API documentation

`algorithms`

`duration_based_chunks(splits: int, items: List[nodes.Item], durations: Dict[str, float]) -> List[TestGroup]`

Split tests into groups by runtime. Ensures tests are split into non-overlapping groups. The original list of test items is split into groups by finding boundary indices i_0, i_1, i_2 and creating group_1 = items[0:i_0], group_2 = items[i_0, i_1], group_3 = items[i_1, i_2], ...

:param splits: How many groups we're splitting in. :param items: Test items passed down by Pytest. :param durations: Our cached test runtimes. Assumes contains timings only of relevant tests :return: List of TestGroup

Source code in src/pytest_split/algorithms.py

def duration_based_chunks(
    splits: int, items: "List[nodes.Item]", durations: "Dict[str, float]"
) -> "List[TestGroup]":
    """
    Split tests into groups by runtime.
    Ensures tests are split into non-overlapping groups.
    The original list of test items is split into groups by finding boundary indices i_0, i_1, i_2
    and creating group_1 = items[0:i_0], group_2 = items[i_0, i_1], group_3 = items[i_1, i_2], ...

    :param splits: How many groups we're splitting in.
    :param items: Test items passed down by Pytest.
    :param durations: Our cached test runtimes. Assumes contains timings only of relevant tests
    :return: List of TestGroup
    """
    items_with_durations = _get_items_with_durations(items, durations)
    time_per_group = sum(map(itemgetter(1), items_with_durations)) / splits

    selected: "List[List[nodes.Item]]" = [[] for i in range(splits)]
    deselected: "List[List[nodes.Item]]" = [[] for i in range(splits)]
    duration: "List[float]" = [0 for i in range(splits)]

    group_idx = 0
    for item, item_duration in items_with_durations:
        if duration[group_idx] >= time_per_group:
            group_idx += 1

        selected[group_idx].append(item)
        for i in range(splits):
            if i != group_idx:
                deselected[i].append(item)
        duration[group_idx] += item_duration

    return [
        TestGroup(selected=selected[i], deselected=deselected[i], duration=duration[i])
        for i in range(splits)
    ]

`least_duration(splits: int, items: List[nodes.Item], durations: Dict[str, float]) -> List[TestGroup]`

Split tests into groups by runtime. It walks the test items, starting with the test with largest duration. It assigns the test with the largest runtime to the group with the smallest duration sum.

The algorithm sorts the items by their duration. Since the sorting algorithm is stable, ties will be broken by maintaining the original order of items. It is therefore important that the order of items be identical on all nodes that use this plugin. Due to issue #25 this might not always be the case.

:param splits: How many groups we're splitting in. :param items: Test items passed down by Pytest. :param durations: Our cached test runtimes. Assumes contains timings only of relevant tests :return: List of groups

Source code in src/pytest_split/algorithms.py

def least_duration(
    splits: int, items: "List[nodes.Item]", durations: "Dict[str, float]"
) -> "List[TestGroup]":
    """
    Split tests into groups by runtime.
    It walks the test items, starting with the test with largest duration.
    It assigns the test with the largest runtime to the group with the smallest duration sum.

    The algorithm sorts the items by their duration. Since the sorting algorithm is stable, ties will be broken by
    maintaining the original order of items. It is therefore important that the order of items be identical on all nodes
    that use this plugin. Due to issue #25 this might not always be the case.

    :param splits: How many groups we're splitting in.
    :param items: Test items passed down by Pytest.
    :param durations: Our cached test runtimes. Assumes contains timings only of relevant tests
    :return:
        List of groups
    """
    items_with_durations = _get_items_with_durations(items, durations)

    # add index of item in list
    items_with_durations_indexed = [
        (*tup, i) for i, tup in enumerate(items_with_durations)
    ]

    # Sort by name to ensure it's always the same order
    items_with_durations_indexed = sorted(
        items_with_durations_indexed, key=lambda tup: str(tup[0])
    )

    # sort in ascending order
    sorted_items_with_durations = sorted(
        items_with_durations_indexed, key=lambda tup: tup[1], reverse=True
    )

    selected: "List[List[Tuple[nodes.Item, int]]]" = [[] for _ in range(splits)]
    deselected: "List[List[nodes.Item]]" = [[] for _ in range(splits)]
    duration: "List[float]" = [0 for _ in range(splits)]

    # create a heap of the form (summed_durations, group_index)
    heap: "List[Tuple[float, int]]" = [(0, i) for i in range(splits)]
    heapq.heapify(heap)
    for item, item_duration, original_index in sorted_items_with_durations:
        # get group with smallest sum
        summed_durations, group_idx = heapq.heappop(heap)
        new_group_durations = summed_durations + item_duration

        # store assignment
        selected[group_idx].append((item, original_index))
        duration[group_idx] = new_group_durations
        for i in range(splits):
            if i != group_idx:
                deselected[i].append(item)

        # store new duration - in case of ties it sorts by the group_idx
        heapq.heappush(heap, (new_group_durations, group_idx))

    groups = []
    for i in range(splits):
        # sort the items by their original index to maintain relative ordering
        # we don't care about the order of deselected items
        s = [
            item for item, original_index in sorted(selected[i], key=lambda tup: tup[1])
        ]
        group = TestGroup(selected=s, deselected=deselected[i], duration=duration[i])
        groups.append(group)
    return groups

`ipynb_compatibility`

`ensure_ipynb_compatibility(group: TestGroup, items: list) -> None`

Ensures that group doesn't contain partial IPy notebook cells.

pytest-split might, in principle, break up the cells of an IPython notebook into different test groups, in which case the tests most likely fail (for starters, libraries are imported in Cell 0, so all subsequent calls to the imported libraries in the following cells will raise NameError).

Source code in src/pytest_split/ipynb_compatibility.py

def ensure_ipynb_compatibility(group: "TestGroup", items: list) -> None:
    """
    Ensures that group doesn't contain partial IPy notebook cells.

    ``pytest-split`` might, in principle, break up the cells of an
    IPython notebook into different test groups, in which case the tests
    most likely fail (for starters, libraries are imported in Cell 0, so
    all subsequent calls to the imported libraries in the following cells
    will raise ``NameError``).

    """
    if not group.selected or not _is_ipy_notebook(group.selected[0].nodeid):
        return

    item_node_ids = [item.nodeid for item in items]

    # Deal with broken up notebooks at the beginning of the test group
    first = group.selected[0].nodeid
    siblings = _find_sibiling_ipynb_cells(first, item_node_ids)
    if first != siblings[0]:
        for item in list(group.selected):
            if item.nodeid in siblings:
                group.deselected.append(item)
                group.selected.remove(item)

    if not group.selected or not _is_ipy_notebook(group.selected[-1].nodeid):
        return

    # Deal with broken up notebooks at the end of the test group
    last = group.selected[-1].nodeid
    siblings = _find_sibiling_ipynb_cells(last, item_node_ids)
    if last != siblings[-1]:
        for item in list(group.deselected):
            if item.nodeid in siblings:
                group.deselected.remove(item)
                group.selected.append(item)