Search Algorithms and Utilities

Strategies

interact

interact(
    step: Callable[
        [AnswerPrefix, InteractStats], Opaque[P, Response[A | WrappedParseError, T]]
    ],
    *,
    process: Callable[[A, InteractStats], Opaque[P, B | Error]],
    tools: Mapping[type[T], Callable[[Any], Opaque[P, Any]]] | None = None,
    inner_policy_type: type[P] | None = None,
) -> Strategy[Branch, P, B]

interact(
    step: Callable[
        [AnswerPrefix, InteractStats], Opaque[P, Response[A | WrappedParseError, T]]
    ],
    *,
    process: Callable[[A, InteractStats], Opaque[P, B | Error]],
    tools: Mapping[type[T], Callable[[Any], Opaque[P, Any]]] | None = None,
    produce_feedback: Literal[True] = True,
    unprocess: Callable[[B], A | None] | None = None,
    inner_policy_type: type[P] | None = None,
) -> Strategy[Branch | Feedback, P, B]

interact(
    step: Callable[
        [AnswerPrefix, InteractStats], Opaque[P, Response[A | WrappedParseError, T]]
    ],
    *,
    process: Callable[[A, InteractStats], Opaque[P, B | Error]],
    tools: Mapping[type[T], Callable[[Any], Opaque[P, Any]]] | None = None,
    produce_feedback: bool = False,
    unprocess: Callable[[B], A | None] | None = None,
    inner_policy_type: type[P] | None = None,
) -> Strategy[Branch | Feedback, P, B]

A standard strategy for creating conversational agents.

A common pattern for interacting with LLMs is to have multi-message exchanges where the full conversation history is resent repeatedly. LLMs are also often allowed to request tool calls. This strategy implements this pattern. It is meant to be inlined into a wrapping strategy (since it is not decorated with strategy).

Parameters:

Name	Type	Description	Default
`step`	`Callable[[AnswerPrefix, InteractStats], Opaque[P, Response[A \| WrappedParseError, T]]]`	A parametric opaque space, induced by a strategy or query that takes as an argument the current chat history (possibly empty) along with some statistics, and returns an answer to be processed. Oftentimes, this parametric opaque space is induced by a query with a special `prefix` field for receiving the chat history (see `Query`).	required
`process`	`Callable[[A, InteractStats], Opaque[P, B \| Error]]`	An opaque space induced by a query or strategy that is called on all model responses that are not tool calls, and which returns either a final response to be returned, or an error to be transmitted to the model as feedback (as an `Error` value with an absent or serializable `meta` field).	required
`tools`	`Mapping[type[T], Callable[[Any], Opaque[P, Any]]] \| None`	A mapping from supported tool interfaces to implementations. Tools themselves can be implemented using arbitrary strategies or queries, allowing the integration of horizontal and vertical LLM pipelines.	`None`
`produce_feedback`	`bool`	Whether or not to produce `Feedback` nodes.	`False`
`unprocess`	`Callable[[B], A \| None] \| None`	If `produce_feedback` is `True`, this function is useful for backpropagating `BetterValue` feedback messages.	`None`
`inner_policy_type`	`type[P] \| None`	Ambient inner policy type. This information is not used at runtime but it can be provided to help type inference when necessary.	`None`

Note

This strategy issues normal Branch nodes for calls to step but Run nodes for calls to process and tool calls. Thus, if using dfs as a policy for example, the max_depth setting corresponds to the number of conversation rounds (or one plus the number of feedback cycles).

Generated feedback

If produce_feedback is True, this strategy produces two kinds of feedback sources and two backpropagation handlers. Feedback sources are emitted for every processing error, targetting the last message in the conversation (for the source with tag "last") or the first one (for the source with tag "first"). Backpropagation handlers are also tagged with "last" and "first" respectively, depending on whether GoodValue and BadValue messages should be sent to the last answer in the conversation or the very first one (as BetterValue or WrongValueAlso messages).

Source code in src/delphyne/stdlib/search/interactive.py

def interact[P, A, B, T: md.AbstractTool[Any]](
    step: Callable[
        [dp.AnswerPrefix, InteractStats],
        Opaque[P, dp.Response[A | WrappedParseError, T]],
    ],
    *,
    process: Callable[[A, InteractStats], Opaque[P, B | dp.Error]],
    tools: Mapping[type[T], Callable[[Any], Opaque[P, Any]]] | None = None,
    produce_feedback: bool = False,
    unprocess: Callable[[B], A | None] | None = None,
    inner_policy_type: type[P] | None = None,
) -> dp.Strategy[Branch | dp.Feedback, P, B]:
    """
    A standard strategy for creating conversational agents.

    A common pattern for interacting with LLMs is to have multi-message
    exchanges where the full conversation history is resent repeatedly.
    LLMs are also often allowed to request tool calls. This strategy
    implements this pattern. It is meant to be inlined into a wrapping
    strategy (since it is not decorated with `strategy`).

    Parameters:
        step: A parametric opaque space, induced by a strategy or query
            that takes as an argument the current chat history (possibly
            empty) along with some statistics, and returns an answer to
            be processed. Oftentimes, this parametric opaque space is
            induced by a query with a special `prefix` field for
            receiving the chat history (see `Query`).
        process: An opaque space induced by a query or strategy that is
            called on all model responses that are not tool calls, and
            which returns either a final response to be returned, or an
            error to be transmitted to the model as feedback (as an
            `Error` value with an absent or serializable `meta` field).
        tools: A mapping from supported tool interfaces to
            implementations. Tools themselves can be implemented using
            arbitrary strategies or queries, allowing the integration of
            horizontal and vertical LLM pipelines.
        produce_feedback: Whether or not to produce `Feedback` nodes.
        unprocess: If `produce_feedback` is `True`, this function is
            useful for backpropagating `BetterValue` feedback messages.
        inner_policy_type: Ambient inner policy type. This information
            is not used at runtime but it can be provided to help type
            inference when necessary.

    !!! note
        This strategy issues normal `Branch` nodes for calls to `step`
        but `Run` nodes for calls to `process` and tool calls. Thus, if
        using `dfs` as a policy for example, the `max_depth` setting
        corresponds to the number of conversation rounds (or one plus
        the number of feedback cycles).

    !!! note "Generated feedback"
        If `produce_feedback` is `True`, this strategy produces two
        kinds of feedback sources and two backpropagation handlers.
        Feedback sources are emitted for every processing error,
        targetting the last message in the conversation (for the source
        with tag "last") or the first one (for the source with tag
        "first"). Backpropagation handlers are also tagged with "last"
        and "first" respectively, depending on whether `GoodValue` and
        `BadValue` messages should be sent to the last answer in the
        conversation or the very first one (as `BetterValue` or
        `WrongValueAlso` messages).
    """

    prefix: dp.AnswerPrefix = []
    stats = InteractStats(num_rejected=0, num_tool_call_rounds=0)
    init_resp_ref = None

    while True:
        # We ask for a response, providing the full chat history.
        resp, resp_ref = yield from branch(
            step(prefix, stats), return_ref=True
        )
        if init_resp_ref is None:
            init_resp_ref = resp_ref
        prefix.append(dp.OracleMessage("oracle", resp.answer))

        # Case where a tool call is requested.
        if isinstance(resp.parsed, dp.ToolRequests):
            tc = resp.parsed.tool_calls
            for i, t in enumerate(tc):
                assert tools is not None
                tres = yield from run(tools[type(t)](t))
                msg = dp.ToolResult(
                    "tool",
                    resp.answer.tool_calls[i],
                    t.render_result(tres),
                )
                prefix.append(msg)
            stats.num_tool_call_rounds += 1
            continue

        ans = resp.parsed.final

        # Case where the answer does not parse.
        if isinstance(ans, WrappedParseError):
            msg = dp.FeedbackMessage(
                kind="feedback",
                label=ans.error.label,
                description=ans.error.description,
                meta=ans.error.meta,
            )
            stats.num_rejected += 1
            prefix.append(msg)
            continue

        res = yield from run(process(ans, stats))

        # Case where the parsed answer cannot be processed.
        if isinstance(res, dp.Error):
            msg = dp.FeedbackMessage(
                kind="feedback",
                label=res.label,
                description=res.description,
                meta=res.meta,
            )
            stats.num_rejected += 1
            prefix.append(msg)
            if produce_feedback:
                err_f = dp.send(dp.BadValue(res), resp_ref)
                yield from dp.feedback("last", [err_f])
                err_f = dp.send(dp.BadValueAlso(resp, res), init_resp_ref)
                yield from dp.feedback("first", [err_f])
            continue

        # Case where we have a final good answer.
        # We define a backward feedback function and then return the answer.

        def backward(shortcut: bool, msg: dp.ValueFeedback[B]):
            # The `shortcut` argument determines whether feedback should
            # be propagated to the last response in the conversation or
            # to the first one.
            if isinstance(msg, dp.GoodValue):
                if shortcut:
                    yield dp.send(dp.BetterValue(resp), init_resp_ref)
                else:
                    yield dp.send(msg, resp_ref)
            elif isinstance(msg, dp.BadValue):
                if shortcut:
                    yield dp.send(msg, resp_ref)
                else:
                    yield dp.send(dp.BadValueAlso(resp, msg.error), resp_ref)
            elif isinstance(msg, dp.BetterValue) and unprocess:
                if (v := unprocess(msg.value)) is not None:
                    better = dp.BetterValue(dp.Response.pure(v))
                    yield dp.send(better, init_resp_ref)
            elif isinstance(msg, dp.BadValueAlso) and unprocess:
                if (v := unprocess(msg.value)) is not None:
                    bad = dp.BadValueAlso(dp.Response.pure(v), msg.error)
                    yield dp.send(bad, init_resp_ref)

        if produce_feedback:
            yield from dp.backward("last", res, partial(backward, False))
            yield from dp.backward("first", res, partial(backward, True))
        return res

InteractStats `dataclass`

Statistics maintained by interact.

Attributes:

Name	Type	Description
`num_rejected`	`int`	Number of answers that have been rejected so far, due to either parsing or processing errors.
`num_tool_call_rounds`	`int`	Number of tool call rounds that have been reuqetsed by the LLM so far (a round consists in a single message that can contain several tool call requests).

Source code in src/delphyne/stdlib/search/interactive.py

@dataclass
class InteractStats:
    """
    Statistics maintained by `interact`.

    Attributes:
        num_rejected: Number of answers that have been rejected so far,
            due to either parsing or processing errors.
        num_tool_call_rounds: Number of tool call rounds that have been
            reuqetsed by the LLM so far (a round consists in a single
            message that can contain several tool call requests).
    """

    num_rejected: int
    num_tool_call_rounds: int

Policies

dfs

dfs(
    tree: Tree[Branch | Fail | Skippable, P, T],
    env: PolicyEnv,
    policy: P,
    max_depth: int | None = None,
    max_branching: int | None = None,
) -> StreamGen[T]

The Standard Depth-First Search Algorithm.

Whenever a branching node is encountered, branching candidates are lazily enumerated and the corresponding child recursively searched. Run nodes do not count towards the depth.

Attributes:

Name	Type	Description
`max_depth`	`optional`	maximum number of branching nodes that can be traversed in a path to success.
`max_branching`	`optional`	maximum number of children explored at each branching node.

Source code in src/delphyne/stdlib/search/dfs.py

@search_policy
def dfs[P, T](
    tree: Tree[Branch | Fail | Skippable, P, T],
    env: PolicyEnv,
    policy: P,
    max_depth: int | None = None,
    max_branching: int | None = None,
) -> StreamGen[T]:
    """
    The Standard Depth-First Search Algorithm.

    Whenever a branching node is encountered, branching candidates are
    lazily enumerated and the corresponding child recursively searched.
    `Run` nodes do not count towards the depth.

    Attributes:
        max_depth (optional): maximum number of branching nodes
            that can be traversed in a path to success.
        max_branching (optional): maximum number of children explored at
            each branching node.
    """
    assert max_branching is None or max_branching > 0
    match tree.node:
        case Success(x):
            yield Solution(x)
        case Skippable():
            yield from dfs(
                max_depth=max_depth,
                max_branching=max_branching,
            )(tree.child(None), env, policy)
        case Fail():
            pass
        case Run(cands):
            cand = yield from tree.node.cands.stream(env, policy).first()
            if cand is not None:
                yield from dfs(
                    max_depth=max_depth,
                    max_branching=max_branching,
                )(tree.child(cand.tracked), env, policy)
        case Branch(cands):
            if max_depth is not None and max_depth <= 0:
                return
            cands = cands.stream(env, policy)
            if max_branching is not None:
                cands = cands.take(max_branching, strict=True)
            yield from cands.bind(
                lambda a: dfs(
                    max_depth=max_depth - 1 if max_depth is not None else None,
                    max_branching=max_branching,
                )(tree.child(a.tracked), env, policy)
            )
        case _:
            unsupported_node(tree.node)

par_dfs

par_dfs(tree: Tree[Branch | Fail, P, T], env: PolicyEnv, policy: P) -> StreamGen[T]

Parallel Depth-First Search.

Whenever a branching node is encountered, all branching candidates are computed at once and the associated children are explored in parallel.

Source code in src/delphyne/stdlib/search/dfs.py

@search_policy
def par_dfs[P, T](
    tree: Tree[Branch | Fail, P, T],
    env: PolicyEnv,
    policy: P,
) -> StreamGen[T]:
    """
    Parallel Depth-First Search.

    Whenever a branching node is encountered, all branching candidates
    are computed at once and the associated children are explored in
    parallel.
    """
    match tree.node:
        case Success(x):
            yield Solution(x)
        case Fail():
            pass
        case Run():
            cand = yield from tree.node.cands.stream(env, policy).first()
            if cand is not None:
                yield from par_dfs()(tree.child(cand.tracked), env, policy)
        case Branch(cands):
            cands = yield from cands.stream(env, policy).all()
            yield from Stream.parallel(
                [par_dfs()(tree.child(a.tracked), env, policy) for a in cands]
            )
        case _:
            unsupported_node(tree.node)

Combinators

sequence

sequence(elts: Iterable[T], /, *, stop_on_reject: bool = True) -> T

Try a list of streams, policies, search policies, or prompting policies in sequence.

Parameters:

Name	Type	Description	Default
`elts`	`Iterable[T]`	An iterable of streams, policies, search policies, or prompting policies to try in sequence.	required
`stop_on_reject`	`bool`	If True, stop the sequence as soon as one policy sees all its resource requests denied. Note that this is necessary for termination when `policies` is an infinite iterator.	`True`

Source code in src/delphyne/stdlib/streams.py

def sequence[T: SupportsStreamCombinators](
    elts: Iterable[T], /, *, stop_on_reject: bool = True
) -> T:
    """
    Try a list of streams, policies, search policies, or prompting
    policies in sequence.

    Arguments:
        elts: An iterable of streams, policies, search policies, or
            prompting policies to try in sequence.
        stop_on_reject: If True, stop the sequence as soon as one policy
            sees all its resource requests denied. Note that this is
            necessary for termination when `policies` is an infinite
            iterator.
    """
    try:
        first = next(iter(elts))
    except StopIteration:
        raise ValueError("Called `sequence` on an empty collection.")
    return first.sequence(elts, stop_on_reject=stop_on_reject)

parallel

parallel(elts: Sequence[T]) -> T

Try a sequence of streams or policies in parallel.

Parameters:

Name	Type	Description	Default
`elts`	`Sequence[T]`	A sequence of streams, policies, search policies, or prompting policies to try in parallel.	required

Source code in src/delphyne/stdlib/streams.py

def parallel[T: SupportsStreamCombinators](elts: Sequence[T], /) -> T:
    """
    Try a sequence of streams or policies in parallel.

    Arguments:
        elts: A sequence of streams, policies, search policies, or
            prompting policies to try in parallel.
    """
    if not elts:
        raise ValueError("Called `parallel` on an empty collection.")
    first = elts[0]
    return first.parallel(elts)

nofail

nofail(space: Opaque[P, A], *, default: B) -> Opaque[P, A | B]

Modify an opaque space to that branching over it can never fail.

If the stream associated with the opaque space gets exhausted and no solution is produced, the provided default value is used.

In demonstrations, the default value can be selected by using the #no_fail_default hint.

Source code in src/delphyne/stdlib/misc.py

def nofail[P, A, B](space: Opaque[P, A], *, default: B) -> Opaque[P, A | B]:
    """
    Modify an opaque space to that branching over it can never fail.

    If the stream associated with the opaque space gets exhausted and no
    solution is produced, the provided default value is used.

    In demonstrations, the default value can be selected by using the
    `#no_fail_default` hint.
    """
    try_policy = dfs() @ elim_flag(NoFailFlag, "no_fail_try")
    def_policy = dfs() @ elim_flag(NoFailFlag, "no_fail_default")
    return nofail_strategy(space, default=default).using(
        lambda p: try_policy.or_else(def_policy) & p
    )

iterate

iterate(
    next: Callable[[S | None], Opaque[P, tuple[T | None, S]]],
    transform_stream: Callable[[P], StreamTransformer | None] | None = None,
) -> Opaque[P, T]

Iteratively call a strategy or query, repeatedly feeding back the last call's output state into a new call and yielding values along the way.

A standard use case is to repeatedly call a query or strategy with a blacklist of previously generated values, so as to produce diverse success values.

Parameters:

Name	Type	Description	Default
`next`	`Callable[[S \| None], Opaque[P, tuple[T \| None, S]]]`	A parametric opaque space, induced by a query or stratey that takes a state as an input (or `None` initially) and outputs a new state, along with a generated value.	required
`transform_stream`	`Callable[[P], StreamTransformer \| None] \| None`	An optional mapping from the inner policy to a stream transformer to be applied to the resulting stream of generated values.	`None`

Returns:

Type	Description
`Opaque[P, T]`	An opaque space enumerating all generated values.

Source code in src/delphyne/stdlib/search/iteration.py

def iterate[P, S, T](
    next: Callable[[S | None], Opaque[P, tuple[T | None, S]]],
    transform_stream: Callable[[P], StreamTransformer | None] | None = None,
) -> Opaque[P, T]:
    """
    Iteratively call a strategy or query, repeatedly feeding back the
    last call's output state into a new call and yielding values along
    the way.

    A standard use case is to repeatedly call a query or strategy with a
    blacklist of previously generated values, so as to produce diverse
    success values.

    Arguments:
        next: A parametric opaque space, induced by a query or stratey
            that takes a state as an input (or `None` initially) and
            outputs a new state, along with a generated value.
        transform_stream: An optional mapping from the inner policy to a
            stream transformer to be applied to the resulting stream of
            generated values.

    Returns:
        An opaque space enumerating all generated values.
    """

    def iterate_policy(inner_policy: P):
        policy = _search_iteration()
        if transform_stream is not None:
            trans = transform_stream(inner_policy)
            if trans is not None:
                policy = trans @ policy
        return policy & inner_policy

    return _iterate(next).using(iterate_policy)

SupportsStreamCombinators

Bases: Protocol

Source code in src/delphyne/stdlib/streams.py

class SupportsStreamCombinators(Protocol):
    @classmethod
    def sequence(
        cls: type[Self],
        elts: Iterable[Self],
        /,
        *,
        stop_on_reject: bool = True,
    ) -> Self: ...

    @classmethod
    def parallel(cls: type[Self], elts: Sequence[Self], /) -> Self: ...

    @classmethod
    def with_env(
        cls: type[Self], f: Callable[[PolicyEnv], Self], /
    ) -> Self: ...

Opaque Spaces Sugar

ambient_pp

ambient_pp(policy: PromptingPolicy) -> PromptingPolicy

Convenience shortcut to avoid passing lambdas to the get_policy argument of Query.using, when using the ambient inner policy as a prompting policy.

Source code in src/delphyne/stdlib/misc.py

def ambient_pp(policy: PromptingPolicy) -> PromptingPolicy:
    """
    Convenience shortcut to avoid passing lambdas to the `get_policy`
    argument of `Query.using`, when using the ambient inner policy as a
    prompting policy.
    """
    return policy

ambient

ambient(policy: F) -> F

Convenience shortcut to avoid passing lambdas to the get_policy argument of Query.using, when using the ambient inner policy as a sub-policy (or as a sub-prompting policy).

Source code in src/delphyne/stdlib/misc.py

def ambient[F](policy: F) -> F:
    """
    Convenience shortcut to avoid passing lambdas to the `get_policy`
    argument of `Query.using`, when using the ambient inner policy as a
    sub-policy (or as a sub-prompting policy).
    """
    return policy

Universal Queries

UniversalQuery `dataclass`

Bases: Query[object]

A universal query, implicitly defined by the surrounding context of its call. See guess for more information.

Attributes:

Name	Type	Description
`strategy`	`str`	Fully qualified name of the surrounding strategy (e.g., `my_package.my_module.my_strategy`).
`expected_type`	`str`	A string rendition of the expected answer type.
`tags`	`Sequence[str]`	Tags associated with the space induced by the query, which can be used to locate the exact location where the query is issued (the default tag takes the name of the variable that the query result is assigned to).
`locals`	`dict[str, object]`	A dictionary that provides the values of a subset of local variables or expressions (as JSON values).

Experimental

This feature is experimental and subject to change.

Source code in src/delphyne/stdlib/universal_queries.py

@dataclass
class UniversalQuery(Query[object]):
    """
    A universal query, implicitly defined by the surrounding context of
    its call. See `guess` for more information.

    Attributes:
        strategy: Fully qualified name of the surrounding strategy
            (e.g., `my_package.my_module.my_strategy`).
        expected_type: A string rendition of the expected answer type.
        tags: Tags associated with the space induced by the query, which
            can be used to locate the exact location where the query is
            issued (the default tag takes the name of the variable that
            the query result is assigned to).
        locals: A dictionary that provides the values of a subset of
            local variables or expressions (as JSON values).

    !!! warning "Experimental"
        This feature is experimental and subject to change.
    """

    # TODO: add a `context` field where we store objects whose
    # documentation or source should be added to the prompt.

    strategy: str
    expected_type: str
    tags: Sequence[str]
    locals: dict[str, object]

    __parser__ = last_code_block.yaml

    @override
    def default_tags(self):
        return self.tags

    @property
    def strategy_source(self) -> str:
        """
        Return the source code of the strategy that contains this query.
        """
        strategy_obj = _load_from_qualified_name(self.strategy)
        assert callable(strategy_obj)
        return _source_code(strategy_obj)

strategy_source `property`

strategy_source: str

Return the source code of the strategy that contains this query.

guess

guess(
    annot: type[T], /, *, using: Sequence[object]
) -> Strategy[Branch | Fail, IPDict, T]

guess(
    annot: TypeAnnot[Any], /, *, using: Sequence[object]
) -> Strategy[Branch | Fail, IPDict, Any]

guess(
    annot: TypeAnnot[Any], /, *, using: Sequence[object]
) -> Strategy[Branch | Fail, IPDict, Any]

Attempt to guess a value of a given type, using the surrounding context of the call site along with the value of some local variables or expressions.

This function inspects the call stack to determine the context in which it is called and issues a UniversalQuery, with a tag corresponding to the name of the assigned variable. A failure node is issued if the oracle result cannot be parsed into the expected type. For example:

res = yield from guess(int, using=[x, y.summary()])

issues a UniversalQuery query tagged res, with attribute locals a dictionary with string keys "x" and "y.summary()".

Attributes:

Name	Type	Description
`annot`		The expected type of the value to be guessed.
`using`		A sequence of local variables or expressions whose value should be communicated to the oracle (a label for each expression is automatically generated using source information).

Note

Our use of an overloaded type should not be necessary anymore when TypeExpr is released with Python 3.14.

Experimental

This feature is experimental and subject to change.

Source code in src/delphyne/stdlib/universal_queries.py

def guess(
    annot: TypeAnnot[Any], /, *, using: Sequence[object]
) -> dp.Strategy[Branch | Fail, IPDict, Any]:
    """
    Attempt to guess a value of a given type, using the surrounding
    context of the call site along with the value of some local
    variables or expressions.

    This function inspects the call stack to determine the context in
    which it is called and issues a `UniversalQuery`, with a tag
    corresponding to the name of the assigned variable. A failure node is
    issued if the oracle result cannot be parsed into the expected type.
    For example:

    ```python
    res = yield from guess(int, using=[x, y.summary()])
    ```

    issues a `UniversalQuery` query tagged `res`, with attribute
    `locals` a dictionary with string keys `"x"` and `"y.summary()"`.

    Attributes:
        annot: The expected type of the value to be guessed.
        using: A sequence of local variables or expressions whose value
            should be communicated to the oracle (a label for each
            expression is automatically generated using source information).

    !!! note
        Our use of an overloaded type should not be necessary anymore
        when `TypeExpr` is released with Python 3.14.

    !!! warning "Experimental"
        This feature is experimental and subject to change.
    """

    # Extracting the name of the surrounding strategy
    strategy = surrounding_qualname(skip=1)
    assert strategy is not None
    strategy = _rename_main_module_in_qualified_name(strategy)

    # Extracting the name of the variable being assigned
    cur_instr_src = current_instruction_source(skip=1)
    ret_val_name = assigned_var_name(cur_instr_src)
    assert isinstance(ret_val_name, str)

    # Computing the 'locals' dictionary
    guess_args = call_argument_sources(cur_instr_src, guess)
    assert guess_args is not None
    _args, kwargs = guess_args
    using_args = _parse_list_of_ids(kwargs["using"])
    assert len(using) == len(using_args)
    locals = {k: pydantic_dump(type(v), v) for k, v in zip(using_args, using)}

    # Building the query
    query = UniversalQuery(strategy, str(annot), [ret_val_name], locals)

    ret = yield from branch(query.using(...))
    try:
        parsed = pydantic_load(annot, ret)
    except Exception as e:
        assert_never((yield from fail("parse_error", message=str(e))))
    return parsed

Best-First Search

best_first_search

best_first_search(
    tree: Tree[Branch | Factor | Value | Fail, P, T],
    env: PolicyEnv,
    policy: P,
    *,
    child_confidence_prior: Callable[[int, int], float],
    max_depth: int | None = None,
) -> StreamGen[T]

Best First Search Algorithm.

Nodes can be branching nodes or factor nodes. Factor nodes feature a confidence score in the [0, 1] interval. The total confidence of any node in the tree is the product of all confidence factors found on the path from the root to this node. The algorithm stores all visited branching nodes in a priority queue. At every step, it picks the node with highest confidence and spends an atomic amount of effort trying to generate a new child. If it succeeds, the first descendant branching node is added to the tree and the algorithm continues.

Also, the total confidence of each branching node is multiplied by an additional penalty factor that depends on how many children have been generated already, using the child_confidence_prior argument. This argument is a function that takes as its first argument the depth of the current branching node (0 for the root, only incrementing when meeting other branching nodes) and as its second argument how many children have been generated so far. It returns the additional penalty to be added.

The max_depth parameter indicates the maximum depth a branch node can have. The root has depth 0 and and only branch nodes count towards increasing the depth.

Source code in src/delphyne/stdlib/search/bestfs.py

@search_policy
def best_first_search[P, T](
    tree: dp.Tree[Branch | Factor | Value | Fail, P, T],
    env: PolicyEnv,
    policy: P,
    *,
    child_confidence_prior: Callable[[int, int], float],
    max_depth: int | None = None,
) -> dp.StreamGen[T]:
    """
    Best First Search Algorithm.

    Nodes can be branching nodes or factor nodes. Factor nodes feature a
    confidence score in the [0, 1] interval. The total confidence of any
    node in the tree is the product of all confidence factors found on
    the path from the root to this node. The algorithm stores all
    visited branching nodes in a priority queue. At every step, it picks
    the node with highest confidence and spends an atomic amount of
    effort trying to generate a new child. If it succeeds, the first
    descendant branching node is added to the tree and the algorithm
    continues.

    Also, the total confidence of each branching node is multiplied by
    an additional penalty factor that depends on how many children have
    been generated already, using the `child_confidence_prior` argument.
    This argument is a function that takes as its first argument the
    depth of the current branching node (0 for the root, only
    incrementing when meeting other branching nodes) and as its second
    argument how many children have been generated so far. It returns
    the additional penalty to be added.

    The `max_depth` parameter indicates the maximum depth a branch node
    can have. The root has depth 0 and and only branch nodes count
    towards increasing the depth.
    """
    # `counter` is used to assign ids that are used to solve ties in the
    # priority queue (the older element gets priority).
    counter = 0
    pqueue: list[_PriorityItem] = []  # a heap

    def push_fresh_node(
        tree: dp.Tree[Branch | Factor | Value | Fail, Any, Any],
        confidence: float,
        depth: int,
    ) -> dp.StreamGen[T]:
        match tree.node:
            case dp.Success():
                yield dp.Solution(tree.node.success)
            case Fail():
                pass
            case Factor() | Value():
                if isinstance(tree.node, Value):
                    penalty_fun = tree.node.value(policy)
                else:
                    penalty_fun = tree.node.factor(policy)
                # Evaluate metrics if a penalty function is provided
                if penalty_fun is not None:
                    eval_stream = tree.node.eval.stream(env, policy)
                    eval = yield from eval_stream.first()
                    # If we failed to evaluate the metrics, we give up.
                    if eval is None:
                        return
                    if isinstance(tree.node, Value):
                        confidence = penalty_fun(eval.tracked.value)
                    else:
                        confidence *= penalty_fun(eval.tracked.value)
                yield from push_fresh_node(tree.child(None), confidence, depth)
            case Run():
                cand = yield from tree.node.cands.stream(env, policy).first()
                if cand is not None:
                    yield from push_fresh_node(
                        tree.child(cand.tracked), confidence, depth
                    )
            case Branch():
                if max_depth is not None and depth > max_depth:
                    return
                state = _NodeState(
                    depth=depth,
                    children=[],
                    confidence=confidence,
                    stream=[tree.node.cands.stream(env, policy)],
                    node=tree.node,
                    tree=tree,
                    next_actions=[],
                )
                nonlocal counter
                counter += 1
                prior = child_confidence_prior(depth, 0)
                item_confidence = confidence * prior
                item = _PriorityItem(-item_confidence, counter, state)
                heapq.heappush(pqueue, item)
            case _:
                unsupported_node(tree.node)

    def reinsert_node(state: _NodeState) -> None:
        nonlocal counter
        counter += 1
        prior = child_confidence_prior(state.depth, len(state.children))
        item_confidence = state.confidence * prior
        item = _PriorityItem(-item_confidence, counter, state)
        heapq.heappush(pqueue, item)

    # Put the root into the queue.
    yield from push_fresh_node(tree, 1.0, 0)
    while pqueue:
        state = heapq.heappop(pqueue).node_state
        if not state.next_actions:
            if not state.stream[0]:
                # No more actions to take, we do not put the node back.
                continue
            generated, _, next = yield from state.stream[0].next()
            state.next_actions.extend([a.tracked for a in generated])
            state.stream[0] = next
        if state.next_actions:
            cand = state.next_actions.pop(0)
            child = state.tree.child(cand)
            state.children.append(child.ref)
            yield from push_fresh_node(child, 1, state.depth + 1)
        # We put the node back into the queue
        reinsert_node(state)

Abduction

Abduction `dataclass`

Bases: Node

Node for the singleton tree produced by abduction. See abduction for details.

An action is a successful proof of the main goal.

Source code in src/delphyne/stdlib/search/abduction.py

@dataclass
class Abduction(dp.Node):
    """
    Node for the singleton tree produced by `abduction`.
    See `abduction` for details.

    An action is a successful proof of the main goal.
    """

    prove: Callable[
        [Sequence[tuple[_TrackedFact, _TrackedProof]], _TrackedEFact],
        OpaqueSpace[Any, _Status],
    ]
    suggest: Callable[
        [_TrackedFeedback],
        OpaqueSpace[Any, Sequence[_Fact]],
    ]
    search_equivalent: Callable[
        [Sequence[_TrackedFact], _TrackedFact],
        OpaqueSpace[Any, _Fact | None],
    ]
    redundant: Callable[
        [Sequence[_TrackedFact], _TrackedFact],
        OpaqueSpace[Any, bool],
    ]

    @override
    def navigate(self) -> dp.Navigation:
        proved: list[tuple[_TrackedEFact, _TrackedProof]] = []

        def aux(fact: _TrackedEFact) -> dp.NavigationContext[None]:
            # If `fact` is already proved, do nothing
            if any(drop_refs(fact) == drop_refs(p) for p, _ in proved):
                return
            # We use `cast` because we know `proved` does not contain `None`.
            res = yield self.prove(cast(Any, proved), fact)
            status, payload = res[0], res[1]
            if status.value == "proved":
                proved.append((fact, payload))
                return
            if status.value == "disproved":
                # It is ok if a subgoal is disproved, as long as the
                # main goal is proved in the end.
                return
            assert status.value == "feedback"
            # Obtain suggestions and try to prove them all recursively
            feedback = payload
            suggestions = yield self.suggest(feedback)
            for s in suggestions:
                yield from aux(s)
            # Check again if `fact` is now proved
            if any(drop_refs(fact) == drop_refs(p) for p, _ in proved):
                return
            # If not, try and prove it again with the suggestions
            res = yield self.prove(cast(Any, proved), fact)
            status, payload = res[0], res[1]
            if status.value != "proved":
                # It is ok if a subgoal is not proved, as long as the
                # main goal is proved in the end. Indeed, suggestions
                # can be repair proposals instead of auxiliary facts.
                return
            proved.append((fact, payload))

        yield from aux(None)
        if not proved or proved[-1][0] is not None:
            raise dp.NavigationError("Main goal was not proved")
        return proved[-1][1]

abduction

abduction(
    prove: Callable[
        [Sequence[tuple[Fact, Proof]], Fact | None],
        Opaque[P, AbductionStatus[Feedback, Proof]],
    ],
    suggest: Callable[[Feedback], Opaque[P, Sequence[Fact]]],
    search_equivalent: Callable[[Sequence[Fact], Fact], Opaque[P, Fact | None]],
    redundant: Callable[[Sequence[Fact], Fact], Opaque[P, bool]],
    inner_policy_type: type[P] | None = None,
) -> Strategy[Abduction, P, Proof]

Higher-order strategy for proving a fact via recursive abduction.

Parameters:

Name	Type	Description	Default
`prove`	`Callable[[Sequence[tuple[Fact, Proof]], Fact \| None], Opaque[P, AbductionStatus[Feedback, Proof]]]`	take a sequence of already established facts as an argument along with a new fact, and attempt to prove this new fact. Three outcomes are possible: the fact is proved, disproved, or a list of suggestions are made that might be helpful to prove first. `None` denotes the top-level goal to be proved.	required
`suggest`	`Callable[[Feedback], Opaque[P, Sequence[Fact]]]`	take some feedback from the `prove` function and return a sequence of fact candidates that may be useful to prove before reattempting the original proof.	required
`search_equivalent`	`Callable[[Sequence[Fact], Fact], Opaque[P, Fact \| None]]`	take a collection of facts along with a new one, and return either the first fact of the list equivalent to the new fact or `None`. This is used to avoid spending search in proving equivalent facts.	required
`redundant`	`Callable[[Sequence[Fact], Fact], Opaque[P, bool]]`	take a collection of established facts and decide whether they imply a new fact candidate. This is useful to avoid trying to prove and accumulating redundant facts.	required

Returns:

Type	Description
`Strategy[Abduction, P, Proof]`	a proof of the top-level goal.

Source code in src/delphyne/stdlib/search/abduction.py

def abduction[Fact, Feedback, Proof, P](
    prove: Callable[
        [Sequence[tuple[Fact, Proof]], Fact | None],
        Opaque[P, AbductionStatus[Feedback, Proof]],
    ],
    suggest: Callable[
        [Feedback],
        Opaque[P, Sequence[Fact]],
    ],
    search_equivalent: Callable[
        [Sequence[Fact], Fact], Opaque[P, Fact | None]
    ],
    redundant: Callable[[Sequence[Fact], Fact], Opaque[P, bool]],
    inner_policy_type: type[P] | None = None,
) -> dp.Strategy[Abduction, P, Proof]:
    """
    Higher-order strategy for proving a fact via recursive abduction.

    Arguments:
      prove: take a sequence of already established facts as an
        argument along with a new fact, and attempt to prove this new
        fact. Three outcomes are possible: the fact is proved,
        disproved, or a list of suggestions are made that might be
        helpful to prove first. `None` denotes the top-level goal to be
        proved.

      suggest: take some feedback from the `prove` function and return a
        sequence of fact candidates that may be useful to prove before
        reattempting the original proof.

      search_equivalent: take a collection of facts along with a new
        one, and return either the first fact of the list equivalent to
        the new fact or `None`. This is used to avoid spending search in
        proving equivalent facts.

      redundant: take a collection of established facts and decide
        whether they imply a new fact candidate. This is useful to avoid
        trying to prove and accumulating redundant facts.

    Returns:
      a proof of the top-level goal.
    """
    recv = yield spawn_node(
        Abduction,
        prove=prove,
        suggest=suggest,
        search_equivalent=search_equivalent,
        redundant=redundant,
    )
    return cast(Proof, recv.action)

abduct_recursively

abduct_recursively(
    tree: Tree[Abduction, P, Proof],
    env: PolicyEnv,
    policy: P,
    *,
    max_depth: int = 1,
    max_suggestions: int | None = None,
) -> StreamGen[Proof]

A simple policy for Abduction nodes that mimics the behavior of the associated navigation function.

This policy does not use search_equivalent and is_redundant.

Parameters:

Name	Type	Description	Default
`max_depth`	`int`	The maximum recursive depth at which proof attempts are performed. If equal to 0, `suggest` is never called and the policy succeeds only if the top-level goal can be proved straight away. If equal to 1, suggestions can be made but they must all be provable straight away.	`1`
`max_suggestions`	`int \| None`	The maximum of suggestions to consider.	`None`

Source code in src/delphyne/stdlib/search/abduction.py

@search_policy
def abduct_recursively[P, Proof](
    tree: dp.Tree[Abduction, P, Proof],
    env: PolicyEnv,
    policy: P,
    *,
    max_depth: int = 1,
    max_suggestions: int | None = None,
) -> dp.StreamGen[Proof]:
    """
    A simple policy for `Abduction` nodes that mimics the behavior of
    the associated navigation function.

    This policy does not use `search_equivalent` and `is_redundant`.

    Arguments:
        max_depth: The maximum recursive depth at which proof attempts
            are performed. If equal to 0, `suggest` is never called and
            the policy succeeds only if the top-level goal can be
            proved straight away. If equal to 1, suggestions can be made
            but they must all be provable straight away.
        max_suggestions: The maximum of suggestions to consider.
    """

    proved: list[tuple[_TrackedEFact, _TrackedProof]] = []
    assert isinstance(tree.node, Abduction)
    node = tree.node

    def prove(
        fact: _TrackedEFact,
    ) -> dp.StreamContext[dp.Tracked[AbductionStatus[_Feedback, _Proof]]]:
        res = (
            yield from node.prove(cast(Any, proved), fact)
            .stream(env, policy)
            .first()
        )
        if res is None:
            raise _Abort()
        return res.tracked

    def suggest(
        feedback: _TrackedFeedback,
    ) -> dp.StreamContext[dp.Tracked[Sequence[_Fact]]]:
        res = yield from node.suggest(feedback).stream(env, policy).first()
        if res is None:
            raise _Abort()
        return res.tracked

    def aux(fact: _TrackedEFact, depth: int) -> dp.StreamContext[None]:
        if depth > max_depth:
            return
        # If `fact` is already proved, do nothing
        if any(drop_refs(fact) == drop_refs(p) for p, _ in proved):
            return
        res = yield from prove(fact)
        status, payload = res[0], res[1]
        if status.value == "proved":
            proved.append((fact, payload))
            return
        if status.value == "disproved":
            return
        assert status.value == "feedback"
        # Obtain suggestions and try to prove them all recursively
        feedback = payload
        suggestions = [*(yield from suggest(feedback))]
        if max_suggestions is not None:
            suggestions = suggestions[:max_suggestions]
        for s in suggestions:
            yield from aux(s, depth + 1)
        # Check again if `fact` is now proved
        if any(drop_refs(fact) == drop_refs(p) for p, _ in proved):
            return
        # If not, try and prove it again with the suggestions
        res = yield from prove(fact)
        status, payload = res[0], res[1]
        if status.value != "proved":
            return
        proved.append((fact, payload))

    try:
        yield from aux(None, depth=0)
    except _Abort:
        return
    if proved and proved[-1][0] is None:
        proof = proved[-1][1]
        child = tree.child(proof)
        assert isinstance(child.node, dp.Success)
        yield dp.Solution(child.node.success)

abduct_and_saturate

abduct_and_saturate(
    tree: Tree[Abduction, P, Proof],
    env: PolicyEnv,
    policy: P,
    *,
    max_rollout_depth: int = 3,
    scoring_function: ScoringFunction = _default_scoring_function,
    log_steps: LogLevel | None = None,
    max_raw_suggestions_per_step: int | None = None,
    max_reattempted_candidates_per_propagation_step: int | None = None,
    max_consecutive_propagation_steps: int | None = None,
    max_proved: int | None = None,
    max_candidates: int | None = None,
    remember_disproved: bool = True,
) -> StreamGen[Proof]

A saturation-based, sequential policy for abduction trees.

This policy proceeds by saturation: it repeatedly grows a set of proved facts until the main goal is proved or some limit is reached.

It does so by repeatedly performing rollouts. Each rollout starts with the toplevel goal as a target, and attempts to prove this target assuming all facts in proved. If the target cannot be proved, suggestions for auxiliary facts to prove first are requested before another attempt is made. If still unsuccessful, one of the unproved suggestions is set as the new target and the rollout proceeds (up to some depth specified by max_rollout_depth).

The algorithm maintains four, disjoint global sets of facts:

proved: facts that have been successfully proved
disproved: facts that have been disproved
redundant: facts that are implied by the conjunction of all facts from proved.
candidates: facts that have been suggested but do not belong to any of the three sets above.

Each step of a rollout proceeds as follows:

The current target is assumed to be a fact from the candidates set. Suggestions for new rollout targets are determined as follows (get_suggestions):
- The suggest node function returns a list of candidates.
- All suggestions are normalized using the search_equivalent node function (one call per suggestion).
- Each normalized suggestion is added (add_candidate) to one of the proved, disproved, redundant, or candidates sets. At most one call to the prove and is_redundant node functions is made per suggestion.
- Assuming the previous step results in at least one new fact being proved, all candidates from the candidates set are re-examined until saturation (saturate).
- Remaining suggestions that are in candidates are potential taregts for the next rollout step.
Assuming the current target is still not proved, the next rollout target is picked using the scoring_function parameter.

Parameters:

Name	Type	Description	Default
`max_rollout_depth`	`int`	The maximum depth of a rollout, as the maximal number of consecutive target goals that can be set (the first goal being the toplevel goal).	`3`
`scoring_function`	`ScoringFunction`	Scoring function for choosing the next target goal at the end of each rollout step.	`_default_scoring_function`
`log_steps`	`LogLevel \| None`	If not `None`, log main steps of the algorithm at the provided severity level.	`None`
`max_raw_suggestions_per_step`	`int \| None`	Maximum number of suggestions from the `suggest` node function to consider at each rollout step. If more suggestions are available, the most frequent (for naive, syntacic equality) ones are chosen.	`None`
`max_reattempted_candidates_per_propagation_step`	`int \| None`	Maximum number of candidates that are reattempted at each propagation step. Candidates that have been proposed more frequently are selected in priority.	`None`
`max_consecutive_propagation_steps`	`int \| None`	Maximum number of propagation steps that are performed during a rollout step, or `None` if there is no limit.	`None`
`max_proved`	`int \| None`	The maximum number of proved facts that can be accumulated. Search fails if this number is exceeded without the top-level goal being proved. In particular, setting such a limit is useful for bounding the complexity of `prove`, `search_equivalent` and `is_redundant`.	`None`
`max_candidates`	`int \| None`	The maximum number of unproved fact candidates that can be memorized. Once this number is met, new suggestions are not memorized if they cannot be proved straight away. In particular, setting such a limit is useful for bounding the complexity of `search_equivalent`.	`None`
`remember_disproved`	`bool`	Whether or not to remember disproved facts so that one does not attempt to prove them again. In particular, not remembering disproved facts is useful for bounding the complexity of `search_equivalent`.	`True`

Warning

Facts must be hashable.

Warning

By design, this policy tries and makes as few calls to suggest as possible, since those typically involve LLM calls. However, by default, it can make a very large number of calls to prove, is_redundant and search_equivalent. This number can explode as the number of candidates increases (in particular, it can be quadratic in the number of candidates at each rollout step, due to saturation). Thus, we recommend setting proper limits using the hyperparameters whose name start with max_.

Tip

Possible suggestions are auxiliary facts, but also repair suggestions. For example, suppose the goal is to prove x > 0, which is false. Then, one might suggest a repair such as n > 0 -> x > 0.

Note

No fact is attempted to be proved if it is redundant with already-proved facts. However, in the current implementation, the set of proved facts can still contain redundancy. For example, if x > 0 is established before the stronger x >= 0 is, the former won't be deleted.

Source code in src/delphyne/stdlib/search/abduction.py

@search_policy
def abduct_and_saturate[P, Proof](
    tree: dp.Tree[Abduction, P, Proof],
    env: PolicyEnv,
    policy: P,
    *,
    max_rollout_depth: int = 3,
    scoring_function: ScoringFunction = _default_scoring_function,
    log_steps: dp.LogLevel | None = None,
    max_raw_suggestions_per_step: int | None = None,
    max_reattempted_candidates_per_propagation_step: int | None = None,
    max_consecutive_propagation_steps: int | None = None,
    max_proved: int | None = None,
    max_candidates: int | None = None,
    remember_disproved: bool = True,
) -> dp.StreamGen[Proof]:
    """
    A saturation-based, sequential policy for abduction trees.

    This policy proceeds by saturation: it repeatedly grows a set of
    proved facts until the main goal is proved or some limit is reached.

    It does so by repeatedly performing _rollouts_. Each rollout starts
    with the toplevel goal as a target, and attempts to prove this target
    assuming all facts in `proved`. If the target cannot be proved,
    suggestions for auxiliary facts to prove first are requested before
    another attempt is made. If still unsuccessful, one of the unproved
    suggestions is set as the new target and the rollout proceeds (up to
    some depth specified by `max_rollout_depth`).

    The algorithm maintains four, disjoint global sets of facts:

    - `proved`: facts that have been successfully proved
    - `disproved`: facts that have been disproved
    - `redundant`: facts that are implied by the conjunction of all
      facts from `proved`.
    - `candidates`: facts that have been suggested but do not belong to
      any of the three sets above.

    Each step of a rollout proceeds as follows:

    - The current target is assumed to be a fact from the `candidates`
      set. Suggestions for new rollout targets are determined as follows
      (`get_suggestions`):
        - The `suggest` node function returns a list of candidates.
        - All suggestions are normalized using the `search_equivalent`
          node function (one call per suggestion).
        - Each normalized suggestion is added (`add_candidate`) to one
          of the `proved`, `disproved`, `redundant`, or `candidates`
          sets. At most one call to the `prove` and `is_redundant` node
          functions is made per suggestion.
        - Assuming the previous step results in at least one new fact
          being proved, all candidates from the `candidates` set are
          re-examined until saturation (`saturate`).
        - Remaining suggestions that are in `candidates` are potential
          taregts for the next rollout step.
    - Assuming the current target is still not proved, the next rollout
      target is picked using the `scoring_function` parameter.

    Arguments:
        max_rollout_depth: The maximum depth of a rollout, as the
            maximal number of consecutive target goals that can be set
            (the first goal being the toplevel goal).
        scoring_function: Scoring function for choosing the next target
            goal at the end of each rollout step.
        log_steps: If not `None`, log main steps of the algorithm at the
            provided severity level.
        max_raw_suggestions_per_step: Maximum number of suggestions from
            the `suggest` node function to consider at each rollout
            step. If more suggestions are available, the most frequent
            (for naive, syntacic equality) ones are chosen.
        max_reattempted_candidates_per_propagation_step: Maximum number
            of candidates that are reattempted at each propagation step.
            Candidates that have been proposed more frequently are
            selected in priority.
        max_consecutive_propagation_steps: Maximum number of propagation
            steps that are performed during a rollout step, or `None` if
            there is no limit.
        max_proved: The maximum number of proved facts that can be
            accumulated. Search fails if this number is exceeded without
            the top-level goal being proved. In particular, setting such
            a limit is useful for bounding the complexity of `prove`,
            `search_equivalent` and `is_redundant`.
        max_candidates: The maximum number of unproved fact candidates
            that can be memorized. Once this number is met, new
            suggestions are not memorized if they cannot be proved
            straight away. In particular, setting such a limit is useful
            for bounding the complexity of `search_equivalent`.
        remember_disproved: Whether or not to remember disproved facts
            so that one does not attempt to prove them again. In
            particular, not remembering disproved facts is useful for
            bounding the complexity of `search_equivalent`.

    !!! warning
        Facts must be hashable.

    !!! warning
        By design, this policy tries and makes as few calls to `suggest`
        as possible, since those typically involve LLM calls. However,
        by default, it can make a very large number of calls to `prove`,
        `is_redundant` and `search_equivalent`. This number can explode
        as the number of candidates increases (in particular, it can be
        quadratic in the number of candidates at each rollout step, due
        to saturation). Thus, we recommend setting proper limits using
        the hyperparameters whose name start with `max_`.

    !!! tip
        Possible suggestions are auxiliary facts, but also repair
        suggestions. For example, suppose the goal is to prove `x > 0`,
        which is false. Then, one might suggest a repair such as `n > 0
        -> x > 0`.

    !!! note
        No fact is attempted to be proved if it is redundant with
        already-proved facts. However, in the current implementation,
        the set of proved facts can still contain redundancy. For
        example, if `x > 0` is established before the stronger `x >= 0`
        is, the former won't be deleted.
    """

    # TODO: stop the rollout if the current goal is proved.

    # Initialize tool statistics tracking
    call_stats = _CallStats()

    # Invariant: `candidates`, `proved`, `disproved` and `redundant` are
    # disjoint. Together, they form the set of "canonical facts".
    candidates: dict[_EFact, _CandInfo] = {}
    proved: dict[_EFact, _TrackedProof] = {}
    disproved: set[_EFact] = set()
    # Facts that are implied by the conjunction of all proved facts.
    redundant: set[_EFact] = set()

    # It is easier to manipulate untracked facts and so we keep the
    # correspondence with tracked facts here.
    # Invariant: all canonical facts are included in `tracked`.
    tracked: dict[_EFact, _TrackedEFact] = {None: None}

    # The `equivalent` dict maps a fact to its canonical equivalent
    # representative that is somewhere in `candidates`, `proved`,
    # `disproved` or `redundant`.
    equivalent: dict[_EFact, _EFact] = {}

    assert max_rollout_depth >= 1
    assert isinstance(tree.node, Abduction)
    node = tree.node

    def tracked_f(fact: _Fact) -> _TrackedFact:
        # Access `tracked` but ensure that `None` is not returned
        res = tracked[fact]
        assert res is not None
        return res

    def compute_fact_stats() -> _FactStats:
        return _FactStats(
            num_candidates=len(candidates),
            num_proved=len(proved),
            num_disproved=len(disproved),
            num_redundant=len(redundant),
        )

    def dbg(msg: str):
        if log_steps:
            stats = {
                "facts_stats": compute_fact_stats(),
                "call_stats": call_stats,
            }
            env.log(log_steps, msg, stats)

    def log_call_stats():
        env.info("abduct_and_saturate_call_stats", call_stats)

    def all_canonical() -> Sequence[_EFact]:
        return [*candidates, *proved, *disproved, *redundant]

    def is_redundant(f: _EFact) -> dp.StreamContext[bool]:
        if f is None:
            return False
        call_stats.is_redundant_calls += 1
        start_time = time.time()
        respace = node.redundant([tracked_f(o) for o in proved], tracked_f(f))
        res = yield from respace.stream(env, policy).first()
        call_stats.is_redundant_time_in_seconds += time.time() - start_time
        if res is None:
            raise _Abort()
        return res.tracked.value

    def add_candidate(c: _EFact) -> dp.StreamContext[None]:
        # Take a new fact and put it into either `proved`, `disproved`,
        # `candidates` or `redundant`. If a canonical fact is passed,
        # nothing is done.
        if c in all_canonical():
            return
        # We first make a redundancy check
        if (yield from is_redundant(c)):
            dbg(f"Redundant: {c}")
            redundant.add(c)
            return
        # If not redundant, we try and prove it
        call_stats.prove_calls += 1
        start_time = time.time()
        facts_list = [(tracked_f(f), p) for f, p in proved.items()]
        pstream = node.prove(facts_list, tracked[c]).stream(env, policy)
        res = yield from pstream.first()
        call_stats.prove_time_in_seconds += time.time() - start_time
        if res is None:
            raise _Abort()
        status, payload = res.tracked[0], res.tracked[1]
        if status.value == "disproved":
            if remember_disproved:
                disproved.add(c)
            dbg(f"Disproved: {c}")
            if c is None:
                raise _Abort()
        elif status.value == "proved":
            proved[c] = payload
            dbg(f"Proved: {c}")
            if c is None:
                raise _ProofFound()
            if max_proved is not None and len(proved) > max_proved:
                raise _Abort()
        elif max_candidates is None or len(candidates) + 1 <= max_candidates:
            candidates[c] = _CandInfo(payload, 0, 0)

    def propagate() -> dp.StreamContext[Literal["updated", "not_updated"]]:
        # Go through each candidate and see if it is now provable
        # assuming all established facts.
        dbg("Propagating...")
        old_candidates = candidates.copy()
        # Determining which candidates to reattempt
        M = max_reattempted_candidates_per_propagation_step
        if M is None:
            to_reattempt = old_candidates
            candidates.clear()
        else:
            to_reattempt_list = list(old_candidates.items())
            to_reattempt_list.sort(key=lambda x: -x[1].num_proposed)
            to_reattempt = dict(to_reattempt_list[:M])
            for c in to_reattempt:
                del candidates[c]
        for c, i in to_reattempt.items():
            yield from add_candidate(c)
            if c in candidates:
                # Restore the counters if `c` is still a candidate
                candidates[c].num_proposed = i.num_proposed
                candidates[c].num_visited = i.num_visited
        return (
            "updated"
            if len(candidates) != len(old_candidates)
            else "not_updated"
        )

    def saturate() -> dp.StreamContext[None]:
        # Propagate facts until saturation
        i = 0
        m = max_consecutive_propagation_steps
        while (m is None or i < m) and (yield from propagate()) == "updated":
            i += 1

    def get_canonical(f: _EFact) -> dp.StreamContext[_EFact]:
        # The result is guaranteed to be in `tracked`
        if f in proved or f in disproved or f in candidates:
            # Case where f is a canonical fact
            return f
        assert f is not None
        if f in equivalent:
            # Case where an equivalent canonical fact is known already
            nf = equivalent[f]
            assert nf in all_canonical()
            return equivalent[f]
        # New fact whose equivalence must be tested
        prev = [tracked_f(o) for o in all_canonical() if o is not None]
        if not prev:
            # First fact: no need to make equivalence call
            return f
        call_stats.search_equivalent_calls += 1
        start_time = time.time()
        eqspace = node.search_equivalent(prev, tracked_f(f))
        res = yield from eqspace.stream(env, policy).first()
        call_stats.search_equivalent_time_in_seconds += (
            time.time() - start_time
        )
        if res is None:
            raise _Abort()
        res = res.tracked
        if res.value is None:
            return f
        elif res.value in all_canonical():
            equivalent[f] = res.value
            return res.value
        else:
            env.error("invalid_equivalent_call")
            return f

    def get_raw_suggestions(c: _EFact) -> dp.StreamContext[Sequence[_EFact]]:
        assert c in candidates
        sstream = node.suggest(candidates[c].feedback).stream(env, policy)
        res = yield from sstream.all()
        if not res:
            # If no suggestions are returned, we are out of budget and
            # abort so as to not call this again in a loop.
            raise _Abort()
        tracked_suggs = [s for r in res for s in r.tracked]
        M = max_raw_suggestions_per_step
        if M is not None and len(tracked_suggs) > M:
            counts: dict[_Fact, int] = defaultdict(int)
            for s in tracked_suggs:
                counts[s.value] += 1
            tracked_suggs.sort(key=lambda x: counts[x.value], reverse=True)
            tracked_suggs = tracked_suggs[:M]
        # Populate the `tracked` cache (this is the only place where new
        # facts can be created and so the only place where `tracked`
        # must be updated).
        suggs = [s.value for s in tracked_suggs]
        dbg(f"Suggestions: {suggs}")
        for s, ts in zip(suggs, tracked_suggs):
            if s not in tracked:
                tracked[s] = ts
        return suggs

    def get_suggestions(c: _EFact) -> dp.StreamContext[dict[_EFact, int]]:
        # Return a dict representing a multiset of suggestions
        assert c in candidates
        raw_suggs = yield from get_raw_suggestions(c)
        suggs: list[_EFact] = []
        for s in raw_suggs:
            suggs.append((yield from get_canonical(s)))
        len_proved_old = len(proved)
        for s in suggs:
            yield from add_candidate(s)
        if len_proved_old != len(proved):
            assert len(proved) > len_proved_old
            yield from saturate()
        suggs = [s for s in suggs if s in candidates]
        suggs_multiset: dict[_EFact, int] = {}
        for s in suggs:
            if s not in suggs_multiset:
                suggs_multiset[s] = 0
            suggs_multiset[s] += 1
        dbg(f"Filtered: {suggs_multiset}")
        return suggs_multiset

    try:
        yield from add_candidate(None)
        while True:
            cur: _EFact = None
            for _ in range(max_rollout_depth):
                dbg(f"Explore fact: {cur}")
                suggs = yield from get_suggestions(cur)
                if not suggs or cur in proved:
                    break
                n = sum(suggs.values())
                for s, k in suggs.items():
                    candidates[s].num_proposed += k / n
                infos = [candidates[c] for c in suggs]
                best = _argmax(
                    scoring_function(i.num_proposed, i.num_visited)
                    for i in infos
                )
                cur = list(suggs.keys())[best]
                candidates[cur].num_visited += 1
    except _Abort:
        log_call_stats()
        return
    except _ProofFound:
        log_call_stats()
        action = proved[None]
        child = tree.child(action)
        assert isinstance(child.node, dp.Success)
        yield dp.Solution(child.node.success)
        return

ScoringFunction

Bases: Protocol

A function for assigning a score to candidate facts to prove, so that the fact with the highest score is chosen next.

Source code in src/delphyne/stdlib/search/abduction.py

class ScoringFunction(Protocol):
    """
    A function for assigning a score to candidate facts to prove, so
    that the fact with the highest score is chosen next.
    """

    def __call__(self, num_proposed: float, num_visited: float) -> float:
        """
        Arguments:
            num_proposed: Normalized number of times the fact was
                proposed by the `suggest` function. When the latter
                returns `n` suggestions, each suggestion's count is
                increased by `1/n`.
            num_visited: Number of times the fact was chosen as target
                in one step of a rollout.
        """
        ...

call

__call__(num_proposed: float, num_visited: float) -> float

Parameters:

Name	Type	Description	Default
`num_proposed`	`float`	Normalized number of times the fact was proposed by the `suggest` function. When the latter returns `n` suggestions, each suggestion's count is increased by `1/n`.	required
`num_visited`	`float`	Number of times the fact was chosen as target in one step of a rollout.	required

Source code in src/delphyne/stdlib/search/abduction.py

def __call__(self, num_proposed: float, num_visited: float) -> float:
    """
    Arguments:
        num_proposed: Normalized number of times the fact was
            proposed by the `suggest` function. When the latter
            returns `n` suggestions, each suggestion's count is
            increased by `1/n`.
        num_visited: Number of times the fact was chosen as target
            in one step of a rollout.
    """
    ...

_default_scoring_function

_default_scoring_function(num_proposed: float, num_visited: float) -> float

The default scoring function for fact candidates.

See ScoringFunction for details.

Source code in src/delphyne/stdlib/search/abduction.py

def _default_scoring_function(
    num_proposed: float, num_visited: float
) -> float:
    """
    The default scoring function for fact candidates.

    See `ScoringFunction` for details.
    """
    return -(num_visited / max(1, math.sqrt(num_proposed)))

Search Algorithms and Utilities

Strategies

interact

InteractStats dataclass

Policies

dfs

par_dfs

Combinators

sequence

parallel

nofail

iterate

SupportsStreamCombinators

Opaque Spaces Sugar

ambient_pp

ambient

Universal Queries

UniversalQuery dataclass

strategy_source property

guess

Best-First Search

best_first_search

Abduction

Abduction dataclass

abduction

abduct_recursively

abduct_and_saturate

ScoringFunction

__call__

_default_scoring_function

InteractStats `dataclass`

UniversalQuery `dataclass`

strategy_source `property`

Abduction `dataclass`

call