The Strategy Language
Following the previous Overview chapter, we now provide more details on Delphyne's strategy language. We provide a quick overview of the most useful techniques and concepts and refer to the Reference for more details and explanations (follow the hypertext links).
Defining Strategies
A strategy is a program with unresolved choice points, which can be reified into a search tree. Delphyne allows writing strategies as Python generators that yield internal tree nodes and receive associated actions in return. The Strategy
type has three type parameters, corresponding to the strategy's signature (i.e., the type of nodes it can emit), its associated inner policy type and its return type. Strategy functions are typically defined via the strategy
decorator, which creates functions that return StrategyInstance
values, wrapping the underlying generator while adding some metadata and convenience methods (e.g., using
).
@strategy # (1)!
def find_param_value(
expr: str) -> Strategy[Branch | Fail, FindParamValueIP, int]: ...
- Example from the previous chapter.
Query functions can have arguments of arbitrary type (including functions). When launching strategies from commands and in the presence of type annotations, arguments are automatically unserialized using Pydantic. Thus, it is useful for top-level strategies to have serializable argument types that are properly annotated.
Branching nodes can be yielded via the branch
function:
def branch[P, T](cands: Opaque[P, T]) -> Strategy[Branch, P, T]: ...
Crucially, branch
does not take a query as an argument but an opaque space (Opaque
). An opaque space can be seen as a mapping from the ambient inner policy to an iterator of values (more precisely, a search stream). Opaque spaces can be produced from queries or strategy instances, by mapping the ambient inner policy to a prompting policy or a policy respectively:
class Query[T]:
def using[Pout](self,
get_policy: Callable[[Pout], PromptingPolicy]) -> Opaque[Pout, T]: ...
class StrategyInstance[N: Node, P, T]:
def using(self,
get_policy: Callable[[Pout], Policy[N, P]]) -> Opaque[Pout, T]: ...
Importantly, search policies such as dfs
are unaware of whether an opaque space originates from a query or a strategy. This guarantees modularity and allows queries to be transparently refined into dedicated strategies whenever more guidance is desired. Opaque spaces also allow queries with different signatures (i.e., yielding different kinds of tree nodes) to be composed, while being associated independent search policies.
Strategy Inlining
It is also possible to inline a strategy call within another strategy, provided that both strategies share the same signature and inner policy type. This can be done using the inline
method, which unwraps a StrategyInstance
value and gives access to the underlying generator.
For an example of a strategy that branches over results of a sub-strategy, see the prove_program_via_abduction_and_branching
strategy from the find_invariants
example. In particular, see how doing so results in nested inner policy records.
Source for examples/find_invariants/abduct_and_branch.py
"""
A simple Delphyne strategy to discover loop invariants with Why3.
"""
from collections.abc import Callable, Sequence
from dataclasses import dataclass
import delphyne as dp
from delphyne import Branch, Compute, Fail, Strategy, Value, strategy
import why3_utils as why3
# fmt: off
#####
##### Inner Policy Types
#####
@dataclass
class ProposeInvsIP:
propose: dp.PromptingPolicy
novel: dp.PromptingPolicy
@dataclass
class ProveProgIP:
propose: "dp.Policy[Branch | Fail, ProposeInvsIP]"
eval: dp.PromptingPolicy
quantify_eval: "Callable[[ProofStateMetrics], float] | None"
#####
##### Type Definitions
#####
@dataclass
class ProofStateMetrics:
has_redundant_invs: bool
type Proposal = Sequence[why3.Formula]
type Blacklist = Sequence[Proposal]
#####
##### Strategies
#####
@strategy
def prove_program_via_abduction_and_branching(
prog: why3.File,
) -> Strategy[Branch | Value | Fail | Compute, ProveProgIP, why3.File]:
annotated: why3.File = prog
while True:
feedback = yield from dp.compute(why3.check, prog, annotated)
yield from dp.ensure(feedback.error is None, "invalid program")
if feedback.success:
return annotated
remaining = [o for o in feedback.obligations if not o.proved]
yield from dp.ensure(
len(remaining) == 1, "too many remaining obligations")
unproved = remaining[0]
yield from dp.ensure(
not why3.invariant_init_obligation(unproved), "init")
if annotated != prog: # if invariant annotations are present
yield from dp.value(
EvaluateProofState(unproved)
.using(lambda p: p.eval, ProveProgIP),
lambda p: p.quantify_eval)
new_invariants = yield from dp.branch(
dp.iterate(
lambda prior:
propose_invariants(unproved, prior)
.using(lambda p: p.propose, ProveProgIP)))
annotated = why3.add_invariants(annotated, new_invariants)
@strategy
def propose_invariants(
obligation: why3.Obligation,
blacklist: Sequence[Proposal] | None,
) -> Strategy[Branch | Fail, ProposeInvsIP, tuple[Proposal, Blacklist]]:
if blacklist is None:
blacklist = []
proposal = yield from dp.branch(
ProposeInvariants(obligation, blacklist)
.using(lambda p: p.propose, ProposeInvsIP))
sanity_check = all(why3.no_invalid_formula_symbol(inv) for inv in proposal)
yield from dp.ensure(sanity_check, "sanity check failed")
if blacklist:
novel = yield from dp.branch(
IsProposalNovel(proposal, blacklist)
.using(lambda p: p.novel, ProposeInvsIP))
yield from dp.ensure(novel, "proposal is not novel")
return proposal, [*blacklist, proposal]
#####
##### Queries
#####
@dataclass
class ProposeInvariants(dp.Query[Proposal]):
unproved: why3.Obligation
blacklist: Sequence[Proposal]
__parser__ = dp.last_code_block.yaml
@dataclass
class IsProposalNovel(dp.Query[bool]):
proposal: Proposal
blacklist: Sequence[Proposal]
__parser__ = dp.last_code_block.yaml
@dataclass
class EvaluateProofState(dp.Query[ProofStateMetrics]):
unproved: why3.Obligation
__parser__ = dp.last_code_block.yaml
#####
##### Policies
#####
def prove_program_via_abduction_and_branching_policy(
fancy_model: str = "gpt-4o",
base_model: str = "gpt-4o",
max_depth: int = 2,
max_requests_per_proposal: int = 5,
root_proposal_penalty: float = 0.7,
nonroot_proposal_penalty: float = 1.0,
max_nonroot_proposals: int = 3,
enable_state_evaluation: bool = False,
min_value: float = 0.3,
) -> dp.Policy[Branch | Value | Fail | Compute, ProveProgIP]:
def compute_value(metrics: ProofStateMetrics) -> float:
prob = 1
if metrics.has_redundant_invs:
prob = 0
return max(min_value, prob)
def child_confidence_prior(depth: int, prev_gen: int) -> float:
if depth >= 1:
if prev_gen >= max_nonroot_proposals:
return 0
return nonroot_proposal_penalty ** prev_gen
return root_proposal_penalty ** prev_gen
n = dp.NUM_REQUESTS
fancy = dp.openai_model(fancy_model)
base = dp.openai_model(base_model)
propose_ip = ProposeInvsIP(
propose=dp.few_shot(fancy),
novel=dp.take(1) @ dp.few_shot(base),
)
proposal_limit = dp.BudgetLimit({n: max_requests_per_proposal})
prove_prog_ip = ProveProgIP(
propose=dp.with_budget(proposal_limit) @ dp.dfs() & propose_ip,
eval=dp.take(1) @ dp.few_shot(base),
quantify_eval=compute_value if enable_state_evaluation else None,
)
bestfs = dp.best_first_search(
child_confidence_prior=child_confidence_prior,
max_depth=max_depth)
return bestfs @ dp.elim_compute() & prove_prog_ip
Queries
New query types can be defined by subclassing the Query
class. We refer to the associated API documentation for details. Some highlights:
- Queries can be associated system prompts and instance prompts. Prompts can be defined inline or in separate Jinja files (see
find_invariants
example). - Query types can have fields of any type as long as they can be serialized and unserialized using Pydantic (this includes custom dataclasses).
- Prompts can be parameterized, and parameters instantiated on the policy side (e.g.,
params
argument offew_shot
). This is useful for testing prompt variations, specializing specific prompt fragments for particular, etc... - It is possible to define several answer modes for a query, each mode being associated a distinct parser (see
tests/example_strategies.py:GetFavoriteDish
for an example). - Queries support structured output and tool calls.
Trees and Reification
Strategies can be reified (i.e. compiled) into trees using the reify
function (see reference documentation for details and caveats). The Tree
class is defined as follows:
@dataclass(frozen=True)
class Tree[N: Node, P, T]:
node: N | Success[T]
child: Callable[[Value], Tree[N, P, T]]
ref: GlobalNodePath # (1)!
- To ignore on first reading. See documentation on references.
A tree contains either a node (Node
) or a success leaf (Node
). When applicable, children trees are indexed by actions (Value
). Actions result from combining elements of local spaces (Space
). For example, if node
has type Branch
, then an action corresponds to a branching candidate.
Adding New Effects
New types of effects beyond Branch
and Fail
can be added easily, by subclassing Node
. Here are a number of additional effects defined in the Delphyne standard library:
Join
: allows evaluating a sequence of independent sub-strategies, possibly in parallel.Compute
: allows performing expensive and possibly non-replicable/nondeterministic computations in strategies (see details in How-To Guide).Value
: allows adding value information into strategy trees. Such information can be leveraged by search policies (e.g.best_first_search
).Flag
: allows providing a finite number of alternative implementations for sub-tasks, to be selected either offline or at runtime.Message
: allows decorating trees with debugging messages.
Node types are dataclasses whose fields can be of several kinds:
- Nonparametric local spaces (
Space
), the main types of which are opaque spaces (OpaqueSpace
) and embedded trees (EmbeddedTree
). - Parametric local spaces, which are functions from local values (i.e. assembly of elements from local spaces) to local spaces.
- Data fields that contain policy metadata, debugging information, etc...
More details are available in the API Reference. For examples of defining new effects, you can refer to the source code of the aforementioned effects in the Delphyne standard library (in the delphyne.stdlib.nodes
module).