Evaluating Demonstrations
Evaluation Feedback
DemoFeedback
DemoFeedback: StrategyDemoFeedback | QueryDemoFeedback
Feedback sent by the server for each demonstration in a file.
QueryDemoFeedback
dataclass
Feedback sent by the server for a standalone query demonstration.
Attributes:
Name | Type | Description |
---|---|---|
kind |
Literal['query']
|
Always "query". |
diagnostics |
list[Diagnostic]
|
Global diagnostics. |
answer_diagnostics |
list[tuple[int, Diagnostic]]
|
Diagnostics attached to specific answers. |
Source code in src/delphyne/analysis/feedback.py
380 381 382 383 384 385 386 387 388 389 390 391 392 393 |
|
StrategyDemoFeedback
dataclass
Feedback sent by the server for each strategy demonstration.
Attributes:
Name | Type | Description |
---|---|---|
kind |
Literal['strategy']
|
Always "strategy". |
trace |
Trace
|
The resulting browsable trace, which includes all visited nodes. |
answer_refs |
dict[TraceAnswerId, DemoAnswerId]
|
A mapping from answer ids featured in the
trace to the position of the corresponding answer in the
demonstration. This mapping may be partial. For example,
using value hints (e.g., |
saved_nodes |
dict[str, TraceNodeId]
|
Nodes saved using the |
test_feedback |
list[TestFeedback]
|
Feedback for each test in the demonstration. |
global_diagnostics |
list[Diagnostic]
|
Diagnostics that apply to the whole demonstration (individual tests have their own diagnostics). |
query_diagnostics |
list[tuple[DemoQueryId, Diagnostic]]
|
Diagnostics attached to specific queries. |
answer_diagnostics |
list[tuple[DemoAnswerId, Diagnostic]]
|
Diagnostics attached to specific answers. |
implicit_answers |
list[ImplicitAnswer]
|
Implicit answers that were generated on the fly and that can be explicitly added to the demonstration. |
Source code in src/delphyne/analysis/feedback.py
344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 |
|
TestFeedback
dataclass
Feedback returned by the demo interpreter for a single test.
The test is considered successful if no diagnostic is a warning or an
error. Most of the time, and even when unsuccessful, a test stops at
a given node, which can be inspected in the UI and which is
indicated in field node_id
.
Attributes:
Name | Type | Description |
---|---|---|
diagnostics |
list[Diagnostic]
|
List of diagnostics for the test. |
node_id |
TraceNodeId | None
|
Identifier of the node where the test stopped. |
Source code in src/delphyne/analysis/feedback.py
300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 |
|
ImplicitAnswer
dataclass
An implicit answer that is not part of the demonstration but was generated on the fly.
The VSCode extension then offers to add such answers explicitly in
the demonstration. This is particularly useful for handling
Compute
nodes in demonstrations.
Attributes:
Name | Type | Description |
---|---|---|
query_name |
str
|
Query name. |
query_args |
dict[str, object]
|
Arguments passed to the query. |
answer |
str
|
The implicit answer value, as a raw string (mode |
Source code in src/delphyne/analysis/feedback.py
319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 |
|
DemoAnswerId
DemoAnswerId: tuple[int, int]
A (query_id, answer_index) pair that identifies an answer in a demo.
DemoQueryId
DemoQueryId: int
Index of the query in the queries section of a demo.
Diagnostic
Diagnostic: tuple[DiagnosticType, str]
A diagnostic gathers a type (i.e. severity) and a message.
DiagnosticType
DiagnosticType: Literal['error', 'warning', 'info']
Diagnostic type.
Browsable Traces
Trace
dataclass
A browsable trace.
Raw traces contain all the information necessary to recompute a trace but are not easily manipulated by tools. In comparison, these offer a more redundant but also more explicit view. This module provides a way to convert a trace from the former format to the latter.
Attributes:
Name | Type | Description |
---|---|---|
nodes |
dict[TraceNodeId, Node]
|
A mapping from node ids to their description. |
Info
A browsable trace features answer identifiers, for which a
meaning must be provided externally. For example, the
demonstration interpreter also produces a mapping from answer
ids to their position in the demonstration file. In addition,
commands like run_strategy
return a raw trace
(core.traces.Trace
) in addition to the browsable version,
which maps answer ids to their actual content.
Source code in src/delphyne/analysis/feedback.py
259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 |
|
Node
dataclass
Information about a node.
Attributes:
Name | Type | Description |
---|---|---|
kind |
str
|
Name of the node type, or |
success_value |
ValueRepr | None
|
The success value if the node is a success leaf,
or |
summary_message |
str | None
|
A short summary message (see the
|
leaf_node |
bool
|
Whether the node is a leaf node |
label |
str | None
|
A label describing the node, which can be useful for writing node selectors (although there is currently no guarantee that the label constitutes a valid selector leading to the node). Currently, the label shows all node tags, separated by "&". |
tags |
list[str]
|
The list of all tags attached to the node. |
properties |
list[tuple[Reference, NodeProperty]]
|
List of node properties (attached queries, nested trees, data fields...). Each property is accompanied by a pretty-printed, local space reference. |
actions |
list[Action]
|
A list of explored actions. |
origin |
NodeOrigin
|
The origin of the node in the global trace. |
Source code in src/delphyne/analysis/feedback.py
220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 |
|
NodeOrigin
NodeOrigin: (
Literal["root"]
| tuple[Literal["child"], TraceNodeId, TraceActionId]
| tuple[Literal["nested"], TraceNodeId, TraceNodePropertyId]
)
Origin of a node.
A node can be the global root, the child of another node, or the root of a nested tree.
Action
dataclass
An action associated with a node.
Attributes:
Name | Type | Description |
---|---|---|
ref |
Reference
|
Pretty-printed local reference for the action. |
hints |
list[str] | None
|
If the trace results from executing a demonstration,
this provides the list of hints that can be used to recover
the action through navigation. Otherwise, it is |
related_success_nodes |
list[TraceNodeId]
|
List of related success nodes. A related success node is a node whose attached value was used in building the action. Indeed, in the VSCode extension's Path View, we get a sequence of actions and for each of them the list of success paths that were involved in building that action. |
related_answers |
list[TraceAnswerId]
|
List of related answers. A related answer is an answer to a local query that is used in building the action. Storing this information is useful to detect useless answers that are not used in any action. |
destination |
TraceNodeId
|
Id of the child node that the action leads to. |
Source code in src/delphyne/analysis/feedback.py
174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
|
NodeProperty
NodeProperty: Data | NestedTree | Query
Description of a node property (see NodePropertyId
).
Data
dataclass
Generic property that displays some data.
Attributes:
Name | Type | Description |
---|---|---|
kind |
Literal['data']
|
Always "data". |
content |
str
|
string representation of the data content. |
Source code in src/delphyne/analysis/feedback.py
94 95 96 97 98 99 100 101 102 103 104 105 |
|
NestedTree
dataclass
A nested tree.
Attributes:
Name | Type | Description |
---|---|---|
kind |
Literal['nested']
|
Always "nested". |
strategy |
str
|
Name of the strategy function that induces the tree. |
args |
dict[str, ValueRepr]
|
Arguments passed to the strategy function. |
tags |
list[str]
|
Tags attached to the space induced by the tree. |
node_id |
TraceNodeId | None
|
Identifier of the root node of the nested tree, or
|
Source code in src/delphyne/analysis/feedback.py
108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 |
|
Query
dataclass
Information about a query.
Attributes:
Name | Type | Description |
---|---|---|
kind |
Literal['query']
|
Always "query". |
name |
str
|
Name of the query. |
args |
dict[str, object]
|
Query arguments, serialized in JSON. |
tags |
list[str]
|
Tags attached to the space induced by the query. |
answers |
list[Answer]
|
All answers to the query present in the trace. |
Source code in src/delphyne/analysis/feedback.py
150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 |
|
Answer
dataclass
An answer to a query.
Attributes:
Name | Type | Description |
---|---|---|
id |
TraceAnswerId
|
Unique answer identifier. |
value |
ValueRepr
|
Parsed answer value. |
hint |
tuple[] | tuple[str] | None
|
If the trace results from executing a demonstration (vs
running a policy with tracing enabled), then |
Source code in src/delphyne/analysis/feedback.py
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 |
|
Reference
dataclass
A reference to a space or to a value.
Several human-readable representations are provided:
Attributes:
Name | Type | Description |
---|---|---|
with_ids |
str
|
A pretty-printed, id-based reference. |
with_hints |
str | None
|
A pretty-printed, hint-based reference. These are typically available in the output of the demonstration interpreter, but not when converting arbitrary traces that result from running policies. |
Source code in src/delphyne/analysis/feedback.py
75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 |
|
ValueRepr
dataclass
Multiple representations for a Python object.
We allow providing several representations for Python objects: short, one-liner string descriptions, detailed descriptions, JSON representation... All of these can be leveraged by different tools and UI components.
Attributes:
Name | Type | Description |
---|---|---|
short |
str
|
A short representation, typically obtained using the
|
long |
str | None
|
A longer, often multi-line representation, typically
obtained using the |
json |
object
|
A JSON representation of the object. |
json_provided |
bool
|
Whether a JSON representation is provided (the
JSON field is |
Source code in src/delphyne/analysis/feedback.py
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
|
TraceAnswerId
TraceAnswerId: int
Global answer id, as set by core.traces.Trace
.
TraceActionId
TraceActionId: int
Index of an action within a given node.
TraceNodePropertyId
TraceNodePropertyId: int
Index of a property within a given node. A property is an element that can be listed in the UI, which is either an attached query, a nested tree or some data.
Demonstration Interpreter
evaluate_demo
evaluate_demo(
demo: Demo, context: DemoExecutionContext, extra_objects: dict[str, object]
) -> DemoFeedback
Evaluate a query or strategy demonstration.
This is the main entrypoint of the demonstration interpreter.
Attributes:
Name | Type | Description |
---|---|---|
demo |
The demonstration to evaluate. |
|
context |
The execution context in which to resolve Python identifiers. |
|
extra_objects |
Additional objects that can be resolved by name (with higher precedence). |
Returns:
Type | Description |
---|---|
DemoFeedback
|
A feedback object containing the results of the evaluation. |
Warning
This function creates an ObjectLoader
internally and is
therefore not thread-safe.
Source code in src/delphyne/analysis/demo_interpreter.py
678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 |
|
DemoExecutionContext
dataclass
Demonstration Execution Context.
Attributes:
Name | Type | Description |
---|---|---|
strategy_dirs |
Sequence[Path]
|
A list of directories in which strategy modules
can be found, to be added to |
modules |
Sequence[str]
|
A list of modules in which python object identifiers
should be resolved. Modules can be part of packages and so
their name may feature |
Source code in src/delphyne/analysis/demo_interpreter.py
71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 |
|
ObjectLoader
Utility class for loading Python objects.
Demonstration and command files may refer to Python identifiers that
need to be resolved. This is done relative to an execution context
(DemoExecutionContext
) that specifies a list of directories to be
added to sys.path
, along with a list of modules.
An exception is raised if an object with the requested identifier can be found in several modules.
Source code in src/delphyne/analysis/demo_interpreter.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 |
|
__init__
__init__(
ctx: DemoExecutionContext,
extra_objects: dict[str, object] | None = None,
reload: bool = True,
)
Attributes:
Name | Type | Description |
---|---|---|
ctx |
The execution context in which to resolve Python identifiers. |
|
extra_objects |
Additional objects that can be resolved by name (with higher precedence). |
|
reload |
Whether to reload all modules specified in the
execution context upon initialization. Setting this
value to |
Raises:
Type | Description |
---|---|
ModuleNotFound
|
a module could not be found. |
Source code in src/delphyne/analysis/demo_interpreter.py
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
|
find_object
find_object(name: str) -> Any
Find an object with a given name.
If the name is unqualified (it features no .
), one attempts to
find the object in every registered module in order. If the name
is qualified, one looks at the specified registered module.
Raises:
Type | Description |
---|---|
ObjectNotFound
|
The object could not be found. |
AmbiguousObjectIdentifier
|
The object name is ambiguous, i.e. it is found in several modules. |
Source code in src/delphyne/analysis/demo_interpreter.py
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 |
|
load_and_call_function
load_and_call_function(name: str, args: dict[str, Any]) -> Any
Load and call a function by wrapping a call to find_object
.
Source code in src/delphyne/analysis/demo_interpreter.py
180 181 182 183 184 185 186 |
|
load_strategy_instance
load_strategy_instance(name: str, args: dict[str, Any]) -> StrategyComp[Any, Any, Any]
Load and instantiate a strategy function with given arguments.
Raises:
Type | Description |
---|---|
ObjectNotFound
|
If the strategy function cannot be found. |
AmbiguousObjectIdentifier
|
If an ambiguous name is given. |
StrategyLoadingError
|
If the object is not a strategy function or if the arguments are invalid. |
Source code in src/delphyne/analysis/demo_interpreter.py
188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 |
|
load_query
load_query(name: str, args: dict[str, Any]) -> AbstractQuery[Any]
Load a query by name and instantiate it with given arguments.
Raises:
Type | Description |
---|---|
ObjectNotFound
|
if the query cannot be found. |
AmbiguousObjectIdentifier
|
if an ambiguous name is given. |
AssertionError
|
if the object is not a query. |
Source code in src/delphyne/analysis/demo_interpreter.py
212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 |
|