Getting Started: Advanced Tools#

Evaluating Invariants and Preconditions#

The RDDLEnv provides built-in logic to validate states and actions constraints automatically according to their RDDL descriptions (which throws an error upon violation, by default). However, in some applications, you may wish to evaluate constraints yourself.

Constraint checking functions are implemented in the sampler field of a RDDLEnv:

import pyRDDLGym
env = pyRDDLGym.make("CartPole_Continuous_gym", "0")
backend = env.sampler

One common use case is to validate actions before executing them in the environment (perhaps reverting to a default action if invalid):

if backend.check_action_preconditions(actions, silent=True):
    return actions
else:
    return default_actions   # upon violation

Below are the commonly used functions of sampler that can be accessed directly:

Commonly-used functions accessible in `sampler`#
syntax	description
`check_state_invariants(silent)`	returns a bool indicating if all state invariants are satisfied (`silent` will not raise an exception)
`check_action_preconditions(actions, silent)`	returns a bool indicating if all action preconditions are satisfied (`silent` will not raise an exception)
`check_terminal_states()`	returns a bool indicating if any termination condition is satisfied

Inspecting the Model#

The pyRDDLGym compiler provides a convenient API for querying a variety of properties about RDDL constructs in a domain. These can be accessed through the model field of a RDDLEnv:

import pyRDDLGym
env = pyRDDLGym.make("CartPole_Continuous_gym", "0")
model = env.model

Below are some commonly used fields of model that can be accessed directly:

Commonly-used properties accessible in `model`#
syntax	description
`horizon`	horizon as defined in the instance
`discount`	discount factor as defined in the instance
`max_allowed_actions`	`max-nondef-actions` as defined in the instance
`variable_types`	dict of pvariable types (e.g. non-fluent, …) for each variable
`variable_ranges`	dict of pvariable ranges (e.g. real, …) for each variable
`variable_params`	dict of parameters and their types for each variable
`type_to_objects`	dict of all defined objects for each type
`non_fluents`	dict of initial values for each non-fluent
`state_fluents`	dict of initial values for each state-fluent
`action_fluents`	dict of default values for each action-fluent
`interm_fluents`	dict of initial values for each interm-fluent
`observ_fluents`	dict of initial values for each observ-fluent
`cpfs`	dict of `Expression` objects for each cpf
`reward`	`Expression` object for reward function
`preconditions`	list of `Expression` objects for each action-precondition
`invariants`	list of `Expression` objects for each state-invariant

Expression objects are abstract syntax trees that describe the flow of computations in each cpf, constraint relation, or the reward function of the RDDL domain: - the etype() function provides basic information about the expression, such as its type - the args() function provides its sub-expressions, which consists of other Expression objects, aggregation variables, or other information required by the engine.

Modifying the Problem Programmatically#

The model supports modification of key properties. For example, it is possible to change non-fluents in the previous example, such as the mass of the cart:

import pyRDDLGym
env = pyRDDLGym.make("CartPole_Continuous_gym", "0")
env.model.non_fluents['CART-MASS'] = 20.0

However, env will not reflect changes to the model, since the make() function caches and precomputes certain information such as non-fluent values. A long workaround is to decompile the model into RDDL code, save it to disk, and call make() again. A simpler solution is to create a new environment by passing the modified model directly:

new_env = pyRDDLGym.make(model, instance=None)
new_env.set_visualizer(env._visualizer.__class__)

The new_env is a copy of env with a modified cart mass.

Grounding a Problem#

By default, pyRDDLGym simulates domains in a vectorized manner directly from the (lifted) domain description. Specifically, parameterized variables are represented internally as numpy arrays, whose values are propagated in a vectorized manner through mathematical expressions.

However, sometimes it is required to work with the grounded representation. For example, given a p-variable some-var(?x, ?y) of two parameters ?x and ?y, and the expression cpf'(?x, ?y) = some-var(?x, ?y) + 1.0;, the grounded representation is as follows:

cpf___x1__y1' = some-var___x1__y1 + 1.0;
cpf___x1__y2' = some-var___x1__y2 + 1.0;
cpf___x2__y1' = some-var___x2__y1 + 1.0;
cpf___x2__y2' = some-var___x2__y2 + 1.0;
...

where x1, x2... are the objects of ?x and y1, y2... are the objects of ?y. In other words, all p-variables are replaced by sets of non-parameterized variables (one per valid combination of objects), and all expressions are replaced by sets of expressions whose p-variable dependencies are replaced by their non-parameterized counterparts. In all cases, the grounded and lifted representations should produce the same numerical results, albeit in a slightly different format.

pyRDDLGym provides a convenient API for producing a grounded model:

from pyRDDLGym.core.grounder import RDDLGrounder
model = RDDLGrounder(env.model._AST).ground()

Much of the functionality for inspecting and operating on models described on this page also apply to grounded models.

Warning

Currently, the grounder only supports RDDL 1.0 syntax. Therefore, it does not ground expressions containing switch statements, matrix operations, vector distributions and Discrete, nested pvariables, and free parameters outside a pvariable (e.g. ?x == ?y). There is currently (limited) support for enumerated types.

Vectorized Input and Output#

Some algorithms require a tensor representation of parameterized state-fluent outputs and/or action-fluent inputs. The RDDLEnv class provides a vectorized option to allow tensor representations of state and action fluents to be passed into and out of pyRDDLGym.

For example, a bool action-fluent put-out(?x, ?y) taking two parameters ?x and ?y, with 3 objects each, would be provided as a boolean-valued 3-by-3 matrix. State-fluents would be returned in a similar format from the environment.

This option can be enabled as follows:

import pyRDDLGym
env = pyRDDLGym.make("CartPole_Continuous_gym", "0", vectorized=True)

With this option enabled, the bounds of the observation_space and action_space of the environment are instances of gymnasium.spaces.Box with the correct shape and dtype.

Decompiling Models into RDDL#

It is possible to decompile a Python model object back into (cleaned up) RDDL code. This is useful for generating RDDL descriptions of problems that have been modified programmatically in Python:

from pyRDDLGym.core.debug.decompiler import RDDLDecompiler
decompiler = RDDLDecompiler()
domain_rddl = decompiler.decompile_domain(model)   # domain.rddl
instance_rddl = decompiler.decompile_instance(model)    # instance.rddl

Running pyRDDLGym through TCP#

Some older algorithms and infrastructure built around the Java rddlsim required a TCP connection with a server that provides the environment interaction. pyRDDLGym provides a RDDLSimServer class that functions in a similar way.

To create and run a server built around a specific domain or instance:

from pyRDDLGym.core.server import RDDLSimServer
server = RDDLSimServer("/path/to/domain.rddl", "/path/to/instance.rddl", rounds, time, port=2323)
server.run()

The rounds specifies the number of epsiodes/rounds of simulation to perform, and time specifies the time the server connection should remain open. The optional port parameter allows multiple connections to be established in parallel at different ports. Finally, the run() command starts the server.

Embedding External Python Functions within RDDL#

Starting in pyRDDLGym 2.5, it is possible to embed arbitrary Python functions within RDDL description files. The general syntax is described here in more details.

To successfully compile a RDDL description file that contains external function references, i.e.:

cpf1 = $Func1[...](...);
cpf2 = $Func2[...](...);

you must provide a dictionary with name - function mappings when building the environment, where each function must respect both the input and output signatures as they are described in the RDDL file:

def func1(...):
    return ...
def func2(...):
    return ...

env = pyRDDLGym.make(..., backend_kwargs={'python_functions': {'Func1': func1, 'Func2': func2}})

Note

JaxPlan requires additional structure on external functions, namely that they should support JAX’s JIT compilation. This implies that any external functions in an environment used by JaxPlan must consist entirely of JAX supported operations (i.e. JAX, flax, haiku, optax, etc). However, functions that use non-JAX operations (including for example pyTorch or TensorFlow operations) can still be used in pyRDDLGym!

The pyRDDLGym Compiler (for Advanced Users)#

Warning

This section is intended for advanced users who wish to become familiar with the backend of pyRDDLGym. We can only provide limited support for backend-related topics (bug reports and pull requests are always welcome), and the API can change at any time making this guide obsolete at times.

At the lowest level of abstraction of pyRDDLGym, syntactic analysis is performed statically whenever possible for optimal performance, due to the fact that pyRDDLGym is written purely in Python and does not (currently) provide C/C++ binaries.

The syntax analysis and compilation of RDDL description files into Python objects happens statically when the environment is created, which are directly referenced by the user-facing API (e.g. the RDDLEnv, simulator, optimizers, etc.). These objects include outputs from several distinct stages of compilation:

Compiler Components#
name	description
RDDLParser	Parses a RDDL description file into an intermediate AST.
RDDLPlanningModel	Converts a parsed AST into a user-friendly model object.
RDDLValueInitializer	Compiles the initial values of all pvariables into numerical arrays.
RDDLLevelAnalysis	Summarizes the dependencies between CPFs, and computes their order of evaluation.
RDDLObjectsTracer	Traces the RDDL AST to compile type information about each subexpression, and does type checking.

Parsing#

The following code illustrates the parsing of a domain description, returning a RDDL object that represents its AST representation:

from pyRDDLGym.core.parser.reader import RDDLReader
from pyRDDLGym.core.parser.parser import RDDLParser

rddl_string = RDDLReader("\path\to\domain.rddl", "\path\to\instance.rddl").rddltxt
parser = RDDLParser()
parser.build()
ast = parser.parse(rddl_string)

Model Objects#

The AST can then be passed to a RDDLPlanningModel, which compiles the AST into a user-friendly API with accessible properties and functions for operating on and modifying the (lifted) domain:

from pyRDDLGym.core.compiler.model import RDDLLiftedModel
model = RDDLLiftedModel(ast)

Initializing Values#

It is now possible to extract the initial values of the pvariables by using a RDDLValueInitializer, which reads from the init-fluents block in the instance whenever possible, and otherwise from the default values in the domain:

from pyRDDLGym.core.compiler.initializer import RDDLValueInitializer
values = RDDLValueInitializer(model).initialize()

Analyzing Fluent Dependencies#

It is also possible to compute the graph that summarizes the pvariables/CPFs on which each CPF depends, and run a topological sort on the graph to determine the correct order of CPF evaluation:

from pyRDDLGym.core.compiler.levels import RDDLLevelAnalysis
sorter = RDDLLevelAnalysis(model)
dependencies = sorter.build_call_graph()
levels = sorter.compute_levels()

Tracing and Static Compilation#

Finally, the code can be traced to compile static type information about each subexpression in the AST, which includes for example any free parameters in the scope of the subexpression and their types, the type of the value returned by the subexpression during evaluation, or information that is expensive to compute dynamically during simulation (e.g. instructions how to operate on pvariables stored as tensors):

from pyRDDLGym.core.compiler.tracer import RDDLObjectsTracer
trace_info = RDDLObjectsTracer(model, cpf_levels=levels).trace()

This creates an object of type RDDLTracedObjects which can be queried for compiled information about each subexpression in the AST, i.e.:

trace_info.cached_objects_in_scope(expr)   # list free parameters in scope
trace_info.cached_object_type(expr)        # type of the value returned (None if primitive)
trace_info.cached_is_fluent(expr)          # whether expr is fluent (returned value can change over time)
trace_info.cached_sim_info(expr)           # low-level instructions for operating on returned value tensor