Getting Started: Advanced Tools
===============

Evaluating Invariants and Preconditions
-------------------

The ``RDDLEnv`` provides built-in logic to validate states and actions constraints automatically
according to their RDDL descriptions (which throws an error upon violation, by default). However, 
in some applications, you may wish to evaluate constraints yourself. 

Constraint checking functions are implemented in the ``sampler`` field of a ``RDDLEnv``:

.. code-block:: python
	
    import pyRDDLGym
    env = pyRDDLGym.make("CartPole_Continuous_gym", "0")
    backend = env.sampler

One common use case is to validate actions before executing them in the environment 
(perhaps reverting to a default action if invalid):

.. code-block:: python
	
    if backend.check_action_preconditions(actions, silent=True):
        return actions
    else: 
        return default_actions   # upon violation
    
Below are the commonly used functions of ``sampler`` that can be accessed directly:

.. list-table:: Commonly-used functions accessible in ``sampler``
   :widths: 40 80
   :header-rows: 1
   
   * - syntax
     - description
   * - ``check_state_invariants(silent)``
     - returns a bool indicating if all state invariants are satisfied (``silent`` will not raise an exception)
   * - ``check_action_preconditions(actions, silent)``
     - returns a bool indicating if all action preconditions are satisfied (``silent`` will not raise an exception)
   * - ``check_terminal_states()``
     - returns a bool indicating if any termination condition is satisfied

.. raw:: html 

   <a href="notebooks/checking_constraints_in_pyrddlgym.html"> 
       <img src="_static/notebook_icon.png" alt="Jupyter Notebook" style="width:64px;height:64px;margin-right:5px;margin-top:5px;margin-bottom:5px;">
       Related example: Checking constraints in pyRDDLGym.
   </a>
   

Inspecting the Model
-------------------

The pyRDDLGym compiler provides a convenient API for querying a variety of properties about RDDL constructs in a domain.
These can be accessed through the ``model`` field of a ``RDDLEnv``:

.. code-block:: python
	
    import pyRDDLGym
    env = pyRDDLGym.make("CartPole_Continuous_gym", "0")
    model = env.model

Below are some commonly used fields of ``model`` that can be accessed directly:
	
.. list-table:: Commonly-used properties accessible in ``model``
   :widths: 40 80
   :header-rows: 1
   
   * - syntax
     - description
   * - ``horizon``
     - horizon as defined in the instance
   * - ``discount``
     - discount factor as defined in the instance
   * - ``max_allowed_actions``
     - ``max-nondef-actions`` as defined in the instance
   * - ``variable_types``
     - dict of pvariable types (e.g. non-fluent, ...) for each variable
   * - ``variable_ranges``
     - dict of pvariable ranges (e.g. real, ...) for each variable
   * - ``variable_params``
     - dict of parameters and their types for each variable
   * - ``type_to_objects``
     - dict of all defined objects for each type
   * - ``non_fluents``
     - dict of initial values for each non-fluent
   * - ``state_fluents``
     - dict of initial values for each state-fluent
   * - ``action_fluents``
     - dict of default values for each action-fluent
   * - ``interm_fluents``
     - dict of initial values for each interm-fluent
   * - ``observ_fluents``
     - dict of initial values for each observ-fluent
   * - ``cpfs``
     - dict of ``Expression`` objects for each cpf
   * - ``reward``
     - ``Expression`` object for reward function
   * - ``preconditions``
     - list of ``Expression`` objects for each action-precondition
   * - ``invariants``
     - list of ``Expression`` objects for each state-invariant

``Expression`` objects are abstract syntax trees that describe the flow of computations
in each cpf, constraint relation, or the reward function of the RDDL domain:
- the ``etype()`` function provides basic information about the expression, such as its type
- the ``args()`` function provides its sub-expressions, which consists of other ``Expression`` objects, aggregation variables, or other information required by the engine.

.. raw:: html 

   <a href="notebooks/extracting_info_from_compiled_model_with_pyrddlgym.html"> 
       <img src="_static/notebook_icon.png" alt="Jupyter Notebook" style="width:64px;height:64px;margin-right:5px;margin-top:5px;margin-bottom:5px;">
       Related example: Extracting information from the compiled planning problem.
   </a>
      
Modifying the Problem Programmatically
------

The model supports modification of key properties. For example, it is possible to change non-fluents 
in the previous example, such as the mass of the cart:

.. code-block:: python

    import pyRDDLGym
    env = pyRDDLGym.make("CartPole_Continuous_gym", "0")
    env.model.non_fluents['CART-MASS'] = 20.0

However, ``env`` will not reflect changes to the model, since the ``make()`` function 
caches and precomputes certain information such as non-fluent values.
A long workaround is to decompile the model into RDDL code, save it to disk, and call ``make()`` again.
A simpler solution is to create a new environment by passing the modified model directly:

.. code-block:: python

    new_env = pyRDDLGym.make(model, instance=None)
    new_env.set_visualizer(env._visualizer.__class__)

The ``new_env`` is a copy of ``env`` with a modified cart mass.

.. raw:: html 

   <a href="notebooks/modifying_a_problem_with_pyrddlgym.html"> 
       <img src="_static/notebook_icon.png" alt="Jupyter Notebook" style="width:64px;height:64px;margin-right:5px;margin-top:5px;margin-bottom:5px;">
       Related example: Modifying a problem with pyRDDLGym.
   </a>


Grounding a Problem
------

By default, pyRDDLGym simulates domains in a vectorized manner directly from the (lifted) domain description. 
Specifically, parameterized variables are represented internally as numpy arrays,
whose values are propagated in a vectorized manner through mathematical expressions.

However, sometimes it is required to work with the grounded representation. For example, 
given a p-variable ``some-var(?x, ?y)`` of two parameters ``?x`` and ``?y``, and the expression
``cpf'(?x, ?y) = some-var(?x, ?y) + 1.0;``, the grounded representation is as follows:

.. code-block:: shell

    cpf___x1__y1' = some-var___x1__y1 + 1.0;
    cpf___x1__y2' = some-var___x1__y2 + 1.0;
    cpf___x2__y1' = some-var___x2__y1 + 1.0;
    cpf___x2__y2' = some-var___x2__y2 + 1.0;
    ...

where ``x1, x2...`` are the objects of ``?x`` and ``y1, y2...`` are the objects of ``?y``.
In other words, all p-variables are replaced by sets of non-parameterized variables (one per valid combination of objects),
and all expressions are replaced by sets of expressions whose p-variable dependencies are replaced by their non-parameterized
counterparts. In all cases, the grounded and lifted representations should produce the same numerical results, 
albeit in a slightly different format.
 
pyRDDLGym provides a convenient API for producing a grounded model:

.. code-block:: python
    
    from pyRDDLGym.core.grounder import RDDLGrounder
    model = RDDLGrounder(env.model._AST).ground()

Much of the functionality for inspecting and operating on models described on this page also apply to grounded models.

.. warning::
   Currently, the grounder only supports RDDL 1.0 syntax. Therefore, it does not ground
   expressions containing switch statements, matrix operations, vector distributions and Discrete, 
   nested pvariables, and free parameters outside a pvariable (e.g. ``?x == ?y``). 
   There is currently (limited) support for enumerated types.

.. raw:: html 

   <a href="notebooks/grounding_a_problem_with_pyrddlgym.html"> 
       <img src="_static/notebook_icon.png" alt="Jupyter Notebook" style="width:64px;height:64px;margin-right:5px;margin-top:5px;margin-bottom:5px;">
       Related example: Grounding a problem in pyRDDLGym.
   </a>
   
   
Vectorized Input and Output
-------------------

Some algorithms require a tensor representation of parameterized state-fluent outputs and/or action-fluent inputs. 
The ``RDDLEnv`` class provides a ``vectorized`` option
to allow tensor representations of state and action fluents to be passed into and out of pyRDDLGym. 

For example, a ``bool`` action-fluent ``put-out(?x, ?y)`` taking two parameters 
``?x`` and ``?y``, with 3 objects each, would be provided as a boolean-valued 
3-by-3 matrix. State-fluents would be returned in a similar format from the environment.

This option can be enabled as follows:

.. code-block:: python
	
    import pyRDDLGym
    env = pyRDDLGym.make("CartPole_Continuous_gym", "0", vectorized=True)

With this option enabled, the bounds of the ``observation_space`` and ``action_space`` 
of the environment are instances of ``gymnasium.spaces.Box`` with the correct shape and dtype.


Decompiling Models into RDDL
--------------------------

It is possible to decompile a Python model object back into (cleaned up) RDDL code. This is useful for 
generating RDDL descriptions of problems that have been modified programmatically in Python:

.. code-block:: python

    from pyRDDLGym.core.debug.decompiler import RDDLDecompiler
    decompiler = RDDLDecompiler()
    domain_rddl = decompiler.decompile_domain(model)   # domain.rddl
    instance_rddl = decompiler.decompile_instance(model)    # instance.rddl


Running pyRDDLGym through TCP
-------------------

Some older algorithms and infrastructure built around the Java rddlsim required 
a TCP connection with a server that provides the environment interaction.
pyRDDLGym provides a ``RDDLSimServer`` class that functions in a similar way.

To create and run a server built around a specific domain or instance:

.. code-block:: python
	
    from pyRDDLGym.core.server import RDDLSimServer	
    server = RDDLSimServer("/path/to/domain.rddl", "/path/to/instance.rddl", rounds, time, port=2323)
    server.run()	
	
The ``rounds`` specifies the number of epsiodes/rounds of simulation to perform,
and ``time`` specifies the time the server connection should remain open. The optional ``port``
parameter allows multiple connections to be established in parallel at different ports. 
Finally, the ``run()`` command starts the server.


Embedding External Python Functions within RDDL
--------------------------

Starting in pyRDDLGym 2.5, it is possible to embed arbitrary Python functions within RDDL description files.
The general syntax is described :ref:`here <external-functions>` in more details. 

To successfully compile a RDDL description file that contains external function references, i.e.:

.. code-block:: shell
	
    cpf1 = $Func1[...](...);
    cpf2 = $Func2[...](...);

you must provide a dictionary with name - function mappings when building the environment, where each function
must respect both the input and output signatures as they are described in the RDDL file:

.. code-block:: python
	  
    def func1(...):
        return ...
    def func2(...):
        return ...

    env = pyRDDLGym.make(..., backend_kwargs={'python_functions': {'Func1': func1, 'Func2': func2}})


.. raw:: html 

   <a href="notebooks/calling_external_functions.html"> 
       <img src="_static/notebook_icon.png" alt="Jupyter Notebook" style="width:64px;height:64px;margin-right:5px;margin-top:5px;margin-bottom:5px;">
       Related example: Calling External Python Functions in pyRDDLGym.
   </a>

.. note::
    JaxPlan requires additional structure on external functions, namely that they should support JAX's JIT compilation.
    This implies that any external functions in an environment used by JaxPlan must consist entirely of JAX supported operations (i.e. JAX, flax, haiku, optax, etc).
    However, functions that use non-JAX operations (including for example pyTorch or TensorFlow operations) can still be used in pyRDDLGym!
   

The pyRDDLGym Compiler (for Advanced Users)
-------------------

.. warning::
   This section is intended for advanced users who wish to become familiar with the 
   backend of pyRDDLGym. We can only provide limited support for backend-related 
   topics (bug reports and pull requests are always welcome), 
   and the API can change at any time making this guide obsolete at times.

At the lowest level of abstraction of pyRDDLGym, syntactic analysis is performed statically whenever possible 
for optimal performance, due to the fact that pyRDDLGym is written purely in Python
and does not (currently) provide C/C++ binaries.

The syntax analysis and compilation of RDDL description files into Python objects 
happens statically when the environment is created, which are directly referenced 
by the user-facing API (e.g. the ``RDDLEnv``, simulator, optimizers, etc.). 
These objects include outputs from several distinct stages of compilation:

.. list-table:: Compiler Components
   :widths: 60 120
   :header-rows: 1
   
   * - name 
     - description
   * - `RDDLParser <https://github.com/pyrddlgym-project/pyRDDLGym/blob/main/pyRDDLGym/core/parser/parser.py>`_
     - Parses a RDDL description file into an intermediate AST.
   * - `RDDLPlanningModel <https://github.com/pyrddlgym-project/pyRDDLGym/blob/main/pyRDDLGym/core/compiler/model.py>`_
     - Converts a parsed AST into a user-friendly model object.
   * - `RDDLValueInitializer <https://github.com/pyrddlgym-project/pyRDDLGym/blob/main/pyRDDLGym/core/compiler/initializer.py>`_
     - Compiles the initial values of all pvariables into numerical arrays.
   * - `RDDLLevelAnalysis <https://github.com/pyrddlgym-project/pyRDDLGym/blob/main/pyRDDLGym/core/compiler/levels.py>`_
     - Summarizes the dependencies between CPFs, and computes their order of evaluation.
   * - `RDDLObjectsTracer <https://github.com/pyrddlgym-project/pyRDDLGym/blob/main/pyRDDLGym/core/compiler/tracer.py>`_
     - Traces the RDDL AST to compile type information about each subexpression, and does type checking.

Parsing
^^^^^^^^^^^^^^^^^^^

The following code illustrates the parsing of a domain description, 
returning a ``RDDL`` object that represents its AST representation:

.. code-block:: python
    
    from pyRDDLGym.core.parser.reader import RDDLReader	
    from pyRDDLGym.core.parser.parser import RDDLParser
    
    rddl_string = RDDLReader("\path\to\domain.rddl", "\path\to\instance.rddl").rddltxt
    parser = RDDLParser()
    parser.build()
    ast = parser.parse(rddl_string)

Model Objects
^^^^^^^^^^^^^^^^^^^

The AST can then be passed to a ``RDDLPlanningModel``, which compiles
the AST into a user-friendly API with accessible properties and functions 
for operating on and modifying the (lifted) domain:

.. code-block:: python

    from pyRDDLGym.core.compiler.model import RDDLLiftedModel	
    model = RDDLLiftedModel(ast)

Initializing Values
^^^^^^^^^^^^^^^^^^^

It is now possible to extract the initial values of the pvariables by using a ``RDDLValueInitializer``,
which reads from the ``init-fluents`` block in the instance whenever possible, and
otherwise from the ``default`` values in the domain:

.. code-block:: python

    from pyRDDLGym.core.compiler.initializer import RDDLValueInitializer	
    values = RDDLValueInitializer(model).initialize()

Analyzing Fluent Dependencies
^^^^^^^^^^^^^^^^^^^

It is also possible to compute the graph that summarizes the pvariables/CPFs on which each CPF depends,
and run a topological sort on the graph to determine the correct order of CPF evaluation:

.. code-block:: python

    from pyRDDLGym.core.compiler.levels import RDDLLevelAnalysis    
    sorter = RDDLLevelAnalysis(model)
    dependencies = sorter.build_call_graph()
    levels = sorter.compute_levels()

.. raw:: html 

   <a href="notebooks/analyzing_fluent_dependencies.html"> 
       <img src="_static/notebook_icon.png" alt="Jupyter Notebook" style="width:64px;height:64px;margin-right:5px;margin-top:5px;margin-bottom:5px;">
       Related example: Analyzing fluent dependencies and evaluation order.
   </a>
   
Tracing and Static Compilation
^^^^^^^^^^^^^^^^^^^

Finally, the code can be traced to compile static type information about each subexpression in the AST, which 
includes for example any free parameters in the scope of the subexpression and their types,
the type of the value returned by the subexpression during evaluation,
or information that is expensive to compute dynamically during simulation 
(e.g. instructions how to operate on pvariables stored as tensors):
 
.. code-block:: python
    
    from pyRDDLGym.core.compiler.tracer import RDDLObjectsTracer    
    trace_info = RDDLObjectsTracer(model, cpf_levels=levels).trace()

This creates an object of type ``RDDLTracedObjects`` which can be queried for 
compiled information about each subexpression in the AST, i.e.:

.. code-block:: python
    
    trace_info.cached_objects_in_scope(expr)   # list free parameters in scope
    trace_info.cached_object_type(expr)        # type of the value returned (None if primitive)
    trace_info.cached_is_fluent(expr)          # whether expr is fluent (returned value can change over time)
    trace_info.cached_sim_info(expr)           # low-level instructions for operating on returned value tensor