.. _rddl-description: RDDL Language Description =============== Introduction ------------------- The Relational Dynamic Influence Diagram Language (RDDL) is a uniform language where states, actions, and observations (whether discrete or continuous) are parameterized variables and the evolution of a fully or partially observed (stochastic) process is specified via (stochastic) functions over next state variables conditioned on current state and action variables (n.b., concurrency is allowed). Parameterized variables are simply templates for ground variables that can be obtained when given a particular problem instance defining possible domain objects. Semantically, RDDL is simply a dynamic Bayes net (DBN) (with potentially many intermediate layers) extended with a simple influence diagram (ID) utility node representing immediate reward. An objective function specifies how these immediate rewards should be optimized over time for optimal control. For a ground instance, RDDL is just a factored MDP (or POMDP, if partially observed). File Structure and Syntax ------------------- Domain Block ^^^^^^^^^^^^^^^^^^^ The ``domain`` block defines the domain of a problem that will be used to perform simulations accordingly. The objects, states and actions to be simulated, and the rules that the simulations should follow, are all defined in this block. The block should be written in a .rddl file as the following: .. code-block:: shell domain { requirements {

 };
        types {  };
        pvariables {  };
        cpfs {  };
        reward = ;
        state-invariants {  }; // optional
        action-preconditions {  }; // optional
        termination {  }; // optional
    }
    
.. note::
   There should not be a semicolon at the end of the domain block.

Requirements
^^^^^^^^^^^^^^^^^^^

The ``requirements`` section of the domain block states the characteristics of the 
domain. Specifically, the requirements section tells the parser what kind of 
parameterized variables will be defined, what types of values (integer, real 
numbers or user defined values) the variables will be assigned to, and how the 
actions and reward will be determined.

The ``requirements`` section will be included in the .rddl file with the following format:

.. code-block:: shell

    requirements { , , ... };

There are currently nine requirements that can be implemented in RDDL, as shown 
in the following table:

.. list-table:: Possible Requirements
   :widths: 40 80
   :header-rows: 1

   * - requirement
     - representation
   * - concurrent
     - This domain permits multiple non-default actions
   * - constrained-state
     - This domain uses state constraints
   * - continuous
     - This domain uses real-valued parameterized variables
   * - cpf-deterministic
     - This domain uses deterministic conditional functions for transitions (it is important to note that RDDL can also be used to model deterministic domains)
   * - integer-valued
     - This domain uses integer variables
   * - intermediate-nodes
     - This domain uses intermediate parameterized variable nodes
   * - multivalued
     - This domain uses enumerated parameterized variables
   * - partially-observed
     - This domain uses observation parameterized variables so it is treated as a POMDP (not an MDP as is otherwise the case)
   * - reward-deterministic
     - This domain does not use a stochastic reward

Types
^^^^^^^^^^^^^^^^^^^

The ``types`` section defines the objects and user defined values, if any, that will 
be used in the domain, according to the following format:

.. code-block:: shell

    types {
         : object;
         : object;
        ...

         : {@, @, ... };
    };

An ``object`` is a user-defined parameter that will be used to parameterize variables. 
They are often things or people that will be simulated to move or act in this domain. 

For example, consider a simulation where elevators are to travel between different 
floors and open doors to allow people to get on and off the elevators to 
ultimately minimize the waiting time (see elevators.rddl example). ``person``
and ``elevator`` can be declared as objects in the domain as follows:

.. code-block:: shell

    types {
        person : object;
        elevator : object;
    };

The ``@`` quantifier specifies that the given value should be treated as an object rather than a pvariable.
This symbol is generally optional for objects in expressions, however:

.. warning::
   If the ``@`` symbol is not prepended to an object, and there is a variable defined in the domain 
   with the same name as the object that does not have parameters, then it is ambiguous whether the 
   object or the variable are being referred to inside an expression. 
   The compiler will raise an exception in this case.

Parameterized Variables (pVariables)
^^^^^^^^^^^^^^^^^^^

This section is included to declare all variables used in the domain. 
These variables include constant values, states and action variables, as well as 
potentially intermediate and observed variables. 
Ultimately, these variables will serve as condition-determining parameters in 
transitions of states. 
The variables declared in this section can be either parameterized by one or more 
objects, or non-parameterized, and they are declared according to the following 
format:

.. code-block:: shell

    pvariables {
    
        // parameterized variables
        (, [, ...]) : { , , default =  };
        (, [, ...]) : { , , default =  };
        ...
        
        // non-parameterized variables
         : { , , default =  };
         : { , , default =  };
        ...
    };

The ```` argument specifies the function of the variable declared. 
This argument can take one of the following five values:

* ``non-fluent``: variable that never changes during a simulation. Non-fluents will be initialized in the non-fluents block before simulation starts
* ``state-fluent``: or state variable, variable that represents the state of a simulation, often used to describe the state or relative state of objects (e.g., locations, occupancy, etc.).
* ``interm-fluent``: or intermediate variable, variable that is used as an intermediate conditional probability calculation. Intermediate fluents must have a level of stratification, and are strictly stratified so that an intermediate variable can only condition on intermediate variables of a strictly lower level or state variable.
* ``observ-fluent``: or observation variable, variable used as a conditional observation probability in partially observable Markov decision process (POMDP).
* ``action-fluent``: or action variable, variable that represents the action of a simulation, often used to describe if a transition between two different states is happening.

The ```` argument specifies the values the declared variable can take on. 
This argument can be one of the following four options:

* ``bool``: boolean valued variable (i.e., true, false). Note that these variables are evaluated to 1 or 0 when used in arithmetic expressions
* ``int``: integer valued variable (i.e., 1, 2, 3, 10, 100 ...)
* ``real``: real valued variable (i.e., 0.1, 0.25, 1.414, 2.718, 3.142 ...)
* ````: an enumerated value defined by the user in the ``types`` section
* ``