Extracting information from the compiled planning problem.

Extracting information from the compiled planning problem.#

In this intermediate notebook, we discuss how to access the underlying compiled planning problem.

First install the required packages:

%pip install --quiet --upgrade pip
%pip install --quiet git+https://github.com/pyrddlgym-project/pyRDDLGym.git
%pip install --quiet git+https://github.com/pyrddlgym-project/rddlrepository.git
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.

Import the required packages:

import pyRDDLGym

We will use the Wildfire problem from the IPPC 2014 as our illustrative example:

env = pyRDDLGym.make('Wildfire_MDP_ippc2014', '1')

The compiled information about the planning problem is stored in the environment’s model field:

env.model
<pyRDDLGym.core.compiler.model.RDDLLiftedModel at 0x14ae9eda2d0>

To extract the domain and instance names:

print(env.model.domain_name)
print(env.model.instance_name)
wildfire_mdp
wildfire_inst_mdp__1

To get the objects of each domain and enum type:

print(env.model.type_to_objects)
print(env.model.enum_types)
{'x_pos': ['x1', 'x2', 'x3'], 'y_pos': ['y1', 'y2', 'y3']}
set()

To get all fluents by category, with their default values:

print(env.model.non_fluents.keys())
print(env.model.state_fluents.keys())
print(env.model.action_fluents.keys())
print(env.model.derived_fluents.keys())
print(env.model.interm_fluents.keys())
print(env.model.observ_fluents.keys())
dict_keys(['COST_CUTOUT', 'COST_PUTOUT', 'PENALTY_TARGET_BURN', 'PENALTY_NONTARGET_BURN', 'NEIGHBOR', 'TARGET'])
dict_keys(['burning', 'out-of-fuel'])
dict_keys(['put-out', 'cut-out'])
dict_keys([])
dict_keys([])
dict_keys([])

To get the value ranges of all fluents:

print(env.model.variable_ranges)
{'COST_CUTOUT': 'real', 'COST_PUTOUT': 'real', 'PENALTY_TARGET_BURN': 'real', 'PENALTY_NONTARGET_BURN': 'real', 'NEIGHBOR': 'bool', 'TARGET': 'bool', 'burning': 'bool', "burning'": 'bool', 'out-of-fuel': 'bool', "out-of-fuel'": 'bool', 'put-out': 'bool', 'cut-out': 'bool'}

To get the variable parameters of each fluent:

print(env.model.variable_params)
{'COST_CUTOUT': [], 'COST_PUTOUT': [], 'PENALTY_TARGET_BURN': [], 'PENALTY_NONTARGET_BURN': [], 'NEIGHBOR': ['x_pos', 'y_pos', 'x_pos', 'y_pos'], 'TARGET': ['x_pos', 'y_pos'], 'burning': ['x_pos', 'y_pos'], "burning'": ['x_pos', 'y_pos'], 'out-of-fuel': ['x_pos', 'y_pos'], "out-of-fuel'": ['x_pos', 'y_pos'], 'put-out': ['x_pos', 'y_pos'], 'cut-out': ['x_pos', 'y_pos']}

To get all CPF expressions:

print(env.model.cpfs)
{"burning'": ([('?x', 'x_pos'), ('?y', 'y_pos')], <pyRDDLGym.core.parser.expr.Expression object at 0x0000014AEB10F2C0>), "out-of-fuel'": ([('?x', 'x_pos'), ('?y', 'y_pos')], <pyRDDLGym.core.parser.expr.Expression object at 0x0000014AEB10FA10>)}

To get the reward function expression:

print(env.model.reward)
Expression(etype=('arithmetic', '+'), args=
    Expression(etype=('arithmetic', '+'), args=
        Expression(etype=('arithmetic', '+'), args=
            Expression(etype=('aggregation', 'sum'), args=
                ('typed_var', ('?x', 'x_pos'))
                ('typed_var', ('?y', 'y_pos'))
                Expression(etype=('arithmetic', '*'), args=
                    Expression(etype=('pvar', 'COST_CUTOUT'), args=('COST_CUTOUT', None))
                    Expression(etype=('pvar', 'cut-out'), args=(cut-out, [?x, ?y]))))
            Expression(etype=('aggregation', 'sum'), args=
                ('typed_var', ('?x', 'x_pos'))
                ('typed_var', ('?y', 'y_pos'))
                Expression(etype=('arithmetic', '*'), args=
                    Expression(etype=('pvar', 'COST_PUTOUT'), args=('COST_PUTOUT', None))
                    Expression(etype=('pvar', 'put-out'), args=(put-out, [?x, ?y])))))
        Expression(etype=('aggregation', 'sum'), args=
            ('typed_var', ('?x', 'x_pos'))
            ('typed_var', ('?y', 'y_pos'))
            Expression(etype=('arithmetic', '*'), args=
                Expression(etype=('pvar', 'PENALTY_TARGET_BURN'), args=('PENALTY_TARGET_BURN', None))
                Expression(etype=('boolean', '^'), args=
                    Expression(etype=('boolean', '|'), args=
                        Expression(etype=('pvar', 'burning'), args=(burning, [?x, ?y]))
                        Expression(etype=('pvar', 'out-of-fuel'), args=(out-of-fuel, [?x, ?y])))
                    Expression(etype=('pvar', 'TARGET'), args=(TARGET, [?x, ?y]))))))
    Expression(etype=('aggregation', 'sum'), args=
        ('typed_var', ('?x', 'x_pos'))
        ('typed_var', ('?y', 'y_pos'))
        Expression(etype=('arithmetic', '*'), args=
            Expression(etype=('pvar', 'PENALTY_NONTARGET_BURN'), args=('PENALTY_NONTARGET_BURN', None))
            Expression(etype=('boolean', '^'), args=
                Expression(etype=('pvar', 'burning'), args=(burning, [?x, ?y]))
                Expression(etype=('boolean', '~'), args=
                    Expression(etype=('pvar', 'TARGET'), args=(TARGET, [?x, ?y])))))))

To get the precondition and invariant expressions:

print(env.model.preconditions)
print(env.model.invariants)
[]
[]

To get the discount factor, horizon, action concurrency number:

print(env.model.discount)
print(env.model.horizon)
print(env.model.max_allowed_actions)
1.0
40
1

To ground out a variable tensor into a dictionary of grounded fluents with values:

print(dict(env.model.ground_var_with_values('put-out', [True, True, False, True, False, True, False, True, False])))
{'put-out___x1__y1': True, 'put-out___x1__y2': True, 'put-out___x1__y3': False, 'put-out___x2__y1': True, 'put-out___x2__y2': False, 'put-out___x2__y3': True, 'put-out___x3__y1': False, 'put-out___x3__y2': True, 'put-out___x3__y3': False}

To determine whether an expression object is non-fluent (does not change value across decision time steps):

print(env.model.is_non_fluent_expression(env.model.reward))
False