Extracting information from the compiled planning problem.#
In this intermediate notebook, we discuss how to access the underlying compiled planning problem.
First install the required packages:
%pip install --quiet --upgrade pip
%pip install --quiet git+https://github.com/pyrddlgym-project/pyRDDLGym.git
%pip install --quiet git+https://github.com/pyrddlgym-project/rddlrepository.git
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Import the required packages:
import pyRDDLGym
We will use the Wildfire problem from the IPPC 2014 as our illustrative example:
env = pyRDDLGym.make('Wildfire_MDP_ippc2014', '1')
The compiled information about the planning problem is stored in the environment’s model
field:
env.model
<pyRDDLGym.core.compiler.model.RDDLLiftedModel at 0x14ae9eda2d0>
To extract the domain and instance names:
print(env.model.domain_name)
print(env.model.instance_name)
wildfire_mdp
wildfire_inst_mdp__1
To get the objects of each domain and enum type:
print(env.model.type_to_objects)
print(env.model.enum_types)
{'x_pos': ['x1', 'x2', 'x3'], 'y_pos': ['y1', 'y2', 'y3']}
set()
To get all fluents by category, with their default values:
print(env.model.non_fluents.keys())
print(env.model.state_fluents.keys())
print(env.model.action_fluents.keys())
print(env.model.derived_fluents.keys())
print(env.model.interm_fluents.keys())
print(env.model.observ_fluents.keys())
dict_keys(['COST_CUTOUT', 'COST_PUTOUT', 'PENALTY_TARGET_BURN', 'PENALTY_NONTARGET_BURN', 'NEIGHBOR', 'TARGET'])
dict_keys(['burning', 'out-of-fuel'])
dict_keys(['put-out', 'cut-out'])
dict_keys([])
dict_keys([])
dict_keys([])
To get the value ranges of all fluents:
print(env.model.variable_ranges)
{'COST_CUTOUT': 'real', 'COST_PUTOUT': 'real', 'PENALTY_TARGET_BURN': 'real', 'PENALTY_NONTARGET_BURN': 'real', 'NEIGHBOR': 'bool', 'TARGET': 'bool', 'burning': 'bool', "burning'": 'bool', 'out-of-fuel': 'bool', "out-of-fuel'": 'bool', 'put-out': 'bool', 'cut-out': 'bool'}
To get the variable parameters of each fluent:
print(env.model.variable_params)
{'COST_CUTOUT': [], 'COST_PUTOUT': [], 'PENALTY_TARGET_BURN': [], 'PENALTY_NONTARGET_BURN': [], 'NEIGHBOR': ['x_pos', 'y_pos', 'x_pos', 'y_pos'], 'TARGET': ['x_pos', 'y_pos'], 'burning': ['x_pos', 'y_pos'], "burning'": ['x_pos', 'y_pos'], 'out-of-fuel': ['x_pos', 'y_pos'], "out-of-fuel'": ['x_pos', 'y_pos'], 'put-out': ['x_pos', 'y_pos'], 'cut-out': ['x_pos', 'y_pos']}
To get all CPF expressions:
print(env.model.cpfs)
{"burning'": ([('?x', 'x_pos'), ('?y', 'y_pos')], <pyRDDLGym.core.parser.expr.Expression object at 0x0000014AEB10F2C0>), "out-of-fuel'": ([('?x', 'x_pos'), ('?y', 'y_pos')], <pyRDDLGym.core.parser.expr.Expression object at 0x0000014AEB10FA10>)}
To get the reward function expression:
print(env.model.reward)
Expression(etype=('arithmetic', '+'), args=
Expression(etype=('arithmetic', '+'), args=
Expression(etype=('arithmetic', '+'), args=
Expression(etype=('aggregation', 'sum'), args=
('typed_var', ('?x', 'x_pos'))
('typed_var', ('?y', 'y_pos'))
Expression(etype=('arithmetic', '*'), args=
Expression(etype=('pvar', 'COST_CUTOUT'), args=('COST_CUTOUT', None))
Expression(etype=('pvar', 'cut-out'), args=(cut-out, [?x, ?y]))))
Expression(etype=('aggregation', 'sum'), args=
('typed_var', ('?x', 'x_pos'))
('typed_var', ('?y', 'y_pos'))
Expression(etype=('arithmetic', '*'), args=
Expression(etype=('pvar', 'COST_PUTOUT'), args=('COST_PUTOUT', None))
Expression(etype=('pvar', 'put-out'), args=(put-out, [?x, ?y])))))
Expression(etype=('aggregation', 'sum'), args=
('typed_var', ('?x', 'x_pos'))
('typed_var', ('?y', 'y_pos'))
Expression(etype=('arithmetic', '*'), args=
Expression(etype=('pvar', 'PENALTY_TARGET_BURN'), args=('PENALTY_TARGET_BURN', None))
Expression(etype=('boolean', '^'), args=
Expression(etype=('boolean', '|'), args=
Expression(etype=('pvar', 'burning'), args=(burning, [?x, ?y]))
Expression(etype=('pvar', 'out-of-fuel'), args=(out-of-fuel, [?x, ?y])))
Expression(etype=('pvar', 'TARGET'), args=(TARGET, [?x, ?y]))))))
Expression(etype=('aggregation', 'sum'), args=
('typed_var', ('?x', 'x_pos'))
('typed_var', ('?y', 'y_pos'))
Expression(etype=('arithmetic', '*'), args=
Expression(etype=('pvar', 'PENALTY_NONTARGET_BURN'), args=('PENALTY_NONTARGET_BURN', None))
Expression(etype=('boolean', '^'), args=
Expression(etype=('pvar', 'burning'), args=(burning, [?x, ?y]))
Expression(etype=('boolean', '~'), args=
Expression(etype=('pvar', 'TARGET'), args=(TARGET, [?x, ?y])))))))
To get the precondition and invariant expressions:
print(env.model.preconditions)
print(env.model.invariants)
[]
[]
To get the discount factor, horizon, action concurrency number:
print(env.model.discount)
print(env.model.horizon)
print(env.model.max_allowed_actions)
1.0
40
1
To ground out a variable tensor into a dictionary of grounded fluents with values:
print(dict(env.model.ground_var_with_values('put-out', [True, True, False, True, False, True, False, True, False])))
{'put-out___x1__y1': True, 'put-out___x1__y2': True, 'put-out___x1__y3': False, 'put-out___x2__y1': True, 'put-out___x2__y2': False, 'put-out___x2__y3': True, 'put-out___x3__y1': False, 'put-out___x3__y2': True, 'put-out___x3__y3': False}
To determine whether an expression object is non-fluent (does not change value across decision time steps):
print(env.model.is_non_fluent_expression(env.model.reward))
False