Projects / blinker
The blinker
wrapper can wrap any Gym environment, adding an additional, parallel observe
action. observe
costs a configurable amount of reward (i.e. it produces a negative reward), but is required to obtain a fresh observation. If observe
is not chosen, the observation will remain stale. This forces agents to choose the best times to observe, and to avoid observation if they can predict the relevant world state.
Additionally, the render(human=true)
method will show a visual indication when an observation is being made.