快捷方式

OpenSpielWrapper

torchrl.envs.OpenSpielWrapper(*args, **kwargs)[源代码]

Google DeepMind OpenSpiel 环境包装器。

GitHub: https://github.com/google-deepmind/open_spiel

文档: https://openspiel.readthedocs.io/en/latest/index.html

参数:

env (pyspiel.State) – 要包装的游戏。

关键字参数:
  • device (torch.device, optional) – 如果提供,则数据将被转换为的设备。默认为 None

  • batch_size (torch.Size, optional) – 环境的批次大小。默认为 torch.Size([])

  • allow_done_after_reset (bool, optional) – 如果为 True,则在调用 reset() 后环境允许处于 done 状态。默认为 False

  • group_map (MarlGroupMapType or Dict[str, List[str]]], optional) – 如何在 tensordicts 中对代理进行分组以用于输入/输出。有关更多信息,请参阅 MarlGroupMapType。默认为 ALL_IN_ONE_GROUP

  • categorical_actions (bool, optional) – 如果为 True,则分类规范将被转换为 TorchRL 等效的规范(torchrl.data.Categorical),否则将使用独热编码(torchrl.data.OneHot)。默认为 False

  • return_state (bool, optional) – 如果为 True,“state”将被包含在 reset()step() 的输出中。状态可以传递给 reset() 以重置到该状态,而不是重置到初始状态。默认为 False

变量:

available_envs – 可用于构建的环境

示例

>>> import pyspiel
>>> from torchrl.envs import OpenSpielWrapper
>>> from tensordict import TensorDict
>>> base_env = pyspiel.load_game('chess').new_initial_state()
>>> env = OpenSpielWrapper(base_env, return_state=True)
>>> td = env.reset()
>>> td = env.step(env.full_action_spec.rand())
>>> print(td)
TensorDict(
    fields={
        agents: TensorDict(
            fields={
                action: Tensor(shape=torch.Size([2, 4672]), device=cpu, dtype=torch.int64, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False),
        next: TensorDict(
            fields={
                agents: TensorDict(
                    fields={
                        observation: Tensor(shape=torch.Size([2, 20, 8, 8]), device=cpu, dtype=torch.float32, is_shared=False),
                        reward: Tensor(shape=torch.Size([2, 1]), device=cpu, dtype=torch.float32, is_shared=False)},
                    batch_size=torch.Size([2]),
                    device=None,
                    is_shared=False),
                current_player: Tensor(shape=torch.Size([]), device=cpu, dtype=torch.int32, is_shared=False),
                done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                state: NonTensorData(data=FEN: rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
                3009
                , batch_size=torch.Size([]), device=None),
                terminated: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False)},
            batch_size=torch.Size([]),
            device=None,
            is_shared=False)},
    batch_size=torch.Size([]),
    device=None,
    is_shared=False)
>>> print(env.available_envs)
['2048', 'add_noise', 'amazons', 'backgammon', ...]

reset() 可以恢复特定状态,而不是初始状态,只要 return_state=True

>>> import pyspiel
>>> from torchrl.envs import OpenSpielWrapper
>>> from tensordict import TensorDict
>>> base_env = pyspiel.load_game('chess').new_initial_state()
>>> env = OpenSpielWrapper(base_env, return_state=True)
>>> td = env.reset()
>>> td = env.step(env.full_action_spec.rand())
>>> td_restore = td["next"]
>>> td = env.step(env.full_action_spec.rand())
>>> # Current state is not equal `td_restore`
>>> (td["next"] == td_restore).all()
False
>>> td = env.reset(td_restore)
>>> # After resetting, now the current state is equal to `td_restore`
>>> (td == td_restore).all()
True

文档

访问全面的 PyTorch 开发者文档

查看文档

教程

为初学者和高级开发者提供深入的教程

查看教程

资源

查找开发资源并让您的问题得到解答

查看资源