MarlGroupMapType¶

torchrl.envs.MarlGroupMapType(value, names=None, *, module=None, qualname=None, type=None, start=1)[源代码]¶

Marl 组映射类型。

作为 torchrl 多代理功能的一项特性，您可以控制环境中代理的 agrupamento。您可以将代理分组（堆叠它们的张量）以在将它们通过相同的神经网络时利用向量化。您可以将代理分割成不同的组，其中它们是异构的或应该由不同的神经网络处理。要进行分组，您只需在环境构造时传递一个 group_map。

另外，您可以从此类中选择一种预定义的组合策略。

当 group_map=MarlGroupMapType.ALL_IN_ONE_GROUP 和代理 ["agent_0", "agent_1", "agent_2", "agent_3"] 时，从您的环境中进出的 tensordicts 将如下所示：

>>> print(env.rand_action(env.reset()))
TensorDict(
    fields={
        agents: TensorDict(
            fields={
                action: Tensor(shape=torch.Size([4, 9]), device=cpu, dtype=torch.int64, is_shared=False),
                done: Tensor(shape=torch.Size([4, 1]), device=cpu, dtype=torch.bool, is_shared=False),
                observation: Tensor(shape=torch.Size([4, 3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
            batch_size=torch.Size([4]))},
    batch_size=torch.Size([]))
>>> print(env.group_map)
{"agents": ["agent_0", "agent_1", "agent_2", "agent_3]}

当 group_map=MarlGroupMapType.ONE_GROUP_PER_AGENT 和代理 ["agent_0", "agent_1", "agent_2", "agent_3"] 时，从您的环境中进出的 tensordicts 将如下所示：

>>> print(env.rand_action(env.reset()))
TensorDict(
    fields={
        agent_0: TensorDict(
            fields={
                action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
            batch_size=torch.Size([]))},
        agent_1: TensorDict(
            fields={
                action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
            batch_size=torch.Size([]))},
        agent_2: TensorDict(
            fields={
                action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
            batch_size=torch.Size([]))},
        agent_3: TensorDict(
            fields={
                action: Tensor(shape=torch.Size([9]), device=cpu, dtype=torch.int64, is_shared=False),
                done: Tensor(shape=torch.Size([1]), device=cpu, dtype=torch.bool, is_shared=False),
                observation: Tensor(shape=torch.Size([3, 3, 2]), device=cpu, dtype=torch.int8, is_shared=False)},
            batch_size=torch.Size([]))},
    batch_size=torch.Size([]))
>>> print(env.group_map)
{"agent_0": ["agent_0"], "agent_1": ["agent_1"], "agent_2": ["agent_2"], "agent_3": ["agent_3"]}

MarlGroupMapType¶

文档

教程

资源