MultiAgentConvNet¶
- class torchrl.modules.MultiAgentConvNet(n_agents: int, centralized: bool | None = None, share_params: bool | None = None, *, in_features: int | None = None, device: DEVICE_TYPING | None = None, num_cells: Sequence[int] | None = None, kernel_sizes: Sequence[int | Sequence[int]] | int = 5, strides: Sequence | int = 2, paddings: Sequence | int = 0, activation_class: type[nn.Module] = <class 'torch.nn.modules.activation.ELU'>, use_td_params: bool = True, **kwargs)[源代码]¶
- 多智能体 CNN。 - 在 MARL 设置中,智能体可能共享相同的动作策略,也可能不共享:我们称之为参数可以共享或不共享。同样,一个网络可以接收所有智能体的整个观测空间,或按每个智能体的基础来计算其输出,我们分别称之为“集中式”和“非集中式”。 - 它期望输入的形状为 - (*B, n_agents, channels, x, y)。- 注意 - 要使用 torch.nn.init 模块初始化 MARL 模块参数,请参考 - get_stateful_net()和- from_stateful_net()方法。- 参数:
- 关键字参数:
- in_features (int, optional) – 输入特征的维度。如果留空为 - None,则使用懒惰模块。
- device (str 或 torch.device, optional) – 创建模块的设备。 
- num_cells (int 或 Sequence[int], optional) – 输入和输出之间各层的单元数。如果提供整数,则所有层都将具有相同的单元数。如果提供可迭代对象,则线性层的 - out_features将与- num_cells的内容匹配。
- kernel_sizes (int, Sequence[Union[int, Sequence[int]]) – 卷积网络的核大小。默认为 - 5。
- strides (int 或 Sequence[int]) – 卷积网络的步长。如果为可迭代对象,则长度必须与由 num_cells 或 depth 参数定义的深度匹配。默认为 - 2。
- activation_class (Type[nn.Module]) – 要使用的激活类。默认为 - torch.nn.ELU。
- use_td_params (bool, optional) – 如果为 - True,则参数可以在 self.params 中找到,它是一个- TensorDictParams对象(它同时继承自 TensorDict 和 nn.Module)。如果为- False,则参数包含在 self._empty_net 中。总的来说,这两种方法应该大致相同,但不可互换:例如,使用- use_td_params=True创建的- state_dict不能在- use_td_params=False时使用。
- **kwargs – 可以将 - ConvNet的参数传递给它,以自定义 ConvNet。
 
 - 示例 - >>> import torch >>> from torchrl.modules import MultiAgentConvNet >>> batch = (3,2) >>> n_agents = 7 >>> channels, x, y = 3, 100, 100 >>> obs = torch.randn(*batch, n_agents, channels, x, y) >>> # Let's consider a centralized network with shared parameters. >>> cnn = MultiAgentConvNet( ... n_agents, ... centralized = True, ... share_params = True ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0): ConvNet( (0): LazyConv2d(0, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> result = cnn(obs) >>> # The final dimension of the resulting tensor would be determined based on the layer definition arguments and the shape of input 'obs'. >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> # Since both observations and parameters are shared, we expect all agents to have identical outputs (eg. for a value function) >>> print(all(result[0,0,0] == result[0,0,1])) True - >>> # Alternatively, a local network with parameter sharing (eg. decentralized weight sharing policy) >>> cnn = MultiAgentConvNet( ... n_agents, ... centralized = False, ... share_params = True ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0): ConvNet( (0): Conv2d(4, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> # Parameters are shared but not observations, hence each agent has a different output. >>> print(all(result[0,0,0] == result[0,0,1])) False - >>> # Or multiple local networks identical in structure but with differing weights. >>> cnn = MultiAgentConvNet( ... n_agents, ... centralized = False, ... share_params = False ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0-6): 7 x ConvNet( (0): Conv2d(4, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> print(all(result[0,0,0] == result[0,0,1])) False - >>> # Or where inputs are shared but not parameters. >>> cnn = MultiAgentConvNet( ... n_agents, ... centralized = True, ... share_params = False ... ) >>> print(cnn) MultiAgentConvNet( (agent_networks): ModuleList( (0-6): 7 x ConvNet( (0): Conv2d(28, 32, kernel_size=(5, 5), stride=(2, 2)) (1): ELU(alpha=1.0) (2): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (3): ELU(alpha=1.0) (4): Conv2d(32, 32, kernel_size=(5, 5), stride=(2, 2)) (5): ELU(alpha=1.0) (6): SquashDims() ) ) ) >>> print(result.shape) torch.Size([3, 2, 7, 2592]) >>> print(all(result[0,0,0] == result[0,0,1])) False