CatFrames¶

class torchrl.envs.transforms.CatFrames(N: int, dim: int, in_keys: Sequence[NestedKey] | None = None, out_keys: Sequence[NestedKey] | None = None, padding='same', padding_value=0, as_inverse=False, reset_key: NestedKey | None = None, done_key: NestedKey | None = None)[源代码]¶

将连续的观察帧连接成一个单一的张量。

此转换有助于在观察到的特征中创建运动或速度感。它也可以与需要访问过去观察的模型（如 transformers 等）一起使用。它最初在“Playing Atari with Deep Reinforcement Learning”中提出（https://arxiv.org/pdf/1312.5602.pdf）。

当在转换后的环境中用作转换器时，CatFrames 是一个有状态的类，可以通过调用 reset() 方法将其重置为其初始状态。此方法接受带有 "_reset" 条目的 tensordicts，该条目指示要重置的缓冲区。

参数:

N (int) – 要连接的观察次数。
dim (int) – 连接观察的维度。应为负数，以确保其与不同 batch_size 的环境兼容。
in_keys (NestedKey 序列, 可选) – 指向需要连接的帧的键。默认为 [“pixels”]。
out_keys (NestedKey 序列, 可选) – 指向输出写入位置的键。默认为 in_keys 的值。
padding (str, 可选) – 填充方法。可以是 "same" 或 "constant"。默认为 "same"，即第一个值用于填充。
padding_value (float, 可选) – 如果 padding="constant"，则用于填充的值。默认为 0。
as_inverse (bool, 可选) – 如果为 True，则转换作为逆转换应用。默认为 False。
reset_key (NestedKey, 可选) – 要用作部分重置指示器的重置键。必须是唯一的。如果未提供，则默认为父环境的唯一重置键（如果只有一个），否则引发异常。
done_key (NestedKey, 可选) – 要用作部分完成指示器的完成键。必须是唯一的。如果未提供，则默认为 "done"。

示例

>>> from torchrl.envs.libs.gym import GymEnv
>>> env = TransformedEnv(GymEnv('Pendulum-v1'),
...     Compose(
...         UnsqueezeTransform(-1, in_keys=["observation"]),
...         CatFrames(N=4, dim=-1, in_keys=["observation"]),
...     )
... )
>>> print(env.rollout(3))

CatFrames 转换器也可以离线使用，以不同比例重现在线帧连接的效果（或为了限制内存消耗）。下面的示例以及 torchrl.data.ReplayBuffer 的用法给出了完整的图景。

示例

>>> from torchrl.envs.utils import RandomPolicy        >>> from torchrl.envs import UnsqueezeTransform, CatFrames
>>> from torchrl.collectors import SyncDataCollector
>>> # Create a transformed environment with CatFrames: notice the usage of UnsqueezeTransform to create an extra dimension
>>> env = TransformedEnv(
...     GymEnv("CartPole-v1", from_pixels=True),
...     Compose(
...         ToTensorImage(in_keys=["pixels"], out_keys=["pixels_trsf"]),
...         Resize(in_keys=["pixels_trsf"], w=64, h=64),
...         GrayScale(in_keys=["pixels_trsf"]),
...         UnsqueezeTransform(-4, in_keys=["pixels_trsf"]),
...         CatFrames(dim=-4, N=4, in_keys=["pixels_trsf"]),
...     )
... )
>>> # we design a collector
>>> collector = SyncDataCollector(
...     env,
...     RandomPolicy(env.action_spec),
...     frames_per_batch=10,
...     total_frames=1000,
... )
>>> for data in collector:
...     print(data)
...     break
>>> # now let's create a transform for the replay buffer. We don't need to unsqueeze the data here.
>>> # however, we need to point to both the pixel entry at the root and at the next levels:
>>> t = Compose(
...         ToTensorImage(in_keys=["pixels", ("next", "pixels")], out_keys=["pixels_trsf", ("next", "pixels_trsf")]),
...         Resize(in_keys=["pixels_trsf", ("next", "pixels_trsf")], w=64, h=64),
...         GrayScale(in_keys=["pixels_trsf", ("next", "pixels_trsf")]),
...         CatFrames(dim=-4, N=4, in_keys=["pixels_trsf", ("next", "pixels_trsf")]),
... )
>>> from torchrl.data import TensorDictReplayBuffer, LazyMemmapStorage
>>> rb = TensorDictReplayBuffer(storage=LazyMemmapStorage(1000), transform=t, batch_size=16)
>>> data_exclude = data.exclude("pixels_trsf", ("next", "pixels_trsf"))
>>> rb.add(data_exclude)
>>> s = rb.sample(1) # the buffer has only one element
>>> # let's check that our sample is the same as the batch collected during inference
>>> assert (data.exclude("collector")==s.squeeze(0).exclude("index", "collector")).all()

注意

CatFrames 目前仅支持根目录下的 "done" 信号。嵌套的 done，例如在 MARL 设置中找到的，目前不支持。如果需要此功能，请在 TorchRL 存储库上提交一个 issue。

注意

在回放缓冲区中存储帧堆栈会显著增加内存消耗（增加 N 倍）。为了缓解这个问题，您可以直接将轨迹存储在回放缓冲区中，并在采样时应用 CatFrames。这种方法涉及采样存储的轨迹的切片，然后应用帧堆叠转换。为了方便起见，CatFrames 提供了一个 make_rb_transform_and_sampler() 方法，该方法创建：

一个适合在回放缓冲区中使用的转换器的修改版本
一个对应的 SliceSampler 以便与缓冲区一起使用

forward(tensordict: TensorDictBase) → TensorDictBase[源代码]¶

读取输入 tensordict，并对选定的键应用转换。

默认情况下，此方法

直接调用 _apply_transform()。
不调用 _step() 或 _call()。

此方法不会在任何时候在 env.step 中调用。但是，它会在 sample() 中调用。

注意

forward 也可以使用 dispatch 将参数名称转换为键，并使用常规关键字参数。

示例

>>> class TransformThatMeasuresBytes(Transform):
...     '''Measures the number of bytes in the tensordict, and writes it under `"bytes"`.'''
...     def __init__(self):
...         super().__init__(in_keys=[], out_keys=["bytes"])
...
...     def forward(self, tensordict: TensorDictBase) -> TensorDictBase:
...         bytes_in_td = tensordict.bytes()
...         tensordict["bytes"] = bytes
...         return tensordict
>>> t = TransformThatMeasuresBytes()
>>> env = env.append_transform(t) # works within envs
>>> t(TensorDict(a=0))  # Works offline too.

make_rb_transform_and_sampler(batch_size: int, **sampler_kwargs) → tuple[Transform, torchrl.data.replay_buffers.SliceSampler][源代码]¶

创建一个转换器和采样器，用于在存储帧堆叠数据时与回放缓冲区一起使用。

此方法通过避免在缓冲区中存储整个帧堆栈来帮助减少存储数据中的冗余。相反，它创建了一个在采样过程中即时堆叠帧的转换器，以及一个确保正确维护序列长度的采样器。

参数:

batch_size (int) – 采样器使用的批次大小。
**sampler_kwargs – 传递给 SliceSampler 构造函数的其他关键字参数。

返回:

transform (Transform): 一个在采样过程中即时堆叠帧的转换器。
sampler (SliceSampler): 一个确保正确维护序列长度的采样器。

返回类型:

一个包含的元组

示例

>>> env = TransformedEnv(...)
>>> catframes = CatFrames(N=4, ...)
>>> transform, sampler = catframes.make_rb_transform_and_sampler(batch_size=32)
>>> rb = ReplayBuffer(..., sampler=sampler, transform=transform)

注意

处理图像时，建议在前面的 ToTensorImage 转换器中使用不同的 in_keys 和 out_keys。这确保了存储在缓冲区中的张量与它们的处理后的对应物是分开的，而我们不希望存储这些处理后的对应物。对于非图像数据，请考虑在 CatFrames 之前插入一个 RenameTransform 来创建一个将被存储在缓冲区中的数据副本。

注意

将转换器添加到回放缓冲区时，应注意还要传递 CatFrames 前面的转换器，例如 ToTensorImage 或 UnsqueezeTransform，以便 CatFrames 转换器看到的数据格式与数据收集期间的格式相同。

注意

有关更完整的示例，请参阅 torchrl 的 github 存储库 examples 文件夹：https://github.com/pytorch/rl/tree/main/examples/replay-buffers/catframes-in-buffer.py

transform_observation_spec(observation_spec: TensorSpec) → TensorSpec[源代码]¶

转换观察规范，使结果规范与转换映射匹配。

参数:: observation_spec (TensorSpec) – 转换前的规范
返回:: 转换后的预期规范

CatFrames¶

文档

教程

资源