BurnInTransform¶

class torchrl.envs.transforms.BurnInTransform(modules: Sequence[TensorDictModuleBase], burn_in: int, out_keys: Sequence[NestedKey] | None = None)[source]¶

用于部分烧入数据序列的转换。

此转换器对于在无法获得循环状态时获取最新的循环状态很有用。它沿着时间维度烧入从采样的顺序数据切片中获得的若干步，并返回具有烧入数据作为其初始时间步的剩余数据序列。此转换器旨在用作回放缓冲区转换器，而不是环境转换器。

参数:

modules (TensorDictModule 序列) – 用于烧入数据序列的模块列表。
burn_in (int) – 要烧入的时间步数。
out_keys (NestedKey 序列, 可选) – 目标键。默认为
` (所有指向下一个时间步的模块的 out_keys（例如“hidden”，如果) –
(“next” –
module)。 (“hidden”）是其中一个模块的 out_keys) –

注意

此转换器期望输入的 TensorDict 的最后一个维度是时间维度。它还假定所有提供的模块都可以处理顺序数据。

示例

>>> import torch
>>> from tensordict import TensorDict
>>> from torchrl.envs.transforms import BurnInTransform
>>> from torchrl.modules import GRUModule
>>> gru_module = GRUModule(
...     input_size=10,
...     hidden_size=10,
...     in_keys=["observation", "hidden"],
...     out_keys=["intermediate", ("next", "hidden")],
...     default_recurrent_mode=True,
... )
>>> burn_in_transform = BurnInTransform(
...     modules=[gru_module],
...     burn_in=5,
... )
>>> td = TensorDict({
...     "observation": torch.randn(2, 10, 10),
...      "hidden": torch.randn(2, 10, gru_module.gru.num_layers, 10),
...      "is_init": torch.zeros(2, 10, 1),
... }, batch_size=[2, 10])
>>> td = burn_in_transform(td)
>>> td.shape
torch.Size([2, 5])
>>> td.get("hidden").abs().sum()
tensor(86.3008)

>>> from torchrl.data import LazyMemmapStorage, TensorDictReplayBuffer
>>> buffer = TensorDictReplayBuffer(
...     storage=LazyMemmapStorage(2),
...     batch_size=1,
... )
>>> buffer.append_transform(burn_in_transform)
>>> td = TensorDict({
...     "observation": torch.randn(2, 10, 10),
...      "hidden": torch.randn(2, 10, gru_module.gru.num_layers, 10),
...      "is_init": torch.zeros(2, 10, 1),
... }, batch_size=[2, 10])
>>> buffer.extend(td)
>>> td = buffer.sample(1)
>>> td.shape
torch.Size([1, 5])
>>> td.get("hidden").abs().sum()
tensor(37.0344)

forward(tensordict: TensorDictBase) → TensorDictBase[source]¶

读取输入 tensordict，并对选定的键应用转换。

默认情况下，此方法

直接调用 _apply_transform()。
不调用 _step() 或 _call()。

此方法不会在任何时候在 env.step 中调用。但是，它会在 sample() 中调用。

注意

forward 也可以使用 dispatch 将参数名称转换为键，并使用常规关键字参数。

示例

>>> class TransformThatMeasuresBytes(Transform):
...     '''Measures the number of bytes in the tensordict, and writes it under `"bytes"`.'''
...     def __init__(self):
...         super().__init__(in_keys=[], out_keys=["bytes"])
...
...     def forward(self, tensordict: TensorDictBase) -> TensorDictBase:
...         bytes_in_td = tensordict.bytes()
...         tensordict["bytes"] = bytes
...         return tensordict
>>> t = TransformThatMeasuresBytes()
>>> env = env.append_transform(t) # works within envs
>>> t(TensorDict(a=0))  # Works offline too.

BurnInTransform¶

文档

教程

资源