torchrl.trainers.algorithms.configs.objectives.PPOLossConfig¶

class torchrl.trainers.algorithms.configs.objectives.PPOLossConfig(_partial_: bool = False, actor_network: Any = None, critic_network: Any = None, loss_type: str = 'clip', entropy_bonus: bool = True, samples_mc_entropy: int = 1, entropy_coeff: float | None = None, log_explained_variance: bool = True, critic_coeff: float = 0.25, loss_critic_type: str = 'smooth_l1', normalize_advantage: bool = True, normalize_advantage_exclude_dims: tuple = (), gamma: float | None = None, separate_losses: bool = False, advantage_key: str | None = None, value_target_key: str | None = None, value_key: str | None = None, functional: bool = True, actor: Any = None, critic: Any = None, reduction: str | None = None, clip_value: float | None = None, device: Any = None, _target_: str = 'torchrl.trainers.algorithms.configs.objectives._make_ppo_loss')[源代码]¶

用于配置 PPO 损失的类。

参数:: loss_type – 要使用的损失类型。

torchrl.trainers.algorithms.configs.objectives.PPOLossConfig¶

文档

教程

资源