目录

快捷方式

VideoDecoder¶

class torchcodec.decoders.VideoDecoder(source: Union[str, Path, RawIOBase, BufferedReader, bytes, Tensor], *, stream_index: Optional[int] = None, dimension_order: Literal['NCHW', 'NHWC'] = 'NCHW', num_ffmpeg_threads: int = 1, device: Optional[Union[str, device]] = 'cpu', seek_mode: Literal['exact', 'approximate'] = 'exact')[源代码]¶

一个单流视频解码器。

参数:

source (str, Pathlib.path, bytes, torch.Tensor 或类文件对象) –
视频的来源
- 如果为 str：本地路径或视频文件的 URL。
- 如果为 Pathlib.path：本地视频文件的路径。
- 如果为 bytes 对象或 torch.Tensor：原始编码视频数据。
- 如果为类文件对象：我们会按需从该对象读取视频数据。该对象必须公开 read(self, size: int) -> bytes 和 seek(self, offset: int, whence: int) -> bytes 方法。更多信息请参阅：通过类文件支持流式传输数据。
stream_index (int, 可选) – 指定要从中解码帧的视频流。请注意，此索引对于所有媒体类型都是绝对的。如果未指定，则使用最佳流。

dimension_order (str, 可选) –

解码帧的维度顺序。可以是“NCHW”（默认）或“NHWC”，其中 N 是批次大小，C 是通道数，H 是高度，W 是帧的宽度。 .. note

Frames are natively decoded in NHWC format by the underlying
FFmpeg implementation. Converting those into NCHW format is a
cheap no-copy operation that allows these frames to be
transformed using the `torchvision transforms
<https://pytorch.ac.cn/vision/stable/transforms.html>`_.

num_ffmpeg_threads (int, 可选) – 用于解码的线程数。使用 1 进行单线程解码，这可能是运行多个 VideoDecoder 实例的并行最佳选择。使用更高的数字进行多线程解码，这可能是运行单个 VideoDecoder 实例的最佳选择。传递 0 会让 FFmpeg 决定线程数。默认值：1。
device (str 或 torch.device, 可选) – 用于解码的设备。默认值：“cpu”。
seek_mode (str, 可选) – 决定帧访问是“精确”还是“近似”。精确模式保证请求帧 i 总是返回帧 i，但这需要对文件进行初始扫描。近似模式更快，因为它避免了扫描文件，但准确性较低，因为它使用文件的元数据来计算 i 的可能位置。默认值：“exact”。更多关于此参数的信息请参阅：精确与近似查找模式：性能和准确性比较

变量:

metadata (VideoStreamMetadata) – 视频流的元数据。
stream_index (int) – 此解码器从中检索帧的流索引。如果在初始化时提供了流索引，则此值与该值相同。如果未指定，则这是最佳流。

使用 VideoDecoder 的示例

精确与近似搜索模式：性能和准确性比较

精确与近似搜索模式：性能和准确性比较

使用 CUDA 和 NVDEC 在 GPU 上加速视频解码

使用 CUDA 和 NVDEC 在 GPU 上加速视频解码

使用 VideoDecoder 解码视频

使用 VideoDecoder 解码视频

通过类文件对象流式传输数据

通过类文件对象流式传输数据

并行视频解码：多进程与多线程

并行视频解码：多进程与多线程

如何采样视频片段

如何采样视频片段

__getitem__(key: Union[Integral, slice]) → Tensor[源代码]¶

以张量形式返回给定索引处的帧或帧。

注意

如果要解码多帧，我们建议使用批量方法，因为它们速度更快：get_frames_at()、get_frames_in_range()、get_frames_played_at() 和 get_frames_played_in_range()。

参数:: key (int 或 slice) – 要检索的帧的索引或范围。
返回:: 给定索引或范围处的帧或帧。
返回类型:: torch.Tensor

get_frame_at(index: int) → Frame[源代码]¶

返回给定索引处的单个帧。

注意

如果要解码多帧，我们建议使用批量方法，因为它们速度更快：get_frames_at()、get_frames_in_range()、get_frames_played_at()、get_frames_played_in_range()。

参数:: index (int) – 要检索的帧的索引。
返回:: 给定索引处的帧。
返回类型:: Frame

get_frame_played_at(seconds: float) → Frame[源代码]¶

返回在给定时间戳（以秒为单位）播放的单个帧。

注意

如果要解码多帧，我们建议使用批量方法，因为它们速度更快：get_frames_at()、get_frames_in_range()、get_frames_played_at()、get_frames_played_in_range()。

参数:: seconds (float) – 帧播放的时间戳（以秒为单位）。
返回:: 在 seconds 时间播放的帧。
返回类型:: Frame

get_frames_at(indices: list[int]) → FrameBatch[源代码]¶

返回给定索引处的帧。

参数:: indices (int 列表) – 要检索的帧的索引。
返回:: 给定索引处的帧。
返回类型:: FrameBatch

get_frames_in_range(start: int, stop: int, step: int = 1) → FrameBatch[源代码]¶

在给定索引范围处返回多帧。

帧在 [start, stop) 范围内。

参数:

start (int) – 要检索的第一帧的索引。
stop (int) – 索引范围的结束（排除，遵循 Python 约定）。
step (int, 可选) – 帧之间的步长。默认值：1。

返回:

指定范围内的帧。

返回类型:

get_frames_played_at(seconds: list[float]) → FrameBatch[源代码]¶

返回在给定时间戳（以秒为单位）播放的帧。

参数:: seconds (float 列表) – 帧播放的时间戳（以秒为单位）。
返回:: 在 seconds 时间播放的帧。
返回类型:: FrameBatch

get_frames_played_in_range(start_seconds: float, stop_seconds: float) → FrameBatch[源代码]¶

返回指定范围内的多帧。

帧在半开区间 [start_seconds, stop_seconds) 内。返回的每帧的 pts（以秒为单位）都包含在半开区间内。

参数:

start_seconds (float) – 范围开始的时间（以秒为单位）。
stop_seconds (float) – 范围结束的时间（以秒为单位）。作为半开区间，结束时间被排除。

返回:

指定范围内的帧。

返回类型:

文档

访问全面的 PyTorch 开发者文档

查看文档

教程

为初学者和高级开发者提供深入的教程

查看教程

资源

查找开发资源并让您的问题得到解答

查看资源