WAV2VEC2_XLSR53¶

torchaudio.pipelines.WAV2VEC2_XLSR53¶

Wav2vec 2.0 模型（“base”架构），在来自多个数据集的 56,000 小时未标记音频上进行了预训练（Multilingual LibriSpeech [Pratap et al., 2020]，CommonVoice [Ardila et al., 2020] 和 BABEL [Gales et al., 2014]），未经微调。

最初由 *Unsupervised Cross-lingual Representation Learning for Speech Recognition* [Conneau et al., 2020] 的作者根据 MIT 许可发布，并以相同的许可重新分发。[License, Source]