从源代码构建¶

ExecuTorch 使用 CMake 作为主要构建系统。即使您不直接使用 CMake，CMake 也可以生成其他格式（如 Make、Ninja 或 Xcode）的脚本。有关信息，请参阅 cmake-generators(7)。

系统要求¶

操作系统¶

我们已在以下系统上测试了这些说明，尽管它们也应该能在类似环境中运行。

Linux (x86_64)

CentOS 8+
Ubuntu 20.04.6 LTS+
RHEL 8+

macOS (x86_64/ARM64)

Big Sur (11.0)+

Windows (x86_64)

Windows Subsystem for Linux (WSL)，支持以上任何 Linux 选项

软件¶

conda 或其他虚拟环境管理器
- 我们推荐使用 conda，因为它提供跨语言支持，并且能与 pip（Python 内置包管理器）顺畅集成。
- 否则，Python 内置的虚拟环境管理器 python venv 是一个不错的选择。
g++ 7 或更高版本，clang++ 5 或更高版本，或任何其他兼容 C++17 的工具链。
python 3.10-3.12 版本
ccache (可选) - 一个编译器缓存，可加速重新编译

请注意，可跨平台编译的核心运行时代码支持更广泛的工具链，低至 C++17。有关可移植性详细信息，请参阅运行时概述。

环境设置¶

克隆 ExecuTorch¶

# Clone the ExecuTorch repo from GitHub
git clone -b release/0.7 https://github.com/pytorch/executorch.git && cd executorch

创建虚拟环境¶

创建并激活 Python 虚拟环境

python3 -m venv .venv && source .venv/bin/activate && pip install --upgrade pip

或者，在您的机器上安装 conda。然后，创建一个名为“executorch”的 Conda 环境。

conda create -yn executorch python=3.10.0 && conda activate executorch

从源码安装 ExecuTorch pip 包¶

# Install ExecuTorch pip package and its dependencies, as well as
# development tools like CMake, and backend support for XNNPACK and CoreML.
# If developing on a Mac, make sure to install the Xcode Command Line Tools first.
# Intel-based macOS systems require building PyTorch from source (see below)
./install_executorch.sh

有关如何从源码构建 PyTorch 的信息，请参阅 PyTorch 说明。

使用 --use-pt-pinned-commit 标志来安装具有现有 PyTorch 构建的 ExecuTorch

./install_executorch.sh --use-pt-pinned-commit

对于基于 Intel 的 macOS 系统，请使用 --use-pt-pinned-commit --minimal 标志

./install_executorch.sh --use-pt-pinned-commit --minimal

请注意，默认情况下仅支持 XNNPACK 和 CoreML 后端。您可以通过设置相应的 CMake 标志来启用其他后端或禁用默认后端。

# Enable the MPS backend
CMAKE_ARGS="-DEXECUTORCH_BUILD_MPS=ON" ./install_executorch.sh

# Disable the XNNPACK backend
CMAKE_ARGS="-DEXECUTORCH_BUILD_XNNPACK=OFF" ./install_executorch.sh

对于开发模式，请使用 --editable 运行命令，这允许我们修改 Python 源代码并立即看到更改生效。

./install_executorch.sh --editable

# Or you can directly do the following if dependencies are already installed
# either via a previous invocation of `./install_executorch.sh` or by explicitly installing requirements via `./install_requirements.sh` first.
pip install -e . --no-build-isolation

如果修改了 C++ 文件，您仍需要从源码重新安装 ExecuTorch。

警告： 某些模块无法直接在可编辑模式下导入。这是一个已知的问题，我们正在积极修复。要解决此问题
# This will fail
python -c "from executorch.exir import CaptureConfig"
# But this will succeed
python -c "from executorch.exir.capture import CaptureConfig"

注意： 清理构建系统

从上游仓库获取新版本时（通过 git fetch 或 git pull），最好清除旧的构建工件。构建系统目前不能很好地适应构建依赖项的变化。

您还应该再次更新并拉取子模块，以防它们的版本发生变化。
# From the root of the executorch repo:
./install_executorch.sh --clean
git submodule sync
git submodule update --init --recursive
带有 --clean 的命令会移除构建工件、pip 输出，并在安装了 ccache 的情况下清除 ccache，从而确保一个完全干净的构建环境。

从源码构建 ExecuTorch C++ 运行时¶

ExecuTorch 的 CMake 构建系统涵盖了可能对嵌入式系统用户有用的运行时组件。

libexecutorch.a：ExecuTorch 运行时的核心。不包含任何运算符/内核定义或后端定义。
libportable_kernels.a：ATen 兼容运算符的实现，遵循 //kernels/portable/functions.yaml 中的签名。
libportable_kernels_bindings.a：用于将 libportable_kernels.a 的内容注册到运行时的生成代码。
- 注意：必须将其链接到您的应用程序中，并使用类似 -Wl,-force_load 或 -Wl,--whole-archive 的标志。它包含在加载时自动注册内核的函数，但链接器通常默认会修剪这些函数，因为没有直接调用它们。
executor_runner：一个示例工具，它使用所有 1 个值作为输入来运行 .pte 程序文件，并将输出打印到 stdout。它与 libportable_kernels.a 链接，因此程序可以使用它实现的任何运算符。

配置 CMake 构建¶

在克隆或拉取上游仓库后执行这些步骤，因为构建依赖项可能已更改。

# cd to the root of the executorch repo
cd executorch

# Clean and configure the CMake build system. It's good practice to do this
# whenever cloning or pulling the upstream repo.
./install_executorch.sh --clean
(mkdir cmake-out && cd cmake-out && cmake ..)

完成此操作后，除非再次从上游仓库拉取或修改任何与 CMake 相关的文件，否则无需再次执行。

CMake 构建选项¶

发布版本包含旨在提高性能和减小二进制文件大小的优化。它会禁用程序验证和 executorch 日志记录，并添加优化标志。

-DCMAKE_BUILD_TYPE=Release

为了进一步减小发布版本的体积，请同时使用

-DCMAKE_BUILD_TYPE=Release \
-DEXECUTORCH_OPTIMIZE_SIZE=ON

编译器缓存 (ccache)¶

如果您的系统上安装了 ccache，ExecuTorch 会自动检测并启用它。这通过缓存先前编译的对象来显著加速重新编译。

如果检测到 ccache，您将看到：ccache found and enabled for faster builds
如果未找到 ccache，您将看到：ccache not found, builds will not be cached

安装 ccache

# Ubuntu/Debian
sudo apt install ccache

# macOS
brew install ccache

# CentOS/RHEL
sudo yum install ccache
# or
sudo dnf install ccache

无需额外配置 - 构建系统将在可用时自动使用 ccache。

请参阅 CMakeLists.txt

构建运行时组件¶

使用以下命令构建所有目标：

# cd to the root of the executorch repo
cd executorch

# Build using the configuration that you previously generated under the
# `cmake-out` directory.
#
# NOTE: The `-j` argument specifies how many jobs/processes to use when
# building, and tends to speed up the build significantly. It's typical to use
# "core count + 1" as the `-j` value.
cmake --build cmake-out -j9

提示： 为了更快地重新构建，请考虑安装 ccache（参见上面的编译器缓存部分）。首次构建时，ccache 会填充其缓存。后续使用相同编译器标志进行构建可能会快得多。

使用示例二进制文件 `executor_runner` 执行 .pte 文件¶

首先，通过导出示例模型或按照模型导出和降低中的说明生成 .pte 文件。

要生成简单的模型文件，请从 ExecuTorch 目录运行以下命令。它将在当前目录中创建一个名为“add.pte”的文件。

python -m examples.portable.scripts.export --model_name="add"

然后，将其传递给命令行工具

./cmake-out/executor_runner --model_path add.pte

您应该会看到“模型执行成功”的消息，后跟输出值。

I 00:00:00.000526 executorch:executor_runner.cpp:82] Model file add.pte is loaded.
I 00:00:00.000595 executorch:executor_runner.cpp:91] Using method forward
I 00:00:00.000612 executorch:executor_runner.cpp:138] Setting up planned buffer 0, size 48.
I 00:00:00.000669 executorch:executor_runner.cpp:161] Method loaded.
I 00:00:00.000685 executorch:executor_runner.cpp:171] Inputs prepared.
I 00:00:00.000764 executorch:executor_runner.cpp:180] Model executed successfully.
I 00:00:00.000770 executorch:executor_runner.cpp:184] 1 outputs:
Output 0: tensor(sizes=[1], [2.])

为 Windows 构建 ExecuTorch¶

本文档概述了在 Windows 机器上构建和验证 ExecuTorch 的当前已知有效构建说明。

此演示使用 MobileNet v2 模型，通过 XNNPACK 后端对图像进行分类。

请注意，所有命令都应在管理员模式的 Windows powershell 中执行。

先决条件¶

1. 安装适用于 Windows 的 Miniconda¶

从官方网站下载适用于 Windows 的 Miniconda。

2. 安装适用于 Windows 的 Git¶

从官方网站下载适用于 Windows 的 Git。

3. 安装适用于 Windows 的 ClangCL¶

从官方网站下载适用于 Windows 的 ClangCL。

创建 Conda 环境¶

要检查 powershell 提示符是否检测到 conda，请尝试 conda list 或 conda --version。

如果 powershell 未检测到 conda，您可以运行名为 conda-hook.ps1 的 powershell 脚本。要验证 Conda 是否在 powershell 环境中可用，请尝试运行 conda list 或 conda --version。如果 Conda 不可用，请按以下方式运行 conda-hook.ps1：

$miniconda_dir\\shell\\condabin\\conda-hook.ps1

其中 $miniconda_dir 是您安装 miniconda 的目录。默认情况下，它是 “C:\Users\<username>\AppData\Local”。

创建并激活 conda 环境：¶

conda create -yn et python=3.12
conda activate et

检查符号链接¶

设置以下环境变量以启用符号链接：

git config --global core.symlinks true

设置 ExecuTorch¶

从官方 GitHub 仓库克隆 ExecuTorch。

git clone --recurse -submodules https://github.com/pytorch/executorch.git

运行设置脚本¶

目前，Windows 上有许多组件无法构建。以下说明安装了一个非常精简的 ExecuTorch，可用作基本检查。

进入 `executorch` 目录¶

cd executorch

(可选) 在运行 .bat 文件之前运行 –clean 脚本。¶

./install_executorch.bat --clean

运行设置脚本。¶

您可以运行 .bat 文件或 Python 脚本。

./install_executorch.bat
# OR
# python install_executorch.py

导出 MobileNet V2¶

创建名为 export_mv2.py 的以下脚本：

from torchvision.models import mobilenet_v2
from torchvision.models.mobilenetv2 import MobileNet_V2_Weights

mv2 = mobilenet_v2(weights=MobileNet_V2_Weights.DEFAULT) # This is torch.nn.Module

import torch
from executorch.exir import to_edge
from executorch.backends.xnnpack.partition.xnnpack_partitioner import XnnpackPartitioner

model = mv2.eval() # turn into evaluation mode

example_inputs = (torch.randn((1, 3, 224, 224)),) # Necessary for exporting the model

exported_graph = torch.export.export(model, example_inputs) # Core Aten graph

edge = to_edge(exported_graph) # Edge Dialect

edge_delegated = edge.to_backend(XnnpackPartitioner()) # Parts of the graph are delegated to XNNPACK

executorch_program = edge_delegated.to_executorch() # ExecuTorch program

pte_path = "mv2_xnnpack.pte"

with open(pte_path, "wb") as file:
    executorch_program.write_to_file(file) # Serializing into .pte file

运行导出脚本以创建 `mv2_xnnpack.pte` 文件。¶

python .\\export_mv2.py

构建和安装 C++ 库 + 二进制文件¶

del -Recurse -Force cmake-out; `
cmake . `
  -DCMAKE_INSTALL_PREFIX=cmake-out `
  -DPYTHON_EXECUTABLE=$miniconda_dir\\envs\\et\\python.exe `
  -DCMAKE_PREFIX_PATH=$miniconda_dir\\envs\\et\\Lib\\site-packages `
  -DCMAKE_BUILD_TYPE=Release `
  -DEXECUTORCH_BUILD_EXTENSION_TENSOR=ON `
  -DEXECUTORCH_BUILD_FLATC=ON `
  -DEXECUTORCH_BUILD_PYBIND=OFF `
  -DEXECUTORCH_BUILD_XNNPACK=ON `
  -DEXECUTORCH_BUILD_KERNELS_CUSTOM=ON `
  -DEXECUTORCH_BUILD_KERNELS_OPTIMIZED=ON `
  -DEXECUTORCH_BUILD_KERNELS_QUANTIZED=ON `
  -DEXECUTORCH_ENABLE_LOGGING=ON `
  -T ClangCL `
  -Bcmake-out; `
cmake --build cmake-out -j64 --target install --config Release

其中 $miniconda_dir 是您安装 miniconda 的目录。默认情况下，它是 “C:\Users\<username>\AppData\Local”。

使用 XNNPACK 委托运行 MobileNet V2 模型¶

.\\cmake-out\\backends\\xnnpack\\Release\\xnn_executor_runner.exe --model_path=.\\mv2_xnnpack.pte

预期的输出将打印一个大小为 1x1000 的张量，其中包含类别分数的值。

Output 0: tensor(sizes=[1, 1000], [
  -0.50986, 0.30064, 0.0953904, 0.147726, 0.231205, 0.338555, 0.206892, -0.0575775, … ])

恭喜！您已成功在 Windows 设备上设置 ExecuTorch 并运行了 MobileNet V2 模型。现在，您可以在自己的 Windows 设备上探索并享受 ExecuTorch 的强大功能！

交叉编译¶

以下是有关如何为 Android 和 iOS 进行交叉编译的说明。

Android¶

构建 executor_runner shell 二进制文件¶

先决条件：Android NDK，选择以下选项之一：
- 选项 1：按照安装 ndk 的说明下载 Android Studio。
- 选项 2：直接从此处下载 Android NDK。

假设 Android NDK 可用，请运行：

# Run the following lines from the `executorch/` folder
./install_executorch.sh --clean
mkdir cmake-android-out && cd cmake-android-out

# point -DCMAKE_TOOLCHAIN_FILE to the location where ndk is installed
cmake -DCMAKE_TOOLCHAIN_FILE=$ANDROID_NDK/build/cmake/android.toolchain.cmake  -DANDROID_ABI=arm64-v8a ..

cd  ..
cmake --build  cmake-android-out  -j9

adb shell mkdir -p /data/local/tmp/executorch
# push the binary to an Android device
adb push  cmake-android-out/executor_runner  /data/local/tmp/executorch
# push the model file
adb push  add.pte  /data/local/tmp/executorch

adb shell  "/data/local/tmp/executorch/executor_runner --model_path /data/local/tmp/executorch/add.pte"

从源码构建 AAR 以进行应用集成¶

先决条件：上一节的 Android NDK 和 Android SDK（建议使用 Android Studio）。

假设 Android NDK 和 SDK 可用，请运行：

export ANDROID_ABIS=arm64-v8a
export BUILD_AAR_DIR=aar-out
mkdir -p $BUILD_AAR_DIR
sh scripts/build_android_library.sh

此脚本将构建 AAR，其中包含 Java API 及其对应的 JNI 库。有关用法，请参阅本文档。

iOS¶

对于 iOS，我们将构建框架而不是静态库，其中还将包含公共头文件。

从 Mac App Store 安装 Xcode，然后使用终端安装命令行工具：

xcode-select --install

构建框架

./scripts/build_apple_frameworks.sh

使用 --help 标志运行上述命令，以了解如何构建其他后端（如 Core ML、MPS 或 XNNPACK）等。请注意，某些后端可能需要额外的依赖项以及特定版本的 Xcode 和 iOS。

将生成的 .xcframework 包复制到您的 Xcode 项目中，将其链接到您的目标，并且不要忘记添加额外的链接器标志 -all_load。

请查看 iOS 演示应用教程以获取更多信息。

后续步骤¶

您已成功将 executor_runner 二进制文件交叉编译到 iOS 和 Android 平台。您可以开始探索高级功能和能力。以下是您可能想接下来阅读的部分列表：

选择性构建，用于构建链接到程序仅使用的内核的运行时，这可以显著节省二进制文件大小。
有关构建 Android 和 iOS 演示应用的教程。
有关将应用程序部署到嵌入式设备（如 ARM Cortex-M/Ethos-U 和 XTensa HiFi DSP）的教程。