更快的哈希运算符¶

CUDA 运算符¶

template<typename TInput, typename TIdentity> void _zero_collision_hash_cuda(Tensor &output, Tensor &evict_slots, const Tensor &input, Tensor &identities, int64_t max_probe, bool circular_probe, int64_t cur_hour, bool readonly, bool support_evict, const std::optional<Tensor> &local_sizes, const std::optional<Tensor> &offsets, int32_t hash_identity, const std::optional<Tensor> &metadata, bool disable_fallback, const std::optional<Tensor> &input_metadata, int64_t eviction_threshold, int64_t eviction_policy, int64_t opt_in_prob, int64_t num_reserved_slots, const std::optional<Tensor> &opt_in_rands)¶

零碰撞哈希的 CUDA 实现。此函数对输入张量中的输入特征 ID 执行零碰撞哈希，并返回输出张量中的重映射 ID。如果启用了逐出策略，它还会更新元数据表。具体来说，它执行以下步骤：

对于每个输入特征 ID，使用 MurmurHash3 算法计算其哈希值。哈希值将被转发到身份表（名为 identities 的张量）。
检查由哈希值索引的身份表中的槽是否为空。如果为空，则将特征 ID 插入到该槽中，并将哈希值作为重映射 ID 返回。
如果该槽不为空，则会线性探测下一个槽，直到找到一个空槽或达到最大探测次数。如果找到一个空槽，则将特征 ID 插入到该槽中，并将该空槽的索引作为重映射 ID 返回。
如果没有找到空槽，则根据逐出策略找到可逐出的槽，并逐出该槽中的特征 ID。然后，将当前特征 ID 插入到被逐出的槽中，并将被逐出槽的索引作为重映射 ID 返回。元数据表也将相应地更新。

参数:

output – 将原地修改的输出张量
evict_slots – 将被逐出的槽
input – 输入张量
identities – 身份张量
max_probe – 最大探测次数
circular_probe – 是否使用循环探测
cur_hour – 当前小时
readonly – 是否使用只读模式
support_evict – 是否支持逐出
local_sizes – 局部大小张量
offsets – 偏移量张量
hash_identity – 是否对身份进行哈希
metadata – 元数据张量
disable_fallback – 是否禁用回退
input_metadata – 输入元数据张量
eviction_threshold – 逐出阈值
eviction_policy – 逐出策略
opt_in_prob – 选择加入的概率
num_reserved_slots – 保留槽的数量
opt_in_rands – 选择加入的随机数张量

返回:

无（输出张量将原地修改）

Tensor murmur_hash3_cuda(const Tensor &input, int64_t y, int64_t seed)¶

适用于 CUDA 设备的 Murmur 哈希运算符。

此函数实现了 Murmur 哈希算法。给定输入张量、y 值和种子值，它返回输入张量的哈希值。哈希值使用 common_utils.cuh 中的 murmur_hash3_2x64 函数实现的 Murmur hash3 x64 算法进行计算。

参数:

input – 输入张量
y – y 值
seed – 种子值

返回:

输出哈希值

CPU 运算符¶

std::tuple<Tensor, Tensor> create_zch_buffer_cpu(const int64_t size, bool support_evict, std::optional<at::Device> device, bool long_type)¶

为 ZCH 创建身份表和元数据表的缓冲区。此函数声明并初始化 ZCH 的身份表和元数据表。身份表是大小为 [size, 1] 的张量，元数据表是大小为 [size, 1] 的张量。身份表和元数据表中的槽都初始化为默认值 -1。

参数:

size – 目标张量维度
support_evict – 是否支持逐出
device – 分配张量的设备
long_type – 是否对张量使用长类型

返回:

一个包含两个张量的元组，第一个张量是

Tensor murmur_hash3_cpu(const Tensor &input, int64_t y, int64_t seed)¶

适用于 CPU 的 Murmur 哈希运算符。

此函数实现了 Murmur 哈希算法。给定输入张量、y 值和种子值，它返回输入张量的哈希值。哈希值使用 common_utils.cuh 中的 murmur_hash3_2x64 函数实现的 Murmur hash3 x64 算法进行计算。

参数:

input – 输入张量
y – y 值
seed – 种子值

返回:

输出哈希值

std::tuple<Tensor, Tensor> zero_collision_hash_cpu(const Tensor &input, Tensor&identities, int64_t max_probe, bool circular_probe, int64_t exp_hours, bool readonly, const std::optional<Tensor> &local_sizes, const std::optional<Tensor> &offsets, const std::optional<Tensor> &metadata, bool, bool disable_fallback, bool _modulo_identity_DPRECATED, const std::optional<Tensor> &input_metadata, int64_t eviction_threshold, int64_t, int64_t opt_in_prob, int64_t num_reserved_slots, const std::optional<Tensor> &opt_in_rands)¶

CPU 的零碰撞哈希运算符。

此函数对输入张量中的输入特征 ID 执行零碰撞哈希，并返回输出张量中的重映射 ID。如果启用了逐出策略，它还会更新元数据表。具体来说，它执行以下步骤：

对于每个输入特征 ID，使用 MurmurHash3 算法计算其哈希值。哈希值将被转发到身份表（名为 identities 的张量）。
检查由哈希值索引的身份表中的槽是否为空。如果为空，则将特征 ID 插入到该槽中，并将哈希值作为重映射 ID 返回。
如果该槽不为空，则会线性探测下一个槽，直到找到一个空槽或达到最大探测次数。如果找到一个空槽，则将特征 ID 插入到该槽中，并将该空槽的索引作为重映射 ID 返回。
如果没有找到空槽，则根据逐出策略找到可逐出的槽，并逐出该槽中的特征 ID。然后，将当前特征 ID 插入到被逐出的槽中，并将被逐出槽的索引作为重映射 ID 返回。元数据表也将相应地更新。

参数:

input – 输入张量
identities – 身份表
max_probe – 最大探测次数
circular_probe – 是否使用循环探测
exp_hours – 身份表项过期的小时数
readonly – 是否使用只读模式
local_sizes – 局部大小张量
offsets – 偏移量张量
metadata – 元数据张量
output_on_uvm – 是否在 UVM 上输出
disable_fallback – 是否禁用回退
_modulo_identity_DPRECATED – 模身份
input_metadata – 输入元数据张量
eviction_threshold – 逐出阈值
eviction_policy – 逐出策略
opt_in_prob – 选择加入的概率
num_reserved_slots – 保留槽的数量
opt_in_rands – 选择加入的随机数张量

返回:

一个包含两个张量的元组，第一个张量是输出张量，第二个张量是将被逐出的槽

std::tuple<Tensor, Tensor> zero_collision_hash_meta(const Tensor &input, Tensor&, int64_t, bool, int64_t, bool, const std::optional<Tensor>&, const std::optional<Tensor>&, const std::optional<Tensor>&, bool, bool, bool, const std::optional<Tensor>&, int64_t, int64_t, int64_t, int64_t, const std::optional<Tensor>&)¶

Meta 设备的零碰撞哈希运算符。

此函数对输入张量中的输入特征 ID 执行零碰撞哈希，并返回输出张量中的重映射 ID。如果启用了逐出策略，它还会更新元数据表。具体来说，它执行以下步骤：

对于每个输入特征 ID，使用 MurmurHash3 算法计算其哈希值。哈希值将被转发到身份表（名为 identities 的张量）。
检查由哈希值索引的身份表中的槽是否为空。如果为空，则将特征 ID 插入到该槽中，并将哈希值作为重映射 ID 返回。
如果该槽不为空，则会线性探测下一个槽，直到找到一个空槽或达到最大探测次数。如果找到一个空槽，则将特征 ID 插入到该槽中，并将该空槽的索引作为重映射 ID 返回。
如果没有找到空槽，则根据逐出策略找到可逐出的槽，并逐出该槽中的特征 ID。然后，将当前特征 ID 插入到被逐出的槽中，并将被逐出槽的索引作为重映射 ID 返回。元数据表也将相应地更新。

参数:

input – 输入张量
identities – 身份表
max_probe – 最大探测次数
circular_probe – 是否使用循环探测
exp_hours – 身份表项过期的小时数
readonly – 是否使用只读模式
local_sizes – 局部大小张量
offsets – 偏移量张量
metadata – 元数据张量
output_on_uvm – 是否在 UVM 上输出
disable_fallback – 是否禁用回退
_modulo_identity_DPRECATED – 模身份
input_metadata – 输入元数据张量
eviction_threshold – 逐出阈值
eviction_policy – 逐出策略
opt_in_prob – 选择加入的概率
num_reserved_slots – 保留槽的数量
opt_in_rands – 选择加入的随机数张量

返回:

一个包含两个张量的元组，第一个张量是输出张量，第二个张量是将被逐出的槽

Tensor murmur_hash3_meta(const Tensor &input, int64_t y, int64_t seed)¶

适用于 Meta 设备的 Murmur 哈希运算符。

此函数实现了 Murmur 哈希算法。给定输入张量、y 值和种子值，它返回输入张量的哈希值。哈希值使用 common_utils.cuh 中的 murmur_hash3_2x64 函数实现的 Murmur hash3 x64 算法进行计算。

参数:

input – 输入张量
y – y 值
seed – 种子值

更快的哈希运算符¶

CUDA 运算符¶

CPU 运算符¶

文档

教程

资源