使用numpy实现bert模型，使用hugging face 或pytorch训练模型，保存参数为numpy格式，然后使用numpy加载模型推理，可在树莓派上运行

之前分别用numpy实现了mlp，cnn，lstm，这次搞一个大一点的模型bert，纯numpy实现，最重要的是可在树莓派上或其他不能安装pytorch的板子上运行，推理数据

本次模型是随便在hugging face上找的一个新闻评论的模型，7分类

看这些模型参数，这并不重要，模型占硬盘空间都要400+M

bert.embeddings.word_embeddings.weight torch.Size([21128, 768])

bert.embeddings.position_embeddings.weight torch.Size([512, 768])

bert.embeddings.token_type_embeddings.weight torch.Size([2, 768])

bert.embeddings.LayerNorm.weight torch.Size([768])

bert.embeddings.LayerNorm.bias torch.Size([768])

bert.encoder.layer.0.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.0.attention.self.query.bias torch.Size([768])

bert.encoder.layer.0.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.0.attention.self.key.bias torch.Size([768])

bert.encoder.layer.0.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.0.attention.self.value.bias torch.Size([768])

bert.encoder.layer.0.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.0.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.0.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.0.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.0.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.0.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.0.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.0.output.dense.bias torch.Size([768])

bert.encoder.layer.0.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.0.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.1.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.1.attention.self.query.bias torch.Size([768])

bert.encoder.layer.1.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.1.attention.self.key.bias torch.Size([768])

bert.encoder.layer.1.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.1.attention.self.value.bias torch.Size([768])

bert.encoder.layer.1.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.1.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.1.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.1.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.1.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.1.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.1.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.1.output.dense.bias torch.Size([768])

bert.encoder.layer.1.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.1.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.2.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.2.attention.self.query.bias torch.Size([768])

bert.encoder.layer.2.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.2.attention.self.key.bias torch.Size([768])

bert.encoder.layer.2.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.2.attention.self.value.bias torch.Size([768])

bert.encoder.layer.2.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.2.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.2.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.2.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.2.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.2.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.2.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.2.output.dense.bias torch.Size([768])

bert.encoder.layer.2.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.2.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.3.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.3.attention.self.query.bias torch.Size([768])

bert.encoder.layer.3.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.3.attention.self.key.bias torch.Size([768])

bert.encoder.layer.3.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.3.attention.self.value.bias torch.Size([768])

bert.encoder.layer.3.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.3.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.3.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.3.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.3.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.3.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.3.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.3.output.dense.bias torch.Size([768])

bert.encoder.layer.3.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.3.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.4.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.4.attention.self.query.bias torch.Size([768])

bert.encoder.layer.4.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.4.attention.self.key.bias torch.Size([768])

bert.encoder.layer.4.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.4.attention.self.value.bias torch.Size([768])

bert.encoder.layer.4.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.4.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.4.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.4.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.4.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.4.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.4.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.4.output.dense.bias torch.Size([768])

bert.encoder.layer.4.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.4.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.5.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.5.attention.self.query.bias torch.Size([768])

bert.encoder.layer.5.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.5.attention.self.key.bias torch.Size([768])

bert.encoder.layer.5.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.5.attention.self.value.bias torch.Size([768])

bert.encoder.layer.5.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.5.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.5.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.5.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.5.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.5.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.5.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.5.output.dense.bias torch.Size([768])

bert.encoder.layer.5.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.5.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.6.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.6.attention.self.query.bias torch.Size([768])

bert.encoder.layer.6.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.6.attention.self.key.bias torch.Size([768])

bert.encoder.layer.6.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.6.attention.self.value.bias torch.Size([768])

bert.encoder.layer.6.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.6.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.6.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.6.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.6.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.6.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.6.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.6.output.dense.bias torch.Size([768])

bert.encoder.layer.6.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.6.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.7.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.7.attention.self.query.bias torch.Size([768])

bert.encoder.layer.7.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.7.attention.self.key.bias torch.Size([768])

bert.encoder.layer.7.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.7.attention.self.value.bias torch.Size([768])

bert.encoder.layer.7.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.7.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.7.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.7.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.7.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.7.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.7.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.7.output.dense.bias torch.Size([768])

bert.encoder.layer.7.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.7.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.8.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.8.attention.self.query.bias torch.Size([768])

bert.encoder.layer.8.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.8.attention.self.key.bias torch.Size([768])

bert.encoder.layer.8.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.8.attention.self.value.bias torch.Size([768])

bert.encoder.layer.8.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.8.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.8.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.8.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.8.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.8.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.8.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.8.output.dense.bias torch.Size([768])

bert.encoder.layer.8.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.8.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.9.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.9.attention.self.query.bias torch.Size([768])

bert.encoder.layer.9.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.9.attention.self.key.bias torch.Size([768])

bert.encoder.layer.9.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.9.attention.self.value.bias torch.Size([768])

bert.encoder.layer.9.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.9.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.9.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.9.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.9.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.9.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.9.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.9.output.dense.bias torch.Size([768])

bert.encoder.layer.9.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.9.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.10.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.10.attention.self.query.bias torch.Size([768])

bert.encoder.layer.10.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.10.attention.self.key.bias torch.Size([768])

bert.encoder.layer.10.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.10.attention.self.value.bias torch.Size([768])

bert.encoder.layer.10.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.10.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.10.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.10.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.10.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.10.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.10.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.10.output.dense.bias torch.Size([768])

bert.encoder.layer.10.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.10.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.11.attention.self.query.weight torch.Size([768, 768])

bert.encoder.layer.11.attention.self.query.bias torch.Size([768])

bert.encoder.layer.11.attention.self.key.weight torch.Size([768, 768])

bert.encoder.layer.11.attention.self.key.bias torch.Size([768])

bert.encoder.layer.11.attention.self.value.weight torch.Size([768, 768])

bert.encoder.layer.11.attention.self.value.bias torch.Size([768])

bert.encoder.layer.11.attention.output.dense.weight torch.Size([768, 768])

bert.encoder.layer.11.attention.output.dense.bias torch.Size([768])

bert.encoder.layer.11.attention.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.11.attention.output.LayerNorm.bias torch.Size([768])

bert.encoder.layer.11.intermediate.dense.weight torch.Size([3072, 768])

bert.encoder.layer.11.intermediate.dense.bias torch.Size([3072])

bert.encoder.layer.11.output.dense.weight torch.Size([768, 3072])

bert.encoder.layer.11.output.dense.bias torch.Size([768])

bert.encoder.layer.11.output.LayerNorm.weight torch.Size([768])

bert.encoder.layer.11.output.LayerNorm.bias torch.Size([768])

bert.pooler.dense.weight torch.Size([768, 768])

bert.pooler.dense.bias torch.Size([768])

classifier.weight torch.Size([7, 768])

classifier.bias torch.Size([7])

为了实现numpy的bert模型，踩了两天的坑，一步步对比huggingface源码实现的，真的太难了～～～

这是使用numpy实现的bert代码，分数上和huggingface有稍微的一点点区别，可能是模型太大，保存的模型参数误差累计造成的！

看下面的代码真的有利于直接了解bert模型结构，各种细节简单又到位，自己都服自己，研究这个东西～～～

import numpy as np

def word_embedding(input_ids, word_embeddings):

    return word_embeddings[input_ids]

def position_embedding(position_ids, position_embeddings):

    return position_embeddings[position_ids]

def token_type_embedding(token_type_ids, token_type_embeddings):

    return token_type_embeddings[token_type_ids]

def softmax(x, axis=None):

    # e_x = np.exp(x).astype(np.float32) #

    e_x = np.exp(x - np.max(x, axis=axis, keepdims=True))

    sum_ex = np.sum(e_x, axis=axis,keepdims=True).astype(np.float32)

    return e_x / sum_ex

def scaled_dot_product_attention(Q, K, V, mask=None):

    d_k = Q.shape[-1]

    scores = np.matmul(Q, K.transpose(0, 2, 1)) / np.sqrt(d_k)

    if mask is not None:

        scores = np.where(mask, scores, np.full_like(scores, -np.inf))

    attention_weights = softmax(scores, axis=-1)

    # print(attention_weights)

    # print(np.sum(attention_weights,axis=-1))

    output = np.matmul(attention_weights, V)

    return output, attention_weights

def multihead_attention(input, num_heads,W_Q,B_Q,W_K,B_K,W_V,B_V,W_O,B_O):

    q = np.matmul(input, W_Q.T)+B_Q

    k = np.matmul(input, W_K.T)+B_K

    v = np.matmul(input, W_V.T)+B_V

    # 分割输入为多个头

    q = np.split(q, num_heads, axis=-1)

    k = np.split(k, num_heads, axis=-1)

    v = np.split(v, num_heads, axis=-1)

    outputs = []

    for q_,k_,v_ in zip(q,k,v):

        output, attention_weights = scaled_dot_product_attention(q_, k_, v_)

        outputs.append(output)

    outputs = np.concatenate(outputs, axis=-1)

    outputs = np.matmul(outputs, W_O.T)+B_O

    return outputs

def layer_normalization(x, weight, bias, eps=1e-12):

    mean = np.mean(x, axis=-1, keepdims=True)

    variance = np.var(x, axis=-1, keepdims=True)

    std = np.sqrt(variance + eps)

    normalized_x = (x - mean) / std

    output = weight * normalized_x + bias

    return output

def feed_forward_layer(inputs, weight, bias, activation='relu'):

    linear_output = np.matmul(inputs,weight) + bias

    if activation == 'relu':

        activated_output = np.maximum(0, linear_output)  # ReLU激活函数

    elif activation == 'gelu':

        activated_output = 0.5 * linear_output * (1 + np.tanh(np.sqrt(2 / np.pi) * (linear_output + 0.044715 * np.power(linear_output, 3))))  # GELU激活函数

    elif activation == "tanh" :

        activated_output = np.tanh(linear_output)

    else:

        activated_output = linear_output  # 无激活函数

    return activated_output

def residual_connection(inputs, residual):

    # 残差连接

    residual_output = inputs + residual

    return residual_output

def tokenize_sentence(sentence, vocab_file = 'vocab.txt'):

    with open(vocab_file, 'r', encoding='utf-8') as f:

        vocab = f.readlines()

        vocab = [i.strip() for i in vocab]

        # print(len(vocab))

    tokenized_sentence = ['[CLS]'] + list(sentence) + ["[SEP]"] # 在句子开头添加[cls]

    token_ids = [vocab.index(token) for token in tokenized_sentence]

    return token_ids

# 加载保存的模型数据

model_data = np.load('bert_model_params.npz')

word_embeddings = model_data["bert.embeddings.word_embeddings.weight"]

position_embeddings = model_data["bert.embeddings.position_embeddings.weight"]

token_type_embeddings = model_data["bert.embeddings.token_type_embeddings.weight"]

def model_input(sentence):

    token_ids = tokenize_sentence(sentence)

    input_ids = np.array(token_ids)  # 输入的词汇id

    word_embedded = word_embedding(input_ids, word_embeddings)

    position_ids = np.array(range(len(input_ids)))  # 位置id

    # 位置嵌入矩阵，形状为 (max_position, embedding_size)

    position_embedded = position_embedding(position_ids, position_embeddings)

    token_type_ids = np.array([0]*len(input_ids))  # 片段类型id

    # 片段类型嵌入矩阵，形状为 (num_token_types, embedding_size)

    token_type_embedded = token_type_embedding(token_type_ids, token_type_embeddings)

    embedding_output = np.expand_dims(word_embedded + position_embedded + token_type_embedded, axis=0)

    return embedding_output

def bert(input,num_heads):

    ebd_LayerNorm_weight = model_data['bert.embeddings.LayerNorm.weight']

    ebd_LayerNorm_bias = model_data['bert.embeddings.LayerNorm.bias']

    input = layer_normalization(input,ebd_LayerNorm_weight,ebd_LayerNorm_bias)     #这里和模型输出一致

    for i in range(12):

        # 调用多头自注意力函数

        W_Q = model_data['bert.encoder.layer.{}.attention.self.query.weight'.format(i)]

        B_Q = model_data['bert.encoder.layer.{}.attention.self.query.bias'.format(i)]

        W_K = model_data['bert.encoder.layer.{}.attention.self.key.weight'.format(i)]

        B_K = model_data['bert.encoder.layer.{}.attention.self.key.bias'.format(i)]

        W_V = model_data['bert.encoder.layer.{}.attention.self.value.weight'.format(i)]

        B_V = model_data['bert.encoder.layer.{}.attention.self.value.bias'.format(i)]

        W_O = model_data['bert.encoder.layer.{}.attention.output.dense.weight'.format(i)]

        B_O = model_data['bert.encoder.layer.{}.attention.output.dense.bias'.format(i)]

        attention_output_LayerNorm_weight = model_data['bert.encoder.layer.{}.attention.output.LayerNorm.weight'.format(i)]

        attention_output_LayerNorm_bias = model_data['bert.encoder.layer.{}.attention.output.LayerNorm.bias'.format(i)]

        intermediate_weight = model_data['bert.encoder.layer.{}.intermediate.dense.weight'.format(i)]

        intermediate_bias = model_data['bert.encoder.layer.{}.intermediate.dense.bias'.format(i)]

        dense_weight = model_data['bert.encoder.layer.{}.output.dense.weight'.format(i)]

        dense_bias = model_data['bert.encoder.layer.{}.output.dense.bias'.format(i)]

        output_LayerNorm_weight = model_data['bert.encoder.layer.{}.output.LayerNorm.weight'.format(i)]

        output_LayerNorm_bias = model_data['bert.encoder.layer.{}.output.LayerNorm.bias'.format(i)]

        output = multihead_attention(input, num_heads,W_Q,B_Q,W_K,B_K,W_V,B_V,W_O,B_O)

        output = residual_connection(input,output)

        output1 = layer_normalization(output,attention_output_LayerNorm_weight,attention_output_LayerNorm_bias)    #这里和模型输出一致

        output = feed_forward_layer(output1, intermediate_weight.T, intermediate_bias, activation='gelu')

        output = feed_forward_layer(output, dense_weight.T, dense_bias, activation='')

        output = residual_connection(output1,output)

        output2 = layer_normalization(output,output_LayerNorm_weight,output_LayerNorm_bias)   #一致

        input = output2

    bert_pooler_dense_weight = model_data['bert.pooler.dense.weight']

    bert_pooler_dense_bias = model_data['bert.pooler.dense.bias']

    output = feed_forward_layer(output2, bert_pooler_dense_weight.T, bert_pooler_dense_bias, activation='tanh')    #一致

    return output

# for i in model_data:

#     # print(i)

#     print(i,model_data[i].shape)

id2label = {0: 'mainland China politics', 1: 'Hong Kong - Macau politics', 2: 'International news', 3: 'financial news', 4: 'culture', 5: 'entertainment', 6: 'sports'}

if __name__ == "__main__":

    # 示例用法

    sentence = '马拉松决赛'

    # print(model_input(sentence).shape)

    output = bert(model_input(sentence),num_heads=12)

    # print(output)

    classifier_weight = model_data['classifier.weight']

    classifier_bias = model_data['classifier.bias']

    output = feed_forward_layer(output[:,0,:], classifier_weight.T, classifier_bias, activation='')

    # print(output)

    output = softmax(output,axis=-1)

    label_id = np.argmax(output,axis=-1)

    label_score = output[0][label_id]

    print(id2label[label_id[0]],label_score)

这是hugging face上找的一个别人训练好的模型，roberta模型作新闻7分类，并且保存模型结构为numpy格式，为了上面的代码加载

import numpy as np

from transformers import AutoModelForSequenceClassification,AutoTokenizer,pipeline

model = AutoModelForSequenceClassification.from_pretrained('uer/roberta-base-finetuned-chinanews-chinese')

tokenizer = AutoTokenizer.from_pretrained('uer/roberta-base-finetuned-chinanews-chinese')

text_classification = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer)

print(text_classification("马拉松决赛"))

# print(model)

# 打印BERT模型的权重维度

for name, param in model.named_parameters():

    print(name, param.data.shape)

# # # 保存模型参数为NumPy格式

model_params = {name: param.data.cpu().numpy() for name, param in model.named_parameters()}

np.savez('bert_model_params.npz', **model_params)

# model_params

对比两个结果：

hugging face：[{'label': 'sports', 'score': 0.9929242134094238}]

numpy：sports [0.9928773]

使用numpy实现bert模型，使用hugging face 或pytorch训练模型，保存参数为numpy格式，然后使用numpy加载模型推理，可在树莓派上运行的更多相关文章

深度学习原理与框架-猫狗图像识别-卷积神经网络(代码) 1.cv2.resize(图片压缩) 2..get_shape()[1:4].num_elements(获得最后三维度之和) 3.saver.save(训练参数的保存) 4.tf.train.import_meta_graph(加载模型结构) 5.saver.restore(训练参数载入)
1.cv2.resize(image, (image_size, image_size), 0, 0, cv2.INTER_LINEAR) 参数说明:image表示输入图片,image_size表示变 ...
【4】TensorFlow光速入门-保存模型及加载模型并使用
本文地址:https://www.cnblogs.com/tujia/p/13862360.html 系列文章: [0]TensorFlow光速入门-序 [1]TensorFlow光速入门-tenso ...
MindSpore保存与加载模型
技术背景近几年在机器学习和传统搜索算法的结合中,逐渐发展出了一种Search To Optimization的思维,旨在通过构造一个特定的机器学习模型,来替代传统算法中的搜索过程,进而加速经典图论等 ...
NeHe OpenGL教程第三十一课：加载模型
转自[翻译]NeHe OpenGL 教程前言声明,此 NeHe OpenGL教程系列文章由51博客yarin翻译(2010-08-19),本博客为转载并稍加整理与修改.对NeHe的OpenGL管线 ...
tensorflow学习笔记2：c++程序静态链接tensorflow库加载模型文件
首先需要搞定tensorflow c++库,搜了一遍没有找到现成的包,于是下载tensorflow的源码开始编译: tensorflow的contrib中有一个makefile项目,极大的简化的接下来 ...
PyTorch保存模型与加载模型+Finetune预训练模型使用
Pytorch 保存模型与加载模型 PyTorch之保存加载模型参数初始化参数的初始化其实就是对参数赋值.而我们需要学习的参数其实都是Variable,它其实是对Tensor的封装,同时提供了da ...
[Pytorch]Pytorch 保存模型与加载模型(转)
转自:知乎目录: 保存模型与加载模型冻结一部分参数,训练另一部分参数采用不同的学习率进行训练 1.保存模型与加载简单的保存与加载方法: # 保存整个网络 torch.save(net, PAT ...
tensorflowjs下载源文件到本地不能加载模型解决方案
大多数情况(非源文件错误)下载源文件到本地不能加载模型,那么你可能需要搭建一个本地WEB服务器. 1.安装apache或ngnix,可以参照这个博客 2.强烈推荐一个Chrome插件Web Serve ...
WPF 3D动态加载模型文件
原文:WPF 3D动态加载模型文件这篇文章需要读者对WPF 3D有一个基本了解,至少看过官方的MSDN例子. 一般来说关于WPF使用3D的例子,都是下面的流程: 1.美工用3DMAX做好模型,生成一 ...
[译]Vulkan教程(31)加载模型
[译]Vulkan教程(31)加载模型 Loading models 加载模型 Introduction 入门 Your program is now ready to render textured ...

随机推荐

KVM WEB管理工具 WebVirtMgr
一.webvirtmgr介绍及环境说明温馨提示:安装KVM是需要2台都操作的,因为我们是打算将2台都设置为宿主机所有都需要安装KVM相关组件 github地址https://github.com/r ...
使用VScode进行Python开发
一.Microsoft Store中安装:terminal 二.PowerShell中执行: [win10新版或者win11使用: 单个命令安装运行 WSL 所需的一切内容(需要重启计算机):wsl ...
【LeetCode回溯算法#extra01】集合划分问题【火柴拼正方形、划分k个相等子集、公平发饼干】
火柴拼正方形 https://leetcode.cn/problems/matchsticks-to-square/ 你将得到一个整数数组 matchsticks ,其中 matchsticks[i] ...
DG：重启之后主备数据重新同步
问题描述:本来配置好的DG第二天重启之后,发现主备库数据不能同步,在主库上执行日志切换以及创建表操作都传不到备库上,造成这种错误的原因是主库实例断掉后造成备库日志与主库无法实时接收主库:orcl ...
LNMP搭建静态网页服务器
chattr -i default/.user.ini LNMP搭建使用 1.安装screen,命令或者操作可以一直运行下去 yum install screen 2.获取及安装 LNMP wget ...
Kurator v0.3.0版本发布
摘要:2023年4月8日,Kurator正式发布v0.3.0版本. 本文分享自华为云社区<华为云 Kurator v0.3.0 版本发布!集群舰队助力分布式云统一管理>,作者:云容器大未来 ...
2022-04-20：小团去参加军训，军训快要结束了，长官想要把大家一排n个人分成m组，然后让每组分别去参加阅兵仪式，只能选择相邻的人一组，不能随意改变队伍中人的位置，阅兵仪式上会进行打分，其中
2022-04-20:小团去参加军训,军训快要结束了, 长官想要把大家一排n个人分成m组,然后让每组分别去参加阅兵仪式, 只能选择相邻的人一组,不能随意改变队伍中人的位置, 阅兵仪式上会进行打分,其中 ...
vue全家桶进阶之路13：生命周期
Vue2的生命周期是指Vue实例从创建.挂载.更新.销毁等各个阶段中所经历的一系列过程.Vue2的生命周期共有8个阶段,分别是: beforeCreate:Vue实例被创建之前的阶段,此时Vue实例的 ...
phpstudy-pikachu-数字型注入（post）
抓包搞到格式 id=1&submit=%E6%9F%A5%E8%AF%A2 查字符段 id=1 order by 2&submit=%E6%9F%A5%E8%AF%A2 id=1 un ...
SICP：惰性求值、流和尾递归（Python实现）
求值器完整实现代码我已经上传到了GitHub仓库:TinySCM,感兴趣的童鞋可以前往查看.这里顺便强烈推荐UC Berkeley的同名课程CS 61A. 即使在变化中,它也丝毫未变. --赫拉克利特 ...

使用numpy实现bert模型，使用hugging face 或pytorch训练模型，保存参数为numpy格式，然后使用numpy加载模型推理，可在树莓派上运行

使用numpy实现bert模型，使用hugging face 或pytorch训练模型，保存参数为numpy格式，然后使用numpy加载模型推理，可在树莓派上运行的更多相关文章

随机推荐

热门专题