【深度学习 | Python】AutoModel, AutoProcessor的介绍

旅途中的宽~

6749人浏览 · 2023-05-12 20:33:38

旅途中的宽~ · 2023-05-12 20:33:38 发布

from transformers import AutoModel, AutoProcessor

transformers 是一个由 Hugging Face 开发并维护的自然语言处理（NLP）模型库，它提供了各种各样的预训练模型，可以用来进行文本分类、信息提取、自然语言生成等多种任务。

这个库中包含了许多常用的 NLP 模型，比如 BERT、GPT、RoBERTa、T5 等。

AutoModel 和 AutoProcessor 是 transformers 库中的两个模块，用于加载预训练模型和处理器。

AutoModel 用于加载预训练模型，可以根据模型名称自动选择对应的模型，并将该模型加载为可以直接使用的对象。

AutoProcessor 用于加载处理器，可以根据模型名称自动选择对应的处理器，并将该处理器加载为可以直接使用的对象，在进行下游任务时，需要将文本数据转换为模型输入，这时就需要用到处理器。

下面示例使用 AutoModel 和 AutoProcessor 分别加载预训练模型和处理器：

from transformers import AutoModel, AutoProcessor

# 加载模型和处理器
model_name = 'bert-base-uncased'
model = AutoModel.from_pretrained(model_name)
processor = AutoProcessor.from_pretrained(model_name)

# 处理文本输入，将其转换为模型输入
text = "Hello, world!"
tokens = processor(text, return_tensors='pt', padding=True, truncation=True)

# 将模型输入传给模型
outputs = model(**tokens)

# 打印模型输出
print(outputs)

输出结果为：

Some weights of the model checkpoint at bert-base-uncased were not used when initializing BertModel: ['cls.predictions.decoder.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.bias', 'cls.predictions.transform.dense.weight', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
BaseModelOutputWithPoolingAndCrossAttentions(last_hidden_state=tensor([[[-0.0781,  0.1587,  0.0400,  ..., -0.2805,  0.0248,  0.4081],
         [-0.2016,  0.1781,  0.4184,  ..., -0.2522,  0.3630, -0.0979],
         [-0.7156,  0.6751,  0.6017,  ..., -1.1032,  0.0797,  0.0567],
         [ 0.0527, -0.1483,  1.3609,  ..., -0.4513,  0.1274,  0.2655],
         [-0.7122, -0.4815, -0.1438,  ...,  0.5602, -0.1062, -0.1301],
         [ 0.9955,  0.1328, -0.0621,  ...,  0.2460, -0.6502, -0.3296]]],
       grad_fn=<NativeLayerNormBackward0>), pooler_output=tensor([[-0.8130, -0.2470, -0.7289,  0.5582,  0.3357, -0.0758,  0.7851,  0.1526,
         -0.5705, -0.9997, -0.3183,  0.7643,  0.9550,  0.5801,  0.9046, -0.6037,
         -0.3113, -0.5445,  0.3740, -0.4197,  0.5471,  0.9996,  0.0560,  0.2710,
          0.3869,  0.9316, -0.7260,  0.8900,  0.9311,  0.5901, -0.5208,  0.0532,
         -0.9711, -0.1791, -0.8414, -0.9663,  0.2318, -0.6239,  0.0885,  0.1203,
         -0.8333,  0.1662,  0.9993,  0.1384,  0.1207, -0.3476, -1.0000,  0.2947,
         -0.7443,  0.7037,  0.6978,  0.5853,  0.0875,  0.4013,  0.3722,  0.1009,
         -0.1470,  0.1421, -0.2055, -0.4406, -0.6010,  0.2476, -0.7887, -0.8612,
          0.8639,  0.7504, -0.0738, -0.2541,  0.0941, -0.1272,  0.7828,  0.1683,
          0.0685, -0.8279,  0.4741,  0.2687, -0.6123,  1.0000, -0.3837, -0.9341,
          0.5166,  0.5990,  0.5714, -0.2885,  0.4897, -1.0000,  0.2800, -0.1625,
         -0.9728,  0.2292,  0.3729, -0.1447,  0.2490,  0.5224, -0.5050, -0.3634,
         -0.2048, -0.7688, -0.2677, -0.1745, -0.0355, -0.2574, -0.1838, -0.3517,
          0.2785, -0.3823, -0.3204,  0.4208, -0.0671,  0.6005,  0.3758, -0.3386,
          0.4421, -0.9251,  0.5425, -0.2365, -0.9684, -0.5510, -0.9714,  0.4726,
         -0.2355, -0.3178,  0.8958,  0.1285,  0.2222,  0.0103, -0.5784, -1.0000,
         -0.5691, -0.5153, -0.0901, -0.1982, -0.9424, -0.9055,  0.4781,  0.9141,
          0.0904,  0.9976, -0.2006,  0.8990, -0.3713, -0.6045,  0.5630, -0.3681,
          0.7174,  0.1177, -0.4574,  0.1722, -0.0565,  0.2068, -0.5352, -0.1658,
...
         -0.3599, -1.0000,  0.3665, -0.2367,  0.6221, -0.5721,  0.3542, -0.5887,
         -0.9486, -0.2115,  0.1483,  0.6009, -0.4153, -0.6647,  0.4821, -0.1477,
          0.8825,  0.7133, -0.2224,  0.2536,  0.5956, -0.6499, -0.6185,  0.8514]],
       grad_fn=<TanhBackward0>), hidden_states=None, past_key_values=None, attentions=None, cross_attentions=None)