首先需要安装依赖:
!pip install bitsandbytes modelscope
!pip install --upgrade accelerate
!pip install git+https://github.com/huggingface/transformers 接着加载模型并进行量化:
from modelscope import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
tokenizer = AutoTokenizer.from_pretrained("qwen/Qwen1.5-MoE-A2.7B-Chat")
bnb_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16)
model = AutoModelForCausalLM.from_pretrained("qwen/Qwen1.5-MoE-A2.7B-Chat", quantization_config=bnb_config) 定义聊天函数:
def qwen_moe_chat(prompt: str):
messages = [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages,tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to('cuda')
generated_ids = model.generate(model_inputs.input_ids, max_new_tokens=512)
generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
return response 最终对模型进行测评:
book_review = ["很生气,一晚上看完,只有生气。太矫情了",
"这个标题真是太贴切了,真的是罪,真的是美。",
"没有才调,看在材料份上加一星。",
"电影更值得一看",
"历史书做成这样真是太赞了!",
"废话太多",
"啥玩意,情色系的啊。故事一般。看的困。",
"你是猴子请来的逗逼吗。",
"人生读过最狗血的书之一 除了对了解穆斯林信仰风俗有所帮助之外都是狗血",
"作为资深影迷,这本书必读",
"两章果断弃!",
"跟看我高中同学的日记本差不多。",
"文不对题,读不下去。",
"莫非法国人的法语水平都堕落了?",
"就不加友情分了…",
"如隔夜白开,索然无味。",
"2015.1025 融合了我喜欢的所有元素,校园爱情、破镜重圆、高干子弟,可是却写不出一篇让人有一口气读下去的好文。",
"没多大意思,文笔俏皮轻佻得刻意。",
"据说抄袭大风刮过的《桃花债》和公子欢喜的《思凡》,呵呵哒",
"第二遍"]
prompt = "评论:{} 请将以上评论分类到 好评 或 差评(你只需要回复 好评 或 差评)"
for review in book_review:
new_prompt = prompt.format(review)
response = qwen_moe_chat(new_prompt)
print(response, review)

