核心内容摘要
揭秘“gb14may18XXXXXL民族”:一段跨越时空的文化回响
MiniCPM4-
5B-QAT-Int4-GPTQ-format · 模型库from modelscope import AutoTokenizer from vllm import LLM, SamplingParams model_name OpenBMB/MiniCPM4-
5B-QAT-Int4-GPTQ-format prompt [{role: user, content: 推荐5个北京的景点。
}] tokenizer AutoTokenizer.from_pretrained(model_name, trust_remote_codeTrue) input_text tokenizer.apply_chat_template(prompt, tokenizeFalse, add_generation_promptTrue) llm LLM( modelmodel_name, quantizationgptq_marlin, trust_remote_codeTrue, max_num_batched_tokens32768, dtypebfloat16, gpu_memory_utilization
8, ) sampling_params SamplingParams(top_p
7, temperature
7, max_tokens1024, repetition_penalty
1.