发布于2024-11-10 09:27 阅读(976) 评论(0) 点赞(14) 收藏(2)
我正在尝试使用 BERTopic 库和使用 transformers 库的自定义文本生成模型。但是,我遇到了这个 RuntimeError。我尝试在管道中将设备指定为 0(GPU),但仍然出现此错误。我该如何解决这个问题?
请帮助我了解导致此错误的原因以及如何修复它。
2024-08-30 10:44:11,684 - BERTopic - Dimensionality - Completed ✓
2024-08-30 10:44:11,688 - BERTopic - Cluster - Start clustering the reduced embeddings
/usr/local/lib/python3.10/dist-packages/joblib/externals/loky/backend/fork_exec.py:38: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.
pid = os.fork()
2024-08-30 10:44:17,485 - BERTopic - Cluster - Completed ✓
2024-08-30 10:44:17,498 - BERTopic - Representation - Extracting topics from clusters using representation models.
0%| | 0/66 [00:08<?, ?it/s]
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
[<ipython-input-11-5511f129a54a>](https://localhost:8080/#) in <cell line: 16>()
14 )
15
---> 16 topics, probs = topic_model.fit_transform(docs, embeddings)
13 frames
[/usr/local/lib/python3.10/dist-packages/transformers/generation/logits_process.py](https://localhost:8080/#) in __call__(self, input_ids, scores)
351 @add_start_docstrings(LOGITS_PROCESSOR_INPUTS_DOCSTRING)
352 def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor) -> torch.FloatTensor:
--> 353 score = torch.gather(scores, 1, input_ids)
354
355 # if score < 0 then repetition penalty has to be multiplied to reduce the token probabilities
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cpu and cuda:0! (when checking argument for argument index in method wrapper_CUDA_gather)
我的代码是:
from transformers import AutoTokenizer, pipeline
model = AutoModelForCausalLM.from_pretrained(
"TheBloke/zephyr-7B-alpha-GGUF",
model_file="zephyr-7b-alpha.Q4_K_M.gguf",
model_type="mistral",
gpu_layers=50,
hf=True
#context_length=512,
#max_new_tokens=512
)
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceH4/zephyr-7b-alpha")
prompt = """<|system|>You are a helpful, respectful and honest assistant for labeling topics..</s>
<|user|>
I have a topic that contains the following documents:
[DOCUMENTS]
The topic is described by the following keywords: '[KEYWORDS]'."""
generator = pipeline(
model=model, tokenizer=tokenizer,
task='text-generation',
max_new_tokens=50,
repetition_penalty=1.1,
device=0
)
from bertopic.representation import TextGeneration
zephyr = TextGeneration(generator, prompt=prompt, doc_length=10,tokenizer="char")
representation_model = {"Zephyr": zephyr}
作者:黑洞官方问答小能手
链接:https://www.pythonheidong.com/blog/article/2045412/ead362a956b51089ed81/
来源:python黑洞网
任何形式的转载都请注明出处,如有侵权 一经发现 必将追究其法律责任
昵称:
评论内容:(最多支持255个字符)
---无人问津也好,技不如人也罢,你都要试着安静下来,去做自己该做的事,而不是让内心的烦躁、焦虑,坏掉你本来就不多的热情和定力
Copyright © 2018-2021 python黑洞网 All Rights Reserved 版权所有,并保留所有权利。 京ICP备18063182号-1
投诉与举报,广告合作请联系vgs_info@163.com或QQ3083709327
免责声明:网站文章均由用户上传,仅供读者学习交流使用,禁止用做商业用途。若文章涉及色情,反动,侵权等违法信息,请向我们举报,一经核实我们会立即删除!