update readme
This commit is contained in:
parent
59b79af9d7
commit
7d505ea303
|
@ -50,7 +50,7 @@ print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
|
|||
The CheckPoint after pre-training only is also uploaded to [s-JoL/Open-Llama-V1-pretrain](https://huggingface.co/s-JoL/Open-Llama-V1-pretrain).
|
||||
The model [PR](https://github.com/huggingface/transformers/pull/22795) has been submitted for merging into the Transformers main branch.
|
||||
|
||||
We have completed 300B token pre-training, training a total of 80 K steps. The Global Batch Size is consistent with Llama at 4M.
|
||||
We have completed 330B token pre-training, training a total of 80 K steps. The Global Batch Size is consistent with Llama at 4M.
|
||||
Using a total of 7 parts of data to constitute the Instruction-tuning data, the model has certain programming abilities, mathematical abilities, and multi-turn dialogue abilities. Specific data can be found in the Instruction-Tuning section.
|
||||
|
||||
Below is a display of the model's multi-turn dialogue ability regarding code:
|
||||
|
|
|
@ -51,7 +51,7 @@ print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
|
|||
只经过预训练的CheckPoint也上传至[s-JoL/Open-Llama-V1-pretrain](https://huggingface.co/s-JoL/Open-Llama-V1-pretrain)。
|
||||
模型已提交[PR](https://github.com/huggingface/transformers/pull/22795)合并至Transformers main分支。
|
||||
|
||||
我们完成了300B token的预训练,总共训练80 K step,Global Batch Size和Llama中一致为4M。
|
||||
我们完成了330B token的预训练,总共训练80 K step,Global Batch Size和Llama中一致为4M。
|
||||
使用总共7部分数据构成Instruction-tuning数据,模型具有一定的编程能力、数学能力和多轮对话能力,具体数据见Instruction-Tuning部分。
|
||||
|
||||
如下是一个关于代码的多轮对话能力的展示
|
||||
|
|
Loading…
Reference in New Issue
Block a user