update readme

This commit is contained in:
LiangSong 2023-05-09 17:03:13 +08:00
parent 59b79af9d7
commit 7d505ea303
2 changed files with 2 additions and 2 deletions

View File

@ -50,7 +50,7 @@ print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
The CheckPoint after pre-training only is also uploaded to [s-JoL/Open-Llama-V1-pretrain](https://huggingface.co/s-JoL/Open-Llama-V1-pretrain).
The model [PR](https://github.com/huggingface/transformers/pull/22795) has been submitted for merging into the Transformers main branch.
We have completed 300B token pre-training, training a total of 80 K steps. The Global Batch Size is consistent with Llama at 4M.
We have completed 330B token pre-training, training a total of 80 K steps. The Global Batch Size is consistent with Llama at 4M.
Using a total of 7 parts of data to constitute the Instruction-tuning data, the model has certain programming abilities, mathematical abilities, and multi-turn dialogue abilities. Specific data can be found in the Instruction-Tuning section.
Below is a display of the model's multi-turn dialogue ability regarding code:

View File

@ -51,7 +51,7 @@ print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
只经过预训练的CheckPoint也上传至[s-JoL/Open-Llama-V1-pretrain](https://huggingface.co/s-JoL/Open-Llama-V1-pretrain)。
模型已提交[PR](https://github.com/huggingface/transformers/pull/22795)合并至Transformers main分支。
我们完成了300B token的预训练总共训练80 K stepGlobal Batch Size和Llama中一致为4M。
我们完成了330B token的预训练总共训练80 K stepGlobal Batch Size和Llama中一致为4M。
使用总共7部分数据构成Instruction-tuning数据模型具有一定的编程能力、数学能力和多轮对话能力具体数据见Instruction-Tuning部分。
如下是一个关于代码的多轮对话能力的展示