From 7d505ea3038b6e7ecf8159138138ad90c6a42edb Mon Sep 17 00:00:00 2001 From: LiangSong Date: Tue, 9 May 2023 17:03:13 +0800 Subject: [PATCH] update readme --- README.md | 2 +- README_zh.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index e67bc2b..bc2afce 100644 --- a/README.md +++ b/README.md @@ -50,7 +50,7 @@ print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True)) The CheckPoint after pre-training only is also uploaded to [s-JoL/Open-Llama-V1-pretrain](https://huggingface.co/s-JoL/Open-Llama-V1-pretrain). The model [PR](https://github.com/huggingface/transformers/pull/22795) has been submitted for merging into the Transformers main branch. -We have completed 300B token pre-training, training a total of 80 K steps. The Global Batch Size is consistent with Llama at 4M. +We have completed 330B token pre-training, training a total of 80 K steps. The Global Batch Size is consistent with Llama at 4M. Using a total of 7 parts of data to constitute the Instruction-tuning data, the model has certain programming abilities, mathematical abilities, and multi-turn dialogue abilities. Specific data can be found in the Instruction-Tuning section. Below is a display of the model's multi-turn dialogue ability regarding code: diff --git a/README_zh.md b/README_zh.md index 4d27a36..89d2e46 100644 --- a/README_zh.md +++ b/README_zh.md @@ -51,7 +51,7 @@ print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True)) 只经过预训练的CheckPoint也上传至[s-JoL/Open-Llama-V1-pretrain](https://huggingface.co/s-JoL/Open-Llama-V1-pretrain)。 模型已提交[PR](https://github.com/huggingface/transformers/pull/22795)合并至Transformers main分支。 -我们完成了300B token的预训练,总共训练80 K step,Global Batch Size和Llama中一致为4M。 +我们完成了330B token的预训练,总共训练80 K step,Global Batch Size和Llama中一致为4M。 使用总共7部分数据构成Instruction-tuning数据,模型具有一定的编程能力、数学能力和多轮对话能力,具体数据见Instruction-Tuning部分。 如下是一个关于代码的多轮对话能力的展示