update readme add ckpt from hf

2023-04-16 23:50:36 +08:00 · 2023-04-16 23:50:36 +08:00 · ad3d943a7d
commit ad3d943a7d
parent b21441b14b
2 changed files with 15 additions and 2 deletions
--- a/README.md
+++ b/README.md
@ -2,7 +2,7 @@
 * @Author: LiangSong(sl12160010@gmail.com)
 * @Date: 2023-03-10 21:18:35
 * @LastEditors: LiangSong(sl12160010@gmail.com)
- * @LastEditTime: 2023-04-09 22:48:28
+ * @LastEditTime: 2023-04-16 23:49:06
 * @FilePath: /Open-Llama/README.md
 * @Description: 
 * 
@ -18,6 +18,12 @@ Open-Llama是一个开源项目，提供了一整套用于构建大型语言模

 **采用FastChat项目相同方法测评Open-Llama的效果和GPT3.5的效果对比，经过测试在中文问题上可以达到GPT3.5 84%的水平，具体测试结果和CheckPoint将在近期放出**

+经过Instruct-tuning的CheckPoint已开源在[HuggingFace](https://huggingface.co/s-JoL/Open-Llama-V1)。使使用ckpt需要先用下面命令安装最新版本Transformers
+``` base
+pip install git+https://github.com/s-JoL/transformers.git@dev
+```
+
+
 我们完成了300B token的预训练，总共训练80 K step，Global Batch Size和Llama中一致为4M。
 使用总共7部分数据构成Instruction-tuning数据，模型具有一定的编程能力、数学能力和多轮对话能力，具体数据见Instruction-Tuning部分。

--- a/README_en.md
+++ b/README_en.md
@ -2,7 +2,7 @@
 * @Author: LiangSong(sl12160010@gmail.com)
 * @Date: 2023-03-10 21:18:35
 * @LastEditors: LiangSong(sl12160010@gmail.com)
- * @LastEditTime: 2023-04-08 00:03:57
+ * @LastEditTime: 2023-04-16 23:49:28
 * @FilePath: /Open-Llama/README_en.md
 * @Description: 
 * 
@ -15,6 +15,13 @@ Translated by ChatGPT.
 Open-Llama is an open source project that provides a complete set of training processes for building large-scale language models, from data preparation to tokenization, pre-training, instruction tuning, and reinforcement learning techniques such as RLHF.

 ## Progress
+
+The checkpoint after Instruct-tuning has been open-sourced on [HuggingFace](https://huggingface.co/s-JoL/Open-Llama-V1).
+To use the checkpoint, you need to first install the latest version of Transformers using the following command.
+``` base
+pip install git+https://github.com/s-JoL/transformers.git@dev
+```
+
 We completed pre-training on 300 billion tokens, with a total of 80,000 steps trained, using a global batch size of 4 million, consistent with Llama. We constructed the instruction-tuning dataset using a total of 7 parts of data, which the model has certain programming ability, mathematical ability, and multi-turn dialogue ability. For specific data, please refer to the instruction-tuning section.

 [Demo](http://home.ustc.edu.cn/~sl9292/)