diff --git a/README.md b/README.md index 4bc033e..bb0de69 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ * @Author: LiangSong(sl12160010@gmail.com) * @Date: 2023-03-10 21:18:35 * @LastEditors: LiangSong(sl12160010@gmail.com) - * @LastEditTime: 2023-04-09 22:48:28 + * @LastEditTime: 2023-04-16 23:49:06 * @FilePath: /Open-Llama/README.md * @Description: * @@ -18,6 +18,12 @@ Open-Llama是一个开源项目,提供了一整套用于构建大型语言模 **采用FastChat项目相同方法测评Open-Llama的效果和GPT3.5的效果对比,经过测试在中文问题上可以达到GPT3.5 84%的水平,具体测试结果和CheckPoint将在近期放出** +经过Instruct-tuning的CheckPoint已开源在[HuggingFace](https://huggingface.co/s-JoL/Open-Llama-V1)。使使用ckpt需要先用下面命令安装最新版本Transformers +``` base +pip install git+https://github.com/s-JoL/transformers.git@dev +``` + + 我们完成了300B token的预训练,总共训练80 K step,Global Batch Size和Llama中一致为4M。 使用总共7部分数据构成Instruction-tuning数据,模型具有一定的编程能力、数学能力和多轮对话能力,具体数据见Instruction-Tuning部分。 diff --git a/README_en.md b/README_en.md index cc90dfb..626691f 100644 --- a/README_en.md +++ b/README_en.md @@ -2,7 +2,7 @@ * @Author: LiangSong(sl12160010@gmail.com) * @Date: 2023-03-10 21:18:35 * @LastEditors: LiangSong(sl12160010@gmail.com) - * @LastEditTime: 2023-04-08 00:03:57 + * @LastEditTime: 2023-04-16 23:49:28 * @FilePath: /Open-Llama/README_en.md * @Description: * @@ -15,6 +15,13 @@ Translated by ChatGPT. Open-Llama is an open source project that provides a complete set of training processes for building large-scale language models, from data preparation to tokenization, pre-training, instruction tuning, and reinforcement learning techniques such as RLHF. ## Progress + +The checkpoint after Instruct-tuning has been open-sourced on [HuggingFace](https://huggingface.co/s-JoL/Open-Llama-V1). +To use the checkpoint, you need to first install the latest version of Transformers using the following command. +``` base +pip install git+https://github.com/s-JoL/transformers.git@dev +``` + We completed pre-training on 300 billion tokens, with a total of 80,000 steps trained, using a global batch size of 4 million, consistent with Llama. We constructed the instruction-tuning dataset using a total of 7 parts of data, which the model has certain programming ability, mathematical ability, and multi-turn dialogue ability. For specific data, please refer to the instruction-tuning section. [Demo](http://home.ustc.edu.cn/~sl9292/)