update readme add ckpt from hf
This commit is contained in:
parent
b21441b14b
commit
ad3d943a7d
|
@ -2,7 +2,7 @@
|
|||
* @Author: LiangSong(sl12160010@gmail.com)
|
||||
* @Date: 2023-03-10 21:18:35
|
||||
* @LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
* @LastEditTime: 2023-04-09 22:48:28
|
||||
* @LastEditTime: 2023-04-16 23:49:06
|
||||
* @FilePath: /Open-Llama/README.md
|
||||
* @Description:
|
||||
*
|
||||
|
@ -18,6 +18,12 @@ Open-Llama是一个开源项目,提供了一整套用于构建大型语言模
|
|||
|
||||
**采用FastChat项目相同方法测评Open-Llama的效果和GPT3.5的效果对比,经过测试在中文问题上可以达到GPT3.5 84%的水平,具体测试结果和CheckPoint将在近期放出**
|
||||
|
||||
经过Instruct-tuning的CheckPoint已开源在[HuggingFace](https://huggingface.co/s-JoL/Open-Llama-V1)。使使用ckpt需要先用下面命令安装最新版本Transformers
|
||||
``` base
|
||||
pip install git+https://github.com/s-JoL/transformers.git@dev
|
||||
```
|
||||
|
||||
|
||||
我们完成了300B token的预训练,总共训练80 K step,Global Batch Size和Llama中一致为4M。
|
||||
使用总共7部分数据构成Instruction-tuning数据,模型具有一定的编程能力、数学能力和多轮对话能力,具体数据见Instruction-Tuning部分。
|
||||
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
* @Author: LiangSong(sl12160010@gmail.com)
|
||||
* @Date: 2023-03-10 21:18:35
|
||||
* @LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
* @LastEditTime: 2023-04-08 00:03:57
|
||||
* @LastEditTime: 2023-04-16 23:49:28
|
||||
* @FilePath: /Open-Llama/README_en.md
|
||||
* @Description:
|
||||
*
|
||||
|
@ -15,6 +15,13 @@ Translated by ChatGPT.
|
|||
Open-Llama is an open source project that provides a complete set of training processes for building large-scale language models, from data preparation to tokenization, pre-training, instruction tuning, and reinforcement learning techniques such as RLHF.
|
||||
|
||||
## Progress
|
||||
|
||||
The checkpoint after Instruct-tuning has been open-sourced on [HuggingFace](https://huggingface.co/s-JoL/Open-Llama-V1).
|
||||
To use the checkpoint, you need to first install the latest version of Transformers using the following command.
|
||||
``` base
|
||||
pip install git+https://github.com/s-JoL/transformers.git@dev
|
||||
```
|
||||
|
||||
We completed pre-training on 300 billion tokens, with a total of 80,000 steps trained, using a global batch size of 4 million, consistent with Llama. We constructed the instruction-tuning dataset using a total of 7 parts of data, which the model has certain programming ability, mathematical ability, and multi-turn dialogue ability. For specific data, please refer to the instruction-tuning section.
|
||||
|
||||
[Demo](http://home.ustc.edu.cn/~sl9292/)
|
||||
|
|
Loading…
Reference in New Issue
Block a user