update readme

2023-03-31 14:58:07 +08:00 · 2023-03-31 14:58:07 +08:00 · d25b34c280
commit d25b34c280
parent 5dc1e77c66
8 changed files with 81 additions and 7 deletions
--- a/README.md
+++ b/README.md
@ -2,7 +2,7 @@
 * @Author: LiangSong(sl12160010@gmail.com)
 * @Date: 2023-03-10 21:18:35
 * @LastEditors: LiangSong(sl12160010@gmail.com)
- * @LastEditTime: 2023-03-29 21:50:37
+ * @LastEditTime: 2023-03-31 14:54:20
 * @FilePath: /Open-Llama/README.md
 * @Description: 
 * 
@ -16,10 +16,16 @@ Open-Llama是一个开源项目，提供了一整套用于构建大型语言模

 ## 进展

-经过30K step的预训练，已经展现出了一定的胡说八道的能力，如下分别是编写代码、续写论文和续写中文，虽然正确性存在问题但是已经像是一句话了。
+虽然还没有完整的预训练完，但是我们先使用40K step预训练的模型进行了Instruction-tuning，模型可以服从简单的命令

-<img src="assets/code.JPG" width="33%"><img src="assets/paper.JPG" width="33%"><img src="assets/chinese.JPG" width="33%">
+[Demo](https://ffdd75ef89db6f1c97.gradio.live/)

+我们参考一些对文心一言的测试也简单测试一下我们的模型，原始报道 [百度“文心一言”测试：国内生成式 AI 什么水平？](https://www.8btc.com/article/6809666)
+
+本模型的效果如下图，更多结果还待进一步测试。由于国内网络问题，使用上面的Demo可能出现请求丢失的情况，如长时间无响应可刷新重试
+![image1](assets/image1.png)![image2](assets/image2.png)![image3](assets/image3.png)
+
+我们简单预估一下达到上面效果的一个花费，训练40K step使用了1.5亿条预训练数据，大约为110B token，总共训练时间76h，按Google Cloud的A100报价花费大约为19152美元。后续的Instruction-tuning训练了12k Step，使用1.6M条数据，总共训练时间3.4h，大约花费342美元。因此从0开始训练一个这样的模型总花费不到20000美元。
 ## **特性**

 ### 易用性
@ -152,10 +158,40 @@ Total mult-adds (G): 6.89
 ```

 目前的进展
-![](assets/loss.png)
+![](assets/pretrain_loss.png)

 ### Instruction-Tuning

+我们使用目前开源的三个数据集进行Instruction-tuning，后续会加入更多的任务以及自己构建的数据集。
+- [yizhongw/self_instruct](https://huggingface.co/datasets/yizhongw/self_instruct)
+- [BelleGroup/generated_train_0.5M_CN](https://huggingface.co/datasets/BelleGroup/generated_train_0.5M_CN)
+- [BelleGroup/generated_train_1M_CN](https://huggingface.co/datasets/BelleGroup/generated_train_1M_CN)
+
+我们对原始数据进行了一些预处理，格式如下
+```
+user: {prompt}<s>system: {completion}</s>
+```
+
+具体训练代码和预训练基本一样，代码可见
+```
+instruction_tuning.py
+```
+
+启动命令也基本一致
+```bash
+accelerate launch --config_file configs/default_config.yaml instruction_tuning.py
+```
+某些情况下可能需要指定下列参数
+```
+--main_process_ip
+--main_process_port
+--num_processes
+--num_machines
+--machine_rank
+```
+
+过程中Loss如下，基本在波动不怎么下降
+![loss](assets/instruct_loss.png)
 ### RLHF

 ## 性能对比
--- a/README_en.md
+++ b/README_en.md
@ -2,7 +2,7 @@
 * @Author: LiangSong(sl12160010@gmail.com)
 * @Date: 2023-03-10 21:18:35
 * @LastEditors: LiangSong(sl12160010@gmail.com)
- * @LastEditTime: 2023-03-29 21:51:28
+ * @LastEditTime: 2023-03-31 14:57:55
 * @FilePath: /Open-Llama/README_en.md
 * @Description: 
 * 
@ -15,9 +15,17 @@ Translated by ChatGPT.
 Open-Llama is an open source project that provides a complete set of training processes for building large-scale language models, from data preparation to tokenization, pre-training, instruction tuning, and reinforcement learning techniques such as RLHF.

 ## Progress
-After 30K steps of pre-training, the model has demonstrated some language capabilities, as shown below in coding and paper continuation tasks. Although there are still issues with correctness, the generated outputs resemble sentences.
+Although the complete pre-training is not finished yet, we used the 40K-step pre-trained model for instruction tuning, which can follow simple commands.

-<img src="assets/code.JPG" width="50%"><img src="assets/paper.JPG" width="50%">
+[Demo](https://ffdd75ef89db6f1c97.gradio.live/)
+
+We tested our model by referring to some tests for Wenxin Yiyuan. Original report can be found at Baidu []"Wenxin Yiyan" Test: What is the level of domestic generative AI?](https://www.8btc.com/article/6809666)
+
+The results of our model are shown in the following figure, and more results are yet to be further tested. Due to domestic network problems, the use of the above Demo may result in a request loss situation. If there is no response for a long time, please refresh and try again.
+
+![image1](assets/image1.png)![image2](assets/image2.png)![image3](assets/image3.png)
+
+We roughly estimate the cost to achieve the above results. The 40K-step pre-training used 150 million pre-training data, which is about 110B tokens. The total training time is 76 hours, and the cost is about $19,152 according to Google Cloud's A100 quotation. The Instruction-tuning training was carried out for 12k steps, using 1.6 million data, and the total training time was 3.4 hours, costing about $342. Therefore, the total cost of training such a model from scratch is less than $20,000.

 ## **Features**
 ### Ease of Use
@ -143,6 +151,35 @@ Current Progress
 ![](assets/loss.png)
 ### Instruction-Tuning

+We performed instruction-tuning on three currently available open-source datasets, and we plan to add more tasks and our own constructed datasets in the future.
+- [yizhongw/self_instruct](https://huggingface.co/datasets/yizhongw/self_instruct)
+- [BelleGroup/generated_train_0.5M_CN](https://huggingface.co/datasets/BelleGroup/generated_train_0.5M_CN)
+- [BelleGroup/generated_train_1M_CN](https://huggingface.co/datasets/BelleGroup/generated_train_1M_CN)
+
+We did some preprocessing on the raw data, the format is as follows:
+```
+user: {prompt}<s>system: {completion}</s>
+```
+The training code is similar to pre-training and can be seen in
+```
+instruction_tuning.py
+```
+
+The launch command is also similar to pre-training:
+```bash
+accelerate launch --config_file configs/default_config.yaml instruction_tuning.py
+```
+In some cases, the following parameters may need to be specified:
+```
+--main_process_ip
+--main_process_port
+--num_processes
+--num_machines
+--machine_rank
+```
+
+The loss during the process is as follows, basically fluctuating and not decreasing much:
+![loss](assets/instruct_loss.png)
 ### RLHF

 ## Performance Comparison
--- a/assets/image1.png
+++ b/assets/image1.png
--- a/assets/image2.png
+++ b/assets/image2.png
--- a/assets/image3.png
+++ b/assets/image3.png
--- a/assets/instruct_loss.png
+++ b/assets/instruct_loss.png
--- a/assets/pretrain_loss.png
+++ b/assets/pretrain_loss.png
--- a/requirements.txt
+++ b/requirements.txt
@ -16,4 +16,5 @@ sentencepiece
 triton
 functorch==1.13.1
 xformers
+gradio
 git+https://github.com/Bayes-Song/transformers.git