diff --git a/README.md b/README.md index 4e77cef..d9f3e81 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,12 @@ [**中文**](./README_zh.md) | [**English**](./README.md) @@ -124,7 +124,7 @@ For a 7B model, the training speed with the native PyTorch Llama model in Transf If pre-training with 500B tokens, 38300 GPU hours are required. According to the hourly price for 8 A100-80G Spot GPUs on Google Cloud, which is 12.6 US dollars, the total cost is 60,300 US dollars. When using the unaccelerated version for training, the cost is 158,744 US dollars. The final training cost is reduced by 98,000 US dollars. -For more testing, see [performance comparison with other open-source models](https://github.com/Bayes-Song/Open-Llama#%E5%92%8C%E5%85%B6%E4%BB%96%E5%BC%80%E6%BA%90%E6%A8%A1%E5%9E%8B%E6%80%A7%E8%83%BD%E5%AF%B9%E6%AF%94). +For more testing, see [performance comparison with other open-source models](https://github.com/s-JoL/Open-Llama#%E5%92%8C%E5%85%B6%E4%BB%96%E5%BC%80%E6%BA%90%E6%A8%A1%E5%9E%8B%E6%80%A7%E8%83%BD%E5%AF%B9%E6%AF%94). ### Versatility @@ -392,7 +392,7 @@ The following table summarizes the performance of currently available open-sourc ``` @misc{openllama, title={Open-Llama}, - author={Liang Song}, + author={s-JoL}, year={2023}, howpublished={\url{https://github.com/s-JoL/Open-Llama}}, } diff --git a/README_zh.md b/README_zh.md index 3ae0b15..02d7723 100644 --- a/README_zh.md +++ b/README_zh.md @@ -1,12 +1,12 @@ [**中文**](./README_zh.md) | [**English**](./README.md) @@ -124,7 +124,7 @@ v1版代码可见https://github.com/s-JoL/Open-Llama/tree/v1.0 如果使用500B token进行预训练,需要训练38300 GPU时。按照Google Cloud上A100-80G Spot的价格计算,8卡每小时价格为12.6美元,则总价格为60300美元。 当使用未加速版本训练时,价格为158744美元。最终降低训练成本9.8万美元。 -更多测试可见[和其他开源模型性能对比](https://github.com/Bayes-Song/Open-Llama#%E5%92%8C%E5%85%B6%E4%BB%96%E5%BC%80%E6%BA%90%E6%A8%A1%E5%9E%8B%E6%80%A7%E8%83%BD%E5%AF%B9%E6%AF%94)。 +更多测试可见[和其他开源模型性能对比](https://github.com/s-JoL/Open-Llama#%E5%92%8C%E5%85%B6%E4%BB%96%E5%BC%80%E6%BA%90%E6%A8%A1%E5%9E%8B%E6%80%A7%E8%83%BD%E5%AF%B9%E6%AF%94)。 ### 通用性 在训练语言模型时,我们希望能够构建一个通用的模型,可以适用于不同的语言和不同的领域。为了实现这一点,我们采用了以下策略: @@ -367,7 +367,7 @@ accelerate launch --config_file configs/accelerate_configs/ds_stage1.yaml train_ ``` @misc{openllama, title={Open-Llama}, - author={Liang Song}, + author={s-JoL}, year={2023}, howpublished={\url{https://github.com/s-JoL/Open-Llama}}, } diff --git a/chat_server.py b/chat_server.py index 68c7259..e53bed4 100644 --- a/chat_server.py +++ b/chat_server.py @@ -1,12 +1,12 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-04-06 22:30:10 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-05-12 15:07:36 FilePath: /Open-Llama/chat_server.py Description: -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import torch import logging @@ -15,14 +15,16 @@ from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("s-JoL/Open-Llama-V2", use_fast=False) -model = AutoModelForCausalLM.from_pretrained("s-JoL/Open-Llama-V2", torch_dtype=torch.bfloat16, device_map="auto") +model = AutoModelForCausalLM.from_pretrained( + "s-JoL/Open-Llama-V2", torch_dtype=torch.bfloat16, device_map="auto" +) logging.warning("ready") with gr.Blocks() as demo: gr.Markdown( """ - # [Open-Llama](https://github.com/Bayes-Song/Open-Llama) + # [Open-Llama](https://github.com/s-JoL/Open-Llama) 完全使用Open-Llama项目从0开始训练的Instruct-GPT模型,当长时间无响应(如20s以上)可刷新重试。 Instruct-GPT model is trained from scratch using the Open-Llama project without relying on any other pre-trained models. If there is no response for a long time (such as more than 20 seconds), please refresh and try again. diff --git a/data/download_instruct.sh b/data/download_instruct.sh index e781158..d3707ce 100644 --- a/data/download_instruct.sh +++ b/data/download_instruct.sh @@ -1,13 +1,13 @@ #!/bin/bash ### - # @Author: LiangSong(sl12160010@gmail.com) + # @Author: s-JoL(sl12160010@gmail.com) # @Date: 2023-04-05 23:18:10 - # @LastEditors: LiangSong(sl12160010@gmail.com) + # @LastEditors: s-JoL(sl12160010@gmail.com) # @LastEditTime: 2023-05-04 08:24:17 # @FilePath: /Open-Llama/data/download_instruct.sh # @Description: # - # Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. + # Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. ### mkdir data/instruction_data wget -c --tries 3 'https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/HTML_cleaned_raw_dataset/sg_90k_part1_html_cleaned.json' -O data/sg_90k_part1_html_cleaned.json diff --git a/data/download_the_pile.sh b/data/download_the_pile.sh index a3fd260..ebf4237 100644 --- a/data/download_the_pile.sh +++ b/data/download_the_pile.sh @@ -1,13 +1,13 @@ #!/bin/bash ### - # @Author: LiangSong(sl12160010@gmail.com) + # @Author: s-JoL(sl12160010@gmail.com) # @Date: 2023-03-16 21:21:38 - # @LastEditors: LiangSong(sl12160010@gmail.com) + # @LastEditors: s-JoL(sl12160010@gmail.com) # @LastEditTime: 2023-03-26 22:58:02 # @FilePath: /Open-Llama/data/download_the_pile.sh # @Description: # download the pile dataset and preprocess - # Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. + # Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. ### start=0 end=29 diff --git a/data/download_wudao.sh b/data/download_wudao.sh index f302a05..1850159 100644 --- a/data/download_wudao.sh +++ b/data/download_wudao.sh @@ -1,13 +1,13 @@ #!/bin/bash ### - # @Author: LiangSong(sl12160010@gmail.com) + # @Author: s-JoL(sl12160010@gmail.com) # @Date: 2023-03-16 21:21:56 - # @LastEditors: LiangSong(sl12160010@gmail.com) + # @LastEditors: s-JoL(sl12160010@gmail.com) # @LastEditTime: 2023-03-26 22:58:11 # @FilePath: /Open-Llama/data/download_wudao.sh # @Description: # download wudao dataset and preprocess - # Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. + # Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. ### apt install unrar diff --git a/data/preprocess_instruction.py b/data/preprocess_instruction.py index 37b28a4..4f45ac6 100644 --- a/data/preprocess_instruction.py +++ b/data/preprocess_instruction.py @@ -1,12 +1,12 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-03-30 20:52:10 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-05-04 08:32:04 FilePath: /Open-Llama/data/preprocess_instruction.py Description: -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import json from tqdm import tqdm diff --git a/data/preprocess_the_pile.py b/data/preprocess_the_pile.py index ede8a13..1fdaf0f 100644 --- a/data/preprocess_the_pile.py +++ b/data/preprocess_the_pile.py @@ -1,13 +1,13 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-03-16 22:35:38 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-03-26 22:59:38 FilePath: /Open-Llama/data/preprocess_the_pile.py Description: Parse the dataset from the raw files and split them into different jsonl files based on the preset maximum number of lines, making it easy for parallel training to perform streaming reads. -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import json from glob import glob diff --git a/data/preprocess_wudao.py b/data/preprocess_wudao.py index 767d2f3..c0866b9 100644 --- a/data/preprocess_wudao.py +++ b/data/preprocess_wudao.py @@ -1,13 +1,13 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-03-16 22:10:44 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-03-26 22:59:55 FilePath: /Open-Llama/data/preprocess_wudao.py Description: Parse the dataset from the raw files and split them into different jsonl files based on the preset maximum number of lines, making it easy for parallel training to perform streaming reads. -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import json from glob import glob diff --git a/dataset/dataset.py b/dataset/dataset.py index b1071cc..475f275 100644 --- a/dataset/dataset.py +++ b/dataset/dataset.py @@ -1,12 +1,12 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-04-24 20:05:21 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-05-06 23:30:37 FilePath: /Open-Llama/dataset/dataset.py Description: -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import math import torch diff --git a/dataset/validation.py b/dataset/validation.py index 8f50df6..de1980f 100644 --- a/dataset/validation.py +++ b/dataset/validation.py @@ -1,12 +1,12 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-03-18 00:06:41 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-03-27 01:09:20 FilePath: /Open-Llama/dataset/validation.py Description: -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ val_set = [ "白日依山尽,", diff --git a/solver/trainer.py b/solver/trainer.py index bd57cb9..0feee16 100644 --- a/solver/trainer.py +++ b/solver/trainer.py @@ -1,12 +1,12 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-04-24 20:05:21 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-05-08 22:51:42 FilePath: /Open-Llama/solver/trainer.py Description: -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import time import wandb diff --git a/train_lm.py b/train_lm.py index 11402e8..0b9b801 100644 --- a/train_lm.py +++ b/train_lm.py @@ -1,12 +1,12 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-04-12 19:12:42 -LastEditors: LiangSong(sl12160010@gmail.com) -LastEditTime: 2023-05-08 23:39:35 +LastEditors: s-JoL(sl12160010@gmail.com) +LastEditTime: 2023-05-17 22:20:32 FilePath: /Open-Llama/train_lm.py Description: -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import yaml import math diff --git a/utils/convert_ckpt.py b/utils/convert_ckpt.py index 8a8f972..8ceaca8 100644 --- a/utils/convert_ckpt.py +++ b/utils/convert_ckpt.py @@ -1,12 +1,12 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-04-28 19:55:13 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-05-06 23:30:29 FilePath: /Open-Llama/utils/convert_ckpt.py Description: -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import torch import sentencepiece as spm diff --git a/utils/speed_test/accelerate/run.py b/utils/speed_test/accelerate/run.py index a886ef1..88889be 100644 --- a/utils/speed_test/accelerate/run.py +++ b/utils/speed_test/accelerate/run.py @@ -1,12 +1,12 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-04-08 22:44:44 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-04-08 23:15:57 FilePath: /Open-Llama/speed_test/accelerate/run.py Description: -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import time import torch diff --git a/utils/speed_test/accelerate/run.sh b/utils/speed_test/accelerate/run.sh index 2e60847..0f38877 100644 --- a/utils/speed_test/accelerate/run.sh +++ b/utils/speed_test/accelerate/run.sh @@ -1,12 +1,12 @@ ### - # @Author: LiangSong(sl12160010@gmail.com) + # @Author: s-JoL(sl12160010@gmail.com) # @Date: 2023-04-08 22:44:27 - # @LastEditors: LiangSong(sl12160010@gmail.com) + # @LastEditors: s-JoL(sl12160010@gmail.com) # @LastEditTime: 2023-04-11 21:58:43 # @FilePath: /Open-Llama/speed_test/accelerate/run.sh # @Description: # - # Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. + # Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. ### total_gpu=8 accelerate launch --config_file deepspeed_stage2.yaml --main_process_ip 127.0.0.1 --main_process_port 23335 --num_processes $total_gpu run.py \ No newline at end of file diff --git a/utils/speed_test/colossal-ai/run.py b/utils/speed_test/colossal-ai/run.py index 8461802..917d3a1 100644 --- a/utils/speed_test/colossal-ai/run.py +++ b/utils/speed_test/colossal-ai/run.py @@ -1,12 +1,12 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-04-11 20:07:35 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-04-11 21:56:23 FilePath: /Open-Llama/speed_test/colossal-ai/run.py Description: -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import os from functools import partial diff --git a/utils/speed_test/lightning/run.py b/utils/speed_test/lightning/run.py index 91469f5..c41c1bf 100644 --- a/utils/speed_test/lightning/run.py +++ b/utils/speed_test/lightning/run.py @@ -1,12 +1,12 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-04-11 20:07:35 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-04-11 21:56:07 FilePath: /Open-Llama/speed_test/lightning/run.py Description: -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import time import torch diff --git a/utils/train_tokenizer.py b/utils/train_tokenizer.py index 6dd1991..b83420c 100644 --- a/utils/train_tokenizer.py +++ b/utils/train_tokenizer.py @@ -1,12 +1,12 @@ """ -Author: LiangSong(sl12160010@gmail.com) +Author: s-JoL(sl12160010@gmail.com) Date: 2023-03-24 20:49:03 -LastEditors: LiangSong(sl12160010@gmail.com) +LastEditors: s-JoL(sl12160010@gmail.com) LastEditTime: 2023-05-06 23:34:14 FilePath: /Open-Llama/utils/train_tokenizer.py Description: -Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved. +Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved. """ import random from glob import glob