update header
This commit is contained in:
parent
d269affb42
commit
95973b5de1
12
README.md
12
README.md
|
@ -1,12 +1,12 @@
|
|||
<!--
|
||||
* @Author: LiangSong(sl12160010@gmail.com)
|
||||
* @Author: s-JoL(sl12160010@gmail.com)
|
||||
* @Date: 2023-03-10 21:18:35
|
||||
* @LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
* @LastEditTime: 2023-05-17 21:16:42
|
||||
* @LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
* @LastEditTime: 2023-05-17 22:21:07
|
||||
* @FilePath: /Open-Llama/README.md
|
||||
* @Description:
|
||||
*
|
||||
* Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
* Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
-->
|
||||
[**中文**](./README_zh.md) | [**English**](./README.md)
|
||||
|
||||
|
@ -124,7 +124,7 @@ For a 7B model, the training speed with the native PyTorch Llama model in Transf
|
|||
If pre-training with 500B tokens, 38300 GPU hours are required. According to the hourly price for 8 A100-80G Spot GPUs on Google Cloud, which is 12.6 US dollars, the total cost is 60,300 US dollars.
|
||||
When using the unaccelerated version for training, the cost is 158,744 US dollars. The final training cost is reduced by 98,000 US dollars.
|
||||
|
||||
For more testing, see [performance comparison with other open-source models](https://github.com/Bayes-Song/Open-Llama#%E5%92%8C%E5%85%B6%E4%BB%96%E5%BC%80%E6%BA%90%E6%A8%A1%E5%9E%8B%E6%80%A7%E8%83%BD%E5%AF%B9%E6%AF%94).
|
||||
For more testing, see [performance comparison with other open-source models](https://github.com/s-JoL/Open-Llama#%E5%92%8C%E5%85%B6%E4%BB%96%E5%BC%80%E6%BA%90%E6%A8%A1%E5%9E%8B%E6%80%A7%E8%83%BD%E5%AF%B9%E6%AF%94).
|
||||
|
||||
### Versatility
|
||||
|
||||
|
@ -392,7 +392,7 @@ The following table summarizes the performance of currently available open-sourc
|
|||
```
|
||||
@misc{openllama,
|
||||
title={Open-Llama},
|
||||
author={Liang Song},
|
||||
author={s-JoL},
|
||||
year={2023},
|
||||
howpublished={\url{https://github.com/s-JoL/Open-Llama}},
|
||||
}
|
||||
|
|
12
README_zh.md
12
README_zh.md
|
@ -1,12 +1,12 @@
|
|||
<!--
|
||||
* @Author: LiangSong(sl12160010@gmail.com)
|
||||
* @Author: s-JoL(sl12160010@gmail.com)
|
||||
* @Date: 2023-03-10 21:18:35
|
||||
* @LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
* @LastEditTime: 2023-05-17 21:17:41
|
||||
* @LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
* @LastEditTime: 2023-05-17 22:20:48
|
||||
* @FilePath: /Open-Llama/README_zh.md
|
||||
* @Description:
|
||||
*
|
||||
* Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
* Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
-->
|
||||
[**中文**](./README_zh.md) | [**English**](./README.md)
|
||||
|
||||
|
@ -124,7 +124,7 @@ v1版代码可见https://github.com/s-JoL/Open-Llama/tree/v1.0
|
|||
如果使用500B token进行预训练,需要训练38300 GPU时。按照Google Cloud上A100-80G Spot的价格计算,8卡每小时价格为12.6美元,则总价格为60300美元。
|
||||
当使用未加速版本训练时,价格为158744美元。最终降低训练成本9.8万美元。
|
||||
|
||||
更多测试可见[和其他开源模型性能对比](https://github.com/Bayes-Song/Open-Llama#%E5%92%8C%E5%85%B6%E4%BB%96%E5%BC%80%E6%BA%90%E6%A8%A1%E5%9E%8B%E6%80%A7%E8%83%BD%E5%AF%B9%E6%AF%94)。
|
||||
更多测试可见[和其他开源模型性能对比](https://github.com/s-JoL/Open-Llama#%E5%92%8C%E5%85%B6%E4%BB%96%E5%BC%80%E6%BA%90%E6%A8%A1%E5%9E%8B%E6%80%A7%E8%83%BD%E5%AF%B9%E6%AF%94)。
|
||||
### 通用性
|
||||
|
||||
在训练语言模型时,我们希望能够构建一个通用的模型,可以适用于不同的语言和不同的领域。为了实现这一点,我们采用了以下策略:
|
||||
|
@ -367,7 +367,7 @@ accelerate launch --config_file configs/accelerate_configs/ds_stage1.yaml train_
|
|||
```
|
||||
@misc{openllama,
|
||||
title={Open-Llama},
|
||||
author={Liang Song},
|
||||
author={s-JoL},
|
||||
year={2023},
|
||||
howpublished={\url{https://github.com/s-JoL/Open-Llama}},
|
||||
}
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-04-06 22:30:10
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-05-12 15:07:36
|
||||
FilePath: /Open-Llama/chat_server.py
|
||||
Description:
|
||||
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import torch
|
||||
import logging
|
||||
|
@ -15,14 +15,16 @@ from transformers import AutoModelForCausalLM, AutoTokenizer
|
|||
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained("s-JoL/Open-Llama-V2", use_fast=False)
|
||||
model = AutoModelForCausalLM.from_pretrained("s-JoL/Open-Llama-V2", torch_dtype=torch.bfloat16, device_map="auto")
|
||||
model = AutoModelForCausalLM.from_pretrained(
|
||||
"s-JoL/Open-Llama-V2", torch_dtype=torch.bfloat16, device_map="auto"
|
||||
)
|
||||
logging.warning("ready")
|
||||
|
||||
|
||||
with gr.Blocks() as demo:
|
||||
gr.Markdown(
|
||||
"""
|
||||
# [Open-Llama](https://github.com/Bayes-Song/Open-Llama)
|
||||
# [Open-Llama](https://github.com/s-JoL/Open-Llama)
|
||||
完全使用Open-Llama项目从0开始训练的Instruct-GPT模型,当长时间无响应(如20s以上)可刷新重试。
|
||||
|
||||
Instruct-GPT model is trained from scratch using the Open-Llama project without relying on any other pre-trained models. If there is no response for a long time (such as more than 20 seconds), please refresh and try again.
|
||||
|
|
|
@ -1,13 +1,13 @@
|
|||
#!/bin/bash
|
||||
###
|
||||
# @Author: LiangSong(sl12160010@gmail.com)
|
||||
# @Author: s-JoL(sl12160010@gmail.com)
|
||||
# @Date: 2023-04-05 23:18:10
|
||||
# @LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
# @LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
# @LastEditTime: 2023-05-04 08:24:17
|
||||
# @FilePath: /Open-Llama/data/download_instruct.sh
|
||||
# @Description:
|
||||
#
|
||||
# Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
# Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
###
|
||||
mkdir data/instruction_data
|
||||
wget -c --tries 3 'https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/HTML_cleaned_raw_dataset/sg_90k_part1_html_cleaned.json' -O data/sg_90k_part1_html_cleaned.json
|
||||
|
|
|
@ -1,13 +1,13 @@
|
|||
#!/bin/bash
|
||||
###
|
||||
# @Author: LiangSong(sl12160010@gmail.com)
|
||||
# @Author: s-JoL(sl12160010@gmail.com)
|
||||
# @Date: 2023-03-16 21:21:38
|
||||
# @LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
# @LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
# @LastEditTime: 2023-03-26 22:58:02
|
||||
# @FilePath: /Open-Llama/data/download_the_pile.sh
|
||||
# @Description:
|
||||
# download the pile dataset and preprocess
|
||||
# Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
# Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
###
|
||||
start=0
|
||||
end=29
|
||||
|
|
|
@ -1,13 +1,13 @@
|
|||
#!/bin/bash
|
||||
###
|
||||
# @Author: LiangSong(sl12160010@gmail.com)
|
||||
# @Author: s-JoL(sl12160010@gmail.com)
|
||||
# @Date: 2023-03-16 21:21:56
|
||||
# @LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
# @LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
# @LastEditTime: 2023-03-26 22:58:11
|
||||
# @FilePath: /Open-Llama/data/download_wudao.sh
|
||||
# @Description:
|
||||
# download wudao dataset and preprocess
|
||||
# Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
# Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
###
|
||||
apt install unrar
|
||||
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-03-30 20:52:10
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-05-04 08:32:04
|
||||
FilePath: /Open-Llama/data/preprocess_instruction.py
|
||||
Description:
|
||||
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import json
|
||||
from tqdm import tqdm
|
||||
|
|
|
@ -1,13 +1,13 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-03-16 22:35:38
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-03-26 22:59:38
|
||||
FilePath: /Open-Llama/data/preprocess_the_pile.py
|
||||
Description:
|
||||
Parse the dataset from the raw files and split them into different jsonl files based on the preset maximum number of lines,
|
||||
making it easy for parallel training to perform streaming reads.
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import json
|
||||
from glob import glob
|
||||
|
|
|
@ -1,13 +1,13 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-03-16 22:10:44
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-03-26 22:59:55
|
||||
FilePath: /Open-Llama/data/preprocess_wudao.py
|
||||
Description:
|
||||
Parse the dataset from the raw files and split them into different jsonl files based on the preset maximum number of lines,
|
||||
making it easy for parallel training to perform streaming reads.
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import json
|
||||
from glob import glob
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-04-24 20:05:21
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-05-06 23:30:37
|
||||
FilePath: /Open-Llama/dataset/dataset.py
|
||||
Description:
|
||||
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import math
|
||||
import torch
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-03-18 00:06:41
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-03-27 01:09:20
|
||||
FilePath: /Open-Llama/dataset/validation.py
|
||||
Description:
|
||||
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
val_set = [
|
||||
"白日依山尽,",
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-04-24 20:05:21
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-05-08 22:51:42
|
||||
FilePath: /Open-Llama/solver/trainer.py
|
||||
Description:
|
||||
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import time
|
||||
import wandb
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-04-12 19:12:42
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-05-08 23:39:35
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-05-17 22:20:32
|
||||
FilePath: /Open-Llama/train_lm.py
|
||||
Description:
|
||||
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import yaml
|
||||
import math
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-04-28 19:55:13
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-05-06 23:30:29
|
||||
FilePath: /Open-Llama/utils/convert_ckpt.py
|
||||
Description:
|
||||
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import torch
|
||||
import sentencepiece as spm
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-04-08 22:44:44
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-04-08 23:15:57
|
||||
FilePath: /Open-Llama/speed_test/accelerate/run.py
|
||||
Description:
|
||||
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import time
|
||||
import torch
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
###
|
||||
# @Author: LiangSong(sl12160010@gmail.com)
|
||||
# @Author: s-JoL(sl12160010@gmail.com)
|
||||
# @Date: 2023-04-08 22:44:27
|
||||
# @LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
# @LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
# @LastEditTime: 2023-04-11 21:58:43
|
||||
# @FilePath: /Open-Llama/speed_test/accelerate/run.sh
|
||||
# @Description:
|
||||
#
|
||||
# Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
# Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
###
|
||||
total_gpu=8
|
||||
accelerate launch --config_file deepspeed_stage2.yaml --main_process_ip 127.0.0.1 --main_process_port 23335 --num_processes $total_gpu run.py
|
|
@ -1,12 +1,12 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-04-11 20:07:35
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-04-11 21:56:23
|
||||
FilePath: /Open-Llama/speed_test/colossal-ai/run.py
|
||||
Description:
|
||||
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import os
|
||||
from functools import partial
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-04-11 20:07:35
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-04-11 21:56:07
|
||||
FilePath: /Open-Llama/speed_test/lightning/run.py
|
||||
Description:
|
||||
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import time
|
||||
import torch
|
||||
|
|
|
@ -1,12 +1,12 @@
|
|||
"""
|
||||
Author: LiangSong(sl12160010@gmail.com)
|
||||
Author: s-JoL(sl12160010@gmail.com)
|
||||
Date: 2023-03-24 20:49:03
|
||||
LastEditors: LiangSong(sl12160010@gmail.com)
|
||||
LastEditors: s-JoL(sl12160010@gmail.com)
|
||||
LastEditTime: 2023-05-06 23:34:14
|
||||
FilePath: /Open-Llama/utils/train_tokenizer.py
|
||||
Description:
|
||||
|
||||
Copyright (c) 2023 by LiangSong(sl12160010@gmail.com), All Rights Reserved.
|
||||
Copyright (c) 2023 by s-JoL(sl12160010@gmail.com), All Rights Reserved.
|
||||
"""
|
||||
import random
|
||||
from glob import glob
|
||||
|
|
Loading…
Reference in New Issue
Block a user