Commit Graph

107 Commits

Author SHA1 Message Date
LiangSong
16811d0efe update readme 2023-05-08 22:29:24 +08:00
LiangSong
92caa94490 support peft 2023-05-08 22:26:39 +08:00
LiangSong
7da40f1c83 fix typo 2023-05-08 19:00:06 +08:00
LiangSong
2df3e622e9 update readme 2023-05-08 18:59:01 +08:00
LiangSong
ec2b4d6ee7 fix split by shard bug 2023-05-08 14:03:05 +08:00
LiangSong
4a1e7bb44b Optimized the structure of configs, added support for deepspeed stage3, reduced memory usage by using Auto class to load models, and added support for training 65B models. 2023-05-06 23:37:17 +08:00
LiangSong
5b1f6a4861 fix epoch bug 2023-05-06 09:45:37 +08:00
LiangSong
f893a0f5b8 update dataset 2023-05-05 19:23:16 +08:00
LiangSong
758af69c73 update science instruct-tuning datasets 2023-05-05 19:00:37 +08:00
LiangSong
d24b4cce54 update preprocess format 2023-05-05 18:20:59 +08:00
LiangSong
85caa97a6a add xP3 dataset and belle_2M 2023-05-05 17:05:41 +08:00
LiangSong
00cbdbbf26 fix typo 2023-05-04 22:55:40 +08:00
LiangSong
693e3970d9 update readme 2023-05-04 22:54:10 +08:00
LiangSong
fbb7997607 fix typo 2023-05-04 22:32:15 +08:00
LiangSong
98ffab3a97 update readme and add half to server 2023-05-04 22:28:36 +08:00
LiangSong
5c876121cb update gradio, fix code format bug 2023-05-04 18:18:52 +08:00
LiangSong
a1acc90988 fix train_tokenizer bug 2023-05-04 16:00:56 +08:00
LiangSong
51686b5fb8 add split dataset by shard option to accelerate data loading 2023-05-04 09:20:23 +08:00
LiangSong
f0d41f937b update instruct_config and set all random seed to 42 2023-05-04 08:45:21 +08:00
LiangSong
dba2e2d680 update ShareGPT_90K preprocess 2023-05-04 08:34:38 +08:00
LiangSong
154456c976 set dataset shuffle seed to 42 2023-05-04 00:31:12 +08:00
LiangSong
c2184c6dd1 support multiple epochs 2023-05-03 00:02:01 +08:00
LiangSong
f05e929aad update config 2023-05-02 21:42:55 +08:00
LiangSong
0466673f76 support load model from accelerate ckpt 2023-04-29 20:40:42 +08:00
LiangSong
52cd09f664 update readme 2023-04-29 20:30:24 +08:00
LiangSong
fc21a75d1e add continue training 2023-04-29 20:28:39 +08:00
LiangSong
28b11a5bed update requirements 2023-04-29 13:39:03 +08:00
LiangSong
8b439dec4a update flops 2023-04-29 12:31:11 +08:00
LiangSong
a2816bd23d update readme 2023-04-29 12:06:55 +08:00
LiangSong
4c5e50e4aa update readme 2023-04-29 11:41:28 +08:00
LiangSong
c8037746c3 update readme 2023-04-28 22:45:45 +08:00
s-JoL
0ff8b2353f
Merge pull request #30 from s-JoL/dev
update readme
2023-04-28 19:54:52 +08:00
LiangSong
724265b435 update readme 2023-04-28 19:54:14 +08:00
s-JoL
0fd7dbd636
Merge pull request #29 from s-JoL/dev
update readme
2023-04-28 19:50:29 +08:00
LiangSong
8c85535db3 update readme 2023-04-28 19:49:51 +08:00
LiangSong
676dcfd995 add hardward configuration to readme 2023-04-28 17:29:11 +08:00
s-JoL
f3c664bde3
Merge pull request #25 from s-JoL/dev
v2 release
2023-04-28 15:11:02 +08:00
LiangSong
c890bce69c update readme 2023-04-28 15:10:41 +08:00
LiangSong
9baebfd49c Merge branch 'main' into dev 2023-04-28 15:08:25 +08:00
LiangSong
2fd13ff075 fix typo 2023-04-28 15:05:33 +08:00
LiangSong
0fdca8b949 update readme 2023-04-28 15:01:01 +08:00
LiangSong
49118aad42 update header config and add padding to concat_multiple_sequence 2023-04-27 23:42:11 +08:00
LiangSong
db6cdb51d0 unified pre-training and instrcution-tuning both use train_lm and dataset 2023-04-27 19:42:06 +08:00
LiangSong
97aff0e051 use split_dataset_by_node instead accelerate.prepare to accelerate data loading by 50% 2023-04-27 00:04:11 +08:00
LiangSong
0377b43628 update tokenizer to LlamaTokenizer 2023-04-26 18:53:30 +08:00
LiangSong
f41f5558ec update header 2023-04-24 23:19:07 +08:00
LiangSong
f8f4cde228 using huggingface datasets to accelerate training, using open-llama to pretrain 2023-04-24 19:13:53 +08:00
s-JoL
92af968637
Update README.md 2023-04-23 16:26:58 +08:00
s-JoL
cf852bc459
Update README.md 2023-04-23 16:26:21 +08:00
LiangSong
ad3d943a7d update readme add ckpt from hf 2023-04-16 23:50:36 +08:00