Commit Graph

102 Commits

Author SHA1 Message Date
LiangSong
4a1e7bb44b Optimized the structure of configs, added support for deepspeed stage3, reduced memory usage by using Auto class to load models, and added support for training 65B models. 2023-05-06 23:37:17 +08:00
LiangSong
5b1f6a4861 fix epoch bug 2023-05-06 09:45:37 +08:00
LiangSong
f893a0f5b8 update dataset 2023-05-05 19:23:16 +08:00
LiangSong
758af69c73 update science instruct-tuning datasets 2023-05-05 19:00:37 +08:00
LiangSong
d24b4cce54 update preprocess format 2023-05-05 18:20:59 +08:00
LiangSong
85caa97a6a add xP3 dataset and belle_2M 2023-05-05 17:05:41 +08:00
LiangSong
00cbdbbf26 fix typo 2023-05-04 22:55:40 +08:00
LiangSong
693e3970d9 update readme 2023-05-04 22:54:10 +08:00
LiangSong
fbb7997607 fix typo 2023-05-04 22:32:15 +08:00
LiangSong
98ffab3a97 update readme and add half to server 2023-05-04 22:28:36 +08:00
LiangSong
5c876121cb update gradio, fix code format bug 2023-05-04 18:18:52 +08:00
LiangSong
a1acc90988 fix train_tokenizer bug 2023-05-04 16:00:56 +08:00
LiangSong
51686b5fb8 add split dataset by shard option to accelerate data loading 2023-05-04 09:20:23 +08:00
LiangSong
f0d41f937b update instruct_config and set all random seed to 42 2023-05-04 08:45:21 +08:00
LiangSong
dba2e2d680 update ShareGPT_90K preprocess 2023-05-04 08:34:38 +08:00
LiangSong
154456c976 set dataset shuffle seed to 42 2023-05-04 00:31:12 +08:00
LiangSong
c2184c6dd1 support multiple epochs 2023-05-03 00:02:01 +08:00
LiangSong
f05e929aad update config 2023-05-02 21:42:55 +08:00
LiangSong
0466673f76 support load model from accelerate ckpt 2023-04-29 20:40:42 +08:00
LiangSong
52cd09f664 update readme 2023-04-29 20:30:24 +08:00
LiangSong
fc21a75d1e add continue training 2023-04-29 20:28:39 +08:00
LiangSong
28b11a5bed update requirements 2023-04-29 13:39:03 +08:00
LiangSong
8b439dec4a update flops 2023-04-29 12:31:11 +08:00
LiangSong
a2816bd23d update readme 2023-04-29 12:06:55 +08:00
LiangSong
4c5e50e4aa update readme 2023-04-29 11:41:28 +08:00
LiangSong
c8037746c3 update readme 2023-04-28 22:45:45 +08:00
s-JoL
0ff8b2353f
Merge pull request #30 from s-JoL/dev
update readme
2023-04-28 19:54:52 +08:00
LiangSong
724265b435 update readme 2023-04-28 19:54:14 +08:00
s-JoL
0fd7dbd636
Merge pull request #29 from s-JoL/dev
update readme
2023-04-28 19:50:29 +08:00
LiangSong
8c85535db3 update readme 2023-04-28 19:49:51 +08:00
LiangSong
676dcfd995 add hardward configuration to readme 2023-04-28 17:29:11 +08:00
s-JoL
f3c664bde3
Merge pull request #25 from s-JoL/dev
v2 release
2023-04-28 15:11:02 +08:00
LiangSong
c890bce69c update readme 2023-04-28 15:10:41 +08:00
LiangSong
9baebfd49c Merge branch 'main' into dev 2023-04-28 15:08:25 +08:00
LiangSong
2fd13ff075 fix typo 2023-04-28 15:05:33 +08:00
LiangSong
0fdca8b949 update readme 2023-04-28 15:01:01 +08:00
LiangSong
49118aad42 update header config and add padding to concat_multiple_sequence 2023-04-27 23:42:11 +08:00
LiangSong
db6cdb51d0 unified pre-training and instrcution-tuning both use train_lm and dataset 2023-04-27 19:42:06 +08:00
LiangSong
97aff0e051 use split_dataset_by_node instead accelerate.prepare to accelerate data loading by 50% 2023-04-27 00:04:11 +08:00
LiangSong
0377b43628 update tokenizer to LlamaTokenizer 2023-04-26 18:53:30 +08:00
LiangSong
f41f5558ec update header 2023-04-24 23:19:07 +08:00
LiangSong
f8f4cde228 using huggingface datasets to accelerate training, using open-llama to pretrain 2023-04-24 19:13:53 +08:00
s-JoL
92af968637
Update README.md 2023-04-23 16:26:58 +08:00
s-JoL
cf852bc459
Update README.md 2023-04-23 16:26:21 +08:00
LiangSong
ad3d943a7d update readme add ckpt from hf 2023-04-16 23:50:36 +08:00
LiangSong
b21441b14b disable concat docs 2023-04-15 19:35:24 +08:00
LiangSong
3f62a23ee2 update format 2023-04-12 22:16:15 +08:00
LiangSong
a4aa109dd3 add trainer and utils 2023-04-12 17:59:05 +08:00
LiangSong
ae0691c509 update utils 2023-04-12 17:15:40 +08:00
LiangSong
da1c927016 update speed test 2023-04-12 17:15:07 +08:00