Commit Graph

15 Commits

Author SHA1 Message Date
LiangSong
4a1e7bb44b Optimized the structure of configs, added support for deepspeed stage3, reduced memory usage by using Auto class to load models, and added support for training 65B models. 2023-05-06 23:37:17 +08:00
LiangSong
693e3970d9 update readme 2023-05-04 22:54:10 +08:00
LiangSong
fbb7997607 fix typo 2023-05-04 22:32:15 +08:00
LiangSong
98ffab3a97 update readme and add half to server 2023-05-04 22:28:36 +08:00
LiangSong
5c876121cb update gradio, fix code format bug 2023-05-04 18:18:52 +08:00
LiangSong
f05e929aad update config 2023-05-02 21:42:55 +08:00
LiangSong
0466673f76 support load model from accelerate ckpt 2023-04-29 20:40:42 +08:00
LiangSong
fc21a75d1e add continue training 2023-04-29 20:28:39 +08:00
LiangSong
49118aad42 update header config and add padding to concat_multiple_sequence 2023-04-27 23:42:11 +08:00
LiangSong
db6cdb51d0 unified pre-training and instrcution-tuning both use train_lm and dataset 2023-04-27 19:42:06 +08:00
LiangSong
97aff0e051 use split_dataset_by_node instead accelerate.prepare to accelerate data loading by 50% 2023-04-27 00:04:11 +08:00
LiangSong
f8f4cde228 using huggingface datasets to accelerate training, using open-llama to pretrain 2023-04-24 19:13:53 +08:00
LiangSong
c67d365db3 update format 2023-04-07 23:20:20 +08:00
LiangSong
f4ba4b6ff2 update chinese readme 2023-04-07 23:19:42 +08:00
LiangSong
1a731953da update server 2023-04-07 10:04:05 +08:00