Commit Graph

16 Commits

Author SHA1 Message Date
LiangSong
4a1e7bb44b Optimized the structure of configs, added support for deepspeed stage3, reduced memory usage by using Auto class to load models, and added support for training 65B models. 2023-05-06 23:37:17 +08:00
LiangSong
85caa97a6a add xP3 dataset and belle_2M 2023-05-05 17:05:41 +08:00
LiangSong
f0d41f937b update instruct_config and set all random seed to 42 2023-05-04 08:45:21 +08:00
LiangSong
f05e929aad update config 2023-05-02 21:42:55 +08:00
LiangSong
0fdca8b949 update readme 2023-04-28 15:01:01 +08:00
LiangSong
49118aad42 update header config and add padding to concat_multiple_sequence 2023-04-27 23:42:11 +08:00
LiangSong
db6cdb51d0 unified pre-training and instrcution-tuning both use train_lm and dataset 2023-04-27 19:42:06 +08:00
LiangSong
0377b43628 update tokenizer to LlamaTokenizer 2023-04-26 18:53:30 +08:00
LiangSong
f8f4cde228 using huggingface datasets to accelerate training, using open-llama to pretrain 2023-04-24 19:13:53 +08:00
LiangSong
a4aa109dd3 add trainer and utils 2023-04-12 17:59:05 +08:00
LiangSong
ae0691c509 update utils 2023-04-12 17:15:40 +08:00
LiangSong
bc16df4751 add more instruction data 2023-04-06 03:45:24 +08:00
LiangSong
b9bc7eaf35 fix long seq bug 2023-03-31 10:12:28 +08:00
LiangSong
a62ac2658f add instruction-tuning 2023-03-30 23:43:12 +08:00
LiangSong
918a8cdc3d reformat code with black 2023-03-27 14:34:59 +08:00
LiangSong
73a81a4205 add high-performance Llama pre-train code 2023-03-26 23:59:53 +08:00