Commit Graph

16 Commits

Author SHA1 Message Date
LiangSong
95973b5de1 update header 2023-05-17 22:21:46 +07:00
LiangSong
59b79af9d7 add comment 2023-05-09 16:53:05 +08:00
LiangSong
92caa94490 support peft 2023-05-08 22:26:39 +08:00
LiangSong
ec2b4d6ee7 fix split by shard bug 2023-05-08 14:03:05 +08:00
LiangSong
4a1e7bb44b Optimized the structure of configs, added support for deepspeed stage3, reduced memory usage by using Auto class to load models, and added support for training 65B models. 2023-05-06 23:37:17 +08:00
LiangSong
f893a0f5b8 update dataset 2023-05-05 19:23:16 +08:00
LiangSong
85caa97a6a add xP3 dataset and belle_2M 2023-05-05 17:05:41 +08:00
LiangSong
51686b5fb8 add split dataset by shard option to accelerate data loading 2023-05-04 09:20:23 +08:00
LiangSong
f0d41f937b update instruct_config and set all random seed to 42 2023-05-04 08:45:21 +08:00
LiangSong
154456c976 set dataset shuffle seed to 42 2023-05-04 00:31:12 +08:00
LiangSong
0fdca8b949 update readme 2023-04-28 15:01:01 +08:00
LiangSong
49118aad42 update header config and add padding to concat_multiple_sequence 2023-04-27 23:42:11 +08:00
LiangSong
db6cdb51d0 unified pre-training and instrcution-tuning both use train_lm and dataset 2023-04-27 19:42:06 +08:00
LiangSong
0377b43628 update tokenizer to LlamaTokenizer 2023-04-26 18:53:30 +08:00
LiangSong
f41f5558ec update header 2023-04-24 23:19:07 +08:00
LiangSong
f8f4cde228 using huggingface datasets to accelerate training, using open-llama to pretrain 2023-04-24 19:13:53 +08:00