Commit Graph

6 Commits

Author SHA1 Message Date
LiangSong
0fdca8b949 update readme 2023-04-28 15:01:01 +08:00
LiangSong
49118aad42 update header config and add padding to concat_multiple_sequence 2023-04-27 23:42:11 +08:00
LiangSong
db6cdb51d0 unified pre-training and instrcution-tuning both use train_lm and dataset 2023-04-27 19:42:06 +08:00
LiangSong
0377b43628 update tokenizer to LlamaTokenizer 2023-04-26 18:53:30 +08:00
LiangSong
f41f5558ec update header 2023-04-24 23:19:07 +08:00
LiangSong
f8f4cde228 using huggingface datasets to accelerate training, using open-llama to pretrain 2023-04-24 19:13:53 +08:00