LiangSong
|
95973b5de1
|
update header
|
2023-05-17 22:21:46 +07:00 |
|
LiangSong
|
59b79af9d7
|
add comment
|
2023-05-09 16:53:05 +08:00 |
|
LiangSong
|
92caa94490
|
support peft
|
2023-05-08 22:26:39 +08:00 |
|
LiangSong
|
ec2b4d6ee7
|
fix split by shard bug
|
2023-05-08 14:03:05 +08:00 |
|
LiangSong
|
4a1e7bb44b
|
Optimized the structure of configs, added support for deepspeed stage3, reduced memory usage by using Auto class to load models, and added support for training 65B models.
|
2023-05-06 23:37:17 +08:00 |
|
LiangSong
|
f893a0f5b8
|
update dataset
|
2023-05-05 19:23:16 +08:00 |
|
LiangSong
|
85caa97a6a
|
add xP3 dataset and belle_2M
|
2023-05-05 17:05:41 +08:00 |
|
LiangSong
|
51686b5fb8
|
add split dataset by shard option to accelerate data loading
|
2023-05-04 09:20:23 +08:00 |
|
LiangSong
|
f0d41f937b
|
update instruct_config and set all random seed to 42
|
2023-05-04 08:45:21 +08:00 |
|
LiangSong
|
154456c976
|
set dataset shuffle seed to 42
|
2023-05-04 00:31:12 +08:00 |
|
LiangSong
|
0fdca8b949
|
update readme
|
2023-04-28 15:01:01 +08:00 |
|
LiangSong
|
49118aad42
|
update header config and add padding to concat_multiple_sequence
|
2023-04-27 23:42:11 +08:00 |
|
LiangSong
|
db6cdb51d0
|
unified pre-training and instrcution-tuning both use train_lm and dataset
|
2023-04-27 19:42:06 +08:00 |
|
LiangSong
|
0377b43628
|
update tokenizer to LlamaTokenizer
|
2023-04-26 18:53:30 +08:00 |
|
LiangSong
|
f41f5558ec
|
update header
|
2023-04-24 23:19:07 +08:00 |
|
LiangSong
|
f8f4cde228
|
using huggingface datasets to accelerate training, using open-llama to pretrain
|
2023-04-24 19:13:53 +08:00 |
|
LiangSong
|
ae0691c509
|
update utils
|
2023-04-12 17:15:40 +08:00 |
|
LiangSong
|
c67d365db3
|
update format
|
2023-04-07 23:20:20 +08:00 |
|
LiangSong
|
1a731953da
|
update server
|
2023-04-07 10:04:05 +08:00 |
|
LiangSong
|
bc16df4751
|
add more instruction data
|
2023-04-06 03:45:24 +08:00 |
|
LiangSong
|
562067230f
|
update dataset, add concat sequence from multiple docs
|
2023-04-05 22:42:34 +08:00 |
|
LiangSong
|
b9bc7eaf35
|
fix long seq bug
|
2023-03-31 10:12:28 +08:00 |
|
LiangSong
|
a62ac2658f
|
add instruction-tuning
|
2023-03-30 23:43:12 +08:00 |
|
LiangSong
|
87776f4370
|
add BucketBySequenceLengthDataset to accelerate training speed
|
2023-03-28 10:05:27 +08:00 |
|
LiangSong
|
918a8cdc3d
|
reformat code with black
|
2023-03-27 14:34:59 +08:00 |
|
LiangSong
|
1976aea5f9
|
update readme
|
2023-03-27 02:12:59 +08:00 |
|
LiangSong
|
73a81a4205
|
add high-performance Llama pre-train code
|
2023-03-26 23:59:53 +08:00 |
|