LiangSong
|
4a1e7bb44b
|
Optimized the structure of configs, added support for deepspeed stage3, reduced memory usage by using Auto class to load models, and added support for training 65B models.
|
2023-05-06 23:37:17 +08:00 |
|
LiangSong
|
693e3970d9
|
update readme
|
2023-05-04 22:54:10 +08:00 |
|
LiangSong
|
fbb7997607
|
fix typo
|
2023-05-04 22:32:15 +08:00 |
|
LiangSong
|
98ffab3a97
|
update readme and add half to server
|
2023-05-04 22:28:36 +08:00 |
|
LiangSong
|
5c876121cb
|
update gradio, fix code format bug
|
2023-05-04 18:18:52 +08:00 |
|
LiangSong
|
f05e929aad
|
update config
|
2023-05-02 21:42:55 +08:00 |
|
LiangSong
|
0466673f76
|
support load model from accelerate ckpt
|
2023-04-29 20:40:42 +08:00 |
|
LiangSong
|
fc21a75d1e
|
add continue training
|
2023-04-29 20:28:39 +08:00 |
|
LiangSong
|
49118aad42
|
update header config and add padding to concat_multiple_sequence
|
2023-04-27 23:42:11 +08:00 |
|
LiangSong
|
db6cdb51d0
|
unified pre-training and instrcution-tuning both use train_lm and dataset
|
2023-04-27 19:42:06 +08:00 |
|
LiangSong
|
97aff0e051
|
use split_dataset_by_node instead accelerate.prepare to accelerate data loading by 50%
|
2023-04-27 00:04:11 +08:00 |
|
LiangSong
|
f8f4cde228
|
using huggingface datasets to accelerate training, using open-llama to pretrain
|
2023-04-24 19:13:53 +08:00 |
|
LiangSong
|
c67d365db3
|
update format
|
2023-04-07 23:20:20 +08:00 |
|
LiangSong
|
f4ba4b6ff2
|
update chinese readme
|
2023-04-07 23:19:42 +08:00 |
|
LiangSong
|
1a731953da
|
update server
|
2023-04-07 10:04:05 +08:00 |
|