Open-Llama

Author	SHA1	Message	Date
LiangSong	4a1e7bb44b	Optimized the structure of configs, added support for deepspeed stage3, reduced memory usage by using Auto class to load models, and added support for training 65B models.	2023-05-06 23:37:17 +08:00
LiangSong	693e3970d9	update readme	2023-05-04 22:54:10 +08:00
LiangSong	fbb7997607	fix typo	2023-05-04 22:32:15 +08:00
LiangSong	98ffab3a97	update readme and add half to server	2023-05-04 22:28:36 +08:00
LiangSong	5c876121cb	update gradio, fix code format bug	2023-05-04 18:18:52 +08:00
LiangSong	f05e929aad	update config	2023-05-02 21:42:55 +08:00
LiangSong	0466673f76	support load model from accelerate ckpt	2023-04-29 20:40:42 +08:00
LiangSong	fc21a75d1e	add continue training	2023-04-29 20:28:39 +08:00
LiangSong	49118aad42	update header config and add padding to concat_multiple_sequence	2023-04-27 23:42:11 +08:00
LiangSong	db6cdb51d0	unified pre-training and instrcution-tuning both use train_lm and dataset	2023-04-27 19:42:06 +08:00
LiangSong	97aff0e051	use split_dataset_by_node instead accelerate.prepare to accelerate data loading by 50%	2023-04-27 00:04:11 +08:00
LiangSong	f8f4cde228	using huggingface datasets to accelerate training, using open-llama to pretrain	2023-04-24 19:13:53 +08:00
LiangSong	c67d365db3	update format	2023-04-07 23:20:20 +08:00
LiangSong	f4ba4b6ff2	update chinese readme	2023-04-07 23:19:42 +08:00
LiangSong	1a731953da	update server	2023-04-07 10:04:05 +08:00