Open-Llama

Author	SHA1	Message	Date
LiangSong	4a1e7bb44b	Optimized the structure of configs, added support for deepspeed stage3, reduced memory usage by using Auto class to load models, and added support for training 65B models.	2023-05-06 23:37:17 +08:00
LiangSong	5b1f6a4861	fix epoch bug	2023-05-06 09:45:37 +08:00
LiangSong	f893a0f5b8	update dataset	2023-05-05 19:23:16 +08:00
LiangSong	758af69c73	update science instruct-tuning datasets	2023-05-05 19:00:37 +08:00
LiangSong	d24b4cce54	update preprocess format	2023-05-05 18:20:59 +08:00
LiangSong	85caa97a6a	add xP3 dataset and belle_2M	2023-05-05 17:05:41 +08:00
LiangSong	00cbdbbf26	fix typo	2023-05-04 22:55:40 +08:00
LiangSong	693e3970d9	update readme	2023-05-04 22:54:10 +08:00
LiangSong	fbb7997607	fix typo	2023-05-04 22:32:15 +08:00
LiangSong	98ffab3a97	update readme and add half to server	2023-05-04 22:28:36 +08:00
LiangSong	5c876121cb	update gradio, fix code format bug	2023-05-04 18:18:52 +08:00
LiangSong	a1acc90988	fix train_tokenizer bug	2023-05-04 16:00:56 +08:00
LiangSong	51686b5fb8	add split dataset by shard option to accelerate data loading	2023-05-04 09:20:23 +08:00
LiangSong	f0d41f937b	update instruct_config and set all random seed to 42	2023-05-04 08:45:21 +08:00
LiangSong	dba2e2d680	update ShareGPT_90K preprocess	2023-05-04 08:34:38 +08:00
LiangSong	154456c976	set dataset shuffle seed to 42	2023-05-04 00:31:12 +08:00
LiangSong	c2184c6dd1	support multiple epochs	2023-05-03 00:02:01 +08:00
LiangSong	f05e929aad	update config	2023-05-02 21:42:55 +08:00
LiangSong	0466673f76	support load model from accelerate ckpt	2023-04-29 20:40:42 +08:00
LiangSong	52cd09f664	update readme	2023-04-29 20:30:24 +08:00
LiangSong	fc21a75d1e	add continue training	2023-04-29 20:28:39 +08:00
LiangSong	28b11a5bed	update requirements	2023-04-29 13:39:03 +08:00
LiangSong	8b439dec4a	update flops	2023-04-29 12:31:11 +08:00
LiangSong	a2816bd23d	update readme	2023-04-29 12:06:55 +08:00
LiangSong	4c5e50e4aa	update readme	2023-04-29 11:41:28 +08:00
LiangSong	c8037746c3	update readme	2023-04-28 22:45:45 +08:00
s-JoL	0ff8b2353f	Merge pull request #30 from s-JoL/dev update readme	2023-04-28 19:54:52 +08:00
LiangSong	724265b435	update readme	2023-04-28 19:54:14 +08:00
s-JoL	0fd7dbd636	Merge pull request #29 from s-JoL/dev update readme	2023-04-28 19:50:29 +08:00
LiangSong	8c85535db3	update readme	2023-04-28 19:49:51 +08:00
LiangSong	676dcfd995	add hardward configuration to readme	2023-04-28 17:29:11 +08:00
s-JoL	f3c664bde3	Merge pull request #25 from s-JoL/dev v2 release	2023-04-28 15:11:02 +08:00
LiangSong	c890bce69c	update readme	2023-04-28 15:10:41 +08:00
LiangSong	9baebfd49c	Merge branch 'main' into dev	2023-04-28 15:08:25 +08:00
LiangSong	2fd13ff075	fix typo	2023-04-28 15:05:33 +08:00
LiangSong	0fdca8b949	update readme	2023-04-28 15:01:01 +08:00
LiangSong	49118aad42	update header config and add padding to concat_multiple_sequence	2023-04-27 23:42:11 +08:00
LiangSong	db6cdb51d0	unified pre-training and instrcution-tuning both use train_lm and dataset	2023-04-27 19:42:06 +08:00
LiangSong	97aff0e051	use split_dataset_by_node instead accelerate.prepare to accelerate data loading by 50%	2023-04-27 00:04:11 +08:00
LiangSong	0377b43628	update tokenizer to LlamaTokenizer	2023-04-26 18:53:30 +08:00
LiangSong	f41f5558ec	update header	2023-04-24 23:19:07 +08:00
LiangSong	f8f4cde228	using huggingface datasets to accelerate training, using open-llama to pretrain	2023-04-24 19:13:53 +08:00
s-JoL	92af968637	Update README.md	2023-04-23 16:26:58 +08:00
s-JoL	cf852bc459	Update README.md	2023-04-23 16:26:21 +08:00
LiangSong	ad3d943a7d	update readme add ckpt from hf	2023-04-16 23:50:36 +08:00
LiangSong	b21441b14b	disable concat docs	2023-04-15 19:35:24 +08:00
LiangSong	3f62a23ee2	update format	2023-04-12 22:16:15 +08:00
LiangSong	a4aa109dd3	add trainer and utils	2023-04-12 17:59:05 +08:00
LiangSong	ae0691c509	update utils	2023-04-12 17:15:40 +08:00
LiangSong	da1c927016	update speed test	2023-04-12 17:15:07 +08:00

1 2 3

102 Commits