update readme add new model

2023-05-12 11:32:42 +08:00 · 2023-05-12 11:32:42 +08:00 · 7231d53ca4
commit 7231d53ca4
parent ceb1fd067b
2 changed files with 14 additions and 14 deletions
--- a/README.md
+++ b/README.md
@ -2,7 +2,7 @@
 * @Author: LiangSong(sl12160010@gmail.com)
 * @Date: 2023-03-10 21:18:35
 * @LastEditors: LiangSong(sl12160010@gmail.com)
- * @LastEditTime: 2023-05-08 22:28:51
+ * @LastEditTime: 2023-05-12 11:32:28
 * @FilePath: /Open-Llama/README.md
 * @Description: 
 * 
@ -25,7 +25,7 @@ Open-Llama is an open-source project that offers a complete training pipeline fo

 ## **Main contents**

- **Support Transformers/HuggingFace.** The CheckPoint after Instruct-tuning is open-source on [HuggingFace: s-JoL/Open-Llama-V1](https://huggingface.co/s-JoL/Open-Llama-V1).
+- **Support Transformers/HuggingFace.** The CheckPoint after Instruct-tuning is open-source on [HuggingFace: s-JoL/Open-Llama-V2](https://huggingface.co/s-JoL/Open-Llama-V2).

 - **By adopting the same evaluation method as the FastChat project, Open-Llama's performance is compared to GPT3.5’s. After testing, it can reach 84% of GPT3.5's performance on Chinese questions.**

@ -37,17 +37,17 @@ pip install git+https://github.com/huggingface/transformers.git

 from transformers import AutoModelForCausalLM, AutoTokenizer

-tokenizer = AutoTokenizer.from_pretrained("s-JoL/Open-Llama-V1", use_fast=False)
-model = AutoModelForCausalLM.from_pretrained("s-JoL/Open-Llama-V1").cuda()
+tokenizer = AutoTokenizer.from_pretrained("s-JoL/Open-Llama-V2", use_fast=False)
+model = AutoModelForCausalLM.from_pretrained("s-JoL/Open-Llama-V2", device_map="auto")

-inputs = tokenizer('user:implement quick sort in python\nsystem:', return_tensors='pt', return_attention_mask=False)
+inputs = tokenizer('user:implement quick sort in python\nsystem:', return_tensors='pt', return_attention_mask=False, add_special_tokens=False)
 for k, v in inputs.items():
   inputs[k] = v.cuda()
 pred = model.generate(**inputs, max_new_tokens=512, do_sample=True)
 print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))

 ```
-The CheckPoint after pre-training only is also uploaded to [s-JoL/Open-Llama-V1-pretrain](https://huggingface.co/s-JoL/Open-Llama-V1-pretrain).
+The CheckPoint after pre-training only is also uploaded to [s-JoL/Open-Llama-V2-pretrain](https://huggingface.co/s-JoL/Open-Llama-V2-pretrain).
 The model [PR](https://github.com/huggingface/transformers/pull/22795) has been submitted for merging into the Transformers main branch.

 We have completed 330B token pre-training, training a total of 80 K steps. The Global Batch Size is consistent with Llama at 4M.
@ -135,7 +135,7 @@ When training language models, our goal is to build a versatile model that can h

 - Python 3.7 or higher
 - PyTorch 1.13
- Special version of [Transformers library](https://github.com/Bayes-Song/transformers)
+- [Transformers library](https://github.com/huggingface/transformers)
 - [Accelerate library](https://huggingface.co/docs/accelerate/index)
 - CUDA 11.6 or higher (for GPU acceleration)
 - Hardware configuration: currently using (64 CPU, 1000G Memory, 8xA100-80G) x N. There is a rather curious phenomenon that when more CPUs are used, the system runs slightly slower. I speculate this may have something to do with the multi-processing of dataloader.
--- a/README_zh.md
+++ b/README_zh.md
@ -2,7 +2,7 @@
 * @Author: LiangSong(sl12160010@gmail.com)
 * @Date: 2023-03-10 21:18:35
 * @LastEditors: LiangSong(sl12160010@gmail.com)
- * @LastEditTime: 2023-05-08 22:29:53
+ * @LastEditTime: 2023-05-12 11:31:51
 * @FilePath: /Open-Llama/README_zh.md
 * @Description: 
 * 
@ -25,7 +25,7 @@ Open-Llama是一个开源项目，提供了一整套用于构建大型语言模

 ## **主要内容**

- **支持Transformers/HuggingFace直接调用。** 经过Instruct-tuning的CheckPoint已开源在[HuggingFace: s-JoL/Open-Llama-V1](https://huggingface.co/s-JoL/Open-Llama-V1)。
+- **支持Transformers/HuggingFace直接调用。** 经过Instruct-tuning的CheckPoint已开源在[HuggingFace: s-JoL/Open-Llama-V2](https://huggingface.co/s-JoL/Open-Llama-V2)。

 - **采用FastChat项目相同方法测评Open-Llama的效果和GPT3.5的效果对比，经过测试在中文问题上可以达到GPT3.5 84%的水平。**

@ -38,17 +38,17 @@ pip install git+https://github.com/huggingface/transformers.git

 from transformers import AutoModelForCausalLM, AutoTokenizer

-tokenizer = AutoTokenizer.from_pretrained("s-JoL/Open-Llama-V1", use_fast=False)
-model = AutoModelForCausalLM.from_pretrained("s-JoL/Open-Llama-V1").cuda()
+tokenizer = AutoTokenizer.from_pretrained("s-JoL/Open-Llama-V2", use_fast=False)
+model = AutoModelForCausalLM.from_pretrained("s-JoL/Open-Llama-V2", device_map="auto")

-inputs = tokenizer('user:implement quick sort in python\nsystem:', return_tensors='pt', return_attention_mask=False)
+inputs = tokenizer('user:implement quick sort in python\nsystem:', return_tensors='pt', return_attention_mask=False, add_special_tokens=False)
 for k, v in inputs.items():
   inputs[k] = v.cuda()
 pred = model.generate(**inputs, max_new_tokens=512, do_sample=True)
 print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))

 ```
-只经过预训练的CheckPoint也上传至[s-JoL/Open-Llama-V1-pretrain](https://huggingface.co/s-JoL/Open-Llama-V1-pretrain)。
+只经过预训练的CheckPoint也上传至[s-JoL/Open-Llama-V2-pretrain](https://huggingface.co/s-JoL/Open-Llama-V2-pretrain)。
 模型已提交[PR](https://github.com/huggingface/transformers/pull/22795)合并至Transformers main分支。

 我们完成了330B token的预训练，总共训练80 K step，Global Batch Size和Llama中一致为4M。
@ -134,7 +134,7 @@ v1版代码可见https://github.com/s-JoL/Open-Llama/tree/v1.0

 - Python 3.7 或更高版本
 - PyTorch 1.13
- 特殊版本的[Transformers库](https://github.com/Bayes-Song/transformers)
+- [Transformers库](https://github.com/huggingface/transformers)
 - [Accelerate库](https://huggingface.co/docs/accelerate/index)
 - CUDA 11.6 或更高版本（用于 GPU 加速）
 - 硬件配置：目前使用(64 CPU, 1000G Memory, 8xA100-80G) x N，有个比较神奇的现象当使用更多cpu时反而会慢一点，猜测这和dataloader的多进程有一定关系。