update readme

2023-04-28 22:45:45 +08:00 · 2023-04-28 22:45:45 +08:00 · c8037746c3
commit c8037746c3
parent 0ff8b2353f
2 changed files with 30 additions and 6 deletions
--- a/README.md
+++ b/README.md
@ -2,7 +2,7 @@
 * @Author: LiangSong(sl12160010@gmail.com)
 * @Date: 2023-03-10 21:18:35
 * @LastEditors: LiangSong(sl12160010@gmail.com)
- * @LastEditTime: 2023-04-28 19:52:27
+ * @LastEditTime: 2023-04-28 22:44:21
 * @FilePath: /Open-Llama/README.md
 * @Description: 
 * 
@ -22,9 +22,21 @@ Open-Llama是一个开源项目，提供了一整套用于构建大型语言模

 **训练速度达到3620 token/s，快于Llama原文中的3370 token/s，达到目前sota的水平。**

-经过Instruct-tuning的CheckPoint已开源在[s-JoL/Open-Llama-V1](https://huggingface.co/s-JoL/Open-Llama-V1)。使使用ckpt需要先用下面命令安装最新版本Transformers
-``` base
+经过Instruct-tuning的CheckPoint已开源在[HuggingFace: s-JoL/Open-Llama-V1](https://huggingface.co/s-JoL/Open-Llama-V1)。使用ckpt需要先用下面命令安装最新版本Transformers
+``` python
 pip install git+https://github.com/s-JoL/transformers.git@dev
+
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+tokenizer = AutoTokenizer.from_pretrained("s-JoL/Open-Llama-V1", use_fast=False)
+model = AutoModelForCausalLM.from_pretrained("s-JoL/Open-Llama-V1").cuda()
+
+inputs = tokenizer('user:implement quick sort in python\nsystem:', return_tensors='pt', return_attention_mask=False)
+for k, v in inputs.items():
+   inputs[k] = v.cuda()
+pred = model.generate(**inputs, max_new_tokens=512, do_sample=True)
+print(tokenizer.decode(pred.cpu()[0]).strip())
+
 ```
 只经过预训练的CheckPoint也上传至[s-JoL/Open-Llama-V1-pretrain](https://huggingface.co/s-JoL/Open-Llama-V1-pretrain)。
 模型已提交[PR](https://github.com/huggingface/transformers/pull/22795)合并至Transformers main分支。
--- a/README_en.md
+++ b/README_en.md
@ -2,7 +2,7 @@
 * @Author: LiangSong(sl12160010@gmail.com)
 * @Date: 2023-03-10 21:18:35
 * @LastEditors: LiangSong(sl12160010@gmail.com)
- * @LastEditTime: 2023-04-28 19:53:01
+ * @LastEditTime: 2023-04-28 22:44:27
 * @FilePath: /Open-Llama/README_en.md
 * @Description: 
 * 
@ -22,9 +22,21 @@ Open-Llama is an open-source project that offers a complete training pipeline fo

 **The training speed reaches 3620 tokens/s, faster than the 3370 tokens/s reported in the original Llama paper, reaching the current state-of-the-art level.**

-The CheckPoint after Instruct-tuning is open-source on [s-JoL/Open-Llama-V1](https://huggingface.co/s-JoL/Open-Llama-V1). To use the CheckPoint, first, install the latest version of Transformers with the following command:
-``` base
+The CheckPoint after Instruct-tuning is open-source on [HuggingFace: s-JoL/Open-Llama-V1](https://huggingface.co/s-JoL/Open-Llama-V1). To use the CheckPoint, first, install the latest version of Transformers with the following command:
+``` python
 pip install git+https://github.com/s-JoL/transformers.git@dev
+
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+tokenizer = AutoTokenizer.from_pretrained("s-JoL/Open-Llama-V1", use_fast=False)
+model = AutoModelForCausalLM.from_pretrained("s-JoL/Open-Llama-V1").cuda()
+
+inputs = tokenizer('user:implement quick sort in python\nsystem:', return_tensors='pt', return_attention_mask=False)
+for k, v in inputs.items():
+   inputs[k] = v.cuda()
+pred = model.generate(**inputs, max_new_tokens=512, do_sample=True)
+print(tokenizer.decode(pred.cpu()[0]).strip())
+
 ```
 The CheckPoint after pre-training only is also uploaded to [s-JoL/Open-Llama-V1-pretrain](https://huggingface.co/s-JoL/Open-Llama-V1-pretrain).
 The model [PR](https://github.com/huggingface/transformers/pull/22795) has been submitted for merging into the Transformers main branch.