update readme

This commit is contained in:
LiangSong 2023-05-15 00:21:13 +08:00
parent 1ce8c18d83
commit 82c845a8ce
4 changed files with 6 additions and 6 deletions

View File

@ -2,7 +2,7 @@
* @Author: LiangSong(sl12160010@gmail.com)
* @Date: 2023-03-10 21:18:35
* @LastEditors: LiangSong(sl12160010@gmail.com)
* @LastEditTime: 2023-05-14 10:52:36
* @LastEditTime: 2023-05-15 00:21:01
* @FilePath: /Open-Llama/README.md
* @Description:
*
@ -57,13 +57,13 @@ Using a total of 7 parts of data to constitute the Instruction-tuning data, the
Below is a display of the model's multi-turn dialogue ability regarding code:
![image4](assets/multiturn_chat_en.jpeg)
![image4](assets/multiturn_chat_en.jpg)
## **Updates**
**[2023.5.8] Release v2.1**
- This update adds support for larger model training. Using DeepSpeed stage3 + offload + activation checkpoint, you can **train a 65B model on a single machine with 8 A100-80G**.
- This update adds support for larger model training. Using DeepSpeed stage3 + offload + activation checkpoint, you can **train a 65B model with A100-80G**.
- The peft library is introduced to **support training such as lora**.

View File

@ -2,7 +2,7 @@
* @Author: LiangSong(sl12160010@gmail.com)
* @Date: 2023-03-10 21:18:35
* @LastEditors: LiangSong(sl12160010@gmail.com)
* @LastEditTime: 2023-05-14 10:52:08
* @LastEditTime: 2023-05-15 00:02:05
* @FilePath: /Open-Llama/README_zh.md
* @Description:
*
@ -64,7 +64,7 @@ print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
**[2023.5.8] Release v2.1**
- 本次更新加入对更大模型训练的支持使用DeepSpeed stage3 + offload + activation checkpoint可以在**单机8卡A100-80G训练65B模型**。
- 本次更新加入对更大模型训练的支持使用DeepSpeed stage3 + offload + activation checkpoint可以在**A100-80G训练65B模型**。
- 引入peft库**支持lora**等训练。

Binary file not shown.

After

Width:  |  Height:  |  Size: 859 KiB

View File

@ -18,4 +18,4 @@ functorch==1.13.1
xformers==0.0.16
gradio
peft
git+https://github.com/huggingface/transformers.git
transformers