From 16811d0efeef750247cd6b0c6657505b4a52fe4e Mon Sep 17 00:00:00 2001
From: LiangSong <sl12160010@gmail.com>
Date: Mon, 8 May 2023 22:29:24 +0800
Subject: [PATCH] update readme

---
 README.md    | 10 ++++++----
 README_zh.md |  9 ++++++---
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/README.md b/README.md
index 45af213..6b9e6a8 100644
--- a/README.md
+++ b/README.md
@@ -2,7 +2,7 @@
  * @Author: LiangSong(sl12160010@gmail.com)
  * @Date: 2023-03-10 21:18:35
  * @LastEditors: LiangSong(sl12160010@gmail.com)
- * @LastEditTime: 2023-05-08 22:25:57
+ * @LastEditTime: 2023-05-08 22:28:51
  * @FilePath: /Open-Llama/README.md
  * @Description: 
  * 
@@ -61,9 +61,11 @@ Below is a display of the model's multi-turn dialogue ability regarding code:
 
 **[2023.5.8] Release v2.1**
 
-This update adds support for larger model training. Using DeepSpeed stage3 + offload + activation checkpoint, you can **train a 65B model on a single machine with 8 A100-80G**. 
-At the same time, the peft library is introduced to **support training such as lora**.
-The following table compares the training speed of Open-Llama and the original Llama, and the performance data of Llama is quoted from the original Llama paper.
+- This update adds support for larger model training. Using DeepSpeed stage3 + offload + activation checkpoint, you can **train a 65B model on a single machine with 8 A100-80G**. 
+
+- The peft library is introduced to **support training such as lora**.
+
+- The following table compares the training speed of Open-Llama and the original Llama, and the performance data of Llama is quoted from the original Llama paper.
 |                | DeepSpeed Stage | Offload | Activation Checkpoint | Total Token | GPU hours | Speed token/s/gpu | Batch Size | CPU Memory |
 |----------------|-----------------|---------|-----------------------|-------------|-----------|-------------------|------------|------------|
 | Open-Llama 7B  | 1               | False   | False                 | 173.7B      | 13412     | 3587              | 2          | 94G        |
diff --git a/README_zh.md b/README_zh.md
index f061ab5..0cd1616 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -2,7 +2,7 @@
  * @Author: LiangSong(sl12160010@gmail.com)
  * @Date: 2023-03-10 21:18:35
  * @LastEditors: LiangSong(sl12160010@gmail.com)
- * @LastEditTime: 2023-05-08 22:25:28
+ * @LastEditTime: 2023-05-08 22:28:40
  * @FilePath: /Open-Llama/README_zh.md
  * @Description: 
  * 
@@ -62,8 +62,11 @@ print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))
 
 **[2023.5.8] Release v2.1**
 
-本次更新加入对更大模型训练的支持，使用DeepSpeed stage3 + offload + activation checkpoint可以在**单机8卡A100-80G训练65B模型**。同时引入peft库**支持lora**等训练。
-下表对比了Open-Llama和Llama原文的训练速度，Llama性能数据引自Llama原文。
+- 本次更新加入对更大模型训练的支持，使用DeepSpeed stage3 + offload + activation checkpoint可以在**单机8卡A100-80G训练65B模型**。
+
+- 引入peft库**支持lora**等训练。
+
+- 下表对比了Open-Llama和Llama原文的训练速度，Llama性能数据引自Llama原文。
 |                | DeepSpeed Stage | Offload | Activation Checkpoint | Total Token | GPU hours | Speed token/s/gpu | Batch Size | CPU Memory |
 |----------------|-----------------|---------|-----------------------|-------------|-----------|-------------------|------------|------------|
 | Open-Llama 7B  | 1               | False   | False                 | 173.7B      | 13412     | 3587              | 2          | 94G        |