Update README_en.md
This commit is contained in:
parent
be2f0960c7
commit
ce8bc5249f
|
@ -104,6 +104,7 @@ Check the DataLoader output with the following command:
|
||||||
```bash
|
```bash
|
||||||
python3 dataset/pretrain_dataset.py
|
python3 dataset/pretrain_dataset.py
|
||||||
```
|
```
|
||||||
|
Verification of data intergrity can be foud in this [issue]((https://github.com/s-JoL/Open-Llama/issues/5)
|
||||||
### Model Structure
|
### Model Structure
|
||||||
We modified the [Llama](https://github.com/facebookresearch/llama) model in the Transformers library based on section 2.4 "Efficient Implementation" in the original paper and introduced some optimizations from other papers. Specifically, we introduced the memory_efficient_attention operation from the [xformers library](https://github.com/facebookresearch/xformers) by META for computing self-attention, which significantly improves performance by about 30%. Please refer to modeling_llama.py for details.
|
We modified the [Llama](https://github.com/facebookresearch/llama) model in the Transformers library based on section 2.4 "Efficient Implementation" in the original paper and introduced some optimizations from other papers. Specifically, we introduced the memory_efficient_attention operation from the [xformers library](https://github.com/facebookresearch/xformers) by META for computing self-attention, which significantly improves performance by about 30%. Please refer to modeling_llama.py for details.
|
||||||
|
|
||||||
|
@ -261,4 +262,4 @@ Build an online demo using Gradio.
|
||||||
year={2023},
|
year={2023},
|
||||||
howpublished={\url{https://github.com/Bayes-Song/Open-Llama}},
|
howpublished={\url{https://github.com/Bayes-Song/Open-Llama}},
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
Loading…
Reference in New Issue
Block a user