chujiezheng commited on
Commit
00293b3
1 Parent(s): 3683640

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -11,7 +11,7 @@ The extrapolated (ExPO) model based on [`shenzhi-wang/Llama3-8B-Chinese-Chat`](h
11
 
12
  Specifically, we obtain this model by extrapolating **(alpha = 0.3)** from the weights of the SFT and DPO/RLHF checkpoints, achieving superior alignment with human preference.
13
 
14
- **Note:** This is an experimental model, as I have not comprehensively evaluated its Chinese ability. There may occur unexpected issues when we apply extrapolation to the new-language (i.e., Chinese) training.
15
 
16
  ## Evaluation Results
17
 
 
11
 
12
  Specifically, we obtain this model by extrapolating **(alpha = 0.3)** from the weights of the SFT and DPO/RLHF checkpoints, achieving superior alignment with human preference.
13
 
14
+ **Note:** This is an experimental model, as I have not comprehensively evaluated its Chinese ability. **Unexpected issues may occur when we apply extrapolation to the DPO/RLHF alignment training for new languages (e.g., Chinese).**
15
 
16
  ## Evaluation Results
17