joshuasundance commited on
Commit
32b73ce
1 Parent(s): 2844481

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -11
README.md CHANGED
@@ -14,11 +14,14 @@ language:
14
 
15
  # Model Card for Model ID
16
 
17
- This is an experimental model made by using [https://huggingface.co/datasets/joshuasundance/mypo-4k-rfc](joshuasundance/mypo-4k-rfc) for DPO training of [https://huggingface.co/edumunozsala/phi3-mini-4k-qlora-python-code-20k](edumunozsala/phi3-mini-4k-qlora-python-code-20k).
 
 
 
18
 
19
- The goal is to learn about model training and potentially get the base model to reliably produce Python with type hints.
20
 
21
- I chose `edumunozsala/phi3-mini-4k-qlora-python-code-20k` because I was able to train this model in one hour on my laptop.
22
 
23
 
24
  ## Model Details
@@ -31,7 +34,7 @@ This is the model card of a 🤗 transformers model that has been pushed on the
31
  - **Model type:** phi 3 qlora DPO
32
  - **Language(s) (NLP):** English
33
  - **License:** MIT
34
- - **Finetuned from model [optional]:** [https://huggingface.co/edumunozsala/phi3-mini-4k-qlora-python-code-20k](edumunozsala/phi3-mini-4k-qlora-python-code-20k)
35
 
36
  ### Model Sources [optional]
37
 
@@ -81,30 +84,30 @@ Use the code below to get started with the model.
81
 
82
  ### Training Data
83
 
84
- * Original qlora: [https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca](iamtarun/python_code_instructions_18k_alpaca)
85
- * DPO: [https://huggingface.co/datasets/joshuasundance/mypo-4k-rfc](joshuasundance/mypo-4k-rfc)
86
 
87
  ### Training Procedure
88
 
89
- See [https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc](training code) using `peft`, `transformers`, and `trl`
90
 
91
  #### Preprocessing [optional]
92
 
93
- See [https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc](training code) using `peft`, `transformers`, and `trl`
94
 
95
  #### Training Hyperparameters
96
 
97
- See [https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc](training code) using `peft`, `transformers`, and `trl`
98
 
99
  #### Speeds, Sizes, Times [optional]
100
 
101
- See [https://huggingface.co/joshuasundance/phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc/blob/main/trainer_state.json](trainer_state.json) in this repo
102
 
103
  [More Information Needed]
104
 
105
  ## Evaluation
106
 
107
- See [https://huggingface.co/joshuasundance/phi3-mini-4k-qlora-python-code-20k-mypo-4k-rfc/blob/main/trainer_state.json](trainer_state.json) in this repo
108
 
109
  ### Testing Data, Factors & Metrics
110
 
 
14
 
15
  # Model Card for Model ID
16
 
17
+ * **Base Model**: https://huggingface.co/edumunozsala/phi3-mini-4k-qlora-python-code-20k
18
+ * **Preference Dataset**: https://huggingface.co/datasets/joshuasundance/mypo-4k-rfc
19
+ * **Training Code**: https://gist.github.com/joshuasundance-swca/a94672960733782865932a645587ccdc
20
+ * **Training Metrics**: [trainer_state.json](trainer_state.json)
21
 
22
+ This is an experimental model made by using `joshuasundance/mypo-4k-rfc` for DPO training of `edumunozsala/phi3-mini-4k-qlora-python-code-20k`.
23
 
24
+ The goal is to learn about model training and potentially get the base model to reliably produce Python with type hints. I chose `edumunozsala/phi3-mini-4k-qlora-python-code-20k` because I was able to train this model in one hour on my laptop.
25
 
26
 
27
  ## Model Details
 
34
  - **Model type:** phi 3 qlora DPO
35
  - **Language(s) (NLP):** English
36
  - **License:** MIT
37
+ - **Finetuned from model [optional]:** `edumunozsala/phi3-mini-4k-qlora-python-code-20k`
38
 
39
  ### Model Sources [optional]
40
 
 
84
 
85
  ### Training Data
86
 
87
+ * Original qlora: `iamtarun/python_code_instructions_18k_alpaca`
88
+ * DPO: `joshuasundance/mypo-4k-rfc`
89
 
90
  ### Training Procedure
91
 
92
+ See training code using `peft`, `transformers`, and `trl`
93
 
94
  #### Preprocessing [optional]
95
 
96
+ See training code using `peft`, `transformers`, and `trl`
97
 
98
  #### Training Hyperparameters
99
 
100
+ See training code using `peft`, `transformers`, and `trl`
101
 
102
  #### Speeds, Sizes, Times [optional]
103
 
104
+ See [trainer_state.json](trainer_state.json) in this repo
105
 
106
  [More Information Needed]
107
 
108
  ## Evaluation
109
 
110
+ See [trainer_state.json](trainer_state.json) in this repo
111
 
112
  ### Testing Data, Factors & Metrics
113