LHC88
/

LaseredHermes-7B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

LHC88 commited on Feb 10

Commit

2971ac5

•

1 Parent(s): d16f74b

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -13,12 +13,14 @@ datasets:
 ![image/png](https://huggingface.co/openaccess-ai-collective/DPOpenHermes-7B/resolve/main/assets/dpopenhermes.png)
 ## Laser Config
 top_k_layers: 32
 datasets: wikitext2, ptb, c4
 ## OpenHermes x Notus x Neural
-[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
 This is a second RL fine tuned model of [Teknium](https://huggingface.co/teknium)'s [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) using the [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs) and [allenai/ultrafeedback_binarized_cleaned](https://huggingface.co/datasets/allenai/ultrafeedback_binarized_cleaned) preference datasets for reinforcement learning using Direct Preference Optimization (DPO)

 ![image/png](https://huggingface.co/openaccess-ai-collective/DPOpenHermes-7B/resolve/main/assets/dpopenhermes.png)
 ## Laser Config
+[**Lasered with AIDOcks**](https://github.com/l4b4r4b4b4/AIDocks)
 top_k_layers: 32
 datasets: wikitext2, ptb, c4
 ## OpenHermes x Notus x Neural
+[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
 This is a second RL fine tuned model of [Teknium](https://huggingface.co/teknium)'s [OpenHermes-2.5-Mistral-7B](https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B) using the [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs) and [allenai/ultrafeedback_binarized_cleaned](https://huggingface.co/datasets/allenai/ultrafeedback_binarized_cleaned) preference datasets for reinforcement learning using Direct Preference Optimization (DPO)