AriGraph: Learning Knowledge Graph World Models with Episodic Memory for LLM Agents Paper • 2407.04363 • Published Jul 5 • 26
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27 • 147
Latent Consistency Models LoRAs Collection Latent Consistency Models for Stable Diffusion - LoRAs and full fine-tuned weights • 4 items • Updated Nov 10, 2023 • 98
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22 • 107
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling Paper • 2402.12226 • Published Feb 19 • 40
In Search of Needles in a 10M Haystack: Recurrent Memory Finds What LLMs Miss Paper • 2402.10790 • Published Feb 16 • 40
Efficiently Programming Large Language Models using SGLang Paper • 2312.07104 • Published Dec 12, 2023 • 7
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions Paper • 2309.10150 • Published Sep 18, 2023 • 24
LLaVA-φ: Efficient Multi-Modal Assistant with Small Language Model Paper • 2401.02330 • Published Jan 4 • 14
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 211
LLaMA Beyond English: An Empirical Study on Language Capability Transfer Paper • 2401.01055 • Published Jan 2 • 53
Zephyr 7B Collection Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 144
timm Backbones Collection Pre-trained feature extraction backbones available in timm. • 18 items • Updated Jun 12 • 7
Rethinking Vision Transformers for MobileNet Size and Speed Paper • 2212.08059 • Published Dec 15, 2022 • 4
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer Paper • 2110.02178 • Published Oct 5, 2021 • 2