-
Exponentially Faster Language Modelling
Paper • 2311.10770 • Published • 118 -
stabilityai/stable-video-diffusion-img2vid-xt
Image-to-Video • Updated • 286k • 2.56k -
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes
Paper • 2311.13384 • Published • 49 -
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis
Paper • 2311.12454 • Published • 29
Collections
Discover the best community collections!
Collections including paper arxiv:2312.06550
-
The Chosen One: Consistent Characters in Text-to-Image Diffusion Models
Paper • 2311.10093 • Published • 56 -
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models
Paper • 2311.12092 • Published • 20 -
DREAM: Diffusion Rectification and Estimation-Adaptive Models
Paper • 2312.00210 • Published • 14 -
HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models
Paper • 2312.00079 • Published • 14
-
DualMix: Unleashing the Potential of Data Augmentation for Online Class-Incremental Learning
Paper • 2303.07864 • Published • 1 -
Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks
Paper • 2305.13547 • Published • 1 -
MixPro: Simple yet Effective Data Augmentation for Prompt-based Learning
Paper • 2304.09402 • Published • 2 -
LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning
Paper • 2305.18169 • Published • 1
-
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Paper • 2211.05100 • Published • 28 -
CsFEVER and CTKFacts: Acquiring Czech data for fact verification
Paper • 2201.11115 • Published -
Training language models to follow instructions with human feedback
Paper • 2203.02155 • Published • 14 -
FinGPT: Large Generative Models for a Small Language
Paper • 2311.05640 • Published • 27
-
The Impact of Depth and Width on Transformer Language Model Generalization
Paper • 2310.19956 • Published • 9 -
Retentive Network: A Successor to Transformer for Large Language Models
Paper • 2307.08621 • Published • 170 -
RWKV: Reinventing RNNs for the Transformer Era
Paper • 2305.13048 • Published • 12 -
Attention Is All You Need
Paper • 1706.03762 • Published • 41
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 61 -
Learning To Teach Large Language Models Logical Reasoning
Paper • 2310.09158 • Published • 1 -
ChipNeMo: Domain-Adapted LLMs for Chip Design
Paper • 2311.00176 • Published • 8 -
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Paper • 2308.09583 • Published • 7
-
AlpaGasus: Training A Better Alpaca with Fewer Data
Paper • 2307.08701 • Published • 22 -
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
Paper • 2303.03915 • Published • 6 -
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
SlimPajama-DC: Understanding Data Combinations for LLM Training
Paper • 2309.10818 • Published • 10
-
Creative Robot Tool Use with Large Language Models
Paper • 2310.13065 • Published • 8 -
CodeCoT and Beyond: Learning to Program and Test like a Developer
Paper • 2308.08784 • Published • 5 -
Lemur: Harmonizing Natural Language and Code for Language Agents
Paper • 2310.06830 • Published • 30 -
CodePlan: Repository-level Coding using LLMs and Planning
Paper • 2309.12499 • Published • 72
-
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 590 -
Mixtral of Experts
Paper • 2401.04088 • Published • 157 -
Mistral 7B
Paper • 2310.06825 • Published • 47 -
Don't Make Your LLM an Evaluation Benchmark Cheater
Paper • 2311.01964 • Published • 1