-
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping
Paper • 2402.14083 • Published • 43 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 590 -
Genie: Generative Interactive Environments
Paper • 2402.15391 • Published • 70 -
Humanoid Locomotion as Next Token Prediction
Paper • 2402.19469 • Published • 26
Yosuke Yamaguchi
yamayou
AI & ML interests
None yet
Organizations
None yet
Collections
3
-
Jamba: A Hybrid Transformer-Mamba Language Model
Paper • 2403.19887 • Published • 103 -
Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
Paper • 2404.00399 • Published • 40 -
Mixture-of-Depths: Dynamically allocating compute in transformer-based language models
Paper • 2404.02258 • Published • 103 -
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length
Paper • 2404.08801 • Published • 62
models
None public yet
datasets
None public yet