Kye Gomez's picture

Kye Gomez

kye

·

https://discord.gg/qUtxnK2NMf

kyegomezb

AI & ML interests

Neuroscience, Behavior Science, Anti-Matter, Anti-Gravity propulsion,

Organizations

kye's activity

upvoted 3 papers about 6 hours ago

Preference Tuning with Human Feedback on Language, Speech, and Vision Tasks: A Survey

Paper • 2409.11564 • Published 2 days ago • 12

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Paper • 2409.12183 • Published 1 day ago • 14

Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published 1 day ago • 63

upvoted a paper about 20 hours ago

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published 1 day ago • 38

upvoted 2 papers about 22 hours ago

Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models

Paper • 2409.11136 • Published 2 days ago • 17

Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

Paper • 2409.11406 • Published 2 days ago • 19

upvoted 7 papers 1 day ago

EzAudio: Enhancing Text-to-Audio Generation with Efficient Diffusion Transformer

Paper • 2409.10819 • Published 3 days ago • 11

A Comprehensive Evaluation of Quantized Instruction-Tuned Large Language Models: An Experimental Analysis up to 405B

Paper • 2409.11055 • Published 2 days ago • 13

Single-Layer Learnable Activation for Implicit Neural Representation (SL^{2}A-INR)

Paper • 2409.10836 • Published 3 days ago • 4

On the limits of agency in agent-based models

Paper • 2409.10568 • Published 6 days ago • 10

OSV: One Step is Enough for High-Quality Image to Video Generation

Paper • 2409.11367 • Published 2 days ago • 11

NVLM: Open Frontier-Class Multimodal LLMs

Paper • 2409.11402 • Published 2 days ago • 47

OmniGen: Unified Image Generation

Paper • 2409.11340 • Published 2 days ago • 52

upvoted 6 papers 2 days ago

Towards Predicting Temporal Changes in a Patient's Chest X-ray Images based on Electronic Health Records

Paper • 2409.07012 • Published 9 days ago • 3

Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models

Paper • 2409.06277 • Published 10 days ago • 12

One missing piece in Vision and Language: A Survey on Comics Understanding

Paper • 2409.09502 • Published 5 days ago • 23

Guiding Vision-Language Model Selection for Visual Question-Answering Across Tasks, Domains, and Knowledge Types

Paper • 2409.09269 • Published 6 days ago • 7

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Paper • 2409.10516 • Published 3 days ago • 26

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Paper • 2409.09214 • Published 6 days ago • 36

upvoted 2 papers 3 days ago

Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection

Paper • 2409.08513 • Published 7 days ago • 8

Apollo: Band-sequence Modeling for High-Quality Audio Restoration

Paper • 2409.08514 • Published 7 days ago • 5

upvoted 2 papers 4 days ago

PiTe: Pixel-Temporal Alignment for Large Video-Language Model

Paper • 2409.07239 • Published 8 days ago • 11

DSBench: How Far Are Data Science Agents to Becoming Data Science Experts?

Paper • 2409.07703 • Published 8 days ago • 58

upvoted 4 papers 6 days ago

SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories

Paper • 2409.07440 • Published 8 days ago • 6

LLaMA-Omni: Seamless Speech Interaction with Large Language Models

Paper • 2409.06666 • Published 9 days ago • 51

Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

Paper • 2409.08264 • Published 7 days ago • 39

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Paper • 2409.04109 • Published 14 days ago • 37

upvoted 12 papers 7 days ago

gsplat: An Open-Source Library for Gaussian Splatting

Paper • 2409.06765 • Published 9 days ago • 11

Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models

Paper • 2409.07452 • Published 8 days ago • 18

Generative Hierarchical Materials Search

Paper • 2409.06762 • Published 9 days ago • 6

VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos

Paper • 2409.07450 • Published 8 days ago • 10

Self-Harmonized Chain of Thought

Paper • 2409.04057 • Published 14 days ago • 15

ProteinBench: A Holistic Evaluation of Protein Foundation Models

Paper • 2409.06744 • Published 10 days ago • 6

MVLLaVA: An Intelligent Agent for Unified and Flexible Novel View Synthesis

Paper • 2409.07129 • Published 9 days ago • 7

Can Large Language Models Unlock Novel Scientific Research Ideas?

Paper • 2409.06185 • Published 10 days ago • 9

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Paper • 2409.07146 • Published 9 days ago • 18

Agent Workflow Memory

Paper • 2409.07429 • Published 8 days ago • 25

PingPong: A Benchmark for Role-Playing Language Models with User Emulation and Multi-Model Evaluation

Paper • 2409.06820 • Published 9 days ago • 55

MEDIC: Towards a Comprehensive Framework for Evaluating LLMs in Clinical Applications

Paper • 2409.07314 • Published 8 days ago • 49

upvoted 10 papers 9 days ago

SongCreator: Lyrics-based Universal Song Generation

Paper • 2409.06029 • Published 10 days ago • 19

INTRA: Interaction Relationship-aware Weakly Supervised Affordance Grounding

Paper • 2409.06210 • Published 10 days ago • 24

OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs

Paper • 2409.05152 • Published 11 days ago • 27

Robot Utility Models: General Policies for Zero-Shot Deployment in New Environments

Paper • 2409.05865 • Published 10 days ago • 14

Paper Copilot: A Self-Evolving and Efficient LLM System for Personalized Academic Assistance

Paper • 2409.04593 • Published 13 days ago • 19

Draw an Audio: Leveraging Multi-Instruction for Video-to-Audio Synthesis

Paper • 2409.06135 • Published 10 days ago • 14

POINTS: Improving Your Vision-language Model with Affordable Strategies

Paper • 2409.04828 • Published 12 days ago • 21

MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery

Paper • 2409.05591 • Published 10 days ago • 24

MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct

Paper • 2409.05840 • Published 10 days ago • 43

Towards a Unified View of Preference Learning for Large Language Models: A Survey

Paper • 2409.02795 • Published 15 days ago • 70

upvoted 4 papers 10 days ago

GST: Precise 3D Human Body from a Single Image with Gaussian Splatting Transformers

Paper • 2409.04196 • Published 13 days ago • 11

Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task

Paper • 2409.04005 • Published 14 days ago • 16

Configurable Foundation Models: Building LLMs from a Modular Perspective

Paper • 2409.02877 • Published 15 days ago • 27

How Do Your Code LLMs Perform? Empowering Code Instruction Tuning with High-Quality Data

Paper • 2409.03810 • Published 14 days ago • 29

upvoted 6 papers 12 days ago

Report Cards: Qualitative Evaluation of Language Models Using Natural Language Summaries

Paper • 2409.00844 • Published 18 days ago • 11

Building Math Agents with Multi-Turn Iterative Preference Learning

Paper • 2409.02392 • Published 16 days ago • 14

From MOOC to MAIC: Reshaping Online Teaching and Learning through LLM-driven Agents

Paper • 2409.03512 • Published 14 days ago • 25

mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding

Paper • 2409.03420 • Published 14 days ago • 23

FuzzCoder: Byte-level Fuzzing Test via Large Language Model

Paper • 2409.01944 • Published 16 days ago • 44

Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing

Paper • 2409.01322 • Published 17 days ago • 94

upvoted a paper 14 days ago

WildVis: Open Source Visualizer for Million-Scale Chat Logs in the Wild

Paper • 2409.03753 • Published 14 days ago • 17