jizhongpeng's picture

jizhongpeng

jizhongpeng

·

AI & ML interests

None yet

Organizations

jizhongpeng's activity

upvoted a collection 21 days ago

Qwen2-VL

Vision-language model series based on Qwen2 • 15 items • Updated 1 day ago • 114

upvoted a paper 24 days ago

K-Sort Arena: Efficient and Reliable Benchmarking for Generative Models via K-wise Human Preferences

Paper • 2408.14468 • Published 24 days ago • 33

upvoted 2 papers 28 days ago

Towards flexible perception with visual memory

Paper • 2408.08172 • Published Aug 15 • 19

FRAP: Faithful and Realistic Text-to-Image Generation with Adaptive Prompt Weighting

Paper • 2408.11706 • Published 29 days ago • 5

upvoted a paper about 1 month ago

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6 • 58

upvoted a paper about 2 months ago

Q-Ground: Image Quality Grounding with Large Multi-modality Models

Paper • 2407.17035 • Published Jul 24 • 1

upvoted a collection about 2 months ago

Magpie-Qwen2 Datasets

Dataset built with Qwen2 72B and Qwen2 7B. • 6 items • Updated 5 days ago • 10

upvoted a paper about 2 months ago

LongVideoBench: A Benchmark for Long-context Interleaved Video-Language Understanding

Paper • 2407.15754 • Published Jul 22 • 19

upvoted a collection 2 months ago

🪐 SmolLM

A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated Aug 18 • 169

upvoted a paper 2 months ago

MJ-Bench: Is Your Multimodal Reward Model Really a Good Judge for Text-to-Image Generation?

Paper • 2407.04842 • Published Jul 5 • 52

upvoted a collection 2 months ago

InternVL 2.0

Expanding Performance Boundaries of Open-Source MLLM • 16 items • Updated Aug 10 • 72

upvoted 3 papers 3 months ago

CMC-Bench: Towards a New Paradigm of Visual Signal Compression

Paper • 2406.09356 • Published Jun 13 • 4

MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos

Paper • 2406.08407 • Published Jun 12 • 24

A-Bench: Are LMMs Masters at Evaluating AI-generated Images?

Paper • 2406.03070 • Published Jun 5 • 2

upvoted 2 collections 3 months ago

MaPO

This collection includes the models and datasets as a part of the MaPO release. • 9 items • Updated Jun 12 • 5

Qwen2

Qwen2 language models, including pretrained and instruction-tuned models of 5 sizes, including 0.5B, 1.5B, 7B, 57B-A14B, and 72B. • 39 items • Updated 1 day ago • 332

upvoted a collection 4 months ago

GLM-4

GLM-4 Open Models • 8 items • Updated 17 days ago • 99

upvoted a paper 4 months ago

The Road Less Scheduled

Paper • 2405.15682 • Published May 24 • 20

upvoted a collection 4 months ago

👑 Llama-3

My experiments with Llama-3 models • 61 items • Updated 4 days ago • 22

upvoted 2 collections 5 months ago

LLaVA++ (LLaMA-3 and Phi-3-Mini)

Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated Jun 11 • 22

LLaVA-LLaMA-3

Reproduction of various LLaVA models based on LLaMA-3 backbone. • 4 items • Updated Aug 16 • 2

upvoted an article 5 months ago

Article

Welcome Llama 3 - Meta's new open LLM

Apr 18

• 272

upvoted 2 collections 5 months ago

Meta Llama 3

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Aug 2 • 673

MGM

Official model collection for the paper "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models" • 13 items • Updated May 3 • 46

upvoted a paper 5 months ago

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD

Paper • 2404.06512 • Published Apr 9 • 29

upvoted 2 papers 6 months ago

InternLM2 Technical Report

Paper • 2403.17297 • Published Mar 26 • 28

Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation

Paper • 2403.16990 • Published Mar 25 • 24

upvoted a collection 6 months ago

Common Corpus

The largest public domain dataset for training LLMs. • 27 items • Updated Jul 17 • 111

upvoted a paper 6 months ago

Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation

Paper • 2403.12015 • Published Mar 18 • 63

upvoted 4 papers 7 months ago

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

Paper • 2403.03507 • Published Mar 6 • 182

Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

Paper • 2403.03206 • Published Mar 5 • 56

Towards Open-ended Visual Quality Comparison

Paper • 2402.16641 • Published Feb 26 • 16

A Benchmark for Multi-modal Foundation Models on Low-level Vision: from Single Images to Pairs

Paper • 2402.07116 • Published Feb 11 • 2

upvoted a paper 8 months ago

Towards A Better Metric for Text-to-Video Generation

Paper • 2401.07781 • Published Jan 15 • 14

upvoted 2 papers 9 months ago

Q-Boost: On Visual Quality Assessment Ability of Low-level Multi-Modality Foundation Models

Paper • 2312.15300 • Published Dec 23, 2023 • 2

PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU

Paper • 2312.12456 • Published Dec 16, 2023 • 40

upvoted a collection 9 months ago

LLM Leaderboard best models ❤️‍🔥

A daily uploaded list of models with best evaluations on the LLM leaderboard: • 264 items • Updated Jun 22 • 392

upvoted 2 papers 9 months ago

Q-Refine: A Perceptual Quality Refiner for AI-Generated Image

Paper • 2401.01117 • Published Jan 2 • 8

Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

Paper • 2312.17090 • Published Dec 28, 2023 • 4

upvoted 3 papers 10 months ago

Diffusion Model Alignment Using Direct Preference Optimization

Paper • 2311.12908 • Published Nov 21, 2023 • 47

SelfEval: Leveraging the discriminative nature of generative models for evaluation

Paper • 2311.10708 • Published Nov 17, 2023 • 14

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Paper • 2311.06783 • Published Nov 12, 2023 • 26

upvoted a paper 11 months ago

Improved Baselines with Visual Instruction Tuning

Paper • 2310.03744 • Published Oct 5, 2023 • 36