What method of quantization was used to make this model?

by CCRss - opened Aug 7

Aug 7

Great family models
I'm interested, is it just bnb quantization to 4bit? Or it's some close method that you can't publish.

tc-mb

OpenBMB org Aug 7

•

edited Aug 7

For the time being, it's just bnb.
Other quantization frameworks have difficulty supporting multimodal models, and we don't have enough energy to develop them yet.
Or you can use our version of gguf, which is also int4.
here: https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf

CCRss

Aug 7

Thanks for the reply. 🙏 Sadly, maybe I will try to do AWQ quantization of this model if my supervisor accept it.

Aniel99

Aug 7

@CCRss I'm am also interested in AWQ/MLC quantization for running on Jetson devices. Maybe we can join effort to some point

CCRss

Aug 8

@Aniel99 I will be happy to. You can contact me via twitter. https://x.com/VladimirAlbrek1

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment