What method of quantization was used to make this model?

#1
by CCRss - opened

Great family models
I'm interested, is it just bnb quantization to 4bit? Or it's some close method that you can't publish.

OpenBMB org
β€’
edited Aug 7

For the time being, it's just bnb.
Other quantization frameworks have difficulty supporting multimodal models, and we don't have enough energy to develop them yet.
Or you can use our version of gguf, which is also int4.
here: https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf

Thanks for the reply. πŸ™ Sadly, maybe I will try to do AWQ quantization of this model if my supervisor accept it.

@CCRss I'm am also interested in AWQ/MLC quantization for running on Jetson devices. Maybe we can join effort to some point

@Aniel99 I will be happy to. You can contact me via twitter. https://x.com/VladimirAlbrek1

Sign up or log in to comment