---
license: mit
---
Series of models to test the benefits of CoreML joint compression on iOS 18/macOS 15.

# mlp-*.mlpackage
Simple Up/Gate/Silu/Down MLP repeated four times with the Llama 2 7B dimensions.

All using 'CPU and Neural Engine' compute unit, measured in Xcode.

|Device|Model        |Precision          |Minimum (ms)|Median (ms)|
|:--   |:--          |:--                |--:         |--:        |
|M1 Max|mlp-float16  |float16            |19.30       |19.42      |
|M1 Max|mlp-4bit     |4-bit LUT          |5.93        |5.98       |
|M1 Max|mlp-2bit     |2-bit LUT          |5.92        |6.11       |
|M1 Max|mlp-4bit-int8|4-bit int8 LUT + A8|6.02        |6.31       |
|M1 Max|mlp-2bit-int8|2-bit int8 LUT + A8|6.00        |6.18       |
|M1 Max|mlp-int8-int8|W8A8               |9.78        |9.94       |
|M4    |mlp-4bit     |4-bit LUT          |-           |4.19       |
|M4    |mlp-2bit     |2-bit LUT          |-           |3.83       |
|M4    |mlp-4bit-int8|4-bit int8 LUT + A8|-           |4.14       |
|M4    |mlp-2bit-int8|2-bit int8 LUT + A8|-           |3.83       |
|M4    |mlp-int8-int8|W8A8               |-           |8.18       |


# Download
```
huggingface-cli download \
  --local-dir . \
  --local-dir-use-symlinks False \
  smpanaro/coreml-joint-compression-test \
  --include "*.mlpackage/*"
```