INT8 quantization gives me better accuracy than FP16 ! [D]

Our take

In the world of deep learning, precision plays a crucial role in model performance. Recently, I observed an intriguing trend while comparing different precisions: FP32 served as my baseline, but surprisingly, INT8 post-training quantization yielded better inference accuracy than FP16. This outcome challenged my expectations, as I had assumed FP16 would be closer to FP32 in accuracy. I’m curious to hear if anyone else has experienced this phenomenon and what factors might explain why INT8 is outperforming FP16 in this context. Let's explore this together!

Hi everyone,

I’m working on a deep learning model and I noticed something strange.

When I compare different precisions: FP32 (baseline)

FP16 , INT8 (post-training quantization)

I’m getting better inference accuracy with INT8 than FP16, which I didn’t expect.

I thought FP16 should be closer to FP32 and therefore more accurate than INT8, but in my case INT8 is actually performing better.

Has anyone seen this before? What could explain INT8 outperforming FP16 in inference?

Setup details:

Model exported via ONNX

FP16 used directly / INT8 via quantization

No major architecture changes

submitted by /u/Fragrant_Rate_2583
[link] [comments]

Read on the original site

Open the publisher's page for the full experience

View original article →

Tagged with

#rows.com#machine learning in spreadsheet applications#financial modeling with spreadsheets#INT8#FP16#FP32#quantization#deep learning#inference accuracy#post-training quantization#model#precision#ONNX#architecture changes#exported#baseline#performance#comparison#accuracy#model training