Spaces:
Running
Running
What is Average H4 Score?
#2
by
adnan-ahmad-tub
- opened
What does the term H4 stands for?
I believe its the average of four HuggingFace automatic evaluation metrics, namely,
- ARC (25-s)
- HellaSwag (10-s)
- MMLU (5-s)
- TruthfulQA (MC) (0-s)
But I'm not sure. Can someone please confirm me?
Thanks!
Yes that's exactly what it is. The llm-perf leaderboard currently reports the average score found in the open llm leaderboard for all hardware+backend configurations of a specific model. We are making the assumption that there's no quality degradation which is not always true (e.g. some models might have their weights originally in float32
and have lower score or even unexpected behavior when loaded in float16
).
IlyasMoutawwakil
changed discussion status to
closed