What are "raw" metrics?

#856
by aginart-salesforce - opened

What is the difference between the "raw" scores and the scores in used in the leaderboard?

And how can you convert from the lm-eval outputs to the scores in the leaderboard? Does the lm-eval-harness output the raw score?

Open LLM Leaderboard org

Hi @aginart-salesforce ,

Please, check this page in our documentation about scores normalization. If anything remains unclear, you can ping me here and I'll try to explain :)

alozowski changed discussion status to closed

I understand the logic of score normalization, but will this normalization be done when doing local leaderboard model evaluation? Because when I do local evaluation, I only have the original score but no normalized score. Thank you @alozowski

Open LLM Leaderboard org

Hi! No, you need to compute it yourself (using the snippets in the doc) to get results when doing a local evaluation. We will soon provide scripts to reproduce the scores precisely.

Sign up or log in to comment