ResearchInvalid Date · NaNy ago
LMSYS Arena drops live leaderboard for three months of review
Citing ongoing research into gameability, LMSYS is pausing live ELO updates through July pending a methodology refresh.
LMSYS announced a three-month pause on live ELO updates for Chatbot Arena while the team reviews style-bias and prompt-quality concerns raised by outside researchers.
The decision is a significant moment. For two years, Arena ELO has been the single most-watched signal in AI model evaluation. A pause — even a methodological one — creates space for competitors like WildBench, LiveBench, and Artificial Analysis to establish stronger market positions.
We’ll continue to track the last published ELO on the leaderboard with a clear freeze marker, and we’ve begun tracking WildBench as a secondary signal.
Byline
Inference Index