OpenAI employees publicly accuse Grok3's Benchmark test results of being misleading

robot
Abstract generation in progress

Jinshi Data News on February 23rd, recently, an employee of OpenAI publicly accused XAI, a company owned by Elon Musk, of releasing misleading Benchmark test results for its latest AI model, Grok3. In response, Igor Babushkin, co-founder of XAI, insisted that the company did nothing wrong. XAI's charts show that both versions of Grok3 - Grok3 Reasoning Beta and Grok3 mini Reasoning - outperformed OpenAI's current strongest available model, o3-mini-high, on AIME 2025. However, OpenAI employees quickly pointed out on the X platform that XAI's charts did not include the AIME 2025 score of o3-mini-high under the 'cons@64' condition. Babushkin argued on the X platform that OpenAI had also released similar misleading Benchmark test charts in the past, although these charts were used to compare the performance of their own models.

XAI4.93%
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 2
  • Repost
  • Share
Comment
0/400
Birakenewekovip
· 02-23 03:48
Bull Run 🐂
Reply0
Renatinho25vip
· 02-23 02:47
Invest 🚀
View OriginalReply0
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)