Results on September 05, 2025
Multiple Choice Track
Model | Submission Time (GMT) | Original | NOTA |
---|
GPT-4.1 + Google Custom Search | 2025-09-06 03:00:00 | 70.0 | 50.0 |
Claude-3.5 Sonnet + Google Custom Search | 2025-09-06 03:00:00 | 70.0 | 50.0 |
GPT-4o + Google Custom Search | 2025-09-06 03:00:00 | 70.0 | 40.0 |
Claude-3.5 Sonnet | 2025-09-06 03:00:00 | 60.0 | 50.0 |
Claude-3.5 Haiku + Google Custom Search | 2025-09-06 03:00:00 | 60.0 | 40.0 |
Claude-3.5 Haiku | 2025-09-06 03:00:00 | 60.0 | 40.0 |
Claude-3.7 Sonnet + Google Custom Search | 2025-09-06 03:00:00 | 50.0 | 50.0 |
Claude-3.7 Sonnet | 2025-09-06 03:00:00 | 50.0 | 40.0 |
GPT-4.1 | 2025-09-06 03:00:00 | 50.0 | 30.0 |
GPT-4o | 2025-09-06 03:00:00 | 40.0 | 40.0 |
Gemini 1.5 Flash + Google Custom Search | 2025-09-06 03:00:00 | 0.0 | 0.0 |
Gemini 1.5 Flash | 2025-09-06 03:00:00 | 0.0 | 0.0 |
Gemini 2.0 Flash + Google Custom Search | 2025-09-06 03:00:00 | 0.0 | 0.0 |
Gemini 2.0 Flash | 2025-09-06 03:00:00 | 0.0 | 0.0 |
Generation Track
Model | Submission Time (GMT) | EM | F1 |
---|
Gemini 1.5 Flash + Google Custom Search | 2025-09-06 03:00:00 | 20.0 | 24.2 |
GPT-4o + Google Custom Search | 2025-09-06 03:00:00 | 10.0 | 16.3 |
GPT-4.1 | 2025-09-06 03:00:00 | 10.0 | 14.6 |
Claude-3.7 Sonnet + Google Custom Search | 2025-09-06 03:00:00 | 0.0 | 14.7 |
Gemini 2.0 Flash + Google Custom Search | 2025-09-06 03:00:00 | 0.0 | 10.4 |
Claude-3.5 Sonnet + Google Custom Search | 2025-09-06 03:00:00 | 0.0 | 8.8 |
GPT-4o | 2025-09-06 03:00:00 | 0.0 | 8.4 |
GPT-4.1 + Google Custom Search | 2025-09-06 03:00:00 | 0.0 | 7.4 |
Claude-3.5 Haiku | 2025-09-06 03:00:00 | 0.0 | 6.5 |
Gemini 2.0 Flash | 2025-09-06 03:00:00 | 0.0 | 5.1 |
Claude-3.5 Haiku + Google Custom Search | 2025-09-06 03:00:00 | 0.0 | 4.7 |
Claude-3.7 Sonnet | 2025-09-06 03:00:00 | 0.0 | 4.0 |
Gemini 1.5 Flash | 2025-09-06 03:00:00 | 0.0 | 3.3 |
Claude-3.5 Sonnet | 2025-09-06 03:00:00 | 0.0 | 1.7 |