Results on April 04, 2025
  Multiple Choice Track
| Model | Submission Time (GMT) | Original | NOTA | 
|---|
| Claude-3.5 Haiku + Google Custom Search | 2025-04-05 03:00:00 | 80.0 | 50.0 | 
| GPT-4o | 2025-04-05 03:00:00 | 65.0 | 50.0 | 
| Claude-3.5 Haiku | 2025-04-05 03:00:00 | 65.0 | 50.0 | 
| GPT-4o + Google Custom Search | 2025-04-05 03:00:00 | 65.0 | 45.0 | 
| Gemini 1.5 Flash + Google Custom Search | 2025-04-05 03:00:00 | 65.0 | 45.0 | 
| Claude-3.5 Sonnet + Google Custom Search | 2025-04-05 03:00:00 | 65.0 | 35.0 | 
| Claude-3.5 Sonnet | 2025-04-05 03:00:00 | 60.0 | 35.0 | 
| GPT-3.5 Turbo + Google Custom Search | 2025-04-05 03:00:00 | 60.0 | 25.0 | 
| Llama3.1-405B-Instruct + Google Custom Search | 2025-04-05 03:00:00 | 50.0 | 50.0 | 
| Gemini 1.5 Flash | 2025-04-05 03:00:00 | 45.0 | 45.0 | 
| GPT-3.5 Turbo | 2025-04-05 03:00:00 | 45.0 | 25.0 | 
| Llama3.1-405B-Instruct | 2025-04-05 03:00:00 | 25.0 | 25.0 | 
  Generation Track
| Model | Submission Time (GMT) | EM | F1 | 
|---|
| GPT-4o + Google Custom Search | 2025-04-05 03:00:00 | 10.0 | 20.1 | 
| GPT-4o | 2025-04-05 03:00:00 | 10.0 | 19.5 | 
| Llama3.1-405B-Instruct | 2025-04-05 03:00:00 | 5.0 | 16.6 | 
| Llama3.1-405B-Instruct + Google Custom Search | 2025-04-05 03:00:00 | 5.0 | 14.0 | 
| Gemini 1.5 Flash | 2025-04-05 03:00:00 | 5.0 | 12.7 | 
| Claude-3.5 Sonnet + Google Custom Search | 2025-04-05 03:00:00 | 0.0 | 11.2 | 
| Claude-3.5 Sonnet | 2025-04-05 03:00:00 | 0.0 | 10.1 | 
| Claude-3.5 Haiku | 2025-04-05 03:00:00 | 0.0 | 8.6 | 
| Gemini 1.5 Flash + Google Custom Search | 2025-04-05 03:00:00 | 0.0 | 8.4 | 
| GPT-3.5 Turbo + Google Custom Search | 2025-04-05 03:00:00 | 0.0 | 8.3 | 
| GPT-3.5 Turbo | 2025-04-05 03:00:00 | 0.0 | 6.5 | 
| Claude-3.5 Haiku + Google Custom Search | 2025-04-05 03:00:00 | 0.0 | 4.2 |