Welcome to RealTime QA
Check the latest results Make a new submission
Check out the GitHub Check out the paper
Latest results
Multiple Choice Track
Model | Submission Time (GMT) | Original | NOTA |
---|---|---|---|
GPT-4o + Google Custom Search | 2025-05-17 03:00:00 | 75.0 | 65.0 |
Claude-3.5 Haiku + Google Custom Search | 2025-05-17 03:00:00 | 70.0 | 80.0 |
GPT-4.1 + Google Custom Search | 2025-05-17 03:00:00 | 70.0 | 55.0 |
Gemini 1.5 Flash + Google Custom Search | 2025-05-17 03:00:00 | 65.0 | 65.0 |
Claude-3.5 Sonnet + Google Custom Search | 2025-05-17 03:00:00 | 65.0 | 60.0 |
Claude-3.7 Sonnet + Google Custom Search | 2025-05-17 03:00:00 | 65.0 | 60.0 |
Llama-4-Scout-17B-16E-Instruct + Google Custom Search | 2025-05-17 03:00:00 | 60.0 | 60.0 |
Llama-4-Maverick-17B-128E-Instruct + Google Custom Search | 2025-05-17 03:00:00 | 60.0 | 55.0 |
Claude-3.5 Haiku | 2025-05-17 03:00:00 | 55.0 | 65.0 |
GPT-4.1 | 2025-05-17 03:00:00 | 55.0 | 50.0 |
Gemini 1.5 Flash | 2025-05-17 03:00:00 | 45.0 | 40.0 |
Claude-3.7 Sonnet | 2025-05-17 03:00:00 | 45.0 | 30.0 |
GPT-4o | 2025-05-17 03:00:00 | 40.0 | 40.0 |
Llama-4-Maverick-17B-128E-Instruct | 2025-05-17 03:00:00 | 40.0 | 40.0 |
Claude-3.5 Sonnet | 2025-05-17 03:00:00 | 35.0 | 35.0 |
Llama-4-Scout-17B-16E-Instruct | 2025-05-17 03:00:00 | 30.0 | 30.0 |
Gemini 2.0 Flash + Google Custom Search | 2025-05-17 03:00:00 | 0.0 | 0.0 |
Gemini 2.0 Flash | 2025-05-17 03:00:00 | 0.0 | 0.0 |
Generation Track
Model | Submission Time (GMT) | EM | F1 |
---|---|---|---|
Llama-4-Scout-17B-16E-Instruct + Google Custom Search | 2025-05-17 03:00:00 | 25.0 | 31.7 |
Gemini 1.5 Flash + Google Custom Search | 2025-05-17 03:00:00 | 15.0 | 22.8 |
Gemini 2.0 Flash + Google Custom Search | 2025-05-17 03:00:00 | 5.0 | 10.0 |
Llama-4-Scout-17B-16E-Instruct | 2025-05-17 03:00:00 | 5.0 | 9.8 |
GPT-4o | 2025-05-17 03:00:00 | 5.0 | 9.8 |
GPT-4.1 | 2025-05-17 03:00:00 | 5.0 | 8.1 |
Gemini 2.0 Flash | 2025-05-17 03:00:00 | 5.0 | 7.8 |
Gemini 1.5 Flash | 2025-05-17 03:00:00 | 5.0 | 7.0 |
Claude-3.7 Sonnet + Google Custom Search | 2025-05-17 03:00:00 | 0.0 | 12.9 |
GPT-4o + Google Custom Search | 2025-05-17 03:00:00 | 0.0 | 11.0 |
Claude-3.5 Haiku | 2025-05-17 03:00:00 | 0.0 | 10.2 |
Llama-4-Maverick-17B-128E-Instruct + Google Custom Search | 2025-05-17 03:00:00 | 0.0 | 9.9 |
GPT-4.1 + Google Custom Search | 2025-05-17 03:00:00 | 0.0 | 9.1 |
Claude-3.5 Haiku + Google Custom Search | 2025-05-17 03:00:00 | 0.0 | 9.0 |
Claude-3.5 Sonnet + Google Custom Search | 2025-05-17 03:00:00 | 0.0 | 8.6 |
Claude-3.5 Sonnet | 2025-05-17 03:00:00 | 0.0 | 4.5 |
Llama-4-Maverick-17B-128E-Instruct | 2025-05-17 03:00:00 | 0.0 | 1.6 |
Claude-3.7 Sonnet | 2025-05-17 03:00:00 | 0.0 | 0.6 |
Make a new submission
Download the latest set of RealTime QA (link)
Submit your model predictions. (submission form)
Submission examples (
.jsonl file
) are available here