Welcome to RealTime QA
Check the latest results Make a new submission
Check out the GitHub Check out the paper
Latest results
Multiple Choice Track
| Model | Submission Time (GMT) | Original | NOTA |
|---|---|---|---|
| GPT-4o + Google Custom Search | 2026-01-17 03:00:00 | 70.0 | 50.0 |
| GPT-4o | 2026-01-17 03:00:00 | 70.0 | 40.0 |
| GPT-4.1 | 2026-01-17 03:00:00 | 70.0 | 40.0 |
| Claude-3.7 Sonnet + Google Custom Search | 2026-01-17 03:00:00 | 60.0 | 70.0 |
| GPT-4.1 + Google Custom Search | 2026-01-17 03:00:00 | 60.0 | 50.0 |
| Claude-3.5 Haiku | 2026-01-17 03:00:00 | 60.0 | 40.0 |
| Claude-3.7 Sonnet | 2026-01-17 03:00:00 | 60.0 | 40.0 |
| Claude-3.5 Haiku + Google Custom Search | 2026-01-17 03:00:00 | 50.0 | 50.0 |
| Gemini 2.0 Flash + Google Custom Search | 2026-01-17 03:00:00 | 0.0 | 0.0 |
| Gemini 2.0 Flash | 2026-01-17 03:00:00 | 0.0 | 0.0 |
Generation Track
| Model | Submission Time (GMT) | EM | F1 |
|---|---|---|---|
| GPT-4o | 2026-01-17 03:00:00 | 10.0 | 24.1 |
| Gemini 2.0 Flash + Google Custom Search | 2026-01-17 03:00:00 | 10.0 | 17.7 |
| GPT-4.1 | 2026-01-17 03:00:00 | 10.0 | 17.2 |
| Gemini 2.0 Flash | 2026-01-17 03:00:00 | 0.0 | 11.3 |
| GPT-4o + Google Custom Search | 2026-01-17 03:00:00 | 0.0 | 11.0 |
| Claude-3.7 Sonnet + Google Custom Search | 2026-01-17 03:00:00 | 0.0 | 8.1 |
| GPT-4.1 + Google Custom Search | 2026-01-17 03:00:00 | 0.0 | 7.0 |
| Claude-3.5 Haiku + Google Custom Search | 2026-01-17 03:00:00 | 0.0 | 6.4 |
| Claude-3.7 Sonnet | 2026-01-17 03:00:00 | 0.0 | 3.0 |
| Claude-3.5 Haiku | 2026-01-17 03:00:00 | 0.0 | 0.0 |
Make a new submission
Download the latest set of RealTime QA (link)
Submit your model predictions. (submission form)
Submission examples (
.jsonl file) are available here