ARC Prize launches its toughest AI benchmark yet: ARC-AGI-2

Introduction to ARC Prize and ARC-AGI-2

The ARC Prize has launched the ARC-AGI-2 benchmark, accompanied by the announcement of their 2025 competition with $1 million in prizes. As AI progresses from performing narrow tasks to demonstrating general, adaptive intelligence, the ARC-AGI-2 challenges aim to uncover capability gaps and actively guide innovation. The ARC Prize team states that good AGI benchmarks act as useful progress indicators, better AGI benchmarks clearly discern capabilities, and the best AGI benchmarks do all this and actively inspire research and guide innovation.

Beyond Memorisation

Since its inception in 2019, ARC Prize has served as a “North Star” for researchers striving toward AGI by creating enduring benchmarks. Benchmarks like ARC-AGI-1 leaned into measuring fluid intelligence, representing a clear departure from datasets that reward memorisation alone. The mission of ARC Prize is also forward-thinking, aiming to accelerate timelines for scientific breakthroughs. Its benchmarks are designed not just to measure progress but to inspire new ideas.

ARC-AGI-2: Closing the Human-Machine Gap

The ARC-AGI-2 benchmark is tougher for AI yet retains its accessibility for humans. While frontier AI reasoning systems continue to score in single-digit percentages on ARC-AGI-2, humans can solve every task in under two attempts. The benchmark includes datasets with varying visibility and characteristics such as symbolic interpretation, compositional reasoning, and contextual rule application. These characteristics highlight the challenges AI faces in areas where humans excel.

The Role of Efficiency

Measuring performance by cost per task is essential to gauge intelligence as not just problem-solving capability but the ability to do so efficiently. Real-world examples are already showing efficiency gaps between humans and frontier AI systems. For instance, a human panel passes ARC-AGI-2 tasks with 100% accuracy at $17/task, while OpenAI o3 has an estimated 4% success rate at $200 per task. These metrics underline disparities in adaptability and resource consumption between humans and AI.

ARC Prize 2025

ARC Prize 2025 launches on Kaggle this week, promising $1 million in total prizes and showcasing a live leaderboard for open-source breakthroughs. The contest aims to drive progress toward systems that can efficiently tackle ARC-AGI-2 challenges. Among the prize categories are a grand prize of $700,000 for reaching 85% success within Kaggle efficiency limits, a top score prize of $75,000 for the highest-scoring submission, and a paper prize of $50,000 for transformative ideas contributing to solving ARC-AGI tasks.

Conclusion

The ARC Prize and the introduction of the ARC-AGI-2 benchmark mark significant steps towards achieving true artificial general intelligence (AGI). By focusing on tasks that are easy for humans but challenging for AI and emphasizing efficiency, the ARC Prize encourages innovation and collaboration among researchers. The 2025 competition, with its substantial prizes, is set to drive meaningful progress in the field, potentially leading to breakthroughs in efficient general systems.

FAQs

What is the ARC Prize?
The ARC Prize is an initiative that aims to guide innovation and measure progress towards achieving artificial general intelligence (AGI) through the creation of challenging benchmarks.
What is ARC-AGI-2?
ARC-AGI-2 is a benchmark designed to test AI systems’ ability to perform tasks that are relatively easy for humans but challenging for AI, focusing on aspects like symbolic interpretation, compositional reasoning, and contextual rule application.
What is the focus of the ARC Prize 2025 competition?
The ARC Prize 2025 competition focuses on driving progress toward systems that can efficiently tackle ARC-AGI-2 challenges, with an emphasis on efficiency and innovation.
How can I participate in the ARC Prize 2025 competition?
The competition is hosted on Kaggle, and participants can register and submit their solutions to compete for the prizes.
What are the key characteristics of the ARC-AGI-2 benchmark?
The ARC-AGI-2 benchmark includes tasks that require symbolic interpretation, compositional reasoning, and contextual rule application, which are areas where current AI systems struggle but humans perform well.