AI Models Outperform Humans at 2025 ICPC World Finals

GPT-5 scored a perfect 12/12 and Gemini 2.5 solved 10/12 under supervised ICPC finals conditions. Their results beat top human teams and underscore LLMs' growing algorithmic reasoning and enterprise promise.

AI Models Outperform Humans at 2025 ICPC World Finals

TL;DR

  • GPT-5: perfect 12/12, matching gold-medal level.
  • Gemini 2.5 Deep Think: solved 10/12 problems in 677 minutes (second-place equivalent).
  • Best human team solved 11/12; official human gold teams: St. Petersburg State University, University of Tokyo, Beijing Jiaotong University, Tsinghua University; Harvard and MIT were top U.S. silver finishers.
  • Finals featured 139 universities from at least 103 countries; same 12 problems and five-hour limit for all entrants.
  • GPT-5 gave correct first answers on 11 problems and required multiple attempts on the hardest one.
  • Gemini solved eight problems within 45 minutes and two more within three hours.
  • Both model entries were supervised and followed ICPC rules (did not compete as human teams).
  • Gemini’s duct-flow solution combined dynamic programming with a priority-value representation, leveraged the minimax theorem, and used nested ternary searches across a convex “bowl-shaped” solution space to find optimal priorities.
  • OpenAI stated GPT-5’s performance did not come from ICPC-specific training; Google called the entrant an “advanced version” of Gemini 2.5 DeepThink.
  • Implication: foundation models can execute complex abstract reasoning and layered algorithmic problem solving under timed contest constraints.
  • Demonstrated capability breadth: combining dynamic programming, convex search, and minimax reasoning.
  • Potential utility: support for intricate engineering workflows and optimization tasks in enterprise settings.
  • Gemini earlier recorded a top performance at the IMO this year.

AI models top the 2025 ICPC World Finals AI track

OpenAI’s GPT-5 and Google DeepMind’s Gemini 2.5 Deep Think competed under the ICPC World Finals AI track and solved algorithmic problems at a level that surpassed the human field in this iteration. The machines tackled the same set of twelve problems under the same five-hour constraints used for the human teams, with submissions judged concurrently by the event’s local judge.

Results at a glance

  • GPT-5 achieved a perfect score (12/12), matching the equivalent of a gold-medal performance.
  • Gemini 2.5 Deep Think solved 10 of 12 problems in 677 minutes, a result that would place it second overall.
  • The best human team solved 11 of 12 problems. Official human gold-medal teams were St. Petersburg State University, the University of Tokyo, Beijing Jiaotong University, and Tsinghua University; Harvard and MIT were the top U.S. finishers at the silver level.
  • The finals featured 139 universities from at least 103 countries, with every competitor receiving the same problems and time limit.

OpenAI reported that GPT-5 produced correct first answers on 11 of the 12 problems and required multiple attempts to solve the hardest one, while Google reported that Gemini solved eight problems within 45 minutes and two more within three hours. Neither model competed as a human team; both entries were supervised and followed ICPC rules.

How the models approached difficult problems

One problem that eluded all human teams involved distributing liquid through a network of ducts. Gemini’s solution combined dynamic programming with an analytical insight: representing each reservoir with a priority value and searching for the values that made the resulting flow constraints tightest. That approach leveraged the minimax theorem and used nested ternary searches across a bowl-shaped convex solution space to quickly locate optimal priorities and derive an optimal duct configuration.

OpenAI indicated that GPT-5’s performance did not stem from ICPC-specific training. Google described the entrant as an “advanced version” of Gemini 2.5 DeepThink.

What this means for developers and enterprise AI

The ICPC results emphasize that foundation models can now perform complex abstract reasoning and nontrivial algorithmic problem solving under contest constraints. For development teams, the demonstrations highlight two practical points:

  • Capability breadth: these models can combine algorithmic techniques—dynamic programming, convex search, minimax reasoning—rather than only retrieving patterns from training data.
  • Potential utility: increasingly sophisticated reasoning may unlock AI assistance for intricate engineering workflows and optimization tasks in enterprise settings.

Earlier in the year, Gemini also registered a top performance at the International Mathematical Olympiad, reinforcing that progress in mathematical reasoning is not isolated to a single event.

Broader context

Performances like these reignite conversations about the path toward more general problem-solving systems. The contest results document concrete instances of models executing layered algorithmic strategies and producing contest-level code under timed conditions, which contributes to ongoing evaluation of capabilities in research and industry settings.

Original source: https://venturebeat.com/ai/google-and-openais-coding-wins-at-university-competition-show-enterprise-ai?

Continue the conversation on Slack

Did this article spark your interest? Join our community of experts and enthusiasts to dive deeper, ask questions, and share your ideas.

Join our community