The rapid rise of Large Language Models (LLMs) has raised a major question for our community: is AI about to replace human competitive programmers, or is it simply becoming another tool? This talk cuts through the hype by examining the real algorithmic abilities of state-of-the-art AI through three recent milestones, tracing its role from contestant, to problem setter, to heuristic optimizer.
First, we examine AI as a competitor through LiveCodeBench Pro (NeurIPS 2025). To avoid contamination from past online archives, we evaluate models in real time on fresh Codeforces, ICPC, and IOI problems before solutions are published. The results show a clear divide: models do well on knowledge-heavy implementation tasks, but struggle on ad-hoc, constructive, and observation-heavy problems. We discuss why AI still misses the creative "aha" moments that human experts often find naturally.
Next, we turn to AI as a problem setter with AutoCode (ICLR 2026). Writing strong statements, validators, and edge-case generators is one of the hardest parts of contest design. We present an automated system in which LLMs generate robust test suites that catch tricky "cheese" or hack solutions. More surprisingly, the system can also create novel contest-grade problems that the AI itself still cannot solve to Accepted.
Finally, we explore AI as a heuristic optimizer through FrontierCS (under review, ICML 2026). Unlike standard competitive programming, which relies on binary AC/WA verdicts, heuristic contests use continuous scores on open-ended optimization tasks. FrontierCS extends this setting to NP-hard problems and evaluates how well models improve solutions when the true optimum is unknown. We show how these challenges naturally connect competitive programming to real-world computer science research.