Welcome to the Maze Navigation Challenge!
This interactive experience recreates the spatial reasoning experiment used to evaluate Large Language Models. You'll navigate through increasingly complex mazes (5×5 to 15×15) using limited visibility - just like the LLMs in our study.
Your task: Navigate from the green starting position (top-left) to the red goal (bottom-right). You can only see how far you can travel in each direction before hitting a wall.
Rules: You lose if you visit the same cell 10 times or make more than 3n² moves in an n×n maze.
How did AI models perform on this challenge?
Discover which LLMs succeeded, which failed spectacularly, and the performance gap between languages in our research paper:
Read the paper on arXiv | View code on GitHub