Maze Navigation Challenge

Welcome to the Maze Navigation Challenge!

This interactive experience recreates the spatial reasoning experiment used to evaluate Large Language Models. You'll navigate through increasingly complex mazes (5×5 to 15×15) using limited visibility - just like the LLMs in our study.

Your task: Navigate from the green starting position (top-left) to the red goal (bottom-right). You can only see how far you can travel in each direction before hitting a wall.

Rules: You lose if you visit the same cell 10 times or make more than 3n² moves in an n×n maze.

Level: 1

Maze Size: 5×5

Moves: 0

Move Limit: 75

You can see the distance to walls in each direction (highlighted cells)

Player

Goal

Visited

Visible

Use arrow keys, WASD, click neighboring cells, or buttons below

How did AI models perform on this challenge?

Discover which LLMs succeeded, which failed spectacularly, and the performance gap between languages in our research paper:

Read the paper on arXiv | View code on GitHub

@misc{einarsson2025mazeevalbenchmarktestingsequential, title={MazeEval: A Benchmark for Testing Sequential Decision-Making in Language Models}, author={Hafsteinn Einarsson}, year={2025}, eprint={2507.20395}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2507.20395}, }