When I was in elementary school, I realized that I could draw polished-looking mazes on a page of graph paper. It was a pretty straightforward process:
- Draw a big rectangle the size of your maze.
- Draw a winding one-square-wide path through the rectangle. This will be the correct path.
- Divide the area outside the correct path into multiple zones, bounded by walls.
- For each zone, remove exactly one wall between it and the correct path.
- Fill in each zone with random walls. Do this by branching the walls off of the existing zone walls, and never making either a loop of walls or a loop of cells. All four corners of every cell must touch a wall (i.e. no big open spaces).
The method looks something like this in practice:
These were a whole bunch of fun to draw and then sell to my classmates for 50ยข. The method made it easy to generate mazes of varying difficulty because you could draw the correct path as directly or circuitously as possible.
I always thought the most interesting mazes had no loops and makes every cell reachable. So that means that every wall has to eventually connect back to exactly one side wall. The mazes I constructed followed these well-defined rules, so each one always had exactly one solution.
It's visual
I remember taking an introductory computer science course in my freshman year of college where we were instructed to build a maze solving program. The standard "follow the left wall" algorithm is dead simple to reason about, so the exercise was really more about implementing that algorithm.
As part of the exercise, my professor provided us with a program template which
generated a maze at random, then placed a player somewhere in the maze. Our job
was then to fill out the contents of a function which moved that player through
the maze using functions like turnLeft()
, isFacingWall()
, moveForward()
,
etc.
I still have my copy of that assignment. Here's a recording of my final program:
After I completed the assignment, I got curious: how was the professor's
template able to generate a maze randomly each time? Thinking back to my
elementary school days, it seemed like maze generation would have been a fairly
complicated piece of logic. I peeked at the .cpp
source file and saw something
like this:
static void GenerateWalls(int left, int top, int right, int bottom)
{
if (bounds_too_small) {
return;
}
int x = random_between(left, right);
int y = random_between(top, bottom);
// add and remove some walls
// (...snip...)
GenerateWalls(left, top, x, y);
GenerateWalls(x, top, right, y);
GenerateWalls(left, y, x, bottom);
GenerateWalls(x, y, right, bottom);
}
int main() {
GenerateWalls(0, 0, maze_width - 1, maze_height - 1);
}
There was a whole bunch of math in that wall-twiddling section. I saw the recursive calls, so I knew it must be somehow splitting up the maze into smaller and smaller picees. Turns out that this was essentially the recursive division method. As described by Wikipedia:
In a rectangular maze, build at random points two walls that are perpendicular to each other. These two walls divide the large chamber into four smaller chambers separated by four walls. Choose three of the four walls at random, and open a one cell-wide hole at a random point in each of the three. Continue in this manner recursively, until every chamber has a width of one cell in either of the two directions.
Here's a visual demonstration:
This algorithm is closely dependent on the visual representation of the maze, since it just draws dividing lines repeatedly.
My home-grown kid method was also modeled on the visual representation of the maze; I just eyeballed what I thought a good path was, then eyeballed the dead zone regions, and filled in the rest at random.
It's a graph
One of the most important steps in good software engineering is modeling a problem space well. Oftentimes, difficult problems can be reduced to another problem which is easier to solve. Sometimes, those alternative formulations already have good solutions.
In fact, it's actually a little easier to generate a maze when you model it as a graph than if you just treat its visual appearance as its model. With this approach, the cells are nodes, and the boundaries between cells (whether walled or open) are edges between cell nodes. For a rectangular maze, each cell node has at most 4 neighbors (with side and corner cell nodes having fewer).
Let's model this!
// represents a rectangular maze with a given width and height
export class SquareMazeGraph {
constructor(width, height);
// an array of all Cells in the graph
get cells();
// an array of all CellBoundarys in the graph
get boundaries();
// reset all boundaries to either walls or
setAllBoundaries(bool);
}
// represents a single cell in that maze
export class Cell {
// a unique ID
constructor(id);
get id();
// all CellBoundarys that surround this cell
get boundaries();
// all Cell neighbors that touch this cell
get neighbors();
// returns the CellBoundary between this cell and the given
// neighbor
boundaryBetween(cell);
}
// represents a boundary (can be a wall or a passage) between
// two cells in the maze
export class CellBoundary {
constructor(firstCellId, secondCellId);
// lesser bordering cell ID
get firstId();
// greater bordering cell ID
get secondId();
// both neighboring Cells
get borderingCells();
// true if this is wall, false if its a passage
get isWall();
// sets the wall status
set isWall(newValue);
// returns the bordering Cell which is NOT the given one
traverseFrom(cell);
}
Now let's write the "Hello World" of maze algorithms: flip a coin for each edge to determine if it's a wall or not.
function* random(maze) {
maze.setAllBoundaries(false);
for (const b of maze.boundaries) {
b.isWall = Math.random() < 0.5;
yield [b];
}
}
(I'm implementing these algorithms as JavaScript
generator functions,
which is going to let us walk through their execution step by step via the
yield
statements. Any interesting cells or boundaries can be yielded, and
they'll be highlighted in the visualization.)
That looks like this:
This is clearly a useless maze. Large open spaces, closed-off areas, and quite often there's not even a route through it.
What we really want is a maze that satisfies the earlier constraints we discussed--every cell reachable, and no loops. When modeled as a graph, these "good" mazes are actually the same thing as a spanning tree.
Let's try something a bit more robust: a depth-first search algorithm. For this, we'll start at a random cell and just randomly pick unconnected cell neighbors to create corridors until we hit a dead end. When that happens, we'll backtrack until we have another route option, then start another corridor. Rinse and repeat until the entire maze is connected.
export function* depthFirstSearch(maze) {
maze.setAllBoundaries(true);
// we'll snake our way randomly through the maze, keeping
// track of our path to the starting point
const startCell = pickRandom(maze.cells);
const backtrackStack = [startCell];
// keep track of all the cells that we've connected to the
// larger maze
const visitedIds = new Set([startCell.id]);
while (backtrackStack.length) {
// get all unvisited neighbors of our current cell
const thisCell = backtrackStack.pop();
yield [thisCell];
const unvisitedNeighbors = thisCell.neighbors
.filter((cell) => !visitedIds.has(cell.id));
// but if there are no unvisited neighbors, we'll retreat
// back to the previous cell in the chain
if (unvisitedNeighbors.length === 0) {
continue;
}
// Pick one of those neighbors as random and connect it
// to the chain. It's now visited.
backtrackStack.push(thisCell);
const nextCell = pickRandom(unvisitedNeighbors);
const wallToRemove = thisCell.boundaryBetween(nextCell);
wallToRemove.isWall = false;
backtrackStack.push(nextCell);
visitedIds.add(nextCell.id);
yield [wallToRemove];
}
}
In practice, this becomes:
The depth-first search algorithm generates long and winding paths that feel natural. The correct path is often non-obvious, and might require some searching on larger puzzles.
Let's look at another algorithm: Randomized Kruskal's algorithm.
In this algorithm, we basically treat each cell as belonging to its own set. We then pick random walls in the maze, and, if the cells on either side belong to different sets (i.e aren't already connected somehow), then we remove the wall and join the sets that the two cells belong to.
export function* kruskalsAlgorithm(maze) {
maze.setAllBoundaries(true);
// Array that starts as [0 => 0, 1 => 1, 2 => 2, ...], with one entry per
// cell. Each key represents a cell ID, and each value is the "set ID" that
// that cell belongs to. There are more efficient ways to represent this
// (notably a disjoint-set data structure), but this naive implementation
// suitable for demonstration.
// At first, each cell belongs to its own set.
const cellSets = [...Array(maze.width * maze.height).keys()];
// A utility method to join two cell sets, given their "set IDs"
const joinSets = (setA, setB) => {
const setToChange = Math.max(setA, setB);
const newSet = Math.min(setA, setB);
for (let cellId = 0; cellId < cellSets.length; cellId++) {
if (cellSets[cellId] === setToChange) {
cellSets[cellId] = newSet;
}
}
};
// randomly order the boundaries
const boundaries = maze.boundaries;
shuffle(boundaries);
for (const boundary of boundaries) {
yield [boundary];
if (cellSets[boundary.firstId] !== cellSets[boundary.secondId]) {
boundary.isWall = false;
joinSets(cellSets[boundary.firstId], cellSets[boundary.secondId]);
yield boundary.borderingCells;
}
}
}
Here's what that looks like:
It's just bits
Interestingly, our maze and the state of every boundary in it can be represented
as a bit set. Or even more simply, as
just an integer. Each edge in the maze is represented by a single bit in that
integer, where 0
means no wall and 1
means wall. Here's an example with a
small maze:
The "Regenerate" button is doing the exact same thing as the random()
algorithm implemented above--flipping a coin for each boundary (bit) and using
that to determine if there's a wall or not (1
or 0
). When you view the
representation of the maze as an integer, that's exactly equivalent to just
generating a random integer in the range [0..2^40-1]
(for a 5x5 maze like
this). This representation is helpful to give us some sense of the
cardinality of the set of all
possible mazes.
But of course, not all random integers will be valid mazes, or have the properties that we described above as constituting a "good" maze (no loops, every cell reachable). We could actually write predicate functions for these, to test a given integer to see if it satisfies those constraints. That would look something like:
function isEveryCellReachable(mazeInteger) { ... }
function isNonLooping(mazeInteger) { ... }
function isGoodMaze(mazeInteger) {
return isEveryCellReachable(mazeInteger) && isNonLooping(mazeInteger);
}
Note that we aren't using an isSolvable(mazeInteger)
function here.
isEveryCellReachable()
is sufficient for that instead; if every cell in a maze
is reachable, then that maze is solvable. However, the inverse is not true; a
solvable maze does not guarantee that every cell is reachable. So put another
way, the set of mazes which satisies isEveryCellReachable
is a strict subset
of the mazes which satisfy isSolvable
).
The implementation of these is left as an exercise to the reader ๐. Hint: flood fill algorithm.
Algorithm bias
We can predict that "good" mazes are statistically rare when we're just flipping
random bits to generate a maze. The larger that a maze's dimensions grow, the
more unlikely it is that a maze will satisfy the isGoodMaze()
predicate. The
maze algorithms we've seen so far always generate "good" mazes--but each one is
biased towards specific tendencies.
The depth-first algorithm will tend to generate mazes with long and winding paths. The randomized Kruskal's algorithm tends to generate relatively simple patterns and paths. While each algorithm could, with the right seed, generate the same exact maze as the other algorithm, that's statistically unlikely.
Since all good mazes are just spanning trees of their graphs, can we make an unbiased selection from the set of all possible good mazes? That is, can we generate a uniform spanning tree?
Yes! Enter Wilson's Algorithm. First, pick a random cell and mark it as "connected" to the maze. Then, starting from any other cell, perform a random walk around the maze, cutting off unnecessary loops when you cross over your walk again. Once the path reaches the connected part of the maze, re-trace the walk from the start and bust down the walls in your way. Then mark all those new cells as connected to the maze, and start another random walk.
export function* wilsonsAlgorithm(maze) {
maze.setAllBoundaries(true);
// keep track of whether a cell is officially connected to the maze yet
const isConnected = Array(maze.width * maze.height).fill(false);
// add one cell to the maze at random
const initialMazeCell = pickRandom(maze.cells);
isConnected[initialMazeCell.id] = true;
yield [initialMazeCell];
// we'll use each unconnected cell as a starting point for our random walk
for (const c of maze.cells) {
if (isConnected[c.id]) {
continue;
}
// we'll store each cell in our walk as a key in the map, with the value
// being the direction (boundary) that the walk proceeds from that cell.
const walkDirections = new Map();
let walkPos = c;
while (true) {
// We'll pick a neighbor to randomly move in to and add that to our map.
// This will sometimes overwrite a previous move from that cell-which is
// the "loop-erasing" part of the algorithm.
const direction = pickRandom(walkPos.boundaries);
yield [walkPos, direction];
walkDirections.set(walkPos.id, direction);
walkPos = direction.traverseFrom(walkPos);
// if we've reached the connected part of the maze, we'll done
if (isConnected[walkPos.id]) {
break;
}
}
// Now that we have a series of directions to get to the maze, we'll
// re-execute that walk from the beginning cell, following the map
// of directions, busting down walls as we go and marking each cell
// connected.
let pathPos = c;
while (!isConnected[pathPos.id]) {
isConnected[pathPos.id] = true;
const wallToBust = walkDirections.get(pathPos.id);
yield [pathPos, wallToBust];
wallToBust.isWall = false;
pathPos = wallToBust.traverseFrom(pathPos);
}
// Note that there may be entries in the map which we didn't visit.
// That's okay--they were the erased parts of the random walk that we
// don't care about.
}
}
Wilson's algorithm takes a while to get going at first, as the search tries to randomly find the initial starting cell of the maze. But once that first path is connected, subsequent walks take less and less time to connect.
Wilson's algorithm, using loop-erased random walks, will generate an unbiased random maze from the set of all good mazes.
It's not just right angles
One big benefit of treating a maze as a graph is that the actual rendered appearance of the maze doesn't have to be rectangular. You could just as easily create a hexagonal maze using the same algorithm implementations. Or maybe a maze based on the cells in a Voronoi tessellation:
As long as your maze is modeled as a graph, the algorithms will work just fine regardless of visual appearance. The recursive division algorithm described about would be much more difficult to adapt to non-rectangular mazes.
Perspective
There's a ton of value in taking a step back from the surface-level problem and representing it in a different way. This oftentimes reveals a much different problem that might be easier to tackle than the original problem. In my mind, one of the most enjoyable parts of software engineering is discovering these alternate formulations, then successfully leveraging them to solve a nasty problem.
Further reading on mazes
This was just a toe-dip into the fascinating world of maze generation. If you're still curious, check out these resources:
- My standalone web app to generate mazes. This uses the same implementation as the demos above, just with more control.
- The Maze generation algorithm article on Wikipedia, which I heavily referenced for this topic.
- Jamis Buck's articles on maze generation algorithms. The article on Wilson's algorithm was incredibly helpful for me, and he has many similar articles just as helpful.
- Jamis Buck's book Mazes for Programmers, which dives deeper into the topic than I could ever hope to go in a blog post.