A maze is a maze is a maze

When I was in elementary school, I realized that I could draw polished-looking mazes on a page of graph paper. It was a pretty straightforward process:

Draw a big rectangle the size of your maze.
Draw a winding one-square-wide path through the rectangle. This will be the correct path.
Divide the area outside the correct path into multiple zones, bounded by walls.
For each zone, remove exactly one wall between it and the correct path.
Fill in each zone with random walls. Do this by branching the walls off of the existing zone walls, and never making either a loop of walls or a loop of cells. All four corners of every cell must touch a wall (i.e. no big open spaces).

The method looks something like this in practice:

Animation of a maze being created via the above set of steps, where each step is one frame of the animation. The maze starts out as a big empty box, then has a snaking path through it, then has zones outside the path added with entrances, then each zone is filled in with winding paths.

These were a whole bunch of fun to draw and then sell to my classmates for 50¢. The method made it easy to generate mazes of varying difficulty because you could draw the correct path as directly or circuitously as possible.

I always thought the most interesting mazes had no loops and makes every cell reachable. So that means that every wall has to eventually connect back to exactly one side wall. The mazes I constructed followed these well-defined rules, so each one always had exactly one solution.

It's visual

I remember taking an introductory computer science course in my freshman year of college where we were instructed to build a maze solving program. The standard "follow the left wall" algorithm is dead simple to reason about, so the exercise was really more about implementing that algorithm.

As part of the exercise, my professor provided us with a program template which generated a maze at random, then placed a player somewhere in the maze. Our job was then to fill out the contents of a function which moved that player through the maze using functions like turnLeft(), isFacingWall(), moveForward(), etc.

I still have my copy of that assignment. Here's a recording of my final program:

Animation of a blocky green maze being solved by a little arrow avatar. The avatar is performing the "follow the left wall" algorithm and slowly moving through the maze going up and down corridors. The animation starts over before the avatar finishes the maze.

After I completed the assignment, I got curious: how was the professor's template able to generate a maze randomly each time? Thinking back to my elementary school days, it seemed like maze generation would have been a fairly complicated piece of logic. I peeked at the .cpp source file and saw something like this:

static void GenerateWalls(int left, int top, int right, int bottom)
{
    if (bounds_too_small) {
        return;
    }

    int x = random_between(left, right);
    int y = random_between(top, bottom);

    // add and remove some walls
    // (...snip...)
    
    GenerateWalls(left, top, x, y);
    GenerateWalls(x, top, right, y);
    GenerateWalls(left, y, x, bottom);
    GenerateWalls(x, y, right, bottom);
}

int main() {
    GenerateWalls(0, 0, maze_width - 1, maze_height - 1);
}

There was a whole bunch of math in that wall-twiddling section. I saw the recursive calls, so I knew it must be somehow splitting up the maze into smaller and smaller picees. Turns out that this was essentially the recursive division method. As described by Wikipedia:

In a rectangular maze, build at random points two walls that are perpendicular to each other. These two walls divide the large chamber into four smaller chambers separated by four walls. Choose three of the four walls at random, and open a one cell-wide hole at a random point in each of the three. Continue in this manner recursively, until every chamber has a width of one cell in either of the two directions.

Here's a visual demonstration:

Recursive maze by Xabi-Vazquez, CC BY-SA 4.0, via Wikimedia Commons

This algorithm is closely dependent on the visual representation of the maze, since it just draws dividing lines repeatedly.

My home-grown kid method was also modeled on the visual representation of the maze; I just eyeballed what I thought a good path was, then eyeballed the dead zone regions, and filled in the rest at random.

It's a graph

One of the most important steps in good software engineering is modeling a problem space well. Oftentimes, difficult problems can be reduced to another problem which is easier to solve. Sometimes, those alternative formulations already have good solutions.

In fact, it's actually a little easier to generate a maze when you model it as a graph than if you just treat its visual appearance as its model. With this approach, the cells are nodes, and the boundaries between cells (whether walled or open) are edges between cell nodes. For a rectangular maze, each cell node has at most 4 neighbors (with side and corner cell nodes having fewer).

Let's model this!

// represents a rectangular maze with a given width and height
export class SquareMazeGraph {
  constructor(width, height);
  // an array of all Cells in the graph
  get cells();
  // an array of all CellBoundarys in the graph
  get boundaries();
  // reset all boundaries to either walls or
  setAllBoundaries(bool);
}

// represents a single cell in that maze
export class Cell {
  // a unique ID
  constructor(id);
  get id();
  // all CellBoundarys that surround this cell
  get boundaries();
  // all Cell neighbors that touch this cell
  get neighbors();
  // returns the CellBoundary between this cell and the given
  // neighbor
  boundaryBetween(cell);
}

// represents a boundary (can be a wall or a passage) between
// two cells in the maze
export class CellBoundary {
  constructor(firstCellId, secondCellId);
  // lesser bordering cell ID
  get firstId();
  // greater bordering cell ID
  get secondId();
  // both neighboring Cells
  get borderingCells();
  // true if this is wall, false if its a passage
  get isWall();
  // sets the wall status
  set isWall(newValue);
  // returns the bordering Cell which is NOT the given one
  traverseFrom(cell);
}

Now let's write the "Hello World" of maze algorithms: flip a coin for each edge to determine if it's a wall or not.

function* random(maze) {
  maze.setAllBoundaries(false);
  for (const b of maze.boundaries) {
    b.isWall = Math.random() < 0.5;
    yield [b];
  }
}

(I'm implementing these algorithms as JavaScript generator functions, which is going to let us walk through their execution step by step via the yield statements. Any interesting cells or boundaries can be yielded, and they'll be highlighted in the visualization.)

That looks like this:

This is clearly a useless maze. Large open spaces, closed-off areas, and quite often there's not even a route through it.

What we really want is a maze that satisfies the earlier constraints we discussed--every cell reachable, and no loops. When modeled as a graph, these "good" mazes are actually the same thing as a spanning tree.

Let's try something a bit more robust: a depth-first search algorithm. For this, we'll start at a random cell and just randomly pick unconnected cell neighbors to create corridors until we hit a dead end. When that happens, we'll backtrack until we have another route option, then start another corridor. Rinse and repeat until the entire maze is connected.

export function* depthFirstSearch(maze) {
  maze.setAllBoundaries(true);

  // we'll snake our way randomly through the maze, keeping
  // track of our path to the starting point
  const startCell = pickRandom(maze.cells);
  const backtrackStack = [startCell];

  // keep track of all the cells that we've connected to the
  // larger maze
  const visitedIds = new Set([startCell.id]);

  while (backtrackStack.length) {
    // get all unvisited neighbors of our current cell
    const thisCell = backtrackStack.pop();
    yield [thisCell];
    const unvisitedNeighbors = thisCell.neighbors
      .filter((cell) => !visitedIds.has(cell.id));

    // but if there are no unvisited neighbors, we'll retreat
    // back to the previous cell in the chain
    if (unvisitedNeighbors.length === 0) {
      continue;
    }

    // Pick one of those neighbors as random and connect it
    // to the chain. It's now visited.
    backtrackStack.push(thisCell);
    const nextCell = pickRandom(unvisitedNeighbors);
    const wallToRemove = thisCell.boundaryBetween(nextCell);
    wallToRemove.isWall = false;
    backtrackStack.push(nextCell);
    visitedIds.add(nextCell.id);

    yield [wallToRemove];
  }
}

In practice, this becomes:

The depth-first search algorithm generates long and winding paths that feel natural. The correct path is often non-obvious, and might require some searching on larger puzzles.

Let's look at another algorithm: Randomized Kruskal's algorithm.

In this algorithm, we basically treat each cell as belonging to its own set. We then pick random walls in the maze, and, if the cells on either side belong to different sets (i.e aren't already connected somehow), then we remove the wall and join the sets that the two cells belong to.

export function* kruskalsAlgorithm(maze) {
  maze.setAllBoundaries(true);

  // Array that starts as [0 => 0, 1 => 1, 2 => 2, ...], with one entry per
  // cell. Each key represents a cell ID, and each value is the "set ID" that
  // that cell belongs to. There are more efficient ways to represent this
  // (notably a disjoint-set data structure), but this naive implementation
  // suitable for demonstration.

  // At first, each cell belongs to its own set.
  const cellSets = [...Array(maze.width * maze.height).keys()];

  // A utility method to join two cell sets, given their "set IDs"
  const joinSets = (setA, setB) => {
    const setToChange = Math.max(setA, setB);
    const newSet = Math.min(setA, setB);
    for (let cellId = 0; cellId < cellSets.length; cellId++) {
      if (cellSets[cellId] === setToChange) {
        cellSets[cellId] = newSet;
      }
    }
  };

  // randomly order the boundaries
  const boundaries = maze.boundaries;
  shuffle(boundaries);

  for (const boundary of boundaries) {
    yield [boundary];
    if (cellSets[boundary.firstId] !== cellSets[boundary.secondId]) {
      boundary.isWall = false;
      joinSets(cellSets[boundary.firstId], cellSets[boundary.secondId]);
      yield boundary.borderingCells;
    }
  }
}

Here's what that looks like:

It's just bits

Interestingly, our maze and the state of every boundary in it can be represented as a bit set. Or even more simply, as just an integer. Each edge in the maze is represented by a single bit in that integer, where 0 means no wall and 1 means wall. Here's an example with a small maze:

The "Regenerate" button is doing the exact same thing as the random() algorithm implemented above--flipping a coin for each boundary (bit) and using that to determine if there's a wall or not (1 or 0). When you view the representation of the maze as an integer, that's exactly equivalent to just generating a random integer in the range [0..2^40-1] (for a 5x5 maze like this). This representation is helpful to give us some sense of the cardinality of the set of all possible mazes.

But of course, not all random integers will be valid mazes, or have the properties that we described above as constituting a "good" maze (no loops, every cell reachable). We could actually write predicate functions for these, to test a given integer to see if it satisfies those constraints. That would look something like:

function isEveryCellReachable(mazeInteger) { ... }
function isNonLooping(mazeInteger) { ... }

function isGoodMaze(mazeInteger) {
    return isEveryCellReachable(mazeInteger) && isNonLooping(mazeInteger);
}

Note that we aren't using an isSolvable(mazeInteger) function here. isEveryCellReachable() is sufficient for that instead; if every cell in a maze is reachable, then that maze is solvable. However, the inverse is not true; a solvable maze does not guarantee that every cell is reachable. So put another way, the set of mazes which satisies isEveryCellReachable is a strict subset of the mazes which satisfy isSolvable).

The implementation of these is left as an exercise to the reader 😉. Hint: flood fill algorithm.

Algorithm bias

We can predict that "good" mazes are statistically rare when we're just flipping random bits to generate a maze. The larger that a maze's dimensions grow, the more unlikely it is that a maze will satisfy the isGoodMaze() predicate. The maze algorithms we've seen so far always generate "good" mazes--but each one is biased towards specific tendencies.

The depth-first algorithm will tend to generate mazes with long and winding paths. The randomized Kruskal's algorithm tends to generate relatively simple patterns and paths. While each algorithm could, with the right seed, generate the same exact maze as the other algorithm, that's statistically unlikely.

Since all good mazes are just spanning trees of their graphs, can we make an unbiased selection from the set of all possible good mazes? That is, can we generate a uniform spanning tree?

Yes! Enter Wilson's Algorithm. First, pick a random cell and mark it as "connected" to the maze. Then, starting from any other cell, perform a random walk around the maze, cutting off unnecessary loops when you cross over your walk again. Once the path reaches the connected part of the maze, re-trace the walk from the start and bust down the walls in your way. Then mark all those new cells as connected to the maze, and start another random walk.

export function* wilsonsAlgorithm(maze) {
  maze.setAllBoundaries(true);

  // keep track of whether a cell is officially connected to the maze yet
  const isConnected = Array(maze.width * maze.height).fill(false);

  // add one cell to the maze at random
  const initialMazeCell = pickRandom(maze.cells);
  isConnected[initialMazeCell.id] = true;
  yield [initialMazeCell];

  // we'll use each unconnected cell as a starting point for our random walk
  for (const c of maze.cells) {
    if (isConnected[c.id]) {
      continue;
    }

    // we'll store each cell in our walk as a key in the map, with the value
    // being the direction (boundary) that the walk proceeds from that cell.
    const walkDirections = new Map();
    let walkPos = c;
    while (true) {
      // We'll pick a neighbor to randomly move in to and add that to our map.
      // This will sometimes overwrite a previous move from that cell-which is
      // the "loop-erasing" part of the algorithm.
      const direction = pickRandom(walkPos.boundaries);
      yield [walkPos, direction];
      walkDirections.set(walkPos.id, direction);
      walkPos = direction.traverseFrom(walkPos);

      // if we've reached the connected part of the maze, we'll done
      if (isConnected[walkPos.id]) {
        break;
      }
    }

    // Now that we have a series of directions to get to the maze, we'll
    // re-execute that walk from the beginning cell, following the map
    // of directions, busting down walls as we go and marking each cell
    // connected.
    let pathPos = c;
    while (!isConnected[pathPos.id]) {
      isConnected[pathPos.id] = true;
      const wallToBust = walkDirections.get(pathPos.id);
      yield [pathPos, wallToBust];
      wallToBust.isWall = false;
      pathPos = wallToBust.traverseFrom(pathPos);
    }
    // Note that there may be entries in the map which we didn't visit.
    // That's okay--they were the erased parts of the random walk that we
    // don't care about.
  }
}

Wilson's algorithm takes a while to get going at first, as the search tries to randomly find the initial starting cell of the maze. But once that first path is connected, subsequent walks take less and less time to connect.

Wilson's algorithm, using loop-erased random walks, will generate an unbiased random maze from the set of all good mazes.

It's not just right angles

One big benefit of treating a maze as a graph is that the actual rendered appearance of the maze doesn't have to be rectangular. You could just as easily create a hexagonal maze using the same algorithm implementations. Or maybe a maze based on the cells in a Voronoi tessellation:

[Colored Voronoi 3D Slice](https://commons.wikimedia.org/wiki/File:Coloured_Voronoi_3D_slice.svg), [CC BY-SA 3.0](https://creativecommons.org/licenses/by-sa/3.0), via Wikimedia Commons

As long as your maze is modeled as a graph, the algorithms will work just fine regardless of visual appearance. The recursive division algorithm described about would be much more difficult to adapt to non-rectangular mazes.

Perspective

There's a ton of value in taking a step back from the surface-level problem and representing it in a different way. This oftentimes reveals a much different problem that might be easier to tackle than the original problem. In my mind, one of the most enjoyable parts of software engineering is discovering these alternate formulations, then successfully leveraging them to solve a nasty problem.

John Ellmore