Last week I was implementing pathfinding, so that our NPCs could find their way around town. Until now they were just standing there, animated but immobile.

I haven’t written an A* implementation in a few years, so I stopped by Wikipedia to refresh my memory, and was not amused: the article was both full of trivia, and also completely inadequate at actually explaining anything. Which I guess describes Wikipedia in general. But I can’t imagine learning A* from it, if I didn’t already know how it works.

A* is really a nice and simple algorithm that’s easy to implement. There’s a lot of discussion in the article about sets and heuristics, but it all comes down to just one main point: it’s a simple search algorithm that uses a particular “open set” data structure, and a particular scoring formula. (The name “open *set*” is really a misnomer, it should not be implemented as a set, but I guess that’s our mathematical roots poking out again, and tripping people up.)

There’s a whole family of relatively simple search algorithms that are all very similar. For example, virtually every CS graduate learns in Intro to Algorithms how to do depth-first graph traversal using a stack, and that by replacing a stack with a queue, we switch from depth-first to breadth-first search.

A* uses the same logic as those two, and the difference boils down to using a priority queue (or some other sorted collection) for the open set:

- A simple stack implements depth-first search
- A simple queue implements breadth-first search
- A priority queue
- sorted on f score implements A*
- sorted on g score implements Dijkstra

Then in order for the priority queue to actually sort correctly, we need to provide a score value for each node, such as the f score or the g score.

And that’s it. That’s all there is to it. Everything else is mere bookkeeping. 🙂

(I’m being hyperbolic here, but only a tiny bit; all the f score and g score stuff is pretty straightforward.)

As a fun aside: when we were making the prototype for CityVille, I wrote the pathfinding a little hastily, missed the heuristic, and ended up with an implementation of Dijkstra. I had completely forgotten about it, until it showed up in profiling, when we were testing large cities – suddenly pathfinding was taking up a lot of processing time! But the fix was trivial – I switched over to an f-sorted priority queue, and the cost of pathfinding dropped back down to being negligible.

Only goes to show how much the choice of the open set matters. 🙂