A* search algorithm
A* is a set of steps (an algorithm) that computers can use to figure out how to get somewhere fast between two places. If you have a list of locations, and how hard it is to get from one straight to the other, using A* can quickly tell you the fastest way. It's related to Dijkstra's algorithm, but makes smart guesses so that it doesn't spend as long trying slow ways. It's a good series of steps if you only want the path between two places. If you're going to ask for many paths from the same map, then there are faster ways, that find all the answers at once, like the Floyd–Warshall algorithm. A* will not work if you want to visit several places on one trip (the Travelling salesman problem).
The steps
[change | change source]For an example, we'll say A is connected to places B and C, and B and C are both connected to D and E. D and E are both connected to Z. There are 4 possible ways to go from A to Z. You can go A-B-D-Z, A-C-D-Z, A-B-E-Z, or A-C-E-Z. A computer using A* first looks at how hard it is to get from A to B, and from A to C. This is the "cost" for those places. The cost of a place means how hard it is to get from A to that place. After writing down both costs, the computer looks at how hard it is to get from B to D, and adds this to B's cost. It writes this down as D's cost. Then the computer looks at how hard it is to get from C to D, and adds this to C's cost. This is a different cost for D, and if it's less than the one it already has, it will replace the old one. The computer only wants to know the best path, so it ignores the path with the higher cost. It will only remember one of A-B-D and A-C-D, whichever is faster.
The computer goes on and finds the fastest way to get to E. Finally, it goes from D to Z, and finds a cost, and from E to Z and finds a cost. It gets a final cost for Z, and this is the smallest cost it can get. Now the computer knows which way is the fastest, and it has the answer. The computer can do a similar series of steps, but with many many more places. Each time, it will look at the place that's nearest to A, and will add up the costs to that place's neighbors.
People call the above series of steps Dijkstra's algorithm. Dijkstra's algorithm can be slow, because it will look at many places that might be going the wrong way from Z. If you asked the computer how to get from one city to a near one, Dijkstra's algorithm might end up looking in another state.
A* fixes this problem. A* lets you tell the computer a guess for how far it will be from each place to the end. The computer can use the guess to tell roughly how far it will take to get from a certain place to Z. Instead of just picking the place nearest to A to look at, it will look at the one which is probably going to have the lowest total. It finds this total by adding the cost to the expected distance left. This way, it can look only in the direction where things will probably get better. It's okay if the guess is not perfect, but even a simple bad guess can make the program go a lot faster. If you're trying to find a path between two places in the real world, a good guess is just the distance between them in a straight line. The real path over roads will be longer, but this lets the program guess it, and it won't go in the wrong direction.
In math or computer science literature, this guess is often a function of the place, and it is called a heuristic. Each place is a vertex, and each path between two places is an edge. These are words from graph theory.
Uses
[change | change source]A* is one algorithm in a big group of similar algorithms. These algorithms are all different ways of finding a path between two places, over certain paths between them. They are called path-finding algorithms. "Places" and "paths" aren't always real places, though. They can be abstract, and they can be a metaphor for something else. One example where it is a real place is trying to find a road trip on Google Maps or Mapquest. In this example, the paths are roads, and the places are houses or cities. A second example is sending information over the Internet. The information needs to go through many computers to get to the end. Many routers use A* to find the best way to send information across the Internet. Here, the paths are cables, and the places are computers. For a third example, where it's not literal, you want to make a certain chemical from the ingredients you have. Mixing some of these ingredients will produce new chemicals, and those can be mixed with each other. A path-finding algorithm could tell you which order to mix them in. Here, the paths are chemical reactions, the cost is how long the reaction takes to finish, and the places are the different chemicals you can make.