Ants, the micro icons of industriousness and organization, apparently can teach us something about how computer networks work and how they can be improved as well. A team at MIT's Computer Science and Artificial Intelligence Laboratory has been studying ants to create a model of analysis of social networks, collective decision making among robot swarms, and communication in decentralized, ad hoc wireless networks.
MIT study confirms a long-held assumption in the scientific community, ants estimate their population density based on the frequency at which they bump into other ants while randomly exploring their environs. This ability appears to be key for several activities, including deciding where the colony establishes a new nest.
"It's intuitive that if a bunch of people are randomly walking around an area, the number of times they bump into each other will be a surrogate of the population density," says Cameron Musco, co-author of a paper on the research. "What we're doing is giving a rigorous analysis behind that intuition, and also saying that the estimate is a very good estimate, rather than some coarse estimate."
The researchers make a parallel between an ant's environment and a grid. An explorer ant starts at some cell of the grid and likely moves to one of the adjacent cells. It is also likely that it then moves to another cell adjacent to the one it departed from, and so on. In technical, statistical language this is called "random walk." The explorer ant counts the number of other ants in all the cells it visits.
Random walk versus random sampling
The study compares the random walk to random sampling, which is when cells are selected from the grid at random and the number of ants in each cell counted. In both cases accuracy improves with each additional sample, but the surprising factor here is that the random walk can reveal the true population density almost as quickly as the tried-and-tested random sampling method does.
The researchers say this is relevant because in many practical cases random sampling is not an option. For instance, in ad hoc networks, a given device knows only the locations of the devices in its immediate vicinity. An algorithm that uses random walks to aggregate information from multiple devices would be much easier to implement than one that has to characterize the network as a whole.
The study then gets interesting for its more counterintuitive findings.
It would be logical to assume that the explorer ant is likely to return to a cell it has already visited, so one given cell would more likely be oversampled than in the case of random samplings. But when the researchers got antsy about oversampled data and tried to filter it out, they found that, instead of improving their algorithm, it made it worse. They offer a theoretical explanation for that.
"If you're randomly walking around a grid, you're not going to bump into everybody, because you're not going to cross the whole grid," Musco says. "So there's somebody on the far side of the grid that I have pretty much a zero percent chance of bumping into. But while I'll bump into those guys less, I'll bump into local guys more. I need to count all my interactions with the local guys to make up for the fact that there are these faraway guys that I'm never going to bump into. It sort of perfectly balances out."
Graph data structure
To model the ants' environment, the researchers used a graph data structure consisting of nodes (circles), and edges, which are the line segments connecting nodes. In the grid, each cell is a node, and it shares edges only with those cells immediately adjacent to it.
If the graph is not very well connected, with just a chain of nodes, each connected only to the two nodes adjacent to it, then oversampling could become a problem, with the explorer stuck in the same set of nodes. But graphs describing communication networks often feature two random walks starting from the same node and then branching out in different directions. This being the case, random walks can offer almost the same level of accuracy as random sampling.
The bigger the number of explorers, the faster an analysis would produce an accurate estimate. "If they were 
robots instead of ants, they could get gains by talking to each other and saying, 'Oh, this is my estimate','" Musco adds.