Lately, I've been using Java for something I never imagined I would: Running time bound simulations and collecting statistics. This started from an Advanced AI homework, which required us to evaluate different strategies for solving the 8 Puzzle.

Depth First and Breadth First search -- the most brute force of them all -- are real processor hogs. They take a good amount of time to get through solutions. I set up my simulations to try 8 puzzle combinations in groups. Each group had members which were an equal number of moves away from the solution. I'll call this the "distance" of a group from the solution. I started looking at groups with distances 1 through 10. The idea was to see how depth first search scaled with distance, while using multiple boards to average out the time (since boards were randomly generated). The simulation started out fine, but somewhere half way through, Java ran out of heap space causing me to lose half an hour worth of simulations. Terrific. So I fixed the heap space with a JVM switch to allocate 768 M. This time, Java didn't seem to die on space, but for some reason, started halting randomly. Here's a section of the simulation output. The first column is the distance of the board and the second is the run time in microseconds.

4 4.265305

4 0.399163

4 8058.812957

4 0.495941

4 4.762955

Now, the simulation is set up, to repeat a given run 5 times, to average out run times. So, the same board configuration is run 5 times for a given algorithm. The numbers above are the run times of the same algorithm on the same damn board! What gives!? A 4 microsecond difference is probably justifiable, but a full 8000 microseconds?

Now, I'm not sure if Java / the JVM is entirely to blame for such random spikes. In order to validate that, I'd have to rewrite this simulation in C++ and post similar numbers. Somehow, I doubt that the deviation in run times will not be this high. As of now, I'm treating this as an open problem. I'm certainly not assigning blame to Java, but its definitely complicit!

More to come.

[And no, there were no other CPU hogging processes running on the box at simulation run time. Certainly none that would cause such spikes.]

## 0 comments:

Post a Comment