Showing posts with label java. Show all posts
Showing posts with label java. Show all posts

Case insensitivity on HFS+

As most of you (I have limited readership on my blog and I'm pretty sure I know all of you) know, I've moved away from Linux to the Mac world a couple of months ago. I have an ancient PowerMac G4 for my home desktop and I use a MacBook Pro at work. Of course, I still use Linux for most development; Linux laptop at home and a server at work, so I haven't found the need to do any hardcore dev work on OS X till date.

This weekend, I decided to try my hand at porting Java-GNOME 4.x to OS X. I'm blogging that effort in a different post, but to cut to the chase, I was burned by case insensitivity on HFS+ for the most part of the weekend. It turns out, that despite being a UNIX, by default, OS X chooses a case-preserving but case-insensitive configuration for its filesystem. While you can choose a case-sensitive configuration (You'll have to select at format time!), you're likely to break third party non-Apple apps such as Adobe Photoshop.

Consider the following piece of code:





public class Test {
public interface CLICKED {
public void perform();
}

public interface Clicked {
public void perform();
}

public static void main(String[] args) {

Clicked myClicked = new Clicked() {
public void perform() {
System.out.println("Inside interface Clicked");
}
};

CLICKED myCLICKED = new CLICKED() {
public void perform() {
System.out.println("Inside interface CLICKED");
}
};

myClicked.perform();
myCLICKED.perform()
}
}





I'm sure you'll agree that this is a reasonable piece of Java, by most standards. We create two Interfaces and define two objects by implementing their methods. Now, we invoke "javac" and get this piece to compile. We typically expect a class file for each Outer class and a qualified class file for each Inner interface/class as Outer$Inner.class and so forth No sweat:

[pendyals@fermi:~/test/Test] $ javac Test.java
[pendyals@fermi:~/test/Test] $ ls
Test$1.class Test$CLICKED.class Test.class Test.java
[pendyals@fermi:~/test/Test] $

But what's this! We only have Test$CLICKED.class! Evidently we're missing a Test$Clicked.class! Where could it be? How about we look in the Test$CLICKED.class?


[pendyals@fermi:~/test/Test] $ javap -c Test\$CLICKED
Compiled from "Test.java"
public interface Test$Clicked{
public abstract void perform();
}


Evidently, the only interface compiled in, is called Clicked and not CLICKED. As you'd have guessed by now, attempting to run this class will result in a Exception:



[pendyals@fermi:~/test/Test] $ java Test
Exception in thread "main" java.lang.NoClassDefFoundError: Test$CLICKED (wrong name: Test$Clicked)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:675)

...


If you haven't guessed what happened: Javac created a Test$CLICKED.class file and put in the byte code for the CLICKED interface. Then, after compiling the byte code for the Clicked interface, it wanted to write into its corresponding file. Before writing, it checks to see if the file exists and if it does, just overwrites the file. HFS+ of course, dutifully returned a handle to "Test$CLICKED.class" when asked for "Test$Clicked.class" and this caused javac to silently overwrite it.

As is painfully evident, the case-insensitive nature of HFS+ has repercussions far beyond naming files correctly. Since languages like Java require you to name files with class names, an incorrectly designed application could fail silently and not alert you of the fact that a class file has been rewritten.

While the "typical" UNIX response would be to dismiss OS X and HFS+'s behavior as absurd or "immature" (yes I've heard that one), it would seem to me that a prudent design shouldn't require case-sensitivity in any form. I use the word "prudent" in a tongue-in-cheek sort of way, because there's no requirement for Java apps to run on OS X or Windows if the app designer didn't want it to.

Fingers burnt. Lesson learnt.

Java and Scientific Computing

Lately, I've been using Java for something I never imagined I would: Running time bound simulations and collecting statistics. This started from an Advanced AI homework, which required us to evaluate different strategies for solving the 8 Puzzle.

Depth First and Breadth First search -- the most brute force of them all -- are real processor hogs. They take a good amount of time to get through solutions. I set up my simulations to try 8 puzzle combinations in groups. Each group had members which were an equal number of moves away from the solution. I'll call this the "distance" of a group from the solution. I started looking at groups with distances 1 through 10. The idea was to see how depth first search scaled with distance, while using multiple boards to average out the time (since boards were randomly generated). The simulation started out fine, but somewhere half way through, Java ran out of heap space causing me to lose half an hour worth of simulations. Terrific. So I fixed the heap space with a JVM switch to allocate 768 M. This time, Java didn't seem to die on space, but for some reason, started halting randomly. Here's a section of the simulation output. The first column is the distance of the board and the second is the run time in microseconds.


4 4.265305
4 0.399163

4 8058.812957

4 0.495941

4 4.762955

Now, the simulation is set up, to repeat a given run 5 times, to average out run times. So, the same board configuration is run 5 times for a given algorithm. The numbers above are the run times of the same algorithm on the same damn board! What gives!? A 4 microsecond difference is probably justifiable, but a full 8000 microseconds?


Now, I'm not sure if Java / the JVM is entirely to blame for such random spikes. In order to validate that, I'd have to rewrite this simulation in C++ and post similar numbers. Somehow, I doubt that the deviation in run times will not be this high. As of now, I'm treating this as an open problem. I'm certainly not assigning blame to Java, but its definitely complicit!

More to come.

[And no, there were no other CPU hogging processes running on the box at simulation run time. Certainly none that would cause such spikes.]