Friday, November 30, 2007

JRuby pod-cast

I almost accidentally got to hear Neal Ford's interview about JRuby this morning.

Love it or hate it, Ruby cannot be ignored. Yet the articles and blog-posts you find on Ruby vs. Java are many times so emotional and single-minded that it became a really good joke. But not this one.

Neal starts with a bit of history, goes through Ruby comparison to Java (expressive power, meta-programming capabilities), the rising importance of JVM as a multi-language platform, comparison between JRuby and other JVM languages (Groovy, Jython, Jaskell, Scala) when to use them and how to get JRuby into your corporate development environment (the tips include - "don't say the word Ruby as long as it runs on the JVM" and "use it in test or build environment first - Ruby has great advantages there and it's easier to justify a new tool for build or test than for production") and finally about his work on first commercial product developed entirely in JRuby - the agile team management suite called Mingle.

I also liked his general comments regarding how G0F design patterns illustrate language deficiencies, his objection to XML programming and why the concept of protective programming environment allowing to scale software development through hiring zillions of bad programmers is bankrupt.

P.S. Martin Fowler, Neal's colleague at ThoughtWorks, recently blogged about JRuby too.

Tuesday, November 27, 2007

Reflection and auto-boxing

In an earlier post I discovered that the way reflection handles synthetic bridge methods is somewhat lacking. Here is another reflection issue, this time a known problem: 6176992 - sort of paradox in a box.

Apparently Class#isAssignableFrom does not handle auto-boxing so if you have a method that returns a boolean, Boolean.class.isAssignableFrom(method.getReturnType()) will evaluate to false. I played with it a bit further, and if you have a method that takes an int, MyClass.class.getMethod(methodName, Integer.class) yields NoSuchMethodException.

I expected these API's to return a non-primitive class because Method#invoke accepts only Objects and translates them internally, and it worked long before auto-boxing feature was added to Java. However, after giving this more thought, I decided that it's unfair to complain that the API's return exact types, also a primitive type does not extend or inherit its boxed counterpart and Javadoc describes the behavior of isAssignableFrom impeccably - the method supports widening reference conversion only (chapter 5.1.5 in the 3rd edition of Java Language Specification).

Nevertheless, Java Language Specification also defines widening primitive conversion (chapter 5.1.2) and (un)boxing (in 5.1.7 and 5.1.8) and I think it is fair to ask for a utility method that tests the compatibility of 2 classes that represent primitive types and a utility for boxing and un-boxing of classes. So what I'm really asking for is a java.lang.reflect.Primitive class, that provides some static utilities just like java.lang.reflect.Array does.

Until then, I am going to create my own utility for boxing/un-boxing - a Function<Class,Class> of course :-)

Thursday, November 22, 2007

Properties, static and yet dynamic

I think it's time to share some more of my Java properties ideas.

Property is a super-type-token and the definition looks like this:

public static final Property<Person,String> name =
new Property<Person,String> ("name") {};
So I've taken the property out of its usual habitat - the object it applies to, and let it be defined in any class! One prominent flexibility is the ability to add properties to an existing class by loading a new class.

How is the value associated with a particular object instance? A property should be assigned a "definition", which is in essence a Function that retrieves the value from a given object, by using reflection for example. The definition can be changed on the fly, and user of the property needs not be concerned.

That's all very nice, but reflection is not enough if we add properties on the fly. There are 2 ways of dealing with it. The more obvious one is having a Map or an array within the object where the values can be stored. This of course means that the object class is in advance "prepared" to have dynamic properties.

Another way is rather shocking at first glance... to me it was, anyway. Traditionally the object stores a property->value map. But why can't we have the property store an object->value map!? When I read about column-oriented databases, I realized it's not as crazy as it first looks. All I needed was WeakIdentityHashMap, unfortunately JDK does not have one, but google-collections do - the ReferenceMap. Also cloning and serialization need special hooks, but it's not too much hassle.

I have 4 types of properties for now, Property that only allows read access, UpdatableProperty that supports read and write, and their multi-value counterparts: IterableProperty and CollectionProperty. This list can of course be extended. Any property can have a default value, and updatable property can have a list of value constraints. I have a library of definitions - plain function, reflected, mapped, ones using internal and external map like I described above and of course more can be added.

Last but not least - all properties that apply to a certain class of objects are tracked, so there's an API which I can query for applicable properties for a given class. It would collect properties of all super-classes and super-interfaces - note that property can be defined for an interface and in an interface (being a final static).

That's it for tonight. To be continued...

P.S. JavaEdge was really great, thanks Alpha folks!

Monday, November 19, 2007

Software proletariat

Don't miss out today's Dilbert on the lower depths of our profession.

Sunday, November 18, 2007

Predicate + Function = Rule

Another enhancement on top of google-collections that I want to share with you.

In essence, rule is condition and action. So I will use Predicate as a condition and Function as an action. The composite function then consists of list of rules, for the first predicate that evaluates to true, it will invoke the corresponding function. If none of the predicates are matched - default function will be invoked. Here is the source code, if you want more details.

For example, here is how I define a function "size" of an arbitrary object:

Function<Object, Integer> size = new Rules<Object,Integer>().
addRule(isNull(), constant(0)).
addRule(
or(
instanceOf(Collection.class),
instanceOf(Map.class)),
self().reflect("size").cast(Integer.class)).
addRule(
asPredicate(self().reflect("getClass").
reflect("isArray").cast(Boolean.class)),
self().reflect(Array.class, "getLength").cast(Integer.class)).
setDefault(constant(1));
If you read my earlier post, you already know about the FunctionChain, this is where I statically import self() from. The reflect() function transfers Object to Object by invoking a parameterless method with given name on the current object, or passes current object to a single-parameter method of another object (or class for static methods). The cast() method performs a cast to given class. The instanceOf() function is a shortcut to ClassPredicate which returns true if the class it holds is assignable from a given class. I also assume statically imported or and isNull Predicates and Functions.constant. (I could replace reflect function calls with a special size and length functions, but I think it's enough to illustrate the idea as it is.)

The same segment in "normal" coding would look something like that:
if (object == null) {
return 0;
} else if (object instanceof Collection) {
return ((Collection)object).size(); {
} else if (object instanceof Map) {
return ((Map)object).size(); {
} else if (object.getClass().isArray()) {
return Array.getLength(object);
} else {
return 1;
}
I don't pretend one is better than the other, but it's interesting enough, I think, and sometimes useful.

Modularity - outside in

It is probably a pretty obvious thing, and yet some developers and software architects manage to overlook this.

When refactoring a project under the flag of "better modularity" one needs to start with northbound, southbound and other external interfaces - remove the coupling between the "big boxes" in the system, and only then go inside each box and improve its architecture. Otherwise we'll find ourselves redoing huge amounts of code over and over. Just moving your middle tier code base to JEE does not solve the problem... it is like adding meat balls to spaghetti, while what you really want is lasagna.

Just a thought.

Wednesday, November 14, 2007

Walls and bridges

Last time I told you about the FunctionChain utility I wrote on top of google-collections. But I discovered a very strange problem, so I have posted a fixed version and I'd like to tell you about the problem, since I learned a lot while dealing with it.

I got a Function instance and I want to find out whether its apply method accepts nulls. In order to do that, I need to check the existence of @Nullable annotation on the first parameter of the apply method. The annotation's retention policy is run-time, so the task sounds pretty trivial, right? Wrong!

How do I get hold of the apply method? At first I thought function.getClass().getMethod("apply", Object.class) should do the job, you know, because of erasure. And indeed a method is always returned, only it was sometimes missing the annotations. This was driving me crazy, until I looked at all the methods of function using reflection. Guess what - there are sometimes multiple apply methods. How could it be - it's an anonymous inner class implementing the Function interface alone...? And then the flashback hit me - right, synthetic bridge methods! If I defined my function to transform Integer to String, there would be a String apply(Integer) method in addition to Object apply(Object). And if I defined my function to transform an Object to Class (equivalent of Object.getClass()) then I got 2 apply methods taking Object, one returns an Object and the other one a Class. Unbelievable, ha?!

Another thing worth noting, is that the annotation is present only on the original method, not on the bridge method. I have browsed the web and found 2 references to similar problem here and here. Both Michael Ernst and Rob Harrop think that not copying annotations to bridge method is a javac bug, but Sun Bug Database has no trace, so I'm going to submit one.

Ok, at least I knew now what's going on, but how do I find out the right method? I decided that the only way of doing it is figuring out the first parameter in the Function generic definition. I used Neal Gafter's super-type-tokens trick with some minor enhancements - array support and ability to process arbitrary number of parameters. It allows me to find out the classes that substitute generic parameter definitions, and once I get the first one - it's the apply method parameter class, so I'll use it to look up the method. The apply method return type should not be a problem, because according to getMethod JavaDoc the method with a more specific return type will be found (and I can't imagine how there can be more than two methods with same parameter in our case).

I had to change Neal's code a bit, because while he's using getClass().getGenericSuperclass() I had to use getClass().getGenericInterfaces() and to find the one representing Function. I noticed that interfaces need to be manually collected from all super-classes, because each class would only return the ones it explicitly implements, and (usually) not the ones that the super-class implements. Bummer... To add insult to injury - there may be several implemented Types that represent a Function - for example if the parent implements Function<T,String> and child implements Function<Integer,String>, those would be 2 different types, and I, of course, am looking for the most specific one. LinkedHashSet did the job. Phew.

You can look at the source code here, let me know what you think. All in all, it turned out to be much harder than I originally expected, and I am wondering if google-collection folks would be willing to add similar functionality to their package... after all run-time retention of @Nullable is useless without it.

Wednesday, November 7, 2007

Null-safe access to properties

First I am going to show a nice side-effect of using properties with google-collections interface Function. Like many others I am annoyed to write code of the style


Bar getBarOfThing(Thing thing) {
Bar bar = null;
if (thing != null) {
Foo foo = thing.getFoo();
if (foo != null) {
bar = foo.getBar()
}
}
return bar;
}
All the null checks make it unreadable. And heaven forbid I forget one null test... BOOM! NPE! It's much nicer in SQL, for example, where everything's a bag, no results - no problem, join it with whatever you want, you'll just get an empty bag.

Now imagine that
foo = new Function<Thing,Foo>() {
public Foo apply(Thing from) {
return from.getFoo();
}
}
and
bar = Function<Foo,Bar>() {
public Bar apply(Foo from) {
return from.getBar();
}
}

Luckily google-collections have defined a @Nullable annotation and I am going to use it to differentiate between functions that can gracefully handle null (e.g. return some default value) and the ones that don't. Now I'm going to define a FunctionChain class, that provides me a fluent interface to chain the functions:
  FunctionChain<Thing,Bar> chain =
FunctionChain.<Thing>self().function(foo).function(bar);
Which I can safely apply to any Thing, including null
  assertTrue(chain.applyTo(null) == null);

Tuesday, November 6, 2007

Properties for Java

The need for better support for properties in Java has been raised many times and in several contexts, and I am definitely not going to repeat the reasoning here. It used to be on the list of features proposed for Java 7, and here's the last time I read about it on Rémi Forax's blog (quite a while ago). Fred recently blogged about another way of addressing the problem via abstract Enums. There were several other interesting proposals, and the main trends are summarized very nicely by here. Stephen is also the one who first blogged about bean-independent property object and wrote a reference implementation - Joda Beans project. I had started developing similar ideas in parallel, and since I have now reached a stage of a nice working prototype, this is what I hope to describe in the upcoming posts.

First of all, I chose the bean-independent property approach (although it was before Stephen coined it that), because I think the real added value in properties is being able to refer to the property regardless of any particular object instance. Another big design decision is to avoid changing Java compiler, unlike Fred and Rémi. I mean it's a very interesting exercise to play with javac and I am really grateful that Sun made it possible, but... IMO that's the last resort to solve the problem. I mean, even in the best case where the enhancement is accepted by Sun, the earliest time when it will work in an official Java version is Java 7, which is still very far away. And using a home-made javac in production environment is just not very realistic, I mean imagine justifying it to your boss (or customer). I also tried to stay away from bytecode or source generation, and managed well so far, although I might need to use it in the future in order to improve verbosity and safety of my solution.

So here is what I managed to do: no compiler changes, fully type-safe properties, some null-safety built in (ideas similar to what Stephen blogged here), the properties can have annotations and can be referred to in annotations, they can be applied to existing Java Beans via reflection and they expose their own meta-data via reflection. In addition to all that some Ruby-esque adding/modifying properties on the fly is supported. The last, but not least, is that I have been using google-collections quite a lot in this project so there's very nice inter-operation, for example Property implements com.google.common.base.Function and writing pseudo-functional code in Java using google-collections and fluent interfaces has been discussed quite a lot recently.

So this is the blog-post number 0, sort of preface, that hopefully will keep you interested enough to come back and read the blog. To be continued...

Saturday, November 3, 2007

Functional programming for the JVM

Dear Java, my good old friend, there is something I need to tell you. You've been my only one for over 10 years, but now I have trouble staying faithful. It's not you, it's me. And no, I am not moving out, not just yet, because I can't afford to leave all your stuff behind.

But you don't satisfy all my needs anymore!

I know you are trying, dear, and good people try to help you - here Neal Gafter is working hard to give you closure support, Ricky Clarkson, a true alpha-geek who played with the first prototype of Neal's work, has blogged the results. But, sorry love, I can't wait until the big makeover in Java 7. What do you expect me to do with all these JVM languages around? These languages inter-operate (well, to certain extent) with your libraries, so I can enjoy all the usual stuff but I can also be more expressive. Here is an example of hooking up JRuby with Velocity and another one of writing servlets in Scala under Tomcat.

You wanna know who they are? Among the young ones, Scala deserves to be mentioned first, alongside Ruby, JavaScript and Python to name just a few. Classic ones include several Lisp dialects (e.g. Clojure and Common Lisp), Scheme and OCaml implemented over the JVM as well. Either there is a compiler that produces bytecode, or there's a JVM scripting engine, and sometimes both. I hope one day to make a deeper study and produce a proper comparison table, maybe then you'll understand.

You think that closures alone aren't good enough reason to have a little flirt on the side with another language? Hmm, consider the fact that "the other language" may offer me hot-swapping of classes, mix-ins, better modularity, pattern matching, better concurrency model and dynamic typing support ... who can blame me for being tempted.

Still loving you, yours truly but... not entirely.

Thursday, November 1, 2007

Beware: Overflows in numeric operations

Well, isn't it amazing how huge can be the problems caused by a simple little code piece? Have a look at Java Puzzlers - there are plenty of examples. What is important to understand is that it's no fiction nor Neal and Josh perverted minds. It's out there in the software we use and the one we produce. Want proof? Nearly All Binary Searches and Mergesorts are Broken.

Here is the story of what happened to me. The code was written by a colleague, and at some point became my responsibility. I didn't notice anything wrong when first reviewing it, but later on I had to fix the problems. Now have a look:


public class Node implements Comparable {
private long _id;

public Node() {
_id = IdGenerator.getNewId();
}
public Node(long id) {
_id = id;
}

public boolean equals(Object obj) {
return compareTo(obj) == 0;
}

public int hashCode() {
return (int)(_id ^ (_id >> 32));
}

public int compareTo(Object obj) {
if (obj == null) {
return -1;
}
if (obj instanceof Node) {
Node tmp = (Node) obj;
return (int)(_id - tmp._id);
}
return -1;
}
}

Looks innocent, right? Why would then new Node(12884902889L).equals(new Node(4294968297L)) be true?! Ah, because 12884902889-4294968297=8589934592=2^33 and when cast to int, 2^33 is of course 0. Bellisimo!

But hey, here's one even worse. IdGenerator generates long Ids from 2 integers - persistent high and in-memory low, low being incremented all the time and high only when MAX_VALUE of ids have been given out or when the process restarts. So far so good. Look at this function then:

public synchronized long getNewAndReserve(int interval)
throws IdGeneratorException
{
long value;
if (low + interval >= MAX_VALUE) {
high = getNewHighValue();
low = 0;
}
value = getNewId(high, low);
low += interval;
return value;
}

MAX_VALUE is BTW Integer.MAX_VALUE. Now think puzzlers... YES! a sum of 2 integers is an integer, it can NEVER exceed a MAX_VALUE, it overflows to a negative number! The fix for this is to change the condition to (low >= MAX_VALUE - interval) and to add an assertion that (interval > 0).

All this makes you think whether high-level programming language like Java should help you avoid this kind of problems...? But that's worth of another post sometime. In the meantime, look after your numbers, folks.