Monday, December 31, 2007

Past forward

Since my previous post was devoted to Java history and this interview with Martin Odersky recently came out, I decided to do one more short transcription related to Generics history.

"I am an academic, actually, I'm working at the university in Switzerland. ... A long time ago I was a Ph.D. student of Nicholas Wirth, so I am in the Modula 2 operand camp... But then I drifted more and more into functional programming community, I have a good friend named Phil Wadler who is very active in the community, so he sometimes told me ... "there is this new thing, it's gonna bury functional programming, it is called Java". I wanted to find out more and he said it has garbage collection, and it runs everywhere on the Web ..., so what are you gonna do against that?!

So ... we said we'll try to do a functional language on the Java platform, because the Java platform looked very exciting to us then (it was in 1995, so a long time ago). ... To find out more about what it is (I always learn best when I write a compiler) let's write a Java compiler. We had this second ever Java compiler out there after the old JavaC.

Then we said - if we want to do a language it should be Java compatible, and we wrote a language called Pizza, which added stuff from functional programming to Java. We experimented a little bit with that and then Sun came and said - we like what you did, in particular the Generics, we wanna do that. Then we had a project with Gilad Bracha at Sun and we did this GJ thing, which was the Generics thing.

The compiler for GJ eventually became the current JavaC compiler, Sun took that over because they decided that this compiler was more maintainable than the original. (Interviewer: "from JavaC 5 on...") From Java 1.3 already.

So the compiler was in 1.3, but the Generics were disabled. In fact, the Generics were at first not that much disabled, there was a secret switch that people found out about, so there was this stealthy movement of people using Generics already in 1998 when it first came out. And then Sun pulled the plug, they said - you can't do that, so you have to re-write your compiler so that it doesn't make any Generics available. So there was a special mutilation script in the compiler which would rip it off, so that people couldn't use it for a while, until it came back in Java 5."

Thursday, December 13, 2007

Back to the future

If you look at the programming language history poster there's a bunch or arrows leading to Java genesis, one of which comes from Smalltalk. Ok, I heard about Smalltalk and saw some code in books. I also knew that Gilad Bracha was one of the authors of Strongtalk (Smalltalk with static types) and later immensely contributed to Java evolution while working at Sun. I also saw references to this supposedly great language called Self in some programming language articles. But only now I see how it all linked together thanks to Avi Bryant's Keynote at RailsConf (yeah, of all places to learn about Java...) so I decided to transcribe here a small portion of this fascinating talk.

"There's a legend that there's something inherently about Ruby that makes it hard to implement a fast VM for it, because of open classes, because of dynamic typing - there's something about meta-programming, something about Ruby that makes it hard in this day to make it run fast. And it's just not true! It was true 20 years ago - it was a hard problem, but people solved it.

This is actually from Sun site, this is a bunch of papers that collectively explain how do you make Ruby go fast. But you'll note that they're written in 1989-91, you can go and look at the URL here, they come from the Self project. And the history is kind of interesting here, so since this is about the future and about the past, I'll give you a really brief history.

Self was a project at Sun to do Smalltalk, but even more so. Even purer object-oriented. And it meant that they had to find ways, because they wanted to make it so object-oriented, so pure, so turtles all the way down, they had to find more and more ways to make it go fast. And the technology they came up with was so interesting that they actually spawn off a start-up to try to implement new super-fast Smalltalk that was gonna be used on Wall Street. But before they got to release it, this was gonna be called Strongtalk, before they got to release it, Sun bought them back.

And Sun used the technology, used the HotSpot profiling and Just-In-Time compilation technology that have been developed here to build the Java VM. Which is sort of one of the great tragedies of technology, but it's true, that Java HotSpot VM in fact is what came of all of this great work of making dynamic languages go fast - of course they crippled it so that it doesn't make dynamic languages go fast anymore, but maybe that'll change. One interesting footnote to this is that a lot of the engineers that were involved in that have since left Sun and are back to doing Smalltalk."

Sunday, December 9, 2007

Extension methods proposal

I'd like to talk here about another Java 7 feature proposal lobbied by Google that generated lots of waves in the blog-sphere after being revealed by Neal Gafter. A lot have been said (Peter, Alex, Stephen, Rémi and Stefan) so I'll add just 2 humble observations.

Adding capabilities to the language vs. maintaining a safe and clear programming environment is a tough balance to keep. I had my doubts in this particular case, and I think I tend to favor it, with one modification. I think it would be better if extension method import was different from normal static import, so the developer knows exactly what he is doing. I propose following syntax (that BTW does not require any additional keywords in Java)

import static java.util.Collections.sort
extends java.util.List;
Compiler would then check that the first argument of sort is a List (or its super-class or super-interface) and compile
List list = ...;
list.sort();

into
Collections.sort(list);
The main advantage of this proposal is that extension methods are imported consciously, rather than implicitly.

The "importer" is also given some control over the overloading rules for the extension method - what if Collection has a sort method? what if ArrayList does? well, with the syntax I propose here the developer targets the method to a specific class, so overloading rules can be applied just as if the method was in that class, in this example it would overload Collection one, but not ArrayList one. It reminds me of the Ruby feature that Neal Ford had mentioned - calling a method, and if it does not exist, executing some code piece instead. So with proper overloading rules, could imported extension method be Java alternative for method missing? At last, there's this dark corner question "What if List had a sort method?" Personally, I think that if the class explicitly imports an extension method, it should be treated as local definition, while the method defined in List as inherited definition, and therefore (following the principles laid down by Gilad Bracha in one of his recent papers) the extension method should override the one in List. This is somewhat similar to the infamous "Mother May I" rule, but I better leave this for greater minds to solve.

Another interesting aspect of extension methods in general is their behavior in case the first parameter of the method is generic. Take Collections.max method for example. It accepts collections which element type extends <T extends Object & Comparable<? super T>>. AFAIK there is no definition of extension method behavior with generics. Maybe they are not allowed, but if they were, I would expect that list.max() compiles only if elements of my list are compatible with the static method definition, which in this particular case means that they should be Comparable to each other. Now this is an enhancement of Java generics system - being able to add methods to classes based on their generic parameters! Wow, it's somewhat spooky, but really powerful, I can imagine a few neat things I could do with this :-) and since I am now going to spend some time dreaming, I am going to leave you to think about this...

P.S. (12/12/07) Just came across a vivid discussion of similar feature in .NET. First reaction - "ah, oh, I get it now, it's another one of those .NET has it so we gotta have it too features!". But seriously now - let's learn from their experience, the conclusions are pretty similar to what is discussed here.

Wednesday, December 5, 2007

Faster than C

Not everyday you come across an example of Java program that executes faster than its C equivalent. Although the original code is in Scala, the speed is because of the JVM and HotSpot, and not because of the language which the program was written in. This is a good illustration of JVM strength which IMO can and should be reused for other languages. (To be completely fair, gcc is not the strongest competitor among C compilers, there are optimizers that do the sort of trick that JVM did. Nevertheless, it's impressive.)

Speaking of Java versus C/C++, there's an interesting debate on LtU regarding performance of GC vs. explicit memory management. The conclusion is not very surprising - there is a cost, but it is reasonable to pay given the other benefits of GC. It reminds me of a discussion I had with a guy who used to work with Java RTS (Real-Time Java System). I admit, when he first mentioned it, I didn't know what it was and it surely sounded weird, so I went and read a few web-pages. Java with explicit memory management.... hm? Poor Java, why would you do such a thing to it? So I asked him "Why don't you just use C or C++?" The answer was - "It's proven that programmers prefer Java and they are more productive writing Java code than C++". "All well, but shouldn't it be at least partially attributed to the fact that in normal Java we (programmers) don't need to manage the memory?" "I see your point" he smiled. That said, I admit that I know practically nothing about JME platform and how it solves real-time problems with Java. So I'm not shutting the door here - maybe I'm wrong.

But I just can't help and generalize this problem, looking at where things go with JSE. I mean it's great that Java evolves, but as wonderful as Java is, it's not this one universal tool that can solve every problem in the world. If there is a need for Java-like language for real-time applications, then why can't some smart guy (at Sun or Google or wherever) create a language that does explicit memory management, but incorporates other Java advantages. Just don't call it Java, please! (Yeah, I know that marketing guys really want you to, but let's try to keep some dignity, fellow enginerds).

Friday, November 30, 2007

JRuby pod-cast

I almost accidentally got to hear Neal Ford's interview about JRuby this morning.

Love it or hate it, Ruby cannot be ignored. Yet the articles and blog-posts you find on Ruby vs. Java are many times so emotional and single-minded that it became a really good joke. But not this one.

Neal starts with a bit of history, goes through Ruby comparison to Java (expressive power, meta-programming capabilities), the rising importance of JVM as a multi-language platform, comparison between JRuby and other JVM languages (Groovy, Jython, Jaskell, Scala) when to use them and how to get JRuby into your corporate development environment (the tips include - "don't say the word Ruby as long as it runs on the JVM" and "use it in test or build environment first - Ruby has great advantages there and it's easier to justify a new tool for build or test than for production") and finally about his work on first commercial product developed entirely in JRuby - the agile team management suite called Mingle.

I also liked his general comments regarding how G0F design patterns illustrate language deficiencies, his objection to XML programming and why the concept of protective programming environment allowing to scale software development through hiring zillions of bad programmers is bankrupt.

P.S. Martin Fowler, Neal's colleague at ThoughtWorks, recently blogged about JRuby too.

Tuesday, November 27, 2007

Reflection and auto-boxing

In an earlier post I discovered that the way reflection handles synthetic bridge methods is somewhat lacking. Here is another reflection issue, this time a known problem: 6176992 - sort of paradox in a box.

Apparently Class#isAssignableFrom does not handle auto-boxing so if you have a method that returns a boolean, Boolean.class.isAssignableFrom(method.getReturnType()) will evaluate to false. I played with it a bit further, and if you have a method that takes an int, MyClass.class.getMethod(methodName, Integer.class) yields NoSuchMethodException.

I expected these API's to return a non-primitive class because Method#invoke accepts only Objects and translates them internally, and it worked long before auto-boxing feature was added to Java. However, after giving this more thought, I decided that it's unfair to complain that the API's return exact types, also a primitive type does not extend or inherit its boxed counterpart and Javadoc describes the behavior of isAssignableFrom impeccably - the method supports widening reference conversion only (chapter 5.1.5 in the 3rd edition of Java Language Specification).

Nevertheless, Java Language Specification also defines widening primitive conversion (chapter 5.1.2) and (un)boxing (in 5.1.7 and 5.1.8) and I think it is fair to ask for a utility method that tests the compatibility of 2 classes that represent primitive types and a utility for boxing and un-boxing of classes. So what I'm really asking for is a java.lang.reflect.Primitive class, that provides some static utilities just like java.lang.reflect.Array does.

Until then, I am going to create my own utility for boxing/un-boxing - a Function<Class,Class> of course :-)

Thursday, November 22, 2007

Properties, static and yet dynamic

I think it's time to share some more of my Java properties ideas.

Property is a super-type-token and the definition looks like this:

public static final Property<Person,String> name =
new Property<Person,String> ("name") {};
So I've taken the property out of its usual habitat - the object it applies to, and let it be defined in any class! One prominent flexibility is the ability to add properties to an existing class by loading a new class.

How is the value associated with a particular object instance? A property should be assigned a "definition", which is in essence a Function that retrieves the value from a given object, by using reflection for example. The definition can be changed on the fly, and user of the property needs not be concerned.

That's all very nice, but reflection is not enough if we add properties on the fly. There are 2 ways of dealing with it. The more obvious one is having a Map or an array within the object where the values can be stored. This of course means that the object class is in advance "prepared" to have dynamic properties.

Another way is rather shocking at first glance... to me it was, anyway. Traditionally the object stores a property->value map. But why can't we have the property store an object->value map!? When I read about column-oriented databases, I realized it's not as crazy as it first looks. All I needed was WeakIdentityHashMap, unfortunately JDK does not have one, but google-collections do - the ReferenceMap. Also cloning and serialization need special hooks, but it's not too much hassle.

I have 4 types of properties for now, Property that only allows read access, UpdatableProperty that supports read and write, and their multi-value counterparts: IterableProperty and CollectionProperty. This list can of course be extended. Any property can have a default value, and updatable property can have a list of value constraints. I have a library of definitions - plain function, reflected, mapped, ones using internal and external map like I described above and of course more can be added.

Last but not least - all properties that apply to a certain class of objects are tracked, so there's an API which I can query for applicable properties for a given class. It would collect properties of all super-classes and super-interfaces - note that property can be defined for an interface and in an interface (being a final static).

That's it for tonight. To be continued...

P.S. JavaEdge was really great, thanks Alpha folks!

Monday, November 19, 2007

Software proletariat

Don't miss out today's Dilbert on the lower depths of our profession.

Sunday, November 18, 2007

Predicate + Function = Rule

Another enhancement on top of google-collections that I want to share with you.

In essence, rule is condition and action. So I will use Predicate as a condition and Function as an action. The composite function then consists of list of rules, for the first predicate that evaluates to true, it will invoke the corresponding function. If none of the predicates are matched - default function will be invoked. Here is the source code, if you want more details.

For example, here is how I define a function "size" of an arbitrary object:

Function<Object, Integer> size = new Rules<Object,Integer>().
addRule(isNull(), constant(0)).
addRule(
or(
instanceOf(Collection.class),
instanceOf(Map.class)),
self().reflect("size").cast(Integer.class)).
addRule(
asPredicate(self().reflect("getClass").
reflect("isArray").cast(Boolean.class)),
self().reflect(Array.class, "getLength").cast(Integer.class)).
setDefault(constant(1));
If you read my earlier post, you already know about the FunctionChain, this is where I statically import self() from. The reflect() function transfers Object to Object by invoking a parameterless method with given name on the current object, or passes current object to a single-parameter method of another object (or class for static methods). The cast() method performs a cast to given class. The instanceOf() function is a shortcut to ClassPredicate which returns true if the class it holds is assignable from a given class. I also assume statically imported or and isNull Predicates and Functions.constant. (I could replace reflect function calls with a special size and length functions, but I think it's enough to illustrate the idea as it is.)

The same segment in "normal" coding would look something like that:
if (object == null) {
return 0;
} else if (object instanceof Collection) {
return ((Collection)object).size(); {
} else if (object instanceof Map) {
return ((Map)object).size(); {
} else if (object.getClass().isArray()) {
return Array.getLength(object);
} else {
return 1;
}
I don't pretend one is better than the other, but it's interesting enough, I think, and sometimes useful.

Modularity - outside in

It is probably a pretty obvious thing, and yet some developers and software architects manage to overlook this.

When refactoring a project under the flag of "better modularity" one needs to start with northbound, southbound and other external interfaces - remove the coupling between the "big boxes" in the system, and only then go inside each box and improve its architecture. Otherwise we'll find ourselves redoing huge amounts of code over and over. Just moving your middle tier code base to JEE does not solve the problem... it is like adding meat balls to spaghetti, while what you really want is lasagna.

Just a thought.

Wednesday, November 14, 2007

Walls and bridges

Last time I told you about the FunctionChain utility I wrote on top of google-collections. But I discovered a very strange problem, so I have posted a fixed version and I'd like to tell you about the problem, since I learned a lot while dealing with it.

I got a Function instance and I want to find out whether its apply method accepts nulls. In order to do that, I need to check the existence of @Nullable annotation on the first parameter of the apply method. The annotation's retention policy is run-time, so the task sounds pretty trivial, right? Wrong!

How do I get hold of the apply method? At first I thought function.getClass().getMethod("apply", Object.class) should do the job, you know, because of erasure. And indeed a method is always returned, only it was sometimes missing the annotations. This was driving me crazy, until I looked at all the methods of function using reflection. Guess what - there are sometimes multiple apply methods. How could it be - it's an anonymous inner class implementing the Function interface alone...? And then the flashback hit me - right, synthetic bridge methods! If I defined my function to transform Integer to String, there would be a String apply(Integer) method in addition to Object apply(Object). And if I defined my function to transform an Object to Class (equivalent of Object.getClass()) then I got 2 apply methods taking Object, one returns an Object and the other one a Class. Unbelievable, ha?!

Another thing worth noting, is that the annotation is present only on the original method, not on the bridge method. I have browsed the web and found 2 references to similar problem here and here. Both Michael Ernst and Rob Harrop think that not copying annotations to bridge method is a javac bug, but Sun Bug Database has no trace, so I'm going to submit one.

Ok, at least I knew now what's going on, but how do I find out the right method? I decided that the only way of doing it is figuring out the first parameter in the Function generic definition. I used Neal Gafter's super-type-tokens trick with some minor enhancements - array support and ability to process arbitrary number of parameters. It allows me to find out the classes that substitute generic parameter definitions, and once I get the first one - it's the apply method parameter class, so I'll use it to look up the method. The apply method return type should not be a problem, because according to getMethod JavaDoc the method with a more specific return type will be found (and I can't imagine how there can be more than two methods with same parameter in our case).

I had to change Neal's code a bit, because while he's using getClass().getGenericSuperclass() I had to use getClass().getGenericInterfaces() and to find the one representing Function. I noticed that interfaces need to be manually collected from all super-classes, because each class would only return the ones it explicitly implements, and (usually) not the ones that the super-class implements. Bummer... To add insult to injury - there may be several implemented Types that represent a Function - for example if the parent implements Function<T,String> and child implements Function<Integer,String>, those would be 2 different types, and I, of course, am looking for the most specific one. LinkedHashSet did the job. Phew.

You can look at the source code here, let me know what you think. All in all, it turned out to be much harder than I originally expected, and I am wondering if google-collection folks would be willing to add similar functionality to their package... after all run-time retention of @Nullable is useless without it.

Wednesday, November 7, 2007

Null-safe access to properties

First I am going to show a nice side-effect of using properties with google-collections interface Function. Like many others I am annoyed to write code of the style


Bar getBarOfThing(Thing thing) {
Bar bar = null;
if (thing != null) {
Foo foo = thing.getFoo();
if (foo != null) {
bar = foo.getBar()
}
}
return bar;
}
All the null checks make it unreadable. And heaven forbid I forget one null test... BOOM! NPE! It's much nicer in SQL, for example, where everything's a bag, no results - no problem, join it with whatever you want, you'll just get an empty bag.

Now imagine that
foo = new Function<Thing,Foo>() {
public Foo apply(Thing from) {
return from.getFoo();
}
}
and
bar = Function<Foo,Bar>() {
public Bar apply(Foo from) {
return from.getBar();
}
}

Luckily google-collections have defined a @Nullable annotation and I am going to use it to differentiate between functions that can gracefully handle null (e.g. return some default value) and the ones that don't. Now I'm going to define a FunctionChain class, that provides me a fluent interface to chain the functions:
  FunctionChain<Thing,Bar> chain =
FunctionChain.<Thing>self().function(foo).function(bar);
Which I can safely apply to any Thing, including null
  assertTrue(chain.applyTo(null) == null);

Tuesday, November 6, 2007

Properties for Java

The need for better support for properties in Java has been raised many times and in several contexts, and I am definitely not going to repeat the reasoning here. It used to be on the list of features proposed for Java 7, and here's the last time I read about it on Rémi Forax's blog (quite a while ago). Fred recently blogged about another way of addressing the problem via abstract Enums. There were several other interesting proposals, and the main trends are summarized very nicely by here. Stephen is also the one who first blogged about bean-independent property object and wrote a reference implementation - Joda Beans project. I had started developing similar ideas in parallel, and since I have now reached a stage of a nice working prototype, this is what I hope to describe in the upcoming posts.

First of all, I chose the bean-independent property approach (although it was before Stephen coined it that), because I think the real added value in properties is being able to refer to the property regardless of any particular object instance. Another big design decision is to avoid changing Java compiler, unlike Fred and Rémi. I mean it's a very interesting exercise to play with javac and I am really grateful that Sun made it possible, but... IMO that's the last resort to solve the problem. I mean, even in the best case where the enhancement is accepted by Sun, the earliest time when it will work in an official Java version is Java 7, which is still very far away. And using a home-made javac in production environment is just not very realistic, I mean imagine justifying it to your boss (or customer). I also tried to stay away from bytecode or source generation, and managed well so far, although I might need to use it in the future in order to improve verbosity and safety of my solution.

So here is what I managed to do: no compiler changes, fully type-safe properties, some null-safety built in (ideas similar to what Stephen blogged here), the properties can have annotations and can be referred to in annotations, they can be applied to existing Java Beans via reflection and they expose their own meta-data via reflection. In addition to all that some Ruby-esque adding/modifying properties on the fly is supported. The last, but not least, is that I have been using google-collections quite a lot in this project so there's very nice inter-operation, for example Property implements com.google.common.base.Function and writing pseudo-functional code in Java using google-collections and fluent interfaces has been discussed quite a lot recently.

So this is the blog-post number 0, sort of preface, that hopefully will keep you interested enough to come back and read the blog. To be continued...

Saturday, November 3, 2007

Functional programming for the JVM

Dear Java, my good old friend, there is something I need to tell you. You've been my only one for over 10 years, but now I have trouble staying faithful. It's not you, it's me. And no, I am not moving out, not just yet, because I can't afford to leave all your stuff behind.

But you don't satisfy all my needs anymore!

I know you are trying, dear, and good people try to help you - here Neal Gafter is working hard to give you closure support, Ricky Clarkson, a true alpha-geek who played with the first prototype of Neal's work, has blogged the results. But, sorry love, I can't wait until the big makeover in Java 7. What do you expect me to do with all these JVM languages around? These languages inter-operate (well, to certain extent) with your libraries, so I can enjoy all the usual stuff but I can also be more expressive. Here is an example of hooking up JRuby with Velocity and another one of writing servlets in Scala under Tomcat.

You wanna know who they are? Among the young ones, Scala deserves to be mentioned first, alongside Ruby, JavaScript and Python to name just a few. Classic ones include several Lisp dialects (e.g. Clojure and Common Lisp), Scheme and OCaml implemented over the JVM as well. Either there is a compiler that produces bytecode, or there's a JVM scripting engine, and sometimes both. I hope one day to make a deeper study and produce a proper comparison table, maybe then you'll understand.

You think that closures alone aren't good enough reason to have a little flirt on the side with another language? Hmm, consider the fact that "the other language" may offer me hot-swapping of classes, mix-ins, better modularity, pattern matching, better concurrency model and dynamic typing support ... who can blame me for being tempted.

Still loving you, yours truly but... not entirely.

Thursday, November 1, 2007

Beware: Overflows in numeric operations

Well, isn't it amazing how huge can be the problems caused by a simple little code piece? Have a look at Java Puzzlers - there are plenty of examples. What is important to understand is that it's no fiction nor Neal and Josh perverted minds. It's out there in the software we use and the one we produce. Want proof? Nearly All Binary Searches and Mergesorts are Broken.

Here is the story of what happened to me. The code was written by a colleague, and at some point became my responsibility. I didn't notice anything wrong when first reviewing it, but later on I had to fix the problems. Now have a look:


public class Node implements Comparable {
private long _id;

public Node() {
_id = IdGenerator.getNewId();
}
public Node(long id) {
_id = id;
}

public boolean equals(Object obj) {
return compareTo(obj) == 0;
}

public int hashCode() {
return (int)(_id ^ (_id >> 32));
}

public int compareTo(Object obj) {
if (obj == null) {
return -1;
}
if (obj instanceof Node) {
Node tmp = (Node) obj;
return (int)(_id - tmp._id);
}
return -1;
}
}

Looks innocent, right? Why would then new Node(12884902889L).equals(new Node(4294968297L)) be true?! Ah, because 12884902889-4294968297=8589934592=2^33 and when cast to int, 2^33 is of course 0. Bellisimo!

But hey, here's one even worse. IdGenerator generates long Ids from 2 integers - persistent high and in-memory low, low being incremented all the time and high only when MAX_VALUE of ids have been given out or when the process restarts. So far so good. Look at this function then:

public synchronized long getNewAndReserve(int interval)
throws IdGeneratorException
{
long value;
if (low + interval >= MAX_VALUE) {
high = getNewHighValue();
low = 0;
}
value = getNewId(high, low);
low += interval;
return value;
}

MAX_VALUE is BTW Integer.MAX_VALUE. Now think puzzlers... YES! a sum of 2 integers is an integer, it can NEVER exceed a MAX_VALUE, it overflows to a negative number! The fix for this is to change the condition to (low >= MAX_VALUE - interval) and to add an assertion that (interval > 0).

All this makes you think whether high-level programming language like Java should help you avoid this kind of problems...? But that's worth of another post sometime. In the meantime, look after your numbers, folks.

Friday, October 12, 2007

Why is this blog named this way?

Because of my main occupation and lipstick colour.