Friday, January 9, 2009

Static types are from Mars, Dynamic types are from Venus


static - associated with logic and acting by the rules, strong, efficient, usually responsible for safety and order enforcement; but non-compromising (for better or worse), non-adaptive, weak in communication skills.

dynamic - associated with beauty and elegance, light, good in communication skills, can usually be easily made to do what you want them to, change all the time and adapt to change well; but unpredictable, act on intuition rather than logic, often seen as less efficient and weaker.


Wednesday, January 7, 2009

Types and components

Just some thoughts following a recent conversation I had. Don't we always want static types? If we can detect errors in our program, why not do it as early as possible? Sure. So when is "as early as possible"? I think the answer depends on what our program is - is it one monolithic piece or a component?


In the past monolithic software was prevalent, but today I would bet that most software is meant to be a component. Just look at open-source as it used to be before Java - you'd usually download the C source-code, change it if you need, build it all on your machine, and create one big executable. (And BTW - congratulations, you are a geek.) Cross platform Java changed the situation. Now open-source projects give you a jar to download, you put it on your classpath and you are ready to go. This lowered the bar for open-source adoption, and contributed (aside less restrictive licenses and other factors) to baby-boom of open-source frameworks and tools. Nowadays a Java developer cannot even imagine having no 3rd party jars in the classpath!

Static type check validates our software component against other components in the compilation environment. Does it match the runtime environment? What about different configurations of the runtime environment - there are tests and real deployments, and various types of deployments, and any given installation environment can change over time - new components being added, other updated or removed? How do we guarantee that compile time checks still hold? The short answer is - we can't. We need dynamic type safety anyway. Now let's examine the added value and the price (yes, there is one!) of deeply static types.

On code organization level, we try to reduce the dependencies to bare minimum - hide classes behind interfaces that we hope will remain more stable. The problem is that number of interfaces and factories in our application grows, while pursuing modularity we sacrifice simplicity... So maybe the problem is in the name? Some go as far as add support for structural types, minimizing dependency to a single field/method signature (not a problem-free solution, but there are interesting refinements). All this may help, but doesn't really solve the problem.

Another aspect we need to deal with is building and packaging the software. Here we enter the world of dependency management, the world of "make", Ant, Maven, repositories, jar versions; if it's a large enough and complex enough software we work on, simply speaking - we enter the world of pain. I still find it strange that we haven't found a better way.

As for application deployment and its problems, we'll get back to it later. But the truth is that no matter how hard we try, we can't guarantee there will be no errors when we deploy our software, so ... JVM doesn't trust us and gives us verification.

Simply put, when class is compiled, some of its requirements from other classes are captured and encoded into the bytecode. Then JVM would check them when the class is loaded, and reject the class if they can't be met. (This is really an over-simplified description of a complex algorithm, which also takes time to execute, despite optimization efforts on JVM side.) So this isn't really a dynamic check, it's something in between - names in our class get linked when it is loaded. In the classic Java SE class-loading scheme, where components are basically a chain, this scheme should work. But if we want real components, ones we can add, override, replace or remove while program is running - sweet turns sour. Our interfaces and factories have names, and classes that represent them need to reside in some "common vocabulary" usually loaded by the parent classloader, because it's not only the class bytecode that matters, but also who loaded what. Since we are talking actual classes, not their names, once we loaded two components, they cannot change their protocol of communication without reloading their parent, they also can't use a different version of a sub-component that the parent component has referenced.

In JEE that sort of things is necessary, that's why classloading in JEE is a terrible mess, not only it does not follow any specification, but it is different in almost each and every app server (wasn't there supposed to be portability?!) If you ever used commons-logging in a JEE app, you probably know what I mean. Maybe it got fixed lately, I don't know, but the Tech Guide for commons-logging is an ode to classloader frustration.

Back to deployment: whenever there are some sort of dynamic components - JEE, Spring or OSGi, there is always reflection. And most of the time there's lots of XML too. It's an escape route from static types. I attended Alef Arendsen's session at JavaEdge that presented OSGi and SpringSource. I carefully watched Alef juggle between XML, source and console like a child who watches a circus magician trying to uncover his tricks. But I didn't quite figure out the magic. And that was a whole session just for HelloWorld. I know Spring folks are doing best they can, and they're smart and all... but comparing to Smalltalk, I wasn't quite impressed. As for other solutions, although I haven't tried this out, there's Guice/OSGi integration without XML and with dynamic proxies and on-the-fly bytecode generation with ASM, but there's some overhead for the user, because it requires intermediate objects for services. So this way or the other, looks like JVM platform is holding us back.

Verification is addition, not replacement of dynamic checks. So what we get is basically a triple check of correctness (javac, verifier, dynamic) but loss of flexibility - we are interfering with components runtime life-cycles. If the invocation target is resolved just in time when the call is made, nothing precludes the target component from being reloaded between calls. But with preemptive validation, we get a static dependency tree at runtime, classes wired with each other "too early" and for good, which makes reloading a component very hard (although people keep trying). The reason for "early linking" is also performance, but late binding doesn't mean that the runtime platform can't do any optimization heuristics... but they'll have to be dynamic optimizations in the style of JIT. Will invokedynamic bring the salvation?

It seems when we are talking about multiple components, "statically typed platform" does not quite do the job. Static type check may mean a lot inside a component, but as for inter-component communication they are not only useless, but harmful. People sometimes dismiss dynamic types, because they think "it's like static types, but without static types". What they may not realize is that you are not just loosing, you are gaining something with dynamic types. You get late binding and meta-programming, and in a multi-component environment, it means a whole lot!

And that's when we are talking "inside the platform" components developed in the same language. Once you work with a system that runs on a different platform or developed in a different language - our type system doesn't normally stretch across the communication boundary. The other system may not even have static types, and since we are only as strong as the weakest link, our static types don't really help us. I think every time we try to encode types into communication between systems we end up with a monster like CORBA or Web Services. But there's another (unfortunately popular) extreme of just sending a string over and hoping for the best - with no checking on our side at all. Then we are relying on the other system to stop us from doing damage, and there's no way to correctly blame the component that made an error - was it a wrong string or an unexpected change on the other side? I think that ideally type or contract checking and conversions can be done dynamically on both sides, and not as part of the protocol. This results in light and flexible data-exchanging protocols (like HTTP or ATOM) which are easier to work with and I think will win in the end. On the more theoretical level I like this model for intercommunication and of course there are Aliens, that model external system as a special object in our system.

So as far as I see - components simply require a dynamic environment, they may be statically checked inside, but act as "dynamic" to the outside world. Sort of hard skeleton and soft shell. Indeed soft parts are much easier to fit together and less breakable, due to flexibility - this is used often in mechanical engineering and in nature, so why not in software?

Thursday, December 11, 2008

How not to implement Comparable

I have already blogged about the danger of numeric overflows. I recently came across this example in Java course materials (!!!):

public class Foo implements Comparable<Foo> {
private int number;
public int compareTo(Foo o) {
if (this == o) return 0;
return number - o.number;
}
}
How do you think Integer.MAX_VALUE compares to negative numbers? It will appear smaller. This reminds me of even worse case we encountered in a real codebase. Look at this:
public class Foo implements Comparable<Foo> {
private long id;
//...
public int compareTo(Foo o) {
if (this == o) return 0;
return (int)(id - o.id);
}
public boolean equals(Object o) {
return o != null && (o instanceOf Foo) && compareTo((Foo)o) == 0;
}
}
How do you think new Foo(8325671243L) and new Foo(25505540427L) compare? They are equal, but I will do it ala Weiqi Gao and leave you to find out why... :-)

Static initializers - update

After some interesting comments to my previous write-up I decided to make a follow-up post with some additional details.

Here is the code from Effective Java:

public class Person {
private final Date birthDate;
//...
private static final Date BOOM_START;
private static final Date BOOM_END;

static {
Calendar gmtCal = Calendar.getInstance(TimeZone.getTimeZone("GMT"));
gmtCal.set(1946, Calendar.JANUARY, 1, 0, 0, 0);
BOOM_START = gmtCal.getTime();
gmtCal.set(1965, Calendar.JANUARY, 1, 0, 0, 0);
BOOM_END = gmtCal.getTime();
}

public boolean isBabyBoomer() {
return birthDate.compareTo(BOOM_START) >= 0 &&
birthDate.compareTo(BOOM_END) < 0;
}
}


and here's how I would have changed it:

public class BabyBoom {
private static final BabyBoom boom = new BabyBoom();
private Date start = null;
private Date end = null;

private BabyBoom() {}

public static BabyBoom getInstance() {
if (boom.start == null || boom.end == null) {
Calendar gmtCal = Calendar.getInstance(TimeZone.getTimeZone("GMT"));
gmtCal.set(1946, Calendar.JANUARY, 1, 0, 0, 0);
boom.start = gmtCal.getTime();
gmtCal.set(1965, Calendar.JANUARY, 1, 0, 0, 0);
boom.end = gmtCal.getTime();
}
return boom;
}

public boolean contains(Date birthDate) {
return birthDate.compareTo(start) >= 0 && birthDate.compareTo(end) < 0;
}
}

public class Person {
private final Date birthDate;
//...
public boolean isBabyBoomer() {
BabyBoom boom = BabyBoom.getInstance();
return boom.contains(birthDate);
}
}

I didn't synchronize getInstance, because initializing the dates twice does no harm, so it's not worth the price of synchronization. However I did check that both fields are initialized before returning the object in getInstance

Sunday, December 7, 2008

A case against static initializers

"Effective Java" is an excellent book. I recently bought the 2nd edition, and it is absolutely fabulous, priceless. However after quite some time in the industry, I've learnt not to take any advice blindly. First edition was also excellent, but several of the items were revisited since then. So here's an item that I have mixed feelings about - Avoid creating unnecessary objects. The advice is to use static initializers for the expensive computation. 

Static initializers are double-edged sword. It's like with the stock exchange in times of crisis - for a particular individual it may be a good idea to sell the stock, but the trouble is that everybody's doing it, and in the end everybody's losing big-time. Same applies to static initializers. One or two may seem harmless, but they add up and together create a big problem. The fact that static initializers are (edit) may be invoked at program start-up affects everybody and since they potentially interfere with classloading, it's very hard (edit) harder to debug them if anything goes wrong.

Here is how it usually gets out of hand: people start with initializing static members in static blocks. Map of values, sort of configuration details. That alone sounds harmless. But soon comes the time when the values in a map need to be read from a properties file, so here we got IO within static block. Uh oh, better catch these exceptions. Before you notice the whole thing turns into a puzzler. Don't believe me? Here's a real problem I had.

Server in Java, client is either applet or JNLP. On certain machines, only one of them can run. You run server first - client never comes up, no error whatsoever. You reboot and connect as client to another machine - no problem. But if you try to start up the server locally - silent death. A team in India spends months on it. In vain. The whole release is detained, escalation to senior management. The thing ends up on my desk after a ruthless blame game between teams. Long story short: it's a DirectX problem. Why the **** does the server need DirectX? Ah. What's next after reading defaults from a properties file? Reading them from the database. Oh, but it's a different process, and the database needs to be up. So we not just connect to database from initialization block, we wait for the database process to be up. Great idea. How? We follow "best coding practices" and reuse: find a poller utility somewhere in the JDK. Apparently there is a java.awt.Timer. Why not? Great idea. Apparently, a touch of one AWT class causes a bunch of other AWT classes to load, which in turn loads DirectX and OpenGL dlls. And guess what - Windows on some machines has a nasty bug, that only allows to load them once per machine, regardless of the user. And when another user tries to do it - the loading gets stuck. And our server is of course a system process, while the client belongs to the logged in user. 

Since it was a last minute fix, we solved it with some JVM flags that disabled DirectX and OpenGL. The problem was not the fix, but the diagnosis. If it was part of the regular code, it would have been easy to connect with a debugger, see what call gets stuck, investigate it from there. But as it was part of start-up, people didn't know where to look. Not to mention the man-months accumulatively spent by developers who waited for the server to restart when testing. 

So... what's the lesson here? Life is better without static, avoid it as much as you can. 

Java... tea?

No clone for you!

Java Practices call Cloneable/Object#clone API pathalogical and suggest to avoid it. I have to agree. Here's my little story. 


I have a method that receives a certain object, which is Cloneable. Now I want to actually clone that object, so I am calling the clone method. Makes sense, no? Apparently not. My slightly outdated version of IDEA was confused, just as I was, and did not complain. But Javac was cruel and uncompromizing. No clone for you. 


Because Cloneable does not require a clone() method, and Object#clone() can be protected. WTF? My objects subclass different not-necessarily-cloneable classes, so they can't have a common cloneable superclass. So I'll define my own ReallyCloneable interface that has a public clone method, sounds simple enough...? Not!
public interface ReallyCloneable extends Cloneable {
    public Object clone() _
}
Now I realize there's CloneNotSupportedException to throw. A checked exception. I have 2 options: be a good girl, adhere to best practices and clone the signature of Object#clone, or ... not. Let's explore both options. 
public interface ReallyCloneable extends Cloneable {
    public Object clone() throws CloneNotSupportedException;
}
Now all my classes (let's assume they are simple enough to contain only primitives, Strings and arrays of the above) implement ReallyCloneable and add those 3 lines:
public Object clone() throws CloneNotSupportedException {
    return super.clone();
}
But whoever clones my objects also needs to add:
try {
    //call clone
} catch (CloneNotSupportedException e)  {
    //handle it... how???
}
Because implementing Cloneable and all the rest means that JVM will not necessarity throw CloneNotSupportedException, but as far as Javac is concerned, it still may. So no matter what, the caller needs to prepare himself for the worst. Then why bother with interfaces and all that, I ask? People say Java is verbose. I don't mind verbose, but bloated with boiler-plate to this extent?! 

The other option is not to declare the exception in ReallyCloneable. This is possible because overriding rules do not require to throw the exceptions of the overridden method.
public Object clone() {
  try {
    return super.clone();
  } catch (CloneNotSupportedException e) {
    //hm...? 
  }
}
Even though JLS allows it, my IDE and static analysis tools will complain. So I have to either reconfigure them or add some "escape" annotations. More typing... By now I feel like a typomaniac. And wait, I should handle the exception - I don't want to swallow it. Besides, what if someone really objects to being cloned? There is a need for a runtime exception. 
public class CloneFailedException extends RuntimeException {
public CloneFailedException() {
super();
}
  public CloneFailedException(CloneNotSupportedException e) {
    super(e);
  }
}
and
public Object clone() {
  try {
    return super.clone();
  } catch (CloneNotSupportedException e) {
    throw new CloneFailedException(e); 
  }
}
Now at least the callers don't have to catch. But they will still have to cast, because there is no way to express a self type with Java's static types system. Sigh.

And all this why? and for what? Couldn't we just have Cloneable with a public clone and a RuntimeException to throw if we didn't want to be cloned? Simple as that? 

This is yet another example of static type system's failed attempt to protect us from ourselves. 



Saturday, November 29, 2008

Brains, bucks and programming languages

The title is supposed to be a paraphrase of "sex, drugs and rock'n'roll" in a geeky context.

This autumn I went to see Paul McCartney in concert - a lifetime dream come true. For most people Paul McCartney is first of all an ex-Beatle. Indeed, during the concert he played many classic Beatles tunes to please the audience. And the audience was very pleased. Then, he cashed in the multi-million-dollar cheque and went back to England to do what he really likes - which at this point seems to be composing experimental electronic music.  To me it looks pretty fair. 

Recently I listenned  to James Gosling's keynote at the JVM Language Summit. I actually enjoyed the presentation very much. One of the things he said, was something like "My dream would be to implement Fortran over JVM ... ah, but I have a day-job".  Now, not that JVM really needs a Fortran IMO. But think about it for a second. How many people in the world can design a programming language? How many of them can design a good programming language? And a popular one? Java is more popular than Beatles. Uhm, well... even if it's not, you get the idea.  Now what can be more important for James Gosling to do during his day job than design a programming language of his choice, I should ask his employer? What? Throwing T-shirts at JavaOne attendants? No, really. Why is it that James Gosling can't do anything he freaking likes for the rest of his life?

I think something in our business is unfair. I am not saying Microsoft model is right, I am very pro open-source and free software and all that. But I'm confused - something about it isn't right. Large IT companies make loads of money, and waste a lot of it on complete crap - I've seen this from inside. So how come Gilad Bracha cannot find funding for Newspeak development? This is totally surreal!

There goes another angry post.
 

DSL - fuel for life

Do you feel that your programming language is too bloated? I do, and I know I am not alone

Let's take a look at Java. You may ask - what do you mean, when you say Java? Ah, good question. There is Java, the language, as described in the spec. Java the Standard. Don't you wonder where's the new edition of the book, BTW? Anyway, then there's mini-edition, enterprise edition, real-time Java... there's a huge stack of official/certified technologies (e.g. all the JEE stuff - is JSP or JSF Java? is JPQL?), for which there are often multiple vendors. That's not all, there are all the popular open-source frameworks that don't bother getting Sun's approval, and yet they possess lion-share of the market (e.g. Eclipse, Spring). There's no chance to even keep track of all this, forget mastering. And yet, I don't feel that I have all I need. I have all that, and yet I can't write a descent program the way I'd like to, because I am missing some core features. What? Let's see - how about proper modularity, closures, tuples, local type inference, properties... 

Java made the grade in expanding layer by layer - to the extent where it reminds me the state of the Earth in Wall-E. If you haven't seen the movie, I'll just say that the Earth drowned in human-produced garbage, and the garbage made it inhabitable for anything organic. The garbage in Java makes it impossible for the core, organic features of the language to grow. In the movie, people are leaving, and robots stay back to clean up - now you make the analogy. 


As a Java programmer I really identify with Wall-E. Moving tons of garbage around, day after day, in a desperate attempt to clean up the world - something obviously impossible. And man, I'd love to be that flying-iPod-looking Eve from outer space. She's so strong, so modern, so clean and shiny...  And she has a mission!



If Wall-E is my Java, then my Eve is no doubt Newspeak

So what does it mean in a wider sense of programming languages? I think good language should have a small core, as little redundancy as possible. And it should grow from then on. Here's Guy Steel at OOPSLA '98:


I think that the best way of growing is via internal domain specific languages, DSLs, - you don't change the core language, yet you cover more and more domains. Anders Hejlsberg also talked about it at the recent PDC conference - adding features as a DSL; and then, if the DSL is very successful and popular, add syntactic sugar to the core language, as they did with LINQ. I don't think that you need that latter part if your language is well suited for DSLs to begin with. So the language should have a small core and be well suited for DSLs. Java isn't DSL-friendly. Scala is much better, e.g. the Actors library. Haskell, Ruby and of course Smalltalk are really good at it. 

The last part is getting rid of garbage as you grow, and again, Gilad has an idea how to do that

I could go on forever and ever about DSLs, showing examples like parser combinators, unit-test and SQL libraries and so on. Martin Fowler is writing a book. So, instead of boring you with repeating what's been said many times, and enumerating lots of references, I will just finish with this cutest quote:

DSLs are for making languages bear-able.
    For I am a bear of very little brain and long words confuse me. [Milne 1926]

The premise of this subject is that computers should adapt to the ways of people, and not the other way around.  


Thursday, October 23, 2008

Types and other virtues

Here's a thought. Why do Java Generics attract so much bad vibes? If we're talking about types again, gotta visit our old friends - ML, Haskell. They are doing fine, they've got it all figured out. Hindley-Milner etc., they've proved themselves to be right.

But them and Java, it's apples and oranges. So to bring them to the same arena with object oriented languages, let's see how are they doing on the front of extensibility? Well, there are some recent advancements, polymorphic variants in OCaml, extensible records in Haskell, but it's not like the other stuff they've figured out ages ago, there's a slight hint of hesitation here.

So having types and being object oriented at the same time - maybe it's just, hm, non-trivial. If Martin Odersky says it's hard for the compiler (and he's a genius!) - then it really must be. Now look at it as a reverse Turing test - if the machine can't figure it out, how the **** are we supposed to? Which means we need an escape route. If declaring the right type is hard, we need an easy way out.

Pre-generics Java didn't bother itself too much with complex static types, the trick was to fall back on the dynamic types whenever in doubt: arrays - ArrayStoreException, collections - ClassCastException. Some say it's broken. It's a hole in a fence, but it's a way out . "Oops"

Then Generics came and fixed the hole in the fence. But, without realizing, they violated the status-quo, the eco-system, they stepped on a butterfly. Suddenly, the brain hurts, and the only way out is dumping all the angle brackets and jumping over the fence. It's a "No Exit"!

What does Scala do? Ah! First of all Scala has a much more advanced type system to start with. But the real trick is implicits. If it's too hard to declare the right type, just imperatively describe the conversion. Done deal, here's your way out. It's pretty cool, but implicits have their share of bad vibes too. Can't live with them, can't live without them, I guess.

So how about implicits for Java 7? No, actually how about Scala for Java 7? It's been done before, you know...

Sunday, October 5, 2008

The Great Divide

What hasn't been said about static vs. dynamic types in programming languages? Read on at your own risk, because here I go again...

When you think of static types what comes to mind? Haskell? OCaml? Scala? Dear friend, you are better than most of us, but you have clicked the wrong URL. Peace. See you in another post...

Did you say Java? Still with me? Good. Listen, now when the others have gone, just between you and me, the guys from the previous paragraph - they're on to some good stuff. Check it out, you won't regret it. But don't quit your day job, not just yet. It's a bit complicated, but did you ever witness extreme programming methodology implemented in a big corporation? No? Then picture this: Elbonians take over Dilbert's firm and make everybody do XP. They even send pointy-haired boss to a Certified Scrum Master course. Get the outcome? It can only end like the implementation of Carl Marx's ideas in Russian countryside. I am trying to say - there are ideals, and there is reality. In reality, Haskell programs have bugs too.

So Java, you say. How do you feel about dynamic types? Cool? Get out of here. No really, it's no fun preaching to the converted. See you!

Oh no, heaven forbid, you won't touch them with a stick. You're my guy then. So let's rewind to Java 1.4 days, after all many Java developers still use 1.4 and many others look back at it with nostalgia. Are you one of them? Ok. So what about pre-generics collections, do you think they are statically typed? Hmmm... And what percentage of your code involves collections? So this code was not entirely statically checked. Now add all the reflection stuff...

But then of course came Generics. And suddenly Java is much more complex. How do I make my code compile, gee, wildcards, captures... ?! I am trying to get something done, hello!... It's easy of course to blame Generics implementation, but if we learn something from the folks whom I kindly asked to leave in the beginning, they'll tell you that finding correct static type for every element in your program is hard. They of course think that hard is good, they are noble men with ideals, they like overcoming challenges. But you and I, we're just trying to make a living. So we curse Sun and back off to an untyped collection. Hm, maybe we're just doing the right thing? Maybe sometimes we just know that our program is correct, but the compiler demands more and more typing and wastes our time?

Java 5 was all about improving type-checking. If pre-defined types were not enough, annotations came handy. Define your own and test it at compile-time or at run-time. Did it ever happen to you that there were so many annotations, that you couldn't see the code?

See, more types is not always a good thing. Unless you're very keen on intellectual challenges. James Gosling said this about Scala - functional programs will make your brain hurt, they are for calculus lovers. He's right. It's the kind of pain you feel in your muscles when you start working out, you know, that indicates they are still alive... So working out is good, but we can't afford doing it all day, ha?

Maybe the appeal of plain old Java was that it's a combination of static and dynamic checks? So it's not that all dynamic is evil, maybe it's a matter of how much and where?

Give dynamic types a break. Who knows, you may find eventually that they're good for some things.

Peace.

P.S. I've done some role playing here, just for the record. I do love Generics, even though they were hard to master. Annotations are overall very useful. Right now I don't do as much Java as I used to, and I do other fascinating languages (static and dynamic) as I, for long time, wanted to.

Wednesday, October 1, 2008

Two Days In A Life

The ban that Israel's government has put on The Beatles performance in the 60s disappointed not only Israelis, but many Jewish people who were part of The Beatles phenomena, like their legendary manager Brian Epstein. Over four decades later, Paul McCartney represented the Fab Four on stage in Park HaYarkon, Tel Aviv, and rocked the audience with a great show.


Negotiations for the concert have been tough, but after a series of rumors and denials the deal was finally set. Ticket offices were stormed on opening, 50,000 tickets were sold eventually, filling the park with fans and music lovers.

Read more in this illustrated account of Paul McCartney's visit to the Holy Land...