Sententia cdsmithus

May 27, 2007

An interesting old piece of writing

Filed under: Java — cdsmith @ 3:39 am

Most of the time, I cringe when I read something that I write a long time ago!  I was reading this, though, and I actually still agree with something that I said several years ago.  It’s even still a little relevant, as Spring is still popular in the Java world.  I should warn you I haven’t kept up with recent developments in Spring.

For the sake of posterity, the following is what I wrote several years ago.

Last night, I went to see someone talk about Spring at the Pikes Peak Java Developer’s Group.

I will gladly join my voice with the loudest of the critics of Enterprise JavaBeans. Spring is supposed to be the way to avoid EJBs; the way to write simple, lightweight Java applications without all the stifling rules and framework-ness of EJBs. As a result, you might expect I’ll be a big fan of Spring. I hoped that I would be.

Spring is fundamentally built around beans, which are generally supposed to be described in an XML file. The beans are often called “POJOs”, which is a (fairly commonly used, and not invented by Spring) acronym for “plain old Java objects.” Now I start to get worried. If we were really working with plain old Java objects, Spring would call them “objects”. As a general rule, when someone says POJO, you can assume that they are either: (a) fundamentally uncomfortable with doing straight-forward object oriented design, because they are framework-dependent; or (b) talking to people who meet the description in (a). But Spring, in fact, is largely talking to people who are formerly dependent on J2EE application servers. So I move on.

Here’s the interesting part about that XML file, though. This XML file is basically an XML mapping of a limited subset of Java. There are elements for creating beans, with attributes for parameters. There are elements for specifying the constructor arguments for those beans, and for setting properties of those beans.

At this point, though, I was loudly wondering the following: why aren’t we writing this in Java?

The answers I heard:

“You don’t have to write initialization code.”
“This file can be maintained by managers who don’t know Java.”
“You can change your error handling strategy without changing a single line of code.”

All of these statements are complete B.S.! XML is nothing more than a framework for defining file formats. When XML is used to describe the process for creating an object, it qualifies as code in my book, and in the book of most other reasonable people.

Let me put this another way. I can open up the Java Language Specification, 3rd Edition to section 18.1, and I see a BNF grammar for Java. Given a few hours, I can write a simple application to convert this grammar into a DTD or XML Schema. Give me less than a week, and I’ll have javac running to parse the grammar from my XML format. Now, can managers who don’t know Java create powerful applications easily? After all, they possess this magical ability to express things if they are represented in XML, right?

“You can make a change without recompiling.”

This isn’t a good thing. The compiler performs important work, such as checking to make sure that we aren’t trying to use types that don’t exist. Yes, I hear you saying that there is the Spring IDE instead, which runs inside of Eclipse and checks that for us — wait a second; if we’re editing this file in Eclipse, isn’t it impossible to save a file without recompiling?!? (The answer is no, but everyone I know leaves the automatic build feature on.)

More to the point… Have we regressed from automated build procedures so much that the prospect of running the compiler as a part of deployment configuration scares us? Even more to the point, why are we willing to use other tools (such as the Spring IDE validator that checks for references to undefined classes) but not compilers?

“You can do lazy initialization.”

So can Java. In fact, one of the more useful little-known language features in Java is deferred class loading. If I declare a static fields in a class, they don’t get created until the first time that class is accessed. The correct way to do a lazily initialized object in Java is this:

public class MyClass
{
    public static final MyType myObject
        = new MyObject(someParameters);
}

That’s it! All done.

“There are tools to view this XML file visually.”

This is the first response that actually makes some sense. It’s true; there is the Spring IDE, and it can display (and, in the future, edit) this XML file visually as UML-ish boxes on the screen. There’s quite a cost, though, for this advantage. Perhaps the time would be better spent on producing good CASE tools.

What do we lose by using XML?

For one thing, it requires specialized tools to verify things that the average compiler would check, such as that any types used in the file are actually defined. More importantly, the average compiler –and especially IDE — would do a better job. If I build a source file in Eclipse, I can right-click on the method or field declaration and select the option to search for references in the workspace. If I define a bean only to be referenced from several other beans, I can make it private and the compiler in Eclipse will warn me if I don’t use it. If I use a deprecated constructor or method, I’ll get a warning. I could go on and on.

Second, it’s less readable. If XML to create arbitrary Java objects were more readable than Java, we’d write our code in an XML application. That prospect would scare me, and it would probably scare anyone else, too… even the most ardent Spring fanatic.

Third, you lose type-safety. There’s a whole slew of Spring-specific rules out there about resolving types when creating Java objects from the inherently untyped XML format. In the end, though, it comes down to this. Every fundamental configuration item spends at least some time as a string. This other important point: it’s absolutely impossible to ever get type-safe general collections from Spring.

Here’s the killer, though. If I take that XML file from a Spring project and translate it into Java source code, your average team code review would reject it. It’s not good code. It’s as simple as that. We are not just wrapping Java in a proprietary XML mapping; we’re wrapping bad Java in a proprietary XML mapping.

As a reaction to EJBs, Spring 95% of the unnecessary complexity of Enterprise JavaBeans. EJB developers seem to look at Spring and get really excited. It’s an alternative framework that’s less complicated. On the other hand, there is a fairly well-established community of Java developers who have spent the last seven years or so developing enterprise-class applications in Java without EJBs. To these people, Spring needs to solve specific programming problems to justify its existence and use. Otherwise, it’s just deployment hassle.

So what does Spring do?

One thing it does, and perhaps the single most important thing, is “dependency injection”. Dependency injection is a buzz word. Five years ago, many programmers would have described it using this alternate phrase: “design that is at least mediocre”.

Spring also suggests a few interesting third-party products. If someone is writing an application with Spring, then we assume that they are at least considering using Hibernate if they’ve got database access to take care of. We assume that they are looking at possible uses for aspects to remove some types of duplicated code.

Promoting third-party products can be quite useful, and teaching good design even more so. Nevertheless, it doesn’t justify creating a new framework! Robert Glass, in his book Facts and Fallacies of Software Engineering, marvels at the mindset that could lead someone to criticize stifling software methodologies by inventing a new methodology. The same appears to be true of Spring. It’s complaining about stifling frameworks by inventing a new framework.

It could be that this is a necessary intermediate step. After all, it is definitely true that it’s easier to convince a programmer to evaluate a new tool like Spring than to re-evaluate his or her design skills. If part of “learning Spring” is to start using basic modular design and separation of concerns (a.k.a., “dependency inversion”), then Spring can help. However, when it leaves a residue of XML-parsed JavaBeans in its wake, one is forced to wonder if the medicine is worse than the disease, or at least if there might be a less invasive cure.

Snapshotting: A neat problem and solution in Haskell

Filed under: Haskell — cdsmith @ 3:13 am

I solved an interesting problem recently in Haskell.

Now, I’m no expert at Haskell, and this may not even be a very good solution. I got where I did through a lot of weird side paths, a lot of help from more experienced Haskellers on IRC, but I also made a lot of my own mistakes. It is, though, definitely an interesting solution. So I’ll describe it here.

Of course, this all came about to solve a particular problem. In this case, the problem was a multi-user dungeon (MUD). These programs are text-based network applications that allow users to connect and interact with a common virtual world.

I chose to represent the “world” using a nifty module called “software transactional memory” in the Haskell core library. STM is a great thing, and you should learn about it if you don’t know about it already. I found Simon Peyton-Jones’ paper at http://research.microsoft.com/~simonpj/tmp/beautiful.ps very useful. The general idea, though, is that mutable variables are encapsulated with a type constructor TVar, and can only be modified in a transaction.

I wanted to preserve a little of the pure functional flavor, minimize the use of TVars, and generally follow the philosophy of avoiding premature optimization; so the (incomplete) beginnings of the world look something like this:

 data World = World { worldStartRoom    :: TVar RoomId,
                      worldNextRoomId   :: TVar RoomId,
                      worldNextPlayerId :: TVar PlayerId,
                      worldNextItemId   :: TVar ItemId,
                      worldRooms        :: TVar (M.Map RoomId   Room),
                      worldPlayers      :: TVar (M.Map PlayerId Player),
                      worldItems        :: TVar (M.Map ItemId   Item),
                      worldPlayerNames  :: TVar (M.Map String   Player) }
     deriving Eq

Now, occasionally, it is also necessary to occasionally save the state of the world out to disk. That data is being constantly accessed and updated by bazillions of threads, and I need an internally consistent state. Great, because STM solves exactly that problem! This transaction will read the whole world, so it will certainly have its costs — but at least it’ll be correct, right?

Here’s where it gets ugly: STM transactions may not perform I/O operations. In other words, the code will need to first do all its STM stuff, save it off in memory somewhere, and only then begin writing to a file. After defining a bunch of data structures containing TVars (okay, only one for now, but that might change), it now appears that I would need to define an identical set of data structures that provide access to a copy of the world from outside of an STM transaction. In other words, it now looks like I need a SWorld, as follows:

 data SWorld = SWorld { sworldStartRoom    :: RoomId,
                        sworldNextRoomId   :: RoomId,
                        sworldNextPlayerId :: PlayerId,
                        sworldNextItemId   :: ItemId,
                        sworldRooms        :: (M.Map RoomId   Room),
                        sworldPlayers      :: (M.Map PlayerId Player),
                        sworldItems        :: (M.Map ItemId   Item),
                        sworldPlayerNames  :: (M.Map String   Player) }
     deriving Eq

I immediately don’t like this, because it looks identical to the definition of the earlier World, and in fact it is a requirement of the application that the two remain identical. That’s a lot of manual maintenance. Fortunately, we get type polymorphism to the rescue!

 data World a = World { worldStartRoom'    :: a RoomId,
                        worldNextRoomId'   :: a RoomId,
                        worldNextPlayerId' :: a PlayerId,
                        worldNextItemId'   :: a ItemId,
                        worldRooms'        :: a (M.Map RoomId   Room),
                        worldPlayers'      :: a (M.Map PlayerId Player),
                        worldItems'        :: a (M.Map ItemId   Item),
                        worldPlayerNames'  :: a (M.Map String   Player) }
     deriving Eq

So now World is a parametric type, with a type parameter a. But a is not just any old type parameter! It has a kind of (* -> *). The earlier World type is now just World TVar. The snapshot type is a little uglier, but not much. It would be nice to have a simple identity type constructor, but such a thing doesn’t exist. (In the course of looking for it, I did come across a discussion about adding it into a forthcoming release of Haskell on the haskell-prime mailing list; but there are substantial arguments against it.) So in lieu of a true identity type constructor, I can just use the Identity monad. The snapshot type is now World Identity. Instead of having to make sure they both have the same fields, I know they do because they are the same type!

(One interesting change is that if I want to derive the Eq type class for World, I have to enable undecidable instances in GHC. I don’t really understand why, but it doesn’t confuse me; I’m doing some fairly bizarre stuff here.)

Now it is convenient to have a way to put things in and out of a TVar or Identity without worrying too much about which kind of World I’m using. A type class does the trick! Because dealing with TVar values must be done in STM, the operations are defined in the STM monad.

 class Wrapper a where
     extract :: a b -> STM b
     inject  :: b -> STM (a b)   

 instance Wrapper TVar where
     extract = readTVar
     inject  = newTVar   

 instance Wrapper Identity where
     extract = return . runIdentity
     inject  = return . return

Admittedly, that last instance for Identity looks odd. In inject, for instance, I want to take a value, wrap that in the Identity monad, and then wrap that in an STM block (not because it’s necessary, but because it’s part of the type signature for the class.) The extract function uses runIdentity to get the value out of the Identity monad, and then returns is in the STM monad to satisfy the type class. A little more boilerplate…

 mkWorld a b c d e f g h = do
     a' <- inject a
     b' <- inject b
     c' <- inject c
     d' <- inject d
     e' <- inject e
     f' <- inject f
     g' <- inject g
     h' <- inject h
     return (World a' b' c' d' e' f' g' h')   

 convertWorld (World a b c d e f g h) = do
     a' <- extract a
     b' <- extract b
     c' <- extract c
     d' <- extract d
     e' <- extract e
     f' <- extract f
     g' <- extract g
     h' <- extract h
     mkWorld a' b' c' d' e' f' g' h'   

 worldStartRoom    = extract . worldStartRoom'
 worldNextRoomId   = extract . worldNextRoomId'
 worldNextPlayerId = extract . worldNextPlayerId'
 worldNextItemId   = extract . worldNextItemId'
 worldRooms        = extract . worldRooms'
 worldPlayers      = extract . worldPlayers'
 worldItems        = extract . worldItems'
 worldPlayerNames  = extract . worldPlayerNames'

These functions handle the grunt work of dealing with various kinds of worlds. The first one defines a constructor that takes simple values, and automatically wraps them up in TVars or the Identity monad. The second gets these things out of one kind of world and into another. The third through tenth extract the values of fields automatically, so that code isn’t littered with readTVar and runIdentity statements all willy nilly. (It may look like this overhead is worse than the cure. That would be wrong, because that last block of code would have to have been written even if I’d used a completely separate type for the snapshot. The only real added overhead is the class and instances for Wrapper, and that’s independent of type! That means if I change Player to have some TVar fields later on, there’s no extra code to write.)

So this accomplished my goal. And sure enough, it basically does what I want. In the end, though, the boilerplate was too much for me. I ended up turning to Template Haskell to handle it for me. The code now looks like this:

 data World a = World { worldStartRoom'    :: a RoomId,
                        worldNextRoomId'   :: a RoomId,
                        worldNextPlayerId' :: a PlayerId,
                        worldNextItemId'   :: a ItemId,
                        worldRooms'        :: a (M.Map RoomId   Room),
                        worldPlayers'      :: a (M.Map PlayerId Player),
                        worldItems'        :: a (M.Map ItemId   Item),
                        worldPlayerNames'  :: a (M.Map String   Player) }
     deriving Eq   

 $(snapshottingType ''World) 

And that looks to me like an elegant, maintainable piece of code.

Blog at WordPress.com.