Nick Wienholt has posted up about a project that he is working on where the age-old (well, at least as old as .NET) DataSet discussion has kicked off.

I "think" I know the project he is working on, so I also know some of the other people involved and can tell you that the debate isn't going to be light-weight and many good points will be made, although it'll probably result in dead-lock like it always does, so I won't go over the same old arguments again.

Everyone knows you can model most businesses using either a relational database or a set of domain objects so its not really a question of capability. The conclusion that I am fast coming to is that no matter which approach you take you almost always end up modelling the wrong thing.

So let me put a little bit of a side-ways spin on the argument. How many of you end up with a class or data-set called "Customer" or something like it? If I was to guess I'd say that 65% of you would be nodding your head, of the remaining 35% (does quick check of math) it would probably be because "Customer" has no place in the problem space, for example, if you are building CRM apps for Australian toll-ways you'd probably have a "Victim" class.

OK, now that you've thought about your object model a little bit, think back to how the business might have operated without computer assistance. When the relationship began with a customer they might have filled in some kind application form (some industries don't require this). Once that was done that form would probably have been posted to head-office where they created a file (you know, the type that sits inside the filing cabinet).

Lets say our customer moved house and as a result changed their address and phone number. Well, they would probably have filled in a change of address form and posted that to the head office too. Someone at the head office would have filed that request along with their application and probably updated the customer details on the top of the file and maybe forwarded to another relationship manager if they happened to move between service regions (although the transfer would probably have been transparent to the customer).

This second interaction is where things get interesting. If you didn't want to drive your customer nuts you wouldn't ask them to re-enter all their details, you'd probably just ask for some key information like a customer reference number and the new address information.

Compare that to what you might be doing in your existing business application when you transfer the the entire customer representation back and forth between the client application and the source of truth. I can understand read-only interfaces where you do that, but it just doesn't make sense when changing data, because thats not the way the business operates.

In business individual operations against customer representations in the system are significant. When I want to increase the credit limit on my credit card the bank will launch a whole bunch of associated processes, such as sending me a snail mail informing me of the change. If the interface into the system only allowed you to transfer the entire customer representation from client to server, when it gets back to the server it is going to be tricky to intercept the various subtle operations that got performed on it and trigger the background processes.

So, whichever modelling mechanism you choose please look at modeling the "messages" between the customer and the source of truth, not the customer itself.

Now that I have that off my chest, I will outline a few of my concerns about data-sets. Firstly, the tool support is great, but it tends to drive people towards producing in memory representations of the source of truth - therefore you end up passing the customer across the network again (typically). That doesn't mean using objects alleviates this, but because the tool support is still fairly poor you tend to get a bit more creative in what you send across the wire.

End rant.