Why OO Business Applications Always Wind Up Splitting Methods from Data And Not Encapsulating Either

One of the main—perhaps the main—selling point of OO is Encapsulation. Which is actually two selling points. Firstly Information Hiding, that is, hiding the innards of a class so you can use it without needing to know how it works. Secondly Modularity, holding related data and methods together in one place. It's the use of objects & classes to achieve these two effects that gives OO a large part of its flavour, distinct from procedural or functional programming.

The power this gives is the ability to model the domain naturally. This is power of OO. Given a business having customers with addresses, orders, invoices, products etc, we can model the domain - the entities and their relationships - very naturally in code by creating classes to represent them.

So as OO developers coming to line-of-business applications we expect to apply these techniques. We think about what should be private or public to each class, and we think about what responsibilities belong with classes, and what methods & data are needed to fulfil that responsibility.

The problem is, businesses don't do modularity and information hiding. (Well they do of course, by being split up into business functions - sales, accounting, warehouse, customer services etc. Information private to one department is not usually available or even of interest to staff in another. But that's not the level we're interested in in a domain model).

Look again at our OO classes: customers, addresses, invoices, orderlines, products etc. These things are not encapsulated in business use at all. Firstly, there is no hidden information; when a business user goes to their information system to get information about a customer they expect just that: Information. No hiding. Secondly, information is all they expect. They do not expect the information system to model the customer's ability to Walk() and Talk(). No methods - and so little modularity - are needed.

This is true of almost everything in a business's domain model. From a business users point of view, it's all just screens of information. Documents. And this - the humble document - is the heart of what we missed in our sketch of a domain model. The real thing we are modelling is not the flesh and blood Customer, or the physical Product they want to buy. What we're really modelling in line-of-business apps is paper documents in a filing cabinet.

A piece of paper in a filing cabinet has no methods to model. It's just a data-holder. An accurate domain model for a piece-of-paper-for-a-customer is just a list of fields or properties. You can wrap get() & set() methods round them if you like, but all you're doing is adding boilerplate. (As an aside, a big advantage of computers over paper is much better performance in modelling relationships between those pieces of paper. A database can look up a customer's order history much faster than a human with a set of filing cabinets).

So where are the methods in a business? They are largely in the business processes. (The phrase 'business process' is one that your business people use, which tells you that business processes should be part of your domain too).

To the business, a process is usually a sequence of actions done by a chain of people. In a typical business process, a piece of paper (actual or virtual) is passed around; information is updated on it, or on other pieces of paper; and very occasionally something is actually done, such as finding an item in a warehouse and putting in a box.

Which class in your domain is responsible for these actions? Probably none of them. You could create a WarehouseWorker class responsible for Pick()ing and Ship()ing the product, but you'll quickly realise (unless your warehouse is staffed by robots for whom you are writing the control programs) that you can't write any code that actually belongs in those methods. You can write code for UpdateStockCount(), to record the fact that stock is decremented; and you could write an UpdateCustomerOrderStatus() so that the customer services department can tell the customer how their order is getting on. Indeed, that's probably exactly what's being required of your system. But which class in your domain responsible for these actions?

If you think that the Customer class, or the Order class, is responsible for UpdateCustomerOrderStatus(), then you would appear to be imagining a piece of paper updating itself, which is not a normal responsibility for a piece of paper. (Either that, or you're imagining the ghost of the customer coming on-site to keep track of their order. The real customer has no such responsibility. Their only responsibility at this point is to sit at home waiting for their package to arrive). In the world before computers it was probably the responsibility of the Stock Controller or Inventory Controller (yes, those are real job titles still held by real people) to do this book-keeping. So your domain model could/should include a StockController class with that responsibility.

Now ask yourself, what data do I need to encapsulate in the fields of a StockController class? The answer is almost certainly none whatsoever. You are not usually asked to model the real stock controllers' working hours or height or physical strength in a line-of-business application. All you need to model is their ability to UpdateStockCount() & UpdateCustomerOrderStatus(). You need model no internal state.

This is typical of business process modelling. Business processes are nearly always stateless - all the information they need is given to them each time they are invoked. This is true for so-called 'long-running' processes too: the classes responsible for performing each step of the process are not responsible for holding state. Rather, they expect to have data (or ids/keys to look up the data) passed to it.

In short: ninety percent of the domain of a line-of-business application is usually correctly modelled as:

  • Documents holding data
  • Stateless public processes which receive these documents as their inputs

Because most of what you are modelling is paper documents (which do nothing, except record data) being worked on by employees (about whom you know nothing, except their ability to update the documents).

And that's why data is not encapsulated in business applications, nor are methods and data held together in the same class.

Objections

But wait - surely we still have encapsulation and information hiding between for instance the UI layer vs data access layer vs domain model?

Yes, that's right. The StockController exposes the fact that it can UpdateStockCount() but hides the fact (in fact, doesn't even know itself) that this is achieved by writing to a copy of SQL Server installed on machine XYZ. A clear sign of concerns that should be separated at quite a high level is that they are semantically unrelated, that is, they speak a different language. The UI layer and the domain model know nothing of this 'Sql Server' of which you speak.

The point is rather that you should not be disappointed if it turns out that your domain model feels more anaemic than rich. That models the reality of typical business domains.

What about business rules & logic? They aren't stateless business processes?

Also true. And the above line of thought need not be applied to them. (Does an abstract concept such as 'business rule' have state? Yes if it's the-rule-as-applied-to-a-specific-case; no if it's 'The Rule' in abstract).

Improving the accuracy of software project estimates: multiply everything by 3

I found when I'd worked in software for a couple of years that everything I delivered took me about three times longer than I expected.

Eventually I realised that my 'gut feel' for estimating a coding task was 'about how long will it take me to code this if I make no errors & get it right first go'. Which is a good starting point for an estimate, so long as you then go on to add testing, debugging, changing or misunderstanding requirements and time to release.

If you have stable requirements and a pushbutton deployment toolchain, then 3 x gut feel is may be about right. That covers clarification of requirements, testing, subsequent clarification of misunderstanding, and deploy.

If you haven't got those things, x5 or higher is usually closer.

I notice that others have found something similar. I'm please to find that multiplying by 3 puts me about 4.7% ahead of the curve - http://alistair.cockburn.us/The+magic+of+pi+for+project+managers.

Project Success vs Project Value

Somehow earlier this year I subscribed to the chaos report email. Given the significant criticism of chaos report's measure of success - on time, on spec, on budget - I was amused by this week's email which poses the question whether their success criteria are relevant. What the Standish group does have that most of us don't, is access to a large number of (unpublished and hence unverifiable) examples.

From the email: "Which is more important project success or project value? It turns out the project success and project value is orthogonal or at right angles to each other. The harder you strive for perfect success the lower your project value. The harder you strive for greater values the lower the success rate.

We know this because we have coded each of the 50,000 projects within the CHAOS Database a success rate and value rate. Each rate has a score from 0 to 100. By grouping the projects by organization we can come up with a ranking by success and value. We then can compare the rankings. One of the items we discuss is another question Is the traditional triple Constraints (cost, time, and quality) measurement still appropriate?"

I'm surprised by the assertion that success and value are orthogonal. I'd have thought that there was at least some connection between time/cost/functionality and perceived value; if the value of your project is not in the spec, surely you wrote the wrong spec?