Tag Archives: software design

Three Historical Definitions of the Open/Closed Principle and a Claim that it’s Pointless

Bertrand Meyer first published the OCP in his influential 1980’s book Object Oriented Software Construction:

  • “A module is said to be open if it is still available for extension. For example, adding new fields, or performing new functions.
  • A module will be said to be closed if it is available for use by other modules.”

It’s a neat double-definition, not least because the definition of Closed is both useful, and one that might not contradict the definition of Open. But Meyer’s proposed technique—use subclassing to achieve Openness of a Closed module—is widely ignored. Many of us have discovered the pain of working with inheritance hierarchies, so we savour the Gang of Four’s sage dictum: “prefer composition over inheritance.”

Dynamic languages like javascript can do the open/closed trick quite easily. The danger is that, in doing so, you develop inscrutable code and and are left with system that, when it works, you don’t know how it works; and so when it breaks you don’t know how to fix it.

Uncle Bob all-but-redefined the Open/Closed principle by using Interfaces as his technique. The interface is fixed and Closed; modules that depend on it can rely on it not changing. But the Implementation is Open: it can be changed without breaking the interface.

It is worth mentioning a word of wisdom from the .Net team’s Framework Design Guidelines: the weakness of interfaces in Java and .Net is precisely that they are 100% closed. There can be no version 2. Or rather, if there is a InterfaceV2 then it can usually have no useful relationship to InterfaceV1. You might as well call it ICompletelyUnrelatedInterface. (Or perhaps one could put the versioning at the namespace level).
This versioning problem is widely felt in service oriented systems with public interfaces. It is often addressed by creating a new endpoint for a new version of the service. Offering two versions of a service becomes on the whole precisely as expensive as offering two services, which is to say twice as expensive. This is unfortunate.

Contrast this with Meyer’s vision of OCP: On Meyer’s subclassing approach, version 1 clients and version 2 clients would call the same service and get the same responses. Version 2 clients would recognise, and so be able to use, the enhanced v2 capabilities; whereas version 1 clients would only recognise the version 1 capabilities. But here I see a second problem with Meyer’s vision: I’ve almost never seen systems (or even parts of systems) that can achieve it in practise. It’s a beautiful dream. But unachievable. It is a pipedream.

More recently (Dec 2016), Michael Feathers has offered an updated version, towards the bottom of the the page at Towards a galvanizing definition of technical debt:
“our code is better to the degree that we don’t have to change it much when we add features. We should be able to make modifications primarily by adding new classes and functions rather than changing existing ones”
This is much ‘softer’ formulation than Meyer’s or Bob Martin’s and you could take it as just a rule of thumb; something to weigh in the balance against other factors. Feathers implementation in this case (and I’m left with the impression that in a different codebase he’d be happy with a different implementation) is doing event driven code as most people think it should be done: use an AddEventListener() interface, which makes the code Open to all kinds of extension.
This AddEventListener() is exactly the approach used in the HTML spec and other GUI frameworks of the past 20 years. The downside is that the ‘closed’ bit of the interface is so small and weakly-typed that it’s almost non-existent. The interface tells you nothing about the semantics. (What kind of events can I listen to? What information do I get about each event? What can I do with them? I can only find out by reading the HTML spec, which turns out to be quite hard going, or turning to MDN, or, the first port of call for many, StackOverflow. In a bespoke codebase replace this with “ask for documentation; find it is incomplete; and then hunt through the code for examples of how I can use it”).
Strongly typed interfaces are at least somewhat self-documenting—they offer a definitive list of all syntactically valid calls to the service—even if that documentation depends heavily on how well the developers chose their method and parameter names.

These three examples leave me with mixed feelings. OCP seems like trying to square the circle, and Meyer’s choice of name was a well-chosen contradiction. Yet the goals—Openness for extension, Closedness for reliability—are unavoidable.

Dan North, amongst others, has suggest that OCP, and indeed all the SOLID principles, are of limited value and we should drop them in favour of something else. I sympathise—I think that SOLID is a mishmash of mixed value—but I’m willing to wrestle for a couple more years with OCP before I admit defeat.

I’d rather have the above three technique, and others, in my toolkit because my software design still has to address the two contradictory requirements that Meyer identified in the 80s:
–Because my software is still evolving, it has to be open for evolution: it has to change.
–Because my software is already is use, and hence being depended on by some other software or person, it has to be reliable and therefore can’t change.

Coplien & Bjørnvig : Lean Architecture For Agile Developers. A Review

Four years after this book came out, Agile Architecture has at last become a Thing. But as the nuance of its title hints, this book is not fad-driven. It is a carefully-thought out exposition of what architects can learn from lean and agile ideas, and what they can do better as a result.

Well. It’s partly that. If you are a practising architect, it is actually four must-read books in one.

If you are not, you might dismiss this book for two reasons. The first, that judging by other reviews & my own experience, the homespun style of the first half does not suit a bullet-pointed gimme-the-headlines-now generation. The second is that if you have not experienced the pitfalls of architecting and doing software in a real organisation with actual people in it, then Coplien & Bjørnvig’s pearls of wisdom may impress you much as the agile manifesto might impress a cattle rancher.

It is not a beginner’s book. It is the mature distillation of the sweat-soaked notebooks of a fellow-traveller who has stumbled over rocky terrain, been through the tarpits and has some hard-bloody-earned (and, academically researched) wisdom to share as a result.

I said four must-reads in one. They are:

1) The (literally) decades of experience of a leading practitioner & thinker in the field.

2) A thought-through answer to the questions, what can we learn from lean & agile. Whilst value-chains and some technique feature, the authors’ secret conviction is surely that Technology Is All About Human Beings. “Everybody, all together, from early on” is their Lean Secret. “Deferring interaction with stakeholders [users, the business, customers, domain experts, developers], or deferring decisions beyond the responsible moment slows progress, raises cost and increases frustration. A team acts like a team from the start.”

3) Which leads to that which for me, as a more techy-focused reader, was the marvel of the book. Clements et al in “Software Architecture in Practise” offered attribute-driven design, a partitioning of the system based on a priority ordering of, primarily, technical quality requirements. Coplien & Bjørnvig all but deduce a partitioning based on the priority ordering of people: Users first, Development team next.

“The end-users’ mental model” is the refrain on which they start early and never stop hammering. From this they suggest that the first partitioning is What the System Is from What the System Does. I am tempted to paraphrase as, Domain Model from Use Cases. But their point is partly that this also neatly matches the primary partition by rate of change, because the last thing to change in a business is, the business of the business. If you’re in retail, you might change what and how and where you sell, but your business is still Selling Stuff. And at lower levels inside the organisation too: an accounting department, though there has been five thousand years of technology change, still does accounting. The users know this. They understand their domain and if the fundamental form of your software system matches the end user mental model then it can survive–nay, enable!–change and stay fundamentally fit for purpose.

Second, don’t fight Conway’s law. “Organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations”. The deduction for agile teams is, partition so as to maximise the long-term autonomy of your self-organising teams. Even when the team divisions are imposed for non-technical reasons–geography, politics, whatever–still allow that fact to trump more ‘technical’ considerations. This may not match your vision of technical perfection, but it will still be the best way. Ruth Malan recently paraphrased Conway’s law as, “if the architecture of the system and the architecture of the organization are at odds, the architecture of the organization wins”. Don’t be a loser.

These points seem to me obvious in hindsight, yet they turn traditional approaches to high-level designs on their head. After considering People, yes we can considering rates of change, quality attributes, technology areas. But it’s always People First.

4) And finally. This is the first book-sized exposition of DCI architecture, which I would describe as an architectural pattern for systems that have users. Having separated What the System Is from What the System Does, DCI provides the design pattern for how What the System Does (the Use Cases) marshals the elements of the What the System Is. There are at least three notable outcomes.

Firstly that use cases are mapped closely to specific pieces of code; in the best case each use case can be encapsulated in its own component.

Secondly, that the relationship between Domain elements and Use Cases are expressed as “In Use Case X Domain element Y plays the Role of Z”. This brings significant clarity to both, and is part of the key to ‘componentising’ use cases; the Roles needed for a use case become, in the code, its public dependencies. In UML-speak, the Required Interfaces for such a UseCaseComponent are the roles needed to ‘play’ the use case, and those roles are Interfaces which are Implemented by Classes in the Domain Model. Coupling is reduced, cohesion is gained, clarity abounds.

Thirdly, the simpleness of the mapping from business architecture to code is greatly increased. Suddenly one can draw simple straight lines between corresponding elements of architecture and code.

The authors say of their work, “This book is about a Lean approach to domain architecture that lays a foundation for agile software change”. To my mind, this hits the agile architecture nail on the head. Agile software development always only ever succeeded at scale because the people doing it either knew, or had given to them, enough architecture to make it work. Software Agility, just like every other software Ility, must either be supported by the architecture or it ain’t gonna happen.

But the best thing I got from this book was the proof, before my very eyes, that correct technical design flows from knowing how to put the human beings central.

Where to Buy

UK



ebay (UK): Lean Architecture for Agile Software Development


Amazon (UK): Lean Architecture for Agile Software Development

USA



Amazon (USA): Lean Architecture for Agile Software Development

Introducing: The Semantic Field. Or, The One Truly Correct Usage of Layered Architecture in the World

Bear with me if you already abandoned layered architecture long ago. You may be quite familiar with the thought that layered architectures often fail to apply the Dependency Inversion principle, and often thereby induce tight coupling of un-modular, un-testable layers.

I wish to do two things in this post. First, I propose that the notion “Semantic Field” better captures the one big idea that layered architecture nearly gets right. Second, I will discuss the One Truly Correct Usage of Layered Architecture In the World in order to show why it’s the wrong choice for nearly all other usages.

Semantic Field

“Semantic Field” or “Semantic Domain” is a term from linguistics. Words are in the same semantic field if they are related to the same area of reality. (The word domain is pretty much what you’d call it as a DDDer). Orange is in the same semantic domain (let’s call it the fruit domain) as Apple. But it’s also with Red in the semantic domain of Colour, whereas Apple isn’t. That’s how natural language rolls.

Kent Beck used the term conceptual symmetry to explain why he didn’t like this code snippet:

void process(){
input();
count++;
output()
}

and wants to change count++; into tally();. Somehow the count++ doesn’t seem to be on the same level as the method calls. Indeed it isn’t. It’s the same feeling you have when you see:

void applyToJoin(Customer customer){
    if(eligibilityRules.validate(customer)){
        membershipList.accept(customer);
        htmlBtnUpdate.setEnabled();
    }
}

that a method dealing in business rules and processes should not also know about html buttons. Semantic Field is the notion we want here. The clean code rule is “One Level of Abstraction per Function” and I propose to rename it as “One semantic field per method”. In fact, one semantic domain per class, namespace, module, or … layer.

This is what layers gets right: The idea that inside a given layer you understand a specific semantic domain, and don’t use vocabulary from the semantic fields of the layers above or below you.

Where layers goes wrong is, well, the layering. The belief that all top-level dependencies in a system can be expressed in one dimension, top to bottom. They just can’t. Squeezing your code into 1 dimension makes you do contortions that are utterly unhelpful. Strict layering adds to this a second failure mode: It makes you write pointless passthrough code, which ought to be deleted.

(Layering does get a second thing right: no cyclic dependencies. Code with mutual dependencies will try to morph into ball of mud architecture. I’m sure this is half the reason why layered architecture become wildly popular. It was a vast improvement on ball-of-mud).

The One Truly Correct Usage of Layered Architecture In The World

The other reason we were entranced by layered architecture for a decade was the ISO OSI 7 layer model for networking. It seemed so obviously, thoroughly, beautifully, correct.
OSI 7 layer architecture

Each layer is clearly (well, it was clear up to about layer 5, after that it got a bit hazy for some of us) and cleanly independent of the other layers. Each layer is a different semantic domain. The bottom layer deals with physical connectors and with what voltage represents a 1 or a 0 and how a byte sequence is encoded as an electrical waveform. The next layer deals with packets, complete with a destination and a source. The next layer deals  in routes: how to get to this destination from the source. The next layer deals in messages: how to turn them into packages and back again. And so on.
And, the layer-cake picture precisely models the dependencies between the layers. At least to layer 5, each layer relies on and adds value to the layer beneath it.

It was beautiful. It made sense. It was what I wanted my software to look like. It was a siren, luring us all to shipwreck our software on the rock of a beautiful but evil vision of how it should always be.

Why a Layered Architecture is Nearly Always Wrong For Any Other Software System

The bit that isn’t wrong

The part of the OSI model that is applicable to 99% of all known software is the separation into semantic fields. This is why we used to say that business logic shouldn’t be in the UI layer; html buttons live in a different semantic domain to customers and invoices. (Except: it was the wrong way to put it. The presentation layer does reference business logic because in an interactive system usability is achieved by having the UI reflect the business logic; for instance by hiding options that are not valid for the current user).

The bit that fails miserably

The part of the OSI model that is applicable to very very few systems is the layering. In the OSI architecture the strict layering works because the language of each layer can be defined in terms of layers beneath it. Session, Frame, Bit are in separate semantic domains, but the model allows Frame to be defined in terms of Bit, Session in terms of Frame, and so on.

This is almost never the case in layered business software. The vocabulary of a UI cannot be defined in terms of the vocabulary of commerce and business administration, and the vocabulary of a business cannot be defined in terms of data entities. They just are separate domains. The fact that that second one sometimes works a little bit (you can define a customer–incorrectly–as rows in data tables) is what seduces you into thinking it should work. But it doesn’t. You cannot define your business in terms of a data layer.

In particular then, a layered architecture with UI on top is always wrong; and business layer on top of data layer is always, but less obviously and more seductively, wrong. Hexagonal architecture (aka ports and adapters) is a much better model for most systems because it doesn’t confine dependencies to a single dimension (in addition to the already well-known fact that it gets your dependencies pointing the right way).

DDD: Treating the UI layer as a domain

Having recognised that user interface is a separate semantic domain, should we apply some DDD thinking and treat it as a bounded context with it’s own domain? The domain of an MVC web UI includes controllers, actions, routes, etc. But it must reference business logic all the time when deciding what to display, whether to accept user input, and ultimately to do anything with that input. To some, making the UI layer it’s own domain context, and giving it adapters to interface with the business domain seems like over-engineering, whilst others advocate almost exactly that.

I recommend that you should at least be aware that if you do not do this de-coupling (and in MVC web apps I personally almost never have) then your UI layer will have two semantic domains inside it. It’s a trade-off, but a sufficiently small one that I would usually come down in favour of which side has fewer total lines of code.

DistributedMethodCallError: The belief that calling across a network is better than calling within a process

Distributed Method Call Error
The belief that methods and functions communicating across a network is somehow better than communicating within a single process on a single machine.

Process this error by politely throwing a verbal exception, inquiring as to what, exactly, is better. And then explain how the answers you’re getting back are the wrong answers.

Here are templates for the three main areas on why a distributed architecture does not make X better:

If X is one of: Response
Separation of Concerns, Coupling, Cohesion or similar But X is not primarily about deployment scenarios, so distributing your deployment does not improve X.
Reliability, Performance, Robustness or similar But as you’ll know from the Fallacies of Distributed Computing, if not from bitter experience, distributed computing makes things harder not better.
Deployability, continuous deployment or integration But deploying to multiple hosts is harder, not easier, than deploying to a single host.

Yes, there are problems for which distributed computing appear to be part of a solution. Redundancy as a reliability tactic appears to push you to distributed computing. So does horizontal scaling as a performance or capacity tactic. But these things are done extremely well at the infrastructure level by a black box: a load balancer. Notice that load balancing does not force a decision on whether each instance of your application-or-service is deployed entirely on a single box or is distributed.

So if you think that microservices or any other form of distributed deployment address issues such as dependency management, coupling, cohesion, continuous deployment, avoiding domino failure, then may I put it to you that you have confused, not separated, your concerns. In 4+1 terms, you may be confounding your physical and process models (i.e. the runtime models) with your logical & development (‘coding-time’) models. As Simon Brown pithily put it, “if you can’t build a structured monolith, what makes you think microservices are the answer?”.

PS

Yesterday I read a blog about ‘Monolithic’ architecture which said – with pictures and everything – that if your problem is how to carry on coding effectively (add new features, fix bugs, resolve technical debt etc) as the size and complexity of the code base increases, then the solution is a distributed deployment architecture with synchronous calls over http using xml or json.

I could weep. You manage your codebase by managing your codebase! Not by making your runtime deployment and threading models 50 times more complicated. This is what I mean by confounding the logical & development models with the process & deployment models.

PPS

If you’re not familiar with the 4+1 architecture views: The

  1. Logical view describes your classes’ behaviour and relationships; the
  2. Development view describes how software is organised and subdivided into e.g. modules, components, layers, and how they are turned into deployable artefacts; the
  3. Process view describes what threading and processes and distributed communications you use and the
  4. Physical view (though I’d rather call it the Deployment view) describes what machines run what code on the running system

The ‘+1’ is the use cases, or user stories.

When I first saw 4+1 I only really ‘got’ the logical view. As years passed, I realised that this reflected a lack of experience on my part. When you first do distributed or asynchronous computing, you begin to see why you’d want a process view and a physical (or deployment) view.

4+1 is quite long in the tooth and has evolved in use. There are other sets of viewpoints. Rozanski & Wood’s seven viewpoints show the benefit of a decade’s more experience. You may think 7 is a lot, but for a small or simple system some of them need only be a sentence or two.

Why OO Business Applications Always Wind Up Splitting Methods from Data And Not Encapsulating Either

One of the main—perhaps the main—selling point of OO is Encapsulation. Which is actually two selling points. Firstly Information Hiding, that is, hiding the innards of a class so you can use it without needing to know how it works. Secondly Modularity, holding related data and methods together in one place. It’s the use of objects & classes to achieve these two effects that gives OO a large part of its flavour, distinct from procedural or functional programming.

The power this gives is the ability to model the domain naturally. This is power of OO. Given a business having customers with addresses, orders, invoices, products etc, we can model the domain – the entities and their relationships – very naturally in code by creating classes to represent them.

So as OO developers coming to line-of-business applications we expect to apply these techniques. We think about what should be private or public to each class, and we think about what responsibilities belong with classes, and what methods & data are needed to fulfil that responsibility.

The problem is, businesses don’t do modularity and information hiding. (Well they do of course, by being split up into business functions – sales, accounting, warehouse, customer services etc. Information private to one department is not usually available or even of interest to staff in another. But that’s not the level we’re interested in in a domain model).

Look again at our OO classes: customers, addresses, invoices, orderlines, products etc. These things are not encapsulated in business use at all. Firstly, there is no hidden information; when a business user goes to their information system to get information about a customer they expect just that: Information. No hiding. Secondly, information is all they expect. They do not expect the information system to model the customer’s ability to Walk() and Talk(). No methods – and so little modularity – are needed.

This is true of almost everything in a business’s domain model. From a business users point of view, it’s all just screens of information. Documents. And this – the humble document – is the heart of what we missed in our sketch of a domain model. The real thing we are modelling is not the flesh and blood Customer, or the physical Product they want to buy. What we’re really modelling in line-of-business apps is paper documents in a filing cabinet.

A piece of paper in a filing cabinet has no methods to model. It’s just a data-holder. An accurate domain model for a piece-of-paper-for-a-customer is just a list of fields or properties. You can wrap get() & set() methods round them if you like, but all you’re doing is adding boilerplate. (As an aside, a big advantage of computers over paper is much better performance in modelling relationships between those pieces of paper. A database can look up a customer’s order history much faster than a human with a set of filing cabinets).

So where are the methods in a business? They are largely in the business processes. (The phrase ‘business process’ is one that your business people use, which tells you that business processes should be part of your domain too).

To the business, a process is usually a sequence of actions done by a chain of people. In a typical business process, a piece of paper (actual or virtual) is passed around; information is updated on it, or on other pieces of paper; and very occasionally something is actually done, such as finding an item in a warehouse and putting in a box.

Which class in your domain is responsible for these actions? Probably none of them. You could create a WarehouseWorker class responsible for Pick()ing and Ship()ing the product, but you’ll quickly realise (unless your warehouse is staffed by robots for whom you are writing the control programs) that you can’t write any code that actually belongs in those methods. You can write code for UpdateStockCount(), to record the fact that stock is decremented; and you could write an UpdateCustomerOrderStatus() so that the customer services department can tell the customer how their order is getting on. Indeed, that’s probably exactly what’s being required of your system. But which class in your domain responsible for these actions?

If you think that the Customer class, or the Order class, is responsible for UpdateCustomerOrderStatus(), then you would appear to be imagining a piece of paper updating itself, which is not a normal responsibility for a piece of paper. (Either that, or you’re imagining the ghost of the customer coming on-site to keep track of their order. The real customer has no such responsibility. Their only responsibility at this point is to sit at home waiting for their package to arrive). In the world before computers it was probably the responsibility of the Stock Controller or Inventory Controller (yes, those are real job titles still held by real people) to do this book-keeping. So your domain model could/should include a StockController class with that responsibility.

Now ask yourself, what data do I need to encapsulate in the fields of a StockController class? The answer is almost certainly none whatsoever. You are not usually asked to model the real stock controllers’ working hours or height or physical strength in a line-of-business application. All you need to model is their ability to UpdateStockCount() & UpdateCustomerOrderStatus(). You need model no internal state.

This is typical of business process modelling. Business processes are nearly always stateless – all the information they need is given to them each time they are invoked. This is true for so-called ‘long-running’ processes too: the classes responsible for performing each step of the process are not responsible for holding state. Rather, they expect to have data (or ids/keys to look up the data) passed to it.

In short: ninety percent of the domain of a line-of-business application is usually correctly modelled as:

  • Documents holding data
  • Stateless public processes which receive these documents as their inputs

Because most of what you are modelling is paper documents (which do nothing, except record data) being worked on by employees (about whom you know nothing, except their ability to update the documents).

And that’s why data is not encapsulated in business applications, nor are methods and data held together in the same class.

Objections

But wait – surely we still have encapsulation and information hiding between for instance the UI layer vs data access layer vs domain model?

Yes, that’s right. The StockController exposes the fact that it can UpdateStockCount() but hides the fact (in fact, doesn’t even know itself) that this is achieved by writing to a copy of SQL Server installed on machine XYZ. A clear sign of concerns that should be separated at quite a high level is that they are semantically unrelated, that is, they speak a different language. The UI layer and the domain model know nothing of this ‘Sql Server’ of which you speak.

The point is rather that you should not be disappointed if it turns out that your domain model feels more anaemic than rich. That models the reality of typical business domains.

What about business rules & logic? They aren’t stateless business processes?

Also true. And the above line of thought need not be applied to them. (Does an abstract concept such as ‘business rule’ have state? Yes if it’s the-rule-as-applied-to-a-specific-case; no if it’s ‘The Rule’ in abstract).