Should a method name describe what it does or what it intends?

Bob Martin raises a good example in InformIT: Robert C. Martin’s Clean Code Tip of the Week #1: An Accidental Doppelgänger in Ruby > Duplication of two functions that do the same thing but mean different things by it.

I recently stumbled into a slightly different take on the question of should a function say what it does or what it intends?

When a  function implements business process Alpha that today consists of steps A and B (but tomorrow  may change) should you call the function DoBusinessProcessAlpha or call it DoStepsAandB?

One answer would be, if the function is in a public package which exposes business functionality then the name should probably show that it does BusinessProcessAlpha. But if it is a private, not exposed, function then the reader is probably looking for the detail, that it does StepsAandB.

The question is more awkward if Steps A and B are themselves business process functions. That is, if you asked your customer, they would understand what steps A and B mean.

I suppose in that case you could always call it DoAlphaAsStepsAandB().

Your CPU vs your code. Two short lessons in software performance

There’s a great Q and Answer at dijkstra-path-finding-in-c-is-15x-slower-than-c-version which starts with C++ code at 2ms vs C# at 38ms; and finishes with C# at 2.8ms.

On similar lines, the solution to this riddle — why-does-net-core-2-0-perform-worse-than-net-framework-4-6-1 — has nothing to do with the framework; it results from the slower performance of 64bit code compared to 32bit.

Which just goes to confirm what you might have suspected;

VMs aren’t that slow; and
For best performance, you have to understand your VM, and your language, and your O/S, and ultimately your CPU.

Three Historical Definitions of the Open/Closed Principle and a Claim that it’s Pointless

Bertrand Meyer first published the OCP in his influential 1980’s book Object Oriented Software Construction:

  • “A module is said to be open if it is still available for extension. For example, adding new fields, or performing new functions.
  • A module will be said to be closed if it is available for use by other modules.”

It’s a neat double-definition, not least because the definition of Closed is both useful, and one that might not contradict the definition of Open. But Meyer’s proposed technique—use subclassing to achieve Openness of a Closed module—is widely ignored. Many of us have discovered the pain of working with inheritance hierarchies, so we savour the Gang of Four’s sage dictum: “prefer composition over inheritance.”

Dynamic languages like javascript can do the open/closed trick quite easily. The danger is that, in doing so, you develop inscrutable code and and are left with system that, when it works, you don’t know how it works; and so when it breaks you don’t know how to fix it.

Uncle Bob all-but-redefined the Open/Closed principle by using Interfaces as his technique. The interface is fixed and Closed; modules that depend on it can rely on it not changing. But the Implementation is Open: it can be changed without breaking the interface.

It is worth mentioning a word of wisdom from the .Net team’s Framework Design Guidelines: the weakness of interfaces in Java and .Net is precisely that they are 100% closed. There can be no version 2. Or rather, if there is a InterfaceV2 then it can usually have no useful relationship to InterfaceV1. You might as well call it ICompletelyUnrelatedInterface. (Or perhaps one could put the versioning at the namespace level).
This versioning problem is widely felt in service oriented systems with public interfaces. It is often addressed by creating a new endpoint for a new version of the service. Offering two versions of a service becomes on the whole precisely as expensive as offering two services, which is to say twice as expensive. This is unfortunate.

Contrast this with Meyer’s vision of OCP: On Meyer’s subclassing approach, version 1 clients and version 2 clients would call the same service and get the same responses. Version 2 clients would recognise, and so be able to use, the enhanced v2 capabilities; whereas version 1 clients would only recognise the version 1 capabilities. But here I see a second problem with Meyer’s vision: I’ve almost never seen systems (or even parts of systems) that can achieve it in practise. It’s a beautiful dream. But unachievable. It is a pipedream.

More recently (Dec 2016), Michael Feathers has offered an updated version, towards the bottom of the the page at Towards a galvanizing definition of technical debt:
“our code is better to the degree that we don’t have to change it much when we add features. We should be able to make modifications primarily by adding new classes and functions rather than changing existing ones”
This is much ‘softer’ formulation than Meyer’s or Bob Martin’s and you could take it as just a rule of thumb; something to weigh in the balance against other factors. Feathers implementation in this case (and I’m left with the impression that in a different codebase he’d be happy with a different implementation) is doing event driven code as most people think it should be done: use an AddEventListener() interface, which makes the code Open to all kinds of extension.
This AddEventListener() is exactly the approach used in the HTML spec and other GUI frameworks of the past 20 years. The downside is that the ‘closed’ bit of the interface is so small and weakly-typed that it’s almost non-existent. The interface tells you nothing about the semantics. (What kind of events can I listen to? What information do I get about each event? What can I do with them? I can only find out by reading the HTML spec, which turns out to be quite hard going, or turning to MDN, or, the first port of call for many, StackOverflow. In a bespoke codebase replace this with “ask for documentation; find it is incomplete; and then hunt through the code for examples of how I can use it”).
Strongly typed interfaces are at least somewhat self-documenting—they offer a definitive list of all syntactically valid calls to the service—even if that documentation depends heavily on how well the developers chose their method and parameter names.

These three examples leave me with mixed feelings. OCP seems like trying to square the circle, and Meyer’s choice of name was a well-chosen contradiction. Yet the goals—Openness for extension, Closedness for reliability—are unavoidable.

Dan North, amongst others, has suggest that OCP, and indeed all the SOLID principles, are of limited value and we should drop them in favour of something else. I sympathise—I think that SOLID is a mishmash of mixed value—but I’m willing to wrestle for a couple more years with OCP before I admit defeat.

I’d rather have the above three technique, and others, in my toolkit because my software design still has to address the two contradictory requirements that Meyer identified in the 80s:
–Because my software is still evolving, it has to be open for evolution: it has to change.
–Because my software is already is use, and hence being depended on by some other software or person, it has to be reliable and therefore can’t change.

Conway’s Law & Distributed Working. Some Comments & Experience

The eye-opener in my personal experience of Conway’s law was this:

A company with an IT department on the 1st floor, and a marketing department on the 2nd floor, where the web servers were managed by the marketing department (really), and the back end by the IT department.

I was a developer in the marketing department. I could discuss and change web tier code in minutes. To get a change made to the back end would take me days of negotiation, explanation and release co-ordination.

Guess where I put most of my code?

Inevitably the architecture of the system became Webtier vs Backend. And inevitably, I put code on the webserver which, had we been organised differently, I would have put in a different place.

This is Conway’s law: That the communication structure – the low cost of working within my department vs the much higher cost of working across a department boundary – constrained my arrangement of code, and hence the structure of the system. The team “just downstairs” was just too far.  What was that gap made of? Even that small physical gap raised the cost of communication; but also the gaps & differences in priorities, release schedules, code ownership, and—perhaps most of all—personal acquaintance; I just didn’t know the people, or know who to ask.

Conway’s Law vs Distributed Working

Mark Seemann has recently argued that successful, globally distributed, OSS projects demonstrate that co-location isn’t all it’s claimed to be. Which set me thinking about communication in OSS projects.

In my example above, I had no ownership (for instance, no commit rights) to back end code and I didn’t know, and hence didn’t communicate with, the people who did. The tools of OSS—a shared visible repository, the ability to ‘see’ who is working on what, public visibility of discussion threads, being able to get in touch, to to raise pull requests—all serve to reduce the cost of communication.

In other words, the technology helps to re-create, at a distance, the benefits enjoyed by co-located workers.

When thinking of communication & co-location, I naturally think of talking. But @ploeh‘s comments have prodded me into thinking that code ownership is just as big a deal as talking. It’s just something that we take for granted in a co-located team. I mean, if your co-located team didn’t have access to each other’s code, what would be the point of co-locating?

Another big deal with co-location is “tacit” knowledge, facilitated by, as Alistair Cockburn put it, osmotic communication. When two of my colleagues discuss something, I can overhear it and be aware of what’s going on without having to be explicitly invited. What’s more, I can quickly filter out what isn’t relevant to me, or I can spontaneously join conversations & decisions that do concern me. Without even trying, everyone is involved when they need to be in a way that someone working in a separate room–even one that’s right next door–can’t achieve.

But a distributed project can achieve this too. By forcing most communication through shared public channels—mailing lists, chatrooms, pull request conversations—a distributed team can achieve better osmotic communication than a team which has two adjacent rooms in a building.

The cost, I guess, is that typing & reading is more expensive (in time) than talking & listening. Then again, the time-cost of talking can be quite high too (though not nearly as a high as the cost of failing to communicate).

I still suspect that twenty people in a room can work faster than twenty people across the globe. But the communication pathways of a distributed team can be less constrained than those same people in one building but separated even by a flimsy partition wall.

References