Caution: my English is far from perfect. (Русский тоже не всегда хорош).

Friday, 22 August 2014

Semantic Versioning is Not the Solution

People often think they can introduce incompatible changes in their library API, and just increase major version number, as semantic versioning proposes, to save the library clients from problems.

It is not true.

Consider a dependency tree:
  web-server 1.1.1
    commons-logging 1.1.1
  db-client 1.1.1
    commons-logging 1.1.1
  authentication 1.1.1
    commons-logging 1.1.1
Now commons-logging changes its API incompatibly and is released as commons-logging 2.0.1. Authentication adopts commons-logging 2.0.1 while other libraries still depend on 1.1.1:
  web-server 1.1.1
    commons-logging 1.1.1
  db-client 1.1.1
    commons-logging 1.1.1
  authentication 1.1.2
    commons-logging 2.0.1
Now my-application is broken, because the dependency tree includes two versions of commons-logging which share packages, class/functions names, and thus can not be loaded simultaneously.

When you release an incompatible API this way, you essentially split the world of dependent libraries into two parts: the ones depending on the old version, and ones depending in new version. Libraries from the first part can not be used together with libraries from the second part.

A better way to introduce incompatible API is to release it as a new library, for example commons-logging2, or new-logging. Make it possible to use the new library simultaneously with the old one, e.g. it should have new package name.
Doing so will protect clients in majority of cases.

If we are releasing new library for new API, there is no need for such a thing as "major version number".

NB: in some module managers, most notably in javascript, there are no global package/class names on which different versions of a library can interfere. But in majority of programming languages that problem exists.


Cosmin Lehene said...

What you're describing seems to be a limitation of the programming language rather than of the versioning model. If you could load modules with different versions without suffering from namespace clashes, you could have both libraries at the same time.
That's assuming they won't be conflicting outside the scope of your runtime (i.e. if they write binary to the same destination that wouldn't work) but such a conflict would be similar in your example.

This said, semantic versioning is not the solution in the same way versioning is not the solution. You're essentially "summarizing" the state of an entire collection of interfaces into a version. That's a lossy (and poor) representation of the information about your global state.

Anton Vodonosov said...

I agree that being able to load different versions freely would be nice (in javascript we often can; in Java sometimes too - using separate class loader).

But this language "limitation" is not the root cause of the problem, in my opinion.

The root cause is attempt by library author to name different things by the same name - reuse old name for new API.

As for you second point, I agree - SemVer power of describing compatibility between what client requires and what library provides, is too coarse grained and otherwise limited.

David McKay said...

I don't believe that a new package is the right action here at all. Why not namespace your versions within the package?

namespace Common\Library\v2;
class CommonLogger extends Common\Library\v1\CommonLogger

By no means perfect, but leaning towards a better goal.

Anton Vodonosov said...

By "package" I mean the same what you call "namespace"

Maurizio Turatti said...

OSGI is a well known solution for this problem in the Java world:
It however introduces quite an additional layer of complexity, as OSGI itself is not as simple as its sponsors tend to suggest.

AM said...

A possible approach: letting incompatible versions be transitive.

Since authentication is switching from commons-logging 1.1.1 to commons-logging 2.0.1, which is breaking compatibility, its version should reflect this fact and indicate it's no more compatible with commons-logging 1.*; for example, it could be updated to commons-logging 2.0.0

I agree that this means that the versioning numbers won't tel anything about the API compatibility, but it would be also about binary compatibility.

Anton Vodonosov said...

@AM, this doesn't fix the root cause - two common-logging versions can't be loaded together, and therefore split the world of libraries depending on them into two parts, which can't be used together.

Suppose we need new authentication 2.0.1, which provides new important feature of bugix. We can't use it, until all other our dependencies switch to the same common-logging as used by authentication. We are locked.

The solution I propose is much simpler and reliable. And also lets library client to migrate with less work - we can leave old tested code as is, relying on the old API; and use new API in new code.

Also important, that breaking compatibility is seldom really needed. Almost always, instead of changing e.g. function arguments list incompatibly, we can introduce new function, and leave the old, deprecated function (maybe reimplemented in terms of new, better function). But I see people aware of semantic versioning carelessly break backward compatibility because they think new major version number protect clients.

Anton Vodonosov said...

@Maurizio Turatti, true. Even more generally speaking, Java allows to load to different versions of the same class in separate classloaders (that's how OSGI works). And you are right, it's an overkill (who would use Java's StringUtils via OSGI. BTW, commons.lang did the right thing and released new API version in new package - org.apache.commons.lang3; I imagine the huge breakage over all the Java world if commons lang break all the old classes and just changed major version number)

Blog Archive