Interface Identification
In something of a follow-up to my last discussion, in this post I intend to look at a couple of ways that interfaces are identified, found and versioned.
The simplest implementation of the interface pattern is C++’s (possibly Java’s, but I haven’t used Java enough to speak authoritatively on it). Since C++ doesn’t have interfaces as part of the language, it is still implementing a pattern – a class with all functions being pure-virtual/abstract that is implemented by another class. This approach identifies the interface with a name. If you have the name, you can have the interface.
class IPrinter { public: virtual void Print() = 0; }; extern IPrinter* GetPrinter(); IPrinter* myPrinter = GetPrinter()
I’ve deliberately used an external GetPrinter() method to illustrate the problem. That method lives in a separate module (DLL/so/dylib) and is exported using C++ name mangling (I’m not even going near the issue of different compilers doing that differently). Mangling schemes include the types of all parameters and the return value, so we definitely obtain the correct overload that returns IPrinter*. However, what if the other module looks like this:
class IPrinter { public: virtual void Open() = 0; virtual int Print() = 0; }; IPrinter* GetPrinter() { // return something }
The external module has a newer version of IPrinter, with an extra method and different return value. Unfortunately, this doesn’t change the mangled name of GetPrinter(), so our linking works correctly, but we get a pointer to a different interface.
This can be solved using naming conventions, for example, IPrinter2/IPrinter3/etc. The problem with conventions is that they are generally not well defined. The naming convention used in COM, however, is quite clear:
Each interface — the immutable contract of a functional group of methods — is referred to at run time with a globally unique interface identifier (IID). This IID … allows a client to ask an object precisely whether it supports the semantics of the interface. [Emphasis mine, source: MSDN]
Again, this is only a convention. However, it is supported by registration, type libraries and other tools that complicate the process of stuffing up. (For those who haven’t ever used COM, don’t get scared off. The C++ support is really quite good – most of the time you can hardly tell that it’s not C++ classes.) The rules are straightforward – if you’ve released an interface into the wild, you can only change it if you change its identifier.
In fact, you generally add new interfaces to your object and never remove old ones, since that way you get good backwards compatibility. A C++ implementation requires a new definition, but an old COM definition is still valid for new implementations, since construction and accessing interfaces is handled by the implementation.
Of course, nobody uses COM any more, right? We’ve all moved on to .NET and C#… right? Interfaces in managed code are first-class citizens, unlike in C++, but they have many levels of robustness for identification.
Due to the detailed metadata made available in managed class libraries, you can import the definition of an interface from the file containing the implementation. When you compile your project, the version of the implementation file is stored and checked at runtime. Mismatched versions will cause a crash.
However, version numbers do not necessarily increment for each build. Rest assured, if you change the members of IPrinter behind the client’s back, they will notice and crash anyway (eventually), but if you change the version number as well the crash will be neater (as in, it provides an explanation of what’s wrong).
Ensuring that version numbers increment automatically on the implementation helps, as long as the client application takes it into account. The reference to the implementation contains the current version number, but also a flag indicating whether that version number is relevant. Disable that, and any version of a file with that name will be used. (This could also be referred to as the “asking for trouble” flag, though there are so many of those around that we’ll call this one “Specific Version” to disambiguate.)
You can go one step further and sign your assembly with a strong name key, which doesn’t guarantee that the contents haven’t been altered, but does guarantee that this is the interface you are looking for. Whether this is better or worse than using a GUID, I’m not 100% sure. It’s certainly more work, so I like to think that it’s better.
An recent interesting approach in the area of interfaces has come out of Google’s Go language. The explanation for C++ programmers is here, but the short story is that while interfaces are defined in the same way, objects are automatically associated with it when they implement the same methods. So the following Printer interface applies automatically to both structures shown:
type Printer interface { Open(); Print() int; } type BoringOldPrinter struct { printer_handle int } func (p *BoringOldPrinter) Open() { } func (p *BoringOldPrinter) Print() int { } type MultiFunctionPrinter struct { printerHandle int; scannerHandle int } func (p *MultiFunctionPrinter) Open() { } func (p *MultiFunctionPrinter) Scan() int { } func (p *MultiFunctionPrinter) Print() int { }
I’ll leave the discussion of Go here and move onto Python, primarily because Go is not yet mature enough to discuss versioning. It appears that at present, changing the implementation requires creating new function names, which is a step backwards from C++. I will wait and see.
Can Python really be said to have interfaces? Probably not. It seems similar to Go (“seems” because I don’t understand Go that well, not because I’m inventing stuff about Python) in that if you know the method you want, any object implementing that method will do. Go at least requires all the methods in the interface to be present. However, neither of these languages provides strict interfaces. Which in Python’s case is good. Strict interfaces add overhead that Python does not need (yet…).
C++ is where huge projects and libraries are done, with C# on the rise and COM on the decline. Google is trying to position Go as a C++-replacement but seem to have missed the importance of strict interfaces, while Python does a great job of filling their niche. Strict and controlled interfaces are important for large, multi-developer projects (and essential when developer communication is limited, such as when someone releases a library). However, since I’ve just hit 1000 words, I’ll leave that discussion for next time.
