Exploring the essence of object-oriented programming

Warning: no actual answers are given in this post 🙂

For a long time I’ve been fascinated what object-oriented programming is really all about. I’ve never talked to someone who could give me a closed definition, or could even give me the reasons why OOP is a good (or bad) idea. Lately I’ve been doing quite a lot of research, which led me to things as The early history of Smalltak, Smalltalk itself, a lot of work from Alan Kay, the Viewpoints Research Institute, Lisp, COLAs, Meta II and a lot more stuff. Very interesting, I can tell you, but while I learned a lot, I still don’t have the answers. Nevertheless, I thought it would be time to write down my results/thoughts so far.

For a lot of people, OOP is mostly about classes, inheritance, polymorphism and modelling nouns from the real world. When you go back in history, however, you’ll find that the term object-oriented programming was first used by Alan Kay, and he doesn’t seem to mean the aforementioned things at all. Instead, he says the most important idea about OOP is messaging, and that the term OOP was actually not that well chosen since it puts the focus on objects instead. He also says:

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things.

I’m specifically interested in this “definition”. Why? If a visionary like Alan Kay finds this important, it probably is. And I like to know important stuff 🙂 So what do these 3 things mean, what problems do they solve and how does OOP relate to them. Spoiler: I don’t have (all) the answers yet. Let’s visit these three concepts, and how I understand them at the moment.

Messaging. Communication between objects is done through messaging only. Each object decides how to interpret and act on a message themselves. Objects need to decide how messaging works up front.

Local retention and protection and hiding of state-process. An object has private memory which no other object can access. If another objects wants access to some of the data of an object, this needs to happen through a message. An object itself is in control of actually granting that request.

Extreme late-binding of all the things. This is the one I have the most trouble with grasping the consequences. The idea is to bind everything to concrete stuff as late as possible. For example: compiling to bytecode binds your code to a particular machine later than compiling directly to machine code. The main idea here is that you can delay important choices until you have better understanding. In C# a lot of decisions are already made for you (everything is an object, no multiple inheritance, etc) and we bind early to those decisions since there is no way to change them later. If we later find out multiple-inheritance would help in some part of our code, we have no way to do that. The same goes for selecting which method is gonna interpret a message, in C# this is determined compile time which forces you to do all kinds of weird stuff should you actually want to do that runtime.

From a software development point of view, I’ve seen all three things being helpful in one context or another: it feels logical that combining them indeed yields a very powerful programming model. Why Alan Kay thinks these 3 specifically are that important, however, I’m still not sure. At one point, he talks about Real Computers All The Way Down (RCATWD), meaning that every object can represent anything. That’s powerful, and I currently think it’s related to that, but I’m not exactly sure in what way yet.

One thing I noticed on reviewing this post is that at no point I’m talking about programming languages. That’s probably not coincidental, in the sense that OOP is not tied to any particular language; you can actually do OOP in a non-OOP language, and you can program in a non-OO style in OO languages.

So what is OOP? It seems safe to say that fundamental elements of this programming style are at least objects and messages, where objects communicate with each other through the messages. Objects have a private memory to store the data they need to carry out their computation. And that’s where things get messy: does everything need to be an object? If it is, how do you actually represent data (something like church numerals?)? If it’s not, where do you draw the line? Does it really matter, or is it fine to do just parts of your system OOP? How does functional relate to OOP?

All kind of questions I don’t really have an answer to at the moment, but I’ll get back to you if I find out..

Integrating application data using Atom feeds

While Atom is often associated with things like blogs or news feeds, it turns out it’s also an excellent vehicle for integrating application data across your systems. At Infi we’re using this technique for several integration scenario’s for our customers, and it’s quickly becoming a preferred way of solving this kind of problem.

This post assumes a basic understanding of Atom (though you can probably follow it if you don’t). If you need to brush up your knowledge, you can read a basic introduction here.

Context

We often have a situation where a subset of users manage data that is exposed to a much larger set of different users. For example, we might have a website that deals with selling used cars from car dealers, where the car dealers manage the car data, such as pictures, descriptions, etc. This data is then viewed by a large group of users on the website. One of our customers has a similar model, though in a different domain.

As this customer has grown, more and more additional systems needed access to this data, such as several websites, external API’s and an e-mail marketing system. We used to do these integrations on an ad-hoc basis, using a custom-built solution for every integration (e.g. database integration, RPC, XML feeds). As you can imagine, this situation was becoming an increasing point of pain since it doesn’t really scale well in a couple of dimensions, such as performance, documentation, reward/effort, etc. The situation looked more or less like this:

current-status

Requirements

To remove some of this pain we decided on building something new, specifically having the following requirements:

Good performance. This means 2 things: first, we want low latency responses on requests on our integration system. Second, we don’t want high loads on the integration solution pulling down other parts of our system.

Low latency updates. It’s OK for the data to be eventual consistent, but under normal conditions we want propagation times to be low, say on the order of several minutes or less.

No temporal coupling. Consumers shouldn’t be temporally coupled to the integration system. This means they should be able to continue doing their work even when the integration endpoint is unavailable.

Support multiple logical consumers. All (new) integrations should be done through the new system, such that we don’t need to do any modifications for specific consumers.

Ease of integration. It should be easy for consumers to integrate with this new system. Ideally, we should only have to write and maintain documentation on the domain-specific parts and nothing else.

It turns out we can address all these requirements by combining two ideas: snapshotting the data on mutations and consumer-driven pub/sub with Atom feeds.

Solution

The core idea is that whenever an update on a piece of data occurs, we snapshot it, and then post it to an append-only Atom feed:

handling_mutation

This way we essentially build up a history of all the mutations inside the feed. The second part of the solution comes from organizing (paging) the feed in a specific way (this builds on top of the Feed Paging and Archiving RFC):

structure

In this case we divide the snapshots into pages of 20 entries each.

The value in this structure is that only the root and the currently ‘active’ pages (i.e. not completely filled pages) have dynamic content. All the other resources/URLs are completely static, making them indefinitely cacheable. In general, most of the pages will be full, so almost everything will be cacheable. This means we can easily scale out using nginx or any other caching HTTP proxy.

Consuming the feed

From a consumer point of view you’re gonna maintain your own copy of the data and use that to serve your needs. You keep this data up to date by chasing the ‘head’ of the feed. This is done by continuously following the rel=”previous” link in the feed documents.  Once you read the head of the feed (indicated by no entries, and a missing link rel=”previous”) you keep polling on that URL until a new entry appears.

Evaluating the solution

To see how this solution fulfills our requirements, let’s revisit them:

Good performance 

  • Since data is snapshotted, we don’t need to do expensive querying on our transaction-system database. This allows for both quick responses to in the integration system as well as isolation of our transactional system.
  • Because almost everything is cacheable, you can easily scale out and provide low-latency responses.

Low latency updates

  • The detection of changes to the data is event-based instead of some sort of large batch process to detect changes. This allows the data to flow quickly through the system and appear in the feed. Polling the head of the feed is not expensive and can therefore be done on minute or even second basis, so the clients themselves will also be able to notice the updates quickly.

No temporal coupling

  • First of all, the consumers are decoupled from the integration system because they have their own copy of the data so they don’t have to query the feed in real-time to serve their needs. Secondly, the integration system itself is also not coupled to the system containing the original data, since the snapshots are stored in the Atom feed.

Support multiple logical consumers

  • Multiple logical consumers are trivially supported by handing out the URL to the feed endpoint.
  • One problem is different consumers requiring different parts of the data.  Currently, we’ve solved this by always serving the union of all required data pieces to all the consumers. This isn’t exactly elegant, but it works (though we sometimes need to add fields). A better solution would be for clients to send an Accept header containing a mediatype that identifies the specific representation of the data they want.
  • We also built in rudimentary authentication based on pre-shared keys and HTTP basic authentication.

Ease of integration

  • This is where the whole Atom thing comes in. Since both Atom and paging techniques are standard, we only need to document the structure of our own data. From a client point of view, they can use any standard Atom reader to process the data.
  • To make things even more easy to integrate, we also created a custom JSON representation for Atom. This is useful for consumers on platforms that don’t have strong XML support.

Conclusion

As you can see, all our requirements are met by this solution. In practice, it also works very well. We’ve been able to reduce loads on our systems, get data quicker into other systems, and both for us and our partners it’s easier to implement integrations. We started out doing this for just one type of data (property information), but quickly implemented it for other types of data as well.

One of the challenges is pick the right level of granularity for the feeds.  Pick the granularity too narrow, and it becomes harder for a client to consume (needs to keep track of lots of feeds). Pick it too wide, and there will be a lot of updates containing little change. In our cases, common sense and some basic analysis of how many updates were expected worked out fine.

The main drawbacks we encounter are twofold. First, people not familiar with Atom and the paging standards sometimes had problems working with the feed structure. Especially when there’s no platform support for Atom or even basic XML, we sometimes still have to help out. Second, for some integrations maintaining a copy of the data on the consumer side proved to be a bit too much (or not even possible). For these situations we actually built a consumer of our own, which served data to these thinner clients.

Credits

The ideas in this post are not new. We were particularly inspired by EventStore, which also provides an Atom based API for exposing its event streams.

When not to do responsive design

The premise of responsive webdesign is great: by using one set of HTML, CSS and JavaScript for all device types you can shorten development time and costs compared to having multiple. While I haven’t seen any proof of this, I can imagine this being so in certain situations. In general, however, I don’t think this is the case and there are a lot of situations where responsive can actually increase your development time and costs.

As part of my job, I regularly get asked by clients whether they should go responsive. My usual answer is that they probably shouldn’t, for various reasons. In this post I’ll list some of them.

You want to test new features often

As a first case let’s say you’re introducing a new feature, and are not yet completely sure if your users are gonna value it. In this case it doesn’t make sense to directly go all the way. Instead, you want to put something into the market as quickly as possible, and when it turns out your feature is actually a hit, you want to implement it for other platforms.

Clients are often surprised by this reasoning, since it’s assumed that the whole point of responsive is that you have to do the work only once. Well, that’s just not true. No matter how you do it, you will need to design how the scaling up/down will actually work; how the pages are laid out at the different sizes, how the interaction works. This will always be extra work, apart from any coding activities. On the coding side, there will still be extra development work since building HTML/CSS that works for a variety of devices types will always be harder than building it for a single device type. Finally, you will need to test it on all the device types.

So going responsive increased your time to market, but what you wanted is to test the feature ASAP, instead of on all your users.

You want to support a new class of device

You already have a mobile site, but now want to add a desktop version. You have two options here: build separate desktop HTML/CSS or rebuild both the desktop and mobile site as a responsive site.

I’d always go for the first option because of the following implications when (re)building both together:

  • I get more development work since I now need to implement two design instead of one.
  • The mobile site needs to be retested and debugged.
  • I can probably not release both independently. That means I probably need to wait with making changes to the mobile site until I’ve released the desktop site, which might take a while.

In short, it’s gonna take more time to do a desktop version and you’re holding back mobile development, while I’m not convinced there is any benefit. Adding just the desktop version also goes hand in hand with the XP principle of baby steps, changing one thing at a time, which I strongly believe in.

One thing I might do is take into consideration that when I’ve finished the desktop version, I’m gonna retrofit that as a responsive version of the mobile site. But only when I can see the benefits of going responsive (see conclusion).

The use cases vary greatly among device classes

When I’m buying something online, I usually do this on a desktop. I’d like to browse a little bit, compare, read some reviews, etc, before making the purchase. After making the purchase, I then use my phone to regularly check up on the status of my order.

It’s pretty obvious these are two completely different use cases, and therefore require completely different interaction models and navigation paths. This is gonna be hard to do when you’re doing responsive, since responsive assumes pretty much a 1-to-1 mapping to what you can do on each page, and how you navigate. So again, without a clear benefit, you’ve seriously constrained your own ability to provide the most appropriate user experience for each device type.

You want separate teams handling mobile and desktop

As stated above, since use cases probably vary among device classes, I might want to have separate teams handling the development of each, both of them optimizing for their specific type of use cases. I want those teams to be autonomous. Having them work in the same code base is not gonna make that work: there needs to be a lot of coordination to avoid breaking each other’s work, and you can never release independently. So using responsive hurts your ability to scale/structure your teams as you like.

Conclusion

To be fair, most of the problems listed above can actually be circumvented if you try hard enough. Doing that, however, nullifies the entire argument for doing responsive in the first place, which is saving time/costs.

The underlying problem with all of the above cases is that you’re introducing coupling, and we all know that the wrong kind of coupling can really hurt your code. In the above examples, the coupling is working against you instead of for you, manifesting itself in less flexibility, less agility, longer times to market and a lesser end user experience. All this without any real, clear benefit. For me, that’s hard to justify. Especially since that, in my experience, it’s not that hard to build either a separate desktop or mobile variant of your site once you already have the other. Most of the time actually goes into other work, such as settling on functionality, implementing that functionality, designing basic styles (which you can share), etc. I think in a lot of situations this will actually save development time/costs compared to going responsive.

Only in situations where you have a very basic design and just a small amount of functionality that’s not going to change a lot, and you don’t care about flexibility a lot, responsive might actually reduce development time, but certainly not a lot (I dare say at most 5%, if you do reuse basic style components).

Don’t get me wrong, I strongly feel you should provide a good experience for as big an audience as possible. I just don’t think responsive (across all device classes) is the general way to do it.

On URLs

Some people think the only way to get the same URL for desktop and mobile is doing responsive. This is not true since you can detect device class server-side and then decide which HTML you’re gonna serve. And really, Google doesn’t care.

Afterthought: How this relates to MVC

MVC has emerged as method to have multiple views on the same data. Jeff Atwood once wrote an article that on the web, HTML can be seen as the Model and CSS as the View. I don’t agree. For me, HTML is part of the view. To show multiple representations of the same data (the model), as you do when viewing on multiple devices, you create multiple views, comprising both HTML and CSS.

Creating self-signed X.509 (SSL) certificates in .NET using Mono.Security

***Disclaimer***

I’m not a security expert. For that reason, I’m not completely sure in what kind of situations you can use this solution, but you should probably not use it for any production purposes. If you are an expert, please let me know if there are any problems (or not) with the solution.

*** End disclaimer ***

I recently had to programmatically create self-signed X.509 certificates in a .NET application for my WadGraphEs Azure Dashboard project. Specifically, I wanted to generate a PCKCS#12 .pfx file containing the private key and the certificate, as well as a DER .cer file containing the certificate file only.

Unfortunately there doesn’t seem to be an out of the box managed API available from Microsoft, but I was able to make it work using Mono.Security. To see how it’s done, let’s start how to generate them with makecert.exe in the first place.

Creating a self-signed certificate using makercert.exe

makecert.exe is a Microsoft tool that you can use to create self-signed certificates. It’s documented here, and the basic syntax to create a self signed certificate is:

makecert -sky exchange -r -n "CN=certname" -pe -a sha1 -len 2048 -ss My "certname.cer"

This will do a couple of things:

  • Generate a 2048 bit long private/public exchange type key pair
  • Generate a certificate with name “CN=certname” and signed with above-mentioned keys
  • Store the certificate + private key in the “My” certificate store
  • Store the DER format certificate only in the file “certname.cer”

So the .cer file containing the certificate is already generated using this method, and we can get to the .pfx file by exporting it (Copy to file…) from certmgr.msc.

Now, the problem is we can’t easily do this from code. I specifically needed a managed solution, so invoking makecert.exe from my application wouldn’t do it, and neither would using the Win32 APIs. Luckily, the Mono guys actually created a managed makecert.exe port, so with a bit of tuning it should actually be possible to generate the certificate.

Mono.Security to the rescue

The code to the makecert port is available at https://github.com/mono/mono/blob/master/mcs/tools/security/makecert.cs. To use it to generate the self-signed certificate I extracted the code that’s actually used given the provided command line parameters above, and put that into its own class:

Generating a .pfx and .cer is now done as follows (once you’ve nuget installed Mono.Security):

And that’s it, you have now created a pfx/cer pair from pure managed code.

Closing remarks on Mono.Security

There are a couple of peculiarities with the makecert port:

  • The tool initializes CspParameters subjectParams and CspParameters issuerParameters based on the command line arguments, but it does not actually seem to be using them when generating the certificate. I don’t think our set of command line parameters actually influences those two objects, but it’s still a little bit weird.
  • The tool doesn’t support the -len parameter, so I’ve changed the way to generate the key by not using the RSA.Create() factory, but instead hard-code it to new RSACryptoServiceProvider(2048), which should do it. I’ve also confirmed the length using both OpenSSL and certmgr.msc.

It’d be great if someone can independently verify whether the above two points are indeed working as intended.

Anyway, big thanks to the Mono.Security team for providing the makecert port.

 

Implementing the payroll case study

PPPAt Infi we’re trying to create an environment of continuous learning. One of the things we do is do a period of studying with your co-workers once every year. The subject can be anything related to software development, and this year we chose to study Uncle Bob Martin’s excellent book Agile Software Development, Principles, Patterns, and Practices.

One of the sections in the book is The Payroll Case Study, which details the development of a fictional payroll application. During the case study we learn how to apply the principles and patterns in practice, and we learn about the various trade-offs that need to be made. Since this specific domain is closest to the ones we usually deal with (business oriented domains), it’s actually quite a good example to explain some of the things we run in to on a day-to-day basis. Therefore, I decided to implement this study as a baseline example for future blogs. I also implemented the various packaging schemes and used NDepend to create abstractness vs instability diagram, which are pretty cool 🙂

I put the code on github, so check it out! I also welcome you to create pull requests 🙂