Vectors as ADT

I talked about autognostic objects a couple of weeks ago, and in that post contrasted them with abstract data types (ADTs). I promised to follow up with a post on an ADT implementation, so here it is.

First of all, let’s state the autognosis property once again: an autognostic object can only have detailed knowledge of itself. This constraint is required for objects, but not for ADTs. On the contrary: ADTs are allowed (maybe even expected) to inspect detailed information from other values of their own type (and only of their type).

From that point of view it means it’s perfectly fine to implement the Vector add operation in an ADT as follows:

As you can see, we blatantly access the private data (the x and y) of the addend in order to perform the calculation. We can do this because both the augend and addend are of type Vector and ADTs are allowed to access each others private data when they’re of the same type.

The name Vector denotes a type abstraction. With this kind of abstraction, the abstraction boundary is based on a type name (Vector). This means that as a client all you can see is the type and operations, but the implementation is hidden. “Within” the type, though, you have full access to the implementation and representations. It also means that, contrary to objects, you cannot easily interoperate with other values, since they have a different type and therefore have a hidden representation. All ADTs are based on type abstraction.

This also has some implications for extensibility; specifically that an ADT has to know all possible representations. To see that, let’s say we again want to add a polar representation for the Vector. We do this so we can keep the full accuracy when creating a vector based on polar coordinates, accuracy that would have been lost if we’d convert it to rectangular coordinates first. In JavaScript, we can implement that as follows:

It isn’t pretty, but in languages that have sum types and static typing it tends to work a bit better.

The important thing is that we significantly had to change the ADT to support the new representation. In fact, every new representation will require changes to the ADT. Compare that to objects, where we were able to add new representations without changing any of the existing representations. The reason being that ADTs are abstracted by type, while objects are abstracted by interface.

In general, ADTs are much less suited to adding new representations than objects are. It turns out this difference in extensibility is at the heart of the differences between ADTs and objects, and I’ll dive into that further in a future post. Don’t think that all is bad with ADTs though, they have other qualities… If you’d like a sneak peak, check out the Expression Problem on Wikipedia.

Starving outgoing connections on Windows Azure Web Sites

I recently ran into a problem where an application running on Windows Azure Web Apps (formerly Windows Azure Web Sites or WAWS) was unable to create any outgoing connections. The exception thrown was particularly cryptic:

[SocketException (0x271d): An attempt was made to access a socket in a way forbidden by its access permissions x.x.x.x:80]
   System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress) +208
   System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket,
     IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Exception& exception) +464

And no matter how much I googled, I couldn’t find anything related. Since it was definitely related to creating outgoing connection, and not specific to any service (couldn’t connect to HTTP or SQL), I started to consider that WAWS was limiting the amount of outbound connections I could make. More specifically, I hypothesized I was running out of ephemeral ports.

So I did a lot of debugging, looking around for non-disposed connections and such, but couldn’t really find anything wrong with my code (except the usual). However, when running the app locally I did see a lot of open HTTP connections. Now, I’m not gonna go into details but it turns out this had something to do with a (not very well documented) part of .NET: ServicePointManager. This manager is involved in all HTTP connections and keeps connections open so they can be reused later.

When doing this on a secure connection with client authentication, there are some specific rules on when it can reuse the connections, and that’s exactly what bit me: for every outgoing request I did, a new connection was opened, not reusing any already open connection.

The connections stay open for 100 seconds by default, so if I had enough requests coming in (translating to a couple of outgoing requests each), the amount of connections indeed became quite high. On my local machine, this wasn’t a problem, but it seems Web Apps constrains the amount of open connections you can have.

As far as I know, these limits aren’t documented anywhere, so instead I’ll post them here. Note that these limits are per App Service plan, not per App.

App Service Plan Connection Limit
Free F1 250
Shared D1 250
Basic B1 1 Instance 1920
Basic B2 1 Instance 3968
Basic B3 1 Instance 8064
Standard S1 1 Instance 1920
Standard S1 2 Instances 1920 per instance
Standard S2 1 Instance 3968
Standard S3 1 Instance 8064
Premium P1 1 Instance (Preview)  1920

I think it’s safe to say that the amount of available connections is per instance, so that 3 Instances S3 have 3*8604 connections available. I also didn’t measure P2 and P3 and I assume they are equal to the B2/S2 and B3/S3 level. If someone happens to know an official list, please let me know.

The odd-numbering of the limits might make more sense if you look at it in hex: 0x780 (1920), 0xF80 (3968) and 0x1F80 (8064).

If you run into trouble with ServicePointManager yourself, I have a utility class that might come in handy to debug this problem.

 

Autognostic objects in C# and Javascript

William Cook talks about the difference between objects and Abstract Data Types (ADTs) in this great paper: On Understanding Data Abstraction, Revisited. In his treatise on objects, he lists a property called “Autognosis” which, in his words, describes the contraint:

An object can only access other objects through their public interfaces.

Put another way (again by Cook):

An autognostic object can only have detailed knowledge of itself.

I find this constraint particularly interesting, because it’s something that’s easily violated in conventional OOP languages. Presumably because it’s so easy, a lot of OOP code also actually does violate it.

Let’s take the example of a 2D vector; a reasonable implementation in C# (I’ll come to Javascript later) might go something like this:

So what’s the problem? Well, to compute the sum of the two vectors the augend needs access to the private data of addend (addend._x and addend._y). Thus, one object (the augend) accesses another object (the addend) through something else than the public interface. This violates the Autognosis constraint.

In Javascript, we can have an equivalent implementation:

Whether the autognosis property is violated in this case is a little less clear: Javascript doesn’t have access modifiers, so it isn’t clear from the code itself what is part of the public contract and what is not. If x and y are part of it, it isn’t violated. For now, however, let’s assume only fields that have a function value are part of the public contract. So let’s fix it:

By itself, this doesn’t bring us very much, we’ve just wrapped the instance variables. However, from an extensibility point of view this has brought us a great lot: we can now create new, interoperable vector implementations:

And their interoperation:

Let’s take a moment to appreciate what we did here. We were able to introduce a new vector implementation and use that seamlessly within our existing code, without needing to change any of that existing code. This is great, since it means that your application can be extended and improved after writing it in the first place, without needing to modify it.

There is another great advantage to this: besides you writing new extensions, other people can do that as well, allowing your application to do stuff that you never imagined in the first place. Arguably, the vector example doesn’t really show those traits, but OO design principles is exactly what’s at the heart of platform ecosystems such as iOS and Android Apps, or even the internet itself.

To summarize, that’s one of the great benefits of OOP: objects provide for autonomous extensibility with interoperability. Autognosis is one of the key facilitators of that property.

Afterthoughts

This story isn’t specific to Javascript. In C# or Java we can use interfaces to obtain similar results. We should be careful not to explicitly test for class type in the implemention though, since that again violates the autognosis property. (The same goes for Javascript, but testing for types seems to be less common there)

Also notice that nowhere in this post I either talked about classes or inheritance, yet still arrived at some very useful properties directly related to OOP. This strengthens my beliefs that classes and inheritance are not fundamental parts of OOP.

I haven’t said too much about ADTs in this post, however I’ll try and do a follow up post to explore how you’d do a similar extension using ADTs.

Credits

Inspiration for this post came from:

My 16 definitions of Microservices

If you’re in software development these days it’s almost impossible to have missed all the buzz about microservices. The first time I heard about it was in a presentation by Fred George, which I really enjoyed and was mind-challenging for me. After that, I heard the term popping up more and more and noticed that there isn’t really any consensus on what it means. This really confused me since a lot of people are talking about different things but are calling it the same. However, they all do seem to revolve around the idea of partitioning a large body of code into smaller parts. The question then becomes: on what criteria are you partitioning? Or, what defines a service boundary?

I think that’s the essence of my confusion: people partition their code across different dimensions. So, to clear up my own mind, I’ve decided to compile a list of partitioning dimensions/definitions I’ve come across and share them here with you:

#1 Deployment unit

A microservice is a unit of deployment. By this definition every part of your system that you individually deploy is a microservice. For example: a web site, a web service, a SPA, background tasks, etc. This definition is usually related to the scalability properties of microservices: the idea that you can individually scale different parts of your system, something that’s useful if different parts are under different loads. The services are not necessarily independently deployable.

#2 Independent deployments

By this definition a microservice is a piece of your system that you can (and should) deploy independently. This consists of at least one deployment unit. You are independently deployable if you can deploy your service without other services needing to deploy as well. A service will consist of more than one deployment unit if the individual deployment units must be deployed together for the service to keep functioning correctly. In this approach services are autonomous in the sense that they can be updated independently without depending on other services.

#3 Business function

Each service is responsible for a specific business function. Business functions like billing, sales, pricing, recommendations, etc. This is mostly useful if it’s combined with Independent teams (#5), Separate code base (#15) or Independent deployments (#2). With Independent teams, it’s clear for the business owner who to talk to when there’s a problem or he needs a new feature. Separate code bases help because the discipline of not using raw data from a different business function is enforced since that code is simply not nearby.

#4 Technical function

Here, a service is defined by its technical function such as front-end, back-end, database or messaging infrastructure. As with Business function (#3) this would usually not be considered a microservice without Independent Teams (#5), Separate code base (#15) or Independent deployments (#2). Yet, I’ve seen people calling a good old 3-tier architecture a microservices architecture without there being any other criteria.

#5 Independent teams

Each team is responsible for one service. In essence the code a team is working on defines the service boundary. The team develops and runs the service completely by themselves (2 pizza team). They can work autonomously in the sense that they can make all the decisions with regard to the service they are responsible for. When they have a dependency on another service they agree on a contract with the team responsible for that service. Teams are usually aligned to Business function (#3) or Technical function (#4) and sometimes also have their Separate code base (#15).

#6 Private memory space

Services are defined by their ability to run in their own memory space. If an application runs in its own memory space, it’s a service. Such services cannot be crashed by other services running in the same process (since there aren’t any). Also the in-memory data is completely private to the service, so can’t be used by other services. Each service can potentially be built on a different platform or programming language (#10).

#7 Independent database

Each service has its own private database. Services can’t access databases from other services. There are as many services as there are databases in the system: a database defines the service. The services are completely autonomous in their choice for data storage, schema refactorings, etc. They can also be held completely responsible for the conceptual integrity of their data. In general a service will only have a few tables (or data schemas if you like). This is important because a big problem in large monolithic is the ease of access to data that you conceptually have no business touching. If you’re providing a CRUD-y interface on your service, this doesn’t count.

#8 Temporally decoupled

A service is a piece of code that’s temporally decoupled from other pieces of code. This means a service can keep operating (for a finite amount of time) even if services it needs to interact with/depends on are down. This usually implies some form of async messaging using queues or service buses. RPC between services is out of the question because you seize to be functional whenever the other service is down.

#9 Communicating via REST/JSON over HTTP

A microservice is any application that communicates to other microservices via REST/JSON over HTTP. This specifically discounts the possibility of multiple services running in the process or using some form of binary protocol. This is mostly done from an interoperability standpoint since such a service is highly interoperable as a lot of platforms speak REST/JSON and HTTP.

#10 Independent choice of platform/programming language

Two microservices are not two microservices if it isn’t possible to write them in a different language. In this sense, it’s possible for each service to “pick the right tool for the job”. Service boundaries are defined by the ability to do this.

#11 Objects

Microservices are no different from “real objects”. With real objects being the way Alan Kay originally thought of them: objects providing a goal-based interface to provide some sort of service to the user. Interactions between objects will generally occur through messaging with a synchronous in-proc call being a specific kind of message, but not the only way objects can communicate (though that’s the only way that ended up in OOP languages).

#12 Containers/Docker

A microservice is any application that runs inside a container such as Docker. When people define it this way there are usually other contraints as well (specifically Independent deployments (#2) or Communication via REST/JSON over HTTP (#9)), but I’ve come across definitions that seemingly don’t require anything else but containerization.

#13 SOA done right

SOA was originally intended to do a lot of stuff that’s now attributed to microservices. However, SOA somehow ended up being largely synonymous with web services, and people that were once big on SOA are now happy to have a new name for the same ideas with microservices. The definition of a service in SOA is also not that clear, but I think (correct me if I’m wrong) it was mostly supposed to be about interoperability and IT/business alignment.

#14 Shared-nothing architecture

Services are pieces of code that “share nothing”. That means they’re very well encapsulated: no conceptual or physical data sharing. Combines Private memory space (#6) and Independent database (#7). For some people also means no sharing of virtual or physical machine to run on.

#15 Separate code base

A service is defined by the code base it’s in. Or equivalently: each service has its own code base. This is absolutely required for Independent choice of platform/programming language (#10) but it’s also implied in many of the other definitions.

#16 Bounded context

A service is the isomorphic with a DDD Bounded Context. This means services and BCs map one-to-one. It will usually involve the strong code and runtime isolation found in some of the other definitions, to enforce the models staying clean. Eric Evans also did a talk about this at DDDX.

Conclusion

I think of the above list more as axes than definitions. In fact, I find it highly unlikely that any architecture that calls itself a microservices architecture will conform to only one of the points in the above list. Two or three will be much more likely.

So what is the right definition? I really think it doesn’t matter. All of the above definitions have pros and cons in given contexts. The question is not that much which definition to use, but what context you’re in. You are probably not Amazon or Netflix, so things that apply to them might not apply to you. Quite the contrary, using their definitions and playing by their rules will probably hurt you more than it will help you. Therefore, pick the definition that helps you with the problems you’re facing, be it scalability, time to market, quality or whatever, but don’t get distracted by anything else just to be on the microservices bandwagon.

Closing remarks

I’m sorry I haven’t provided a list of sources with regard to the definitions. Most are straight from the top of my head, and come from blogs I read, talks I’ve seen or people I’ve talked to. If you disagree and think microservices are well defined, please let me know. Also, if I missed any definitions, don’t hesitate to comment.

 

AzurePlot

For the last couple of months I’ve been working on a little side-project: AzurePlot. Yesterday I put out a big new release, with a lot of features that should make it something that’s useful for others as well. From the readme:

AzurePlot plots metrics from various Azure services. It’s designed to be a better alternative to the native charting/monitoring capabilities of the Azure portal, focusing on usability and performance. It works by accessing the APIs provided by the individual services.

The reason I built it is that I’m kind of disappointed with the Azure portal with regard to its charting/metrics capabilities. Whenever there is a production issue with one of our projects on Azure, gaining insight through the Azure portal is just a big PITA because it’s slow, lacks basic features, etc. Simply, it isn’t usable to do the kind of diagnosis I need to do. So that needed to change.

I’ve used Graphite in the past, and was really happy with that, so at first I wanted to get my Azure metrics into Graphite. That became reality with WadGraphEs. However, to access the data you need to have access to the Azure APIs and that requires uploading a management certificate, giving you wide access to the Azure subscription. I didn’t want that kind of responsibility in my product, so set out to build an intermediate application that you would host yourself, and expose the metrics through an additional API of that application. When building that, I figured it would make sense if that project was also able to render the data. That became AzurePlot.

I’m quite happy how it turned out, currently it can:

  • Read metrics data from Windows Azure Web Sites, Web/worker roles, VMs and SQL Database
  • Chart those metrics in a dashboard
  • Export the metrics through an API for external consumption

To give you an idea what it looks like:

websites

There’s still a lot of work to be done, such as consuming data from more data sources and providing more powerful chart manipulations. I will continue working on that the coming months. Even still, if you’re running on Azure you might find it really useful, so check it out!