Starving outgoing connections on Windows Azure Web Sites

I recently ran into a problem where an application running on Windows Azure Web Apps (formerly Windows Azure Web Sites or WAWS) was unable to create any outgoing connections. The exception thrown was particularly cryptic:

[SocketException (0x271d): An attempt was made to access a socket in a way forbidden by its access permissions x.x.x.x:80]
   System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress) +208
   System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket,
     IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Exception& exception) +464

And no matter how much I googled, I couldn’t find anything related. Since it was definitely related to creating outgoing connection, and not specific to any service (couldn’t connect to HTTP or SQL), I started to consider that WAWS was limiting the amount of outbound connections I could make. More specifically, I hypothesized I was running out of ephemeral ports.

So I did a lot of debugging, looking around for non-disposed connections and such, but couldn’t really find anything wrong with my code (except the usual). However, when running the app locally I did see a lot of open HTTP connections. Now, I’m not gonna go into details but it turns out this had something to do with a (not very well documented) part of .NET: ServicePointManager. This manager is involved in all HTTP connections and keeps connections open so they can be reused later.

When doing this on a secure connection with client authentication, there are some specific rules on when it can reuse the connections, and that’s exactly what bit me: for every outgoing request I did, a new connection was opened, not reusing any already open connection.

The connections stay open for 100 seconds by default, so if I had enough requests coming in (translating to a couple of outgoing requests each), the amount of connections indeed became quite high. On my local machine, this wasn’t a problem, but it seems Web Apps constrains the amount of open connections you can have.

As far as I know, these limits aren’t documented anywhere, so instead I’ll post them here. Note that these limits are per App Service plan, not per App.

App Service Plan Connection Limit
Free F1 250
Shared D1 250
Basic B1 1 Instance 1920
Basic B2 1 Instance 3968
Basic B3 1 Instance 8064
Standard S1 1 Instance 1920
Standard S1 2 Instances 1920 per instance
Standard S2 1 Instance 3968
Standard S3 1 Instance 8064
Premium P1 1 Instance (Preview)  1920

I think it’s safe to say that the amount of available connections is per instance, so that 3 Instances S3 have 3*8604 connections available. I also didn’t measure P2 and P3 and I assume they are equal to the B2/S2 and B3/S3 level. If someone happens to know an official list, please let me know.

The odd-numbering of the limits might make more sense if you look at it in hex: 0x780 (1920), 0xF80 (3968) and 0x1F80 (8064).

If you run into trouble with ServicePointManager yourself, I have a utility class that might come in handy to debug this problem.

 

  • Jordan Chang

    I’m so glad I found your article. I’ve been stuck with a Production issue for the last 2 days, very similar to the issue you’re described. I’m getting about 3 to 5 SocketExceptions for my REST API calls to my backend API server; while the rest of the succeed. My site is heavily used every minute, but the socketexceptions continue to exist, although at a very low number, but enough to fill up my logs.

    Unable to connect to the remote server. A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.

  • Puneet Gupta

    This happens if you hit the sandbox limits

    To know about sandbox limits please refer to https://github.com/projectkudu/kudu/wiki/Azure-Web-App-sandbox

  • Henrik

    This saved my sanity today T_T

  • Federico Gerardo Báez

    Hi Freek! Great post! We’re currently having a similar issue and I was wondering how did you finally fix this issue.
    We have several outgoing connections running over https and having read your article it seems that the connections are not being reused.
    I was wondering what are the “specific rules” on when the ServicePointManager can reuse connections.
    I’m really tempted to twek ServicePointManager.MaxServicePointIdleTime to a lower value, but I don’t really know if this could back fire on me (besides not having a really good idea of what would be the best value), my logic says to me that if the connections can’t be reused the IdleTime should be 0.
    Any feedback would be really welcome, thanks in advance 🙂

    • Hi Federico, thanks, I appreciate it.

      What I ended up doing was reusing the instance of the most top-level class that was depending on the HTTPS connection.

      I’m not really sure what the rules are for reusing the connection, I don’t think it’s documented. Are you using client certificates as well? In my case it was related to that, but I had to do some real digging in the .NET source code to figure that out.

      You probably don’t want to tweak those parameters (yet), since it’s treating the symptoms instead of the cause. Did you try to use the debugging class I linked to in the post? Could you tell me a bit more about your case?