Persistent Connections

English is not my first language, so I have to remind myself of definitions from time to time. So mind me if you see me define a phrase before I elaborate on it. To persist something is to keep it existing. So what does a persistent connection refer to? and why this is something software engineers should know about?

TCP is the de-facto protocol of the network communication. When we want to send data between node A and node B we establish a TCP connection, be it a simple visit to google.com or an Oracle database connection, under the hood its all TCP.  
Lots of things involve establishing the connection. Handshaking, acknowledgment, making sure the parties in the connection are in fact who they say they are etc. We don't need to discuss details of what exactly happens in this post, but one thing you should know, opening TCP connection is expensive. 
So imagine this scenario. You visit http://www.nationalgeographic.com website, this uses the HTTP protocol which underneath open a TCP connection to the national geographic server. So lets take a time machine back in 1996, where HTTP/1.0 was just released. This would happen:
Open TCP Connection national geographic website (this has many steps remember)Read Index.html and send it back to the browserClose TCP ConnectionFor each Image in the page     Open TCP Connection     Read Image from the server disk, send back to browser     Close TCP ConnectionNext image

Images are just an example of things we need to load, back then there were much more nastier resources. 
So now imagine the overhead of opening and closing the connections. The network congestion from all acknowledgments being sent back and forth and the wasted processing cycles the server and client has to endure. That is why persistent connection became popular, open a connection, and leave it open while we send everything we have, once we are done we can close it. Here is a modern visit of nationalgeographic.com in 2016, HTTP 1.1

Open TCP Connection national geographic website  Read Index.html and send it back to the browserFor each Image in the page     Read Image from the server disk, send back to browserNext image... Do more ... Do moreClose TCP Connection

Now, I know this is very specific to browsers and web servers but the same story is true for database connections. In my years of experience as a programmer working with databases, I developed a habit, and am sure most of you did too, of opening a connection, sending a query and then closing a connection. That might be fine and barely noticeable if you have like 10 users on your application, but as you scale up, you will start noticing performance degradation.
Another thing you gain of persistent connections is PUSH events. This is how WhatsApp is able to freak you out by instantly delivering your wife's message "Where are you!" to your phone the moment she hit the sent on hers. WhatsApp do that by having a live open connection to their server from your mobile (not exactly TCP though, a much more efficient protocol called XMPP, we can touch upon that on some other post). 
Disadvantages?We mentioned the advantages of persistent connection, but are there any disadvantages? Yes, there is no free lunch apparently.
When using persistent connection, you keep the connection alive on both the client and server, so you start eating up more memory the more connections keep alive. So you have to be smart about closing those idle connections. Another problem came with persistent connections that is specifically for TCP. In a nut shell, now that we started to keep connections alive on the server, attackers came up with this idea, "Hey, what if we made the server run out of memory by establishing millions of connections and never replay back?" Thus DDOS attacks were born. Again we can touch more on this on another post. 
-Hussein
 •  0 comments  •  flag
Share on Twitter
Published on July 04, 2016 19:35
No comments have been added yet.