I want to move my simple blocking socket based code to use libuv, so to allow more than a single connection per thread. The catch is that I also want to do that with TLS, and that seems to be much harder. There are a bunch of GitHub projects that talks about this, but as I know nothing about libuv (and very little about OpenSSL) I decided to write own TLS echo server with libuv to get better understanding of how it all play together.
Sit tight, this might take a while to explain. This is a complex topic and it took me a couple of nights of hacking to get it work, and then a lot of thinking into simplifying this to something that I actually like.
There seems to be great documentation for libuv, which is awesome. I went over the simple echo server sample and it seems relatively straightforward. Making the jump to using TLS is a bit harder. OpenSSL make it really easy to setup SSL on a socket file descriptor and read/write to it. There is even support for non blocking operations, but I didn’t want to be forced to write my own select()/poll() code, so how can I integrate these two libraries?
OpenSSL has the notion of a BIO abstraction, which stands for Basic I/O. Basically, this is a stream abstraction. One of the options that OpenSSL has available is the memory BIO. So the overall idea is to:
- Setup libuv to accept a connection
- Setup OpenSSL with the server side configuration
- When a new connection comes through, setup a new SSL instance from SSL
- Read data from the socket and pass it to the SSL instance and vice versa
- Enjoy encrypted communication
The devil is in the details, naturally. The most complex part, after getting the initial handshake to work, in my experience, is the fact that you can get re-negotiation at any time which mean that a write request will fail with need more read data. That really complicate the amount of state that you have to manage.
Basically, on every SSL_write when managing your own state, you may need to do SSL_read and then retry to previous write. The simplest scenario that we have here is when SSL_accept() on the connection, which results in the following code to manage this state:
To handle a read, we need to check, after every read if the act of reading caused us to need to write (client wants to renegotiate the connection, so OpenSSL needs to send data on the connection, which we need to orchestrate) before we can do the actual read. For writes, we need to remember what we are writing, read and write from the network and then repeat our read. This is awkward to do when using synchronous calls, but the amount of state that we have to keep in async and callback driven programming is a lot. I got it working, but it was really hard and mostly a big house of cards.
I really didn’t like that approach, and decided that I should go about it in a very different way. I realized that I had a very conceptual error in how I approach libuv. Unlike standard async programming in C#, for example, libuv is based on the idea of a loop. In other words, unlike in the code above, you aren’t going to setup the next read from the network after each one. That is already done for you. You just call un_read_start() and you’ll get served the data from the network whenever it is available. You can also inject your own behaviors into the loop, which make things really interesting for ourselves.
Here is the logic, we continuously read from the network and pass the buffer to OpenSSL. We then try to read the decrypted data from SSL_read(). This can fail because we are waiting for more data, and that is fine. We’ll be called again when there is such data. However, we’ll also add a step at the end of the I/O loop to check if there are any pending buffers that needs to be flushed to the network. For writes, if we fail to do the write because we need to read, we’ll register the write to be executed later and wait for the network to send us the read operation.
Given that C isn’t an OO language, I think that I’ll start explaining what is going on from the structs that hold the system together and then the operations that are invoked on them:
The first thing to note here is that we have clear layers in the code. We have the connection_handler_t in here, which is a bunch of function pointers that allow higher level code to work with a connection abstraction. The first portion of the code defines the interface that I expect callers to use. As you can see, we have a few functions that deal with creating, establishing and tearing down a connection. We also have the most common operations, reads and writes.
The write method is pretty obvious, I think. You give it a buffer and it takes care of writing it to the other side. Note that this is an asynchronous process, and if there are any errors in the process, you’ll get them in the connection_closed callback. Reading, on the other hand, is completely out of your hands and will be invoked directly by the lower level code whenever it feels like it. This inversion of control may feel strange for people who are used to invoking I/O directly, but it likely allow you better overall performance.
Now that we have the interface, let’s build a TLS echo server with it. Here is how that looks like:
You can see that there isn’t really much done here. On connection creation, we simply allocate a space for tls_uv_connection_state_t. This is a callback because your code might want to allocate more space for whatever stuff you want to do in the per connection structure. When the connection is established (after the SSL negotiation, etc), you get a chance to initiate things from the server side. In the code above, we simply let the client know that the connection has been successful. From that point on, we simply echo back to the client anything that they send us.
The SSL and libuv initialization are the bare bones stuff and not really interesting. The nice bits happen in the end of the snippet, where we define the overall server state and wire together the protocol definition.
That is great, but where the part where stuff actually gets done?
A note about this code. I’m writing this primarily for ease of reading / understanding. I’m ignoring a lot of potential errors that in production code I would be obliged to handle. That would significantly complicate the code, but must be done if you want to use this code for anything but understanding the overall concept.
Let’s finish setting up the libuv machinery before we jump to any other code, shall we. Here is what this looks like:
This is fairly straightforward. We are listening to a socket and binding any incoming connection to the on_new_connection() callback. There is also the after_io preparation stuff, which we use to handle delayed operations (I’ll talk about this later). For now, I want to focus on accepting new connections and processing them.
There is quite a lot that is going on this method, and not all of it is obvious. First, we handle accepting the connection and binding its input to the libuv event loop. Then we create a connection and setup some of the SSL details.
We create an SSL instance for this connection and create two Basic I/O instances that reside in memory. One for the incoming stream and one for the outgoing stream. We’ll be using them to pass data through the OpenSSL encryption, negotiation, etc. We also mark this as a server instance.
Once that is done, we invoke the connection_established() callback and then tell the libuv event loop to start pumping data from this socket to the handle_read() callback. For now, I want to ignore the connection_established() callback, it isn’t important to understand the flow of the code at this point (but we’ll circle back to it). It is important to understand that by the time we call to this callback, the connection is ready to use and can receive and send data. Well, not receive, because we don’t provide a way to pull data from the connection, we’ll be pushing that data to the provided callback. This will happen by libuv calling to the handle_read() method whenever there is data on the socket. Here is how we handle this:
When libuv calls us with some data, we write this data into the read buffer for OpenSSL and then call SSL_read() to get the unencrypted data that was sent to us. There are some issues here. First, the SSL/TLS has framing, and the amount of data that your read from the network isn’t going to be the amount of unencrypted bytes that you get in the end. Another issue is that we need to be careful about re-negotiations, which are generally permitted at any point, but can cause a read to do a write (and may require a write to read).
You might have noticed that this code contains absolutely no indication of this. Instead, we call SSL_read() to get the plaintext data from OpenSSL. We continue to do this until we get an error from SSL_read(). This can be either a real error or an indication that we need to read more from the network. Whenever I get some bytes from OpenSSL, I pass them directly to the read() callback that was provided to us.
If you examine the code carefully, you’ll see that when we run out of data to read, we try to flush the SSL state of the connection. Let’s look at what that method do:
We check if the connection is already in the queue and if it isn’t we check whatever it should be added. There are two reasons why a connection should be added to the pending_writes queue. First, we may have data buffered in the write buffer of the SSL connection, which needs to be sent over the network. Or, we may have failed writes that we need to retry after we read more data into the SSL connection.
You might notice that we are doing some pointer hopping in the process of registering the connection in the queue. This is basically using a double linked list and will be important later. If we are putting stuff into a queue, what is going to be reading from this queue?
Remember that when we setup the libuv stuff, we used the after_io prepare handle? This is called as the first step in the loop, just before we check if there is any I/O to process. This give us the chance to deal with the confusing read on write and write on read nature of OpenSSL in a more structure manner. Let’s first look at the code, and then see how this all play together.
This is what actually handle writing to the network. We take data from the SSL write buffer and send it to the network. Once the write is done, we free buffers that were held for this operation and check if there was any issue with the write (if so, we abort the connection). This is all being driven by this method, which is called before we check for available I/O.
There is quite a lot that is going on in here. First, we iterate through the pending writes for all the connections we have. For each of the connections, we flush the SSL buffer and then check if we have pending writes to process. If we don’t, we can remove the connection from the queue, our work is done. If we do have any pending writes, we need to handle them.
I do that by using SSL_write(), which will write them into in memory buffer. I continue doing so until one of the following happens:
- I run out of pending writes.
- I run out of buffer space and need to flush.
- I need to re-negotiate and need to read from the network
In the first case, I’ve successfully pushed the data to the SSL buffer, so I can call flush_ssl_buffer() and then remove the connection from the queue. In the second case, I’ll flush the SSL write buffer and try again.
However, in the last case, I’m just aborting the writes. I need to do a read, and that will be handled on the next iteration of the libuv loop. There is some bookkeeping there to make sure that if we successfully wrote data into the SSL buffer, we won’t be writing that again, but this is pretty much it. You’ll note that I’m playing games with pointers to pointers there to get clean code on the code that consumes the queue but allow me to skip one of the steps in the linked list without removing it from the list.
This is pretty much it, I have to say. We now have a system where both writes and reads work in conjunction to get the proper SSL behavior, even when we have renegotiation going on.
One thing you’ll not find in this code is a call to SSL_accept(), or indeed any behavior related to explicitly managing the SSL state. I’m letting OpenSSL handle all of that are rely on the fact that I SSL_write() and SSL_read() will handle renegotiations on their own for me.
Let’s do a simple walk through of what is going on with the connection of the TLS echo server.
On connection established (and before we read anything from the network), we call to connection_write():
This is fairly straightforward. We try to write to the buffer, and if we are successful, great. The check_if_need_to_flush_ssl_state() will take care of actually sending that to the client.
If the write buffer is full, we empty it and try again. The interesting thing happen when we need to read in order to complete this write. In this case, we copy the data to write and store it on the side, then we proceed normally and wait or the libuv to deliver the next read buffer for this connection. When that is done, we’ll be sending the deferred write to the client.
It may be easier to explain the flow with a real example. When a new connection comes into the server, we create a new SSL context and then we call:
connection_write(connection, "OK\r\n", 4);
This is the very first time that we actually interacts with the SSL instance and the call to SSL_write() is going to fail (because we haven’t established the SSL connection) with a SSL_ERROR_WANT_READ message. In response for this, we’ll copy the buffer we got and place it into the pending_writes of this connection. We also start listening to new data on the connection. The client will send the ClientHello message, which we’ll read and then feed into the SSL instance. That will cause us to write the SeverHello to the in memory buffer. When the check_if_need_to_flush_ssl_state() will be called, it will flush that message to the client.
Eventually, we’ll get the connection established and at this point we’ll be sending the deferred write to the client.
There are a bunch of other details, but they aren’t crucial to understanding this approaching. You can find the whole code sample here. I’ll reiterate again that it doesn’t have proper error handling, but it is less than 350 lines of C code that does something that is quite nice and expose an API that should be quite interesting to consume.
I’m really interested in feedback on this blog post, both on whatever this approach make any sense and what do you think about the code.