Unlike HTTPS, analysing HTTP traffic with tools like Wireshark is pretty easy because everything is in clear text. Wireshark will even give you the request performance (49ms highlighted below). I can also see that the request was sent in packet 4 (after the three way handshake), and the response came in packet 6. The delta between packet 4 and packet 6 is your server response time.
But what about packet 5? Packet 5 is the acknowledgement of data at the operating system level, rather than at the application layer. Normally if the request is takes more than 50ms (your OS may vary), we will see what’s called a delayed acknowledgement, which the application data may piggyback on. However, this naked acknowledgement (no application payload) came back 3ms later. The reason for this is that the request was less than a full segment size (see the MSS in the SYN packets), which meant that the OS has attached the PSH flag, which the receiver must acknowledge straight away.
So what happens when we wrap this up in HTTPS? We can use the same logic as the measuring the request and response cycles we did with HTTP, it just means that we cant see the actual payloads. In most case we can expect that a payload will be sent to the server, and the delta between that payload and the return payload is our server response(1).
The second interesting thing is that we will now be at the mercy of SSL/TLS setup which involves additional round trips for the connection to establish. The below screenshot demonstrates a simple HTTPS request with connection setup, TLS establishment, HTTP, and session taredown.
If we brake this down, it’s actually quite a simple request and response cycle(2).
Events
- The first 3 packets are the normal TCP three-way handshake.
- Packet 4 nstead of the HTTP request, we have the ‘Client Hello’. The Client Hello is a backwards compatible offer from the client to the server to negotiate specific TLS parameters including; pre-existing TLS Session IDs, available cipher suites (eg. TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256).
- Packet 5 is an operating system level TCP acknowledgement, indicating that the Client Hello has been received.
- A ‘Server Hello’ is received in packet 6 with the servers selection of cipher suite and other important TLS parameters including the server TLS certificate.
- Packet 10 tells the the server that all subsequent communications will be encrypted using the agreed parameters.
- Packet 12 tells the client that all subsequent parameters will be encrypted using the agreed parameters.
- Packet 14 is the HTTP request from the client (Encrypted using TLS).
- Packet 15 (8ms later) is the beginning of the HTTP response, followed by packet packet 17. (16 is an Acknowledgement at the TCP level).
- Packets 19 and 21 are the encrypted alert (type 21) which is session tear down. Even though it says alert, this is normal for TLS teardown and does not indicate a problem.
- Packets 20 and 22 are normal TCP teardown.
- Packets 23 and 24 are resets (RST) towards the server. Resets are now commonly used to tear down types of TLS communications(3).
From this we can see that even though the actual server response time for the request was only 8ms, it actually took 236ms to get to the beginning of the server response to the application due to TCP and TLS overhead.
If this was a high latency (eg. satellite) this would have taken even longer (back of the envelope for StarLink would take roughly 500ms, with geostationary satellite taking 2.2 seconds).
If you got this far, thanks for reading! If you want to learn more about this type of protocol analysis, pick up a copy of
- The exception to this are async communications like WebSockets. This a subscribe type model where you will see the a payload go to the server, and you will see sporadic responses back to the client from the server, or a response every 60 (30, 20,10) seconds.
- This session only has one HTTP object fetched. Typically you would see a persistent connection which reuses the same TCP/TLS connection to make further requests reducing overheads.
- The reason for RST being used in this way is to do with the behaviour of the TCP stack specified all the way back in RFC793. If a reset is sent, the socket is closed immediately, as opposed to waiting for the 2 times MSL (Maximum Segment Lifetime) which is typically for minutes of TIME_WAIT / CLOSE_WAIT. check out RFC1337 for some interesting commentary.
Leave a Reply