I recently zipped through Udacity’s Client-Server Communication course. While this is information that I use somewhat regularly at work, it was nice to solidify my understanding (especially since I started in EE, not CS). The first half of the course is about HTTP.
HTTP stands for HyperText Transfer Protocol, and defines the application-level protocol used to send and receive hypertext. Hypertext is text that contains links–to other hypertext files, images, styles, and so on. HTTP describes a pattern of requests from a client, and responses from a server.
HTTP Request
An HTTP request is a message sent from a client to the server. Requests must include a verb/method, a path to the resource requested, and the HTTP version used.
For example: GET /pics/doggo.jpg HTTP/1.1
is a request to get doggo.jpg from the specified location, using HTTP/1.1.
Possible methods are: GET, POST, PUT, DELETE, HEAD, OPTIONS, TRACE and CONNECT. Some methods, like HEAD, GET, OPTIONS and TRACE, are “safe,” meaning that they don’t have side effects (change data on the server or elsewhere). On the other hand, methods like POST, PUT, DELETE and PATCH, create side effects.
HTTP requests can include a few other things, in addition to the request line shown above (verb + resource location + HTTP version). This includes header fields, an empty line, and an optional message body.
Headers specify more information, either about the sender, the intended recipient, the intended connection settings, what kind of data to send back, and so on. A few examples include:
Host
, the internet host and port # of resource being requestedUser-agent
, what type of browser is making requestConnection
, whether the browser should keep the connection alive, etc.Accept
, what media types are acceptable for the response (text/html, etc.)Content-Language
, what language to return the content in
More HTTP header fields can be found here.
Each linked resource in requested file is fetched with another request (which can then request more files).
REST
The methods described earlier are often used in an architectural pattern called REST. REST stands for Representational State Transfer. Among other things, it describes a pattern of using HTTP methods to create, read, update and delete a type of object.
If we were to use the methods discussed earlier in a RESTful way (to manage data about users), it might look like:
GET person/alice HTTP/1.1
(request data about an existing person)POST person/ HTTP/1.1
(making a new person, Bob)PUT person/alice HTTP/1.1
(updating Alice’s information)DELETE person/eve HTTP/1.1
(deleting Eve’s record)
HTTP Response
If we send a request to a server, it makes sense that we should expect a response back. Responses have two required pieces of data: the HTTP version, and a status code.
For example: HTTP/1.1 200
A status code of 200 means success. Other status codes can be found here.
Like HTTP requests, responses can also (optionally) include header fields, an empty line, and an optional message body.
Example response headers include:
Content-length
, how many bytes of data that client should expectServer
, info about server answering requestContent-type
, format of data being returnedDate
, timestamp of responseLast-modified
, date doc was last changedIf-Modified-Since
can skip sending payload back if newer than date
You can find more response headers here.
HTTP Versions
I’ve mentioned HTTP versions a few time without explaining. HTTP/1.1 was created in 1997; HTTP/2 was standardized in 2015 and is now supported by all major browsers.
HTTP/1.1, being created back in the late 90s, isn’t well-equipped to handle modern web browsing needs. It has a few problems, including head-of-line blocking. This is when multiple requests have to wait for the first request to finish before they can start. While you can have 6 connections open at once, this is only a partial solution because new connections have to do a TCP handshake (time-consuming). Head-of-line blocking is the reason for bundling JS and other files.
HTTP/2 solves a number of problems with HTTP/1.1 and makes web browsing faster and more secure. It solves the head of line problem by using multiplexing. Now, you have one connection, but they’re streams, which can defer to other streams when blocked. Additionally, headers are no longer human readable (use Wireshark!), so they can be compressed.
Lastly, HTTP/2 requires TLS (or rather, all browsers have opted out of the non-TLS implementations of HTTP/2).
If you’d like to take the Udacity course (it’s free!) check it out here.