This post is the continuation of the first post on HTTP, the hypertext web transfer protocol.
HTTP Classified
Chunking
In case the server wants to efficiently transfer large size files, it can split them into chunks. For this, HTTP has the option to set a Transfer-Encoding header. Its semantic is as follows:
So, if the server sets the value to chunked, then it can split the whole data set into chunks, designating each consecutive chunk with the size marker prior to it. Encoding and size markers are in red in the example:
Caching
Caching is expressed in HTTP thought in several ways.
Expires header
- Server sends a resource to a client, attaching Expires <datetime value>, thus setting a resource expiration date.
- In the next HTTP request, if the resource is still valid, the client gets data from the cache, rather than from the server, eliminating extra transfer.
Etag header
Another method is to use Etag, the entity identifier. Each resource entity gets a unique Etag value, ex. Etag: “345678000”. If something gets changed, the Etag will also be changed. Thus in comparing Etags of server and client side resources, we do not need to send data from the server over again if they remain unchanged on a server. See the following figures:
Cache-Control header
Another caching control might be done through a Cache-Control header with a max-age attribute. In this header the server sets the maximum time (in milliseconds) for which a resource is valid. Ex.: Cache-Control: max-age=600 will cause the resource to cache and it will expire in 10 min.
As far as actual implementation, the server is usually configured to use all of the above-mentioned ways to establish cache control, in the hope that the client browser will support at least one of them.
Cookie
Using a cookie is a way to resolve the shortage of protocol’s stateless nature. Some data (cookie) are kept concerning certain information (state) from the last requests or responses, for the user’s convenience.
Cookie does these functions:
- Authenticates user
- Stores info about user
- Tracks user’s actions
- Creates statistics
Besides the name=value pairs, cookie can also contain an expiration date, path and domain. See the following figure to understand how cookie controls session transfers:
Other attributes:
- secure attribute forces the cookie to be sent only by secure HTTP (HTTPS)
- HTTPOnly attribute forces the cookie to be accessible from the server only, not from client browser’s scripts.
Cookie expire
Cookies are permanent (do not expire) only if the expiration date is set (expire attribute), otherwise cookies are deleted at the end of the client-server session. For security reasons the client might delete cookies from his computer. After the expiration date, cookies are deleted as well.
HTTP Comet and full duplex data transfer
HTTP comet technology causes the server to initiate data transfer to the client. This can take place when the client polls the server (poll or long-poll) and the server issues resources to send to the client at an appropriate time. This technology is realized in several ways:
- Polling – when the user constantly asks the server for new updates; so when updates are ready, the server replies.
- Long-poll – the user asks the server once and the server waits and replies when some updated data are ready.
- WebSocket – WebSocket is the technology implementing a full-duplex client-server transfer. Only new browsers support it. Some libraries were developed for this, ex. socket.io
HTTPS (secure)
This is the same HTTP but using encryption. The encryption is done thru SSL (Secure Sockets Layer) or TSL (Transport Layer Security) mechanisms. Encryption is done through certificates. Each certificate is just a pair of public and private keys for client decryption and server encryption and decryption. Caching is turned off for this protocol. Read more on HTTTS in the next post.
Instrument (Tools) to work with HTTP
At the end of our rambling about HTTP, we mention some tools to work with this protocol. Here we’ve ranged the tools from low to high level ones:
- tcpdump – console application
- Wireshark, Fiddler – GUI applications
- Browser tools: HttpFox, FireBug, Chrome Development Tools, Opera Dragonfly