Hypertext

We all know what text is. It's not a stretch from text to the concept of a text document - we're all pretty familiar with that idea too. One thing about text documents (think about paper documents) is that they often refer to other documents. These references might be footnotes, citations, bibliographies, or just embedded as quotations in the text.

Hyper, in mathematics, means extension. The concept of somehow extending text documents - such that you could instantaneously reach things such as references - was inspired by pre-computer technologies like microfilm. The term hyper-text first appeared in an article written by Vannevar Bush in 1945, in which a futuristic device called the Memex allowed a user to instantly skip and link to content made of chains of microfilm frames. In the 1960's, this concept was closer to reality through digital document systems. Ted Nelson coined the terms HyperText along with HyperMedia (referring to a systems where not just text could be linked and skipped to, but also images, sound, and video).

The concept of having links within documents that could be traveled instantaneously is a powerful one. It's not just that a reader can quickly skip to different documents (and then return to the original), but documents could embed other documents and media from different sources. If you consider pre-digital information systems (i.e. books, card catalogs, and libraries), you can see how much of a leap this is.

There is a lot more history to hypertext. You are encouraged to do some research, but let's move on to how hypertext moved from an emerging idea to the technology that we use every single day.

While working at CERN in 1989, Tim Berners-Lee proposed a project to link text documents already on the internet together, called the WorldWideWeb. The core of the proposal was a protocol for addressing documents, requesting documents over TCP, and delivering documents. Crucially, within these documents was a way to embed addresses of other documents. This allowed the software rendering the document to allow a user to ask it to display that resource. We of course recognize this as a link. We use them every day :)

If you haven't put it together yet, the WorldWideWeb project is where we got the www from, and documents that were available on this system were written in an early version of HTML - which stands for Hper Text Markup Language. The "software" that rendered these documents was the first web browser. Some of the very first web browsers were text based, the Line Mode Browser and Lynx are some of the most influential. Berners-Lee is also credit with creating the first web server at CERN, to serve the documents to the first browsers.

HTTP Protocol

The glue between the browser and the server is the protocol that they use to address, request, and deliver documents (which are more accurately called resources, since they need not be text). The protocol is the HyperText Transfer Protocol. Just like the "echo" protocol we saw in the last chapter, it's just a text-based protocol. Text is is sent from the client (the web browser), interpreted by the server, and text is sent as a response. The difference is that the text is much more structured, such that it can include metadata about the resources being requested and delivered, along with data and resources themselves.

The HTTP protocol has proven to be a remarkably powerful method of exchanging data on networks. It is fairly simplistic, but is efficient and flexible. At it's heart is the concept of resources, which are addressable (we'll see this referred to as a URL - universal resource locator). If we think of HTTP as a language, then resources are the nouns - they are the things we do things with. The verbs of HTTP are the things we do - the requests web browsers (clients) perform on resources. The adjectives are meta data that we use to describe both nouns and verbs - we'll soon recognize these as request and response headers.