3.3.3 URL
A Universal Resource Locator (URL) is a kind of address which contains not only the exact location of a certain resource on the Internet but also the protocol used to retrieve it, and, if necessary, additional information like a search query. Let us look at such a URL:The text before the colon in this example specifies the protocol used (http stands for hypertext transfer protocol). Here is a list of common protocols for URLs. This list is not complete, since new protocols can always be added.
| http | Hypertext Transfer Protocol |
| ftp | File Transfer Protocol |
| gopher | The Gopher Protocol |
| wais | Wide Area Information Servers |
| hyperg | Hyper-G (Hyperwave) Protocol |
| telnet | Reference to a telnet-session |
| mailto | SMTP (Simple Mail Transfer Protocol) |
| news | NNTP (Network News Transfer Protocol) |
| cid | Content Identifier for MIME |
The remainder of the string depends on the protocol, but let us discuss our example. The text between the double-slash and the single slash (www.w3.org) is the Internet domain name of the host the server is running on. The number after the colon is optional and specifies the port used. To add port ``80'' in our example is redundant because it is the default port for HTTP. The substring after the port (/pub/WWW/History.html) shows the location of the resource in the hierarchically structured WWW server. ``History.html'' is the filename of the resource. If the URL does not contain a filename, the server will retrieve the ``default index'' of the directory, which can be a file or a list of the files in the directory.
A URL can contain additional information for internal purposes, mostly
separated by a question mark or a semi colon. e.g:
