>> Ressourcen > Theses > Persché, Richar[..] > 3 Hypermedia Sy[..] > 3.3 The World W[..] > 3.3.5 HTML

ErstesErstesVorherigesNächstesLetztes 12/20

3.3.5 HTML

  The HyperText Markup Language (HTML) is a simple language for the interchange of hypertext on the WWW. It is an application of SGML ( Standard General Markup Language, ISO 8879). The purpose of a markup language is the separation of the content and the format of a document. An HTML document is entirely independent of the hardware and software used to display it. What is described in HTML is just the logical structure of the document (headline, paragraph, list, boldface, images, hyperlinks, etc.). This offers the users the ability to choose the font and style which is the most convenient for reading. On the other hand, authors of an HTML document may have no idea exactly how their documents will look when browsed by an arbitrary user, and therefore should view it with a couple of browsers and font sizes before releasing them.

The first HTML standard was HTML 1.0. All available browsers can display the features of this standard. It consists of structured hypertext and inline images. To allow for more sophisticated WWW pages browser companies implemented enriched HTML versions without waiting for the complicated standardisation process. Finally, HTML 2.0, which supports forms for user input, frames, backgrounds etc., was endorsed by the W3 Consortium.

Again, some browsers started to support new features which other browsers did not. Thus, there are pages which look nice when viewed with Netscape Navigator but do not when viewed with Microsoft Internet Explorer and vice versa. E.g., if you define the width on a Java applet as 100% (whole page width), it will be displayed correctly on NN but not on MSIE because only exact widths in pixels are allowed. Fortunately, on the 17th of January 1997 the latest proposal, HTML 3.2, was endorsed by the Director of W3 Consortium as a W3C Recommendation[*]. HTML 3.2 adds new features such as tables, formulae, Java applets, text flow around images, and provides total backwards compatibility with HTML 2.0. The W3C HTML Editorial Review Board (HTML EDR) together with more focused working groups will work on the next version of HTML (code named Cougar)[*].

HTML documents are plain text documents which use a set of tags to define formatted text. Formatted text is delimited by a start-tag (<name>) and an end-tag (</name>). Text which should appear in boldface would look like this: <B> sample text </B>. The name of the tag is case-insensitive, thus, our example could also look like this: <b> sample text </b>. There are tags which do not need an end-tag, and some tags can also include attributes which consist of an attribute-name and the character = followed by a value. E.g., the tag to include an inline image would look like:

<IMG SRC="http://myhost/images/myimage.gif">

An HTML 3.2 document consists of a document type definition, a header, which contains meta and style information, and a body, which contains the actual document. Let's look at a small example:

  <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
  <HTML>
     <HEAD>
        <TITLE>This is the title of a document</TITLE>
        ... other head elements
     </HEAD>
     <BODY>
        ... document body
     </BODY>
  </HTML>

Since multiple white spaces and line breaks in the HTML file are ignored by the viewer, the author can structure the file at will.

<P> This is one Paragraph.</P>

is equal to

<P> This is one Paragraph. </P>

is equal to

<P>
This
is
one
Paragraph.
</P>

I will not describe HTML tags in detail because good descriptions of HTML can be found on the WWW[*].