3.3.5 HTML
The HyperText Markup Language (HTML) is a simple language for the interchange of hypertext on the WWW. It is an application of SGML ( Standard General Markup Language, ISO 8879). The purpose of a markup language is the separation of the content and the format of a document. An HTML document is entirely independent of the hardware and software used to display it. What is described in HTML is just the logical structure of the document (headline, paragraph, list, boldface, images, hyperlinks, etc.). This offers the users the ability to choose the font and style which is the most convenient for reading. On the other hand, authors of an HTML document may have no idea exactly how their documents will look when browsed by an arbitrary user, and therefore should view it with a couple of browsers and font sizes before releasing them.The first HTML standard was HTML 1.0. All available browsers can display the features of this standard. It consists of structured hypertext and inline images. To allow for more sophisticated WWW pages browser companies implemented enriched HTML versions without waiting for the complicated standardisation process. Finally, HTML 2.0, which supports forms for user input, frames, backgrounds etc., was endorsed by the W3 Consortium.
Again, some browsers started to support new features which other browsers
did not. Thus, there are pages which look nice when viewed with Netscape
Navigator but do not when viewed with Microsoft Internet Explorer and vice
versa. E.g., if you define the width on a Java applet as 100% (whole page
width), it will be displayed correctly on NN but not on MSIE because only
exact widths in pixels are allowed. Fortunately, on the 17th of January 1997
the latest proposal, HTML 3.2, was endorsed by the Director of W3 Consortium
as a W3C Recommendation
. HTML 3.2 adds new features
such as tables, formulae, Java applets, text flow around images, and
provides total backwards compatibility with HTML 2.0. The W3C HTML Editorial
Review Board (HTML EDR) together with more focused working groups will work
on the next version of HTML (code named Cougar)
.
HTML documents are plain text documents which use a set of tags to
define formatted text. Formatted text is delimited by a start-tag
(<name>) and an end-tag (</name>). Text which
should appear in boldface would look like this: <B> sample text </B>.
The name of the tag is case-insensitive, thus, our example could also look
like this: <b> sample text </b>. There are tags which do not need an
end-tag, and some tags can also include attributes which consist of an
attribute-name and the character = followed by a value. E.g., the tag to
include an inline image would look like:
An HTML 3.2 document consists of a document type definition, a header, which contains meta and style information, and a body, which contains the actual document. Let's look at a small example:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<HTML>
<HEAD>
<TITLE>This is the title of a document</TITLE>
... other head elements
</HEAD>
<BODY>
... document body
</BODY>
</HTML>
Since multiple white spaces and line breaks in the HTML file are ignored by
the viewer, the author can structure the file at will.
<P> This is one Paragraph.</P>
is equal to
<P> This is one Paragraph. </P>
is equal to
<P>
This
is
one
Paragraph.
</P>
I will not describe HTML tags in detail because good descriptions of HTML
can be found on the WWW
.
