This document tries to define the technical terms it uses or to provide links to definitions. If you find terms which are unknown to you and not defined here, please consult eg the Terms section of HTML 2.0 specification or some of the general Internet glossaries. (The most authoritative Internet glossary is probably RFC 1983.)
People who have heard about HTML 3.0 should notice that HTML 3.2 is not an extension or a variant of HTML 3.0, which has now been withdrawn. (The version numbers 3.0 and 3.2 are misleading!) More exactly, HTML 3.2 contains
For a good summary of the new features in HTML 3.2 as compared with HTML 2.0, consult the article What's New in HTML 3.2 in the World Wide Web Journal, but please notice that it contains a few mistakes.
HTML 3.2 has been defined by the World Wide Web Consortium. It is supported by several browsers to a large extent, and it will probably become the common basis understood by almost all relevant Web software. The next version, an extension to HTML 3.2, is being developed under the code name Cougar.
An older standard, HTML 2.0, is supported to an even larger extent, since HTML 3.2 is an extension of HTML 2.0.
However, to be exact, the following HTML 2.0 features have been removed in HTML 3.2:
This document does not discuss general issues of Web authoring, such as overall design of documents and document collections. As regards to them, see my list of suggested reading.
In addition to such issues, you need to know where to put your HTML document to make it accessible to the world; this may involve things like setting up directory and file protections suitably. Please consult your local Web support for information relevant at your site.
This document concentrates on basic HTML usage. In particular, this document does not give realistic examples about applets or image maps. (The main reason for this is that the author felt that a basic document was urgently needed, and providing good examples about such complicated and somewhat controversial issues would have taken too much time.)
For printing on paper, you may wish to use the
PostScript version
(generated from the HTML version with Netscape),
which also exists in
a much smaller form, as
compressed
(with the Unix compress
utility).
In general, you should be able to read this document on any decent WWW browser. However, tables (TABLE elements) have been used in this document, mainly in the description of attributes, since they are essentially tabular information best presented so. Unfortunately this means that parts of this document are almost illegible when viewed with browsers which cannot present tables (eg most versions of Lynx).
The author hereby gives general permission to copy and distribute this document or parts thereof in any medium, provided that all copies contain, in a manner appropriate for the medium, an acknowledgement of authorship and the URL of the original document, ie http://www.hut.fi/%7ejkorpela/HTML3.2/
The permission granted above does not imply permission to distribute this document in a modified form or as a translation. Please contact the author to discuss the conditions for such actions.
Explanation: The author wishes to preserve the integrity of the document. This includes specifying the context when distributing or using excerpts and informing the reader about the availability of the entire document in its most up-to-date form.
Please notice that most introductory texts on HTML do not present the language exactly as defined by HTML 3.2; some of them might differ a lot from it. This is understandable, since the language HTML evolves rapidly (and even divergently).
The specification is relatively short and technical, and consulting the older HTML 2.0 specification (also known as RFC 1866) can be useful, since the current HTML 3.2 specifications can sometimes be understood only be assuming HTML 2.0 as a background document.
In order to understand the HTML specifications exactly, some fluency in reading SGML (the metalanguage used to describe the syntax of HTML formally) is required. SGML as a whole is rather complicated, and the SGML standard is only available in printed form. However, for the purpose of understanding the SGML descriptions of the syntax of HTML (that is, HTML DTDs), the following material usually gives you enough information:
There are some minor internal inconsistencies in the HTML 3.2 specification.
Notice that documents on HTML (even some of the above-mentioned) very often contain information about features which do not belong to HTML 3.2.
Even if you know HTML 3.2 well, you will by mistake violate the specification; for instance, just forgetting an ending quote can cause a lot of such violations. You may not notice the error in your environment but your readers may get confused.
It is not sufficient to check that "it works" on your browser. Other people will use that browser in a different environment or with different settings, different versions of the browser, or even quite different browsers. Browsers very often pass invalid HTML without giving error messages, perhaps even handling in such a way that things seem to work fine. For other people, it might be a mess. Looking at your document on a few different browsers may help to detect problems, but it would be too tedious to do that for all important browsing environments.
Therefore, validate your code. You can use eg HTML Validation Service of WebTechs which is easy to use.
Passing validation means that there are no violations of HTML syntax (providing that the validator does its job right). Checking the quality of the document is a different thing. There are some checkers such as WebLint which can be used to test the document for various common problems - for things which, although technically legal, are likely to provoke known browser bugs, etc. Checkers may of course perform an HTML syntax check too, but typically they are rougher than validators. They might declare a document legal syntax when it isn't, or declare it illegal when it is. Nevertheless, they are useful tools, both for alerting newcomers to potential problems, and for picking up errors made by even the most experienced.
For more information, Heikki Kantola's nice compact list of validators and checkers and WDG's (annotated) rather extensive list of validators and checkers.
In addition to character repertoire and encoding (of characters by bit combinations), there is a special feature which is fixed in HTML: the interpretation of numerical character escapes of the form &#n; where n is a number. Such an escape is to be interpreted as the character corresponding to n in ISO 10646 and Unicode. In practice, browsers cannot represent all ISO 10646 characters, but the specifications imply that if a browser &#n; presents as a character, it must use the ISO 10646 character. (Unfortunately, browsers may violate this.)
In practise, you should use ISO Latin 1 characters only. Currently or in the near future you can hardly expect general support for extensions to it, although support to some national alphabets may exist nationally. Support for ISO Latin 1 should exist in all browsers, but there are problems even with this. You may of course decide to stick to the ASCII character set, which is a subset of ISO Latin 1, especially if you do not need letters with diacritic marks (or, in general, letters other than English a - z).
The printable characters of ASCII (with code values from 32 to 126 in decimal) are the following:
! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~The other printable characters of ISO Latin 1 (with code values from 160 to 255 in decimal) are the following:
¡ ¢ £ ¤ ¥ ¦ § ¨ © ª « ¬ ® ¯ ° ± ² ³ ´ µ ¶ · ¸ ¹ º » ¼ ½ ¾ ¿ À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö × Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ÷ ø ù ú û ü ý þ ÿNote: The presentation of some characters in the copy of this document may be defective eg due to lack of font support. Naturally, the appearance of characters varies from one font to another.
If your keyboard or text editor does not allow you to enter (ie to type directly) some ISO Latin 1 characters such as ä or ñ, you can use the character escape conventions.
Some practical warnings to those who create HTML documents on microcomputers:
<H1> <H1 ALIGN=LEFT>
<H1>Foreword</H1>In such cases the two tags and the part of the document enclosed by them forms a unit which is called HTML element. Some tags, eg <HR>, are HTML elements by themselves, and for them the corresponding end tag would be illegal. - In the sequel we will usually refer to tags by their name only, omitting the obligatory angle brackets.
For some elements which logically consist of a start tag, some content and an end tag, it is legal to omit the end tag, possibly even the start tag. For example, you can omit the end tag </P> and let browsers and other software imply it when necessary. The exact rules for allowable tag omission are given in the HTML specification, often only in the formal (SGML) syntac, so they can be hard to read. Moreover, some browsers are known to misbehave if you omit some end tags even when the specs allow it, and this can have drastic effects eg when nested tables are involved. Thus it is wisest to use explicit end tags always for all elements which logically have an end tag.
You can also omit the quotes from an attribute value if the value consists of the following characters only (cf to the technical concept of name):
Within attribute values, no HTML tags are recognized. On the other hand, escape sequences are recognized and interpreted.
There is a minimized syntax for attributes when the attribute value is the same as the attribute name. For instance, <UL COMPACT="COMPACT"> can be abbreviated as <UL COMPACT> (and it is common practise to do so). Some user agents even require minization for some attributes (COMPACT, ISMAP, CHECKED, NOWRAP, NOSHADE, NOHREF), so perhaps it is best to use the minimized syntax when applicable.
Successive attribute specifications must be separated with blanks (or newlines).
The general syntax of URLs is the following:
scheme://
host:
port/
path/
filename
where
http | a Web document (to be accessed using Hypertext Transfer Protocol, HTTP) |
ftp | a file in a so-called FTP server, to be retrieved using File Transfer Protocol |
gopher | a file in a Gopher server |
mailto | electronic mail address |
news | a newsgroup or an article in Usenet news |
telnet | for starting an interactive session via the Telnet protocol (which is part of TCP/IP) |
www.hut.fi
(or sometimes a numerical TCP/IP
address); notice that typically, but not necessarily, Web
servers have domain names starting with www
:
port
http
URLs. For other URLs, simplifications and special interpretations are
applied. For example, a mailto
URL is just of the form
mailto
:address where address is
a normal Internet E-mail address like
Jukka.Korpela@hut.fi
.
Please notice that appending anything to the E-mail address in
a mailto
URL
is nonstandard and
may result in lost mail without
anyone noticing!
It is safest to enclose URLs in quotes when writing them as attribute values in HTML.
For an overview of URLs, see W3C material on addressing.
As regards to the technical specifications of the syntax of URLs, see RFC 1738 (absolute URLs) and RFC 1808 (relative URLs).
In particular, the specifications say that within a URL only a limited set of characters can be used as such:
A
to
Z
, a
to z
,
0
to 9
)
$-_.+!*'(),
;/?:@=&#
provided that they
are used in the special meaning reserved for them
in the
RFCs mentioned above.
;/?:@=&#
must also be encoded, if they
are not used in the special meaning.)
This encoding (which is defined by URL specifications, not HTML
specifications)
consists of
using the percent sign followed by two
hexadecimal digits, presenting the code position.
For example, tilde (~
) should be presented as
%7E
and space as %20
.
(Violating the rules causes problems
much more likely
in the latter
case than in the former.)
In this document, upper case letters are used for the above-mentioned constructs. This may help the reader distinguish HTML code from normal text.
However, the following constructs are (in general) case sensitive:
The term newline is used to denote an end of line designation. Theoretically SGML specifies that a line (record) should begin with a record start character (line feed, LF, ASCII code 10) and end with a record end character (carriage return, CR, ASCII code 13). In practise, HTML documents are presented and transmitted using a newline presentation convention of the computer system used. Therefore, HTML browsers are encouraged to accept any of the three common representations, namely CR LF sequence, CR only, and LF only, as line separators and to infer the missing record end and start characters.
Thus, it does not matter how you divide the text into a lines, since
a newline is equivalent to a blank. Notice, however, that you
must not divide a word into two lines in HTML.
If you eg divide the word
international
into two lines as follows:
inter- nationalit will be interpreted as equivalent to
inter- nationaland the result is not what you want.
Thus, you must use HTML tags such as P or BR to force line breaks, if they are necessary for the logical representation of your document.
Browsers usually do not divide words into two lines, except possibly when a word contains a hyphen. The HTML 3.2 Reference Specification is not very explicit in this matter; it just says, in the discussion of tables, the following:
For some user agents it may be necessary or desirable to break text lines within words. In such cases a visual indication that this has occurred is advised.
Beware that the line length is outside your control. It depends on the browser, device, and settings used by the people who look at your document. You can force line breaks but not prevent line breaks between words, in general. (You can try to prevent line breaks by using non-breaking spaces.)
As regards to newlines in conjunction with HTML tags, there are special rules:
<P> Text
is equivalent to
<P>Text
Text </P>
is equivalent to
Text</P>
However, popular browsers (such as Netscape and Internet Explorer)
are known to violate these official rules.
For example, if you write an A element as follows:
<A HREF="foo.html">bar
</A>
then many browsers incorrectly display it as if the link text
had a blank appended. Since browsers often indicate links with
underlining, there could be an extra underlined space.
Thus, in some cases removing a newline before an end tag
may help in improving the presentation on popular but buggy
browsers. See the document
White Space Bugs in Browsers
for more detailed explanation with examples.
The horizontal tab character (HT) can appear in the HTML source. Within PRE elements, tabs have a special interpretation. Otherwise a tab is equivalent to a space. Thus, it does not imply tabulation of any kind. (In order to present tabular data, use the TABLE element.) It is best to avoid tabs in HTML code and to use a suitable number of spaces instead, if one wants to format the HTML source code into tabular form.
Apart from the elements at the topmost levels, namely HTML, HEAD and BODY, the HTML elements are classified into three major categories:
Any text element (including plain text) can appear wherever a block element is allowed, by virtue of implicitly forming a paragraph (P element) when necessary.
A rule of thumb which may help in remembering which elements are block elements and which are text elements: block elements cause paragraph breaks, text elements do not.
Note: Often block elements can contain both text elements and
other block elements, ie blocks can be nested.
Text elements can be nested, too.
On the other hand,
text elements may not contain block elements.
For example,
<CITE><H3>Origin of Species</H3></CITE>
is invalid (since CITE
is text element and H3 is block element)
and also illogical (you don't really mean that the heading
as a structure
is a citation, do you?)
whereas
<H3><CITE>Origin of Species</CITE></H3>
would be legal, although different browsers might treat it differently
(letting either H3 or CITE determine the rendering, or possibly
using a mixture of the two).
Similarly, don't embed
headings into A NAME
tags but vice versa.
It is also illegal to have a paragraph break (P tag)
within eg a STRONG element; although several
browsers can handle it, the semantics is ambiguous and you should use
separate start and end STRONG tags within each paragraph (if you really
want to emphasize such large portions of text!).
The same information is presented in the individual tag descriptions, in their Allowed context and Contents parts. Here it is presented in a compact form. This form does not cover all details but might be more illustrative.
Legend:
A, ADDRESS, APPLET, B, BIG, BLOCKQUOTE, BODY, CAPTION, CENTER, CITE, CODE, DD, DFN, DIV, DT, EM, FONT, FORM, H1, H2, H3, H4, H5, H6, HTML, I, KBD, LI, P, PRE (with restrictions), SAMP, SMALL, STRIKE, STRONG, SUB, SUP, TD, TH, TT, U, VAR.
The following are not text containers but may contain text elements indirectly, ie contain elements which are text containers:
DIR, DL, MENU, OL, TABLE, TR, UL.
The following may not contain text elements at all:
AREA, BASE, BASEFONT, BR, HEAD, HR, IMG, INPUT, ISINDEX, LINK, MAP, META, OPTION, PARAM, SCRIPT, SELECT, STYLE, TEXTAREA, TITLE,
Similarly I will use the term block container to denote any element which may contain a block element directly (as opposite to containing an element which contains a block element). Block containers are: BLOCKQUOTE, BODY, CENTER, DD, DIV FORM HTML, LI (when within UL or OL), TD, TH.
Obviously, since some characters such as < are used with a very special meaning in HTML, there must be some way of expressing them as data characters, ie when they should appear eg as part of the document itself or in a URL. The convention is that the following notations are used:
character | notation | usual name(s) of the character |
---|---|---|
< | < | less than character, left angle bracket |
> | > | greater than character, right angle bracket |
& | & | ampersand |
There was notation " for the double quote (") in HTML 2.0, but it does not belong to HTML 3.2 (for certain technical reasons). The double quote can be typed as such within normal text, and within quoted strings as well if the single quotes are used as the outermost quotes. (In the rare cases where this does not work, you can use " to represent the double quote.)
Notice that the semicolon is part of the escape sequence. In principle, it is necessary only if the following character would otherwise be recognized as part of the name. In practice, it is best to adopt the habit of always terminating an escape sequence with a semicolon.
In escape sequences, the case of letters is significant. For example, the ampersand & may not be represented as & (this escape sequence is undefined), and the escape sequences ä and Ä denote two distinct characters, a umlaut (a dieresis, the letter a with two dots above it) in lower case and in upper case (ä and Ä); notice the principle of uppercasing only the first letter in the escape notation (&AUML; is undefined).
The need for the above-mentioned escape sequences arises from the syntax of HTML. In fact there are escape sequences for all characters in the ISO Latin 1 character set. There are
© | copyright sign, © |
® | registered trademark sign, ® |
| non-breaking space |
However, there is usually little reason to use other escape sequences than < and > and &. Using ä instead of ä might seem to give some character code independency, but it does not; if a browser can display ä correctly, it can also display correctly a document in which the character ä is specified directly. But notice that sometimes you cannot input some special characters directly due to keyboard restrictions, and in such cases you can have use for notations like ä.
And please notice that "character ä" means the ISO Latin 1 character with name "small letter a with diaeresis" (diaeresis = umlaut), with code 344 in octal, 228 in decimal. It can be entered into an HTML document in various ways. It is possible that pressing a key labeled with ä or Ä is not among those ways. For instance, on a Macintosh with Scandinavian keyboard the ä key normally produces a character quite different from ä in ISO Latin 1. Various programs may or may not handle this by performing character code conversions.
Some browsers support other escape sequences than those mentioned above, for example ™ and &cbsp;. The use of such notations is strongly discouraged. (Notation ™ refers to a symbol which does not belong to ISO Latin 1 at all; you may wish to use the HTML 3.2 conformant notation <SUP><SMALL>TM</SMALL></SUP> instead. Notation &cbsp; stands for "conditional breaking space", not in ISO Latin 1 and possibly not intended to be a character at all.)
This name concept occurs in the description of HTTP-EQUIV and NAME attributes of the META element and in the description of NAME attribute of the PARAM element.
In other contexts, a string which is used to name something may contain other characters as well but then it must be quoted.
It is of course possible that due to software or hardware limitations all colors cannot be presented. On some devices, the actual rendering might be just black and white or different shades of grey.
When a color is specified as the value of an attribute, there are two possibilities:
The following table lists the predefined color names and their numerical equivalents.
Black = "#000000" | Green = "#008000" |
Silver = "#C0C0C0" | Lime = "#00FF00" |
Gray = "#808080" | Olive = "#808000" |
White = "#FFFFFF" | Yellow = "#FFFF00" |
Maroon = "#800000" | Navy = "#000080" |
Red = "#FF0000" | Blue = "#0000FF" |
Purple = "#800080" | Teal = "#008080" |
Fuchsia = "#FF00FF" | Aqua = "#00FFFF" |
These colors were originally picked as being the standard 16 colors supported with the Windows VGA palette. The HTML 3.2 Reference Specification contains a section on colors with sample images in each of the 16 colors.
See also:
A browser should multiply the pixel values by an appropriate factor when rendering to very high resolution devices such as laser printers. For instance if a user agent has a display with 75 pixels per inch and is rendering to a laser printer with 600 dots per inch, then it should multiply the pixel values given in HTML attributes by a factor of 8.
The question whether should prevent line breaks when rendering HTML documents is ambiguous. The HTML 2.0 specification says:
Use of the non-breaking space and soft hyphen indicator characters is discouraged because support for them is not widely deployed.The soft hyphen should really be avoided; it serves no useful purpose in HTML. But as regards to non-breaking space, you can well use it to try to prevent line breaks where you don't want them. And although the HTML 3.2 Reference Specification is not explicit about the matter in general, it suggests, in the discussion of the NOWRAP attribute of TH and TD elements, that should act as non-breaking space within table cells at least.
If you use non-breaking spaces, use them instead of normal
spaces, not in addition to them. For instance, if you wish to prevent a line
break between
version
and 3
, type
version 3
(not version 3
).
On the other hand, within a table in HTML 3.2, can have quite different meaning, which can be described as non-empty space: when a table is presented with borders, cells with empty contents are drawn without them, and spaces only do not constitute contents - but does! This peculiar semantics does not prevent from acting as a non-break space as well.
For further confusion, some people use to force spaces into the visible presentation of a document, eg by putting an or a few of them into the beginning of a paragraph to get its first line intended. This may actually work on some browsers, but it is unwise to rely on that, and it is normally useless to try to enforce such presentation features anyway.
You can begin a comment with the four-character sequence <!-- (less than sign, exclamation sign, two hyphens) and terminate it with the three-character sequence --> (two hyphens, greater than sign). Don't use the character pair -- or the character > within a comment. For example:
<!-- Written by Jukka Korpela -->(For a more thorough discussion of comment syntax, see document HTML comments by WDG.)
It is generally preferable to include metainformation about the document into HTML elements, such as META. Consider making information about purpose, author, creation and last update time etc a visible part of the document itself, too.
Thus, comments should be inserted in rare cases only, eg to comment the HTML code itself to explain things that may look odd. Remember that a comment is part of an HTML file, to be transmitted whenever the document is delivered. Therefore, to avoid wasting bandwidth, if you have a long story to tell, put it into a separate document and insert just its URL into a comment.
HTML editors and converters often insert a few comment lines into the beginning of an HTML file. Such indications can be helpful and should not be removed.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <TITLE>Hello</TITLE> Hello worldIn fact, this document implicitly has the following structure, ie it is equivalent to the following:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <TITLE>Hello</TITLE> </HEAD> <BODY> Hello world </BODY> </HTML>This means that apart from the first line, the entire file is an HTML element which contains a HEAD element, with the TITLE element as contents, and a BODY element, with the plain text as contents.
Thus, in the absence of HTML, HEAD, and TITLE tags a browser implicitly assumes them in suitable places. Therefore, your document always contains a head and a body.
Here we will simply emphasize that every HTML document should contain certain basic information about its origin. The local recommendations may specify in detail the form in which that information should be provided.
The importance of providing origin information becomes evident if we think how people find documents using search engines or link lists in an increasing amount. In such contexts the document pops up as such, in isolation, even if you may have intended that people find out following links which you have carefully designed so that they give background information. When a user has eg found your document using AltaVista, he most probably wants to know what kind of document it is. Therefore, each HTML file should provide the very basic information (or link to information) about its origin and nature. For example, in a book-like document collection divided into small files, every file should contain at least a link to the "front page" of the "book".
At least the following origin information should be provided:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <HTML> <HEAD> <TITLE>A sample HTML document</TITLE> <LINK REV="made" HREF="mailto:jukka.korpela@hut.fi"> </HEAD> <BODY> <H1>A sample HTML document</H1> This is a sample HTML document exemplifying a suggested way of presenting basic origin information. <HR> <P> <A HREF="http://www.hut.fi/~jkorpela/">Jukka Korpela</A>, <a href="mailto:Jukka.Korpela@hut.fi">Jukka.Korpela@hut.fi</a> <BR> This document belongs to the context of <a href="index.html">Learning HTML 3.2 by Examples</a> <BR> The URL for this document is <KBD> http://www.hut.fi/~jkorpela/HTML3.2/skel.html </KBD> <BR> Created: December 5, 1996 </BODY> </HTML>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">(where you theoretically should have
HTML 3.2 Final
instead of
HTML 3.2
)
<TITLE>Introduction to General Absurdity</TITLE>
Most browsers don't complain if you omit these, but they are required by the HTML 3.2 definition. More importantly, there are good practical reasons to include them:
Optionally, the HEAD element may contain the following elements in addition to a TITLE element:
The tags for expressing major structural features, so-called block level tags, are the following:
A recommendable approach, which may need adjustments to fit your local recommendations, is the following:
List can be nested in the sense that an item in a list, i.e. an LI (or DD) element, may in turn contain a list element.
Notice that the basic paragraph element P is not nestable, ie you cannot have P elements within a P element to create subparagraphs. However, the various list elements effectively provide an itemization structure which essentially corresponds to subparagraph division. Moreover, the list elements are nestable.
Logical markup shall be preferred. Use physical markup only if it is really relevant that part of a text displayed in a particular physical way (if possible). The need for physical markup may arise when referring to information in fixed presentation form, such as text in a book or in an image. Such situations occur rarely.
For instance, use the STRONG element for strong emphasis, letting the various Web browsers express the emphasis in the way which is the best in the environment where they are used. Do not use the B element (indicating bolding), except in the rare occasions where you are writing about some text appearing in boldface somewhere.
When style sheets will be generally useable, both authors and readers will be able to affect the rendering (eg font, color, and background) of elements. For instance, someone might wish to have all program code extracts presented with yellow background and larger than normal font whereas someone might prefer some quite different methods of distinguishing them from normal text. Such operations will be much easier if logical markup has been used consistently.
In addition to being more flexible with respect to various browsers and rendering environments, logical markup has the following advantage over physical markup: In an increasing amount, computer programs are used for extracting information from HTML documents for various purposes like indexing. For this to work, it is much better to have logical markup indicating eg that some text is more important than the rest or a quotation of computer printout, rather than having designations of physical fonts.
Both logical and physical markup is done using HTML elements with start and end tags. It follows from the nature of HTML language that markups must not overlap. For instance, the following is in error:
This has some <B>bold and <I></B>italic text</I>.On the other hand, markup elements can be nested. User agents should do their best when rendering structures like the following:
This is <I>italic text which contains <U>underlined text</U> within in </I> whereas <U>this is normal underlined text</U>.
Obviously, browsers with limited font repertoire can have difficulties in presenting text markup.
Avoid emphasizing too much, since emphasizing everything is tantamount to saying everything with the same emphasis, ie not emphasizing anything! (The proverbial student who underlines everything in his textbook has not grasped the idea of emphasizing.)
Unfortunately there is no phrase element for "de-emphasis", ie for indicating segments of text as less important. If you really need that, you may consider using the SMALL element. But especially if the less important text is relatively long, it might often be a better idea to put it "behind hyperlinks", into separate documents to which there are links in the main document. A person who follows such a link is probably interested in the text, so he probably prefers seeing it as normal text, and there is no need for any de-emphasis.
The DFN element can be regarded as a special kind of emphasis, too, but logically it indicates that a term is used in a context where it is defined. This is a very useful element in principle but unfortunately many browsers, including Netscape, do not effectively support it.
The VAR element indicates that a piece of text (typically, a word) is a variable, ie a generic notation to be replaced by different actual expressions.
The other phrase elements involve different kinds of citations or quotations:
CITE | citation (title of a book or article or equivalent) |
---|---|
CODE | program code or equivalent (eg HTML code) |
SAMP | sample output from programs, scripts, commands etc |
KBD | text to be typed from a keyboard by a user; typically used when giving instructions |
Please do not identify eg the concept of emphasis with its physical representation on your browser (or even its typical representation on several browsers). See below for notes and examples on rendering markup.
TT | "teletype" text, ie monospaced text |
---|---|
I | italics |
B | bold |
U | underlined |
STRIKE | strike-through text |
BIG | large font |
SMALL | small font |
SUB | subscript |
SUP | superscript |
Note: SUB and SUP might reasonable be regarded as phrase-level markup, and as mentioned above, SMALL might be used as a substitute for the missing phrase markup for de-emphasis.
The FONT (and BASEFONT) element offers more possibilities to control font sizes than BIG and SMALL. However, all use of font size control in HTML should be avoided.
For example, some browsers (eg Internet Explorer) render TT (and CODE) so that the font is significantly smaller than normal text font, and this disproportion is preserved when the setting for font size is changed; moreover, Internet Explorer renders VAR with monospaced font whereas most graphical browsers use (much more naturally) italics. On the other hand, in Netscape these font sizes are separately settable and by default the same font size is used for both, but "the same" is the technical size in points - in practise monospaced font looks bigger than normal proportional font!
Thus, avoid messing up with font sizes; use phrase markup and other structural elements and let the users, if they dislike the font sizes, define fonts in their browser settings the best they can.
The following table is intended for giving an idea of the variation. It (verbally) presents the rendering of markup elements in Netscape Navigator, Microsoft Internet Explorer, and Lynx. Notice that there is variation even within each of these programs - depending on version, platform, and system-wide or user's own configuration, so this is just a typical situation. Thus, consider this as what different things might happen rather than as a description of what actually happens in some particular program.
element | Netscape | Internet Explorer | Lynx |
---|---|---|---|
EM | italics | italics | underlined |
DFN | normal text | italics | normal (monospaced) |
CODE | monospaced | monospaced small | normal (monospaced) |
SAMP | monospaced | monospaced small | normal (monospaced) |
KBD | monospaced | monospaced small | normal (monospaced) |
VAR | italics | monospaced small | normal (monospaced) |
CITE | italics | italics | underlined |
TT | monospaced | monospaced small | normal (monospaced) |
I | italics | italics | underlined |
B | bold | bold | underlined |
U | normal text | underlined | underlined |
STRIKE | strike-through | strike-through | text between [DEL: and
:DEL]
|
BIG | larger than normal | larger than normal | normal text |
SMALL | smaller than normal | slightly smaller than normal | normal text |
SUB | lowered, slightly smaller | lowered | normal text |
SUP | raised, slightly larger | raised | normal text |
These relate to unnested elements. Nesting of text elements may affect the rendering.
The following example illustrates the approach in the context of an introduction to the Perl programming language.
<P>The following Perl script prints out its input so that each line begins with a running line number:</P> <PRE><CODE> #!/usr/bin/perl $line = 1; while (<>) { print $line++, " ", $_; } </CODE></PRE> <P>The scalar variable <CODE>$line</CODE> is of course the line counter.<P> <P>The loop construct is of the form<BR> <CODE>while (<>) {</CODE><BR> <VAR>process one line of input</VAR> <CODE>}</CODE><BR> </P> <P>Assuming that you have written this script (the simpler version of it) into a file named <KBD>lines</KBD>, you could test it using a command of the form<BR> <KBD>./lines</KBD> <VAR>datafile</VAR><BR> In particular, using the script as input to itself, you would do as follows (the details of system output vary from one system to another): </P> <PRE> <SAMP>lk-hp-23 perl 251 % </SAMP><KBD>./lines lines</KBD> <SAMP>1 #!/usr/bin/perl 2 $line = 1; 3 while (<>) { 4 print $line++, " ", $_; } lk-hp-23 perl 252 % </SAMP> </PRE>Notes on the example:
Thus, on the Web there is no such thing as the layout of a document. As an author you cannot dictate layout, just make some efforts to affect it. The following notes, and all information related to layout-oriented features of HTML, should be read with this in mind.
Several HTML elements have optional attributes which can be used to affect the way in which the element is rendered. Consult the detailed descriptions of individual HTML tags to see the possibilities and to read notes about them.
In particular, you may wish to center parts of the text to make them more distinguishable from normal text. You can use the ALIGN=CENTER attribute in several elements like P or DIV (or the separate CENTER element).
If you wish to separate major portions of your document visually from each other, you can use the HR element. Typically it is rendered as a full width horizontal line. But please use this in addition to structuring tools like headings, not as a substitute for them.
As regards to detailed layout issues such as forcing or preventing line breaks, see section Division into lines and the use of blanks and tabs. Font issues were discussed above.
Technically links are specified using A (anchor) elements, and the technical issues are discussed in the description of the A tag. Here we just present the basic idea, a very simple example, and a few pragmatic or stylistic notes.
A link is a directed connection between a particular point in a document and another particular point in the same or another document. The points are often called anchors in HTML terminology.
The two ends of a link (the anchors) are in different logical positions: the link is from one point to another. The latter, called the target of the link, is very often the beginning of a document or, perhaps more logically speaking, an entire document.
In the simplest case, you create a link from one point of your document to another document (which could be your own or written by someone else, perhaps physically located at the other side of the globe). You have to decide which words act as a visual representation of the link, ie as the phrase which refers to the other document, and you need to know the Web address (the URL) of that document. Then you just put the pieces together into a suitable A element. For instance:
I work at <A HREF="http://www.hut.fi/english.html">HUT</a>.This might, in one environment, be rendered as follows:
I work at HUT.
The link text, here the abbreviation HUT, acts as a link to a Web document which explains what the abbreviation means and also provides a lot of information about it. The renderings vary a lot - the link text might be underlined, colored, or otherwise distinguishable from normal text. The user (reader) is assumed to know how links are rendered in the particular environment.
Although it is technically easy to set up links, it is pragmatically often very difficult to use them the right way. Here are some practical guidelines:
Assuming that we have some graphics in some format in a file, there are two essentially different ways to use it in a Web document. You can either link to it or to embed it into your document. In the first case, you use an anchor (A) element; in the latter case, an IMG element. In the first case, when a user accesses your document he sees eg a verbal phrase which acts as a link, and activating that link causes an image to be displayed, either in the same window or in another, depending on the browser and its settings. On the other hand, an embedded image is part of your document; when a user accesses your document, the image is loaded along with it and displayed as part of it.
In both cases, the user will see the image only if the browser supports the particular graphics format. The most commonly supported formats are GIF and JPEG. They are often the only formats supported for embedded images. For linked images, the support is typically wider (it might include eg PostScript, PDF, and PNG) and extensible by the user (by installing new viewers and making suitable additions to the settings of the browser). The reason is that linked images are typically implemented so that the browser knows nothing of the graphics format itself but only knows how to launch a separate program to present it.
As a special case, it is possible to combine linking and embedding in a sense: you can create a document which contains an image which acts (instead of verbal link text) as a link to another image. Typically, the embedded image is rather small, stamp-like, often a small coarse version of the image to which it points as a link.
Linking to an image is usually permitted without specific permission. On the other hand, embedding an image means using it in a way which requires the author's permission, and the author must be mentioned. (See Web Law FAQ.) Obviously, some images are so simple that copyright is not applicable. Moreover, there is a large number of collections of images, some of which are in the public domain.
To illustrate linking to images and embedding images, let us consider a GIF image which has been put onto a suitable place so that it is accessible using the URL http://www.hut.fi/%7elsarakon/sae.gif. Now I could refer to it in the following way:
<A HREF="http://www.hut.fi/~lsarakon/">Liisa Sarakontu</A> has drawn <A HREF="http://www.hut.fi/~lsarakon/sae.gif">a picture of Siamese algae eater</A>.On the other hand, since Liisa has given me the permission to do so, I could embed the image into a document of mine as follows:
The Siamese algae eater (<I>Crossocheilus siamensis</I>) is often mixed up with another algae eating fish, the "false Siamensis" (<I>Garra taeniata</I> or <I>Epalzeorhynchus sp.</I>). Below you can see drawings of them by <A HREF="http://www.hut.fi/~lsarakon/">Liisa Sarakontu</A>. <P> <IMG SRC="http://www.hut.fi/~lsarakon/sae.gif" ALT="[Picture of Siamese algae eater]"> <P> <IMG SRC="http://www.hut.fi/~lsarakon/false.gif" ALT='[Picture of "false Siamensis"]'>The issue of good use of images is very difficult any many-faceted. No attempt to cover it will be made here. The author has written a separate treatise How to use images in communication in general and on the Web in particular.
There is no general support in HTML 3.2 to presenting mathematical formulas. Consult the W3C document on Math Markup to see what work is in progress in this respect. However, you can use some software (eg TeX) to produce the representation of a formula as an image, eg in PostScript form, and use the IMG tag to embed it into your document or the A tag to create link to it. The latter method is often worth considering, especially for large formulas. The reader may prefer reading the text without distractions and looking at the formula (image) at the very moment he is prepared to do so. Moreover, he may prefer looking at it in a separate window (which is separately adjustable in size and positionable on the screen).
In some cases, when just a few separate symbols are needed within the text and they have reasonable textual alternatives, the following kind of approach can be suitable:
The Greek letter <IMG SRC="http://www.ece.cmu.edu/icons/Sigma.xbm" ALT="sigma"> is often used to denote summation.There is a problem, however: since an image has fixed dimensions whereas the size of letters is browser-dependent, there might be an unesthetic disproportion.
Sometimes it is best to present mathematical expressions in linearized notation. For example, instead of trying to find a way of presenting the square root of 2 in the normal mathematical way, you might write just sqrt(2). It depends on intended audience whether you need to explain such notations.
Table cells are often called table elements, but it is best to avoid that in the HTML context, since it might cause confusion eg with the TABLE element, which is the HTML description of an entire table.
Tables are the most important improvement in HTML 3.2 in comparison with HTML 2.0. On the other hand, the table constructs of HTML 3.2 are only a subset of The HTML3 Table Model (RFC 1942).
Unfortunately tables are not yet supported by all browsers, and even if support exists it may be of poor quality. (Text-only browsers and speech-based user agents will always have difficulties with complicated tables, of course.) See Alan Flavell's review Tables on non-table browser for information about making tables look somewhat reasonable, if possible, also on browsers which do not support tables.
Another unfortunate situation is that people have started using table elements just to get a desired layout of pages, not to represent data which is logically matrix-like in structure.
<TABLE> <TR> <TD> 1 </TD> <TD> 0 </TD> </TR> <TR> <TD> 0 </TD> <TD> 1 </TD> </TR> </TABLE>and it looks like the following on a typical browser:
1 | 0 |
0 | 1 |
Thus, the TABLE tags enclose the table rows, each of which is enclosed by TR tags and enclose table cells enclosed by TD tags. This corresponds to the logical structure of a table as a set of rows consisting of cells. You can abbreviate the table structure by omitting the TD and TR end tags (since a browser implicitly assumes them), but at the expense of losing the logical clarity to some extent:
<TABLE> <TR> <TD> 1 <TD> 0 <TR> <TD> 0 <TD> 1 </TABLE>
Moreover, although omitting those end tags is legal HTML 3.2, it may in practise confuse some browsers (including Netscape) in some cases.
The use of blanks and newlines in the HTML code for a table is irrelevant to the visual appearance of a table when viewed with a browser, since that appearance is controlled by HTML tags. However, it is often useful to position table elements suitably in the HTML code so that items in the same column are adjusted to make the structure clear for you (or whoever has to maintain the HTML document).
<P>An illustration of the use of the TABLE element in HTML.</P> <TABLE BORDER=1> <CAPTION>Finnish, English, and scientific names for some animals</CAPTION> <TR><TH>Finnish name</TH><TH>English name</TH><TH>Scientific name</TH></TR> <TR><TD>hirvi</TD><TD>elk</TD><TD><I>Alces alces</I></TD></TR> <TR><TD>orava</TD><TD>squirrel</TD><TD><I>Sciurus vulgaris</I></TD></TR> <TR><TD>susi</TD><TD>wolf</TD><TD><I>Canis lupus</I></TD></TR> </TABLE>Notice that some table elements in the example contain text markup; in this case, there is a specific reason for using the I element.
In the simplest case you can just write a TABLE element (with attributes defaulted) which contains a single row which contains two data cells, each of which contains a paragraph.
In a more general case, you should divide the parallel texts into logical parts, such as paragraphs, and make each part a cell of the table. This may require a lot of work (unless you have a suitable program to do the job), since you must take care of "merging" the text: after the first part of the first text, you must have the first part of the second text, etc.
The following example presents a passage from the Bible in three versions and translations:
<TABLE> <CAPTION><STRONG>The beginning of Genesis in three languages</STRONG></CAPTION> <TR ALIGN=LEFT VALIGN=TOP> <TH><TH>Latin (Vulgate)</TH><TH>English (King James version)</TH> <TH>Finnish (1992 version)</TH> </TR><TR ALIGN=LEFT VALIGN=TOP> <TH>1</TH> <TD>In principio creavit Deus caelum et terram.</TD> <TD>In the beginning God created the heaven and the earth.</TD> <TD>Alussa Jumala loi taivaan ja maan.</TD> </TR><TR ALIGN=LEFT VALIGN=TOP> <TH>2</TH> <TD>Terra autem erat inanis et vacua et tenebrae super faciem abyssi et spiritus Dei ferebatur super aquas.</TD> <TD>And the earth was without form, and void; and darkness was upon the face of the deep. And the Spirit of God moved upon the face of the waters.</TD> <TD>Maa oli autio ja tyhjä, pimeys peitti syvyydet, ja Jumalan henki liikkui vetten yllä. </TD> </TR><TR ALIGN=LEFT VALIGN=TOP> <TH>3</TH> <TD>Dixitque Deus "Fiat lux" et facta est lux.</TD> <TD>And God said, Let there be light: and there was light.</TD> <TD>Jumala sanoi: "Tulkoon valo!" Ja valo tuli.</TD> </TR></TABLE>Notice that the ALIGN and VALIGN attributes can be essential for achieving good rendering. Browsers cannot know the nature of tables from their contents, so there are situations where the document author may need to control formatting issues like alignment.
Using a TABLE element for a definition list is perhaps not an intended use of that element but it is often useful, especially since the author can control things like alignment and use of borders. Consult the document Examples of various list elements in HTML for a very simple example of presenting a definition list as a table with default attribute settings. Usually you probably want the "definition terms" to be left-aligned, as in the following example:
<TABLE> <CAPTION>The first three letters of the Greek alphabet</CAPTION> <TR><TH ALIGN=LEFT>alpha</TH> <TD> the first letter of the Greek alphabet </TD> </TR> <TR><TH ALIGN=LEFT>beta</TH> <TD> the second letter of the Greek alphabet </TD> </TR> <TR><TH ALIGN=LEFT>gamma</TH> <TD> the third letter of the Greek alphabet. </TD> </TR> </TABLE>
For numerical tables, proper alignment is usually crucial for easily readable rendering. (It is in a sense a structural feature, since it relates to the comparability of items of a column.)
Integer values in a column should be right aligned. This is easy to achieve in principle. There are two alternatives:
Values containing a decimal point (or, in many languages, a decimal comma) should be aligned according to that separator, but unfortunately this is not possible in HTML 3.2. (There are suggested ways of expressing such requests, but currently there is little if any support for them.) One solution is to present such values so that there is the same number of digits to the right of the decimal point in every value in a column, and use ALIGN=RIGHT.
However, the rendering might be unsatisfactory if numbers are presented using a proportional font so that digits are of essentially different sizes. It is possible but tedious to overcome this by putting the data in each numerical cell within a TT element. (Notice that it is not legal for a TT element to contain a TABLE element!)
The following example contains first a hand-formatted table presented using the PRE element, then the same data using a TABLE element. In general, it takes more work and care to use a TABLE element but the result is often much better.
Measurement results: <PRE> time temperature pressure 12:00 26 12.8 12:15 22.5 9.8 12:30 11 1.65 12:45 3.3 0.03 13:00 0.05 0.002 </PRE> <TABLE> <CAPTION>Measurement results</CAPTION> <TR><TH>time</TH><TH>temperature</TH><TH>pressure</TH></TR> <TR ALIGN=RIGHT><TD>12:00 </TD><TD>26.00 </TD><TD>12.800 </TD></TR> <TR ALIGN=RIGHT><TD>12:15 </TD><TD>22.50 </TD><TD> 9.810 </TD></TR> <TR ALIGN=RIGHT><TD>12:30 </TD><TD>11.00 </TD><TD> 1.650 </TD></TR> <TR ALIGN=RIGHT><TD>12:45 </TD><TD> 3.30 </TD><TD> 0.030 </TD></TR> <TR ALIGN=RIGHT><TD>13:00 </TD><TD> 0.05 </TD><TD> 0.002 </TD></TR> </TABLE>
The index is implemented in HTML using normal
links, eg
<A HREF="af.html">Afghanistan</A>
What we will discuss here is how to present the link names, or some
other pieces of text, as a list, table, or some other structure.
If you only read HTML specifications, the obvious answer is to use the DIR or MENU construct. However, as mentioned and exemplified in the general discussion of lists, this is not practically feasible. Thus, if we prefer having the menu in multicolumn format, as we usually do, we must use other constructs.
One possibility is to format the menu by hand and enclose it into a PRE element. If the menu items are link texts, you should first format it as text only, then add the anchor (A) tags, since adding them obscures the layout. For clarity, therefore, the following example is presented without links (unlike the other alternatives):
<PRE> Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antarctica Antigua and Barbuda Arctic Ocean Argentina Armenia </PRE>Another possibility, which should be the normal one, is to present the items simply as a text paragraph, using eg a blank or a blank and a comma as separator. This means that the browser takes care of dividing the text into lines and the presentation is very compact:
<BASE HREF="http://www.odci.gov/cia/publications/nsolo/factbook/"> <P> <A HREF="af.htm">Afghanistan</A>, <A HREF="al.htm">Albania</A>, <A HREF="ag.htm">Algeria</A>, <A HREF="aq.htm">American Samoa</A>, <A HREF="an.htm">Andorra</A>, <A HREF="ao.htm">Angola</A>, <A HREF="av.htm">Anguilla</A>, <A HREF="ay.htm">Antarctica</A>, <A HREF="ac.htm">Antigua and Barbuda</A>, <A HREF="ocat.htm">Arctic Ocean</A>, <A HREF="ar.htm">Argentina</A>, <A HREF="am.htm">Armenia</A> </P>Of course, it is possible to force line breaks by using a BR element (eg to make a change in the initial letter cause a new line in an example like above). If you think the items are not distinguishable enough in the rendering, consider prefixing each item with a special character like * (and using just spaces as separator).
However, if for some reason the presentation must be such that all items occupy the same amount of space, then one can either use the PRE method described above or take the effort of designing a suitable TABLE element. Example:
<BASE HREF="http://www.odci.gov/cia/publications/nsolo/factbook/"> <TABLE><TR> <TD WIDTH=160><A HREF="af.htm">Afghanistan</A></TD> <TD WIDTH=160><A HREF="al.htm">Albania</A></TD> <TD WIDTH=160><A HREF="ag.htm">Algeria</A></TD> <TD WIDTH=160><A HREF="aq.htm">American Samoa</A></TD> </TR><TR> <TD WIDTH=160><A HREF="an.htm">Andorra</A></TD> <TD WIDTH=160><A HREF="ao.htm">Angola</A></TD> <TD WIDTH=160><A HREF="av.htm">Anguilla</A></TD> <TD WIDTH=160><A HREF="ay.htm">Antarctica</A></TD> </TR><TR> <TD WIDTH=160><A HREF="ac.htm">Antigua and Barbuda</A></TD> <TD WIDTH=160><A HREF="ocat.htm">Arctic Ocean</A></TD> <TD WIDTH=160><A HREF="ar.htm">Argentina</A></TD> <TD WIDTH=160><A HREF="am.htm">Armenia</A></TD> </TR></TABLE>Alternatively, you might wish to consider the effect of using a table with borders.
Notice that this solution is rather unclean. It involves a TABLE structure where the division into lines is (normally) made for layout purposes only, and adding new items usually requires complete restructuring of the table. You typically need to insert WIDTH attributes to ensure that table columns are of the same width, and the specification is inherently device-dependent since it must be given in pixels. In particular, the presentation might not be the desired one of the physical font size in pixels differs too much from what you think it should be.
Thus, this approach should be avoided in general. Hopefully future browsers will support the UL element in a more advanced way, automatically selecting a compact multicolumn presentation when applicable, or at least support the DIR element in the intended way.
neut. masc. fem. nom. id is ea acc. id eum eam gen. eius eius eius dat. ei ei ei abl. eo eo eaObviously this calls for using a table in HTML, and using the above-explained constructs you can write a simple table presentation for the data. However, if you would like to make it more explicit that there are identical entries in adjacent cells, you can use the ROWSPAN and COLSPAN attributes as follows:
<TABLE BORDER=1 ALIGN=CENTER CELLPADDING=3> <CAPTION>Declination of <I>is</I> in singular</CAPTION> <TR><TH></TH><TH>neuter</TH><TH>masc.</TH><TH>fem.</TH></TR> <TR><TH>nom.</TH><TD ROWSPAN=2 VALIGN=MIDDLE><I>id</I></TD> <TD><I>is</I></TD><TD><I>ea</I></TD></TR> <TR><TH>acc.</TH><TD><I>eum</I></TD><TD><I>eam</I></TD></TR> <TR><TH>gen.</TH><TD COLSPAN=3 ALIGN=CENTER><I>eius</I></TD></TR> <TR><TH>dat.</TH><TD COLSPAN=3 ALIGN=CENTER><I>ei</I></TD></TR> <TR><TH>abl.</TH><TD COLSPAN=2 ALIGN=CENTER><I>eo</I></TD> <TD><I>ea</I></TD></TR> </TABLE>For example, the first cell is specified to have ROWSPAN=2, which effectively means that two adjacent cells in the same column are combined into one cell. Notice that when writing the HTML code for the next row (the second TR element) we simply leave out a cell element corresponding to the location which has already been taken into use.
Nested tables easily become confusing. Moreover, there are browsers which cannot handle nested tables in general or which get confused with complicated nested tables. Of course, nested tables can be the natural way of expressing information, when it is logically an array of something which may in turn be an array.
Basically you just need to be very careful in writing HTML code for nested tables. No new elements or other features are needed, just a combination of those which have already been described. But due to deep nesting one easily makes mistakes, and the results can be really messy, and locating the error may take time.
The simplest case is probably a table with a single row consisting of two elements, each of which is a table. This might be used for presenting two similar tables in parallel for comparison. To proceed with our grammatical example, here is a table containing two tables, one for declination in singular and one for declination in plural:
<TITLE>tbl</TITLE> <TABLE ALIGN=CENTER> <CAPTION>Declination of <I>is</I></CAPTION> <TR><TD> <TABLE BORDER=1 ALIGN=CENTER CELLPADDING=3> <CAPTION>Singular</CAPTION> <TR><TH></TH><TH>neuter</TH><TH>masc.</TH><TH>fem.</TH></TR> <TR><TH>nom.</TH><TD ROWSPAN=2 VALIGN=MIDDLE><I>id</I></TD> <TD><I>is</I></TD><TD><I>ea</I></TD></TR> <TR><TH>acc.</TH><TD><I>eum</I></TD><TD><I>eam</I></TD></TR> <TR><TH>gen.</TH><TD COLSPAN=3 ALIGN=CENTER><I>eius</I></TD></TR> <TR><TH>dat.</TH><TD COLSPAN=3 ALIGN=CENTER><I>ei</I></TD></TR> <TR><TH>abl.</TH><TD COLSPAN=2 ALIGN=CENTER><I>eo</I></TD> <TD><I>ea</I></TD></TR> </TABLE> </TD> <TD> <TABLE BORDER=1 ALIGN=CENTER CELLPADDING=3> <CAPTION>Plural</CAPTION> <TR><TH></TH><TH>neuter</TH><TH>masc.</TH><TH>fem.</TH></TR> <TR><TH>nom.</TD></TH><TD ROWSPAN=2 VALIGN=MIDDLE><I>ea</I></TD> <TD><I>ii (ei)</I></TD><TD><I>eae</I></TD></TR> <TR><TH>acc.</TH><TD><I>eos</I></TD><TD><I>eas</I></TD></TR> <TR><TH>gen.</TH><TD COLSPAN=2 ALIGN=CENTER><I>eorum</I></TD> <TD><I>earum</I></TD></TR> <TR><TH>dat.</TH><TD COLSPAN=3 ROWSPAN=3 ALIGN=CENTER VALIGN=MIDDLE> <I>iis (eis)</I></TD></TR> <TR><TH>abl.</TH></TR> </TABLE> </TD> </TABLE>Notice the explicit use of end tags like </TD>. The same code with omissible tags omitted is equivalent according to HTML 3.2 specification, but Netscape has a bug which can make it present a nested table incorrectly in the absence of end tags.
The default alignment is the following:
There is no way to set different defaults for an entire table. (Although the TABLE element accepts an ALIGN attribute, it affects the positioning of the entire table!)
However, you can use the ALIGN and VALIGN attributes in TH and TD elements to set the alignments for an individual cell, and you can use the same attribute in a TR element to set the alignment defaults for the cells within that element (ie within one row); naturally, such defaults can be overridden in individual elements.
The possible values of ALIGN (in TH, TD and TR elements) are LEFT, RIGHT, and CENTER, for aligning the contents of a cell vertically with respect to the left, center or right within the space for the cell. Notice that when aligning to the left or right, there can still be some space between the upper or lower border of the cell, depending on the setting of the CELLPADDING attribute of the enclosing TABLE element.
The possible values of VALIGN (in TH, TD and TR elements) are TOP, MIDDLE, and BOTTOM, for aligning the contents of a cell vertically with respect to the top, center or bottom of the space for the cell. As stated above, the default is VALIGN=MIDDLE. Notice that when VALIGN=TOP or VALIGN=BOTTOM is used, there can still be some space between the upper or lower border of the cell, depending on the setting of the CELLPADDING attribute of the enclosing TABLE element.
The short answer is: Don't. When necessary, use logical markup for text elements within tables as well as elsewhere. (Previous discussion contained a simple example of this.)
Assuming that you really need to designate font face, size and color (or just insist on doing so), the laborious way of doing it elementwise is the only portable way. Here portable means that you can, with some confidence, expect the HTML code to work on most browsers (assuming that they have table support at all, of course). This is not just a standards issue. In particular, in Netscape the BASEFONT element does not affect text in tables (it is disputable whether it should, according to the standard).
To summarize the situation, as regards to portable solutions in the above-mentioned sense:
Style sheets provide tools for affecting the rendering in a rather detailed manner, but support for them in browsers is still under development.
The basic idea of style sheets is to provide tools for specifying features of the visible (or audible) representation of HTML documents without introducing new HTML tags and attributes for the purpose. The presentation style is specified in a manner which allows several style specifications (by the author and by users, as well as browser defaults) to be taken into account when rendering a document. This will allow control over indentation, colors, fonts, etc in a sophisticated manner. For more information about style sheets in general, consult the W3C pages on style sheets and WDG pages on style sheets.
Almost at the same time as the HTML 3.2 Reference Specification was accepted as a W3C Recommendation, a recommendation with similar status was accepted concerning style sheets: Cascading Style Sheets, level 1, abbreviated CSS1. The two recommendations are, however, separate in the sense that the combination of style sheet specifications with HTML documents has not been defined exactly. In particular, CSS1 mentions the ID and CLASS attributes for selecting specific pieces of text, but these attributes are not in HTML 3.2. The same applies to attributes of STYLE element and the proposed SPAN element.
The HTML 3.2 language provides two ways of referring to style sheets in HTML documents:
Additional methods of referring to style sheets in HTML will probably be possible, and some of them are already supported. For a short general discussion, see Linking Style Sheets to HTML by WDG. There is also a W3C Working Draft HTML3 and Style Sheets which discusses these issues.
An HTML 3.2 conforming browser need not support style sheets in any way (except by recognizing the STYLE element and hiding its contents). However, there is increasing support for some features of CSS1 in browsers.
The structure of the tag descriptions is as follows:
This presentation does not discuss the XMP, LISTING, and PLAINTEXT elements. They are now deprecated (obsolete), and PRE should be used instead.
In principle, the A element can also be used for some other purposes which are currently of little practical value.
The user may select the anchor text (in a browser-dependent manner, using eg arrow keys for moving the cursor and enter key for selecting, or the mouse for moving the cursor and a mouse button click for selecting). In that case the document or location in a document as specified by the target, if existent and accessible, will be fetched and presented to the user. A browser may allow the user to select whether the document is to be displayed in the same or in another window on the screen.
The visual look of anchor texts is settable by user options in many browsers. It can depend on whether the target has been visited by the user or not. It is also affected by eventual LINK and VLINK attributes in a BODY element. When a document is printed, anchor texts might be, depending on the browser and its settings, eg normal text or underlined text or footnotes (indicating the target URLs) might be attached to them.
If anchor text is (or contains) an IMG element, a browser generally indicates the image as a link by drawing a colored (typically blue) border around the image. The width (and existence) of such a border can be controlled by the BORDER attribute of the IMG element.
Other A elements than those containing an HREF attribute have no effect on the rendering of a document.
or
<A NAME="name"></A>
attribute name | possible values | meaning | notes |
---|---|---|---|
NAME | string | a name for a link end | must be unique within the document; case sensitive |
HREF | URL | network address for the linked resource | could be another HTML document, a PDF file, an image etc |
REL | string | the forward relationship also known as the "link type"; cf. LINK with REL | in principle, could be used by browsers in several ways, eg to determine to how to deal with the linked resource when printing out a collection of linked resources |
REV | string | the reverse relationship: | a link from document A to document B with REV=relation expresses the same relationship as a link from B to A with REL=relation. |
TITLE | string | a title for the linked resource | advisory |
The value of a TITLE attribute might be used eg
mailto:
URL
<P>A hyperlink referring to a document in the same directory as the current one: <A HREF="ADDRESS.html">Examples of using ADDRESS tag</A>. <P>A hyperlink referring to a document elsewhere: <A HREF="http://www.hut.fi/english.html">HUT</A>. <P>A hyperlink in which the link text contains markup: <a href="http://www.iki.fi/oa/HTML/"><cite>The HTML test set</cite></a> <p>A hyperlink referring to a label in the same document: <A HREF="#final">final example</A>. <P>A hyperlink referring to a label in another document: <A HREF="http://www.ncsa.uiuc.edu/General/Internet/WWW/HTMLPrimerP2.html#UR"> URL info in HTML Primer</A> <P>A link to an image: <A HREF="http://www.hut.fi/~jkorpela/perhe.jpg" TITLE="Yucca's family picture, by Minna">a family picture</A>. <P><A NAME="final">Finally, this is just text to which you can refer with a hyperlink.</A>
As regards to ISMAP, see the IMG examples.
It depends on the browser how references to resources like audio and video files are handled. If a browser supports them, it typically supports some particular repertoire of file formats by initiating ("launching") a separate program for "playing" the file. (It might use a distinct program for each file format or a general-purpose media player program for a large set of formats.) Thus, for example, in order to listen to .au files the user needs, in addition to suitable hardware installed, a program which can produce sounds according to specifications in .au format, and user's browser must have settings which instruct it to launch that player program for .au files.
Don't use anchor texts like Click here. They look extremely stupid eg in a paper copy of a document. Warren Steel says in Hints for Web Authors:
You don't need to say "Click here for information on our graduate programs;" just insert the link into what you were saying: "Our excellent graduate programs ..." Links to large files or unusual formats should be so marked, perhaps in a parenthetical note. "Our stirring fight song (400k .au) ..."
You can make plain text and binary files of various formats available to other people alongside with your HTML files, and you can tell about them and provide links to them in your HTML documents. However, your server may not support the file format involved, so try to use some widely known format and corresponding file name suffix; see also WDG Web Authoring FAQ, questions 5 and 6.
Of course, such links will be useful only to such people who can use a program which processes the particular file format in a meaningful way. Processing might consist of displaying an image or animation, playing music, or doing some spreadsheet calculations, for example. This might take place within a browser or in a separate program launched automatically by a browser (when programmed to do so), or "offline" so that the Web browser is used just to retrieve the file and to save it into a local file, to be opened later by an application.
Example:
The budget proposal is available as a <A HREF="budget.zip">zipped Excel file</A>People using computers on which Excel is available will then be able to view your document on it. It depends on browser and its settings how smoothly this can take place. Of course they also need some program (eg WinZip) for unfolding a .zip file, but such software exists for almost all environments and should be installed anyway. The reason for my suggesting the use of zipped format is twofold:
It is a rather common error to omit quotes or the closing quote in an HREF attribute. Some browsers are permissive, others may get very confused, so that the link may not work at all.
You cannot nest A elements, but you can write a
dual-purpose A element
which has both an HREF and a NAME attribute, eg.
<A NAME="foo" HREF="#bar">zap</A>
It is not obvious what exactly is the entity named in A NAME element. The most natural interpretation seems to be that it is a part of the document, namely the part between the start and end tags. However, notice that only text elements are allowed within the contents and that most browsers seem to interpret things so that an A NAME element just names a location (a point) in the document, namely the location of the start tag, leaving the position of the end tag meaningless. (However, an end tag </A> is obligatory!)
It is syntactically legal to have an A element with empty content,
such as <A NAME="foo"></A>
, but this has
been observed to confuse some browsers.
The simple solution is
include a few words from the text into the A NAME element, eg
<P><A NAME="summary">To summarize</A>, it is legal but not advisable to have an A element with empty content.</P>
You can use
a mailto:
URL in the HREF attribute.
Example:
My E-mail address is <A HREF="mailto:Jukka.Korpela@hut.fi"> Jukka.Korpela@hut.fi</A>.(Please avoid constructs like
<A HREF="mailto:
address">Mail me!</A>
which are useless eg when reading a paper copy of the document.)
Selecting such a link typically means that the browser invokes an
E-mail composer, with the recipient field prefilled.
It is not possible to prefill other fields in any reliable way.
Use forms
instead of simple mailto:
links
if you want to prefill something.
<ADDRESS> <P> Jukka.Korpela@hut.fi </P> </ADDRESS>One idea is to provide just the author's name but so that it is a link to a home page containing more information. This is typically suitable for short documents to be viewed on the screen only.
<ADDRESS> <P> <A HREF="http://www.hut.fi/~jkorpela/">Jukka Korpela</A> </P> </ADDRESS>A longer, more typical example:
<ADDRESS> <P> Jukka Korpela, M.S. (Math.)<BR> Helsinki University of Technology Computing Centre<BR> FIN-02150 Espoo<BR> Finland </P><P> Telephone International +358 9 451 4319 </P><P> Electronic mail (Internet): <A HREF="mailto:Jukka.Korpela@hut.fi">Jukka.Korpela@hut.fi</A><BR> WWW home page: <A HREF="http://www.hut.fi/%7Ejkorpela/">http://www.hut.fi/%7Ejkorpela/</A> </P> </ADDRESS>
NCSA Beginner's Guide to HTML says that the ADDRESS element "is not used for postal addresses", but the HTML 2.0 specification contains no such statement; on the contrary, its example of ADDRESS illustrates using it for a postal address.
Several browsers, including Netscape, do not use normal paragraph breaks when rendering ADDRESS. Therefore it is advisable to use explicit P tags around the address information, although they are in principle unnecessary. Since P is allowed within ADDRESS but not vice versa, use the same style as in the above examples.
It is advisable to obey applicable standards when writing address information. In particular, when providing telephone numbers, please apply CCITT recommendation E.123.
The ADDRESS tag itself
creates no links;
to provide eg a link
to author's home page
or
a mailto
link to author's E-mail address,
use the normal
A tag with HREF attribute (within the ADDRESS
structure or outside it); see also:
META element and
LINK element
with REV attribute.
Don't forget to add BR tags for line breaks.
attribute name | possible values | meaning | notes |
---|---|---|---|
CODEBASE | URL | the base URL of the applet; this typically refers to the directory or folder containing the code of the applet | default is the URL of the document |
CODE | string | class file, ie the name of the file that contains the compiled Applet subclass of the applet | obligatory; interpreted as relative to the base specified by the CODEBASE attribute; cannot be absolute |
ALT | string | a textual description, to be displayed in place of applet | the contents of the element can be used for the same purpose, with more flexibility |
NAME | string | a name for the applet instance | such names make it possible for applets in the same document to find (and communicate with) each other. |
WIDTH | integer | suggested width, in pixels, not counting any windows or dialogs which the applet brings up | obligatory |
HEIGHT | integer | suggested height, in pixels, not counting any windows or dialogs which the applet brings up | obligatory |
ALIGN | TOP, MIDDLE, BOTTOM, LEFT, RIGHT | positioning of the applet display area | similar to ALIGN attribute of IMG |
HSPACE | integer | suggested horizontal gutter (width of white space to the immediate left and right of the applet display area), in pixels | cf. to HSPACE attribute of IMG |
VSPACE | integer | suggested vertical gutter (height of white space above and below the applet display area), in pixels | cf. to VSPACE attribute of IMG |
The exact meaning and intended use of text elements in the contents is somewhat obscure. The following is the wording of the HTML 3.2 Reference Specification:
Following the PARAM elements, the content of APPLET elements should be used to provide an alternative to the applet for user agents that don't support Java. - - Java-compatible browsers ignore this extra HTML code. You can use it to show a snapshot of the applet running, with text explaining what the applet does. Other possibilities for this area are a link to a page that is more useful for the Java-ignorant browser, or text that taunts the user for not having a Java-compatible browser.Notice that text elements in the contents and ALT attribute in the start tag are two ways of having something displayed in place of the applet. There are two differences: the value of ALT is a plain string, whereas the elements may contain text markup; and an ALT attribute has no effect if the browser does not know an APPLET element at all, whereas such a browser probably processes the text elements in the contents - it simply ignores the APPLET (and PARAM) start and end tags.
A simple example:
<APPLET CODE="Bubbles.class" WIDTH=500 HEIGHT=500 ALIGN=MIDDLE> Java applet that draws animated bubbles. </APPLET>A more complicated example, using parameter passing (PARAM element):
<APPLET CODE="AudioItem" WIDTH=15 HEIGHT=15 ALIGN=TOP> <PARAM NAME=snd VALUE="Hello.au|Welcome.au"> Java applet that plays a welcoming sound. </APPLET>
A further example, making use of CODEBASE:
<APPLET CODEBASE="applets/NervousText" CODE="NervousText.class" WIDTH=300 HEIGHT=50> <PARAM NAME=TEXT VALUE="Java is Cool!"> <IMG SRC="sorry.gif" ALT="This looks better with Java support"> </APPLET>
To help the user, a browser may display, in the status line, the contents of the ALT attribute as the mouse or other pointing device is moved over an area.
attribute name | possible values | meaning | notes |
---|---|---|---|
SHAPE | RECT, CIRCLE, POLY | shape of the area | default is RECT |
COORDS | string of a form which depends on SHAPE | coordinates for the area | obligatory except for defaulted SHAPE |
HREF | URL | address of a document | acts as a hypertext link |
NOHREF | NOHREF | means that this region has no action | useful when you want to cut a hole in a hotzone region |
ALT | string | textual description of the area | obligatory |
The meanings of SHAPE and the syntax and semantics of COORDS for each shape is the following:
SHAPE value | form of area | syntax of COORDS | meaning of COORDS |
---|---|---|---|
SHAPE=RECT | rectangle | COORDS="x1,y1,x2,y2" | the x and y coordinates of lower left and upper right corner |
SHAPE=CIRCLE | circle | COORDS="x0,y0,r" | the x and y coordinates of the center and length of the radius |
SHAPE=POLY | polygon | COORDS="x1,y1,x2,y2,x3,y3,..." | the x and y coordinates of the vertices |
The x and y coordinate values are measured in pixels from the upper left corner of the associated image. This means that the y values increase downwards.
Alternatively, an x or y can also be specified as a percentage, with the percent sign appended to a number, to be interpreted a percentage of the width or height of the image, respectively. Example:
SHAPE=RECT COORDS="0, 0, 50%, 100%"
Examples of various shapes:
SHAPE=RECT COORDS="0,0,9,9" | a rectangle of 10 by 10 pixels in the top left corner of the image |
SHAPE=CIRCLE COORDS="10,10,5" | a circle with radius of 5 pixels and center at location (10,10) |
SHAPE=POLY COORDS="10,50,15,20,20,50" | a polygon (in this case, a triangle) with edge locations (10,50), (15,20), and (20,50) |
<AREA HREF="guide.html" ALT="Guide" COORDS="0,0,118,28">
A draft version of
HTML 3.2
contained DEFAULT as a possible value of SHAPE,
to be
used to specify what happens if the user selects a point which does not
belong to any area specified in other AREA elements.
This was removed. The same effect can be achieved by using
SHAPE=RECT COORDS="0,0,100%,100%"
. Such an AREA element should
be the last
one within a MAP element, for the reason explained above.
The ALT attribute is used to provide text labels which can be displayed in the status line as the mouse or other pointing device is moved over hotzones, or for constructing a textual menu for non-graphical user agents. Authors are strongly recommended to provide meaningful ALT attributes to support interoperability with speech-based or text-only user agents. But notice that the value must be just a string with no text markup.
Compare <B>bolded text</B> with normal text.
See general notes on text markup, which provide additional examples.
For example, given
<BASE href="http://foo.com/index.html">
the IMG element
<IMG SRC="images/bar.gif">
refers to image
http://foo.com/images/bar.gif
attribute name | possible values | meaning | notes |
---|---|---|---|
HREF | URL | base URL to be used | obligatory; must be absolute |
<BASE HREF="http://www.hut.fi/%7ejkorpela/">
This implies that eg the link
<A HREF="lists.html">list examples</A>
is equivalent to
<A HREF="http://www.hut.fi/%7ejkorpela/lists.html">list examples</A>
Since only one BASE element per document is allowed, you cannot have different base URLs in different parts of an HTML file.
In the absence of a BASE element in a document, the URL of the document itself is the base URL within it. (This is not necessarily the same as the URL used to request the document, since the base URL may be overridden by an HTTP header accompanying the document.)
It is advisable to enclose the URL into quotes, although this is not always mandatory.
Don't forget the slash "/".
Anything that follows the last slash in the URL
in a BASE element
is interpreted as
belonging to the filename part and ignored.
The following is equivalent to the BASE element in the example above:
<BASE HREF="http://www.hut.fi/%7ejkorpela/foobar">
whereas the following are equivalent to each other, so the meaning
of the first one is probably not what was intended:
<BASE HREF="http://www.hut.fi/%7ejkorpela">
<BASE HREF="http://www.hut.fi/">
It is not obvious whether it applies to tables. In Netscape, for example, BASEFONT does not affect the font size within tables. (Thus, to affect the font size within tables you must insert font changing elements into each cell!)
The actual font sizes used depend on the browser. See rendering notes about the FONT element.
attribute name | possible values | meaning |
---|---|---|
SIZE | string | size of the font (1 - 7) |
It is not obvious from the HTML 3.2 Reference Specification whether the SIZE attribute here follows the same rules as in the FONT element or has to be just an unsigned integer.
<P>This is text with default font size (3).</P> <BASEFONT SIZE=5> <P>This is text with font size 5 with <FONT SIZE=1>some text</FONT> inserted with font size 1.</P>
Use FONT or, more preferably, SMALL or BIG to set font size locally (but notice that paragraph breaks are not allowed within FONT.)
BASEFONT can be regarded as a global counterpart for FONT with SIZE. In a sense, BODY with TEXT is a global counterpart for FONT with COLOR.
That was a <BIG>big</BIG> mistake!
See general notes on text markup, which provide additional examples.
It is unspecified what happens if BIG elements are nested; it might or might not result in using a font which is larger than you get with a single BIG.
The FONT element may provide more alternatives for specifying different font sizes.
<P>The original context of the saying <I>O tempora, o mores</I> is the following:</P> <BLOCKQUOTE> <P> O tempora, o mores! Senatus haec intellegit. consul videt; hic tamen vivit. Vivit? immo vero etiam in senatum venit, fit publici consilii particeps, notat et designat oculis ad caedem unum quemque nostrum. </P> <P ALIGN=RIGHT> <A HREF="http://www.dla.utexas.edu/depts/classics/documents/Cic.html"> Cicero</A>, <A HREF="http://www.dla.utexas.edu/depts/classics/documents/cat1.html"> <CITE>Oratio in Catilinam Prima</CITE></A>, 2 </P> </BLOCKQUOTE>
Since BLOCKQUOTE is a block element, it is normally used for relatively long quotations. As regards to short quotations to be presented with no paragraph breaks around them, present them using text level markup. In special cases, you might use CODE, SAMP, KBD or CITE, but in the general case you have to resort to specifying the physical presentation, eg using italics (I element) or quotes according to your preferences and the norms of the language you use. (There is no generic text-level element for quotations in HTML 3.2, mainly because the rules for presenting such quotations are different in different languages.)
If it is essential to have the text displayed as it is written (with respect to division into lines and the use of blanks and tabs), consider using PRE.
When describing man-machine interaction, use the specific elements CODE, SAMP and KBD for quotations of program code, program output, and keyboard input.
Do not use BLOCKQUOTE to achieve indentation. A browser may or may not use indentation to present BLOCKQUOTE.
It belongs to proper manners to specify the source of quotation in some suitable way. In several cases this is even required by the law (copyright legislation). If possible, provide a hyperlink to the source document on the Web in addition to specifying the source in the text.
The BLOCKQUOTE element itself provides no structured way of presenting source information. The example above presents one method of doing so.
If you do not like the font used by browsers for BLOCKQUOTE, there is not very much to be done; however, style sheets may change this. If you wish to enforce eg italics font to be used (if possible), using the I element, remember that as a text element it does not allow eg paragraph breaks (or a BLOCKQUOTE) within it, so you must use a separate I element within each paragraph (P element).
As an exception to quotations being exact reproductions of the quoted text, you may leave out words which are irrelevant in the context of the quotation even if they appear in the middle of the quoted text; in such cases you should indicate the omission clearly (the notations - - and ... are the most common ways of doing this). Be very careful in such omissions; it is easy, but quite inappropriate, to quote someone selectively so that he seems to say something very different from what he really said - perhaps even just the opposite. As another exception, when necessary you may add clarifying words but only to convey the original meaning appropriately, not to change it to conform to your own thoughts. Typically, you add the correlate of a pronoun like it. You should clearly indicate such clarifications as not being part of the original; the most common way to do this is to put them into square brackets.
attribute name | possible values | meaning |
---|---|---|
BGCOLOR | color specification | background color for the document |
TEXT | color specification | color for the text of the document |
LINK | color specification | color for unvisited hypertext links |
VLINK | color specification | color for visited hypertext links |
ALINK | color specification | color for active hypertext links; used to stroke the text for a link at the moment the user selects (eg clicks on) the link |
BACKGROUND | URL | URL for an image to be used to tile the background. |
<BODY> <H1>Sample document</H1> <P> This is just a trivial sample document. Its body contains first a heading, then a paragraph, and nothing else. </P> </BODY>
<BODY BGCOLOR=AQUA TEXT="#848484" LINK=RED VLINK=PURPLE ALINK=GREEN > <H1>Sample document</H1> <P> This is also a trivial sample document. Its body contains first a heading, then a paragraph, and then a paragraph containing a link. However, the BODY element uses attributes to affect the visual rendering. </P> <P> This document was written by <A HREF="http://www.hut.fi/~jkorpela/">Jukka Korpela</A>. </P> </BODY>
<BODY TEXT=BLUE LINK=RED VLINK=BLUE ALINK=PINK BACKGROUND="http://www.hut.fi/~jkorpela/HTML3.2/wave.gif" > <H1>Sample document</H1> <P> This document contains first a heading, then a paragraph, and then a paragraph containing a link. However, the BODY element uses attributes to affect the visual rendering, including a background image. </P> <P> This document was written by <A HREF="http://www.hut.fi/~jkorpela/">Jukka Korpela</A>. </P> </BODY>
Be careful when playing with background images and colors. What looks cool on your screen might be disgusting on some other (or in someone else's opinion).
If you set some of the attributes BGCOLOR, TEXT, LINK, VLINK and ALINK, set them all. Otherwise eg your specified background color might coincide with user's default color for text.
Select the text color so that it works together with the background color or the colors of the background image. For instance, red on green can cause serious problems, because a significant number of people have difficulties in distinguishing them.
The text color can be affected locally by FONT elements with COLOR attribute. Background color cannot be set locally in HTML 3.2; if you want to use different backgrounds, you have to write separate HTML files (or use style sheets).
You can set both BGCOLOR and BACKGROUND. If you do, browsers typically give preference to BACKGROUND, but if the background image cannot be loaded, BGCOLOR is used.
attribute name | possible values | meaning | notes |
---|---|---|---|
CLEAR | LEFT, RIGHT, ALL, NONE | control of text flow | default is NONE |
The attribute can be used to move down past floating images on either margin. <BR CLEAR=LEFT> moves down past floating images on the left margin, <BR CLEAR=RIGHT> does the same for floating images on the right margin, while <BR CLEAR=ALL> does the same for such images on both left and right margins.
<P> You should always end the terminal session with the command <BR> logout <BR> or some other operation with the same effect. </P>
The BR element can be used to simulate subparagraphs as explained in the description of the P element.
BR elements with CLEAR attribute are often needed when embedded images are used; see the description of the IMG element.
Some people use multiple BR elements to force whitespace. This need not work in all browsers. If you wish to force empty vertical space, consider using a suitable PRE element.
Usually the caption is horizontally centered. (HTML 3.2 provides no tool for changing the browser behavior in this respect.)
attribute name | possible values | meaning | notes |
---|---|---|---|
ALIGN | TOP, BOTTOM | placement of the caption relative to the table | usually the default is TOP |
<CAPTION>Summary of measurement results</CAPTION> <CAPTION><EM>Mean temperatures</EM></CAPTION>
See the discussion of tables, which contains additional examples, too.
Some browsers (eg Netscape) do not render the caption in a visually distinctive manner. Using phrase markup such as EM or STRONG within the CAPTION element may therefore be desirable.
<P> This is a normal paragraph which will be rendered according to default alignments, which usually means left alignment. </P> <CENTER> <P> This is text which will be centered. </P> <P> This is a longer text paragraph which will be centered. It is so long that line breaks will most probably occur. Notice that the division into lines is usually not the same as in the HTML file. </P> </CENTER>
CENTER is defined as equivalent to DIV with ALIGN=CENTER. CENTER was introduced by Netscape before they added support for the DIV element. It is retained in HTML 3.2 on account of its widespread deployment.
Since CENTER is a block element, it terminates an open P element (ie causes the browser to assume an implied </P> tag when necessary). Other than this, user agents are not expected to render paragraph breaks before and after CENTER elements. If paragraph breaks are desired, you can use the P element with an ALIGN attribute instead.
I learned this from <CITE>The Origin of Species</CITE>.
Accepting this, the question arises how quotations are to be presented within text. (For quotations to be presented as separate paragraphs, or even sequences of paragraphs, BLOCKQUOTE is the natural choice.) You can either use quotation marks according to the rules of the language in which your own document is written, or some other suitable method, such as italics, ie the I element. The latter is often suitable for very short (eg single-word) quotations.
Expressions like <CODE>a[i++] + b[i++]</CODE> should not be used, since they cause undefined behavior.
See also notes on presenting interaction with computer and general remarks on phrase elements.
The end tag </DD> can always be omitted, and it usually is omitted.
<DD>See RFC 822.</DD>
For more realistic examples, see the description of the DL element.
See also general notes on rendering markup.
<DFN>Ichthyology</DFN> is the branch of natural science which studies fish.
The HTML 2.0 specification does not include DFN but mentions it as an element which "has been deployed to some extent".
See also general remarks on phrase elements.
Theoretically, the recommendation has been and still is that DIR element be rendered as a multicolumn directory list.
attribute name | possible values | meaning |
---|---|---|
COMPACT | COMPACT | reduced interim spacing |
<DIR> <LI>one <LI>two <LI>three </DIR>A larger list of very small elements (typically this is not rendered in a suitable manner):
<DIR> <LI>A<LI>B<LI>C<LI>D<LI>E<LI>F<LI>G<LI>H<LI>I<LI>J<LI>K<LI>L<LI>M <LI>N<LI>O<LI>P<LI>Q<LI>R<LI>S<LI>T<LI>U<LI>V<LI>W<LI>X<LI>Y<LI>Z </DIR>See also Examples of various list elements in HTML.
attribute name | possible values | meaning |
---|---|---|
ALIGN | LEFT, CENTER, RIGHT | alignment of text within the element |
The ALIGN attribute specifies the default alignment; it can be overridden by ALIGN attributes in enclosed elements (eg P elements).
<P> This is a normal paragraph which will be rendered according to default alignments, which usually means left alignment. </P> <DIV ALIGN=CENTER> <P> This is text which will be centered. </P> <P> This is a longer text paragraph which will be centered. It is so long that line breaks will most probably occur. Notice that the division into lines is usually not the same as in the HTML file. </P> </DIV>The following example shows how to present (poetic) text as centered and with a particular division into lines:
<DIV ALIGN=CENTER> Mieleni minun tekevi<BR> aivoni ajattelevi<BR> lähteäni laulamahan<BR> saa'ani sanelemahan.<BR> <P ALIGN=RIGHT><CITE>Kalevala</CITE></P> </DIV>
Since DIV is a block-like element, it terminates an open P element (ie causes the browser to assume an implied </P> tag when necessary). Other than this, user agents are not expected to render paragraph breaks before and after DIV elements. If paragraph breaks are desired, you can use the P element with an ALIGN attribute instead.
attribute name | possible values | meaning |
---|---|---|
COMPACT | COMPACT | more compact style of rendering |
Normally you have pairs of DL and DD elements, of course. Multiple DT elements may be paired with a single DD element; this means that several terms share the same definition. A document should not contain multiple consecutive DD elements.
<DL> <DT>Recursion, indirect <DD>See <I>indirect recursion</I>. <DT>Indirect recursion <DD>See <I>recursion, indirect</I>. </DL>See also: Examples of various list elements in HTML.
You can use a TABLE element instead of an UL element (but remember that not all browsers support tables). See general notes about list elements.
The end tag </DT> can always be omitted, and it usually is omitted.
<DT>Terminus technicus.</DT>
For more realistic examples, see the description of the DL element.
The EM element is <EM>logical</EM> markup as opposite to <EM>physical</EM> markup such as the I element.
You can use STRONG for stronger emphasis.
See also general remarks on phrase elements.
A browser may provide a user option for defining which font is to be used and which physical font size shall be used to correspond to the default font size (3) in HTML. Setting the font size in HTML may decrease or increase the actual font size used, in a browser dependent manner.
or
<FONT COLOR=colorspec>text</FONT>
attribute name | possible values | meaning | notes |
---|---|---|---|
SIZE | string | size of the font, either a number in the range 1 - 7 or
a signed integer like "+1" or "-2"
| signed value is added to the current base font size as set by BASEFONT to produce a size number in the range 1 - 7 |
COLOR | color specification | color to be used for the contents | might clash with background color! |
Some user agents also support a FACE attribute which accepts a comma separated list of font names in order of preference. This is used to search for an installed font with the corresponding name.
This is some text <FONT SIZE=-1>including text which may appear in a smaller font</FONT>. <P> This is an attempt to present one <B><U><FONT SIZE=7 COLOR=RED>word</FONT></U></B> very prominently: in bold face, underlined, in the largest font available, and in red.
Use BASEFONT to set font size for a large part of the document. (Notice that paragraph breaks are not allowed within FONT.)
The attributes in the BODY tag can be used to set the background color or the default text font color or both. Of course you should not use the background color for text!
A browser need not implement FONT so that SIZE values 1 - 7 all correspond to different font sizes. The implementation of FONT on some popular browsers is as follows:
You may wish to use a separate file for checking the visual appearance of the different markup elements on your browser to see how it displays different font sizes. Consult information about color specifications for color samples, or a separate file containing text in 16 colors corresponding to the predefined color names.
There are two kinds of relativity involved in font sizes. First, in HTML we refer to font sizes with numbers in the range 1 - 7 which are in some browser and device dependent manner mapped to physical sizes (expressed eg in pixels, points or millimeters). The mapping is usually not linear; you should not assume that eg font size 3 is half of font size 6. Second, the way in which the font size (in the HTML meaning) is specified in the SIZE attribute can be relative; for instance, SIZE="+1" (which is quite different from SIZE="1" or SIZE=1) means the current base font size plus one, and the sum itself is relative in the sense explained above.
attribute name | possible values | meaning | notes |
---|---|---|---|
ACTION | URL | address of the server-side form handler | an HTTP server
(typically, a CGI script)
or a mailto: URL
(which is not supported by all browsers)
|
METHOD | GET, POST | HTTP method (as defined in the HTTP specification) to be used to send the contents of the form to the server (when the ACTION attribute specifies an HTTP server) | default is GET |
ENCTYPE | string | media type used to encode the contents of the form | default is application/x-www-form-urlencoded
|
Notice in particular that there are some elements which may only appear within a FORM element. They can be used for various purposes as follows:
mailto:
link
(using A element), but it hopefully illustrates
the structure of form specifications in a very simple case.
Tell me what you think about my document: <FORM ACTION="http://www.hut.fi/cgi-bin/mailto?Jukka.Korpela@hut.fi" METHOD=POST> <TEXTAREA ROWS=5 COLS=72 NAME=Comments></TEXTAREA> <P> <INPUT TYPE=SUBMIT VALUE=Send> </FORM>The example above, as well as the the two other examples below, uses a simple CGI script named
mailto
(not to be mixed up
with mailto
URLs!) and accessible
using URL of the form
http://www.hut.fi/cgi-bin/mailto?
addr where
addr is an E-mail address.
This particular CGI script has been coded to send the contents of
the form as an E-mail message containing name-value pairs in a
format which is both legible by humans and easy to process automatically.
You can test
these forms if you like, but please notice that
they really send your message to the author; and please
do not copy
the ACTION attribute into a form of your own, since the service
referred to is not intended to be a public service.
(There are such public forms services
elsewhere.)
The following more complicated example contains, in addition to an area for free text input, a selection menu. This might be a good way of getting evaluations, since for many people it is easier to fill a simple form than to write free comments.
<FORM ACTION="http://www.hut.fi/cgi-bin/mailto?Jukka.Korpela@hut.fi" METHOD=POST> Please tell your opinion about the overall quality of this document: <SELECT NAME=evaluation> <OPTION>No opinion <OPTION>Very poor <OPTION>Rather poor <OPTION>Average <OPTION>Rather good <OPTION>Very good </SELECT> <P> You can also be more specific by writing a few comments: <TEXTAREA NAME=Comments ROWS=5 COLS=72></TEXTAREA> <P> <INPUT TYPE=SUBMIT VALUE=Send> </FORM>
One more example:
This is a form for sending your personal evaluation of the document <CITE>Learning HTML by Examples</CITE> as a whole. <FORM ACTION= "http://www.hut.fi/cgi-bin/mailto?Jukka.Korpela@hut.fi" METHOD="POST"> <P> Your home page URL (if any): <INPUT TYPE=TEXT SIZE=30 NAME=Home VALUE="http://"> </P><P> Please rate the overall <EM>usefulness</EM> of the document (to you):<BR> <INPUT TYPE=RADIO NAME=Useful VALUE="Very little">Very little (or none)<BR> <INPUT TYPE=RADIO NAME=Useful VALUE="Little">Little<BR> <INPUT TYPE=RADIO NAME=Useful VALUE="Some">Some<BR> <INPUT TYPE=RADIO NAME=Useful VALUE="Great">Great<BR> <INPUT TYPE=RADIO NAME=Useful VALUE="Very great">Very great </P><P> What about general <EM>understandability</EM>? <SELECT NAME=Understandability> <OPTION VALUE=undef>(No opinion) <OPTION VALUE=verydifficult>Very difficult <OPTION VALUE=difficult>Difficult <OPTION VALUE=avg>Average <OPTION VALUE=easy>Easy <OPTION VALUE=veryeasy>Very easy </SELECT> </P><P>Please feel free to add any comments you like:<BR> <TEXTAREA ROWS=5 COLS=72 NAME=Comments></TEXTAREA> <INPUT TYPE=HIDDEN NAME="Via" VALUE="FORM-3"> </P><P> <INPUT TYPE=CHECKBOX NAME="*** Response requested! ***"> Would appreciate a personal answer; E-mail address: <INPUT TYPE=TEXT SIZE=25 NAME=From> </P> <P>When you are finished with filling the form, select this: <INPUT TYPE=SUBMIT VALUE=Send></P> </FORM> You should get a response saying that a message was sent to Jukka.Korpela@hut.fi. If you want to get back to the page from which you came to this form, please use the "Back" function of your browser twice.Notice the use of a HIDDEN field named
Via
. It is
invisible to users filling the form but allows the recipient of
the E-mail message to recognize the origin (form) from which
the message was generated.
In general, you need a CGI script in order to use HTML forms. See eg Introduction to the Common Gateway Interface (CGI) and CGI Programming FAQ. Writing CGI scripts requires more knowledge about programming than most HTML authors are willing to know. Moreover, Web server maintainers may have strict policies on CGI scripts for security reasons. Thus, please contact your local Web server documentation or local webmaster for information about CGI scripts made available at your site, read their documentation, and write your forms so that you take into account the requirements of the script you have chosen to use.
If you cannot find a locally available CGI script that suits your needs, you may wish to consider using a CGI on a remote server. There are some services which allow you to use CGI scripts on their site, usually for some fee, but there are also free services.
Although
the
HTML 3.2
specification
allows the ACTION attribute to refer to a mailto:
URL,
providing an easy way of creating forms for submitting information via
E-mail, notice that this facility is not supported by all browsers.
For example, a browser
might just invoke its internal E-mail composer
from scratch, ignoring the way in which the form has been filled!
(This
applies to Internet Explorer 3.0,
for example.)
Moreover, even if a browser supports this feature,
the generated E-mail message is in the
x-www-form-urlencoded
form
(which is confusing although
not completely illegible).
To summarize,
avoid using an ACTION which refers to a mailto:
URL.
You can have more than one form in the same document.
The ISINDEX element predates the FORM element and was used for simple keyword searches.
where n is 1, 2, 3, 4, 5, or 6.
attribute name | possible values | meaning |
---|---|---|
ALIGN | LEFT, CENTER, RIGHT | alignment of the heading |
The default is left alignment, but this can be overridden by an enclosing DIV or CENTER element.
<H1>Notes on General Relativity</H1>
<H1 ALIGN=CENTER>The story of my life</H1> <H2>Preface</H2> <H3>General remarks</H3>There is a separate file which contains headings of all levels.
Avoid using H5 and H6 at all. More than four levels of headings are rarely needed, and popular browsers may display H5 and H6 in a manner which is less prominent than normal text!
See general structure recommendations for a detailed suggestion on heading usage.
In particular, don't use eg H5 or H6 to cause text to be presented in a small font just because some browsers present them so. Other browsers - or even future versions of those browsers - may well adopt the more reasonable view that even the lowest level headings should be presented at least as prominently as normal text. If small font is what you really want, use the SMALL (or FONT) element.
Since heading elements are intended to be presented prominently by a browser, don't make them very long. Normally you should not try add anything to the presentation by using text markup within the heading text. It is the job of a browser to present headings as headings. And for the same reason you should not write a heading in all upper case.
It might be a good idea to make every heading an anchor, ie
a possible target of a
link. Example:
Other people (or you) may then link to specific sections in your
document, not just to the document as a whole.
Notice that you must put the A element within
the heading element, not vice versa.
Both the start and end tags can be omitted.
<HEAD> <TITLE>Getting started with Perl</TITLE> </HEAD>
In a speech based user agent, the tag could be rendered as a pause.
attribute name | possible values | meaning | notes |
---|---|---|---|
ALIGN | LEFT, RIGHT, CENTER | horizontal alignment of the rule | default is CENTER |
NOSHADE | NOSHADE | requests the rule to be rendered in a solid color | as opposite to the traditional two-color "groove" |
SIZE | integer | height of the rule, in pixels | |
WIDTH | width specification | width of the rule |
<P> Some text, followed by a basic (default) horizontal rule. </P> <HR> <P> Some other text. </P>
<P> A horizontal rule placed at the right and half the width of the document layout: </P> <HR ALIGN="RIGHT" WIDTH="50%"> <P> An example with all possible spices: placed at left, solid rule (no shading), height 5 pixels, width 100 pixels: </P> <HR ALIGN="LEFT" NOSHADE SIZE=5 WIDTH=100>
It is usually better to use a percentage specification than absolute number of pixels. The user's window might be very different from yours.
attribute name | possible values | meaning |
---|---|---|
VERSION | string | version of HTML |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN"> <TITLE>Hello</TITLE> Hello world
Usually the dog is said to form the species <I>Canis familiaris</I>, but genetically dogs belong to the same species as the wolf, <I>Canis lupus</I>.
However, don't overuse the I element. In particular, for emphasis use EM or STRONG, and for variables (placeholders) use VAR. See general notes on text markup.
Words and phrases taken as such from other languages (than the language in which the document is written), such as status quo, Weltanschauung or sauna, are often presented in italics. However, the more common the word or phrase is (in your text or in your language in general), the less the reader benefits from designating them as foreign and the more he may be disturbed by the frequent occurrence of different fonts in the text.
In linguistics, when referring to words and phrases as in "the plural of ox is oxen", it is normal to use italics. (HTML 2.0 suggests the use of SAMP for such purposes, but that would be unnatural.)
The rules for scientific names for organisms say that the names should be written in italics if possible, so it is natural to write them within I elements. The same applies to symbols of physical quantities such as F for force; the VAR element might sound suitable, but I elements are rendered in the required way, in italics, more probably than VAR elements are.
The positioning of the image is affected by the attributes of the IMG element.
attribute name | possible values | meaning | notes |
---|---|---|---|
SRC | URL | address of the image | obligatory; see notes on graphics formats |
ALT | string | text description of the image | |
ALIGN | TOP, MIDDLE, BOTTOM, LEFT, RIGHT | positioning of the image relative to the current textline | default is BOTTOM |
HEIGHT | integer | suggested height, in pixels | suggestion only |
WIDTH | integer | suggested width, in pixels | suggestion only |
BORDER | integer | suggested line border width, in pixels | relevant when the IMG element appears as an anchor text; use BORDER=0 to suppress the border |
HSPACE | integer | suggested horizontal gutter (width of white space to the immediate left and right of the image), in pixels | default value is a small non-zero number |
VSPACE | integer | suggested vertical gutter (height of white space above and below the image), in pixels | default value is a small non-zero number |
USEMAP | URL | fragment identifier for a client-side image map | maps are defined with the MAP element; names of maps are case sensitive |
ISMAP | ISMAP | indicates that the image is a server-side image map | when the user clicks on the image, this attribute causes the cursor location to be passed to the server. |
Attributes HEIGHT, WIDTH, HSPACE, VSPACE, and USEMAP were not in HTML 2.0! And in HTML 2.0 the allowed values for ALIGN were TOP, MIDDLE, BOTTOM only.
The WIDTH and HEIGHT attributes, when used together, allow user agents to reserve screen space for the image before the image data has arrived over the network. This may imply faster formatting and allow the user start reading while data transfer is still in progress. These attributes were not designed for automatic resizing of images by browsers. Although some browsers are able to scale the image according to WIDTH and HEIGHT attributes, don´t rely on it. Thus they should specify the true size of the image. (Use a suitable program, such as xv on many Unix systems, for finding out the size in pixels and for scaling the image if needed.)
The different values of ALIGN have the following meanings:
Note that some browsers (eg Internet Explorer 2.0 and 3.0) introduce spurious spacing with multiple left or right aligned images. As a result authors can't depend on this being the same for browsers from different vendors. See BR for ways to control text flow.
As regards to ISMAP, here is an example of how you use it:
<a href="/cgibin/navbar.map"><img src=navbar.gif ismap border=0></a>
The location clicked is passed to the server as follows.
The user
agent derives a new URL from the URL specified by the HREF attribute by
appending
a question mark (?),
the x coordinate, a comma (,), and and the y coordinate of the location,
with coordinates expressed in
in pixels. The link is then followed using the new URL. For instance, if
the user clicked at at the location x=10, y=27 then the derived URL will
be: "/cgibin/navbar.map?10,27"
. - It is generally a good idea to
suppress the border
(using the attribute BORDER=0)
explicitly tell
that the image
is clickable.
<IMG SRC="Yucca.jpg" ALT="[Picture of Yucca]" WIDTH=110 HEIGHT=168> <P> <IMG SRC="Yucca.jpg" ALT="[Picture of Yucca]" WIDTH=110 HEIGHT=168 ALIGN=RIGHT> This is a simple example of embedding images. This paragraph should be displayed, in a graphical browser, with an image at the right, and before this paragraph the same image should appear separately, with default alignment. </P>
Using IMG with ISMAP, to create a clickable map:
<A HREF="http://www.hut.fi/cgi-bin/imagemap/Pictures/English/english.map"> <IMG HEIGHT="400" WIDTH="400" SRC="http://www.hut.fi/Pictures/English/english.gif" ALT="Helsinki University of Technology" ISMAP> </A>
There is no HTML feature specifically intended for a caption for an image. One reasonable way of including a caption (when the image appears on its own and not alongside with the text) is the following:
<P> <IMG SRC="sae.gif" ALT="[Siamese algae eater]"> <BR> Siamese algae eater. <SMALL>Drawing by <A HREF="http://www.hut.fi/~lsarakon/">Liisa Sarakontu</A>.</SMALL> </P>
If you want a picture appear at the left (or right) of a text paragraph, you should put the IMG element (with ALIGN=LEFT or ALIGN=RIGHT attribute) at the beginning of the paragraph (P element). Otherwise the result may look messy. Moreover, it is good practise to have a BR element with the CLEAR attribute at the end of such a paragraph, to avoid confusing effects. In general, putting an image alongside with text is a potential source of problems; for example, a user with a narrow window might not see the text at all.
Dianne Gorman has written an illustrative document Aligning Images and Text (part of her Introduction to HTML).
The semantics and intended use of the ALT attribute is vague. It might be viewed as a recommended way of providing a textual presentation of the contents of an image, to be used as a replacement for the image in text-only browsers, speech-based user agents etc. However, much more typically it contains a verbal explanation of the image, such as a title or perhaps just a name for the image. This seems suitable in the common situation of using a graphical browser with automatic image loading disabled: the user decides on the basis of the verbal explanation whether to load this particular image. (Graphic browsers vary in their behavior in such situation: treatment of ALT attributes in the situation where the user has turned off some browsers display the ALT value, others may display a small generic image which says very little.) And it is often difficult to say how the ALT text could be a good replacement for the image, since the syntax restricts the value to be just a string with no HTML markup. - A. J. Flavell has written an extensive document Use of ALT texts in IMGs.
There are two
ways of implementing clickable
image maps in HTML documents:
For more information about image maps, see eg
Image maps can be very useful in association with geographical maps.
(See eg the "Virtual Tourist" map at
Since client side image maps are faster and have other benefits as well
but are not supported by all browsers, you may wish to combine
server side and client side image maps in the following way:
imagemap
or htimage
; consult the documentation of the
server).
<A HREF="/cgi-bin/htimage/your.map">
<IMG SRC="image/your.gif" ... ISMAP USEMAP="#yourmap"></A>
That way new browsers will use the client side image map, whereas old
browsers will ignore the USEMAP attribute and pass the request to the
server.
http://www.vtourist.com/webmap/
.)
They might conceivably be used in other contexts as well, for instance
to allow the user select an item in a display of purchasable objects
or a detail in a plan of a house or
to request information about a part of a device described
by a drawing.
In general, an imagemap can be
very useful for things which are inherently visual
in two or more dimensions.
However, in actual practice
most use of image maps is abuse.
Example 2
above
is a typical case: a natural, simple text menu would be easier to use
and more efficient, and it would work fine on text-only browsers, too.
(See section Using
tables to represent menus for various implementations of menus.)
INPUT - input fields in forms
Purpose
To specify, within a form, input fields such as
single line text fields, password fields, checkboxes, radio
buttons, submit and reset buttons, hidden fields, file upload, image
buttons, etc.
attribute name | possible values | meaning | notes |
---|---|---|---|
TYPE | TEXT, PASSWORD, CHECKBOX, RADIO, SUBMIT, RESET, FILE, HIDDEN, IMAGE | type of the input field | default is TEXT |
NAME | string | name to be used to identify the field when submitting the contents to the server | required for all but SUBMIT and RESET |
VALUE | string | initial value of the input field; when TYPE is SUBMIT or RESET, provides a textual label | obligatory, if TYPE is RADIO or CHECKBOX |
CHECKED | CHECKED | when TYPE is RADIO or CHECKBOX, initializes the field to checked state | |
SIZE | integer | visible size of the field, as number of average character widths | |
MAXLENGTH | integer | maximum number of characters permitted in a text field | default is: no limit |
SRC | URL | address of an image | for fields with background images |
ALIGN | TOP, MIDDLE, BOTTOM, LEFT, RIGHT | image alignment for graphical submit buttons | as ALIGN in IMG (and HTML 2.0 allows only TOP, MIDDLE, BOTTOM here, too); default is BOTTOM |
The different values of the TYPE attribute correspond to different kinds of input fields as follows.
TYPE=TEXT (the default)
A single line text field whose visible size can be set using the SIZE attribute, eg SIZE=40 for a 40 characters wide field. Users should be able to type more than this limit though with the text scrolling through the field to keep the input cursor in view. You can enforce an upper limit on the number of characters that can be entered with the MAXLENGTH attribute. The NAME attribute is used to name the field, while the VALUE attribute can be used to initialize the text string shown in the field when the document is first loaded.
Notice that text input is restricted to a single line. Use the TEXTAREA element to define multi-line text fields.
Example:
<INPUT TYPE=TEXT SIZE=40 NAME=user value="your name">TYPE=PASSWORD
This is like TYPE=TEXT but the browser should not echo the characters, so that people around the user will not see them. Typically, the browser uses a generic character like * to indicate that some character has been sent. The actual input is sent normally (without encryption!). You can use SIZE and MAXLENGTH attributes to control the visible and maximum length exactly as for regular text fields.
Example:
<INPUT TYPE=PASSWORD SIZE=12 NAME=pw>
Used for simple Boolean attributes, or for attributes that can take multiple values at the same time. The latter is represented by several checkbox fields with the same NAME and a different VALUE attribute. Each checked checkbox generates a separate name/value pair in the submitted data, even if this results in duplicate names. Use the CHECKED attribute to initialize the checkbox to its checked state.
Example:
<INPUT TYPE=CHECKBOX CHECKED NAME=uscitizen VALUE=yes>
Used for attributes which can take a single value from a set of alternatives. Each radio button field in the group should be given the same NAME attribute. Radio buttons require an explicit VALUE attribute. Only the checked radio button in the group generates a name/value pair in the submitted data. One radio button in each group should be initially checked (thus providing a default value) using the CHECKED attribute.
Example:
<INPUT TYPE=RADIO NAME=age VALUE="0-12"> <INPUT TYPE=RADIO NAME=age VALUE="13-17"> <INPUT TYPE=RADIO NAME=age VALUE="18-25"> <INPUT TYPE=RADIO NAME=age VALUE="26-35" CHECKED> <INPUT TYPE=RADIO NAME=age VALUE="36-">
This defines a button that users can click to submit the contents of the form to the server. A label is set for the button from the VALUE attribute. If the NAME attribute is given, then the name/value pair for the submit button will be included in the submitted data. You can include several submit buttons in the form. See TYPE=IMAGE for graphical submit buttons.
Example:
<INPUT TYPE=SUBMIT VALUE="Party on ...">
This defines a button that users can click to reset form fields to their initial state when the document was first loaded. You can set a label by providing a VALUE attribute. Reset buttons are never sent as part of the contents of a form.
Example:
<INPUT TYPE=RESET VALUE="Start over ...">
This provides a means for users to attach a file to the contents of the form.
This feature is not commonly supported yet. Notice that some browsers support it seemingly only, eg including the name of the file instead of its contents!
The element is generally rendered as a text field and an associated button which, when clicked, invokes a file browser to select a file name. The file name can also be entered directly in the text field.
Just like for TYPE=TEXT you can use the SIZE attribute to set the visible width of this field in average character widths. You can set an upper limit to the length of file names using the MAXLENGTH attribute.
Some user agents support
the ability to restrict the kinds of files
(that can be attached to the contents of a form)
using an ACCEPT attribute. The value of that attribute
is a comma-separated list of MIME content types. For example,
ACCEPT="image/*"
would restrict files to images. Notice that the ACCEPT attribute
is not defined in HTML 3.2, although it is defined in
RFC 1867
to which the
HTML 3.2 Reference Specification
refers in this context for further information.
An Internet media type is,
generally speaking, a property of a data set, describing
both the general type of data (such as "text" or "image" or "application";
the last one refers to program-specific internal data formats) and,
as a subtype, a specific format for the data.
The concept was originally defined as
MIME content types. The
HTML 3.2 Reference Specification
refers to RFC 1521
but that specification was superseded by
RFC 2046
(in November 1996).
The procedure for registering types in given in
RFC 2048;
according to
it, the registry is kept at
ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/
Further information on using forms for file upload can be found in RFC 1867.
Example:
<INPUT TYPE=FILE NAME=photo SIZE=20>
This indicates that the field should not be rendered to the user. A hidden field provides a means for servers to store state information with a form. This will be passed back to the server when the form is submitted, using the name/value pair defined by the corresponding attributes. This is a workaround for the statefulness of HTTP and an alternative to using so-called HTTP cookies.
Example:
<INPUT TYPE=HIDDEN NAME=customerid VALUE="c2415-345-8563">
This acts as a submit button (cf. TYPE=SUBMIT),
but it is rendered
by an image
rather than a text string and the form is submitted so that
information about the clicked location is passed, too.
The URL for the image is specified with the
SRC attribute. The image alignment can be specified
with the ALIGN attribute. In this respect,
graphical submit buttons are treated identically to
IMG elements (so you can set ALIGN to LEFT, RIGHT,
TOP, MIDDLE or BOTTOM).
A NAME attribute is required.
When the user clicks on the button, the x and
y coordinates of the location clicked are passed to
the server is two name/value pairs. The names are derived by
taking the name of the filed and appending .x
for
the x value and .y
for the y value.
Example:
<P>Now choose a point on the map: <INPUT TYPE=IMAGE SRC="map.gif" NAME=point">Notice that image fields cause problems to people using text-only or speech-based user agents or graphical browsers with automatic loading of images disabled.
The specifications do not mention the VALUE attribute for INPUT TYPE=IMAGE, but at least one text-mode browser takes its value it as the substitute for the image. Thus, defining a meaningful VALUE attribute is good idea, if the form makes sense even if the script processing it does not get (meaningful) x and y values.
<INPUT TYPE=RESET VALUE="Start over ...">
The use of INPUT for text input is restricted to single line fields. Use TEXTAREA to define multi-line text fields.
Use SELECT for menus.
The semantics for ISINDEX are currently well defined only when the base URL for the enclosing document is an HTTP URL. Typically, when the user presses the enter (return) key, the query string is sent to the server identified by the base URL for this document. For example, if the query string entered is "ten green apples" and the base URL is:
http://www.acme.com/
then the query generated is:
http://www.acme.com/?ten+green+apples"The ISINDEX element only provides an interface to a program (typically, a CGI script) which interprets the query. Merely inserting an ISINDEX element does not make the document searchable! (On the other hand, notice that most Web browsers provide some "search in this document" feature, so you need not take any special efforts in order to allow your readers perform simple searches within a document.)
attribute name | possible values | meaning |
---|---|---|
PROMPT | string | prompt message |
The PROMPT attribute can be used to specify a prompt string for the input field, replacing a browser-dependent default prompt string (which might be eg This is a searchable index. Enter search keywords).
<BASE HREF="http://www.hut.fi/cgi-bin/finger"> Searching for a user at <a href="http://www.hut.fi/">HUT</a>. <ISINDEX PROMPT="User id at HUT:">
There are no restrictions on the number of characters that can be entered in the query string.
In practice, the query string is restricted to Latin-1 as there is no current mechanism for the URL to specify a character set for the query.
When the query is generated from the input, space characters are mapped to "+" characters, and normal URL character escaping mechanisms apply. For further details see the HTTP specification.
Finally, type <KBD>logout</KBD> and press the return key.
Although program code might be regarded as keyboard input (to be typed by a programmer), especially in the context of teaching programming, it is more natural to use the CODE element for code fragments.
It is arguable whether one should use the KBD element for command names (or names of programs) as well, even when they do not appear in a context which discusses how commands are given. One might say that a command name like ls (in Unix) is just a name, not keyboard input. But I recommend using KBD, since it is difficult and sometimes quite artificial to distinguish eg ls as keyboard input (or part of it) and as the name of a command (or program). Notice that when a command name appears at the beginning of a statement, grammar rules require a capital initial which might be misleasing (by suggesting to the user that the case of letters in irrelevant on keyboard input); by using KBD - usually rendered using a monospaced font, and therefore distinguishing the command name from normal text - we make it more acceptable to violate the grammar rule.
As usual in HTML, division into lines and the use of blanks and tabs is selected by the browser, not honoring the one in the HTML file. Be careful in telling the user when he should press the return or enter key, since this may not correspond to the visual layout of your instructions.
See also notes on presenting interaction with computer and general remarks on phrase elements.
The end tag </LI> can always be omitted, and it usually is omitted.
When the (innermost) enclosing list element is UL or DIR or MENU:
attribute name | possible values | meaning |
---|---|---|
TYPE | DISC, SQUARE, CIRCLE | bullet style |
When the (innermost) enclosing list element is OL:
attribute name | possible values | meaning |
---|---|---|
TYPE | 1, a, A, i, I | numbering style (as in OL) |
VALUE | integer | sequence number (see OL) |
<LI>A list item.</LI>
For more realistic examples, see Examples of various list elements in HTML and examples given in the descriptions of UL, DIR, MENU,and OL element.
The list of bullet types was chosen to cater for the original bullet shapes used by Mosaic in 1993. The list is not very logical. Usually the default bullet type in UL lists is DISC, if the list is not within an UL list, and SQUARE and CIRCLE in the next levels of nesting. In Lynx, the situation is similar with the shapes DISC, SQUARE, and CIRCLE presented as star (*), plus (+) and letter o.
It is hard to imagine any good use for the TYPE attribute in a LI element, as opposite to defining the bullet type for all items of a list in a UL element or other list element.
or
<LINK REV=relation HREF=URL>
attribute name | possible values | meaning |
---|---|---|
HREF | URL | URL for linked resource |
REL | string | type of "forward" link |
REV | string | type of "reverse" link |
TITLE | string | advisory title string for the linked resource |
A link from document A to document B with REV=relation expresses the same relationship as a link from B to A with REL=relation.
<LINK REL=STYLESHEET HREF="basic.css">A simple LINK element providing authorship information:
<LINK REV=MADE HREF="mailto:jukka.korpela@hut.fi">Some LINK elements which might appear in a large document divided into separate but interlinked HTML files:
<LINK REL=CONTENTS HREF="toc.html"> <LINK REL=PREVIOUS HREF="doc31.html"> <LINK REL=NEXT HREF="doc33.html">
There was an Internet Draft, draft-ietf-html-relrev-00.txt, on proposed relationship values. (Officially the draft has been withdrawn.) Some of the most common (mentioned in the HTML 3.2 Reference Specification) are:
attribute setting | type of link (role of linked resource) |
---|---|
REL=CONTENTS | A document serving as a table of contents. |
REL=INDEX | A document providing an index for the current document. |
REL=GLOSSARY | A document providing a glossary of terms that pertain to the current document. |
REL=COPYRIGHT | A copyright statement for the current document. |
REL=NEXT | The next document to visit in a guided tour. |
REL=PREVIOUS | The previous document in a guided tour. |
REL=HELP | A document offering help, eg describing the wider context and offering further links to relevant documents. This is aimed at reorienting users who have lost their way. |
REL=BOOKMARK | A bookmark, used to provide direct links to key entry points into an extended document. The TITLE attribute may be used to label the bookmark. Several bookmarks may be defined in each document, and provide a means for orienting users in extended documents. |
The above list is just an extract from a withdrawn draft. But if you intend to write new software which uses LINK elements or if you want to include such elements into your document just in case some program happens to make use of them, then conformance to the above list is probably better than reinventing the wheel. See also W3C working draft Hypertext Links in HTML.
In conjunction with style sheets, a LINK element with REL=STYLESHEET can be used.
attribute name | possible values | meaning | notes |
---|---|---|---|
NAME | string | a name for the map, referable to in USEMAP attributes of IMG elements | obligatory; case sensitive |
<IMG SRC="navbar.gif" BORDER=0 USEMAP="#map1"> <MAP NAME="map1"> <AREA HREF="guide.html" ALT="Access Guide" SHAPE=RECT COORDS="0,0,118,28"> <AREA HREF="search.html" ALT="Search" SHAPE=RECT COORDS="184,0,276,28"> <AREA HREF="shortcut.html" ALT="Go" SHAPE=RECT COORDS="118,0,184,28"> <AREA HREF="top10.html" ALT="Top Ten" SHAPE=RECT COORDS="276,0,373,28"> </map>
Theoretically, the recommendation has been and still is that MENU element be rendered as a single column menu list.
attribute name | possible values | meaning |
---|---|---|
COMPACT | COMPACT | reduced interim spacing |
<MENU> <LI> Undo <LI> Cut <LI> Copy <LI> Paste <LI> Find <LI> Find Again </MENU>See also Examples of various list elements in HTML.
The name of the element might be misleading. There is no true selection menu involved, just a display of menu keywords. To present a true selection menu you can use hyperlink anchors (A elements). See the section Using tables to represent menus.
It depends on programs (eg browsers) processing HTML files what they do with the info.
attribute name | possible values | meaning | notes |
---|---|---|---|
NAME | name | meta information item name | alternative to HTTP-EQUIV attribute |
HTTP-EQUIV | name | meta information item name | alternative to NAME attribute |
CONTENT | string | meta information contents | a META element must contain this attribute |
<META NAME=DESCRIPTION CONTENT= "An extensive guide to writing HTML 3.2 documents, with examples and practical advice."> <META NAME=KEYWORDS CONTENT="structural HTML, logical markup">
Several Web search engines, such as InfoSeek and AltaVista, recognize META elements with NAME values DESCRIPTION and KEYWORDS. They might be used when indexing documents, and the CONTENT value corresponding to DESCRIPTION could be used as the abstract for the document when returning query results (instead of just taking first few words of a document which is often not very enlightening.) Thus, it is recommendable to include META elements similar in form to those of the example above. For some more information, consult
The META tag affects the way your document is indexed when it is included into a data base of a search engine. It will not make a robot find the document when it searches candidates for inclusion into a data base. Therefore, if you think the document is important, and especially if there are not several links to it in other documents, consider additionally using facilities like "Add URL" on the AltaVista main page.
The difference between NAME and HTTP-EQUIV is that the latter has a special significance when documents are retrieved via HTTP, whereas the interpretation of NAME attributes is up to each particular browser or other program which processes HTML files (although some common practices may emerge and might be standardized later). HTTP servers may use the property name specified by the HTTP-EQUIV attribute to create an RFC 822 style header in the HTTP response. (RFC 822 is the electronic mail protocol used on the Internet.) A server may disregard any META elements which specify information controlled by the server, such as "Server", "Date", and "Last-modified"; see the HTTP specification for details. - For example,
<META HTTP-EQUIV="Expires" CONTENT="Tue, 20 Aug 1996 14:25:27 GMT">
will result in the HTTP headerExpires: Tue, 20 Aug 1996 14:25:27 GMT
If an organization enforces authors to include meta information such as authorship information and expiration times in a specific format, special software might be written to scan through the WWW server periodically in order to send automatic reminders to authors.
In contrast with the UL element, the items are numbered (consecutively by default).
attribute name | possible values | meaning | notes |
---|---|---|---|
TYPE | 1, a, A, i, I | numbering style | case of letter is significant |
START | integer | starting sequence number | default is 1 |
COMPACT | COMPACT | reduced interim spacing |
Attributes TYPE and START where not in HTML 2.0!
The meanings of the values of TYPE are the following:
Type | Numbering style | The first few numbers |
---|---|---|
1 | normal (Arabic) numbers | 1, 2, 3, ... |
a | Latin letters in lowercase | a, b, c, ... |
A | Latin letters in uppercase | A, B, C, ... |
i | Roman numbers in lowercase | i, ii, iii, ... |
I | Roman numbers in uppercase | I, II, III, ... |
<P> Proceed as follows: </P> <OL> <LI> Try to guess how to use the program. <LI> If it fails, send lots of questions to Usenet News. <LI> If they flame you, consider contacting local user support. <LI> When everything else fails, read the manuals. </OL>An example where it is natural to use Roman numbers:
<P> The declinations of nouns in Latin are best distinguished by the ending of the genitive singular: </P> <OL TYPE=I> <LI> <I>-ae</I>, eg <I>terra:terrae</I> <LI> <I>-i</I>, eg <I>annus:anni</I> <LI> <I>-is</I>, eg <I>labor:laboris</I> <LI> <I>-us</I>, eg <I>fructus:fructus</I> <LI> <I>-ei</I>, eg <I>dies:diei</I>. </OL>A contrived example to show the effects of attributes and overriding them in LI elements.
<OL TYPE=a START=3 COMPACT> <LI> first item <LI> second item <LI VALUE=8> item after skipping a few values <LI> next item <LI TYPE=A> going on with uppercase <LI> this is the last item. </OL>See also Examples of various list elements in HTML.
The sequence numbers of the items start from the value of the START attribute (by default 1). You can set it later on with the VALUE attribute on LI elements. Both of these attributes expect integer values. (Even if you have set the TYPE attribute to something else than 1, the values of the VALUE attribute must be specified using the normal notation of numbers as sequences of digits.) You can't indicate that numbering should be continued from a previous list or skip missing values without giving an explicit number.
The alignment of numbers is unspecified. In particular, Roman numbers might be left or right aligned or centered. (This is outside the control of the document author when using the OL element; you may wish to consider the alternative of using a table.)
In nested OL lists, it would be natural to use numbering of the form m.n but the specifications are silent about this. In practice, and most browsers use simple numbering which is independent of any nesting.
The end tag can always be omitted.
attribute name | possible values | meaning | notes |
---|---|---|---|
SELECTED | SELECTED | the option is selected by default | in a SELECT element without the MULTIPLE attribute, at most one OPTION element may have this set |
VALUE | string | property value to be used when submitting the contents of the form; this is combined with the property name as given by the NAME attribute of the enclosing SELECT element | defaults to the contents of the element |
According to the HTML 2.0 specification, "the initial state has the first option selected, unless a SELECTED attribute is present on any of the OPTION elements". On the other hand, the HTML 3.2 Reference Specification leaves the default initial state open, so it is safest to assume that it is browser-dependent. You may wish to deal with this problem by providing a dummy first option (eg "No selection") and making it SELECTED, thus ensuring the same behavior from all HTML 3.2 conformant browsers.
<OPTION>female</OPTION>
Browsers usually format paragraphs to fit into the horizontal space (screen or window width) available.
Paragraphs are usually rendered flush left with a ragged right margin. The ALIGN attribute can be used to specify explicitly the horizontal alignment.
attribute name | possible values | meaning |
---|---|---|
ALIGN | LEFT, CENTER, RIGHT | alignment of the paragraph (flush left, centered, flush right) |
The default is left alignment, but this can be overridden by an enclosing DIV (or CENTER) element.
<P> This is a normal text paragraph which contains so many characters that it will most probably be split into several lines by a browser. </P>A contrived example:
<P> This is a normal text paragraph with no attribute for horizontal alignment. Nothing special. </P> <P ALIGN=CENTER> <B>This is a paragraph which should be centered. It should also appear in bold face but this results from explicit use of a B element. Centering itself should not affect the font.</B> </P> <P ALIGN=RIGHT> This is a paragraph which should be rendered flush right. It is difficult to see why you would ever <EM>like</EM> to use this option! </P>See also the examples about BLOCKQUOTE, one of which makes reasonable use of ALIGN=RIGHT.
If you intend to use P for alignment purposes, such as centering text, remember that a P element may only contain text elements. The DIV element may contain block elements, too.
There is no way in HTML (in HTML 3.2 at least) to make text appear "justified" (solid-right), unless you want to resort to using the PRE element. More exactly, such presentation issues are browser-dependent, and the great majority of browsers use ragged right margin.
The end tag </P> can always be omitted, and it usually is omitted. This, however, may distort people's thoughts: they regard <P> as a paragraph separator, but in fact it initiates a paragraph (to be terminated by an explicit </P> or implicitly by tags like <P> or <H1>).
Paragraphs cannot be nested. (This is the other side of the "nice" feature that </P> can be omitted.) One way of simulating subparagraphs is to use BR elements around a piece of text within a P element. Another way is to use list elements (such as UL) instead of P elements.
The division into lines in the rendering usually does not match the HTML source. See the section Division into lines and the use of blanks and tabs.
attribute name | possible values | meaning | notes |
---|---|---|---|
NAME | name | name of the parameter | obligatory |
VALUE | string | value of the parameter |
é
and ¹
are expanded before the
parameter value is passed to the applet.
To include an & character use &
.
attribute name | possible values | meaning | notes |
---|---|---|---|
WIDTH | integer | width of text in characters | not yet supported in general |
The value of WIDTH should be equal to or greater than the length of the longest line. In principle, the WIDTH attribute is meant for providing a browser information which it can use to select a suitably-sized font or to adjust indentation to make the text fit. Unfortunately this is not usually done by browsers. You should not expect that eg text wider than 80 characters gets displayed correctly (even if you use the WIDTH attribute).
<PRE> To be or not to be, that is the question. </PRE>A more realistic example:
The printable characters of ASCII: <PRE> ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ </PRE>An attempt to present line printer like computer output:
The printout from the program is the following. Each line contains ten real numbers, each in a field of ten characters. Notice that when viewing this document on WWW, the rendering of the printout can be unsatisfactory; in such a case widen the WWW window, if possible. <PRE WIDTH=100> 0.5138707 0.1757256 0.3086337 0.5345317 0.9476302 0.1717277 0.7022309 0.2264168 0.4947661 0.1246986 0.0838954 0.3896298 0.2772301 0.3680532 0.9834590 0.5353862 0.7656789 0.6464736 0.7671438 0.7802362 0.8229621 0.1519211 0.6254769 0.3146764 0.3469039 0.9172033 0.5197607 0.4011658 0.6067690 0.7854244 </PRE>In situations like this, you may consider the effect of using BASEFONT before PRE. (This is not a good solution but it might serve as a workaround until browsers begin to support the WIDTH attribute.)
An example of PRE element containing links (this might also be presented using a table):
Contact information (phone and E-mail): <PRE> help desk 4344 <A HREF="mailto:atk-neuvonta@hut.fi">atk-neuvonta@hut.fi</A> operators 4341 <A HREF="mailto:opr@hut.fi">opr@hut.fi</A> WWW problems 4331 <A HREF="mailto:webmaster@hut.fi">webmaster@hut.fi</A> </PRE>
The discussion of presenting interaction with computer contains an additional example with embedded text markup.
As another alternative, often suitable for large pieces of text or data, consider writing a separate text file to which you have a link in your HTML code.
Previous versions of HTML contained the XMP, LISTING, and PLAINTEXT elements. They are now deprecated (obsolete), and PRE should be used instead.
One typical use for PRE has been to present tables, and this may still be a good idea in some cases (see example 2). However, HTML tables element can be used for much more advanced tabular presentation. (You might still consider the possibility of presenting your tables in two alternative forms, using TABLE as the basic form but providing a PRE form for those readers who use a non-table browser.)
Although A elements and phrase markup (eg STRONG) can be used, the capabilities of a browser in presenting them may be more restricted than outside PRE elements. See also notes on presenting interaction with computer
You can even use tabs in the preformatted text, although it is better to use multiple spaces, since you cannot be sure of how tab stops are set in the reader's environment. The language specification says that the tab character should position to the next 8 character boundary but discourages its use.
Although a browser must show the document so that line breaks correspond to those in the source code, a browser is not forbidden from using eg constant left indentation for preformatted paragraphs.
You cannot change font size within a PRE element (and you cannot put a PRE element inside a FONT element, for example), but the BASEFONT affects preformatted text, too.
In principle, a P tag is not allowed within a PRE element, since P is block element, not text element. However, HTML 2.0 specification encourages browsers to accept it, with the remark a P within a PRE element should produce only one line break, not a line break plus a blank line.
If character < or > or & occurs in the data, it must be expressed using the escape syntax (as in example 2). In particular you must do so when including HTML code into your document for the purpose of displaying the source code.
The SGML standard requires that the parser remove a newline immediately following the start tag or immediately preceding the end tag. Thus it should not matter whether you have the <PRE> tag on a separate line or as a prefix to the first line of the text. However, some browsers fail in obeying this, so you may consider using the latter presentation to prevent an extra line.
The fatal error message <SAMP>Bus error - core dumped</SAMP> can be caused by very different bugs in your program.
In HTML 2.0 this element was defined as follows:
However, since the HTML 3.2 description is more specific and restrictive, you should use SAMP only to present sample output, not eg in the way the example in the HTML 2.0 specification suggests.The SAMP element indicates a sequence of literal characters, typically rendered in a mono-spaced font. For example:
The only word containing the letters <samp>mt</samp> is dreamt.
See also notes on presenting interaction with computer and general remarks on phrase elements.
Technically, these elements are defined with CDATA as the content type. As a result they may contain only SGML characters. All markup characters or delimiters are ignored and passed as data to the application, except for the character pair </ followed immediately by a letter (a - z, A - Z), This means that the end tag of the element (or of an element in which it is nested) is recognized. (Scripts may need to contain e.g. HTML end tags as data. Different scripting languages provide different methods for coping with this.)
attribute name | possible values | meaning | notes |
---|---|---|---|
NAME | string | a property name that is used to identify the menu choice when the form is submitted to the server | obligatory; each selected option results in a name/value pair being included as part of the contents of the form |
SIZE | integer | sets the number of visible choices | applicable then MULTIPLE is set |
MULTIPLE | MULTIPLE | signifies that the user can make multiple selections from the menu | by default only one selection is allowed |
Example:
<SELECT NAME="flavor"> <OPTION VALUE=a>Vanilla <OPTION VALUE=b>Strawberry <OPTION VALUE=c>Rum and Raisin <OPTION VALUE=d>Peach and Orange </SELECT>
As an alternative to SELECT, you may wish to consider using an INPUT element with TYPE=CHECKBOX or TYPE=RADIO, typically resulting in a rendering which allows the user see all alternatives at a glance.
<P> This is normal text. </P> <P> <SMALL> This text will be presented in a smaller font, if possible. </SMALL> </P>An example which uses SMALL to simulate "small caps" font style.
J<SMALL>UKKA</SMALL> K<SMALL>ORPELA</SMALL> has written an HTML primer G<SMALL>ETTING</SMALL> S<SMALL>TARTED WITH</SMALL> HTML.See also an an example combining SMALL and SUP/a> in the description of SUP.
The use of SMALL to simulate "small caps" as in example 2 above is not particularly effective. Some browsers simply ignore SMALL, leading to an all upper case presentation. In popular browsers, SMALL seems to cause presentation which is just marginally (if at all) smaller than normal font. It is better to use logical markup than to stick presentation conventions designed for traditional forms of publication. For example, use CITE for book titles and other citations. (A user who wants to see them in all caps style might consider using style sheets for the purpose.) Unfortunately there is no logical markup for people's names in current HTML standard.
It is unspecified what happens if SMALL elements are nested; it might or might not result in using a font which is smaller than you get with a single SMALL.
The FONT element may provide more alternatives for specifying different font sizes.
Notice that people may set the normal text font in their browser to something which is just big enough for them to read. If you use SMALL, the result might be illegibly small.
See general notes on text markup, which provide additional examples.
"Private agency" means an accredited nonpublic school, a nonprofit institution of higher education <STRIKE>eligible for tuition grants</STRIKE>, or a hospital.
If you use STRIKE in your document, it is advisable to include a note about its meaning. Even if you use it for the "normal" meaning, indicating deletion, you should tell this to your readers, since some of them might view the document with browsers which do not support STRIKE at all (and display text within STRIKE elements as normal text). You might even provide a way of getting different versions of the document, with STRIKE replaced by some other method of presenting deleted text.
See general notes on text markup, which provide additional examples.
The HTML 2.0 specification does not include STRIKE but mentions it as an element which has been "deployed to some extent".
The HTML 3.2 Reference Specification warns that 'STRIKE may be phased out in favor of the more concise "S" tag from HTML 3.0'.
For your own safety, <STRONG>turn the power off before opening the device.</STRONG>
The STRONG element involves stronger emphasis than the EM element.
See also general remarks on phrase elements.
Technically, these elements are defined with CDATA as the content type. As a result they may contain only SGML characters. All markup characters or delimiters are ignored and passed as data to the application, except for the character pair </ followed immediately by a letter (a - z, A - Z), This means that the end tag of the element (or of an element in which it is nested) is recognized.
It is legal, and recommendable, to use the HTML comment delimiters <!-- and --> around the contents of a STYLE element. The reason is that by doing so you ensure that old browsers (ignorant of STYLE) will not display the contents.
<HEAD> <STYLE><!-- BODY { font-family: sans-serif } U { font-family: serif } --></STYLE> </HEAD> <BODY> Sample text 1.<BR> <U>Sample text 2.</U> </BODY>
As a side effect, subscripts often cause lines to be unevenly spaced.
Let us form the sum of all x<SUB>i</SUB>'s, ie x<SUB>1</SUB> + x<SUB>2</SUB> + ... + x<SUB>n</SUB>.
Usage in chemistry:
SO<sub>3</sub> + H<sub>2</sub>O -> H<sub>2</sub>SO<sub>4</sub>
Using SUB and SUP to affect the presentation of fractions:
Fractions ½ and ¼ and ¾ have their own symbols in ISO Latin 1. Other fractions like <SUP>2</SUP>/<SUB>3</SUB> must be essentially presented in linearized notation, although you can use SUB and SUP to affect the presentation.
Since this tag is new, support for it is not universal.
Some browsers simply ignore it, displaying eg
a<SUB>1</SUB>
as
a1
.
And naturally, text-only browsers cannot truly support SUB.
Subscripts can be nested. This may, however, result eg in rendering inner superscripts in a very small font. Internet Explorer ignores SUB tags after nesting level of two.
See also general notes on text markup.
As a side effect, superscripts often cause lines to be unevenly spaced.
The notation A<SUP>T</SUP> denotes the transpose of A.
Consider the equation x<SUP>n</SUP> + y<SUP>n</SUP> = z<SUP>n</SUP>.
The expression a<SUP>b<SUP>c</SUP></SUP> means a<SUP>(b<SUP>c</SUP>)</SUP>.
This example is a text paragraph which contains several superscripted expressions such as m<SUP>2</SUP> and e<SUP>x</SUP>. They may affect the visual appearance of the paragraph by forcing the browser to use different line heights. This applies in particular to expressions with large and nested superscripts such as (f(a))<SUP>e<SUP>x<SUP>2y</SUP></SUP></SUP>.
Non-mathematical examples:<BR> The word "first" can be written as 1<SUP>st</SUP>.<BR> Foo<SUP><SMALL>TM</SMALL></SUP> is a trademark of Bar, Inc.<BR> In French, the word "mademoiselle" is abbreviated M<SUP>lle</SUP>.
There is also a tag for subscripts, SUB, but HTML 3.2 provides no general support for mathematical formulas.
Since this tag is new, support for it is not universal.
Some browsers simply ignore it, displaying eg
a<SUP>T</SUP>
as
aT
.
And naturally, text-only browsers cannot truly support SUP.
Superscripts can be nested, as the last example shows. This may, however, result eg in rendering inner superscripts in a very small font. Internet Explorer ignores SUP tags after nesting level of two.
See also general notes on text markup.
A table is generally sized automatically by a browser to fit the contents, but you can also set the table width using the WIDTH attribute.
attribute name | possible values | meaning | notes |
---|---|---|---|
ALIGN | LEFT, CENTER, RIGHT | horizontal alignment of the entire table | default is LEFT, but this can be overridden by an enclosing DIV or CENTER element |
WIDTH | width specification | width of the entire table | by default, width is determined by a browser to fit the contents |
BORDER | integer | width of the frame, in pixels | value of 0 (default) means no border; some browsers also accept plain BORDER with the same meaning as BORDER=1 |
CELLSPACING | integer | spacing between cells, in pixels | see note below |
CELLPADDING | integer | spacing (padding), in pixels, between the contents of a cell and the border around a cell. |
Typically the BORDER attribute (with nonzero value) sets the default value of CELLSPACING to 1. This means that by setting a border for the entire table you also set borders of one pixel for the individual cells.
In traditional desktop publishing software, adjacent table cells share a common border. This is not the case in HTML. Each cell is given its own border which is separated from the borders around neighboring cells. This separation can be set in pixels using the CELLSPACING attribute (eg CELLSPACING=10). The same value also determines the separation between the table border and the borders of the outermost cells.
<TABLE> <CAPTION>Areas of the Nordic countries, in sq km</CAPTION> <TR><TH>Country</TH> <TH>Total area</TH> <TH>Land area</TH> <TR><TH>Denmark</TH> <TD ALIGN=RIGHT> 43,070 </TD><TD ALIGN=RIGHT> 42,370</TR> <TR><TH>Finland</TH> <TD ALIGN=RIGHT>337,030 </TD><TD ALIGN=RIGHT>305,470</TR> <TR><TH>Iceland</TH> <TD ALIGN=RIGHT>103,000 </TD><TD ALIGN=RIGHT>100,250</TR> <TR><TH>Norway</TH> <TD ALIGN=RIGHT>324,220 </TD><TD ALIGN=RIGHT>307,860</TR> <TR><TH>Sweden</TH> <TD ALIGN=RIGHT>449,964 </TD><TD ALIGN=RIGHT>410,928</TR> </TABLE>An example of control over presentation style:
<TABLE ALIGN=CENTER WIDTH="80%" BORDER=1 CELLSPACING=10 CELLPADDING=3> <CAPTION>The Nordic countries</CAPTION> <TR><TD>Denmark</TD> <TD>Finland </TD> <TD>Iceland </TD> <TD>Norway </TD> <TD>Sweden </TD> </TR> </TABLE>
Tables can be nested. However, nested tables (and large tables in general) can be confusing, and there are implementation deficiencies involved. If you have a large collection of material which might be presented as a structure of nested tables, give some thought to the question whether it is useful (to your readers) that you do so. Often it pays off to present the material first as a compact overview table, then to accompany it with tables containing details about each part.
When there is normal text before or after a table, it is advisable to end the preceding paragraph with an explicit </P> tag and to begin the following paragraph with an explicit <P> tag. Otherwise the browser (eg Netscape) may not render the table with suitable empty vertical space around it.
Be careful. If numbers of cells in different rows do not match (taking COLSPAN attributes into account), the result is most probably a total mess.
The default alignments are often unsuitable, especially for numerical tables. Unfortunately there is no way for specifying the default alignment for table cells, except rowwise in the TR element; notice that the ALIGN attribute of a TABLE element specifies the alignment of the entire table and does not affect the default alignments for cells.
Several versions of Netscape do not obey an ALIGN=CENTER attribute in a TABLE element. The common solution is to enclose the entire TABLE element into a CENTER element as well.
In principle, the end tag </TD> can always be omitted. This is not recommendable, since some browsers (including Netscape) may act incorrectly when the end tag is omitted.
attribute name | possible values | meaning | notes |
---|---|---|---|
NOWRAP | NOWRAP | suppress word wrap | equivalent to using non-breaking spaces, , instead of normal spaces within the contents of the cell |
ROWSPAN | integer | number of rows spanned by the cell | default is 1 |
COLSPAN | integer | number of columns spanned by the cell | default is 1 |
ALIGN | LEFT, CENTER, RIGHT | horizontal alignment of data in the cell | default is LEFT or the ALIGN attribute in an enclosing TR element |
VALIGN | TOP, MIDDLE, BOTTOM | vertical alignment of data in the cell | overrides a VALIGN attribute in an enclosing TR element |
WIDTH | integer | suggested width of the cell, in pixels | the browser should use the value unless it conflicts with the width requirements for other cells in the same column |
HEIGHT | integer | suggested height of the cell, in pixels | the browser should use the value unless it conflicts with the height requirements for other cells in the same row |
The TD and TH elements are very similar; in particular, they have the same attributes. The TD element is for data in a table whereas the TH element is for headings of columns or rows in a table. The visible differences are:
Normally you should let browsers select suitable height and width for table cells. If you really need to use WIDTH or HEIGHT attributes, it is best to specify the (same) WIDTH attribute for all elements in a column and the (same) HEIGHT attribute for all elements in a row. Some browsers might not honor the requirements otherwise; it is debatable whether this is a bug or a feature.
The area is initialized with the contents of the TEXTAREA element, using monospaced font. The contents is displayed as it is written, similarly to PRE elements.
attribute name | possible values | meaning | notes |
---|---|---|---|
NAME | string | a property name that is used to identify the textarea field when the form is submitted to the server | obligatory |
ROWS | integer | number of visible text lines | obligatory |
COLS | integer | number of visible width of text, in average character widths | obligatory |
A browser should not interpret the ROWS and COLS attributes as restricting the size of the actual input. On the contrary, the browser should provide some means to scroll through the contents of the textarea field when the contents extend the visible area.
A browser may wrap visible text lines to keep long input lines visible without need for scrolling.
The contents is used to initialize the text that is shown in the input field when the document is first loaded.
<TEXTAREA NAME=address ROWS=4 COLS=40> Your address here ... </TEXTAREA>
For single-line input fields you can use an INPUT element with TYPE=TEXT.
It is recommended in the specifications that user agents canonicalize line endings to CR, LF (ASCII decimal 13, 10) when submitting the contents of the field. However, authors should not rely on this, since not all browsers behave so. The character set for submitted data should be ISO Latin 1, unless the server has previously indicated that it can support alternative character sets.
The HTML specifications do not quite explicitly require that the contents of a TEXTAREA element (specifying the initial value) is to be rendered as it is written with respect to division into lines etc (similarly to PRE elements), but this is clearly the intention.
Browsers do not always honor the ROWS and COLS attributes exactly. Rather often the visible input area is somewhat larger than specified by them.
You cannot use ROWS and COLS attributes to restrict the size of the actual input, nor can you do that with other HTML constructs. The script that processes the form can be written so that it takes care of handling excessively large input if needed.
In principle, the end tag </TH> can always be omitted. This is not recommendable, since some browsers (including Netscape) may act incorrectly when the end tag is omitted.
attribute name | possible values | meaning | notes |
---|---|---|---|
NOWRAP | NOWRAP | suppress word wrap | equivalent to using non-breaking spaces, , instead of normal spaces within the contents of the cell |
ROWSPAN | integer | number of rows spanned by the cell | default is 1 |
COLSPAN | integer | number of columns spanned by the cell | default is 1 |
ALIGN | LEFT, CENTER, RIGHT | horizontal alignment of data in the cell | default is CENTER or the ALIGN attribute in an enclosing TR element |
VALIGN | TOP, MIDDLE, BOTTOM | vertical alignment of data in the cell | overrides a VALIGN attribute in an enclosing TR element |
WIDTH | integer | suggested width of the cell, in pixels | the browser should use the value unless it conflicts with the width requirements for other cells in the same column |
HEIGHT | integer | suggested height of the cell, in pixels | the browser should use the value unless it conflicts with the height requirements for other cells in the same row |
The TD and TH elements are very similar; in particular, they have the same attributes. The TD element is for data in a table whereas the TH element is for headings of columns or rows in a table. The visible differences are:
<TITLE>A study of population dynamics</TITLE>
On the other hand, the title should be relatively short to fit into one line under all reasonable circumstances. The HTML 2.0 specification says that long titles may be truncated and that titles should be at most 63 characters in length.
See also general notes about the head section.
Use the H1 or some other heading element to specify the main heading to be displayed as part of the document. Using such a heading at the beginning of a document and using a TITLE element are not alternatives but serve different purposes; both are strongly recommended. The title text and the main heading text may well be identical, but of course they need not.
In principle, the end tag </TR> can always be omitted. This is not recommendable, since some browsers (including Netscape) may act incorrectly when the end tag is omitted.
attribute name | possible values | meaning | notes |
---|---|---|---|
ALIGN | LEFT, CENTER, RIGHT | default horizontal alignment in cells | can be overridden by ALIGN attributes in TH and TD elements |
VALIGN | TOP, MIDDLE, BOTTOM | default vertical alignment in cells | can be overridden by VALIGN attributes in TH and TD elements |
<TR><TD>3.70 <TD>4.69 <TD>8.02 </TR>
Compare <TT>monospaced font</TT> with normal font.
See general notes on text markup, which provide additional examples.
Compare <U>underlined text</U> with normal text.
It is customary to use underlining in typewritten text for various other purposes than emphasis, too, but in HTML it is usually better to use eg the I element (to produce italics).
One particular reason for avoiding U is that typically Web browsers present links using underlining (instead of or in addition to other methods such as different color). Therefore, if you use U elements, the reader may have serious difficulties in distinguishing them from links.
The HTML 2.0 specification does not include U but mentions it as an element which has been "deployed to some extent".
See general notes on text markup, which provide additional examples.
attribute name | possible values | meaning | notes |
---|---|---|---|
TYPE | DISC, SQUARE, CIRCLE | default bullet style for items | Not in HTML 2.0! |
COMPACT | COMPACT | reduced interim spacing |
The default value of bullet type generally depends on the level of nesting (various) lists.
Remember to buy <UL> <LI> milk <LI> bread <LI> apples. </UL>A contrived example to show what the bullets may look like. Notice that TYPE attribute in a LI element overrides that of an enclosing UL element.
<UL TYPE=DISC COMPACT> <LI> disc <LI TYPE=SQUARE> square <LI TYPE=CIRCLE> circle </UL>See also Examples of various list elements in HTML.
An UL element must contain at least one LI element. Some people and some HTML editors may generate UL elements with just text within, possibly even nesting UL elements just in the hope of getting different amounts of indentation. If you have to resort to such tricks, enclose the text into an LI element (although this will usually cause a bullet in the display) and this in turn into UL. (Style sheets will provide mechanisms for controlling indentation.)
VAR { font-style : italic }
.)
See general notes on rendering markup.
In the simplest case, the command for deleting a file in Unix is<BR> <KBD>rm</KBD> <VAR>filename</VAR>