This section describes how the 24 December 1999 version of the HTML 4.01
specification differs from the 24 April 1998 version of the HTML 4.0
specification.
In section 1.4, removed copyright details
and refer to W3C site instead.
References to the document character set are all ISO 10646 (and one time to
UNICODE to signal equivalence). References to UNICODE refer only to the
bidirectionality algorithm.
Any changes to future HTML 4 DTDs will not invalidate documents that
conform to the DTDs of the present specification. The HTML Working Group
reserves the right to correct known bugs.
Software conforming to the DTDs of the present specification may ignore
features of future HTML 4 DTDs that it does not recognize.
7.2 HTML version information: Use
undated, HTML 4 URIs for system identifiers. These URIs are also used globally
in all examples.
7.4.4 Meta data: Removed note
about ongoing work at W3C on meta data and replaced with a note about RDF.
7.4.4.2 Meta data: At the end
of the section on HTTP headers, removed the auto-refresh example (since not
part of the Recommendation) and added a note to use server-side redirects.
12.2 The A element: The
description of the type attribute for the A
and (LINK) elements has been modified to emphasize its advisory
nature.
12.2.3 Anchors with the id
attribute: It is legal for "name" and "id" to appear in the same start tag
when they are both defined for an element. They must have identical
values.
12.3.3 Links and search
engines: Removed reference to dir attribute in example since it
doesn't apply to linked resources (only element content and attribute text
values).
12.4.1 Resolving relative URIs:
Since RFC 2616 does not include a Link header field, the following statement is
qualified for earlier versions of HTTP 1.1: "Link elements specified by HTTP
headers are handled exactly as LINK elements that appear explicitly in a
document."
13.2 The IMG element: Added a
note that user agents must provide different mechanisms for accessing the
"longdesc" URI (of IMG) and the "src" URI (of A) when an IMG is part of the
content of an A element.
13.3 The OBJECT element: Added
a note that when the value of "type" for OBJECT and the Content-Type HTTP
header differ, the latter takes precedence.
13.3 The OBJECT element: Added
a statement to use PARAM instead of the "data" and "classid" attributes for
OBJECT together.
13.4 The APPLET element: Added
a note that, for security reasons only subdirectories are searched for the
"codebase" attribute of APPLET.
13.6.1 Client-side image
maps: The definition of the "poly" attribute has been cleared up. There is
a note that if not closed by authors, user agents should close the polygon for
the "coords" attribute of AREA.
17.2.1 Control types: In the
description of radio buttons, when no radio button is initially selected, user
agent behavior for selecting one is undefined. This differs from RFC 1866.
17.4 The INPUT element: Added
missing "ismap" for the INPUT element. Also, in definition of value., add "checkbox" to values of type that require a value.
17.6.1: When no option is
preselected, user agent behavior is undefined. Authors should supply and
explicit none option to cover this case. This behavior differs from RFC
1866.
SGML Declaration of HTML 4:
Removed text about up-to-date references to ISO 10646. Replaced with:
"Revisions of the HTML 4 specification may update the reference to ISO 10646 to
include additional changes."
Image map examples using "poly" have been fixed to form a closed polygon.
Also, the last pair of coordinates is the same as the first to close the
polygon.
The HR element should also take the
lang and dir attributes. These are noted as being defined
elsewhere at the element's definition, but were left out of the DTDs.
The OBJECT element's archive
attribute is defined in the DTD as taking a value of type %URI". This is
incorrect: the value may be a space-separated list of URIs (as indicated in the
definition of the attribute and in the DTD comment).
The FORM element's DTD fragment should include a definition for
the accept attribute, which is listed in the element's
definition. The definition should be the following:
accept %ContentTypes; #IMPLIED -- list of MIME types for file upload --
At the end of the section, the following sentences are incorrect: "The list
of terms in the content is ALL, INDEX, NOFOLLOW, NOINDEX. The name and the
content attribute values are case-insensitive." In fact, the
META definition specifies that values for the
name and content attributes are case-sensitive.
The specification reads, "Blank lines are not permitted." Blank lines are
permitted in the robots.txt file, just not within a single "record". Note that
the specification doesn't define record.
Further down the page, the specification reads, "There must be exactly one
"User-agent" field per record." In fact, there can be more than one User-Agent
field in the robots.txt file, just not more than one per record.
For information about search robots, please consult, for example:
The [URI] reference should be
updated to RFC 2396 as of August 1998. "Uniform Resource Identifiers (URI):
Generic Syntax", T. Berners-Lee, R. Fielding, L. Masinter, August 1998. RFC
2396 updates [RFC1738] and [RFC1808].
A.1.3 Minor typographical errors that were
corrected
In the sentence beginning "Please consult the SGML standard", the phrase
"an end tag closes all omitted start tags up to the matching start tag (section
7.5.1)" should read "an end tag closes, back to the matching start tag, all
unclosed intervening start tags with omitted end tags".
The sentence "The first COL element refers to the first 39
columns (doing nothing special to them) and the second one assigns an id value
to the fortieth columns so that style sheets may refer to it." should have
"fortieth column" instead.
The last sentence should read "Further information is given below on using
links for..." (change "of" to "on"). This sentence is also missing its closing
punctuation.
In the paragraph that begins "In the following example...", the phrase
"cause it so be instantiated" should be changed to cause it to be instantiated"
(change "so" to "to").
Just after the deprecated example, the sentence "This example may be
rewritten as follows with OBJECT as follows:" should say "This
example may be rewritten with OBJECT as follows:".
Under the "coords" attribute, the word "and" should be substituted for the
word "a" so the sentence reads, "This attribute specifies the position and
shape on the screen."
In the "Deprecated" example, the first sentence should read "If the
clear attribute is set to left or all, the next line will appear as
follows:" ("the" before "next line").
The list of "attributes defined elsewhere" was inadvertently omitted after
the definition of NOFRAMES. These attributes are:
class, id, lang,
dir, title, style, and the %events; attributes.
In the examples at the end of the section, change "Content-Disposition:
attachment" to "Content-Disposition: file". Also, in an earlier example, change
"server.dom" to "server.com".
After the first example, the indefinite article before "content-type" needs
to be "a", not "an". The same applies to "content-type" in the next paragraph.
In the sentence beginning "Documents that do not specify...", the indefinite
article "a" needs to be removed from before "default scripting language
information".
In the paragraph on the COLGROUP element, the last
sentence should read: "The semantics of
COLGROUP have been clarified over previous drafts, and
rules="basic" has been replaced by rules="groups"."
Under "Provide keywords and descriptions", the middle of the sentence "The
value of the name attribute sought by a search
attribute is not defined by this specification." should read "search engine"
instead.
In seventh paragraph, added "back to the matching start tag" to "(e.g.,
they must be properly nested, an end tag closes, back to the matching start
tag, all unclosed intervening start tags with omitted end tags (section 7.5.1),
etc.)."
In a content model definition, "A" means that "A" must occur one time and
only one time. Also, added "+(A)" and "-(A)" to the section on content model
syntax.
All uses of "cracker" in this section and its subsections are replaced with
"hacker". Also, definitions of "hacker" and "nerd" taken from "The Hacker's
Dictionary".
Some versions of Netscape Navigator 4.0X crash upon reading Chapter 3 of
previous versions of this specification. Netscape is aware of this bug and have
fixed it in version 4.5. To work around this bug, go to the
Edit/Preferences/Advanced submenu and disable Style Sheets (and possibly
JavaScript).
"http://www.w3.org/TR/PR-html4/cover.html" was said to designate the
current HTML specification. The current HTML specification is actually at
http://www.w3.org/TR/REC-html40.
The second paragraph read "In this table definition, we specify that the
cell in row four, column two should span a total of three columns, including
the current row." It now ends "...including the current column."
The last sentence of the second paragraph applied to both the
IMG and INPUT elements. However, the
ismap attribute is not defined for
INPUT. The sentence now only applies to
IMG.
The Surfing style rule "BR.mybr { clear: left }" was incorrect, since it refers
to the class "mybr" and not the id value. The correct syntax is: "BR#mybr {
clear: left }".
All the examples containing a Document Type Declaration used something like
"THE_LATEST_VERSION_/frameset.dtd" or "THE_LATEST_VERSION_" as the system
identifier for the Frameset DTD. They now use the proper document type
declaration indicated in Section
7.2
The "attributes defined elsewhere" for
OPTION and OPTGROUP mistakenly listed
onfocus, onblur, and
onchange. The "attributes defined elsewhere" section was missing for
the SELECT element (please see the DTD for the full list of
attributes).
The sentence "The following elements support the
readonly attribute: INPUT and
TEXTAREA." read "The following elements support the
readonly attribute: INPUT,
TEXT, PASSWORD, and
TEXTAREA."
The first paragraph read: "It is also possible to specify the scripting
language in each SCRIPT element via the
type attribute. In the absence of a default scripting language
specification, this attribute must be set on each
SCRIPT element." Since the type attribute
is required for the SCRIPT element, this paragraph now
reads: "The type attribute must be specified for
each SCRIPT element instance in a document. The value of the
type attribute for a
SCRIPT element overrides the default scripting language for that
element."
The comment for the character reference "not" read "= discretionary
hyphen". This has been removed.
The FPI in comment read "-//W3C//ENTITIES Full Latin 1//EN//HTML", instead this
is now "-//W3C//ENTITIES Latin1//EN//HTML".
The S element which is deprecated was listed as part of the changes between HTML
3.2 and HTML 4.0. This element was not actually defined in HTML 3.2. It is now in the new elements
list.
The longdesc attribute was said to be specified for
tables. It is not. Instead, the summary attribute allows authors
to give longer descriptions of tables.
The sentence "You may help search engines by using the
LINK element with rel="start" along with the
title attribute,..." read "You may help search engines by using the
LINK element with rel="begin" along with a
TITLE,..." The same stands for the companion example.
The sentence "This can be altered by setting the width attribute of the
TABLE element." read "This can be altered by setting the width-TABLE
attribute of the TABLE element."
The sentence "Rules for handling objects too large for a column apply when
the explicit or implied alignment results in a situation where the data exceeds
the assigned width of the column." read "too large for column". The meaning of
the sentence was unclear since it referred to "rules" governing an error
condition; user agent behavior in error conditions lies outside the scope of
the specification.
The second word "of" was missing in "Despite never receiving consensus in
standards discussions, these drafts led to the adoption of a range of new
features."
The sentence "Element types that are designed to have no content are called
empty elements." contained one too many "elements". The word "a" was missing in
the sentence "A few HTML element types use an additional SGML feature to
exclude elements from a content model".
Also, in list item two, a period was missing between "optional" and
"Two".
The last word "a" was missing in the sentence "The meaning of a property
and the set of legal values for that property should be defined in a reference
lexicon called profile."
The sentence "Links that express other types of relationships have one or
more link types specified in their source anchors." read "Links that express
other types of relationships have one or more link type specified in their
source anchor."
The second paragraph reads "the hreflang attribute provides user agents
about the language of a..." It should read "the hreflang attribute provides
user agents with information about the language of a..."
In the sentence beginning "Any number of
PARAM elements may appear in the content of an
OBJECT or APPLET element,..." a space was
missing between "APPLET" and "element".
The first sentence read, "In an HTML document, an element must receive
focus from the user in order to become active and perform their tasks" (instead
of "its" tasks).
Just before section 18.2.3, the sentence that includes "a name attribute
takes precedence over an id if both are set." read "over a id if both are
set.".
This sections referred to a non-existent cols attribute. This
attribute is not part of HTML 4.0. Calculating the number of columns in a table
is described in section Section
11.2.4.3, in the chapter on tables. In sections B.5.1 and B.5.2,
occurrences of cols have been replaced by "the number of columns
specified by the COL and COLGROUP elements".
In the sentence "The values for the frame attribute have been chosen to
avoid clashes with the rules, align and valign attributes." a space was missing
between "the" and "frame" and the last attribute was "valign-COLGROUP".
Almost all attributes that specify the presentation of an HTML document
(e.g., colors, alignment, fonts, graphics, etc.) have been deprecated in favor of style sheets. The list of attributes in the appendix
indicates which attributes have been
deprecated.
The id and class attribute allow authors to assign
name and class information to
elements for style sheets, as anchors, for scripting, for object declarations,
general purpose document processing, etc.
The HTML 4.0 table model has grown out of early work on HTML+ and the
initial draft of
HTML3.0. The earlier model has been extended in response to requests from
information providers as follows:
Authors may specify tables that may be incrementally displayed as the user
agent receives data.
Authors may specify tables that are more accessible to users with
non-visual user agents.
Authors may specify tables with fixed headers and footers. User agents may
take advantage of these when scrolling large tables or rendering tables to
paged media.
The HTML 4.0 table model also satisfies requests for optional column-based
defaults for alignment properties, more flexibility in specifying table frames
and rules, and the ability to align on designated characters. It is expected,
however, that style sheets will take over
the task of rendering tables in the near future.
In addition, a major goal has been to provide backwards compatibility with
the widely deployed Netscape implementation of tables. Another goal has been to
simplify importing tables conforming to the SGML CALS model. The latest draft
makes the align attribute compatible with the latest
versions of the most popular browsers. Some clarifications have been made to
the role of the dir attribute and recommended behavior when
absolute and relative column widths are mixed.
A new element, COLGROUP, has been introduced to allow sets of
columns to be grouped with different width and alignment properties specified
by one or more COL elements. The semantics of
COLGROUP have been clarified over previous drafts, and
rules="basic" has been replaced by rules="groups".
The style attribute is included as a means for extending
the properties associated with edges and interiors of groups of cells. For
instance, the line style: dotted, double, thin/thick etc; the color/pattern
fill for the interior; cell margins and font information. This will be the
subject for a companion specification on style sheets.
The frame and rules attributes have been
modified to avoid SGML name clashes with each other, and to avoid clashes with
the align and
valign attributes. These changes were additionally motivated by the
desire to avoid future problems if this specification is extended to allow frame and rules attributes with other
table elements.
The OBJECT element allows generic inclusion of
objects.
The IFRAME and
OBJECT elements allow authors to create embedded documents.
The alt attribute is required on the
IMG and AREA elements.
The mechanism for creating
image maps now allows authors to create more accessible image maps. The
content model of the MAP element has changed for this reason.
This specification introduces several new attributes and elements that
affect forms:
The accesskey attribute allows authors to specify direct
keyboard access to form controls.
The disabled attribute allows authors to make a form
control initially insensitive.
The readonly attribute, allows authors to prohibit changes
to a form control.
The LABEL element associates a label with a particular form
control.
The FIELDSET element groups related fields together and,
in association with the LEGEND element, can be used to name the
group. Both of these new elements allow better rendering and better
interactivity. Speech-based browsers can better describe the form and graphic
browsers can make labels sensitive.
A new set of attributes, in combination with scripts, allow form providers to verify
user-entered data.
The BUTTON element and
INPUT with type set to "button" can be used
in combination with scripts to create
richer forms.
The OPTGROUP element allows authors to group menu options
together in a SELECT, which is particularly important for form
accessibility.
Many elements now feature event
attributes that may be coupled with scripts; the script is executed when
the event occurs (e.g., when a document is loaded, when the mouse is clicked,
etc.).
The HTML 4.0 specification makes additional clarifications with respect to
the bidirectional
algorithm.
The use of CDATA to define the
SCRIPT and STYLE elements does not preserve the
ability to transcode documents, as described in section 2.1 of
[RFC2070].