"lang=" attribute



HTML 4.0

This is paraphrased from HTML 4.0 which was proposed, in working draft form, on July 8, 1997. Acceptance, at the time of this writing had not been addressed.

Note: "User agents (browsers) must not use the "lang=" attribute to determine text directionality.

The "lang=" attribute establishes the natural, spoken language for a page. The objective is to make the internet truly international, accepting all languages, characters and writing styles. The "lang=" attribute allows the setting of rules for search engines, speech synthesis, character sets (fonts or glyphs), quotation marks, hyphenation, ligatures, spacing and spell and grammar checking, based on the stipulated language code.

The argument to the "lang=" attribute is made up of two parts; a primary code and an optional subcode (separated by a "-" hyphen). The primary code is a two character language code.

i.e.<tag lang="en">
The subcode is "understood to be a (ISO 3166) country code". However, W3C also gives several examples:
<tag lang="en-US"> <tag lang="en-cockney"> <tag lang="i-cherokee">
They also propose a method of handling such "artificial languages" as Elfish and Klingon. For such languages they propose the primary code of "x"

The default state of "lang=" is the value "unknown". White space is not allowed in the argument and language-codes are not case-sensitive. The default language is "unknown".