Getting the start of HTML pages right
We will begin at the start, which seems the logical place to start. Let's consider the opening elements of an HTML page and ensure we fully understand all the essential component parts.
Like so many things with the web, remembering the exact syntax of each thing inside the head
section is not particularly important. Understanding what each thing is for is important, however. I generally copy and paste the opening code each time, or have it saved in a text snippet, and I would recommend you do too.
The first few lines of an HTML page should look something like this:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8" />
The doctype
So, what do we actually have there? First of all, we opened our document with the HTML5 Doctype declaration:
<!DOCTYPE html>
If you're a fan of lowercase, then <!doctype html>
is just as good. It makes no difference.
The html tag and lang attribute
After the Doctype declaration, we open the html
tag; the first and therefore root tag for our document. We also use the lang
attribute to specify the language for the document, and then we open the <head>
section:
<html lang="en">
<head>
Specifying alternate languages
According to the W3C specifications (http://www.w3.org/TR/html5/dom.html#the-lang-and-xml:lang-attributes), the lang
attribute specifies the primary language for the element's contents and for any of the element's attributes that contain text. You can imagine how useful this will be to assistive technology such as screen readers. If you're not writing pages in English, you'd best specify the correct language code. For example, for Japanese, the HTML tag would be <html lang="ja">
.
For a full list of languages, take a look at http://www.iana.org/assignments/language-subtag-registry.
Character encoding
Finally, we specify the character encoding, which in simple terms tells the browser how to parse the information contained within. As the meta
tag is a void element, it doesn't require a closing tag:
<meta charset="utf-8" />
Unless you have a good reason to specify otherwise, the value for the charset is always utf-8
.