Tags
All content in HTML is placed in so called tags. De syntax of this paragraph is like this:
<p>
All content in XHTML is placed in so called tags. De syntax of this paragraph is like this:
</p>
To every kind of content belongs a certain tag. In the example above the content is a paragraph and paragraphs get a 'p-tag'. These tags are to be written in lowercase. Right next to the letter p there is a space and then a so called 'attribute'. An attribute is a piece of code which says something about the information, the content, of a tag. In this case we're looking at a lang attribute. This attribute states the language of the content within the tag. Other attributes may be used to influence the graphic representation of the content, or to add scripts to elements.
Another thing that catches the eye is that the piece of content is followed by '</p>'. This piece of code is the termination of the tag. It is an unbreakable law that every tag in XHTML must be terminated.
A few commonly used tags are: <p> (paragraph text), <img> (image), <ul> (unordered list), <table> (table), <h1> (high level header), <h2> (lower level header), <div> (division, rough segmentation of groups of content) and <a> (hyperlink).
Head
The article Anatomy of a (web) document explained how an XHTML-documents are divided into a head section and a body section. The head section contains information which is not intended to be seen by the end user, but instead help a webbrowser to render a page in the right manner. Below, I will touch the most important and most commonly used parts of the head section.
Meta
One of the most important elements that we put inside the head section is the characterset declaration of the document's content. A character set is a table with codes and the matching letters or characters. Imagine that, for whatever reason, you would like to use a character set which contains exclusively contains (post) Latin characters, then this is the place to indicate this. Now, the browser also knows which character set to use and will so be able to render the text as intended.
There's a vast number of character sets for you to choose from. Each of them has it's own history. A few random examples of this are:
- iso-8859-1 (Exclusively suitable for some western languages, including English)
- iso-8859-5 (Cyrillic)
- big5 (Chinese, traditional)
- windows-950 (Chinese, traditional)
- shift_jis (Japanse)
- iso-8859-8 (Hebrew)
- iso-8859-3 (Maltese)
- iso-8859-6 (Arabic)
- utf-8 (All sets)
- utf-16 (All sets)
The above list is by far incomplete. Due to historic reasons do many alphabets and character systems have at least one dedicated code table. Some character sets, such as Arabic, have four or more, which actually all try to do the same job. For those who are interested, iana.org offers a complete list of all existing character sets.
It would be far beyond the extent of this writing to elaborate on the history and functioning of text encoding. Therefore my dogma: use UTF-8 exclusively. UTF-8 is a code table from the so called unicode familie. Unicode offers a standard encoding which contains code tables for all character sets — including Klingon —, plus room for future character sets. This may sound ideal to you… and so it is. There's more to learn about text encoding in general and unicode in particular at joelonsoftware.com.
The syntax for the character set declaration is as follows:
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
Where utf-8 indicates the character set. As you may have noticed, the character set declaration is applied as an attribute to the meta-tag.
An XHTML document is a full member of the XML-familie. Therefore, the two very first lines of each XHTML-document will look like this:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
At the first line, the used XML version is declared, as well as the character set in which the document itself is encoded. Also here, it's recommended to use UTF-8. At the second line there's indicated which XHTML version is used for the document. We call this the document type declaration and it's probably one of the most important lines of the HTML-document, as because of this line, the browser knows exactly in which way to interpret the document code. In this case the XHTML 1.0 Transitional standard is used.
The image below illustrates what the code of a simple XHTML page may look like. Click on the image to render the obtained page in your browser.

It might have occurred to you that your browser automatically generates some space between two paragraphs of text. Besides that, the header of the page looks a little larger and bolded. This happens, because your browser will — even without adding any layout information — know what to do with things such as paragraphs and headers, but also lists, tables, etc.
The XHTML attribute on the html tag won't have escaped your attention either. This is a mandatory part of the XHTML-document and it is basically a link to a file on the internet which exactly describes which rules the code should meet.
Now your are capable of assembling an XHTML document, u are ready to embellish this page using Cascading Style Sheets, or just: 'CSS'.
Other recent articles:
-
Dealing with Labels by Cornelis G. A. Kolbach — February 1, 2009
Read more...As suggested in the article Form Follows Function and Achieving Thereof, every input element on a form should ideally have a label. Labels give more meaning to input elements and makes them accessible. This article dives into dealing with labels and input fields for postal addresses on forms.
-
Form follows function and achieving thereof by Cornelis G. A. Kolbach — February 1, 2009
Read more...Forms can be dreadfully tricky to style and structure properly. Several articles that are out there focus on best practises for building forms using HTML en CSS. This article focusses in a non technical fashion on the use of meaningful nomenclature and how form semantics relate to elements that current markup standards have to offer. It may help you recognise structural patterns and to compose forms properly.
-
Gregorian date input diversity by Cornelis G. A. Kolbach — February 1, 2009
One of the most common interaction patterns one can find on forms is the date input group. They appear in all shapes and sizes in various applications and sign up forms on websites. Certain forms of appearance seem to be more popular in certain geographical areas than other. But other than that it is hard to find any pattern or rationale why one website has chosen for model X while the other has chosen model Y. The suspicion would rise that the date input method is often dictated by the way the backend would 'like' it. This is a situation which neither we, as interaction designers and consultants, nor the end user should settle for.
Read more... -
AJAX and the Old World by Cornelis G. A. Kolbach — November 19, 2006
Most of us know that HTML was designed in such a way that it would enable one to (single) click on certain underlined words in a text, that would link to another page. Initially, these hyperlinks were the only clickable items on web pages. Soon enough, besides using hyperlinks in an inline fashion, they would be grouped on pages so they would form a menu which would help people to navigate between pages that belonged to a certain group of pages. The web site was born.
Today, complex layout methods have made it possible to borrow from interaction patterns of desktop applications, including drop down menu bars, expanding trees and tabs. It's this exact inevitable shift of desktop application design patterns to the page metaphor that has more than often led to confusion amongst both web designers and end users. In this era of AJAX and RIAs, the possibilities for user interface designers have become infinite. Hence the question arises: Have all of these developments actually led to an improved user experience?
Read more...