HTML Basics

The Hypertext Markup Language (HTML), developed in the 1990s, is a language for transmitting global hypertext documents. HTML is a "markup language" that defines the structure of a document using information such as tags and text. A web browser reads this information and displays a webpage. In this article, we look at a brief introduction to HTML. Then, we will focus on the structure of an HTML document, mandatory and optional tags, and attributes. As you read these sections, pay attention to the rules for writing HTML. For example, an HTML5 document must contain four basic tags enclosed in open "<" and closed ">" brackets.

Standard HTML Tags

There are four HTML tags that are considered standard for all web pages. These tags are the <html>, <head >, <title>, and <body> tags. A strict HTML5 web page is required to have these four tags, and many IDEs will automatically insert these four tags into a page for you when you start an HTML page. Since these four tags are always recommended for every web page, I personally keep a template, shown below, that I copy when I begin all web pages.

<html>
  <head>
     <title>Please change this to the title of your page </title>
  </head>
 <body>
 </body>
</html>

Program 2 – HTML template

This template code can be represented as a top-down tree, as shown below. In this tree, the html tag is used to contain 2 elements, the head and the body. Likewise, the head section contains the title, and as we will see shortly, the head and the body sections will contain many other HTML elements. Thus, we will call these 3 tags container tags. The title only contains a block of text, so it is a block tag.


Figure 1 - Tree layout of an html document

These four tags (html, head, title, and body) are special in that they define the structure of an HTML document and are called Document Structure tags. This will be covered more fully in the next section. But first, there are some points to be made about how to structure HTML files.

In the file in Program 2, note that each container tag (html, head, and body) is indented to show the hierarchical structure of the document representing the tree in Figure 2.1. This is not required by the HTML processor, as the processor is just looking at strings of instructions and text and ignoring any program format. However, indenting makes it easier for the developer and maintainer of web pages to understand what is going on in the program.

The second thing I always recommend writing HTML code is to end all container tags when the beginning tag is entered. This means when <head> is entered, the </head> is immediately entered. This is the automatic behavior of many IDEs. The reason to enter a close tag when opening a container tag is to enforce boundaries on the ideas and concepts that are being expanded in the container. This does not make sense to many novices, who seem to see ideas as unstructured information that starts at the top of the document and just streams to the end. Novice ideas often appear (to me) to be a jumble of thoughts. They do not see a purpose in creating boundaries or structure to express the idea. This is true in all areas of academia, including unreadable papers and documentation. This is why indenting and container boundaries are so important to enforce a structured way of presenting the ideas. And why a basic course in CS, which teaches this structuring, can be important for students of any major.

But since this concept of structuring ideas is such an enigma to students, I give a practical reason for entering the closing tag when the opening tag is entered. If the closing tag is not immediately entered, it is likely to be completely forgotten and lead to other problems. Though the best reason for students seems to be so they don't lose points on a test.