HTML Basics

The Hypertext Markup Language (HTML), developed in the 1990s, is a language for transmitting global hypertext documents. HTML is a "markup language" that defines the structure of a document using information such as tags and text. A web browser reads this information and displays a webpage. In this article, we look at a brief introduction to HTML. Then, we will focus on the structure of an HTML document, mandatory and optional tags, and attributes. As you read these sections, pay attention to the rules for writing HTML. For example, an HTML5 document must contain four basic tags enclosed in open "<" and closed ">" brackets.

Document Structure Tags and A Simple Web Page

An HTML document is divided into two main sections, the head and the body. The reason for this division is that the head is to contain metadata, and the body is to contain the information to be displayed on the web page.

To understand this difference, it is important to understand the meaning of the term metadata. According to Dictionary.com, the meta prefix means: "a prefix added to the name of something that consciously references or comments upon its own subject of features". Hence metaphysics is physics about physics, a meta-analysis is a study of studies, etc.

Metadata is what its name implies, data about the data on a web page. It defines how the page is to interpret the data that it will process. For example, functions that are used in a web page are defined in the head. How to handle events and interpret the CSS tags are also defined in the head. Anything that is used to define the behavior of the page is in the head of the document.

As important as what is in the head is what is not in the head of the HTML document. The head should not output any information (or data) to be placed on the web page. Functions and other structures defined in the head should return strings to be printed in the body, and not printed to the page in the head. If the statement is defining something to be rendered on the page itself, it does not belong in the head.

This implies (correctly) that the body of an HTML document should contain anything that is rendered and placed on the web page. Any text to be displayed, images to be rendered, or forms to be processed belong in the body of the document. And again, the body should not contain any metadata such as functions, CSS, or code to handle events.

Nothing in HTML enforces this policy, but there are not many good reasons to violate it. And when the data in the head and body are mixed, it generally shows that the programmer did not have a clear concept of what the page is to do.

In the Document Structure, the <title> tag is shown as metadata. This is because the title is what appears on the tab in the browser and is not rendered on the page.

Program 3 is a simple HTML web page to illustrate the concepts covered so far. Note that the program uses the large heading (<h1>), and paragraph (<p>) block tags, and the <image> tag, which have not been covered. As has been stated before, this text is not to be a text on learning HTML, CSS, JavaScript, or any other language or program. It intends to provide enough detail to allow a motivated intermediate programmer, specifically students doing research with me, enough background to start that research. A complete list of HTML tags can be found at: https://www.w3schools.com/tags/, and many tutorials exist on how to use them in web pages. Readers interested in more functionality of the tags can easily look them up on the WWW. But it is expected that the readers of this text are sufficiently advanced that they can research and learn the implementation details of this type of material.

Enter Program 3 is entered into a file with a ".html" extension. Note that the file must have some form of a .html (e.g. .htm, etc.) extension for the browser to recognize it as an HTML file. Place a jpeg picture (any picture) into a file named dog.jpg, and open the file in a browser such as Chrome, Firefox, Safari, IE, or MS Edge. You should get a page similar to Figure 2-2.

<html>
<!--
  Author: Charles Kann
Date: 5/17/2017
Purpose: A first example of an HTML program
-->
 <head>
  <title>First HTML Web Page </title>
</head>
<body>
  <h1>First page</h1>
 <p>
  This is a first page of text, and shows how to
  insert a picture of a dog
  into a page.
  </p>
  <image src="dog.jpg/>
  <p>
  This page also shows how to handle text using the
 paragraph (&lt;p&gt;)symbol, as well as how to show
  the &lt; and &gt; symbols in html text.
  </p>
</body>
</html>

Program 3 – First HTML Web Page

In this program, comments in HTML begin with a (<!--) tag and continue until a (-->) tag. There is a comment at the start of this document to provide a preamble comment for the file. The need for file preamble comments, and commenting code correctly, is stressed in every introductory programming course I have ever encountered. However, it seems as though students believe such commenting is not useful, and only applies to introductory classes and/or the first language they learned. They throw out these lessons as soon as they think they can safely get away with it. That is why at every level, student programs need to be graded on commenting, and a poorly commented program by a senior should be given an F, even if it works. Commenting is not something to be avoided. It is always good practice, and it will be good practice in web development also.


Figure 2 – Output from the first Web Page