CS402 Study Guide

Unit 1: Networking Fundamentals

1a. describe the evolution of computer networks and the internet

  • What are the major milestones that highlight the invention and progress of computer networks as we know them today?

In the early days of computing in the 1950s and before, "networks" consisted basically of interactive processing through time-sharing. Users' terminals were connected directly to a mainframe computer, and they shared the processing time of that computer.

In the late 50s and 60s, time-sharing was replaced by "batch processing", where users would need to physically carry their work, like punched cards, to the computer. These systems supported only a single user at a time, but they avoided the headache of running wires to each of the many users of the mainframe.

The biggest move toward today's networks started in 1964, when Paul Baran wrote reports outlining packet networks. It was not until 1969 when the first nodes of a true network, ARPANET, became operational. ARPANET featured two especially important applications, Telnet and FTP. Telnet allowed for users on one computer to have access to a remote computer and run programs and commands as if it was local. FTP allowed for high-speed file transfer between computers.

In 1972, the first email system was developed by Ray Tomlinson. Interestingly, email was invented before TCP/IP became operational – that giant step did not occur until 1980. By 1983, ARPANET had fully adopted the TCP/IP protocol. ARPANET was retired in 1990, but the World Wide Web was developed in 1991. It originally used GOPHER, which was a text-based browser, and was followed in 1992 by the first "modern" web browser, MOSAIC.

To review, see Introduction to Networking Fundamentals. For more detail about computing milestones, see this suggested additional article on Computer Networks.

 

1b. describe the difference between a computer network and a distributed system

  • What are the main differences between a computer network and a distributed system?
  • What applications are better suited for each one of the systems?

A network is a system that consists of many computers connected to each other but operating independently. Users know the existence of each computer and resource.

A distributed system consists of many computers working together as a single unit. A distributed system automatically allocates jobs between available processors, which makes the process completely transparent to users. Users in a distributed system see the system as a single big system. A distributed system depends on a layer of software, called middleware, while a network is simply a group of computers connected together. Networks are best when you need to share resources to make them available to everyone, such as a printer.

To review, see Introduction to Networking Fundamentals. For a detailed description of distributed computing and several examples for distributed systems, see this suggested additional article on Distributed Computing.

 

1c. explain the use of layers in networking

  • What are the 7 layers of the OSI reference model and the counterpart layers of the TCP/IP model?
  • What is the function of each layer?

The challenges involved with networking computers together could be overwhelming if they were considered as one single system. At the beginning of networking, designers decided to use a "divide and conquer" approach to divide network functions into logical layers. This split bigger problems into smaller, more manageable problems.

Each of these layers is composed of software and/or hardware modules that perform related network services. Each layer uses the services provided by the layer immediately underneath it, and provides services to the layer above it. Data to be transmitted must pass down through the layers of the source node to the communication medium (that is, the physical link). The data travels across the physical link and up through the layers of the destination node to the user. This is called end-to-end communications. As long as the interface between the layers is not changed, implementers of the layers can feel free to change as they see fit the way that a certain function is accomplished.

The interface, as defined by the protocol, cannot change. Each layer in the stack deals with messages, which are normally limited to a maximum size. Each layer in the stack adds a header to its messages. This header is used to synchronize the data with the same layer in the remote peer. The header contains information that will let the remote peer decide what to do with the data. Data flows down the stack on the sending host. Starting with the raw data in the application, each layer will add a header and pass it down to the next layer below it. When the remote peer receives the message, each layer reads the header, makes a decision on what to do with the message, removes the header, and passes it to the layer above it, which will repeat the same process. The process looks like this:


Applications running on both machines need to exchange data. We will use the term Application Data Unit (ADU) to refer to data units exchanged by the two applications. The transport layer receives data from the application. It divides the data into manageable units and attaches the TH header forming what is called a "segment". The TCP header has a minimum length of 20 bytes. The transport layer will then give the segment to the network layer. The network layer attaches a 20-byte header for IPv4 (NH) to form a "packet" and gives it to the data link layer. The data link layer forms a frame by adding a header, denoted here as DLH, and optionally a trailer, shown here as DLT. Not all data link protocols add a trailer, but Ethernet does add a 14-byte header and 4-byte trailer, not counting the preamble of 7 bytes and Start of Frame Delimiter of 1 byte. The data portion on Ethernet has a maximum of 1500 bytes.

Notice that by now a lot of overhead needs to be added before the actual data goes into the physical layer. For TCP/IP with Ethernet in the data link layer that will be 20 (TH) + 20 (NH) + 14 (DLH) + 4 (DLT) + 7 (Pr) + 1 SFD) = 66 bytes. A full-size frame of 1500 bytes of data will have a minimum length of 1500 + 66 = 1566 bytes. That represents an overhead of 66/1566 = 4.2%. For smaller amounts of data, such as 100 bytes, the overhead will be a whopping 66/166 = 39.7%, going even higher for smaller amounts of data. Needless to say, larger frame sizes are preferred.

One final point: notice in the figure that we show all the layers on the source host as talking with the corresponding layer in the destination host. This is, of course, a "logical" connection. Each layer uses the header to send information to its counterpart layer on the other side. We think of this as each layer having a logical connection to the counterpart layer on the other side.

To review, see The Reference Models.

 

1d. explain the difference between Local Area Networks (LANs), Metropolitan Area Networks (MANs), and Wide Area Networks (WANs)

  • What is the difference between a LAN, MAN, and WAN?
  • 4 different topologies for a LAN include ring, bus, star, and mesh. Of these, which one:
    • is more effective under low load conditions?
    • is more effective under high load conditions?
    • yields the highest reliability?
    • is more deterministic?

A local area network (LAN) is a network where all the nodes share the same physical medium and have a common broadcast domain. Traditionally, LANs were described as a network of computers privately owned and in close proximity, like a home network. However, the total size of a LAN can extend for miles – the primary requirement for it to be classified as a LAN is that they share the same broadcast domain and physical medium.

A MAN, or Metropolitan Area Network, covers a city. A great example of a MAN is the cable television networks that are ubiquitous in cities all around the country.

A WAN, or Wide Area Network, spans a large geographical area that could be a country, a continent, or even the entire planet. It is composed of a combination of hosts and routers that span large areas.

In a LAN with a bus topology, multiple nodes connect to a single bus using one of the typical bus access methodologies like CSMA to access the media. A bus topology provides excellent performance under low load conditions, since there are few collisions and stations have access to the media when they need it. But, at high load conditions, the media will be busy most of the time, which will result in multiple collisions and low throughput. In a bus configuration, a single node failure will not have any effect on the operation of the other nodes in the network, which means it is reliable. However, a link failure will split the network into two or more isolated networks. This could result in some members of the network not being able to access essential services.

In a LAN with a ring topology, the nodes are physically connected in a ring. Each node needs to wait for a token to arrive before being able to send its data. One possible drawback of such a configuration is that if any host in the ring fails, the ring breaks and the network goes down. Contrast that with ethernet using CSMA – every station works independently, and one failure will not affect the operation of others in the network. But a token ring has the advantage of being basically deterministic. Once a station releases the token, it would be able to determine when the media will be available for it again, based on the number of stations in the ring. This topology is also relatively good under heavy loads. Some other features of the token ring topology will be considered later in this study guide. 

In a star topology, all nodes connect to a central physical interface. This is an effective communication strategy, but it suffers from the fact that the central physical interface is a single-point failure. Failure there means that the whole network could go down. The failure of a single node, however, would not have an effect on the operation of the system.

In theory, a full mesh topology would provide the highest performance and redundancy for a small network. However, this topology also has the highest cost, and is virtually impossible if the number of hosts exceeds the number of available network interfaces per host.

To review, see Services and Protocols and The Reference Models. For more detailed information, see this suggested additional article on Local Area Networks.

 

1e. explain the role of the Network Request for Comments (RFC) as a mechanism to develop, review, and incorporate standard changes in a network protocol

  • When would you use an RFC as opposed to a manufacturer's specification or other forms of documentation, like ISO standards or IEEE standards?

RFCs were originally developed as part of the ARPANET project as a means to disclose, share, and generate discussions about a particular protocol. Today, they have become the official publication channel for the Internet Engineering Task Force (IETF). They are the de facto standards that define and describe networking protocols and algorithms.

You use RFCs to learn about the operation of a generic protocol like DNS or OSPF. RFCs give the complete specifications and algorithms used by open protocols that are common in networks. RFCs are tightly controlled by the IETF, and you would not use them if you were experiencing network issues or failing components.

The Institute of Electrical and Electronics Engineers (IEEE) and the International Organization for Standardization (ISO) are two independent non-governmental organizations that created a number of standards that are common in networking. IEEE 802.11 is an example of a standard that standardizes the basic operation of wireless networks. ISO is an international organization that produces many types of standards, not only for networking. It was heavily used in the early days of networking, and its OSI standard described the 7-layer model that has been used ever since. However, RFC is the most prominent tool when trying to learn about networking protocols.

Read the full history and uses of RFCs in The Role of RFC in Computer Networks.

 

1f. describe different switching techniques, such as packet, circuit, and virtual calls

  • What switching technique requires the lowest overhead, but offers the worst resiliency in case of router failures?
  • What switching technique provides the best reliability in case of router failures, but has a high overhead?

Circuit switching, datagram packet switching, and virtual call switching are the three switching techniques used in computer networks.

With circuit switching, the path that all packets will follow is established at the beginning and kept the same throughout the whole exchange of data. This model was followed by telephone networks, where a circuit from caller to recipient is established at the beginning and kept open for the duration of the call. Before data flow, a path must be established from source to destination and all packets follow exactly the same route. Establishing the circuit requires time, which can be considered a disadvantage. Circuit switching also performs the worst when a router fails. If a router fails, the initial "circuit establishment" phase will need to be repeated to establish a new "circuit". That will lead to packet losses and transmission delays. Circuit switching has advantages, however, like low overhead. Once the circuit is established, all that is needed in the header to route the packets is a small circuit ID number. QoS is easy to implement with circuit switching, since quality parameters can be negotiated during circuit establishment. Virtual call switching is a variation of circuit switching, except that packets from different calls can be interleaved during transmission because the routing decision is based on their virtual circuit ID.

Datagram packet switching was modeled after the postal system, where no two packets need to follow the same route. Each packet carries a header with full address information. Routers in the path make routing decisions based on the final destination using routing tables obtained by routing protocols. This gives the best flexibility and reliability, even in the case of router failures or congestion. Another advantage is that there are no delays for data transmission, since packets can be sent immediately without waiting to establish a circuit. However, datagram packet switching requires the highest overhead, since the full addressing information is needed on each packet. IP is a classic example of a datagram packet switching technique.

A real-life implementation of virtual circuit switching was a technique called asynchronous transfer mode (ATM). Data was sent in 53-byte cells, of which only 5 bytes were used for the header. This is a small header, since it was primarily used to indicate the virtual path and circuit ID. The percentage of overhead for ATM was 5/53 = 9.4%.

To review, see Introduction to Networking Fundamentals. For more detailed information, see this suggested additional article on Circuit Switching.

 

1g. differentiate between connection-oriented and connectionless services

  • If you were designing a registered electronic mail system, would you want a connection-oriented service or a connectionless service?
  • If you were designing a network to allow users to log in remotely, would you want a connection-oriented or connectionless service?
  • UDP is a connectionless service. What kinds of real-life scenarios are best suited for UDP?

In a connection-oriented service, a connection is established ahead of time before data is sent through the network. in connection-oriented systems, each data packet must contain the full address of the destination and is sent independently of each other. The universally recognized example of a connection-oriented service is TCP. Note that TCP provides connection-oriented communications while using the connectionless infrastructure provided by IP. You would want to provide connection-oriented services for applications that require user login, or for data-sensitive applications that would not tolerate missing information, such as money transfers.

UDP is the most common example of a connectionless service, where data flows without an initial agreement being established and random bits of data loss is tolerated. Examples of connectionless services are video streaming applications, outward data dissemination, or even electronic mail systems. TCP would not be tolerated by real-life video streaming applications, because the "connection-oriented" assumption of TCP requires that any single lost segment be retransmitted. Because of this, transmission would need to stop while the source retransmitted that particular segment even if the data it carried was insignificant. That would create continuous interruptions in the stream. It would be better to see a small blip in a video than to have the video be interrupted every few seconds.

Viewers of sporting events, for example, would rather lose a bit of color definition and still see a winning play, rather than having the video stop during the play. In the extreme opposite scenario, if you are logging in remotely to your bank account to retrieve money, you would much prefer having a connection-oriented protocol, where every single cent of your money is accounted for. Most protocols in the TCP/IP suite have been designed to use either TCP or UDP. One big exception of this is DNS, which can use either UDP or TCP depending on the type of data being handled.

To read more about connection-oriented and connectionless services, see Services and Protocols.

 

1h. describe the differences between wireless, fiber, and copper media for the transmission of data in a computer network

  • What maximum data rates can be obtained with cat-6 STP cable?
  • What type of cable would you recommend using if high data rates are expected while spanning long distances?

The capacity for transmitting data has improved dramatically over time across a variety of media types, including UTP, STP, fiber, and even wireless media. In the early days of networking, copper cable was the standard. With the invention of Ethernet in 1973, and its standardization and commercial introduction in 1980, coaxial cable became the standard. You would often see a thick orange coaxial cable running from end-to-end in the buildings of organizations that used this technology. Connecting stations to this cable was not an easy task, so they introduced what were called "vampire taps" to make connections. Also known as "piercing taps", they were devices that clamped onto the coaxial cable, piercing into it, like a "vampire" to introduce a couple of probes and connect to the internal copper wires. As networks evolved it became obvious that this type of cable was too thick, too heavy, and too difficult to handle if the transition to "personal computers" was going to happen smoothly. That led to the creation of UTP followed by STP cable in the mid-80s. But these cables all suffered from one big drawback: they provided low bandwidth. At that time, there were no bandwidth-hungry applications, so regular UTP cables worked fine. With the many bandwidth-hungry applications that we have today, copper wire required extreme improvements if it was still going to be used. Over the years the STP cables have been improved by leaps and bounds. For example, cat-6 cable can support data rates of up to 10Gbps at a maximum length of 55 meters or 1Gpbs if the distance increases by around 100 meters. For short distances and maximum rates like these, they do provide a good choice if cost is of concern. However, it is not a good choice for long-distance, high data-rate hauls – for that, you would need to step up to fiber.

A comparison between different transmission media options can be found in Transmission Media.

 

Unit 1 Vocabulary

This vocabulary list includes terms and acronyms that might help you with the review items above and some terms you should be familiar with to be successful in completing the final exam for the course.

Try to think of the reason why each term is included.

  • ARPANET
  • Telnet
  • FTP
  • Batch processing
  • Distributed System
  • LAN
  • MAN
  • WAN
  • Physical Layer
  • Data Link Layer
  • Ethernet
  • Network Layer
  • IP
  • Transport Layer
  • TCP
  • UDP
  • Ring topology
  • Bus topology
  • Star topology
  • Mesh topology
  • RFC
  • OSI
  • IEEE
  • Circuit Switching
  • Virtual Circuit Switching
  • Datagram Packet Switching
  • Broadcast
  • Connection-oriented service
  • Connectionless service
  • UTP
  • STP
  • Fiber