Advanced Topics in E-R Modeling
Degree of a relationship
Relationships can be classified by the number of entity types involved. This is referred to as the degree of a relationship. To this point, we have concerned ourselves with relationships between two entity types. This is, by far, the most common type of relationship. The most common degrees of relationships are as follows:
- This is a relationship between two entity types.
- This is a relationship between three entity types.
- This is a relationship involving only one entity type.
In the real world, there are relationships other than those involving two things. For example, suppose that we want to capture which employees use which skills on which project. We might try to represent this data in a database as three binary relationships between skills and project, project and employee, and employee and skill.
The applies relationship indicates which employee applies which skill. The used on relationship indicates which skill is used on which project. The works on relationship indicates which employee works on which project. But this is not enough to specify which employee uses which skill on which project. Suppose you know the following:
- Lindsey and Mackenzie have worked on projects A and B.
- Lindsey has used skills interface design and database design while Mackenzie only used her database design skill.
- used on
- Both skills have been used on both projects.
Given this information, it is impossible to figure out which projects Lindsey used which skills on. She could have used interface design on project B and database design on project A – or the other way around. Or she might have used both skills on both projects. The database simply does not give us enough information.
In order to capture the necessary information, the database needs a ternary relationship. In this case, the database needs a relationship, called esp, among employee, skill, and project.
The esp relationship captures information three pieces at a time. It stores facts such as:
- Lindsey used interface design skill on project A.
- Mackenzie used database design skill on project A.
- Lindsey used interface design skill on project B.
- Lindsey used database design skill on project B.
- Mackenzie used database design skill on project B.
Notice that this ternary relationship captures the information represented in the three binary relationships:
- Lindsey worked on project A (first fact).
- Lindsey worked on project B (third and fourth facts).
- Mackenzie worked on project A (second fact).
- Mackenzie worked on project B (fifth fact).
- Lindsey used her interface design skill (first and third facts).
- Lindsey used her database design skill (fourth fact).
- Mackenzie used her database design skill (second and fifth facts).
- Both skills have been used on both projects (all the facts).
Implementing ternary relationships does not mean that you have to get rid of the binary relationships. You only get rid of the binary relationships if they capture a subset of the information captured by the ternary relationship. If a binary relationship captures information that differs from the ternary relationship, then the binary relationship should be retained if the information is important to your company. For example, consider the following:
The esp relationship stays the same as in the previous ER diagram. The binary relationships are different.
- has/held by relationship
- An employee has a certain skill. This is different than esp because there are some skills that an employee has that he or she may not have used on a particular project.
- needs/needed on relationship
- A project needs a particular skill. This is different than esp because there may be some skills for which employees have not been assigned to the project yet.
- manages/managed by relationship
- An employee manages a project. This is a completely different dimension than skill so it could not be captured by esp.
We have three entities: car, customer, and salesperson. Consider the following situations.
There is a ternary relationship called sell relating all three entities. Sells: ``Customer cust bought car car from salesperson SP.
Question: Using this data, who sold Sam the Cobra?
We have three binary relations as shown below:
- Customer cust buys car car.
- buys from
- Customer cust buys from salesperson SP.
- Salesperson SP has sold car car.
Question: Using this data, who sold Sam the Cobra?
Thus far in this section the ER diagrams have not represented the cardinality of ternary relationships. There is a different method for determining cardinalities of higher order relationships:
- Cover up all but one of the lines going from the rectangle representing the relationship (esp in the above case) to the entities. (For example, begin by covering up the lines from espto employee and from esp to project.)
- For the one remaining entity, ask yourself: Can there be only one of these items for any single combination of these other entities, or can there be many? (Continuing this example, ask yourself: Can there be only one skill for a certain employee and certain project, or can an employee use many skills on one project?)
- If the answer is many, then put a crow's foot on the line going to this entity. If the answer is one, then put a dash across it.
- Repeat this process until each entity involved in the relationship has been examined.
For the esp relationship, what is the cardinality for each entity type?
The final, and possibly the most difficult, relationship is the recursive relationship. This is a relationship that an entity has with itself. But it really doesn't have to be difficult if you think about it as you would any ordinary binary relationship. Let's look at an example.
Think of an employee who is the manager of other employees.
A manager manages many employees and an employee has exactly one direct manager. This is pretty straightforward. But, now, realize that a manager is really just another name for an employee. So, replace managers with employees in this diagram.
Now, this diagram has the entity type employees represented twice. To remedy this situation, "pull" the relationship diamond down and slide the two employee rectangles so that they are lying on top of each other. Now the diagram looks like the following:
This diagram represents what we want:
- Reading clockwise starting at employee: An employee may not manage any other employees but may manage many.
- Reading counter-clockwise starting at employee: All employees are managed by exactly one other employee.
Not everyone in the company has a manager. The president will not have a direct manager. This is handled in the data in the table by indicating that the president's manager is the president. A little trick.
Two entities can have more than one type of relationship. This is not surprising; further, it is not difficult to represent in a database or in an ER diagram. Consider the entity types person and insurance policy and the relationships between them of pays for and is insured under.
Look at these relationships one at a time.
- A person pays for zero or more insurance policies. An insurance policy is paid for by exactly one person.
- A person is insured under zero or more insurance policies. An insurance policy covers one or more persons.
These are two distinct relationships. They mean two different things – that is why they are represented as two separate relationships in the ER diagram.
Other entity types
When we examined attributes earlier, the attributes were exclusively attached to entity types. However, it is also possible for a relationship to have attributes. A many-to-many relationship is, in some respects, a special case of the other relationships. It requires that another entity be created to "hold" information related to the relationship. This new, special type of entity is called an associative entity.
An associative entity is an entity that can only exist between two other entities, and that exists to store information about the relationship between or among those entities.
For example, suppose we are considering the following situation. A person can have membership in many clubs while a club can have many members. This is represented by this figure:
Now consider a reformulation of the above ER diagram in which the many-to-many relationship is replaced by two one-to-many relationships with an associative entity in the middle:
This second representation of the situation is preferred because it more directly maps to the underlying database that will be created as a result of this relationship. A many-to-many relationship must be represented by a separate table, just as separate entity types must be represented by separate tables. In this case we have an associative entity being used to represent a relationship between two entity types.
This new entity type also allows the database to capture data about the relationship itself. A person can be a member of many clubs and a club can have many members. A natural piece of information to want to store is the date the person joined the club. If the attribute is of the person entity type, then this would indicate when the person joined a club but we would not know which club. If the attribute is of the club entity type, then this would indicate (possibly) when the club was founded or (possibly) when the most recent member joined the club but we would not know the dates on which each person joined. The solution is to make join date an attribute of the membership associative entity (the entity representing the relationship).
Mind you, the basic many-to-many diagram (the first one above) is just fine for the initial stages of drawing the ER diagram. It allows you to draw all the relationships without crowding the diagram with additional entity types. However, as the process continues, at some point you should convert the first type of diagram into the second (for the reasons given above).
Attributive (or weak) entity
Attributive entities are entities, but with a difference – they only exist because some other entity exists.
An attributive entity is completely dependent upon another entity for its existence.
For example, if you were to define two entities employee and salary-history, then the second would be an attributive entity because the record of an employee's salary history could only exist if a record of an employee also exists. Joe Smith's salary history wouldn't make much sense if Joe Smith doesn't exist in the data base. An attributive entity is represented by a rounded rectangle within another rectangle as shown below.
Entity subtype partitioning
This section describes two different ways in which subtypes of an entity can be related to one another and to the supertype: the optional versus mandatory question, and the disjoint versus overlapping question. Certainly, entity subtypes should be classified along both dimensions – that is, you should identify whether the subtype is mandatory or optional and whether it is disjoint or overlapping. All four combinations are possible and each is appropriate at different times.
Optional versus mandatory
Assume there is an entity type called person, and entity subtypes called customer and employee. When a person is created, the designer of the database has two options:
- He/she can demand that the person be classified as one of the subtypes.
- He/she can allow a person to be created without classifying the person as any subtype.
Neither one is preferable to the other. The proper one to choose depends on the business situation.
Mandatory subtyping is represented by creating a double line from the supertype (person in the following ER diagram) to line joining the subtypes. Optional subtyping is represented by leaving a single line from the supertype to the circle. In both cases you draw the dash across the line going to the supertype.
So, what does this tell you? Since it is a mandatory subtype partitioning (you know this from the double line), whenever data for a new person is entered into the database, it must be classified as either a customer or an employee. The database user cannot simply add information about a generic person – she must know whether this person is a customer or an employee. If this had been an optional subtype partitioning, then when that user was entering data about an employee, she had the option of classifying the person as an employee or as a customer – but did not have to classify the person as either.
Disjoint versus overlapping
Consider now the company supertype and the subtypes AR_co and AP_co. As a designer you can specify whether or not an entity of subtype AR_co can also be an entity of type AP_co. Certainly it is not abnormal to think that you can do business with companies that do business with you. Think of being a consultant for Ameritech or IBM.
The following are the two possibilities.
If entities are allowed to be no more than one subtype, then the subtypes are said to be disjoint.
If entities can be classified as several subtypes, then the subtypes are said to be overlapping.
Disjoint subtypes are represented by putting a "d" below the supertype line. Overlapping subtypes are represented by putting a "o" there.
The above figure tells us that this is a disjoint entity subtyping. This means that whenever data for a new company is entered into the database, the company can be classified as either AR_co or AP_co but not both. If this had been an overlapping entity subtyping, then when that user was entering data about a company, she would have had the option of classifying the company as both AR_co and AP_co.
Suppose there is a student entity type with subtypes CIS majors, finance majors, and accounting majors. Is this a mandatory or optional subtyping? Are these subtypes disjoint or overlapping?
Suppose I didn't tell you that this should be an entity subtype problem. Would you represent it this way? What else would you do?
Aggregation of entity types
Subtypes are generally thought of in terms of X is a Y (which is why these are commonly referred to as is-a relationships). Another type of relationship that needs to be represented in a database is the part-of relationship, more formally called aggregation. When an entity is made up of several different types of other entities, an aggregation relationship may be called for.
Consider the relationship between a car and its engine and body. The engine and body are both part of the car. The relationship is represented as follows in an ER diagram:
Types of attributes
Sometimes it is instructive to classify an attribute by the means in which the value is determined. Here are the three possibilities.
- These are values provided to the business. These are the types of attributes that we have been discussing so far. Think of name, address, etc. These values cannot be deduced from the values of other attributes.
- This is invented and exists only in the database. An example might be a unique identifier for a department. This value is not changed once it is set.
- This is a value that can be calculated from the value of other attributes in the database. An example might be the age of an employee when the birth date is in the database. These attributes should, generally, not be stored in the database but should be calculated when needed.
Not all entities have a value for every attribute; however, some attributes must have a value for all entities.
- An entity need not have a value associated with an optional attribute.
- An entity must have a value associated with a mandatory attribute.
For example, assume the employee entity type has attributes hire date and termination date. Hire date would certainly be classified as a mandatory attribute; if the employee didn't have a hire date, then the person couldn't very well be an employee.
Termination date is an optional attribute. You would expect that many people in the database would not have a termination date while others, who are obviously ex-employees, do have a value associated with the termination date attribute.
The optionality of an attribute depends highly on the business situation, how the information is gathered, and how the business updates its database. One company might classify an attribute of an entity type as optional while another company might classify the same attribute of the same type as mandatory. Consider the following example:
Consider the attribute sale price of the catalog item entity type for a computer mail order company. Company A has a policy that they do not put an item into the catalog until it has a price; thus, they do not create a catalog item entity until they can assign a value to the attribute sale price. For this company the sale price attribute is mandatory.
On the other hand, Company B has a policy that they put an item into their catalog as soon as they decide to stock it. This way they can make their product line look as broad as possible. They put Call us for latest quote in the catalog instead of a price. Thus, they do create catalog items even before they have assigned a value to the attribute sale price. For this company the sale price attribute is optional.
Again, in order to classify an attribute as optional or mandatory, you must understand the business situation and practices.
For each of the following attributes, determine whether it is optional or mandatory:
- Delivery date of a customer order
- Order dateof a customer order
- Title of a book
- Serial number of a television
Other information about attributes
The database designer should also determine miscellaneous other information about each attribute:
- This is the value that an attribute should take if it is not assigned a value. For example, the state field of an employee table might have the default value of MI.
- permitted range
- These are the values that an attribute is allowed to take. This ensures that a value that gets put into the database is valid. For example, the sale_price field of the inventory table might have a permitted range of sale_price > 0.
- A composite attribute is an attribute made up of many other attributes. The reason for creating a composite attribute is that the attribute itself is referred to as a whole. Think of an employee's address that is made up of the house number, street, city, state, and zip. One way of storing this attribute is to create an attribute address that would contain the whole address. For example, a value of address might be
address = 202 Crest Avenue, Ann Arbor, MI 48103
An alternative to this would be to store these pieces of data in separate fields in the table. For example, the above information might be stored as
street = 202 Crest Avenue
city = Ann Arbor
state = MI
zip = 48103
Separating the attributes in this way allows database users to refer to each field independently. For example, under the second scheme, a user could easily and quickly determine the employees who live in Michigan. Under the first scheme, this would not be nearly as fast.
Source: Scott Moore, http://db4u.wikidot.com/eradvanced
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.