Data-Oriented Design

This text uses the Martin (1990) version of Information Engineering to illustrate data-oriented design. The result of data-oriented analysis – entity-relationship diagrams, data flow diagrams, CRUD matrices, and so on – is translated into screen designs, production database designs, action diagrams, procedural structures, and security plans. Compared to other approaches, data-oriented design strongly emphasizes security, recovery, and audit controls, relating each to data and processes in the application.

In this chapter, you will learn about the concepts and terminologies for data-oriented design, analyzing data and defining system controls, and the action diagram. The action diagram shows the processing details for an application in a structured format, which can be translated into programs and modules. You will also learn about menu structure, dialogue flow, and hardware and software installation and testing.

Analyze Data Use and Distribution

Develop Action Diagram

Guidelines for Developing an Action Diagram

An action diagram is a diagram that shows procedural structure and processing details for an application. It is built from the process hierarchy and process data flow diagram developed during IE analysis (see Figure 9-45 for ABC's PDFD). The diagram uses only structured programming constructs to convert the PDFD into a hierarchy of processes that can be divided into programs and modules. First we discuss the components of the diagram, then we discuss how to build an action diagram from the process hierarchy and PDFD.

Action diagrams use different bracket structures to depict the code elements in an application. Basic structured programming tenets-iteration, selection, and sequence-are all accommodated with several variations provided. As Figure 10-24 shows, a sequence bracket is a simple bracket. It is optionally identified with a process name and ended with the term ENDPROC to represent a program module consisting of a sequence of instructions.

When a module is designed and detailed in another document or diagram, a rounded rectangle containing the module name is drawn between the brackets (see Figure 10-25). When the module is not yet defined in detail, a rounded rectangle with question marks down the right side is shown. Reusable modules are drawn with a vertical bar to represent reuse.

FIGURE 10-24 Simple Sequence Bracket Format

FIGURE 10-25 Module Designation Format

Selection of modules from the PDFD is shown by a selection bracket (also called a condition bracket) which begins with an IF condition and ends with the term ENDIF (see Figure 10-26a). If the conditional statement has multiple conditions, two other options are allowed. The condition can be stated as an IF statement with one or more ELSE conditions (see Figure 10-26b), or a condition can be stated as a mutually exclusive selection list as in Figure 10-26c; this selection list is eventually translated into an IF statement.

Repetition is shown with a double bracketed figure. The repetition bracket name begins with either DO or DO WHILE + condition (see Figure 10-27). The bracket ends with either an UNTIL + condition(Figure 10-27a), or ENDDO (Figure 10-27b). DO WHILE implies that the condition is checked before the conditional statements are executed. Do while processing may occur zero times. Conversely, DO UNTIL implies that the condition is checked after the lower statements are executed. Do until processes occur at least once.

Miscellaneous items include goto, exit, and concurrency identification. A goto is shown by an arrow leaving one level and pointing to the line for the destination level with a goto statement and destination at the right of the arrow (Figure 28a).

An exit is shown as an arrow leaving one level and pointing to the line for the destination level with the word exit at the right of the arrow (Figure 28b). Unless an exit destination is named with the exit, exit always means that the calling module is the exit destination. For example, if Rent/Return calls CustomerAdd, the exit from CustomerAdd returns to Rent/Return. Further, if CustomerMaint calls CustomerAdd, the exit from CustomerAdd returns to CustomerMaint. That is, the calling module, regardless of what it is, is the return module.

Processes can be sequential or concurrent. Concurrent processes execute at the same time. There are two types of concurrent processes: independent and dependent. Independent concurrent processes are those which execute at the same time but do not synchronize their process completion. For example, when Process Payment and Compute Change is complete in ABC's application, printing and file updates of several types could all be concurrent. If there is no checking on the success of their completions with subsequent action for any failures, these processes are independent. Independent concurrency is shown on the diagram by an arc which connects the module brackets (Figure 10-28). Dependent concurrent processes are those which must be synchronized to coordinate further application actions. Dependent concurrency is shown on the diagram by an asterisk (or some other special character) on the arc connecting the modules (Figure 10-28d). Dependent concurrent processes require the development of a synchronization module, if not already in the application, to ensure complete, accurate processing.

Now that you know the bracket symbols used to define action diagrams, we move to discuss the steps to developing one. The steps to define an action diagram are to translate processes into levels of action using structured constructs, design modules, perform reusability analysis, decide module timing, add data to the diagram, and optionally, add screens to the diagram.

FIGURE 10-26 Conditional Bracket Design Formats

The first step is to translate processes into levels of action. The first-level diagram is developed from the process hierarchy diagram to identify the major activities being performed by the application. The activities themselves are added to the diagram as they are written on the hierarchy diagram. The structured constructs should identify sequence and any selection or conditional processing relating to the activities. Most often, when the diagram is begun at the activity level, the alternative processes are mutually exclusive. When the diagram starts at the process level (Figure 10-29), any construct might apply. The example shows a mutually exclusive selection from among the three alternatives.

Now we shift to the process data flow diagram (Figure 10-30) to add process details to the action diagram. Remember that the processes on the PDFD must match exactly the processes on the hierarchic decomposition diagram. We use the PDFD to translate the structural relationships between the processes correctly. The structural relationships are on the PDFD and not on the decomposition; they refer to the sequential, conditional, and repetitive relationships between processes.

FIGURE 10-27 Repetition Bracket Design Formats

In developing the second-level action diagram, we first add the processes, in sequence, from the PDFD. Then the brackets are drawn to reflect the sequential, conditional, and repetitive structural relationships. In the example (Figure 10-31), the main processes are Identify Item and Vendor, Sort by Vendor and Item, Get Price, Create Order, and Mail Order. Between these processes, there are two repetitive blocks: one based on New Releases, and the other based on Vendors (see Figure 10-32). We identify the repetitive blocks by looking at the circular loops and the conditions for repeating the process(es}. Notice that the Sort is not included in either loop.

Next, evaluate each process grouping. Identify Item is alone within its loop. Sort is also alone. The last three processes are together and are analyzed. The processes are sequential but according to the PDFD, they are not all processed in sequence. If the vendor has not changed from the previous item, we Get Price and Create Order. When the Vendor changes, we File and Mail the order. These statements from the PDFD translate into the IF conditional statement in the action diagram as shown in Figure 10-33.

The diagram is correct in interpreting the PDFD, but it is incomplete as a program specification. First we need to deal with the First Vendor. The First Vendor will not equal Last Vendor, and to file an order for a nonexistent vendor is wrong. Second, think about what an order looks like (Figure 10-34). There are one-time Vendor information and variable lines of Item information. Where the PDFD says Create Order, it really means Add Item to Order. When the Vendor changes and an order is complete, we want to format Vendor information for the new order. Figure 10-35 reflects these details and is ready for the next step. The purpose of this example is to show how a correct PDFD may need elaboration to translate into program specifications.

FIGURE 10-28 Miscellaneous Bracket Design Formats

FIGURE 10-29 Process Hierarchy and First-Level Action Diagram

U sing the action diagram, modules are defined. There are few guidelines on this aspect of Information Engineering. In general, you should try to define modules that perform one well-defined process and nothing else. The guidelines presented in Chapter 8 for module definition can be applied here. For the example in Figure 10-35, the IF ... ELSE IF ... ELSE processing is the module's control flow. Within the control flow we have stand-alone processes that conveniently define modules. Figure 10-36 shows the module names, each enclosed in its own rounded rectangular box to indicate that there are more details for each module. The submodules are each further diagrammed or, if fully documented in a data dictionary, refer to the dictionary entry in the module box.

For Create Purchase Order processing, then, we have a main module and submodules for Create Vendor Info, Get Price, Create Order Item, File Order, and Mail Order. Notice that Create Vendor Info is used twice.

Next, the action diagram modules are compared to templates already in use to determine whether reuse of existing modules is possible. As reusable modules are identified, the process details are removed from the action diagram and replaced with a call statement. The called module name should indicate whether the reused module is customized for this application or not. The conventional way to identify customized reused modules is by a prefix or suffix on the name. For example~ a date compare routine might be used to determine lateness. If not modified, the name of the routine might be DateCompare. If customized, the name of the routine might be RentDateCompare or LateReturnDateCompare. In the example in Figure 10-36, Sort uses a utility program, a special class of reusable module. The Sort statement is removed from the diagram and replaced with a call statement. No other modules in this example are general enough for reuse.

FIGURE 10-30 Sample Process Data Flow Diagram

FIGURE 10-31 Second-Level Action Diagram

When reusability analysis is complete, the action diagram should show the mainline logic of the application with modules for the processes and subprocesses. At this point, timing of processes is decided and added to the diagram. Recall that processes can be sequential or concurrent, and that concurrent processes can be either independent or dependent. Frequently, user requirements will identify required concurrency. If no user requirements identify concurrent operations, a design decision to offer or not offer concurrency is made by the SEs. Concurrency is expensive and adds a level of maintenance complexity to the application that the user might not want.

Optional concurrency is determined by evaluating module interrelationships again. Only groups of sequential modules are evaluated at first. Then the groups themselves are evaluated for possible concurrency. In Figure 10-36, two groups of two or more modules are present. The first is Get Price with Create Order Item. The second group is File Order, Mail Order, and Create Vendor Information on Order. Working backward, we ask if the modules are dependent on each other. Could we create an order item without knowing the price? In this case, the answer is no, we must know the price. Therefore, the modules are dependent and cannot be concurrent. In the second group, we might perform File and Mail Order at the same time, IF success of the file operation is not an issue. Create Vendor cannot be done until the last order is fully processed. To decide on concurrency, we need to know the details of error handling. In this case, we find that errors are checked and handled in the module in which they can occur. If a fatal error occurs, the application does no other processing on this order. This process definition implies sequence to the processes. If the processes were concurrent and a fatal error occurred, some undesired processing would occur. Therefore, in this example, concurrency is not an option.

FIGURE 10-32 Repetitive Blocks on Second-Level Action Diagram

FIGURE 10-33 Conditional Statements on Second-Level Action Diagram

Order at the same time, IF success of the file operation is not an issue. Create Vendor cannot be done until the last order is fully processed. To decide on concurrency, we need to know the details of error handling. In this case, we find that errors are checked and handled in the module in which they can occur. If a fatal error occurs, the application does no other processing on this order. This process definition implies sequence to the processes. If the processes were concurrent and a fatal error occurred, some undesired processing would occur. Therefore, in this example, concurrency is not an option.

FIGURE 10-34 Order Example

Next, the entities and data elements used by the processes are added to the diagram(s). By the time this action is complete, every attribute of every relation must, at least, have been identified for creation and deletion. Any attributes not included in the processing should be reconsidered for elimination from the application. These process definitions should include attributes added to the relations as a result of design activities.

If the action diagrams are developed manually, screen identifiers can be added to the diagram with entities and attributes linked to screens (see Figure 10-38). The diagram then links data sources and destinations to both processes and screens. This type of diagram does manually what linkages in a CASE tool automate.

ABC Video Example Action Diagram

The steps to developing the action diagram are to develop the levels of action using structured constructs, perform reusability analysis, design modules, decide module timing, add data to the diagram, and optionally, add screens to the diagram. Only the first-level action diagram includes all of the processes. The lower-level diagrams consider Rent/ Return processing and Video Maintenance only. The other processes are left as an exercise.

FIGURE 10-35 Order Format Details on Action Diagram

The first-level action diagram is based on the process hierarchy (Figure 10-39). First we draw the general bracket and add the module names, indicating the structural relationships between the modules by the bracket type (Figure 10-40). In the ABC diagram, the processes are all mutually exclusive.

Then, using the PDFD as reference (Figure 10-41), we develop the next level of procedural detail. The subprocess names are added to the diagram as shown in the PDFD (and process hierarchy). For each subprocess, the structural brackets indicating modular control are added.

The subprocesses for Video Maintenance are for create, retrieval, update, and delete processing. These processes are all mutually exclusive, so the diagram is simple (Figure 10-42). At the lowest level, we identify modules that refer to the dictionary for process details.

Rent/Return has all of the complexity in the application. Each cluster of modules is discussed separately. First, Get Request is always executed whenever Rent/Return is invoked (Figure 10-43).

FIGURE 10-36 Module Boxes on Action Diagram

FIGURE 10-36 Module Boxes on Action Diagram FIGURE 10-37 Data Addition to High Level Action Diagram

Then the conditional statement for determining the type of request is added (Figure 10-43). The two options are If Customer and If Video ID, and each has its own processes.

Next, Open Rentals are read and displayed until all Open Rentals for this customer are in memory (Figure 10-44). The Open Rental loop is a simple Do While process.

Then video returns are processed using a repetition with a conditional structure (Figure 10-45). Late fees are checked in a repetitive loop for all Open Rentals (Figure 10-46). New rental Video IDs are entered for all new rentals (Figure 10-47). Process Payment and Make Change is a stand-alone module. Then, for all open and new rentals, the Open Rentals file is updated; for all of today's returns, history is updated; and if payment is made or a user requests, a receipt is printed (Figure 10-48). The consolidated action diagram is shown in Figure 10-49.

Next, evaluate the diagram to identify program modules. As in the example above, we have naturally identified modules as part of process definition. For instance, Get Valid Customer is a small, self-contained module that does one thing only. The module uses a Customer ID to access the Customer relation. If the entry is present, the credit is checked. The name, address, and credit status are returned. The remaining modules, that we originally defined as business processes doing one thing, should each be reviewed to ensure that they are, in fact, single purpose. This is left as a class activity.

In addition, we can now resolve the issue held over from analysis about whether to keep separate or consolidate Get Open Rentals, Add Return Date and Check for Late Fees. Individually, each of these processes is singular (i.e., does one thing). If they are consolidated, they would remain singular but be placed within the same repetition loop. The issue here, then, is which method is easier to program and implement in the intended language, and which provides the better user interface. We need to visualize the user interface and memory processing for each alternative.

FIGURE 10-38 Optional Screen Processing on Action Diagram

If the modules are kept separate, all Open Rentals are read first and displayed. Then the clerk can be prompted for new videos or for returns. If we prompt for returns every time, many wasted entries to deny return processing will be made. If we prompt for either new or return Video IDs, we need a method of knowing which is entered. Assuming we figure that out, we then get all returns and enter today's date for returned videos. Then all entries on the screen are scanned to determine new late fees.

If the modules are consolidated, as each Open Rental is read, Late Fees are computed for tapes with return dates and no late fees (see Figure 10-50). There are two options for this process. Either we assume there are no more returns or the clerk must respond to each Open Rental. With the first option, the clerk would have a selectable option for more return processing. When chosen, each return Video ID is entered and Late Fees are computed for that video.

FIGURE 10-39 ABC Video Process Hierarchy Diagram

Notice that both alternatives have problems. The separation alternative has a problem in dealing with returns, and there will be a slight delay for Late Fee processing. The consolidation option actually modifies the processes from the PDFD somewhat for Late Fee processing.

Data storage for a rental in memory is the same for both alternatives. We need a location for customer information, a table for open rentals, a table for new rentals, and locations for payment information. We will have three iterations through the table for Open Rentals in the separate alternative, and one, or two if returns are present, iteration(s) in the consolidated alternative.

The alternatives are approximately the same in implementation complexity, although three iterations are more likely to contain bugs than one. The human interface design is the same for both alternatives. The difference in the human interfaces is the speed and timing for data to appear on the Open Rentals lines. In this case the consolidated alternative is slightly faster. The difference in memory processing is the number of iterations through Open Rental data. Again, the consolidated alternative is preferred somewhat because it is less likely to contain bugs. With no overwhelming evidence for or against either alternative, this amounts to a judgment call. We will choose the consolidated alternative to minimize the probability of errors and the number of iterations through the data. The action diagram, reflecting consolidated open rental processing, is in Figure 10-50.

FIGURE 10-40 ABC First-Level Action Diagram

The next activity is reusability analysis. ABC has no library of reusable modules to consider since it currently has no computer processing. The types of modules the consultants are likely to have might be relevant to error processing or to screen interactions. For our purposes, we assume no reusable modules.

To assess module timing, we analyze the module clusters. The only modules that could be concurrent are those in the last cluster to update files and print a receipt. Before deciding concurrency, we must decide the details of history processing that were deferred from analysis. We have two types of history files: Customer and Video. Customer History is a separate file that contains the Customer ID and all Video IDs rented by that customer. No counts, dates, or copy information are anticipated. This description complies with the case requirements in Chapter 2.

Video History contains Video ID, Copy ID, Year, Month, Number of Rentals, and Days of Rental for each entry. This data description also complies with the case requirements in Chapter 2. The issue to be decided is whether or not Video History is maintained during on-line processing, or if the current month's activity is kept with Copy information. If the second alternative is chosen, we need a monthly process to update the Video History and reinitialize the counts in the Copy relation. If the first alternative is chosen, we have two more alternatives. First, we might need to update and create processing because, for anyone to copy, we would not know in advance whether it has a historical entry or not. This alternative requires bug-prone processing that is more complex than keeping counts in the current Copy relation. Second, we could create an empty entry for every tape at the beginning of every month. This alternative is not attractive because it generates many empty records on history. Both of these alternatives would require history to be on-line. Keeping current counts with Copy relations does not require history to be on-line. The final argument for keeping the counts in Copy information is that, to maintain status of a given tape, Copy information must be updated upon video return anyway. As long as the tuple is being read, updating it with count information requires adding lines of code rather than a new module. From this discussion, it should be clear that keeping current counts in the Copy relation is the preferred alternative. We document this and the other changes in the Data Dictionary.

Now we can discuss module timing for the last group of modules. In this group we create and/or update Open Rentals, update Copy, and Print Receipt. Recall from analysis that Vic does not want file update success to be known to the customers. The receipt should be printed regardless of updating success. This implies that printing could be concurrent with the file processes. The file updates cannot be concurrent because they will all be on the same device. Since there is already contention for the file among the users, it is unlikely that we would want to increase contention by having the updates concurrent. If printing is the only concurrent process, it is not worth the cost to provide concurrency. Therefore, the processes will be made sequential for production operation. Figure 1 0-50 is not changed at this point.

FIGURE 10-41 ABC Video Process Dependency Diagram

The entities and data attributes are added to the diagram next to show input and output processing. Two entities, EOD and Rental Archive, are still undefined, having been deferred in analysis. These are left as an exercise. The entities referenced in Rental/Return processing, Customer, Open Rental, Video, Copy, Customer History, and EOD are all shown in Figure 10-51. When an action diagram arrow is from an entity to a process, it means that the entire tuple is accessed. The final action is to add screens to the action diagram, but they are not yet defined, so this activity will be left as a future exercise.

FIGURE 10-42 ABC Video Maintenance Second-Level Action Diagram