CS302: Data Gathering Techniques for Each Application Type | Saylor Academy

Software Review

Frequently, applications are replacing older software that supports the work of user departments. Study of the existing software provides you with information about the current work procedures and the extent to which they are constrained by the software design. This, in turn, gives you information about questions to raise with the users, for instance, how much do they want work constrained by the application? If they could remove the constraints, how would they do the work?

The weaknesses of getting information from software review are that documentation might not be accurate or current, code might not be readable, and the time might be wasted if the application is being discarded.

To summarize, the methods of collecting information relating to applications include interviews, group meetings, observation, questionnaires, temporary job assignment, document review, or software review. For obtaining information relating to requirements for applications, interviews and JAD meetings are the most common.

Data Collection and Application Type

In this section, we identify the data gathering techniques most useful for each application type. Like most aspects of application development, the techniques can be used for all application types, but because of their strengths and weaknesses, they do not always result in the type of information that is needed most. In this section, we first match data collection techniques to the data types discussed in the first section. Then, the data types are matched to application types (from Chapter 1). Next, we match the data collection techniques to application types based on the data types they have in common.

Data Collection Technique and Data Type

Table 4-7 summarizes the discussion of the above sections. By matching techniques for data collection to data type, we are more likely to identify information of interest than using other techniques. As the table shows, interviews and meetings are useful for eliciting all types of information. This is the reason they are most frequently used in application work.

Observation provides only crude numerical estimates of volumes and is restricted to current time, varying ambiguity, and possibly variable semantics (see Table 4-7). Because the information from an observation is unstructured, some skill is required of the SE to impose a structure that fits the situation. Also, the information may be incomplete.

Questionnaires can ask structured questions about any time frame but only obtain complete answers for questions asked (see Table 4-7). If the questions are open-ended, the completeness might be quite low. Ambiguity in questionnaires should be low, but the question semantics might be misinterpreted by the respondents. Questions about volume at a department or organization level are usually inappropriate. Information about the volume of transactions or time for transaction processing for individual workers would get meaningful information.

Table 4-7 Data Collection Techniques and Data Type

Technique	Time	Structure	Completeness	Ambiguity	Semantics	Volume
Interview	All	All	All	All	Varies	All
Meeting	All	All	All	All	Varies	All
Observation	Current	Unstruct.	Incomplete	May vary	Varies	Crude measure
Questionnaire	All	Structured	Complete for questions asked	Low	Fixed but might be subject to interpretation	Individual volumes only
Temporary job assignment	Current	Unstruct.	Incomplete	Low-med.	Varies	For period of observation but may not be representative
Internal documents	Past-current	Unstruct.	Incomplete	Low-med.	Varies	Maybe
External documents	Mostly current-future	Unstruct	Incomplete	Low-med.	Relatively fixed	N/A
Software review	Past-current	Structured	Complete for software	Low-med.	Fixed	Maybe

Temporary job assignments are similar to observation in having a high degree of uncertainty associated with the information obtained (see Table 4-7). The information tends to be current, unstructured, and incomplete, depending on the work period. Ambiguity varies from low to medium depending on how well-defined and structured the work is. Semantic content might vary depending on the shared definitions in the workgroup.

Documents provide unstructured, incomplete information from which no relevant volume information is likely. The time orientation differs whether the documents are internal or external to the company (see Table 4-7). Internal documents are mostly oriented to the past or current situation. External documents are mostly oriented to current or future topics. The semantics of external documents on mature technologies or topics tend to be relatively fixed, while internal documents might vary by department or division.

The software provides structured past and possibly current information because it is automated. The ambiguity should be low to medium, and semantics should be fixed since the application embeds definitions of data and processes in code. Information on volumes may be present but should be cross-checked using other methods.

Data Type and Application Type

Application types are transaction processing (TPS), query, decision support (DSS), group decision support (GDSS), executive information (EIS), and expert systems (ES). Each has one or more predominant datatype characteristics that identify its application. Table 4-8 shows all applications categorized for all data types. Here, we discuss only the data types that differentiate between application types.

TABLE 4-8 Application Type by Data Type

Application Type	Time	Structure	Completeness	Ambiguity	Semantics	Volume
TPS	Current	Structured	Complete	Low	Fixed	Any
Query	Past, current	Structured	Complete	Low	Fixed	Any
DSS	All	Structured	Varies	Low-med.	Varies	Med-high
GDSS	Current- future	Unstruct.	Incomplete	Medium-high	Varies	Low
EIS	Future	Unstruct.	Incomplete	Medium-high	Varies	Low-med
Expert system	Current based on past	Semi-structured	Incomplete	Medium-high	May vary	Low

TPS contains predominantly known, current, structured, and complete information (see Table 4-8). Recall that TPS are the operational applications of a company. To control and maintain records of current operations, you must have known, structured, current, and complete information.

Query applications have similar characteristics to TPS with the difference that they might concentrate on historical information in addition to current information (see Table 4-8). Queries are questions posed on data to find problems and solutions and to analyze, summarize, and report on data. To confidently perform summaries and reports, the data must be structured, complete, and interpreted consistently, being both unambiguous and of fixed semantics.

DSS are statistical analysis tools that allow the development of information that aids the decision process. The type of data that identifies DSS so that all time frames might be represented may be incomplete, ambiguous, have variable semantics, and medium to high volume (see Table 4-8). DSS might be used, for instance, in analyzing which of two variations on a given product might enjoy the larger market share. To do this analysis, past sales, current sales, and sales trends in the industry might all be analyzed and tied together to develop an answer. GDSS are meeting facilitation tools for groups of people.

GDSS tools operate in a structured manner, working on data that is unstructured, current, and future-oriented. GDSS mostly deals with incomplete data and contains semantic and other ambiguities (see Table 4-8). The tools themselves are complete, unambiguous, and so forth, but the meeting information they process is not.

EIS are future-oriented applications that allow executives to scan the environment and identify trends, economic changes, or other industry activities that affect their governance of a company. EIS deals mostly with 'messy' data that is unstructured, incomplete, ambiguous, and contains variable semantics (see Table 4-8). Interpretation is always a problem with such data, which is why executives who excel at reading the environment are highly compensated.

Last, expert systems manage and reason through semistructured, incomplete, ambiguous, and variable semantic data (see Table 4-8). Experts and ESs take random, unstructured information and impose a structure. They reason how to interpret the data to remove ambiguity and fix the semantics. Therefore, although the data coming into the application might have these fuzzy characteristics, the data processing is highly structured.

Table 4-9 Data Collection Technique and Application Type

	TPS	Query	DSS	GDSS	EIS	ES
Interview	X*	X	X	X	X	X
Meeting	X	X	X	X	X	X
Observation	X	X	X	Limited	Limited	X
Questionnaire	X	X	X
Temporary job assignment	X	X	X
Internal documents	X	X	X	Limited
External documents	X	X	X	X	X	X
Software review	X	X	X	Limited	Limited	Limited

*Boldface identifies the most frequently used method.

Data Collection Technique and Application Type

Finally, in discussing different data types, we desire to know which data collection techniques are best for each application type. Combining the information in Tables 4-7 and 4-8, we develop Table 4-9 to summarize data collection techniques for each application type. The table entry in boldface shows the principle data collection method for each technique.

TPS and query applications can profit from the use of all techniques. Meetings and interviews predominate because they elicit the broadest range of responses in the shortest time (see Table 4-9). Observation and temporary job assignment are particularly useful in obtaining background information about the current problem domain, but need to be used with caution so as not to prejudice the design of the application. Questionnaires are useful when the number of people to be interviewed is over 50. Also, questionnaires are useful in identifying characteristics of users that determine, for instance, training required of users during organizational feasibility analysis. Also, if the screen requires, for instance, colors or different types of screen arrangements, questionnaires might be useful for presenting a small set of alternatives from which the actual users choose. DSS also are shown as having a use for all data collection techniques, but not all techniques are practical in all cases (see Table 4-9).

DSS are generally developed for use by people in jobs with a significant amount of discretion in what they do and how they do it. Therefore, observing or working with one or two people as representative may result in a biased view of the application requirements for a general purpose DSS. Even for a custom DSS, observation and job assignments might both be impractical if the SE does not know enough about the job being supported to interpret what she or he observes. The same holds true of documents. Documents, such as statistical reports, might be useful for providing samples of the types of analyses desired in a DSS. Other documents, such as policies, procedures, and so on, are not likely to be relevant to the application. For general purpose DSS with a large number of users, questionnaires are a useful way to identify the range of problems and analysis techniques required in the DSS. This information might be followed by interviews or meetings to determine DSS details.

GDSS are usually custom-built suites of software packages that provide different types of support for automated meetings. As such, the SE working on a GDSS environment needs to know the types of issues, number of participants, as well as types of reasoning and group consensus techniques desired. GDSS components are neither common knowledge nor frequently used; you might build one GDSS in a career. Therefore, significant time would be spent finding out about the market,vendors, and GDSS components. External documents on vendor products are useful in developing questions that elicit the required information. After knowledge of the market is obtained, interviews and meetings are useful to determine the specific requirements and to review, with users, what the GDSS can and cannot do. Other methods might have some limited value. For instance, observation of an actual meeting that might be automated would be useful for the SE to gain insight about how a tool might work. Internal documents that provide information about meetings that the GDSS is expected to provide would also be useful. Both of these techniques, observation and document review, have a specific limited role in providing the information needed to build a GDSS. Any software review that is done would be a review of other company's GDSS facilities or of vendor products, rather than review of in-house software.

EIS are similar to GDSS in the rarity and general lack of knowledge about what an EIS is. EIS are not standard applications with a screen for data entry of some type and reports that are displayed. EIS are information presentation facilities that can be structured with menus and selection tools, but may display document pages, newspaper articles, book abstracts, summary reports, and so on. EIS are usually built for a small number of users, which eliminates the use of questionnaires. EIS are custom and one-of-a-kind environments for which past documents or software will be of limited value. Observation is most likely limited because executives would be uncomfortable in being observed. Temporary job assignment is not possible because you cannot just 'be an executive' for a week or two. This leaves external documents, interviews, and meetings as the most likely techniques for data collection (see Table 4-9). As with GDSS, external documents will mostly be used to identify the market, vendors, and products. Interviews are most likely to be used to determine executives' information needs and preferred delivery platforms.

Finally, SEs use interviews, observation, and external documents the most in developing expert systems (see Table 4-9). Experts frequently can talk about external aspects of their jobs, the physical cues they use as inputs, and the result of their reasoning and how it is applied to the business. They are just as frequently unable to discuss their reasoning processes and how they put the cues together to make sense of unstructured situations. Experts, by definition of the term expert, have so internalized their work that they just do it. They don't think consciously about how they are doing what they do. Therefore, observation, in particular, the use of protocol analysis, is useful in getting information the expert might not be able to articulate. Protocol analysis is time-consuming and indefinite because you, the SE, are inferring a reasoning process from actions taken. At best, the protocol analysis gives you questions to ask about the work that assist the experts in discussing aspects of work they ordinarily cannot. Thus, observation is interleaved with interviews to discuss what is observed. As the process continues, structure is imposed on both the data and the problems to begin to develop the ES. The process of obtaining an expert's reasoning processes is called knowledge elicitation. The process of structuring the unstructured data and reasoning information is called knowledge engineering. Knowledge engineering is an activity that is difficult to learn and requires training through an apprenticeship approach in which the trainee works with an expert knowledge engineer.

Source: Adapted from Sue Conger, https://resources.saylor.org/CS/CS302/OER/The_New_Software_Engineering.pdf
This work is licensed under a Creative Commons Attribution 3.0 License.

Last modified: Friday, 8 December 2023, 1:05 PM

Course Introduction

Course Syllabus

Unit 1: Introduction to Software Engineering

1.1: An Overview of Software Engineering

Introduction to Software Engineering

1.2: What Is Software Engineering?

The History of Software

Six Degrees of Computer Science

1.3: Software Applications

Fundamentals of Software Engineering

1.4: Software Quality

Software Quality: Definitions and Strategic Issues

1.5: Software Engineering Code of Ethics and Professional Practices

Ethics and Professional Practices

Code of Ethics and Professional Conduct

Unit 1 Assessment

Unit 1 Assessment

Unit 2: Software Development Life-Cycle Models

2.1: Software Development Life-Cycle (SDLC)

Software Development Life-Cycle

2.2: Life-Cycle Models

Software Development Approaches

Software Development Models

2.3: Software Development Team Roles

Introduction to Software Systems

2.4: Software Development Methodologies

Introduction to Software Engineering Methodology

Software Development Life-Cycle Methodologies

Unit 2 Assessment

Unit 2 Assessment

Unit 3: Software Modeling

3.1: Object-Oriented Concepts

Object-Oriented Programming

Object-Oriented Terms

3.2: An Overview of UML

What Is Unified Modeling Language?

UML Static Diagrams

UML Dynamic Diagrams

Fundamentals of UML

Introduction to UML in Software Engineering

Introduction to UML

Use Cases

3.3: UML Diagrams

Fundamentals of UML Diagrams

UML State Diagrams

UML Class Diagrams

UML Activity Diagrams

3.4: Modeling Concepts

Object-Oriented Design

UML as a Modeling Tool

Unit 3 Assessment

Unit 3 Assessment

Unit 4: Software Requirements Gathering

4.1: What Are Requirements and Data Types?

Data Gathering for Application Development

Writing Software Requirements

Object-Oriented Analysis

4.2: Requirements and Data Gathering Techniques

Data Collection Techniques

4.3: Data Collection Techniques for Each Application Type

Data Gathering Techniques for Each Application Type

Unit 4 Assessment

Unit 4 Assessment

Unit 5: Fundamentals of Requirements Analysis

5.1: Requirements Fundamentals

Fundamentals of Requirements Analysis

5.2: The Requirements Process

The Requirements Process

5.3: Conceptual Modeling

Conceptual Modeling in Object-Oriented Analysis

Conceptual Modeling in Requirements Analysis

5.4: Use Case Diagrams

Use Case Concepts in Object-Oriented Analysis

More on UML Use Cases in Requirements Analysis

5.5: Sequence Diagrams

UML Sequence Diagrams

More on UML Sequence Diagrams

Software Tool Support for Requirements Analysis

Unit 5 Assessment

Unit 5 Assessment