Data Gathering Techniques for Each Application Type

Read this section and identify the data types needed by each application type. These are summarized in a table that relates application type to the type of data it needs. Using this table (application type/data type needed) and the previous table (data gathering technique – data type obtained), requirements stakeholders can select the most appropriate data gathering techniques. This is illustrated in the combined table that relates data gathering technique to application type.

Software Review 

Frequently, applications are replacing older software that supports the work of user departments. Study of the existing software provides you with information about the current work procedures and the extent to which they are constrained by the software design. This, in turn, gives you information about questions to raise with the users, for instance, how much do they want work constrained by the application? If they could remove the constraints, how would they do the work? 

The weaknesses of getting information from software review are that documentation might not be accurate or current, code might not be readable, and the time might be wasted if the application is being discarded.

To summarize, the methods of collecting information relating to applications include interviews, group meetings, observation, questionnaires, temporary job assignment, document review, or software review. For obtaining information relating to requirements for applications, interviews and JAD meetings are the most common.


Data Collection and Application Type

In this section, we identify the data gathering techniques most useful for each application type. Like most aspects of application development, the techniques can be used for all application types, but because of their strengths and weaknesses, they do not always result in the type of information that is needed most. In this section, we first match data collection techniques to the data types discussed in the first section. Then, the data types are matched to application types (from Chapter 1). Next, we match the data collection techniques to application types based on the data types they have in common.


Data Collection Technique and Data Type 

Table 4-7 summarizes the discussion of the above sections. By matching techniques for data collection to data type, we are more likely to identify information of interest than using other techniques. As the table shows, interviews and meetings are useful for eliciting all types of information. This is the reason they are most frequently used in application work.

Observation provides only crude numerical estimates of volumes and is restricted to current time, varying ambiguity, and possibly variable semantics (see Table 4-7). Because the information from an observation is unstructured, some skill is required of the SE to impose a structure that fits the situation. Also, the information may be incomplete. 

Questionnaires can ask structured questions about any time frame but only obtain complete answers for questions asked (see Table 4-7). If the questions are open-ended, the completeness might be quite low. Ambiguity in questionnaires should be low, but the question semantics might be misinterpreted by the respondents. Questions about volume at a department or organization level are usually inappropriate. Information about the volume of transactions or time for transaction processing for individual workers would get meaningful information.

Table 4-7 Data Collection Techniques and Data Type

Technique Time Structure Completeness Ambiguity Semantics Volume
Interview All All All All Varies All
Meeting All All All All Varies All
Observation Current Unstruct. Incomplete May vary Varies Crude measure
Questionnaire All Structured Complete for questions asked Low Fixed but might be subject to interpretation Individual volumes only
Temporary job assignment Current Unstruct. Incomplete Low-med. Varies For period of observation but may not be representative
Internal documents Past-current Unstruct. Incomplete Low-med. Varies Maybe
External documents Mostly current-future Unstruct Incomplete Low-med. Relatively fixed N/A
Software review Past-current Structured Complete for software Low-med. Fixed Maybe


Temporary job assignments are similar to observation in having a high degree of uncertainty associated with the information obtained (see Table 4-7). The information tends to be current, unstructured, and incomplete, depending on the work period. Ambiguity varies from low to medium depending on how well-defined and structured the work is. Semantic content might vary depending on the shared definitions in the workgroup. 

Documents provide unstructured, incomplete information from which no relevant volume information is likely. The time orientation differs whether the documents are internal or external to the company (see Table 4-7). Internal documents are mostly oriented to the past or current situation. External documents are mostly oriented to current or future topics. The semantics of external documents on mature technologies or topics tend to be relatively fixed, while internal documents might vary by department or division.

The software provides structured past and possibly current information because it is automated. The ambiguity should be low to medium, and semantics should be fixed since the application embeds definitions of data and processes in code. Information on volumes may be present but should be cross-checked using other methods.


Data Type and Application Type 

Application types are transaction processing (TPS), query, decision support (DSS), group decision support (GDSS), executive information (EIS), and expert systems (ES). Each has one or more predominant datatype characteristics that identify its application. Table 4-8 shows all applications categorized for all data types. Here, we discuss only the data types that differentiate between application types.

TABLE 4-8 Application Type by Data Type

Application Type Time Structure Completeness Ambiguity Semantics Volume
TPS Current Structured Complete Low Fixed Any
Query Past, current Structured Complete Low Fixed Any
DSS All Structured Varies Low-med. Varies Med-high
GDSS Current- future Unstruct. Incomplete Medium-high Varies Low
EIS Future Unstruct. Incomplete Medium-high Varies Low-med
Expert system Current based on past Semi-structured Incomplete Medium-high May vary Low


TPS contains predominantly known, current, structured, and complete information (see Table 4-8). Recall that TPS are the operational applications of a company. To control and maintain records of current operations, you must have known, structured, current, and complete information. 

Query applications have similar characteristics to TPS with the difference that they might concentrate on historical information in addition to current information (see Table 4-8). Queries are questions posed on data to find problems and solutions and to analyze, summarize, and report on data. To confidently perform summaries and reports, the data must be structured, complete, and interpreted consistently, being both unambiguous and of fixed semantics.

DSS are statistical analysis tools that allow the development of information that aids the decision process. The type of data that identifies DSS so that all time frames might be represented may be incomplete, ambiguous, have variable semantics, and medium to high volume (see Table 4-8). DSS might be used, for instance, in analyzing which of two variations on a given product might enjoy the larger market share. To do this analysis, past sales, current sales, and sales trends in the industry might all be analyzed and tied together to develop an answer. GDSS are meeting facilitation tools for groups of people. 

GDSS tools operate in a structured manner, working on data that is unstructured, current, and future-oriented. GDSS mostly deals with incomplete data and contains semantic and other ambiguities (see Table 4-8). The tools themselves are complete, unambiguous, and so forth, but the meeting information they process is not.

EIS are future-oriented applications that allow executives to scan the environment and identify trends, economic changes, or other industry activities that affect their governance of a company. EIS deals mostly with 'messy' data that is unstructured, incomplete, ambiguous, and contains variable semantics (see Table 4-8). Interpretation is always a problem with such data, which is why executives who excel at reading the environment are highly compensated.

Last, expert systems manage and reason through semistructured, incomplete, ambiguous, and variable semantic data (see Table 4-8). Experts and ESs take random, unstructured information and impose a structure. They reason how to interpret the data to remove ambiguity and fix the semantics. Therefore, although the data coming into the application might have these fuzzy characteristics, the data processing is highly structured.

Table 4-9 Data Collection Technique and Application Type


TPS Query DSS GDSS EIS ES
Interview X* X X X X X
Meeting X X X X X X
Observation X X X Limited Limited X
Questionnaire X X X


Temporary job assignmentX X X


Internal documents X X X Limited

External
documents
X X X X X X
Software review X X X Limited Limited Limited

*Boldface identifies the most frequently used method.

Data Collection Technique and Application Type 

Finally, in discussing different data types, we desire to know which data collection techniques are best for each application type. Combining the information in Tables 4-7 and 4-8, we develop Table 4-9 to summarize data collection techniques for each application type. The table entry in boldface shows the principle data collection method for each technique.

TPS and query applications can profit from the use of all techniques. Meetings and interviews predominate because they elicit the broadest range of responses in the shortest time (see Table 4-9). Observation and temporary job assignment are particularly useful in obtaining background information about the current problem domain, but need to be used with caution so as not to prejudice the design of the application. Questionnaires are useful when the number of people to be interviewed is over 50. Also, questionnaires are useful in identifying characteristics of users that determine, for instance, training required of users during organizational feasibility analysis. Also, if the screen requires, for instance, colors or different types of screen arrangements, questionnaires might be useful for presenting a small set of alternatives from which the actual users choose. DSS also are shown as having a use for all data collection techniques, but not all techniques are practical in all cases (see Table 4-9). 

DSS are generally developed for use by people in jobs with a significant amount of discretion in what they do and how they do it. Therefore, observing or working with one or two people as representative may result in a biased view of the application requirements for a general purpose DSS. Even for a custom DSS, observation and job assignments might both be impractical if the SE does not know enough about the job being supported to interpret what she or he observes. The same holds true of documents. Documents, such as statistical reports, might be useful for providing samples of the types of analyses desired in a DSS. Other documents, such as policies, procedures, and so on, are not likely to be relevant to the application. For general purpose DSS with a large number of users, questionnaires are a useful way to identify the range of problems and analysis techniques required in the DSS. This information might be followed by interviews or meetings to determine DSS details.

GDSS are usually custom-built suites of software packages that provide different types of support for automated meetings. As such, the SE working on a GDSS environment needs to know the types of issues, number of participants, as well as types of reasoning and group consensus techniques desired. GDSS components are neither common knowledge nor frequently used; you might build one GDSS in a career. Therefore, significant time would be spent finding out about the market,vendors, and GDSS components. External documents on vendor products are useful in developing questions that elicit the required information. After knowledge of the market is obtained, interviews and meetings are useful to determine the specific requirements and to review, with users, what the GDSS can and cannot do. Other methods might have some limited value. For instance, observation of an actual meeting that might be automated would be useful for the SE to gain insight about how a tool might work. Internal documents that provide information about meetings that the GDSS is expected to provide would also be useful. Both of these techniques, observation and document review, have a specific limited role in providing the information needed to build a GDSS. Any software review that is done would be a review of other company's GDSS facilities or of vendor products, rather than review of in-house software.

EIS are similar to GDSS in the rarity and general lack of knowledge about what an EIS is. EIS are not standard applications with a screen for data entry of some type and reports that are displayed. EIS are information presentation facilities that can be structured with menus and selection tools, but may display document pages, newspaper articles, book abstracts, summary reports, and so on. EIS are usually built for a small number of users, which eliminates the use of questionnaires. EIS are custom and one-of-a-kind environments for which past documents or software will be of limited value. Observation is most likely limited because executives would be uncomfortable in being observed. Temporary job assignment is not possible because you cannot just 'be an executive' for a week or two. This leaves external documents, interviews, and meetings as the most likely techniques for data collection (see Table 4-9). As with GDSS, external documents will mostly be used to identify the market, vendors, and products. Interviews are most likely to be used to determine executives' information needs and preferred delivery platforms.

Finally, SEs use interviews, observation, and external documents the most in developing expert systems (see Table 4-9). Experts frequently can talk about external aspects of their jobs, the physical cues they use as inputs, and the result of their reasoning and how it is applied to the business. They are just as frequently unable to discuss their reasoning processes and how they put the cues together to make sense of unstructured situations. Experts, by definition of the term expert, have so internalized their work that they just do it. They don't think consciously about how they are doing what they do. Therefore, observation, in particular, the use of protocol analysis, is useful in getting information the expert might not be able to articulate. Protocol analysis is time-consuming and indefinite because you, the SE, are inferring a reasoning process from actions taken. At best, the protocol analysis gives you questions to ask about the work that assist the experts in discussing aspects of work they ordinarily cannot. Thus, observation is interleaved with interviews to discuss what is observed. As the process continues, structure is imposed on both the data and the problems to begin to develop the ES. The process of obtaining an expert's reasoning processes is called knowledge elicitation. The process of structuring the unstructured data and reasoning information is called knowledge engineering. Knowledge engineering is an activity that is difficult to learn and requires training through an apprenticeship approach in which the trainee works with an expert knowledge engineer.


Source: Adapted from Sue Conger, https://resources.saylor.org/CS/CS302/OER/The_New_Software_Engineering.pdf
 Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 License.

Last modified: Friday, December 8, 2023, 1:05 PM