Data Collection Techniques

Site: Saylor Academy
Course: CS302: Software Engineering
Book: Data Collection Techniques
Printed by: Guest user
Date: Wednesday, 30 April 2025, 7:56 AM

Description

Data gathering is the interaction between the software engineer (a business analyst) and the customers (including users). There are many techniques for gathering data, including interviews, meetings, observations, questionnaires, and reviewing software, internal documents, and external documents. Data gathering is an activity where ethical and professional conduct issues typically arise, particularly regarding privacy, security, responsibility, accountability, and communication.

Single Interview

There are seven techniques we use for data gathering during application development. They are interviews, group meetings, observation, temporary job assignment, questionnaires, review of internal and outside documents, and review of software. Each has a use for which it is best served, and each has limitations to the amount and type of information that can be got from the technique. The technique strengths and weaknesses are summarized in Table 4-2, which is referenced throughout this section. 

In general, you always want to validate the information received from any source through triangulation. Triangulation is obtaining the same information from multiple sources. You might ask the same question in several interviews, compare questionnaire responses to each item, or check in-house and external documents for similar information. When a discrepancy is found, you reverify it with the original and triangulated sources as much as possible. If the information is critical to the application being correctly developed, put the definitions, explanations, or other information in writing and have it approved by the users separately from the other documentation. Next, we discuss each data collection technique.

TABLE 4-2 Summary of Data Collection Techniques

Interviews
Strengths Weaknesses
Get both qualitative and quantitative information Takes some skill
Get both detail and summary information May obtain biased results
Good method for surfacing requirements Can result in misleading, inaccurate, or irrelevant information

Requires triangulation to verify results

Not useful with large numbers of people to be interviewed (e.g., over 50)
Group Meetings
Strengths Weaknesses
Decisions can be made Decisions with large number of participants can take a long time
Can get both detail and summary information Wastes time
Good for surfacing requirements Interruptions divert attention of participants
Gets many users involved Arguments about turf, politics, etc. can occur

Wrong participants lead to low results
Observation
Strengths Weaknesses
Surface unarticulated procedures, decision criteria, reasoning processes Might not be representative time period
Not biased by opinion Behavior might be changed as a result of being observed
Observer gets good problem domain understanding Time consuming
Review Software
Strengths Weaknesses
Good for learning current work procedures as constrained or guided by software design May not be current
Good for identifying questions to ask users about functions-how they work and whether they should be kept May be inaccurate

Time consuming

TABLE 4-2 Summary of Data Collection Techniques (Continued)

Questionnaire
Strengths
Weaknesses
Anonymity for respondent Recall may be imperfect
Attitudes and feelings might be more honestly expressed Unanswered questions mean you cannot get the information
Large numbers of people can be surveyed easily Questions might be misinterpreted
Best for limited response, closed-ended questions Reliability or validity may be low
Good for multicultural companies to surface biases, or requirements and design features that should be customized to fit local conventions Might not add useful information to what is already known
Temporary Assignment
Strengths
Weaknesses
Good to learn current context, terminology, procedures, problems
May not include representative work activities or time period
Bases for questions you might not otherwise ask Time consuming

May bias future design work
Review Internal Documents
Strengths Weaknesses
Good for learning history and politics May bias future design work
Explains current context Saves interview /user time
Good for understanding current application Not useful for obtaining attitudes or motivations
Review External Documents
Strengths
Weaknesses
Good for identifying industry trends, surveys, expert opinions, other companies' experiences, and technical information relating to the problem domain May not be relevant

lnformation may not be accurate

May bias future design work


TABLE 4-3 Steps to Conducting a Successful Interview

1. Make an appointment that is at the convenience of the interviewee.
2. Prepare the interview; know the interviewee.
3. Be on time.
4. Have a planned beginning to the interview.
a. Introduce yourself and your role on the project.
b. Use open-ended general questions to begin the discussion.
c. Be interested in all responses, pay attention.
5. Have a planned middle to the interview.
a. Combine open-ended and closed-ended questions to obtain the information you want.
b. Follow-up comments by probing for more detail.
c. Provide feedback to the interviewee in the form of comments, such as, "Let me tell you what I think you mean, ... "
d. Limit your notetaking to avoid distracting the interviewee.
6. Have a planned closing to the interview.
a. Summarize what you have heard. Ask for corrections as needed.
b. Request feedback, note validation, or other actions of interviewee.
      • Give him or her a date by which they will receive information for review.
      • Ask him or her for a date by which the review should be complete.
c. If a follow-up interview is scheduled, confirm the date and time.

A good interview has a beginning, middle, and end. In the beginning, you introduce yourself and put the interviewee at ease. Begin with general questions that are inoffensive and not likely to evoke an emotional response. Pay attention to answers both to get cues for other questions, and to get cues on the honesty and attitude of the interviewee. In the middle, be businesslike and stick to the subject. Get all the information you came for, using the techniques you chose in advance. If some interesting side information emerges, ask if you can talk about it later and then do that. In closing, summarize what you have heard and tell the interviewee what happens next. You may write notes and ask him or her to review them for accuracy. If you do notes, try to get them back for review within 48 hours. Also, have the interviewee commit to the review by a specific date to aid in your time planning. If you say you will follow up with some activity, make sure you do.

Interviews use two types of questions: open-ended and closed-ended. An open-ended question is one that asks for a multi-sentence response. Open-ended questions are good for eliciting descriptions of current and proposed application functions, and for identifying feelings, opinions, and expectations about a proposed application. They can also be used to obtain any lengthy or explanatory answers. An example of open-ended question openings are: "Can you tell me about ... " or "What do you think about ... " or "Can you describe how you use ... ".

A closed-ended question is one which asks for a yes/no or specific answer. Closed-ended questions are good for eliciting factual information or forcing people to take a position on a sensitive issue. An example of a closed-ended question is: "Do you use the monthly report?" A 'yes' response might be followed by an open-ended question, "Can you explain how?"

The questions can be ordered in such a way that the interview might be structured or unstructured (see Table 4-4). A structured interview is one in which the interviewer has an agenda of items to cover, specific questions to ask, and specific information desired. A mix of open and closed questions is used to elicit details of interest. For instance, the interview might start with "Describe the current rental process." The respondent would describe the process, most often using general terms. The interviewer might then ask specific questions, such as, "What is the daily volume of rentals?" Each structured interview is basically the same because the same questions are asked in the same sequence. Tallying the responses is fairly easy because of the structure.

An unstructured interview is one in which the interview unfolds and is directed by responses of the interviewee. The questions tend to be mostly open-ended. There is no set agenda, so the interviewer, who knows the information desired, uses the responses from the open-ended questions to develop ever more specific questions about the topics. The same questions used above as examples for the structured interview might also be used in an unstructured interview; the difference is that above, they are determined as a 'script' in advance. In an unstructured situation, the questions flow from the conversation.

TABLE 4-4 Comparison of Structured and Unstructured Interviews

Strengths
Structured Unstructured
Uses uniform wording of questions for all respondents Provides greater flexibility in question wording to suit respondent
Easy to administer and evaluate Can be difficult to conduct because interviewer must listen carefully to develop questions about issues that arise spontaneously from answers to questions

May surface otherwise overlooked information
More objective evaluation of respondents and answers to questions Requires practice
Requires little training
Requires little training
Weaknesses
Structured Unstructured
Cost of preparation can be high May waste respondent and interviewer time
Respondents do not always accept high level of structure and its mechanical posing of questions Interviewer bias in questions or reporting of results is is more likely
High level of structure is not suited to all situations Extraneous information must be culled through
Reduces spontaneity and ability of interviewer to follow up on comments of interviewee Analysis and interpretation of results may be lengthy

Takes more time to collect essential facts

Structured interviews are most useful when you know the information desired in advance of the interview (see Table 4-4). Conversely, unstructured interviews are most useful when you cannot anticipate the topics or specific outcome. A typical series of interviews with a user client begins with unstructured interviews to give you an understanding of the problem domain. The interviews get progressively structured and focused as the information you need to complete the analysis also gets more specific.

User interview results should always be communicated back to the interviewee in a short period of time. The interviewee should be given a deadline for their review. If the person and/or information are critical to the application design being correct, you should ask for comments even after the deadline is missed. If the person is not key in the development, the deadline date signifies a period during which you will accept changes, after the date you continue work, assuming the information is correct. 

 It is good practice to develop diagram( s) as part of the interview documentation. At the beginning of the next interview session, you discuss the diagram(s) with the user and give him or her any written notes to verify at a later time. You get immediate feedback on the accuracy of the graphic and your understanding of the application. The benefits of this approach are both technical and psychological. From a technical perspective, you are constantly verifying what you have been told. By the time the analysis is complete, both you and the client have confidence that the depicted application processing is correct and complete. From a psychological perspective, you increase user confidence in your analytical ability by demonstrating your problem understanding. Each time you improve the diagram and deepen the analysis, you also increase user confidence that you will build an application that answers his or her needs.

Interviews are useful for obtaining both qualitative and quantitative information (see Table 4-2). The types of qualitative information are opinions, beliefs, attitudes, policies, and narrative descriptions. The types of quantitative information include frequencies, numbers, and quantities of items to be tracked or used in the application. 

Interviews, and other forms of data collection, can give you misleading, inaccurate, politically motivated, or irrelevant information (see Table 4-2). You need to learn to read the person's body language and behavior to decide on further needs for the same information. Table 4-5 lists respondent behaviors you might see in an interview and the actions you might take in dealing with the behaviors.

For instance, if you suspect the interviewee of lying or 'selectively remembering' information, try to cross-check the answers with other, more reliable sources. If the interview information is found to be false, ask the interviewee to please explain the differences between his or her answers and the other information. The session does not need to be a confrontation, rather, it is a simple request for explanation. Be careful not to accuse or condemn, simply try to get the correct information.

Persistence and triangulation are key to getting complete, accurate information. You are not required to become 'friends' with the application users, but interviews are smoother, yield more information for the time spent, and usually have less' game-playing' if you are 'friendly' than if you are viewed as distant, overly-objective, or uninterested.


Source: Sue Conger, https://resources.saylor.org/CS/CS302/OER/The_New_Software_Engineering.pdf
Creative Commons License This work is licensed under a Creative Commons Attribution 3.0 License.

Meetings

Meetings are gatherings of three or more people for a fixed period to discuss a small number of topics and sometimes to reach consensus decisions. Meetings can both complement and replace interviews. They complement interviews by allowing a group verification of individual interview results. They can replace interviews by providing a forum for users to collectively work out the requirements and alternatives for an application. Thus, meetings can be useful for choosing between alternatives, verifying findings, and for soliciting application ideas and requirements.

Meetings can also be a colossal waste of time (see Table 4-2). In general, the larger the meeting, the fewer the decisions and the longer they take. Therefore, before having a meeting, a meeting plan should be developed. The agenda should be defined and circulated in advance to all participants. The number of topics should be kept to between one and five. The meeting should be for a fixed period with specific checkpoints for decisions required. In general, meetings should be no longer than two hours to maintain the attention of the participants. The agenda should be followed and the meeting moved along by the project manager or SE, whoever is running the meeting. Minutes should be generated and circulated to summarize the discussion and decisions. Any follow-up items should identify the responsible person(s) and a date by which the item should be resolved.

Meetings are useful for surfacing requirements, reaching consensus, and obtaining both detailed and summary information (see Table 4-2). If decisions are desired, it is important to ask the decision makers to attend and to tell them in advance of the goals for the meeting. If the wrong people participate, time is wasted and the decisions are not made at the meeting.

Joint application development (lAD) is a special form of meeting in which users and technicians meet continuously over several days to identify application requirements (see Figure 4-3). Before a JAD session, users are trained in the techniques used to document requirements, in particular, diagrams for data and processes are taught. Then, in preparation for the JAD session, the users document their own jobs using the techniques and collecting copies of all forms, inputs, reports, memos, faxes, and so forth used in performing their job.

 TABLE 4-5 Interviewee Behaviors and Interviewer Response

Interviewee Behavior Interviewer Response
Guesses at answers rather than admit ignorance After the interview, cross-check answers
Tries to tell interviewer what she or he wants to hear rather than correct facts Avoid questions with implied answers. Cross-check answers
Gives irrelevant information Be persistent in bringing the discussion to the desired topic
Stops talking when the interviewer takes notes Do not take notes at this interview. Write notes as soon as the interview is done. Ask only the most important questions. Have more than one interview to get all information.
Rushes through the interview Suggest coming back later
Wants no change because she or he likes the current work environment Encourage elaboration of present work environment and good aspects. Use the information to define what gets kept from the current method.
Shows resentment; withholds information or answers guardedly Begin the interview with personal chitchat on a topic of interest to the interviewee. After the person starts talking, work into the interview.
Is not cooperative, refusing to give information Get the information elsewhere. Ask this person, "Would you mind verifying what someone else tells me about this topic?"

If the answer is no, do not use this person as an information source.
Gripes about the job, pay, associates, supervisors, or treatment Listen for clues. Be noncommittal in your comments. An example might be, "You seem to have lots of problems here; maybe the application proposed might solve some of the problems". Try to move the interview to the desired topic.
Acts like a techno-junkie, advocating state-of-the-art everything Listen for the information you are looking for. Do not become involved in a campaign for technology that does not fit the needs of the application.

A JAD session lasts from 3 to 8 days, and from 7 to 10 hours per day. The purpose of the sessions is to get all the interested parties in one place, to define application requirements, and to accelerate the process of development. Several studies show that JAD can compress an analysis phase from three months into about three weeks, with comparable results. The advantage of such sessions is that users' commitment is concentrated into a short period of time. The disadvantage is that users might allow interruptions to divert their attendance at JAD meetings, thus not meeting the objective. JAD is discussed in more detail in the Introduction to Part II.


FIGURE 4-3 JAD Meeting

Observation

Observation is the manual or automated monitoring of one or more persons' work. In manual observation, a person sits with the individual(s) being observed and takes notes of the activities and steps performed during the work (see Table 4-2). In automated observation, a computer keeps track of software used, e-mail correspondence and partners, and actions performed using a computer. Computer log files are then analyzed to describe the work process based on the software and procedures used. 

Observation is useful for obtaining information from users who cannot articulate what they do or how they do it (see Table 4-2). In particular, for expert systems, taking protocols of work is a useful form of observation. A protocol is a detailed minute-by-minute list of the actions performed by a person. Videotaping is sometimes used for continuous tracking. The notes or tapes are analyzed for events, key verbal statements, or actions that indicate reasoning, work procedure, or other information about the work.

There are three disadvantages to observation (see Table 4-2). First, the time of observation might not be representative of the activities that take place normally, so the SE might get a distorted view of the work. Second, the idea that a person is being observed might lead them to change their behavior. This problem can be lessened somewhat by extensive observation during which time the person being observed loses their sensitivity to being watched. The last disadvantage of observation is that it can be time-consuming and may not yield any greater understanding than could be got in less time-consuming methods of data collection.

Advantages of observation are several. Little opinion is injected into the SE's view of the work. The SE can gain a good understanding of the current work environment and work procedures through observation. The SE can focus on the issues of importance to him or her, without alienating or disturbing the individual being observed. Some barriers to working with the SEs that are needed for interviews and validation of findings might be overcome through the contact of observation. 

Some ground rules for observation are necessary to prepare for the session. You should identify and define what is going to be observed. Be specific about the length of time the observation requires. Obtain both management approval and approval of the individual(s) to be observed before beginning. Explain to the individuals being observed what is being done with the information and why. It is unethical to observe someone without their knowledge or to mislead an individual about what will be done with the information gained during the observation session.

Temporary Job Assignment

There is no substitute for experience. With a temporary job assignment, you get a more complete appreciation for the tasks involved and the complexity of each than you ever could by simply talking about them. Also, you learn firsthand the terminology and the context of its use (see Table 4-2). The purpose, then, of temporary job assignment is to make the assignee more knowledgeable about the problem domain. 

Temporary assignments usually last two weeks to one month-long enough for you to become comfortable that most normal and exceptional situations have occurred, but not long enough to become truly expert at the job. Temporary assignment gives you a basis for formulating questions about which functions of the current method of work should be kept and which should be discarded or modified.

The disadvantage of work assignments is that they are time-consuming and may not be a representative period (see Table 4-2). The choice of period can minimize this problem. The other disadvantage is that the SE taking the temporary assignment might become biased about the work process, content, or people in a way that affects future design work.

Questionnaire

A questionnaire is a paper-based or computer-based form of interview. Questionnaires are used to obtain information from a large number of people. The major advantage of a questionnaire is anonymity, thus leading to more honest answers than might be got through interviews. Also, standardized questions provide reliable data upon which decisions can be based.

Questionnaire items, like interviews, can be either open-ended or closed-ended. Recall that open-ended questions have no specific response intended. Open-ended questions are less reliable for obtaining complete information about factual information and are subject to recall difficulties, selective perception, and distortion by the person answering the question. Since the interviewer neither knows the specific respondent nor has contact with the respondent, open-ended questions that lead to other questions might go unanswered. An example of an open-ended question is: "List all new functions which you think the new application should do".

A closed-ended question is one which asks for a yes/no or graded specific answer. For example, "Do you agree with the need for a history file?" would obtain either a yes or no response. Questionnaire construction is a learned skill that requires consideration of the reliability and validity of the instrument. Reliability is the extent to which a questionnaire is free of measurement errors. This means that if a reliable questionnaire were given to the same group several times, the same answers would be obtained. If a questionnaire is unreliable, repeated measurement would result in different answers every time. 

Questionnaires that try to measure mood, satisfaction, and other emotional characteristics of the respondent tend to be unreliable because they are influenced by how the person feels that day. You improve reliability by testing the questionnaire. When the responses are tallied, statistical techniques are used to verify the reliability of related sets of questions.
Validity is the extent to which the questionnaire measures what you think you are measuring. For instance, assume you want to know the extent to which a CASE tool is being used in both frequency of use and number of functions used. Asking the question, "How well do you use the CASE tool?" might obtain a subjective assessment based on the individual's self-perception. If they perceive themselves as skilled, they might answer that they are extensive users. If they perceive themselves as novices, they might answer that they do not use the tool extensively. A better set of questions would be "How often do you use the CASE tool?" and "How many functions of the tool do you use? Please list the functions you use". These questions specifically ask for numbers which are objective and not tied to an individual's self-perception. The list of functions verifies the numbers and provides the most specific answer possible.

Some guidelines for developing questionnaires are summarized in Table 4-6 and discussed here. First, determine the information to be collected, what facts are required, and what feelings, lists of items, or nonfactual information is desired. Group the items by type of information obtained, type of questions to be asked, or by topic area. Choose a grouping that makes sense for the specific project. 

For each piece of information, choose the type of question that best obtains the desired response. Select open-ended questions for general, lists, and nonfactual information. Select closed-ended questions to elicit specific, factual information, or single answers.

Compose a question for each item. For a closed-ended question, develop a response scale. The five-response Likert-like scale is the most frequently used. The low and high ends of the scale indicate the poles of responses, for instance, Totally Disagree and Totally Agree. The middle response is usually neutral, for instance, Neither Agree Nor Disagree. Examine the question and ask yourself if it has any words that might not be interpreted as you mean them. What happens if the respondent does not know the answer to your question? Do you need a response that says, I Don't Know? Is a preferred response hidden in the question? Are the response choices complete and ordered properly? Does the question have the same meaning for every department and possible respondent? If the answers to any of these questions indicate a problem, reword the question to remove the problem.

If you have several questions that ask similar information, examine the possibility of eliminating one or more items. If you are doing statistical analysis of the answers, you might want similar questions to see if the responses are also similar (i.e., are correlated). If you are simply tallying the responses and acting on the information, try to use one question for each piece of information needed. The minimalist approach keeps the questionnaire shorter and easier to tally.

TABLE 4-6 Guidelines for Questionnaire Development

1. Determine what facts are desired and which people are best qualified to provide them. 
2. For each fact, select either an open-ended or close-ended question. Write several questions and choose the one or two that most clearly ask for the information. 
3. Group questions by topic area, type of question, or some context-specific criteria. 
4. Examine the questionnaire for problems: 

  • More than two questions asking the same information 
  • Ambiguous questions 
  • Questions for which respondents might not have the answer 
  • Questions that bias the response 
  • Questions that are open to interpretation by job function, level of organization, etc. 
  • Responses that are not comprehensive of all possible answers 
  • Confusing ordering of questions or responses 

5. Fix any problems identified above. 

6. Test the questionnaire on a small group of people (e.g., 5-10). Ask for both comments on the questions and answers to the questions. 

7. Analyze the comments and fix wording ambiguities, biases, word problems, etc. as identified by the comments. 

8. Analyze the responses to ensure that they are the type desired. 

9. If the information is different than you expected, the questions might not be direct enough and need rewording. If you don't get useful information that you don't already know, reexamine the need for the questionnaire. 

10. Make final edits, print in easy-to-read type. Prepare a cover letter. 

11. Distribute the questionnaire, addressing the cover letter to the person by name. Include specific instructions about returning the questionnaire. Provide a self-addressed, stamped envelope if mailing is needed. 

Pretest the questionnaire on a small group of representative respondents. Ask them to give you feedback on all of the items that they don't understand, that they think are ambiguous, badly worded, or have responses that do not fit the item. Also ask them to complete the questionnaire. The answers of this group should highlight any unexpected responses that, whether the group identified a problem or not, mean that the question was not interpreted as intended. If the pretest responses do not provide you with new information needed to develop the project, the questionnaire might not be needed or might not ask the right questions. Reexamine the need for a questionnaire and revise it as needed. Finally, change the questionnaire based on the feedback from the test group. The pretest and revision activities increase the validity of the questionnaire.

Provide a cover letter for the questionnaire that briefly describes the purpose and type of information sought. Give the respondent a deadline for completing the questionnaire that is not too distant. For instance, three days is better than two weeks. The more distant the due date, the less likely the questionnaire will be completed. Include information about respondent confidentiality and voluntary questionnaire completion, if they are appropriate. Ideally, the questionnaire is anonymous and voluntary. To the extent possible, address the letter to the individual respondent. 

Give the respondent directions about returning the completed questionnaire. If mailing is required, provide a stamped, self-addressed envelope. If interoffice mail is used, provide your mail stop address. If you will pick up responses, tell the person where and when to have the questionnaire ready for pickup.

Document Review

New applications rarely spring from nothing. There is almost always a current way of doing work that is guided by policies, procedures, or application systems. Study of the documentation used to teach new employees, to guide daily work, or to use an application can provide valuable insight into what work is done.

The term documents refers to written policy manuals, regulations, and standard operating procedures that organizations provide as a guide for managers and employees. Document types include those that describe organization structure, goals, and work. Examples of each document type follow:

Policies

Procedures

User manuals

Strategy and mission statements

Organization charts

Job descriptions

Performance standards

Delegation of authority

Chart of accounts

Budgets

Schedules

Forecasts

Any long- or short-range plans

Memos

Meeting minutes

Employee training documents

Employee manuals

Transaction files, e.g., time sheets, expense records

Legal documents, e.g., copyrights, patents, trademarks, etc.

Historical reports

Financial statements

Reference files, e.g., customers, employees, products, vendors

Documents are not always internal to a company. External documents that might be useful include technical publications, research reports, public surveys, and regulatory information. Examples of external documents follow:

Research reports on industry trends, technology trends, technological advances, etc. 

Professional publications with salary surveys, marketing surveys, or product development information 

IRS or American Institute of CPA reports on taxes, workmen's compensation, affirmative action, financial reporting, etc.

Economic trends by industry, region, country, etc. 

Government stability analyses for developing countries in which the application might be placed 

Any publications that might influence the goals, objectives, policies, or work procedures relating to the application

Documentation is particularly useful for SEs to learn about an area with which they have no previous experience. It can be useful for identifying issues or questions about work processes or work products for which users need a history. Documents provide objective information that usually does not discuss user perceptions, feelings, or motivations for work actions. 

Documents are less useful for identifying attitudes or motivations. These topics might be important issues, but documents may not contain the desired information.

Software Review

Frequently, applications are replacing older software that supports the work of user departments. Study of the existing software provides you with information about the current work procedures and the extent to which they are constrained by the software design. This, in turn, gives you information about questions to raise with the users, for instance, how much do they want work constrained by the application? If they could remove the constraints, how would they do the work? 

The weaknesses of getting information from software review are that documentation might not be accurate or current, code might not be readable, and the time might be wasted if the application is being discarded.

To summarize, the methods of collecting information relating to applications include interviews, group meetings, observation, questionnaires, temporary job assignment, document review, or software review. For obtaining information relating to requirements for applications, interviews and JAD meetings are the most common.