«Abstract We conducted an industrial case study of a distributed team in the USA and the Czech Republic that used Extreme Programming. Our goal was to ...»
One problem area that is unquestionably influenced by DSD is intra-team communication . Creating effective communication directly between developers and customers is also essential in XP. However, the geographic distances between customers and developers in the F-15 project made face-to-face and telephone communication cost-prohibitive. This section discusses how the team used asynchronous communication (email) to establish communication channels between customer and developer to manage XP user stories.
A significant portion of the project’s difficulties resulted from the fact that the development team was six time zones ahead of the project management team, providing only two hours of overlap where both teams could communicate synchronously. The team considered using free VOIP technologies, but installing them for all project management personnel was deemed to be too costly for the small amount of time they would be used during the short project. The team also used instant messaging (IM) systems to facilitate some communication between the developers and customers. The IM systems were subject to the same time zone difficulties as telephone conversations, and also were predominantly one-on-one conversations between developer and customer to understand specific technical issues. IM systems were not used when discussing user requirements since the team desired all stakeholders to be aware of such conversations, and the one-on-one nature of instant messaging prevented this information from being distributed to the whole team. This restricted ability to communicate synchronously is particularly troublesome when the XP methodology is used because interactive communication between customer and developers is needed due in large part to the informal nature of XP user stories The daily status meetings and iteration/release planning meetings (described in Section 4.2) were not sufficient to field all of the developer’s questions, however. In place of the informal face-to-face communication common in XP, the team used an informal email approach. Approximately one month into the project, the F-15 team began to use a mailing list as their primary form of communication. All messages sent to the mailing list were distributed to all members of the project management and development teams. Typically, the developers would build a backlog of queries for the project management team during their workday in the Czech Republic. Whenever project management personnel arrived at work in the morning in the US, they would address the developers’ questions.
Therefore, the developers received answers to their questions when they returned for the next day’s work. The email exchanges took a very binary form of Question and Answer. The prompt response to questions was essential in building trust that questions will be promptly answered.
We examined archived data from the F15 team’s email listserv to identify messages concerning user story clarifications and specification, acceptance tests, bug identifications, planning and scope definitions, prototypes of layouts, and more. The classification and quantification of the email communications between developers and project management personnel was conducted through a multi-hour session involving the lead author, the development manager, the project manager, and the project tracker. Figure 2 provides the breakdown of the various types of emails according to iteration. Points on the chart represent the frequency of a message type during that iteration, e.g. in iteration 2-5, there were approximately 40 clarification messages and 30 bug-related messages.
This chart provides many insights into the progression of the project. Iteration 2-1 marks the beginning of widespread use of the mailing list, and also marks the point in which the customer role became well-defined in the project. In subsequent iterations, the number of messages increases significantly. We note that this chart only encompasses messages sent to the mailing list, and not between individuals.
The most numerous type of message is a clarification message, which represents developer-customer dialogues that clarify the user stories and elucidate specification details. We also note that there are a number of specificationrelated messages, which correspond to the customer defining new or changing existing user stories during the iterations. Planning and scope related messages deal with adding or removing functionality from user stories. These messages suggest that the customer was actively involved throughout the process, and influenced the day-to-day activities of the developers. At iteration 2-5 (the release point), the communication flow increases significantly.
Here, also, we see bug-related messages begin to appear as the customer was more rigorously testing the product before delivery. The bug-related messages also appear at approximately the same time as acceptance testing messages. After the release point, we see some post-release bug messages, as well as specification and clarification messages, but on a much smaller scale.
This email analysis provides us with two important insights. First, the change-in-specification and scoping/planning messages indicate that the customer was influencing the process on a daily basis. Second, communication between developers and customers was occurring on a regular basis and not just at critical points during the lifecycle. These observations seem to suggest that it is possible for email communication to sufficiently exhibit the characteristics implicit in face-to-face communication in XP, despite claims by XP authors that face-toface communication is necessary [7, 8]. While face-to-face communication may be the richest form of communication , this data suggests that email communication may serve as a sufficient replacement when faceto-face (or telephone) interaction is cost-prohibitive or restricted due to security concerns or company policy.
Furthermore, it has been noted in a previous study of an XP team that email communication between the developers and project management was deemed adequate .
Recommendation: When face-to-face, synchronous communication is infeasible, use an email listserv to increase the chance of a response and encourage prompt, useful, and conclusive responses to emails.
A timely response to developer inquiries will prevent development from slowing or progressing down an undesired path while awaiting a definitive answer .
5.4. Process visibility and control Conjecture 4: In a globally-distributed XP team, providing the team with continuous access to process and product information can help to improve process control and plan effectiveness.
This section describes how the F-15 team used the XPlanner tool to effectively manage their project. In XP, the customer is not only in control of the features, but also of scheduling and prioritization. This requires that the customer has high visibility into the process to accurately gauge the project’s progress and the evolution of the system’s features. In a GSD, process visibility may become an issue when the project management team is not present to observe first-hand the progress made on the project . Furthermore, without active, face-to-face communication, it may be difficult to get a failing project back on track since project management personnel are not on hand to actively guide the project, encourage team members, and oversee status meetings . In GSD, a universally-accessible knowledge base that documents project status can help in achieving project control through high project visibility. The use of XPlanner to store electronic versions of user story cards provides visibility into the progress reported per user story.
In XPlanner, both customers and developers accessed electronic versions representations of user story cards (see Figure 3 for an example). These user stories, in turn, contained more concrete tasks. Each task had a corresponding progress bar that indicated how much time has been spent on the task by developers, how much time is estimated to be spent in total, and whether or not the task has been completed. Project management personnel, including the customer, could see how individual features in the system were progressing. The developers updated the information in XPlanner three times per week during the early and middle phases of the project, and updated it on a daily basis late in the project.
Figure 3. Example of user story from XPlanner
One way to interpret the planning effectiveness of a project is through its schedule adherence. The information in XPlanner provided project velocity measurements. These velocity measurements were then used to determine the amount of work that was planned in coming iteration and placed bounds on the number of user stories the customer could integrate into the next iteration. The team’s estimation accuracy (how well their estimates of user stories matched the actual time spent) is shown in Figure 4. In Figure 4, the bars are individual user stories and are grouped according to iteration (a user story that spanned multiple iterations appears in each iteration discretely). Bars above the line represent overestimates, while bars below the line represent underestimates. The scale is somewhat skewed because of the sizes of estimates involved, e.g. an estimate of 6 hours with an actual of 1 hour results in 500% overestimate. There were 52 overestimates, 33 underestimates, and 17 correct estimates out of 102 samples. The cardinality and magnitude of the overestimate errors is much greater than that of the underestimates, indicating a tendency for the group to error on the side of caution when creating their estimates. Underestimates are of greater concern, since they reflect additional work and possibly omitting functionality due to time constraints. The largest magnitude of relative error for a single underestimate was 90%. This suggests that, while the team’s schedule adherence was not perfect, they avoided continuous underestimates that may have led to unanticipated reductions in project scope at the delivery date. The XPlanner tool was cited by both the development manager and the customers as being essential to the successful management of the project.
-1000% 1-1 2-1 2-3 2-5 3-2 3-4 1-2 2-2 2-4 3-1 3-3
Recommendation: Use globally-available project management tools to record and monitor the project status on a daily basis.
Using an online project management tool can provide an accurate gauge of project status for all stakeholders.
The tool is only as useful as its users make it, so it is essential that the information contained in such a tool is accurate and up-to-date. The recommended use of such a tool is certainly not novel. However, we feel that it is particularly important communication tool in an XP GSD team.
6. Case study limitations
Yin  outlines four tests of judging the quality of research design, including case studies, in producing
meaningful results. We enumerate the limitations of our case study using these four test categories:
• Construct validity, or establishing measures that reflect the theory under test, can be problematic in case study research. To combat this issue, operational measures for the concept being studied should pre-established, such as with a benchmark like the XP-EF. The XP-EF has been used in multiple case studies [35, 36, 52], and its measurements were constructed using Basili’s Goal-Question-Metric approach . The majority of our analysis is reliant upon interviews and questionnaire responses. Published examples for customer satisfaction interviews were used as guides for these artifacts [5, 30]. Additionally, we did not begin with any theories, but let conjectures emerge. Our goal was not to perform rigorous scientific analysis of data, but to build theories from observed phenomena to motivate further study.
• Internal validity or the occurrence that all the potential factors that might influence the data are controlled except the one under study. The uncontrollable nature of case studies and the inevitable confounding factors cause internal validity concerns. There is also a concern that the industry participants may behave differently when under observation. The research interest in this project did not begin until the project neared its major release point. At that time, only a few project management personnel were aware that a study was possible. Data collection commenced several months after the project had ended. The personnel turnover during development may also have influenced the
• External validity or how well the results of the study can be generalized to the world outside the research situation. External validity is the strength of industrial case studies. However, the results can only be generalized to the context in which the case study was conducted. Potts  contends that the real-world complications of a large industrial project are more likely to produce representative problems and phenomena than a laboratory example. He also points out that searching for completely generic findings via case studies is illusory. As such, we provide conjectures only and make recommendations for further research, but draw no definite conclusions.
We do believe that this study is beneficial to other projects in which the customer-developer relationship is distributed and where there is a need for frequent interaction.
• Experimental reliability, which assesses whether another investigator would get similar results by following the experimental procedures with the same case study. As with construct validity, the use of a benchmark can aid in experimental reliability. The quantitative data in this study can be reconstructed following the procedures in XPEF v1.4. The action of counting and classifying email listserv messages described in Section 5.3 is partially subjective. No standards were used to define the classification categories, and the classification of individual messages was a decision made by the development manager, project manager, and project tracker based upon high level interpretations of the individual categories. Thus, it is possible that counting procedure with the same personnel would yield some different classifications. Also, the interviews conducted with project stakeholders are likely to yield different results based on the individual style of the interviewer.