Friday, July 15, 2016

Are students/teams following the Agile process? How would you know?

I teach a large project course at Berkeley. Each semester, ~40 teams of 6 students each work on open-ended SaaS projects with real external customers. The projects complement an aggressive syllabus of learning Agile/XP using Rails; most of that syllabus is also available in the form of free MOOCs on edX and is complemented by our inexpensive and award-winning textbook [shameless plug].

We require the teams to use iteration-based Agile/XP to develop the projects. During each of four iterations, the team will meet with their customer, use lo-fi mockups to reach agreement on what to do next, use Pivotal Tracker to enter the user stories they'll work on, use Cucumber to turn these into acceptance tests, use RSpec to develop the actual code via TDD, use Travis CI and CodeClimate to keep constant tabs on their code coverage and code quality, and do continuous deployment to Heroku. Everything is managed via GitHub—we encourage teams to use branch-per-feature and to use pull requests as code reviews. Most teams communicate using Slack; we advise teams to do daily or almost-daily 5-minute standups, in addition to longer "all hands" meetings, and to try pair programming and if possible promiscuous pairing. Some teams also use Cloud9 virtual machines to make the development environment consistent across all team members (we provide a setup script that installs the baseline tools needed to do the course homeworks and project). There are also two meetings between the team and their "coach" (TA) during each iteration; coaches help clean up user stories to make them SMART, and advise teams on design and implementation problems.

While the non-project parts of the course benefit from extensive autograding, it's much harder to scale project supervision. Even with a staff of teaching assistants, each TA must keep track of 6 to 8 separate project groups. Project teams are graded according to many criteria; some are qualitative and based on the substance of the meetings with coaches (there is a structured rubric that coaches follow, which we describe in the "Student Projects" chapter of the Instructor's Guide that accompanies our textbook), but others are based on quantitative measurements of whether students/teams are in fact following the Agile process.

In our CS education research we've become quite interested in how to provide automation to help instructors monitor these quantitative indicators, so that we can spot dysfunctional teams early and intervene.

We've observed various ways in which teams don't function properly:

  • Lack of communication: Teams don't communicate frequently enough (e.g., they don't do standups) or communicate in a fractured way, for example, a team of 6 really becomes two teams of three who don't talk to each other. The result is "merges from hell", lack of clarity on who is doing what, etc.
  • Not using project-management/project-planning tools effectively: One goal of Pivotal Tracker (or similar tools such as Trello or GitHub Issues) is to plan, track, prioritize, and assign work, so that it's easy to see who's working on what, how long they've been working on it, and what has already been delivered or pushed to production. Some teams drift away from using it, and rely on either ad-hoc meeting notes or "to-do lists" in Google Docs. The result is poor ability to estimate when things will be done and finger-pointing when a gating task isn't delivered and holds up other tasks.
  • Not following TDD: Projects are required to have good test coverage (85% minimum, combined between unit/functional and acceptance tests), and we preach TDD. But some students never give TDD a fair shake, and end up writing the tests after the code. The result is bad tests that cover poorly-factored code in order to meet the coverage requirement.
  • Poor code quality: We require all projects to report CodeClimate scores, and this score is part of the evaluation. Although we repeatedly tell students that TDD is likely to lead them to higher-quality code, another side-effect of not following TDD is poorly-factored code that manifests as a low CodeClimate score, with students then manually fixing up the code (and the tests) to improve those scores.
There are other dysfunctions, but the above are the main ones. 

To help flag these, I've become interested in how to study not only teams and team dynamics, but what analytics can tell us about the extent to which students are following the Agile process (or any process). We recently summarized and discussed some papers in our reading group that address this. (You can flip through the summary slides of our reading group discussion, but it may help to read the summaries and/or papers first.)

One of these papers, Metrics in Agile Project Courses, considers a variety of metrics obtainable from tool APIs to evaluate how effectively students are following Scrum, especially by measuring:
  • Average life of a pull request, from open to merge. “Optimal” is <24 hours “based on our past experience with the course”, but too-short lifetimes may be bad because they indicate the code wasn't reviewed thoroughly before merging.
  • Percent of merge requests with at least one comment or other activity (eg associated task). This is similar to our metric of pull-request activity in ProjectScope.
  • Mean time between a CI (continuous integration) failure and first successful build on the main development branch.
  • All indicators are tracked week-to-week (or sprint-to-sprint or iteration-to-iteration) so instructors see trends as well as snapshots.
  • Number of deploys this iteration (In their course, “deployment” == “customer downloads latest app”, but for SaaS we could just look at Heroku deploy history.)
In a more detailed paper (How surveys, tutors, and software help to assess Scrum adoption in a classroom software engineering project), the same group used a combination of student self-surveys, TA observation of team behaviors, and a tool they built called "ScrumLint" that which analyzes GitHub commits and issues relative to 10 “conformance metrics” developed for their course, including (eg) “percentage of commits in the last 30 minutes before the sprint-end deadline”. Each conformance metric has a “rating function” that computes a score between 0 (practice wasn't followed at all) and 100 (practice was followed very well); each team can see their scores on all metrics.

What's interesting is that they actually point to an Operations Research-flavored paper on a methodology for detecting nonconformance with a process (see the "Zazworka et al." reference in the summary for more details). The methodology helps you define a specific step or subprocess within a workflow, and rigorously define what it means to "violate" it.

With AgileVentures, we are trying to encapsulate some of this in a set of gems called ProjectScope that will be the basis of a SaaS app for Agile coaches trying to monitor multiple projects. Check back here periodically (or join AV!) if you are interested in helping out!

No comments:

Post a Comment