Monday, December 19, 2016

“Towards a new hacker ethic”

I just read the transcript (and saw slide deck) of a great talk, Programming is Forgetting: Toward a New Hacker Ethic by Allison Parrish at the 2016 Open Hardware Summit.
The whole talk is worth watching/reading, but the part I liked best is Parrish’s reformulation of the “hacker ethic” laid out in Steven Levy’s book Hackers in the 1980s. Levy set out to chronicle the cultural history of hacking (the good kind, i.e. building and tinkering with computing and technology systems, as opposed to maliciously subverting them) and he claimed to summarize the “hacker ethic” in the following four points:
  • Access to computers should be unlimited and total
  • All information should be free
  • Mistrust authority; promote decentralization [of control and content]
  • Hackers should be judged by their hacking [skills], not “bogus” criteria such as degrees, age, race, or position
The gist of Parrish’s talk is that while the above principles are noble, the anecdotes Levy recounts of hacker behavior are often inconsistent with the above points, including a noteworthy anecdote in which some hackers “exercised their ethic” in a way that dismissively interfered with the work of Margaret Hamilton, who would go on to invent the term “software engineering” and to be the lead architect of the guidance computer software for the Apollo missions that landed the first humans on the moon.


For me the best part of the talk was Parrish’s proposed reformulation of the “hacker ethic” in a way that takes the form of questions that creators of technological systems should ask themselves as they think about deploying those systems. I doubt I can improve on her phrasing, so I’ll quote the talk transcript directly:
“…my ethic instead takes the form of questions that every hacker should ask themselves while they’re making programs and machines. So here they are.
Instead of saying access to computers should be unlimited and total, we should ask “Who gets to use what I make? Who am I leaving out? How does what I make facilitate or hinder access?”
Instead of saying all information should be free, we could ask “What data am I using? Whose labor produced it and what biases and assumptions are built into it? Why choose this particular phenomenon for digitization or transcription? And what do the data leave out?” 
Instead of saying mistrust authority, promote decentralization, we should ask “What systems of authority am I enacting through what I make? What systems of support do I rely on? How does what I make support other people?” 
And instead of saying hackers should be judged by their hacking, not bogus criteria such as degrees, age, race, or position, we should ask “What kind of community am I assuming? What community do I invite through what I make? How are my own personal values reflected in what I make?”
A few weeks ago a Medium post went viral in which developer Bill Sourour discussed “The code I’m still ashamed of”. At the direction of his manager, he had written code for a website that was posing as a general-information website where you could take a quiz to determine what prescription drugs were recommended for your particular symptoms and condition. In fact, though, the website was effectively an advertisement for a specific drug, and no matter what your responses to the quiz, the recommendation would always be the same—you needed this company’s drug. (A young woman later killed herself due to depression attributable in part to consuming the drug.) Business Insider reported that Sourour’s post had triggered a storm of “confessions” on Reddit from other engineers who were ashamed of having done similar things under duress, and includes some pointed comments from software thought leader “Uncle Bob” Martin such as “We are killing people.” He warns us that the Volkswagen emissions-cheating scandal was probably just the tip of the iceberg, and that even though in this case the CEO was ultimately held accountable (which doesn’t always happen), “it was software developers who wrote that code. It was us. Some programmers wrote cheating code. Do you think they knew? I think they probably knew.”

Uncle Bob goes on to lament that coding bootcamps rarely include any required material on software ethics, and I'm beginning to fear we don't do enough at Berkeley either. In my work as a college instructor, I do have to deal with breaches of academic integrity of various sorts, from straight-ahead plagiarism to students paying freelancers to do their homework to students presenting false documentation about medical emergencies to avoid taking an exam. Disturbingly often, when these students are confronted with evidence of their actions, their only remorse seems to be that they were caught, and I find myself wondering whether they are the software writers who will go on to insert “cheat code” into a future consumer product. We do have a required Engineering Ethics course and there is a software engineering ethics code endorsed by the Association for Computing Machinery, but I worry that our ethical training doesn't have sharp enough teeth. As Uncle Bob wrote, “We [software developers] rule the world, we just don’t know it yet.” We’d better start acting like it. Self-reflection questions like those proposed by Parrish would be a good place to start.

Tuesday, December 13, 2016

Getting a good software internship in industry or academia

I hire summer students all the time to work on software projects related to my research and teaching at Berkeley, and I have students coming to me frequently asking for advice on either getting a good internship or selecting among various offers. Leaving aside the (very important) nontechnical aspects of your interview or job choice, here's my technical advice from the point of view of what I look for in a software hire (and, not coincidentally, what I teach in Berkeley's software engineering course):

Have a story for testing. Testing and debugging software is much harder than writing it. The question you should be prepared to answer is not "Do you test?" but rather "How do you test?" Do you have tests that can be run programmatically? Do you have any idea how much of your code is covered by tests?

Demonstrate that you know the ecosystem. The developer ecosystem, roughly centered around GitHub, now includes continuous integration (Travis, Jenkins), project tracking (Pivotal Tracker), code quality measurement (CodeClimate), and more. How do you use these tools?

Have you worked in a team? Software development is rarely a solo endeavor anymore; most interesting software is built by teams. What practices does your team use to organize its activities (scrum, standups)? To manage simultaneous development (branches, fork-and-pull)? To communicate and stay on same page about the project (Slack, Basecamp)?

Be a full-stack developer.  What can I do with a front-end developer who can't write the back end? That's like someone who can build the front of my house but not the side walls or stairs. What I need is a house, even if a very simple one. Similarly, even a back-end developer must be able to get my app out to the end user in some way, even if the user interface is modest and simple.

Get good at picking up new languages and frameworks. This is more relevant for full-time jobs, but I'd rather hire someone who can learn new things fast than someone who's an expert on whatever framework I happen to be using right now, since new frameworks and languages come along all the time. How would you demonstrate that you can learn things fast?

Understand Unix. The single most influential set of ideas in modern server and client operating systems derives from the Unix programming model. The ability to quickly put together simple tools to do a job is vital to developer productivity. If I ask you "Given a set of large text files, count approximately how many distinct email addresses they contain," and your first question is "In what language," we're done.  One plausible answer is the shell command:

    grep -E -o '\b\w+@\w+\b'  filenames | sort | uniq | wc -l

Do your own projects.  If the only software you write is in course projects, I'd wonder if you really love coding enough. Imagine trying to compete in a tennis tournament if the only time you had spent on the court was during your actual tennis lessons. Class projects just don't give you enough mileage to get really good at software; doing your own projects (extracurriculars, pro-bono work, hackathons, personal projects, contributing to an open-source project, whatever) is essential. Be prepared to answer "What is the coolest personal software project you've worked on and what made it exciting/challenging/a great learning experience?"

Show me your code.  As my colleague and GitHub engineer Jesse Toth has succinctly put it, “Your code is your résumé.” A résumé without evidence of coding skill is minimally useful. If you can’t make your GitHub public repos part of your code portfolio or add me as a view-only collaborator on some repos, at least send me a link to a tarball of code you’ve written.

Sunday, December 11, 2016

"Keeping the lights on" for ESaaS-built pro-bono software deployments

Berkeley's software engineering course, which I developed with Prof. Dave Patterson, has something important in common with the nonprofit AgileVentures, founded by Prof. Sam Joseph (who also is the lead facilitator for our edX MOOC on Agile Development).

Both organizations allow developers-in-training (students in the case of Berkeley; professionals in the case of AV) to work on open source pro bono software projects in a mentored setting, usually for nonprofits and NGOs. Indeed, since 2012, Berkeley students have delivered/deployed over 100 such projects, many still in production use by the original customer.

A perennial problem we've had, though, is what to do when each offering of the course ends. How can these nontechnical customers arrange for basic maintenance of their apps if they don't have any IT staff who can do this? Even if the customer wants future students to continue working on the software, it needs to be kept running until the next course offering.

This semester (Fall 2016) we're trying something new. AgileVentures is introducing a "Nonprofit Basic Support" membership tier that gives nonprofits basic support for maintaining these SaaS apps. For a very low monthly fee, an AgileVentures developer will be the maintenance contact for the app, ensure its "lights are kept on" (restart when needed, etc.), and advise the customer if the app needs other attention, for example, if the app's resource needs require it to be moved to a paid or higher hosting tier.

The goal is to "keep the lights on" either until the next team of students or AV developers further enhances the app, or until the customer decides to take over maintenance of the app themselves (or move it to another contractor).

Of course, a few customers don't need this service; they may already have in-house staff, or the app may be one that was already in use and for which our student developers just provided new features. But for the majority of customers who are nontechnical and may not even be able to afford in-house IT for maintaining these apps, we look forward to seeing how this experiment works out!

Thursday, September 29, 2016

Agile DevOps

If you're surprised at how frequently in this blog I mention articles from Communications of the ACM ("CACM"), you're missing out. Especially if you're a student, membership is inexpensive and the flagship monthly magazine tends to be full of really good stuff relevant to both practice and research.

Today I'm blogging about an article in the July 2016 issue (yes, I'm behind in my reading) talking about the "small batches principle" for Dev/Ops tasks. The article is written by a Dev/Ops engineer at Stack Overflow, the now-legendary site where all programmers go to give and receive coding help and which was co-founded by Joel Spolsky, one of my software heroes and author of the excellent Joel On Software blog and books.

This article talks about various in-house tasks associated with deploying software, such as pre-release testing and hot-patching, building and running a monitoring system, and other tasks that this company (and many others) historically did once in a great while. The month during which a new release was being tested and deployed became known as "hell month" because of the magnitude and pain of the tasks.

The article describes how Stack Overflow has migrated to a model of doing smaller chunks of work incrementally to avoid having to do very large chunks every few months; how they moved to a "minimum viable product" mentality (what is the simplest product you can build that will enable you to get feedback from the customer to validate or reject the features and product direction); how they adopted a "What can get done by Friday"-driven mentality, so that there would always be some new work on which their customers (in this case, the development and test engineers) could comment on; and so on.

Essentially, they moved to an Agile model of Dev/Ops: do work in small batches so each batch minimizes the risk of going off in the wrong direction; deploy incremental new changes frequently to stimulate customer feedback; thereby avoid painful "hell months" (possibly analogous to "merges from hell" on long lived development branches).

Agile applies to a lot of things, but this article does a nice job of mapping the concepts to the Dev/Ops world from the pure development world.

Tuesday, September 20, 2016

Flipped classroom? No thanks, I'd rather you lecture at me

As an early adopter and enthusiast of MOOCs, I've followed the "flipped classrooms" and "lecture is dead" conversations with some interest. Indeed, why would students attend lecture—and why would professors repeat last semester's lectures—when high-quality versions are available online for viewing anytime?

This semester, I'm once again teaching Software Engineering at Berkeley to about 180 undergraduates. (Past enrollments have been as high as 240, but we capped it lower this semester to make projects more manageable.) In the past, Dave Patterson and I have team-taught this class and we've developed a fairly well-polished set of lectures and other materials that are available as a MOOC on edX, Agile Development Using Ruby on Rails, and which we also use as a SPOC.

Last spring, by the end of the semester our lecture attendance had plummeted to about 15% of enrollment. We surveyed the students anonymously to ask how we could make lecture time more valuable. A majority of students suggested that since they could watch pre-recorded lectures, why not use lecture time to do more live coding demos and work through concrete examples? In other words, a classic "flipped lecture" scenario.

So this semester I dove in feet-first to do just that. Less than 20% of my lecture time has been spent covering material already in the recorded lectures; instead, we made those available to students from the get-go. The rest has consisted of live demos showing the concepts in action, and activities involving extensive peer learning, of which I'm a huge fan. I've even started archiving the demo "scripts" and starter code for the benefit of other instructors. The "contract" was that we would list which videos (and/or corresponding book sections) students should read or watch before lecture, and then during the lecture time (90 minutes twice a week) the students would be well prepared to understand the demo and ask good questions as it went along.

I told the students I would also sprinkle "micro-quizzes" into the lectures to spot-check that they were indeed reading/viewing the preparatory materials. The micro-quiz questions were intended to be simple recall questions that you'd have no trouble answering if you skimmed the preparatory material for the main ideas. (We use iClicker devices that are registered to the students, so we can map both responses and participation during lecture to individual students.)

Today in lecture, a bit more than 4 weeks into the course, I've officially declared the flipped experiment a failure. (Well, not a failure. A negative result is interesting. But it's not the result I expected.)

Since we made the pre-recorded lectures available within an edX  using the edX platform itself, and students have to login with their Berkeley credentials to access the course, we can see using edX Insights (part of the platform's built-in analytics) how many people are watching the videos.

According to the edX platform's analytics, the typical video is partially viewed by about 20 people. Only 45 people have ever watched any video to completion. Remember, this is in a class of 180 enrolled students, whose previous cohort specifically requested this format.

Maybe people are watching the videos after lecture rather than before? If they were, you'd expect video viewing numbers to be higher for videos corresponding to previous weeks, but they're not.

Maybe people are reading the book instead? If they were, the performance on the microquizzes should be nearly unimodal—they are simple recall questions that you cannot help but get right if you even skim the video or the book—but in fact the answer distributions are in some cases nearly uniform.

Maybe people already know the material from (e.g.) an internship? One or two students did approach me after today's lecture to tell me that. I believe them, and I know there are some students like this. But we also gave a diagnostic quiz at the beginning of the semester, and based on its results, very few people match this description.

Maybe students don't have time to watch videos or read the book before lecture? This is a 4-unit course, which means students should nominally be spending 12 hours per week total on it, of which lecture and section together account for only 4. The reading assignments, which can be done instead of watching the videos, average out to 15-20 pages twice a week, or 30-40 pages per week. Speaking as a faculty member leading an upper-division course in the world's best CS department at the world's best public university, I don't believe 30-40 pages of reading per week per course is onerous. Also, in past offerings of the course, we've polled students towards the end of the course to ask how many hours they are actually spending per week. The average has consistently been 8-10 hours, even towards the end where the big projects come due. So by the students' own self-reporting, there's 2-4 hours available to either do the readings or watch the videos.

As you might imagine, planning and executing live demos requires a couple of hours of preparation per hour of lecture to come up with a demo "script", stage code that can be copy-pasted so students don't have to watch me type every keystroke, ensure the examples run in such a way as to illustrate the correct points or techniques, remember what to emphasize in peer-learning questions at each step of the demo, and so on. But it's discouraging to do this if at most 1/9 of the students are doing the basic prep work that will enable them to get substantive value out of the live demo examples.

So at the end of lecture today I informed the students that I wouldn't use this format any longer—those who wanted to work through examples could come to office hours, which have been quiet so far—and I asked them to vote using the clickers on how future lectures should be conducted:

(A) Deliver traditional lectures that closely match the pre-recorded ones
(B) Cancel one lecture each week, replacing it with open office hours (in addition to existing office hours)
(C) Cancel both lectures each week, replacing both with open office hours (ditto)
(D) I have another suggestion, which I am posting right now on Piazza with tag #lecture
(E) I don’t care/I’ll probably ignore lecture regardless of format

Below are the results of that poll. The people have spoken. Less work for me, but disappointing on many levels, and I feel bad for the 20-30 people who do prepare and who have been getting a lot of value out of the more involved examples in "lecture" that go beyond the book or videos. And it doesn't seem like a great use of my time to do a live performance of material that's already recorded and on which I'm unlikely to improve, though it is a lot less work for me (if not as interesting).

(Note that the number of poll respondents adds up to 121, consistent with previous lectures this semester. So even in steady state at the beginning of the semester, 1/3 of the students aren't showing up, even though participation in peer-instruction activities using the clickers counts for part of the grade.)

(Update: I posted a link to this article on the Berkeley CS Facebook page, and many students have commented. One particularly interesting comment comes from a student who took my class previously: “When I took your class, as someone who rarely, if ever, went to lecture or prepared, the microquizzes helped me convince myself to do that. They weren't hard, so getting them wrong was just embarrassing to me. I was probably the best student I ever was until you stopped doing them my semester for reasons I can't recall and after that the whole thing fell apart for me.” So maybe I should just stick to my guns on the "read before lecture" part and make the microquizzes worth more than the ~10% they're currently worth...)

A year or so ago, I approached the campus Committee on Curriculum and Instruction to ask whether they'd be likely to approve a "lab" version of this course, in which there is no live lecture (only pre-recorded), the lecture hours are replaced with open lab/office hours, and all the other course elements (teaching assistants, small sections, office hours for teaching staff, etc.) are preserved. They said that would likely be fine. It's sounding like a better and better deal to me given that the majority of students want me to spend my time doing something nearly identical to what I've done in past semesters. And if previous semesters are any indication—and this has been very consistent since I started teaching this course in 2012—lecture attendance will fall off about halfway through. (I won't be surprised if the rationale given is that the lectures are available to watch online, even though the data we're gathering this semester shows that's clearly not happening.)

In an ideal universe, maybe we'd have 2 versions of the course, one tailored to people who do the prep work and want a fast-paced in-depth experience in lecture and another for people who prefer to be lectured to in the traditional manner. But we don't live in that ideal universe because of finite resource constraints. And one could argue that we already have that second version of the course—it's on edX and is the highest-grossing and one of the most popular courses on the platform, and it's free.

Instructors: What has your experience been flipping a large class?

Students: What would you do?

Wednesday, September 7, 2016

The hidden benefits of microservices

If you know me or read my writings, you probably know that I've become a big fan of Communications of the ACM, the monthly magazine of the Association for Computing Machinery, the world's largest and most prestigious professional society for computing. The August issue had a very nice essay by one of the original Amazon engineers on the hidden benefits of moving to a microservices-based architecture.

The appearance of the article is particularly timely since I just started teaching Engineering Software as a Service to about 200 Berkeley undergraduates in this fall semester, and a service-oriented approach to thinking about SaaS is a cornerstone of the course.

I'm particularly interested in an Amazonian's views since in the ESaaS course and accompanying book we cite Amazon's early move to services as an example of a tech company being ahead of an important curve.

The article is short and well worth a read, especially as it's written by an engineer who was there from the first days of the "monolithic" Amazon.com site and through the transition to the service-oriented architecture Amazon has today, including many services also made available publicly through AWS. Essentially, the 7 benefits apply as long as your APIs are stable and documented, and you have a tight dev/ops team keeping the service reliably running:

  1. Permissionless innovation: no need to ask "gatekeepers" for permission to add/try new features, as long as the features don't interfere with existing use of the API.
  2. Forced to design for failure: cross-service failures are hard to debug, so there's a strong motivation to "design for debuggability/diagnosis" so that many failure types become routine and even automatically recoverable at the individual microservice level.
  3. Forced to disrupt trust boundaries: within small teams, mutual trust is easier since everyone can sign off on every commit. Large teams can't do this, and a microservices architecture both enables small teams and forces coming to terms with the fact that the same level of trust that exists within a team cannot always stretch across API boundaries. This is good for scaling an organization, as Conway's Law suggests ("Any organization will produce a system design that mirrors the organization's internal communication structure").
  4. A service's developers are also its operators, so they are very close to their customers, whether those customers are end users or other microservices. The tighter customer feedback loop accelerates the pace of improvement in the service.
  5. Makes deprecation somewhat easier: if you version the API it becomes clear who "really" cares about the legacy versions.
  6. Makes it possible to change the schema or even the persistence model for data and metadata without permission from others.
  7. Allows separating out particularly sensitive functions as microservices (e.g. handling sensitive data such as financial or health) and "concentrating the pain" (and focus) of how best to steward that data.
  8. Encourages thinking more aggressively about testing from the outside as well as the inside: continuous deployment, "smoke tests" where an experimental feature is rolled out fractionally to see if anything breaks, and phased deployment of new features can all improve the precision of the test suite and result in lower time-to-repair when something breaks in production.
As the author points out, moving from an existing monolithic app to microservices isn't easy (former Director of Engineering at Twitter, Raffi Krikorian, explained what a big move this was for Twitter and why the "Rails doesn't scale" meme wasn't really an accurate description of what happened there). And unless your microservices architecture really does enable the above behaviors, you're probably not 100% of the way there. But this is clearly the way SaaS architecture is going, and will soon be its own new chapter in ESaaS.



Tuesday, August 16, 2016

Encrypting application-level data at rest

In a previous post I described how I keep sensitive config data such as API keys secret for a public app.

What about data that is kept per-user, such as a password or API key for each user or other similar record in your databases?

For that I use the attr_encrypted gem. There is a great post on how to do this; if you're using Figaro as recommended in my earlier post, that gives you an easy place to store the symmetric-encryption key(s) used by attr_encrypted, so instead of

attr_encrypted :user, :key => ENV["USERKEY"]

you would instead say

attr_encrypted :user, :key => Figaro.env.USERKEY

Beware of attr_encrypted and serialize

Be aware that if you are using serialize to store (say) a hash or array in an ActiveRecord text field, there appear to be some weird interactions if you try to use attr_encrypted on that field. What I have found to work is this:

Before:

class User < ActiveRecord::Base
  serialize :api_keys, Hash
end

User.new.api_keys  # =>  {}

After:

class User < ActiveRecord::Base
  attr_encrypted :api_keys, :key => Figaro.env.SECRET, \
     :marshal => true
end

User.new.api_keys  # => nil


That last option instructs attr_encrypted to marshal the resulting data structure before encrypting, and unmarshal after reading from the database and decrypting. However, whereas using serialize gives newly-created attributes a default value of a new instance of the serialized type (in this case, that would be the empty hash), this is not true with attr_encrypted. To remedy this, if your app relies on the serialized value always being non-nil, I'd advise using an after_initialize block to enforce the invariant that the attribute's default value is always an instance of the serialized class:

class User < ActiveRecord::Base
  attr_encrypted :api_keys, :key => Figaro.env.SECRET, \
    :marshal => true
  def ensure_is_hash ; self.api_keys ||= {} ; end
  after_initialize :ensure_is_hash
  private          :ensure_is_hash
  # ...
end

User.new.api_keys  # =>  {}


Et voila, with the exception of the after_initialize callback, your encrypted-at-rest attributes will now behave the same way as regular unencrypted attributes.


Keeping secrets

Lately I've been involved with a bunch of apps that have to manage sensitive or semi-sensitive data, such as passwords, API keys, and so on.  I've converged on a couple of ways of managing this data that I thought I'd share. Most of my apps are hosted on Heroku, but the advice here applies to other deployment platforms too.

I'll distinguish two kinds of data: configuration-level data that is specified once for the whole app (for example, an API key for the app to access another microservice), and sensitive app-level data (for example, a password or other secret that needs to be stored for each user).  In this post I describe how to store configuration-level secret data; a separate post builds on this one to store app-level secret data.

You'll need the Figaro gem and a command-line-friendly installation of your favorite encryption package; I use GPG.

Use Figaro

For managing sensitive config data such as API keys, here's a methodology that observes two important guidelines:
  1. DRY: the secret data should be kept all in one place and nowhere else.
  2. Sensitive data should never be checked into version control (eg GitHub), especially if the app is otherwise open source.
First, set up your app to use the Heroku-friendly Figaro gem to manage the app's secrets. In brief, Figaro uses the well-known technique of accessing  your app's sensitive config data as environment variables, but:
  • It centralizes all secrets in a file config/application.yml
  • it lets you access them through a proxy object, so that environment variable FooBar can be accessed as Figaro.env.FooBar. This allows you to stub/override certain config variables for testing if you want, and also (more importantly) to specify different values of those variables for production environment vs. development. For example, many microservices like Stripe let you setup two different API keys—a regular key, and a "testing" key that behaves the same as a regular key but no financial transactions are actually performed. Using Figaro, your app doesn't have to know which key it uses, because the correct key values for each environment can be supplied in application.yml
When you setup Figaro, it correctly adds config/application.yml to your .gitignore, since this file containing secrets should not be versioned (at least not in cleartext).

Encrypt  & version your secrets file

Next, agree with the rest of your team on a symmetric key for encrypting the secrets file. You can then encrypt the file like so (this example uses GPG and the bash shell):

export KEY=your-secret-key-value
gpg --passphrase "$KEY" --encrypt --symmetric --armor \
   --output config/application.yml.asc  config/application.yml

This will create the ASCII-armored encrypted file config/application.yml.asc, which I then check into version control.  Note that the security of this file relies on having chosen a good symmetric-encryption key.


Make sure developers can generate the secrets file

Of course, the config/application.yml file is now needed for your app to run, but only config/application.yml.asc exists in the repo. So any developer who clones the repo needs to know the value of $KEY, and when they clone the repo, they must manually create the secrets file from its encrypted version by performing the decrypt operation:

export KEY=your-secret-key-value
gpg --passphrase "$KEY" --decrypt \
   --output config/application.yml  config/application.yml.asc

Make sure Heroku knows the secrets

Figaro arranges to make all the info in config/application.yml available as environment ("configuration") variables on Heroku. Whoever has deploy access to the Heroku app can do this step:

figaro heroku:set -e production -a name-of-heroku-app


Make sure CI knows the secrets

Finally, if you're using continuous integration (you are, right?) it probably also needs to be able to generate config/application.yml in order to run the tests. On Travis, I add a step to the "before-script" to do this:

before_script:
  - gpg --passphrase "$KEY" --output config/application.yml \
       --decrypt config/application.yml.asc

Of course, this requires Travis to know the value of $KEY, so you also need to go to your app's config variables in Travis and set the value for KEY manually. These steps are easily adapted for Semaphore or other CI environments—in general, you manually supply the CI environment with the symmetric key value, and you add a before-step to the build to generate the unencrypted secrets file from the encrypted one.

When you change the contents of config/application.yml

If you add or modify secrets within application.yml, you'll need to re-create and commit config/application.yml.asc, and notify your developers that they must merge the new file and manually re-create config/application.yml from it.  You'll also need to re-run figaro heroku:set -e production to populate the deploy with the new values.

That's it. This seems complex, but after the one-time setup, it's basically maintenance-free. In the next post I'll talk about encrypting data at rest per-user.

Tuesday, July 19, 2016

Pair Programming MOOC Style - is it Agile?

This summer the “Engineering Software as a Service” (ESaaS) MOOC got re-branded as “Agile Developing using Ruby on Rails” (ADuRoR). This was a recommendation from edX and I felt it was probably a good move, since in the past there always seemed to be a few people surprised to find that the course was not about cloud computing per se, but about software engineering principles applied to web application development. The course was basically the same, although we were upgrading fully to Rails 4 and RSpec 3 and that was a fair amount of work in itself - more so in terms of updating materials than upgrading software. The big change we did make was to have pair programming become “required” for the first time, and have it assessed by peer review.

We’d introduced “optional” pairing some two years earlier. In the first instance, each assignment had a bonus 10% for submitting a video of pair programming on the assignment, but we also accepted solo videos or text summaries of the assignment talking about the pros and cons of pairing versus solo work. Part of our motivation for requiring more pair programming from the learners is our plan for a “part 3” or “project” capstone course. This will complete the ADuRoR Series, and will involve teams of MOOC learners collaborating on building software solutions for charities and non-profits around the world. We’re keen to ensure that this capstone course has teams that work effectively together. We strongly believe that pair programming is a critical part of an effective software development team. It’s not that you can’t develop great software without pair programming. It’s that you can’t work effectively in a team without good people and communication skills.

Somebody who would pair program for all the ADuRoR MOOC assignments would likely be better equipped to work effectively in a project team than someone who did not. Not that this is a guarantee; there are no guarantees, but we need some mechanism to ensure a high chance of success for our initial project teams. There’s also lots of evidence that pair programming has a big positive impact on learning to code.

As it happens the Agile process itself does not mandate using pair programming, test driven development, scrums, or anything of the sort. Let’s just refresh our memory as to the principles behind the Agile Manifesto:
Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.
Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.
Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.
Business people and developers must work together daily throughout the project.
Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.
The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.
Working software is the primary measure of progress.
Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.
Continuous attention to technical excellence and good design enhances agility.
Simplicity–the art of maximizing the amount of work not done–is essential.
The best architectures, requirements, and designs emerge from self-organizing teams.
At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.
There’s so much here, but as is clear, tests and pairing are not mentioned explicitly, nor is Scrum or Kanban, or many other things that are commonly associated with Agile. Personally I see the last “stanza” as the critical one. Regular reflection on how to become more effective, tuning and adjusting behaviour accordingly.

This seems like extraordinarily good advice to me. I happen to really enjoy pair programming. Not everyone does, and it’s not right for every team; but the crucial thing is that in any period you look back and say well we tried X and it’s gone well, gone badly, made no difference. And then we try a variant of X, throw out X, continue X as appropriate.

Our hope with the project course is to have teams of learners operate in agile teams. Ultimately we want to empower those teams to evolve their own Agile process based on what works for them. However we’d very much like all those team members to have at least tried TDD, BDD, pairing etc. to inform their own Agile Process. We also believe that remote pair programming and remote scrum meetings will be a critical part of distributed project teams working effectively together. To the extent that you accept Google Hangouts (or similar video conferencing software) as a form of face to face conversation, then we can see this also fits in with the principles behind the Agile Manifesto.
The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.
One of the main challenges for pairing in our MOOC has been the logistics of helping online learners find pair partners. While we had over 500 successful pairing sessions during the MOOC, some students have struggled to find pair partners available at convenient times for them, particularly on the later assignments since from start to finish the number of learners attempting the assignment goes down by an order of magnitude. It’s also clear that the logistical overhead of arranging a pairing session is not great for those on tight schedules.

Our options going forward appear to include:
  1. reducing the amount of credit given to pairing; so it doesn’t feel as “mandatory”
  2. improving the mechanism by which pair partners are found
  3. improving the support and documentation of how to manage pairing
The danger with the first step is that it would likely reduce the number of people attempting to pair, and further exacerbate the problem of finding pair partners. We certainly don’t want learners to become frustrated with attempting to pair program; we do believe that both pairing, and reviewing other learners pair videos are valuable learning experiences. The ultimate question is can we reduce the pairing logistics overhead sufficiently through options 2 and 3 …

Friday, July 15, 2016

Are students/teams following the Agile process? How would you know?

I teach a large project course at Berkeley. Each semester, ~40 teams of 6 students each work on open-ended SaaS projects with real external customers. The projects complement an aggressive syllabus of learning Agile/XP using Rails; most of that syllabus is also available in the form of free MOOCs on edX and is complemented by our inexpensive and award-winning textbook [shameless plug].

We require the teams to use iteration-based Agile/XP to develop the projects. During each of four iterations, the team will meet with their customer, use lo-fi mockups to reach agreement on what to do next, use Pivotal Tracker to enter the user stories they'll work on, use Cucumber to turn these into acceptance tests, use RSpec to develop the actual code via TDD, use Travis CI and CodeClimate to keep constant tabs on their code coverage and code quality, and do continuous deployment to Heroku. Everything is managed via GitHub—we encourage teams to use branch-per-feature and to use pull requests as code reviews. Most teams communicate using Slack; we advise teams to do daily or almost-daily 5-minute standups, in addition to longer "all hands" meetings, and to try pair programming and if possible promiscuous pairing. Some teams also use Cloud9 virtual machines to make the development environment consistent across all team members (we provide a setup script that installs the baseline tools needed to do the course homeworks and project). There are also two meetings between the team and their "coach" (TA) during each iteration; coaches help clean up user stories to make them SMART, and advise teams on design and implementation problems.

While the non-project parts of the course benefit from extensive autograding, it's much harder to scale project supervision. Even with a staff of teaching assistants, each TA must keep track of 6 to 8 separate project groups. Project teams are graded according to many criteria; some are qualitative and based on the substance of the meetings with coaches (there is a structured rubric that coaches follow, which we describe in the "Student Projects" chapter of the Instructor's Guide that accompanies our textbook), but others are based on quantitative measurements of whether students/teams are in fact following the Agile process.

In our CS education research we've become quite interested in how to provide automation to help instructors monitor these quantitative indicators, so that we can spot dysfunctional teams early and intervene.

We've observed various ways in which teams don't function properly:

  • Lack of communication: Teams don't communicate frequently enough (e.g., they don't do standups) or communicate in a fractured way, for example, a team of 6 really becomes two teams of three who don't talk to each other. The result is "merges from hell", lack of clarity on who is doing what, etc.
  • Not using project-management/project-planning tools effectively: One goal of Pivotal Tracker (or similar tools such as Trello or GitHub Issues) is to plan, track, prioritize, and assign work, so that it's easy to see who's working on what, how long they've been working on it, and what has already been delivered or pushed to production. Some teams drift away from using it, and rely on either ad-hoc meeting notes or "to-do lists" in Google Docs. The result is poor ability to estimate when things will be done and finger-pointing when a gating task isn't delivered and holds up other tasks.
  • Not following TDD: Projects are required to have good test coverage (85% minimum, combined between unit/functional and acceptance tests), and we preach TDD. But some students never give TDD a fair shake, and end up writing the tests after the code. The result is bad tests that cover poorly-factored code in order to meet the coverage requirement.
  • Poor code quality: We require all projects to report CodeClimate scores, and this score is part of the evaluation. Although we repeatedly tell students that TDD is likely to lead them to higher-quality code, another side-effect of not following TDD is poorly-factored code that manifests as a low CodeClimate score, with students then manually fixing up the code (and the tests) to improve those scores.
There are other dysfunctions, but the above are the main ones. 

To help flag these, I've become interested in how to study not only teams and team dynamics, but what analytics can tell us about the extent to which students are following the Agile process (or any process). We recently summarized and discussed some papers in our reading group that address this. (You can flip through the summary slides of our reading group discussion, but it may help to read the summaries and/or papers first.)

One of these papers, Metrics in Agile Project Courses, considers a variety of metrics obtainable from tool APIs to evaluate how effectively students are following Scrum, especially by measuring:
  • Average life of a pull request, from open to merge. “Optimal” is <24 hours “based on our past experience with the course”, but too-short lifetimes may be bad because they indicate the code wasn't reviewed thoroughly before merging.
  • Percent of merge requests with at least one comment or other activity (eg associated task). This is similar to our metric of pull-request activity in ProjectScope.
  • Mean time between a CI (continuous integration) failure and first successful build on the main development branch.
  • All indicators are tracked week-to-week (or sprint-to-sprint or iteration-to-iteration) so instructors see trends as well as snapshots.
  • Number of deploys this iteration (In their course, “deployment” == “customer downloads latest app”, but for SaaS we could just look at Heroku deploy history.)
In a more detailed paper (How surveys, tutors, and software help to assess Scrum adoption in a classroom software engineering project), the same group used a combination of student self-surveys, TA observation of team behaviors, and a tool they built called "ScrumLint" that which analyzes GitHub commits and issues relative to 10 “conformance metrics” developed for their course, including (eg) “percentage of commits in the last 30 minutes before the sprint-end deadline”. Each conformance metric has a “rating function” that computes a score between 0 (practice wasn't followed at all) and 100 (practice was followed very well); each team can see their scores on all metrics.

What's interesting is that they actually point to an Operations Research-flavored paper on a methodology for detecting nonconformance with a process (see the "Zazworka et al." reference in the summary for more details). The methodology helps you define a specific step or subprocess within a workflow, and rigorously define what it means to "violate" it.

With AgileVentures, we are trying to encapsulate some of this in a set of gems called ProjectScope that will be the basis of a SaaS app for Agile coaches trying to monitor multiple projects. Check back here periodically (or join AV!) if you are interested in helping out!



Wednesday, June 29, 2016

Pair programming with a bot partner?

I just ran across a paper from 2010 with the provocative (and Googleable) title "The Impact of a Peer-Learning Agent Based on Pair Programming in a Programming Course." (IEEE Transactions on Education 53(2) May 2010)

The premise is that beginning programmers interact with an intelligent tutoring system that alternates between a "driver" and "navigator/observer" role in pair programming. The overall workflow is that the programming task starts with a flowchart, and then code is written to implement what the flowchart shows. When the bot is in "driver" mode, it "asks for help" from the student (playing the navigator/observer) in figuring out the flowchart, by showing the student an incomplete flowchart and having her fill in the blanks. The bot then writes code for the flowchart, but makes deliberate mistakes (syntactic and semantic errors) and expects the student to correct them, so the student "learns by teaching".

When the bot is in "navigator/observer" mode, it observes student-entered code and responds to various common errors, both from bug libraries and from instructor-supplied "keyword-triggered" messages corresponding to common mistakes or commonly-chosen suboptimal problem-solving strategies (so there's no AI in this part of it).

(The bot's "navigator mode" was this group's previous work; what's new here is the bot alternating between the driver role and navigator role, even though the bot's implementation of those roles doesn't quite match pair programming—"pair programming inspired" might be more accurate).

In both modes, simple Bayesian models are used to estimate the likelihood of students making a particular error, based on concept-dependence graphs and observed mastery of the student on predecessor nodes in the concept graph.;
A limited randomized-controlled between-subjects trial, in which students used the system bracketed by pre- and post-testing, showed that students exhibited better transfer (p>.05) and retained material longer (p>.05) when using the system that switched roles periodically than when using the system that was always in "navigator mode". The effect size was large (0.79) for transfer and small (0.38) for retention.

The student modeling isn't particularly sophisticated, and the "navigator mode" bot isn't a particularly impressive use of NLP or AI, but the idea of having a tutoring system switch between roles is intriguing. Conversation bots and program analysis have both made lots of progress since this paper was published; the possibility of doing a real pair-programming chatbot for scaffolded programming assignments is quite intriguing!



Friday, June 10, 2016

Pair Programming Considered Harmful?

One of the students in our distributed teams course posted to the forum there asking 
This course has been a fantastic introduction to distributed team management tools and techniques. I am curious, though, to read responses, defences or critiques, of pair programming. Are there effective ways to do distributed team management without it? I present this article as a starting point: not exactly a balanced and neutral viewpoint as its title makes clear, but one that may elicit response. 
http://techcrunch.com/2012/03/03/pair-programming-considered-harmful/
I was all ready to get out the big guns in defence of pair programming, but despite the provocative title I found the article reasonably balanced and a very interesting read.  Pair programming (like many other techniques) is no panacea.

You can read a summary of some more research on pair programming here:


In my experience (and backed by various studies) pair programming works particularly well for learning developers, and for experienced developers when they are working on something new and complicated.

However I agree with the conclusion of the article author


The true answer is that there is no one answer; that what works best is a dynamic combination of solitary, pair, and group work, depending on the context, using your best judgement. Paired programming definitely has its place. (Betteridge’s Law strikes again!) In some cases that place may even be “much of most days.” But insisting on 100 percent pairing is mindless dogma, and like all mindless dogma, ultimately counterproductive.
You can do distributed team management without pairing, but without it you miss one important tool for helping your junior developers level up and distributing knowledge through the team.

Dogma is the danger.  Saying "never ever pair program" or "never ever work alone" is dangerous.  One needs to achieve a good balance.  The danger that I see for a lot of companies and organisations is that they reject pair programming out of hand.  They even have open plan offices, but everyone sits there with their headphones, working in their little silo, potentially working in a highly counter-productive fashion.

I would say follow the Agile process.  Every week you tinker and try things.  No pair programming this week?  Let's see what happens if we do 100% pair programming this week?  Everyone sick of pair programming - let's try all doing mob programming this week.  etc. etc.

The critical thing is to keep reflecting on what is (and isn't) working for your team.  Sometimes you need to be bloody minded to make a difference.  Other times you need to have an open mind.  Keep reflecting to work out when you need which :-)

Monday, May 23, 2016

If you can only speak TDD, maybe you'll think TDD?

I just finished reading Through the Language Glass, an interesting popular-press exposition of the relationship between spoken language and perception, and in particular of the "weak version" of the Sapir-Whorf hypothesis, according to which a speakers' use of language influences their mental processing.
An example is given of the Guugu Yimithirr aboriginal language, in which all location-specifying speech acts use only absolute compass directions (NSEW) as opposed to relative directions. For example, if we were holding hands and standing next to each other, instead of saying "I am standing to the right of Armando", you'd say "I am standing to the north of Armando" (or whatever direction it happened to be). And if we then rotated clockwise together and faced a different direction, you'd then have to say "I'm now standing to the east of Armando." G-Y speakers have to think constantly about their orientation relative to the world, since communication about place would be impossible otherwise. Experiments are described whose provocative finding is that this makes it more difficult, e.g., for G-Y speakers to solve spatial puzzles in which the relative spatial relationships between objects are unchanged but their absolute orientation is changed.
What does this have to do with Agile+SaaS? I've been looking for ways to more strongly encourage the students in the Berkeley course to really commit to TDD. (Many do, of course, but some still write tests "after the fact" only because they know we will be grading them on test coverage.) I wonder if one could devise a lab exercise in which students develop some functionality via pair programming, but they are only allowed to speak in terms of test results when discussing with each other. This would be an interactive (vs. autograded/solo) exercise, and the goal would be to get students to use a vocabulary that effectively limits their communication to describing things in terms of test results. Eg "For this next line, the effect I want is that given a valid URL, it should retrieve a valid XML document." I know many of us already (strive to) maintain this mindset, but is anyone up for designing/scaffolding an exercise that would help us teach it to TDD initiates?

Tuesday, May 17, 2016

Amazon Echo is a fun way to practice "API life skills"

Last week was my birthday, and I got an Amazon Echo. It's a remarkably cool device, but of course the coolest thing is you can write your own apps for it. (Technically they are "Alexa apps" and not "Echo apps"—Alexa is the back end that runs the apps, the Echo is a specific device that acts as a client, but there's also mobile client apps.) Amazon has a nice development environment in which you essentially write a simple "grammar" for the phrases you want your app to understand, and some "event handlers" that get called when particular phrases are recognized. You can write the event handlers in Node.js, Python, or Java. (No Ruby, but hey, I'm ecumenical and Python is just fine.)
Students of Part 2 of ESaaS know that I have strong feelings about the use of Node.js for server-side code. (This post does a great job of summarizing my views, as does this mildly NSFW xtranormal video.) Nonetheless, Amazon's developer documentation starter examples are mostly in Node, so I decided to give it a fair shake.
I decided to make an Alexa app that gives me upcoming departure information for BART, the Bay Area's rapid transit system, which I use daily to get to work. They have an RESTful XML API in which simple HTTP GETs return an XML data structure you can parse.

What does the code look like in Node.js to do a simple GET? It looks like this:

    http.get(endpoint + queryString, function (res) {
        var noaaResponseString = '';
        console.log('Status Code: ' + res.statusCode);

        if (res.statusCode != 200) {
            tideResponseCallback(new Error("Non 200 Response"));
        }

        res.on('data', function (data) {
            noaaResponseString += data;
        });

        res.on('end', function () {
            var noaaResponseObject = JSON.parse(noaaResponseString);

            if (noaaResponseObject.error) {
                console.log("NOAA error: " + noaaResponseObj.error.message);
                tideResponseCallback(new Error(noaaResponseObj.error.message));
            } else {
                var highTide = findHighTide(noaaResponseObject);
                tideResponseCallback(null, highTide);
            }
        });
    }).on('error', function (e) {
        console.log("Communications error: " + e.message);
        tideResponseCallback(new Error(e.message));
    });


As a less-experienced JavaScript programmer, I had to stare at it for awhile to understand its flow, and then I realized my God, all it does is fetch a JSON data structure using an HTTP GET call. Here's the code I wrote in Python to do the same thing (using BART's API):

try:
  self.body = urllib2.urlopen(url).read()
except urllib2.URLError:
  self.error = 'The BART website did not respond'

But I digress. The point is writing Alexa apps is a great way to build up your API-fu, since the best Alexa apps will probably receive concise speech commands, interact with several back-end APIs, and present concise speech output. And you can write and test the apps without having a device, using Amazon's web interface to test how the app responds to specific phrases. Plus the apps can be pretty fun to use, and the device makes me feel like I live in a home of the vaguely near future.Does no one but me see what is wrong with this picture? What is app code doing trying to respond to individual HTTP-level events at the app level? This clearly violates the "A" in the SOFA principle I teach in ESaaS: a function should stick to a single level of abstraction. I'm no more impressed with Node than I was at the beginning of the exercise.
So try out writing some Alexa apps—you can even deploy them free on Lambda, a "managed AWS" service in which you deploy handlers rather than full VMs (essentially).
But for the love of all that is good…write them in Python.

Looking under the hood in cloud datacenters

There's a great article in the May 2016 issue of Communications of the ACM authored by a group of Googlers, including former Berkeley PhD student David Oppenheimer (now the tech lead on Kubernetes) and my PhD advisor Eric Brewer (Berkeley professor on leave to be Google's VP Infrastructure), on the evolution of cloud-based management systems at Google. They were doing "containerization" of cloud servers before almost anyone else, and contributed to the Linux containerization code. The article distills over 10 years of designing and working with different approaches to containerizing software to share the hardware in Google's datacenters, and some lessons learned and still-open problems. A couple of key messages:

  • The "unit of virtualization" is no longer the virtual machine but the application, since modern systems like Kubernetes and Docker virtualize the app container and the files and resources on which it depends, but actually hide some details of the underlying OS, instead relying on OS-independent APIs. Apps are therefore less sensitive to OS-specific APIs and easier to port across containers.
  • As a result of having both resource containers and filesystem-virtualization containers map to an app (or small collection of apps that make up a service), troubleshooting and monitoring naturally have affinity to apps rather than (virtual) machines, simplifying devops.
  • A "label" scheme in which a given instance of an app has one or more labels simplifies deployment and devops as well. For example, if one instance of a service is misbehaving, that instance can (eg) have the "production" label removed from it, so that monitoring/load balancing services will exclude it from their operations but leave it running so it can be debugged in situ.
The article also points to still-open problems, including (surprise!) configuration (every configuration DSL eventually becomes Turing-complete and they're all mutually incompatible), dependency management (related to configuration, but also introduces new complications, i.e. what if service X depends on Y but Y must first go through a registration or authentication step before it can be deployed), and more.

Overall a really interesting read that suggests, ironically, that the programmer-facing view of cloud datacenters is actually a "back to the 1990s" view of managing a collection of individual apps with APIs (remember ASPs?) rather than managing a collection of either virtual or physical machine images.