ESaaS: Software as a Service as a Blog

Risks of Cryptocurrencies

2018-08-15T08:23:00.004-07:00

Cryptocurrencies and blockchains are now all the rage. In 1999, if your startup ended in "dot com", people threw money at you; today, if your startup has something to do with blockchain or cryptocurrency, likewise.
I am naturally a skeptic having seen a handful of boom-and-bust cycles in technology. It's not that I'm skeptical about enthusiasm for a new technology per se. I am as much an early adopter and techno-apologist as anyone. Indeed, in 2007, along with the RAD Lab at Berkeley where I was a researcher at the time, we cheered loudly for Amazon's new "public utility" style cloud computing (EC2, or Elastic Compute Cloud), against skeptics who thought it was hype/nothing fundamentally new. Ten years later, Amazon owns >30% of the infrastructure services market on the strength of its utility-computing offerings, and Netflix, one of the largest app installations on the planet, relies entirely on EC2 for its computing. This is largely in keeping with what we predicted would happen.
But sometimes, there are elements of techno-optimism about a new techno-fad that seem to wilfully ignore economic reality, technical reality, social reality, or a combination of the three. For example. in the late 90s, peer-to-peer filesharing was going to make centralized servers obsolete. The reasons given were many: P2P didn't rely on a single centralized point of coordination or failure (if properly designed and implemented); "spare" cycles and storage around the network could be harnessed to store and process data, rather than paying for centralized capacity to do so; Big Evil Governments would be unable to take down P2P networks acting in ways they didn't like; and so on.
In fact, my advisor at the time, Eric Brewer (now VP Infrastructure at Google), argued persuasively that from a technical and economic perspective, centralization made more sense than distribution. A centralized cluster that's well-run requires fewer human resources per server, can be made more secure and more reliable because all the eggs are in one basket to which dedicated attention can be given, and so on. He was right: Amazon EC2 is the manifestation of exactly that. It is true that there are still good reasons to use P2P, in particular when evasion of censorship or takedown is needed because the site is trafficking in illegal goods. Yes, there is a free-speech argument to be made, but overwhelmingly, the uses of P2P have been to facilitate transactions or discourse that society has already agreed should be illegal.
Similarly, the 2005 "One Laptop Per Child" $100 laptop was supposed to be the technical innovation that would get laptops and the Internet into the hands of children in developing countries, but it failed miserably in doing that. While the prototype did feature some technical innovations, the nonprofit's founders and spiritual leaders had little experience in either large-scale hardware procurement or the politics of development projects in foreign countries. The OLPC, with its one-off technology and quirky bespoke OS, was rapidly eclipsed by $200-400 Intel-based subnotebooks running Windows, and the resulting momentum arguably created that new product category; today Chromebooks can be had for under $100 from various vendors.
In the spirit of tempering techno-optimism with doses of socio-political-economic reality, I had a great time reading Nick Weaver's devastating critique of cryptocurrencies in last month's Communications of the ACM (Inside Risks: Risks of Cryptocurrency, CACM 61(6), June 2018). Nick was a grad student in systems at Berkeley at about the same time I was, and is now a security researcher at Berkeley's International Computer Science Institute. In his essay, he argues persuasively that cryptocurrencies are neither fit for purpose as a stable store of value, nor are they invulnerable to mutation as proponents claim, nor is their control sufficiently distributed in practice to avoid having to trust a small cabal of entities that effectively control most of the wealth, in addition to which the presence of bugs in the code can lead to situations that are catastrophic not only for individual coin holders but for entire societies.
The prose is not polemical or hyperbolic, but it is implacable, relentless, and dryly sardonic in the style of something like Why Pascal Is Not My Favorite Programming Language, and it musters its evidence well. Suffice to say it affirmed my original decision (on which I've never wavered) to stay the hell away from cryptocurrencies.
To be fair, P2P and OLPC had important follow-on effects, and their development resulted in important ideas that play key roles as part of today's ecosystem (which, coincidentally, is focused on centralized servers serving commodity-built clients). It's just that the ideas haven't been deployed in the way envisioned by their creators, because the socio-technical arguments underlying their positions were uncertain to begin with and readily crumbled.
Go read Nick's article. If you like it, feel free to tip him in Bitcoin.

Funding agencies, are you listening? How to build an unsuccessful research center

2017-12-28T12:01:00.002-08:00

In 2014, my colleague David Patterson had a CACM Viewpoint entitled How to Build a Bad Research Center, offering tongue-in-cheek bad advice to build a research center that's unlikely to produce breakthrough work. Dave used the recent wrap-up of the Berkeley Parallel Computing Lab (ParLab) as a case study in building a successful center, following a pattern used by many successful centers for which Dave had been a PI or co-PI.

About a year ago (I'm behind in my reading), Jesper Larsson Träff, a parallel computing researcher at TU Wien, wrote an eerily similar Viewpoint on (Mis)managing Parallel Computing Research Through EU Project Funding, which makes the point that the bureaucracy and heavyweight project management style required for multinational EU Projects (solicited regularly by the European Commission), including reliance on artificial "deliverables" that seldom correlate with actual research breakthroughs, thwart efforts to do groundbreaking work.

Reading Träff's litany of bad research-management practices, it's almost as if the EU Project commission read the "Eight Commandments for a Bad Center" in Dave's article, but failed to realize they were supposed to be satirical. Both Dave's and Jesper's short pieces are well worth reading in their entirety, and if you're pressed for time, I hope these side-by-side representative quotes from each may convince you to do so:

Patterson: "Bad Commandment #2: Expanse is measured geographically, not intellectually. For example, in the U.S. the ideal is having investigators from 50 institutions in all 50 states, as this would make a funding agency look good to the U.S. Senate."

Träff: "…the enforced geographical and thematic spread among consortium members in reality means that time and personnel per group is often too small to pursue serious and realistic research."

Patterson: "Bad Commandment 8. Thou shalt honor thy paper publishers. Researchers of research measure productivity by the number of papers and the citations to them. Thus, to ensure center success, you must write, write, write and cite, cite, cite."

Träff: "As part of the dissemination activities it is important for all consortium members to show regular presence at various associated, often self-initiated workshops and conferences, including meetings and workshops of other projects; high publication output is encouraged … The primary purpose of many of these activities is to meet the dissemination plans [deliverables], and has led to a proliferation of workshops presenting and publishing project-relevant trivialities. Apart from the time this consumes, the apparent authority of a workshop masquerading as a scientific event at a well-established conference may reinforce docility and low standards in the naive Ph.D., and appall and deter the more observant one."

Patterson: "A key to the success of our centers has been feedback from outsiders. Twice a year we hold three-day retreats with everyone in the center plus dozens of guests from other institutions [including industrial partners]. Having long discussions with outsiders often reshapes the views of the students and the faculty, as our guests bring fresh perspectives… Researchers desperately need insightful and constructive criticism, but rarely get it. During retreats, we are not allowed to argue with feedback when it is given; instead, we go over it carefully on our own afterward." Note: In the Berkeley research center model, industrial affiliates pay a fee to participate in the project.

Träff: "…in many cases industrial involvement makes a lot of sense. However, it seems confusing at best to (ab)use scientific research projects to subsidize European (SME) industry. Can't this be done much more effectively by direct means? In any case, it would be more transparent and less ambiguous if industrial participation in EU projects was not directly funded through the projects. Strong projects would be interesting enough that industry would want to participate out of its own volition, which in particular should be the case for large businesses." Note: In the EU Project research center model, public funds are used to pay private companies to participate in the project.

Träff closes his article with some suggestions for moving towards a "radically better" funding model for European high-performance computing research. In this section, unsurprisingly, Träff and Patterson find themselves in agreement:

Träff: "[P]roposals and projects to a larger extent [should] be driven by individual groups with a clear vision and consortium members selected by their specific project-relevant expertise. It would make long-term scientific, and perhaps even commercial sense to make it possible and easy to extend projects that produce actual, great results or valuable prototypes…more possibilities for travel grants and lightweight working groups to foster contacts between European research groups, and more EU scholarships for Ph.D. students doing their Ph.D. at different European universities would also be welcome additions."

Patterson: "After examining 62 NSF-funded centers in computer science, the researchers found that multiple disciplines increase chances of research success, while research done in multiple institutions—especially when covering a large expanse—decreases them: 'The multi-university projects we studied were less successful, on average, than projects located at a single university. ... Projects with many disciplines involved excelled when they were carried out within one university (J. Cummings and S. Kiesler, Collaborative research across disciplinary and organizational boundaries, Social Studies of Science 35, 5 (2005), 703–722.)"

While the two pieces don't overlap 100%, there are great lessons to be learned from reading both.

The question is whether the responsible funding agencies are reading them.

"The Death of Big Software"

2017-12-19T18:26:00.002-08:00

The current edition of our ESaaS textbook makes the case that as early as 2007, Jeff Bezos was pushing Amazon towards an "API only" integration strategy, in which all services would have to be designed with externalizable APIs and all inter-service communication (even inside the company) would have to occur over those APIs. We pointed out that this service-decomposition approach to designing a large service—i.e. compose it from many microservices—was one of the first highly-visible successes of "service-oriented architecture," which in 2011 had begun to acquire a reputation as a content-free buzzphrase bandied by technomarketeer wanna-be's. Indeed, I've recently reported on other CACM pieces extolling the benefits of microservices-based architecture and how such an organization enables a more agile approach to Dev/Ops.

As we reported in ESaaS, enormous monolithic projects (the opposite of SOA) fail too often, are too complex to manage, and so on. Since small agile teams (Bezos calls them "2-pizza teams") work best on modest-sized projects, all the more reason to decompose a very large service into its component parts, and put a small agile team on each part. But a Viewpoint (op-ed) article in the December 2017 CACM goes farther and asserts simply that "big" monolithic software systems are dead. In addition to the execution problems, big monolithic systems, with their implied vendor lock-in and very long development cycles, are anathema to a nimble business. And the advent of inexpensive cloud computing removed an important deployment obstacle to "going SOA", since a multi-component app could be deployed and accessed from anywhere, even if the components came from different vendors.

All of which is to say that the time is more than ripe for the Second Edition of ESaaS (probably coming sometime next year) to take an "API first" approach to service design: rather than thinking in terms of an app and its features, for the architecture design phase think in terms of resources (in the REST sense) and operations you want to expose on them. Expect a thoroughly revamped discussion of SaaS architecture including a précis of the last 5-8 years of evolution, quickly accelerating from the early stumbling blocks of SOAP-based SOA through the growth of simple HTTP/XML/JSON-based RESTful services and now to an "API first" microservices design approach. SwaggerHub, here we come!

Berkeley SaaS course Demo Day Competition winners!

2017-12-19T18:12:00.000-08:00

In the Berkeley campus version of the ESaaS course (Computer Science 169, Software Engineering), which typically enrolls anywhere from 100 to 240 students working in teams of six, we used to have an end-of-semester poster session where every team would have a poster about their project. While it made for a lively event, it was hard for me (the instructor) to spend more than a few minutes at each poster. Of course, there are many other project deliverables, so the grade didn't really depend very heavily on the poster, but I still felt that it was a hurried and stressful process that could stand to be improved.

Last summer, Cal alumna Carina Boo, who had previously served as a TA for this course and is now a full-time software engineer at Google, was hired to guest-teach the course over the summer. It was very successful, and one of her innovations was to substitute an optional Demo Day Competition for the poster session. Participation would be completely voluntary and not carry any extra-credit points; each team would have up to a 10-minute slot to both discuss the challenges/lessons learned from the experience of working with their customer and to present some technical discussion of challenges in their app that would impress a panel of three judges who are all full-time software engineers as well as former students and former TAs of this course (Kevin Casey, now at Facebook, and Tony Lee, now at Splunk).

I decided to adopt this format for this semester as well, and it worked so well that I believe I'll keep it for future semesters. 9 out of 21 teams signed up completely voluntarily to give 10-minute presentations, and some of their customers showed up as well to offer an endorsement of the team’s work. The presentations and the work represented by them were uniformly excellent, and I’m pleased to congratulate the winners here:

1st place: Berkeley MarketPlace, a Berkeley-exclusive C2C buyer and seller market (also won Best UI/UX subcategory). Congratulations to Jiazhen Chen, Shuyin Deng, Amy Ge, Yunyi Huang, Ashton Teng, Jack Ye

2nd place: ESaaS Engagements, an app to track ESaaS customers across different institutions. Congratulations to Sungwon Cho, Joe Deatrick, Kim Hoang, Dave Kwon, Kyriaki Papadimitriou, Julie Soohoo.

Runner up: Berkeley Student Learning Center, an app to help schedule tutoring sessions. Congratulations to Jennifer Be, Haggai Kaunda, Maiki Rainton, Nah Seok, Salvador Villegas, Alex Yang.

Most Technically Challenging: a tie between two projects that the judges considered to have surmounted unusually tough technical challenges:

iC-uC, a lab app for optometry students help model eye geometry and vision for the UC Berkeley Optometry School. The app had to integrate a Matlab terminal and graphs with a Java wrapper and a Rails front-end.
Audience1st, a ticketing and backoffice app used by various community theaters in the Bay Area. Among the features added was the online migration of the entire authentication system to OmniAuth, without disturbing the existing login experience, and without access to customers’ current passwords.

Making CS/IT education more immersive

2017-10-06T17:16:00.000-07:00

This month's Communications of the ACM includes an article by Thomas A. Limoncelli on Four Ways to Make CS and IT More Immersive. The author laments that by and large, CS students aren't learning best practices in the classroom, and offers four specific pieces of advice for serving students better.

1. Use best-of-breed Dev/Ops tools from the start: the normal way for students to work on coding tasks should be to use Git, submit via commit/push, and provide pointers to CI reports.

In our ESaaS course and MOOC, students do just this. Git is used from day one, and student projects (in the campus version of the course, and a coming-soon Part 3 MOOC) are required to include coverage, CodeClimate, and CI badges on their repo front pages. Homework submission is often via deployment to Heroku.

2. Homework programs, even Hello World, should generate Web pages, not text. This requires understanding minimal bits of SaaS architecture, languages, and moving parts.

ESaaS’s first assignment is indeed a Hello World for Ruby, but the very next one has students manipulating a simple SaaS app built using Sinatra. The goal is to get students “thinking SaaS” early on.

3. Curricula should start with a working system that shows best practices, not just build from low-level to higher-level abstractions.

ESaaS’s second assignment has students examining code for a simple Sinatra app that follows good coding practices, includes integration and unit tests (before students have been introduced to creating or reading code for such tests), and is activated by deploying to the public cloud.

4. Focus on reading, understanding, and adding value to an existing system—not just on greenfield development. The former is much more common than the latter in the software engineering profession.

ESaaS indeed includes a 2-part homework assignment on enhancing legacy code, but more exciting, an increasing fraction of all projects undertaken in Berkeley’s version of the course (10/11 projects in summer 2017 offering of the course; 13/20 in Fall 2017 offering) are existing, functioning legacy systems wanting additional features.

Our course materials are freely available on edX as the 2-part (soon to be 3-part) MOOC sequence Agile Development Using Ruby on Rails, and plenty of instructor materials are available for teachers wanting to use this content or the accompanying textbook in a SPOC.

Improving gender & ethnic diversity in CS: what can faculty do?

2017-09-05T12:44:00.003-07:00

Recently, my colleagues Dave Patterson, John Hennessy and Maria Klawe, all storied contributors to both academic and professional computing, wrote an excellent essay enumerating all the things that were incorrect (and there were many) in Google engineer James Damore's "anti-diversity post." For those who didn't hear about said post, which initially circulated internally at Google but eventually found its way onto the public Internet, Damore argued that the reason there are so few women in computing is because biological differences make women less fit than men for the technical and leadership activities associated with computing, and that companies who strive for gender parity are therefore misguided.

I can’t improve on my colleagues’ extensive rebuttal of what was a poorly-argued piece costumed in the garb of science, but as I just gave a phone interview to the Daily Cal (UC Berkeley’s student-run newspaper) on this topic, I thought I’d summarize my comments in this post too.

While Berkeley (and most of our peer institutions) make ongoing and vigorous efforts through a panoply of programs designed to improve the representation of women and underrepresented minorities (URMs) in STEM fields, I believe such programs can only work if they are driven from the relationships that structure students’ day-to-day lives on campus and collectively form an institutional culture. Some of those relationships are among students themselves—as peers in a course, between student and TA, or as part of a student-run club or academic group. Other relationships are between students and faculty—as course instructors, research advisors, or sponsors/liaisons to student-run activities.

In my faculty role, I try to do two things: lead by example in my interactions with students, and listen to them.

Leading by example. There’s quite a bit we can do in our courses and research projects, such as:

Invite women and URM guest speakers, both to talk about their experience of being part of an underrepresented demographic in tech and just to give interesting technical talks.
If we supervise GSIs, or are the faculty sponsor of a student group, DeCal course, or other student-centric activity, we can make it a point when engaging with those students to bring up unconscious bias and the importance of improving diversity in our field. Especially for GSIs, their behavior, language, and demeanor are important because of their position of authority.
We can make students aware of gender/ethnic diversity results that are directly relevant to course if possible. In my software engineering project course, I encourage project teams to make an effort to include women, as team research has found the presence of women on a team to be a strong predictor of team success. Indeed, a study of GitHub code repositories found that women's contributions to code bases were more likely to be accepted, unless you could tell from their profile that they were women! I also often introduce bits of computing history (it’s my hobby) that are relevant to the lecture topics; most students aren’t aware that the “masculinization” of computing didn’t really start in earnest until the 1960s, and that until then virtually all programmers were women. In CS375, a required GSI orientation and training that I’ve co-taught, we do an in-class activity based on Google’s unconscious-bias training that directly demonstrates our proneness to unconscious bias.

Listening. As faculty, we can decide to allocate time to participate in programs designed to support women and URMs entering the field. One example is the Berkeley CS Scholars program, of which I'm a faculty sponsor. When I do get to interact with women or underrepresented minorities, I listen to what they have to say about their experience so that you know what to fix. Was there a difficult interaction with a professor? with a GSI? with a fellow student? Unless we ourselves have had the experiences that many women or minorities have had in computing (and given the problem statement, most of us haven’t), this is the only way to reliably understand what to work on.

Department- and University-supported diversity programs are great, and leaders like Patterson, Hennessy and Klawe have to stand up and point out the problems in posts like Damore's. That is a necessary but not sufficient part of fixing the problem. The “mentality of a field” is based on the values embodied in the relationships among people in that field, and as faculty it’s part of our responsibility to bring those values to every student relationship in which we participate.

Our techno-fetishism problem

2017-08-24T14:37:00.000-07:00

A frequent meme in both the Software Engineering class I teach and during my open office hours (when any student can come for advice or discussion on anything) is what appears to be a premature fear of lack-of-scalability when developing Internet services/Web apps. This concern is often hard to disentangle from techno-fetishism—the often irrational desire to use the latest rockstar tech, without really understanding whether it solves a problem you have, but providing the flimsy rationalization that “we need to use it for {scalability | dependability | having a good user experience | responsiveness}.”

There are various manifestations of this meme, including:

"We want to use {Mongo | Cassandra | your favorite NoSQL DB} because relational databases don't scale."
"We want to write our server in Node.js because synchronous/blocking/task-parallel appservers don't scale." (Weaker version: replace "don't scale" with "require more servers".)
"Our app is going to be huge, and we'd like your advice on how to achieve high scalability and 24x7 dependability across our soon-to-be-worldwide user base."

These thoughts are well-intentioned; many of these students are old enough to remember how MySpace crashed and burned in part because of its inability to get these things right, leaving a vacuum into which Facebook immediately stepped, and the rest is history, etc.

But since that time, the tools ecosystem has come a long way, computers have gotten way faster, and server cycles and storage have dropped in price by one or more orders of magnitude.

My challenge is therefore to help these students understand two things clearly:

No technology will help you if your fundamental design prevents scaling (your app is poorly factored and/or cannot be divided into smaller services that can be scaled separately).
If your app is well-factored, these days you can get surprisingly far with conventional RDBMS-backed stacks on commodity hardware.

Ozan Onay makes this point well in his post "You are not Google," which incidentally includes quotes from my Berkeley colleague Prof. Joe Hellerstein. As Onay points out:

"As of 2016, Stack Exchange [the forum system of which StackOverflow is just one part] served 200 million requests per day, backed by just four SQL servers: a primary for Stack Overflow, a primary for everything else, and two replicas."

In other words, the highest-traffic interactive Web app used by developers all over the world uses the most mundane (in many developers’ view) of technologies—most SQL servers support this kind of master/slave replication by simply tweaking a config file, no special app programming required. Similarly, as of a couple of years ago, all of Pivotal Tracker was hosted on a single RDBMS instance.

A more sarcastic view of the same meme is put forth in one of my favorite xtranormal movies on Node.js. Go ahead, watch it. It’s only a few minutes long. You’ll thank me someday.

The basic message is the same: Most new technologies, languages, frameworks, and so on evolved to fill a specific technical need, and especially if they evolved in a large company, they probably fill a technical need that you don't have, and that you may never have.

There is a subtext to the message too: it is relatively rare that a new tool or framework embodies a fundamentally new idea about software engineering. The event-loop framework used by Node.js has been around since the beginning of time; indeed, the original Mac OS required all apps to be written this way (in part because the original Mac used a microprocessor that didn’t support virtual memory, making true preemptive multitasking impossible), which was super painful. OS designers created threads to make this pain go away; threading systems do exactly what programmers are expected to do in Node. And modern threading systems running on modern hardware are damn fast.

Finally, and most importantly, no tool or framework can save you from a bad design. Some frameworks try to help you by effectively legislating some design decisions out of existence (e.g. Rails all but mandates a Model-View-Controller architecture), but there's still plenty of room to shoot yourself in the foot design-wise.

That will be your real obstacle to scaling and dependability—not the fact that you didn't go with Node.js or write your front end using React. So you can still wear that Node t-shirt, but please consider wearing it ironically.

Teach yourself to code using online tutorials…but which ones?

2017-06-16T11:12:00.002-07:00

There’s an almost embarrassing plethora of “teach yourself to code” tutorials online—including reference sites like W3schools, interactive “try it in a browser” sites like CodeCademy and CodeHunt, beginner-friendly step-by-step activities such as those featured on Code.org, and many more. And that’s not even counting the dozens of online courses, both free (edX, Coursera) and paid (Udemy, Lynda)—devoted to some aspect or other of learning to code.

But which of these are actually likely to be effective in producing learning? Researchers who study how students learn have developed a rich body of well-tested ideas and recommendations around evidence-based teaching and learning—that is, the kind where you can unambiguously measure whether learning is taking place—but which if any of the online teach-yourself-to-code resources actually follow these principles?

My colleague Prof. Andrew Ko at the University of Washington and his student Ada Kim have produced a very nice paper addressing this question, which they presented at SIGCSE 2017, the largest conference focused on computer science education. Kim and Ko studied 30 free and paid highly popular online resources for teaching yourself beginning coding skills.

They have numerous findings, but the one that struck me most was that while many of these sites cover most of the same material (with varying degrees of beginner-friendly organization and quality of presentation), most focus on how to understand and practice a particular concept in coding, but provide little practice on identifying when and why to apply that concept in real programming situations. While Kim and Ko stop short of making “hard” recommendations, they find that the tutorials from Code.org, and online games such as Gidget, Lightbot, and CodeHunt, do the best job at incorporating elements of evidence-based teaching that have been shown to promote learning:

Provide immediate and personalized feedback to the learner
Organize concepts into a hierarchy with clear goal-directed for learning each concept
Opportunity for mastery learning, i.e. to practice a concept repeatedly until understanding is complete
Promote meta-cognition—knowing how and when to use a concept, not just how to use it

With everyone trying to make a buck (or just get visibility/sell ad space) claiming to teach beginners to code, a critical look at these resources through the lens of Computer Science Education Research is a welcome breath of fresh air. Read their 6-page paper here.

The 7 (or so) habits of highly successful projects

2017-03-02T10:35:00.002-08:00

Each time we offer at UC Berkeley based on the Engineering Software as a Service curriculum, students complete a substantial open-ended course project in teams of 4 to 6. Each team works with a nonprofit or campus business unit to solve a specific business problem using SaaS; the student organization Blueprint helps us recruit customers.

Besides creating screencasts demonstrating their projects, students also participate in a poster session in which we (instructors) ask them to reflect on the experience of doing the project and how the techniques taught in the class helped (or didn't help) their success.

Virtually all project teams praised BDD and Cucumber as a way of driving the development of the app and reaching agreement with the customer, and all agreed on the importance of the proper use of version control. Beyond that, there was more variation in which techniques different teams used (or wished in retrospect that they had used). Below is the most frequently-heard advice our students would give to future students doing similar projects, in approximate order of popularity (that is, items earliest in the list were independently reported by the largest number of project teams).

Reuse, don't reinvent. Before coding something that is likely to be a feature used by other SaaS apps (file upload, capability management, and so on), take the time to search for Ruby gems or JavaScript libraries you can use or adapt. Even two hours of searching is less time than it takes to design, code and test it yourself.

Start from a good object-oriented design and schema. Taking time to think about the key entity types (models), relationships among them, and how to capture those relationships in a schema using associations and foreign keys. A good design reduces the likelihood of a painful refactoring due to a schema change.

Weekly meetings are not enough. Especially with the larger 6-person teams we used in the Fall 2013 course offering (237 students forming 40 teams), a 15-minute daily standup meeting helped tremendously in keeping everyone on track, preventing conflicting or redundant work, and informally sharing knowledge among team members who ran into problems. Teams that met only once a week and supplemented it with online chat or social-networking groups wished they had met more often.

Commit to TDD early. Teams that relied heavily on TDD found its greatest value in regression testing: regression bugs were spotted immediately and could be fixed quickly. Teams that didn't commit to TDD had problems with regression when adding features or refactoring. Teams that used TDD also noted that it helped them organize not only their code, but their thoughts on how it would be used ("the code you wish you had").

Use a branch per feature. Fine-grained commits and branch-per-feature were essential in preventing conflicts and keeping the master branch clean and deployment-ready.

Avoid silly mistakes by programming in pairs. Not everyone paired, but those who did found that it led to higher quality code and avoided silly mistakes that might have taken extra time to debug otherwise.

Divide work by stories, not by layers. Teams in which one or a pair of developers owned a story had far fewer coordination problems and merge conflicts than teams that stratified by layer (front-end developer, back-end developer, JavaScript specialist, and so on) and also found that all team members understood the overall app structure better and were therefore more confident when making changes or adding features.

There you have it—the seven habits of highly successful projects, distilled from student self-reflections from approximately sixty projects over two offerings of the course. We hope you find them helpful!

“Towards a new hacker ethic”

2016-12-19T10:32:00.000-08:00

I just read the transcript (and saw slide deck) of a great talk, Programming is Forgetting: Toward a New Hacker Ethic by Allison Parrish at the 2016 Open Hardware Summit.
The whole talk is worth watching/reading, but the part I liked best is Parrish’s reformulation of the “hacker ethic” laid out in Steven Levy’s book Hackers in the 1980s. Levy set out to chronicle the cultural history of hacking (the good kind, i.e. building and tinkering with computing and technology systems, as opposed to maliciously subverting them) and he claimed to summarize the “hacker ethic” in the following four points:

Access to computers should be unlimited and total
All information should be free
Mistrust authority; promote decentralization [of control and content]
Hackers should be judged by their hacking [skills], not “bogus” criteria such as degrees, age, race, or position

The gist of Parrish’s talk is that while the above principles are noble, the anecdotes Levy recounts of hacker behavior are often inconsistent with the above points, including a noteworthy anecdote in which some hackers “exercised their ethic” in a way that dismissively interfered with the work of Margaret Hamilton, who would go on to invent the term “software engineering” and to be the lead architect of the guidance computer software for the Apollo missions that landed the first humans on the moon.

For me the best part of the talk was Parrish’s proposed reformulation of the “hacker ethic” in a way that takes the form of questions that creators of technological systems should ask themselves as they think about deploying those systems. I doubt I can improve on her phrasing, so I’ll quote the talk transcript directly:

“…my ethic instead takes the form of questions that every hacker should ask themselves while they’re making programs and machines. So here they are.

Instead of saying access to computers should be unlimited and total, we should ask “Who gets to use what I make? Who am I leaving out? How does what I make facilitate or hinder access?”

Instead of saying all information should be free, we could ask “What data am I using? Whose labor produced it and what biases and assumptions are built into it? Why choose this particular phenomenon for digitization or transcription? And what do the data leave out?”

Instead of saying mistrust authority, promote decentralization, we should ask “What systems of authority am I enacting through what I make? What systems of support do I rely on? How does what I make support other people?”

And instead of saying hackers should be judged by their hacking, not bogus criteria such as degrees, age, race, or position, we should ask “What kind of community am I assuming? What community do I invite through what I make? How are my own personal values reflected in what I make?”

A few weeks ago a Medium post went viral in which developer Bill Sourour discussed “The code I’m still ashamed of”. At the direction of his manager, he had written code for a website that was posing as a general-information website where you could take a quiz to determine what prescription drugs were recommended for your particular symptoms and condition. In fact, though, the website was effectively an advertisement for a specific drug, and no matter what your responses to the quiz, the recommendation would always be the same—you needed this company’s drug. (A young woman later killed herself due to depression attributable in part to consuming the drug.) Business Insider reported that Sourour’s post had triggered a storm of “confessions” on Reddit from other engineers who were ashamed of having done similar things under duress, and includes some pointed comments from software thought leader “Uncle Bob” Martin such as “We are killing people.” He warns us that the Volkswagen emissions-cheating scandal was probably just the tip of the iceberg, and that even though in this case the CEO was ultimately held accountable (which doesn’t always happen), “it was software developers who wrote that code. It was us. Some programmers wrote cheating code. Do you think they knew? I think they probably knew.”

Uncle Bob goes on to lament that coding bootcamps rarely include any required material on software ethics, and I'm beginning to fear we don't do enough at Berkeley either. In my work as a college instructor, I do have to deal with breaches of academic integrity of various sorts, from straight-ahead plagiarism to students paying freelancers to do their homework to students presenting false documentation about medical emergencies to avoid taking an exam. Disturbingly often, when these students are confronted with evidence of their actions, their only remorse seems to be that they were caught, and I find myself wondering whether they are the software writers who will go on to insert “cheat code” into a future consumer product. We do have a required Engineering Ethics course and there is a software engineering ethics code endorsed by the Association for Computing Machinery, but I worry that our ethical training doesn't have sharp enough teeth. As Uncle Bob wrote, “We [software developers] rule the world, we just don’t know it yet.” We’d better start acting like it. Self-reflection questions like those proposed by Parrish would be a good place to start.

Getting a good software internship in industry or academia

2016-12-13T14:52:00.000-08:00

I hire summer students all the time to work on software projects related to my research and teaching at Berkeley, and I have students coming to me frequently asking for advice on either getting a good internship or selecting among various offers. Leaving aside the (very important) nontechnical aspects of your interview or job choice, here's my technical advice from the point of view of what I look for in a software hire (and, not coincidentally, what I teach in Berkeley's software engineering course):

Have a story for testing. Testing and debugging software is much harder than writing it. The question you should be prepared to answer is not "Do you test?" but rather "How do you test?" Do you have tests that can be run programmatically? Do you have any idea how much of your code is covered by tests?

Demonstrate that you know the ecosystem. The developer ecosystem, roughly centered around GitHub, now includes continuous integration (Travis, Jenkins), project tracking (Pivotal Tracker), code quality measurement (CodeClimate), and more. How do you use these tools?

Have you worked in a team? Software development is rarely a solo endeavor anymore; most interesting software is built by teams. What practices does your team use to organize its activities (scrum, standups)? To manage simultaneous development (branches, fork-and-pull)? To communicate and stay on same page about the project (Slack, Basecamp)?

Be a full-stack developer. What can I do with a front-end developer who can't write the back end? That's like someone who can build the front of my house but not the side walls or stairs. What I need is a house, even if a very simple one. Similarly, even a back-end developer must be able to get my app out to the end user in some way, even if the user interface is modest and simple.

Get good at picking up new languages and frameworks. This is more relevant for full-time jobs, but I'd rather hire someone who can learn new things fast than someone who's an expert on whatever framework I happen to be using right now, since new frameworks and languages come along all the time. How would you demonstrate that you can learn things fast?

Understand Unix. The single most influential set of ideas in modern server and client operating systems derives from the Unix programming model. The ability to quickly put together simple tools to do a job is vital to developer productivity. If I ask you "Given a set of large text files, count approximately how many distinct email addresses they contain," and your first question is "In what language," we're done. One plausible answer is the shell command:

    grep -E -o '\b\w+@\w+\b'  filenames | sort | uniq | wc -l

Do your own projects. If the only software you write is in course projects, I'd wonder if you really love coding enough. Imagine trying to compete in a tennis tournament if the only time you had spent on the court was during your actual tennis lessons. Class projects just don't give you enough mileage to get really good at software; doing your own projects (extracurriculars, pro-bono work, hackathons, personal projects, contributing to an open-source project, whatever) is essential. Be prepared to answer "What is the coolest personal software project you've worked on and what made it exciting/challenging/a great learning experience?"

Show me your code. As my colleague and GitHub engineer Jesse Toth has succinctly put it, “Your code is your résumé.” A résumé without evidence of coding skill is minimally useful. If you can’t make your GitHub public repos part of your code portfolio or add me as a view-only collaborator on some repos, at least send me a link to a tarball of code you’ve written.

"Keeping the lights on" for ESaaS-built pro-bono software deployments

2016-12-11T14:44:00.000-08:00

Berkeley's software engineering course, which I developed with Prof. Dave Patterson, has something important in common with the nonprofit AgileVentures, founded by Prof. Sam Joseph (who also is the lead facilitator for our edX MOOC on Agile Development).

Both organizations allow developers-in-training (students in the case of Berkeley; professionals in the case of AV) to work on open source pro bono software projects in a mentored setting, usually for nonprofits and NGOs. Indeed, since 2012, Berkeley students have delivered/deployed over 100 such projects, many still in production use by the original customer.

A perennial problem we've had, though, is what to do when each offering of the course ends. How can these nontechnical customers arrange for basic maintenance of their apps if they don't have any IT staff who can do this? Even if the customer wants future students to continue working on the software, it needs to be kept running until the next course offering.

This semester (Fall 2016) we're trying something new. AgileVentures is introducing a "Nonprofit Basic Support" membership tier that gives nonprofits basic support for maintaining these SaaS apps. For a very low monthly fee, an AgileVentures developer will be the maintenance contact for the app, ensure its "lights are kept on" (restart when needed, etc.), and advise the customer if the app needs other attention, for example, if the app's resource needs require it to be moved to a paid or higher hosting tier.

The goal is to "keep the lights on" either until the next team of students or AV developers further enhances the app, or until the customer decides to take over maintenance of the app themselves (or move it to another contractor).

Of course, a few customers don't need this service; they may already have in-house staff, or the app may be one that was already in use and for which our student developers just provided new features. But for the majority of customers who are nontechnical and may not even be able to afford in-house IT for maintaining these apps, we look forward to seeing how this experiment works out!

Agile DevOps

2016-09-29T15:29:00.001-07:00

If you're surprised at how frequently in this blog I mention articles from Communications of the ACM ("CACM"), you're missing out. Especially if you're a student, membership is inexpensive and the flagship monthly magazine tends to be full of really good stuff relevant to both practice and research.

Today I'm blogging about an article in the July 2016 issue (yes, I'm behind in my reading) talking about the "small batches principle" for Dev/Ops tasks. The article is written by a Dev/Ops engineer at Stack Overflow, the now-legendary site where all programmers go to give and receive coding help and which was co-founded by Joel Spolsky, one of my software heroes and author of the excellent Joel On Software blog and books.

This article talks about various in-house tasks associated with deploying software, such as pre-release testing and hot-patching, building and running a monitoring system, and other tasks that this company (and many others) historically did once in a great while. The month during which a new release was being tested and deployed became known as "hell month" because of the magnitude and pain of the tasks.

The article describes how Stack Overflow has migrated to a model of doing smaller chunks of work incrementally to avoid having to do very large chunks every few months; how they moved to a "minimum viable product" mentality (what is the simplest product you can build that will enable you to get feedback from the customer to validate or reject the features and product direction); how they adopted a "What can get done by Friday"-driven mentality, so that there would always be some new work on which their customers (in this case, the development and test engineers) could comment on; and so on.

Essentially, they moved to an Agile model of Dev/Ops: do work in small batches so each batch minimizes the risk of going off in the wrong direction; deploy incremental new changes frequently to stimulate customer feedback; thereby avoid painful "hell months" (possibly analogous to "merges from hell" on long lived development branches).

Agile applies to a lot of things, but this article does a nice job of mapping the concepts to the Dev/Ops world from the pure development world.

Flipped classroom? No thanks, I'd rather you lecture at me

2016-09-20T12:38:00.002-07:00

As an early adopter and enthusiast of MOOCs, I've followed the "flipped classrooms" and "lecture is dead" conversations with some interest. Indeed, why would students attend lecture—and why would professors repeat last semester's lectures—when high-quality versions are available online for viewing anytime?

This semester, I'm once again teaching Software Engineering at Berkeley to about 180 undergraduates. (Past enrollments have been as high as 240, but we capped it lower this semester to make projects more manageable.) In the past, Dave Patterson and I have team-taught this class and we've developed a fairly well-polished set of lectures and other materials that are available as a MOOC on edX, Agile Development Using Ruby on Rails, and which we also use as a SPOC.

Last spring, by the end of the semester our lecture attendance had plummeted to about 15% of enrollment. We surveyed the students anonymously to ask how we could make lecture time more valuable. A majority of students suggested that since they could watch pre-recorded lectures, why not use lecture time to do more live coding demos and work through concrete examples? In other words, a classic "flipped lecture" scenario.

So this semester I dove in feet-first to do just that. Less than 20% of my lecture time has been spent covering material already in the recorded lectures; instead, we made those available to students from the get-go. The rest has consisted of live demos showing the concepts in action, and activities involving extensive peer learning, of which I'm a huge fan. I've even started archiving the demo "scripts" and starter code for the benefit of other instructors. The "contract" was that we would list which videos (and/or corresponding book sections) students should read or watch before lecture, and then during the lecture time (90 minutes twice a week) the students would be well prepared to understand the demo and ask good questions as it went along.

I told the students I would also sprinkle "micro-quizzes" into the lectures to spot-check that they were indeed reading/viewing the preparatory materials. The micro-quiz questions were intended to be simple recall questions that you'd have no trouble answering if you skimmed the preparatory material for the main ideas. (We use iClicker devices that are registered to the students, so we can map both responses and participation during lecture to individual students.)

Today in lecture, a bit more than 4 weeks into the course, I've officially declared the flipped experiment a failure. (Well, not a failure. A negative result is interesting. But it's not the result I expected.)

Since we made the pre-recorded lectures available within an edX using the edX platform itself, and students have to login with their Berkeley credentials to access the course, we can see using edX Insights (part of the platform's built-in analytics) how many people are watching the videos.

According to the edX platform's analytics, the typical video is partially viewed by about 20 people. Only 45 people have ever watched any video to completion. Remember, this is in a class of 180 enrolled students, whose previous cohort specifically requested this format.

Maybe people are watching the videos after lecture rather than before? If they were, you'd expect video viewing numbers to be higher for videos corresponding to previous weeks, but they're not.

Maybe people are reading the book instead? If they were, the performance on the microquizzes should be nearly unimodal—they are simple recall questions that you cannot help but get right if you even skim the video or the book—but in fact the answer distributions are in some cases nearly uniform.

Maybe people already know the material from (e.g.) an internship? One or two students did approach me after today's lecture to tell me that. I believe them, and I know there are some students like this. But we also gave a diagnostic quiz at the beginning of the semester, and based on its results, very few people match this description.

Maybe students don't have time to watch videos or read the book before lecture? This is a 4-unit course, which means students should nominally be spending 12 hours per week total on it, of which lecture and section together account for only 4. The reading assignments, which can be done instead of watching the videos, average out to 15-20 pages twice a week, or 30-40 pages per week. Speaking as a faculty member leading an upper-division course in the world's best CS department at the world's best public university, I don't believe 30-40 pages of reading per week per course is onerous. Also, in past offerings of the course, we've polled students towards the end of the course to ask how many hours they are actually spending per week. The average has consistently been 8-10 hours, even towards the end where the big projects come due. So by the students' own self-reporting, there's 2-4 hours available to either do the readings or watch the videos.

As you might imagine, planning and executing live demos requires a couple of hours of preparation per hour of lecture to come up with a demo "script", stage code that can be copy-pasted so students don't have to watch me type every keystroke, ensure the examples run in such a way as to illustrate the correct points or techniques, remember what to emphasize in peer-learning questions at each step of the demo, and so on. But it's discouraging to do this if at most 1/9 of the students are doing the basic prep work that will enable them to get substantive value out of the live demo examples.

So at the end of lecture today I informed the students that I wouldn't use this format any longer—those who wanted to work through examples could come to office hours, which have been quiet so far—and I asked them to vote using the clickers on how future lectures should be conducted:

(A) Deliver traditional lectures that closely match the pre-recorded ones
(B) Cancel one lecture each week, replacing it with open office hours (in addition to existing office hours)
(C) Cancel both lectures each week, replacing both with open office hours (ditto)
(D) I have another suggestion, which I am posting right now on Piazza with tag #lecture
(E) I don’t care/I’ll probably ignore lecture regardless of format

Below are the results of that poll. The people have spoken. Less work for me, but disappointing on many levels, and I feel bad for the 20-30 people who do prepare and who have been getting a lot of value out of the more involved examples in "lecture" that go beyond the book or videos. And it doesn't seem like a great use of my time to do a live performance of material that's already recorded and on which I'm unlikely to improve, though it is a lot less work for me (if not as interesting).

(Note that the number of poll respondents adds up to 121, consistent with previous lectures this semester. So even in steady state at the beginning of the semester, 1/3 of the students aren't showing up, even though participation in peer-instruction activities using the clickers counts for part of the grade.)

(Update: I posted a link to this article on the Berkeley CS Facebook page, and many students have commented. One particularly interesting comment comes from a student who took my class previously: “When I took your class, as someone who rarely, if ever, went to lecture or prepared, the microquizzes helped me convince myself to do that. They weren't hard, so getting them wrong was just embarrassing to me. I was probably the best student I ever was until you stopped doing them my semester for reasons I can't recall and after that the whole thing fell apart for me.” So maybe I should just stick to my guns on the "read before lecture" part and make the microquizzes worth more than the ~10% they're currently worth...)

A year or so ago, I approached the campus Committee on Curriculum and Instruction to ask whether they'd be likely to approve a "lab" version of this course, in which there is no live lecture (only pre-recorded), the lecture hours are replaced with open lab/office hours, and all the other course elements (teaching assistants, small sections, office hours for teaching staff, etc.) are preserved. They said that would likely be fine. It's sounding like a better and better deal to me given that the majority of students want me to spend my time doing something nearly identical to what I've done in past semesters. And if previous semesters are any indication—and this has been very consistent since I started teaching this course in 2012—lecture attendance will fall off about halfway through. (I won't be surprised if the rationale given is that the lectures are available to watch online, even though the data we're gathering this semester shows that's clearly not happening.)

In an ideal universe, maybe we'd have 2 versions of the course, one tailored to people who do the prep work and want a fast-paced in-depth experience in lecture and another for people who prefer to be lectured to in the traditional manner. But we don't live in that ideal universe because of finite resource constraints. And one could argue that we already have that second version of the course—it's on edX and is the highest-grossing and one of the most popular courses on the platform, and it's free.

Instructors: What has your experience been flipping a large class?

Students: What would you do?

The hidden benefits of microservices

2016-09-07T09:53:00.003-07:00

If you know me or read my writings, you probably know that I've become a big fan of Communications of the ACM, the monthly magazine of the Association for Computing Machinery, the world's largest and most prestigious professional society for computing. The August issue had a very nice essay by one of the original Amazon engineers on the hidden benefits of moving to a microservices-based architecture.

The appearance of the article is particularly timely since I just started teaching Engineering Software as a Service to about 200 Berkeley undergraduates in this fall semester, and a service-oriented approach to thinking about SaaS is a cornerstone of the course.

I'm particularly interested in an Amazonian's views since in the ESaaS course and accompanying book we cite Amazon's early move to services as an example of a tech company being ahead of an important curve.

The article is short and well worth a read, especially as it's written by an engineer who was there from the first days of the "monolithic" Amazon.com site and through the transition to the service-oriented architecture Amazon has today, including many services also made available publicly through AWS. Essentially, the 7 benefits apply as long as your APIs are stable and documented, and you have a tight dev/ops team keeping the service reliably running:

Permissionless innovation: no need to ask "gatekeepers" for permission to add/try new features, as long as the features don't interfere with existing use of the API.
Forced to design for failure: cross-service failures are hard to debug, so there's a strong motivation to "design for debuggability/diagnosis" so that many failure types become routine and even automatically recoverable at the individual microservice level.
Forced to disrupt trust boundaries: within small teams, mutual trust is easier since everyone can sign off on every commit. Large teams can't do this, and a microservices architecture both enables small teams and forces coming to terms with the fact that the same level of trust that exists within a team cannot always stretch across API boundaries. This is good for scaling an organization, as Conway's Law suggests ("Any organization will produce a system design that mirrors the organization's internal communication structure").
A service's developers are also its operators, so they are very close to their customers, whether those customers are end users or other microservices. The tighter customer feedback loop accelerates the pace of improvement in the service.
Makes deprecation somewhat easier: if you version the API it becomes clear who "really" cares about the legacy versions.
Makes it possible to change the schema or even the persistence model for data and metadata without permission from others.
Allows separating out particularly sensitive functions as microservices (e.g. handling sensitive data such as financial or health) and "concentrating the pain" (and focus) of how best to steward that data.
Encourages thinking more aggressively about testing from the outside as well as the inside: continuous deployment, "smoke tests" where an experimental feature is rolled out fractionally to see if anything breaks, and phased deployment of new features can all improve the precision of the test suite and result in lower time-to-repair when something breaks in production.

As the author points out, moving from an existing monolithic app to microservices isn't easy (former Director of Engineering at Twitter, Raffi Krikorian, explained what a big move this was for Twitter and why the "Rails doesn't scale" meme wasn't really an accurate description of what happened there). And unless your microservices architecture really does enable the above behaviors, you're probably not 100% of the way there. But this is clearly the way SaaS architecture is going, and will soon be its own new chapter in ESaaS.

Encrypting application-level data at rest

2016-08-16T12:33:00.002-07:00

In a previous post I described how I keep sensitive config data such as API keys secret for a public app.

What about data that is kept per-user, such as a password or API key for each user or other similar record in your databases?

For that I use the attr_encrypted gem. There is a great post on how to do this; if you're using Figaro as recommended in my earlier post, that gives you an easy place to store the symmetric-encryption key(s) used by attr_encrypted, so instead of

attr_encrypted :user, :key => ENV["USERKEY"]

you would instead say

attr_encrypted :user, :key => Figaro.env.USERKEY

Beware of attr_encrypted and serialize

Be aware that if you are using serialize to store (say) a hash or array in an ActiveRecord text field, there appear to be some weird interactions if you try to use attr_encrypted on that field. What I have found to work is this:

Before:

class User < ActiveRecord::Base
serialize :api_keys, Hash
end

User.new.api_keys # => {}

After:

class User < ActiveRecord::Base
attr_encrypted :api_keys, :key => Figaro.env.SECRET, \

:marshal => true

end

User.new.api_keys # => nil

That last option instructs attr_encrypted to marshal the resulting data structure before encrypting, and unmarshal after reading from the database and decrypting. However, whereas using serialize gives newly-created attributes a default value of a new instance of the serialized type (in this case, that would be the empty hash), this is not true with attr_encrypted. To remedy this, if your app relies on the serialized value always being non-nil, I'd advise using an after_initialize block to enforce the invariant that the attribute's default value is always an instance of the serialized class:

class User < ActiveRecord::Base

attr_encrypted :api_keys, :key => Figaro.env.SECRET, \

:marshal => true

def ensure_is_hash ; self.api_keys ||= {} ; end

after_initialize :ensure_is_hash

private :ensure_is_hash

# ...

end

User.new.api_keys # => {}

Et voila, with the exception of the after_initialize callback, your encrypted-at-rest attributes will now behave the same way as regular unencrypted attributes.

Keeping secrets

2016-08-16T12:33:00.000-07:00

Lately I've been involved with a bunch of apps that have to manage sensitive or semi-sensitive data, such as passwords, API keys, and so on. I've converged on a couple of ways of managing this data that I thought I'd share. Most of my apps are hosted on Heroku, but the advice here applies to other deployment platforms too.

I'll distinguish two kinds of data: configuration-level data that is specified once for the whole app (for example, an API key for the app to access another microservice), and sensitive app-level data (for example, a password or other secret that needs to be stored for each user). In this post I describe how to store configuration-level secret data; a separate post builds on this one to store app-level secret data.

You'll need the Figaro gem and a command-line-friendly installation of your favorite encryption package; I use GPG.

Use Figaro

For managing sensitive config data such as API keys, here's a methodology that observes two important guidelines:

DRY: the secret data should be kept all in one place and nowhere else.
Sensitive data should never be checked into version control (eg GitHub), especially if the app is otherwise open source.

First, set up your app to use the Heroku-friendly Figaro gem to manage the app's secrets. In brief, Figaro uses the well-known technique of accessing your app's sensitive config data as environment variables, but:

It centralizes all secrets in a file config/application.yml
it lets you access them through a proxy object, so that environment variable FooBar can be accessed as Figaro.env.FooBar. This allows you to stub/override certain config variables for testing if you want, and also (more importantly) to specify different values of those variables for production environment vs. development. For example, many microservices like Stripe let you setup two different API keys—a regular key, and a "testing" key that behaves the same as a regular key but no financial transactions are actually performed. Using Figaro, your app doesn't have to know which key it uses, because the correct key values for each environment can be supplied in application.yml

When you setup Figaro, it correctly adds config/application.yml to your .gitignore, since this file containing secrets should not be versioned (at least not in cleartext).

Encrypt & version your secrets file

Next, agree with the rest of your team on a symmetric key for encrypting the secrets file. You can then encrypt the file like so (this example uses GPG and the bash shell):

export KEY=your-secret-key-value

gpg --passphrase "$KEY" --encrypt --symmetric --armor \

--output config/application.yml.asc config/application.yml

This will create the ASCII-armored encrypted file config/application.yml.asc, which I then check into version control. Note that the security of this file relies on having chosen a good symmetric-encryption key.

Make sure developers can generate the secrets file

Of course, the config/application.yml file is now needed for your app to run, but only config/application.yml.asc exists in the repo. So any developer who clones the repo needs to know the value of $KEY, and when they clone the repo, they must manually create the secrets file from its encrypted version by performing the decrypt operation:

export KEY=your-secret-key-value

gpg --passphrase "$KEY" --decrypt \

--output config/application.yml config/application.yml.asc

Make sure Heroku knows the secrets

Figaro arranges to make all the info in config/application.yml available as environment ("configuration") variables on Heroku. Whoever has deploy access to the Heroku app can do this step:

figaro heroku:set -e production -a name-of-heroku-app

Make sure CI knows the secrets

Finally, if you're using continuous integration (you are, right?) it probably also needs to be able to generate config/application.yml in order to run the tests. On Travis, I add a step to the "before-script" to do this:

before_script:

- gpg --passphrase "$KEY" --output config/application.yml \

--decrypt config/application.yml.asc

Of course, this requires Travis to know the value of $KEY, so you also need to go to your app's config variables in Travis and set the value for KEY manually. These steps are easily adapted for Semaphore or other CI environments—in general, you manually supply the CI environment with the symmetric key value, and you add a before-step to the build to generate the unencrypted secrets file from the encrypted one.

When you change the contents of config/application.yml

If you add or modify secrets within application.yml, you'll need to re-create and commit config/application.yml.asc, and notify your developers that they must merge the new file and manually re-create config/application.yml from it. You'll also need to re-run figaro heroku:set -e production to populate the deploy with the new values.

That's it. This seems complex, but after the one-time setup, it's basically maintenance-free. In the next post I'll talk about encrypting data at rest per-user.

Pair Programming MOOC Style - is it Agile?

2016-07-19T11:25:00.001-07:00

This summer the “Engineering Software as a Service” (ESaaS) MOOC got re-branded as “Agile Developing using Ruby on Rails” (ADuRoR). This was a recommendation from edX and I felt it was probably a good move, since in the past there always seemed to be a few people surprised to find that the course was not about cloud computing per se, but about software engineering principles applied to web application development. The course was basically the same, although we were upgrading fully to Rails 4 and RSpec 3 and that was a fair amount of work in itself - more so in terms of updating materials than upgrading software. The big change we did make was to have pair programming become “required” for the first time, and have it assessed by peer review.

We’d introduced “optional” pairing some two years earlier. In the first instance, each assignment had a bonus 10% for submitting a video of pair programming on the assignment, but we also accepted solo videos or text summaries of the assignment talking about the pros and cons of pairing versus solo work. Part of our motivation for requiring more pair programming from the learners is our plan for a “part 3” or “project” capstone course. This will complete the ADuRoR Series, and will involve teams of MOOC learners collaborating on building software solutions for charities and non-profits around the world. We’re keen to ensure that this capstone course has teams that work effectively together. We strongly believe that pair programming is a critical part of an effective software development team. It’s not that you can’t develop great software without pair programming. It’s that you can’t work effectively in a team without good people and communication skills.

Somebody who would pair program for all the ADuRoR MOOC assignments would likely be better equipped to work effectively in a project team than someone who did not. Not that this is a guarantee; there are no guarantees, but we need some mechanism to ensure a high chance of success for our initial project teams. There’s also lots of evidence that pair programming has a big positive impact on learning to code.

As it happens the Agile process itself does not mandate using pair programming, test driven development, scrums, or anything of the sort. Let’s just refresh our memory as to the principles behind the Agile Manifesto:

Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.

Welcome changing requirements, even late in development. Agile processes harness change for the customer’s competitive advantage.

Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.

Business people and developers must work together daily throughout the project.

Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.

The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.

Working software is the primary measure of progress.

Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.

Continuous attention to technical excellence and good design enhances agility.

Simplicity–the art of maximizing the amount of work not done–is essential.

The best architectures, requirements, and designs emerge from self-organizing teams.

At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.

There’s so much here, but as is clear, tests and pairing are not mentioned explicitly, nor is Scrum or Kanban, or many other things that are commonly associated with Agile. Personally I see the last “stanza” as the critical one. Regular reflection on how to become more effective, tuning and adjusting behaviour accordingly.

This seems like extraordinarily good advice to me. I happen to really enjoy pair programming. Not everyone does, and it’s not right for every team; but the crucial thing is that in any period you look back and say well we tried X and it’s gone well, gone badly, made no difference. And then we try a variant of X, throw out X, continue X as appropriate.

Our hope with the project course is to have teams of learners operate in agile teams. Ultimately we want to empower those teams to evolve their own Agile process based on what works for them. However we’d very much like all those team members to have at least tried TDD, BDD, pairing etc. to inform their own Agile Process. We also believe that remote pair programming and remote scrum meetings will be a critical part of distributed project teams working effectively together. To the extent that you accept Google Hangouts (or similar video conferencing software) as a form of face to face conversation, then we can see this also fits in with the principles behind the Agile Manifesto.

The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.

One of the main challenges for pairing in our MOOC has been the logistics of helping online learners find pair partners. While we had over 500 successful pairing sessions during the MOOC, some students have struggled to find pair partners available at convenient times for them, particularly on the later assignments since from start to finish the number of learners attempting the assignment goes down by an order of magnitude. It’s also clear that the logistical overhead of arranging a pairing session is not great for those on tight schedules.

Our options going forward appear to include:

reducing the amount of credit given to pairing; so it doesn’t feel as “mandatory”
improving the mechanism by which pair partners are found
improving the support and documentation of how to manage pairing

The danger with the first step is that it would likely reduce the number of people attempting to pair, and further exacerbate the problem of finding pair partners. We certainly don’t want learners to become frustrated with attempting to pair program; we do believe that both pairing, and reviewing other learners pair videos are valuable learning experiences. The ultimate question is can we reduce the pairing logistics overhead sufficiently through options 2 and 3 …

Are students/teams following the Agile process? How would you know?

2016-07-15T10:20:00.002-07:00

I teach a large project course at Berkeley. Each semester, ~40 teams of 6 students each work on open-ended SaaS projects with real external customers. The projects complement an aggressive syllabus of learning Agile/XP using Rails; most of that syllabus is also available in the form of free MOOCs on edX and is complemented by our inexpensive and award-winning textbook [shameless plug].

We require the teams to use iteration-based Agile/XP to develop the projects. During each of four iterations, the team will meet with their customer, use lo-fi mockups to reach agreement on what to do next, use Pivotal Tracker to enter the user stories they'll work on, use Cucumber to turn these into acceptance tests, use RSpec to develop the actual code via TDD, use Travis CI and CodeClimate to keep constant tabs on their code coverage and code quality, and do continuous deployment to Heroku. Everything is managed via GitHub—we encourage teams to use branch-per-feature and to use pull requests as code reviews. Most teams communicate using Slack; we advise teams to do daily or almost-daily 5-minute standups, in addition to longer "all hands" meetings, and to try pair programming and if possible promiscuous pairing. Some teams also use Cloud9 virtual machines to make the development environment consistent across all team members (we provide a setup script that installs the baseline tools needed to do the course homeworks and project). There are also two meetings between the team and their "coach" (TA) during each iteration; coaches help clean up user stories to make them SMART, and advise teams on design and implementation problems.

While the non-project parts of the course benefit from extensive autograding, it's much harder to scale project supervision. Even with a staff of teaching assistants, each TA must keep track of 6 to 8 separate project groups. Project teams are graded according to many criteria; some are qualitative and based on the substance of the meetings with coaches (there is a structured rubric that coaches follow, which we describe in the "Student Projects" chapter of the Instructor's Guide that accompanies our textbook), but others are based on quantitative measurements of whether students/teams are in fact following the Agile process.

In our CS education research we've become quite interested in how to provide automation to help instructors monitor these quantitative indicators, so that we can spot dysfunctional teams early and intervene.

We've observed various ways in which teams don't function properly:

Lack of communication: Teams don't communicate frequently enough (e.g., they don't do standups) or communicate in a fractured way, for example, a team of 6 really becomes two teams of three who don't talk to each other. The result is "merges from hell", lack of clarity on who is doing what, etc.
Not using project-management/project-planning tools effectively: One goal of Pivotal Tracker (or similar tools such as Trello or GitHub Issues) is to plan, track, prioritize, and assign work, so that it's easy to see who's working on what, how long they've been working on it, and what has already been delivered or pushed to production. Some teams drift away from using it, and rely on either ad-hoc meeting notes or "to-do lists" in Google Docs. The result is poor ability to estimate when things will be done and finger-pointing when a gating task isn't delivered and holds up other tasks.
Not following TDD: Projects are required to have good test coverage (85% minimum, combined between unit/functional and acceptance tests), and we preach TDD. But some students never give TDD a fair shake, and end up writing the tests after the code. The result is bad tests that cover poorly-factored code in order to meet the coverage requirement.
Poor code quality: We require all projects to report CodeClimate scores, and this score is part of the evaluation. Although we repeatedly tell students that TDD is likely to lead them to higher-quality code, another side-effect of not following TDD is poorly-factored code that manifests as a low CodeClimate score, with students then manually fixing up the code (and the tests) to improve those scores.

There are other dysfunctions, but the above are the main ones.

To help flag these, I've become interested in how to study not only teams and team dynamics, but what analytics can tell us about the extent to which students are following the Agile process (or any process). We recently summarized and discussed some papers in our reading group that address this. (You can flip through the summary slides of our reading group discussion, but it may help to read the summaries and/or papers first.)

One of these papers, Metrics in Agile Project Courses, considers a variety of metrics obtainable from tool APIs to evaluate how effectively students are following Scrum, especially by measuring:

Average life of a pull request, from open to merge. “Optimal” is <24 hours “based on our past experience with the course”, but too-short lifetimes may be bad because they indicate the code wasn't reviewed thoroughly before merging.
Percent of merge requests with at least one comment or other activity (eg associated task). This is similar to our metric of pull-request activity in ProjectScope.
Mean time between a CI (continuous integration) failure and first successful build on the main development branch.
All indicators are tracked week-to-week (or sprint-to-sprint or iteration-to-iteration) so instructors see trends as well as snapshots.
Number of deploys this iteration (In their course, “deployment” == “customer downloads latest app”, but for SaaS we could just look at Heroku deploy history.)

In a more detailed paper (How surveys, tutors, and software help to assess Scrum adoption in a classroom software engineering project), the same group used a combination of student self-surveys, TA observation of team behaviors, and a tool they built called "ScrumLint" that which analyzes GitHub commits and issues relative to 10 “conformance metrics” developed for their course, including (eg) “percentage of commits in the last 30 minutes before the sprint-end deadline”. Each conformance metric has a “rating function” that computes a score between 0 (practice wasn't followed at all) and 100 (practice was followed very well); each team can see their scores on all metrics.

What's interesting is that they actually point to an Operations Research-flavored paper on a methodology for detecting nonconformance with a process (see the "Zazworka et al." reference in the summary for more details). The methodology helps you define a specific step or subprocess within a workflow, and rigorously define what it means to "violate" it.

With AgileVentures, we are trying to encapsulate some of this in a set of gems called ProjectScope that will be the basis of a SaaS app for Agile coaches trying to monitor multiple projects. Check back here periodically (or join AV!) if you are interested in helping out!

Pair programming with a bot partner?

2016-06-29T13:53:00.005-07:00

I just ran across a paper from 2010 with the provocative (and Googleable) title "The Impact of a Peer-Learning Agent Based on Pair Programming in a Programming Course." (IEEE Transactions on Education 53(2) May 2010)

The premise is that beginning programmers interact with an intelligent tutoring system that alternates between a "driver" and "navigator/observer" role in pair programming. The overall workflow is that the programming task starts with a flowchart, and then code is written to implement what the flowchart shows. When the bot is in "driver" mode, it "asks for help" from the student (who has the role of navigator/observer) in figuring out the flowchart, by showing the student an incomplete flowchart and having her fill in the blanks. The bot then writes code for the flowchart, but makes deliberate mistakes (syntactic and semantic errors) and expects the student to correct them, so the student "learns by teaching".

When the bot is in "navigator/observer" mode, it observes student-entered code and responds to various common errors, both from bug libraries and from instructor-supplied "keyword-triggered" messages corresponding to common mistakes or commonly-chosen suboptimal problem-solving strategies (so there's no AI in this part of it).

(The bot's "navigator mode" was this group's previous work; what's new here is the bot alternating between the driver role and navigator role, even though the bot's implementation of those roles doesn't quite match pair programming—"pair programming inspired" might be more accurate).

In both modes, simple Bayesian models are used to estimate the likelihood of students making a particular error, based on concept-dependence graphs and observed mastery of the student on predecessor nodes in the concept graph.

A limited randomized-controlled between-subjects trial, in which students used the system bracketed by pre- and post-testing, showed that students exhibited better transfer (p>.05) and retained material longer (p>.05) when using the system that switched roles periodically than when using the system that was always in "navigator mode". The effect size was large (0.79) for transfer and small (0.38) for retention.

The student modeling isn't particularly sophisticated, and the "navigator mode" bot isn't a particularly impressive use of NLP or AI, but the idea of having a tutoring system switch between roles is intriguing. Conversation bots and program analysis have both made lots of progress since this paper was published; the possibility of doing a real pair-programming chatbot for scaffolded programming assignments is quite intriguing!

Pair Programming Considered Harmful?

2016-06-10T07:32:00.000-07:00

One of the students in our distributed teams course posted to the forum there asking

This course has been a fantastic introduction to distributed team management tools and techniques. I am curious, though, to read responses, defences or critiques, of pair programming. Are there effective ways to do distributed team management without it? I present this article as a starting point: not exactly a balanced and neutral viewpoint as its title makes clear, but one that may elicit response.

http://techcrunch.com/2012/03/03/pair-programming-considered-harmful/

I was all ready to get out the big guns in defence of pair programming, but despite the provocative title I found the article reasonably balanced and a very interesting read. Pair programming (like many other techniques) is no panacea.

You can read a summary of some more research on pair programming here:

https://en.wikipedia.org/wiki/Pair_programming#Studies

In my experience (and backed by various studies) pair programming works particularly well for learning developers, and for experienced developers when they are working on something new and complicated.

However I agree with the conclusion of the article author

The true answer is that there is no one answer; that what works best is a dynamic combination of solitary, pair, and group work, depending on the context, using your best judgement. Paired programming definitely has its place. (Betteridge’s Law strikes again!) In some cases that place may even be “much of most days.” But insisting on 100 percent pairing is mindless dogma, and like all mindless dogma, ultimately counterproductive.

You can do distributed team management without pairing, but without it you miss one important tool for helping your junior developers level up and distributing knowledge through the team.

Dogma is the danger. Saying "never ever pair program" or "never ever work alone" is dangerous. One needs to achieve a good balance. The danger that I see for a lot of companies and organisations is that they reject pair programming out of hand. They even have open plan offices, but everyone sits there with their headphones, working in their little silo, potentially working in a highly counter-productive fashion.

I would say follow the Agile process. Every week you tinker and try things. No pair programming this week? Let's see what happens if we do 100% pair programming this week? Everyone sick of pair programming - let's try all doing mob programming this week. etc. etc.

The critical thing is to keep reflecting on what is (and isn't) working for your team. Sometimes you need to be bloody minded to make a difference. Other times you need to have an open mind. Keep reflecting to work out when you need which :-)

If you can only speak TDD, maybe you'll think TDD?

2016-05-23T08:53:00.001-07:00

I just finished reading Through the Language Glass, an interesting popular-press exposition of the relationship between spoken language and perception, and in particular of the "weak version" of the Sapir-Whorf hypothesis, according to which a speakers' use of language influences their mental processing.
An example is given of the Guugu Yimithirr aboriginal language, in which all location-specifying speech acts use only absolute compass directions (NSEW) as opposed to relative directions. For example, if we were standing next to each other, instead of saying "I am standing to the right of Armando", you'd say "I am standing to the north of Armando" (or whatever direction it happened to be). And if we then rotated clockwise together and faced a different direction, you'd then have to say "I'm standing to the east of Armando." G-Y speakers have to think constantly about their orientation relative to the world, since communication about place would be impossible otherwise. Experiments are described whose provocative finding is that this makes it more difficult, e.g., for G-Y speakers to solve spatial puzzles in which the relative spatial relationships between objects are unchanged but their absolute orientation is changed.
What does this have to do with Agile+SaaS? I've been looking for ways to more strongly encourage the students in the Berkeley course to really commit to TDD. (Many do, of course, but some still write tests "after the fact" only because they know we will be grading them on test coverage.) I wonder if one could devise a lab exercise in which students develop some functionality via pair programming, but they are only allowed to speak in terms of test results when discussing with each other. This would be an interactive (vs. autograded/solo) exercise, and the goal would be to get students to use a vocabulary that effectively limits their communication to describing things in terms of test results. Eg "For this next line, the effect I want is that given a valid URL, it should retrieve a valid XML document." I know many of us already (strive to) maintain this mindset, but is anyone up for designing/scaffolding an exercise that would help us teach it to TDD initiates?

Amazon Echo is a fun way to practice "API life skills"

2016-05-17T17:16:00.000-07:00

Last week was my birthday, and I got an Amazon Echo. It's a remarkably cool device, but of course the coolest thing is you can write your own apps for it. (Technically they are "Alexa apps" and not "Echo apps"—Alexa is the back end that runs the apps, the Echo is a specific device that acts as a client, but there's also mobile client apps.) Amazon has a nice development environment in which you essentially write a simple "grammar" for the phrases you want your app to understand, and some "event handlers" that get called when particular phrases are recognized. You can write the event handlers in Node.js, Python, or Java. (No Ruby, but hey, I'm ecumenical and Python is just fine.)
Students of Part 2 of ESaaS know that I have strong feelings about the use of Node.js for server-side code. (This post does a great job of summarizing my views, as does this mildly NSFW xtranormal video.) Nonetheless, Amazon's developer documentation starter examples are mostly in Node, so I decided to give it a fair shake.

I decided to make an Alexa app that gives me upcoming departure information for BART, the Bay Area's rapid transit system, which I use daily to get to work. They have an RESTful XML API in which simple HTTP GETs return an XML data structure you can parse. OK, simple enough. All I would need to do in Node is do a simple GET of a URL, and find some XML parsing library so I can parse the result.

What does the code look like in Node.js to do a simple GET? It looks like this:

As a less-experienced JavaScript programmer, I had to stare at it for awhile to understand its flow, and then I realized my God, all it does is fetch a JSON data structure using an HTTP GET call. Does anyone else see what is wrong with this picture? What is app code doing trying to respond to individual HTTP-level events that occur at the transport level? This clearly violates the "A" in the SOFA principle I teach in ESaaS: a function should stick to a single level of abstraction. Plus the code is butt-ugly. I'm no more impressed with Node than I was at the beginning of the exercise.

Here's the code I wrote in Python to do the same thing (using BART's API):

But I digress. The point is writing Alexa apps is a great way to build up your API-fu, since the best Alexa apps will probably receive concise speech commands, interact with several back-end APIs, and present concise speech output. And you can write and test the apps without having a device, using Amazon's web interface to test how the app responds to specific phrases. Plus the apps can be pretty fun to use, and the device makes me feel like I live in a home of the vaguely near future.
So try out writing some Alexa apps—you can even deploy them free on Lambda, a "managed AWS" service in which you deploy handlers rather than full VMs (essentially).
But for the love of all that is good…write them in Python.

Looking under the hood in cloud datacenters

2016-05-17T15:56:00.001-07:00

There's a great article in the May 2016 issue of Communications of the ACM authored by a group of Googlers, including former Berkeley PhD student David Oppenheimer (now the tech lead on Kubernetes) and my PhD advisor Eric Brewer (Berkeley professor on leave to be Google's VP Infrastructure), on the evolution of cloud-based management systems at Google. They were doing "containerization" of cloud servers before almost anyone else, and contributed to the Linux containerization code. The article distills over 10 years of designing and working with different approaches to containerizing software to share the hardware in Google's datacenters, and some lessons learned and still-open problems. A couple of key messages:

The "unit of virtualization" is no longer the virtual machine but the application, since modern systems like Kubernetes and Docker virtualize the app container and the files and resources on which it depends, but actually hide some details of the underlying OS, instead relying on OS-independent APIs. Apps are therefore less sensitive to OS-specific APIs and easier to port across containers.
As a result of having both resource containers and filesystem-virtualization containers map to an app (or small collection of apps that make up a service), troubleshooting and monitoring naturally have affinity to apps rather than (virtual) machines, simplifying devops.
A "label" scheme in which a given instance of an app has one or more labels simplifies deployment and devops as well. For example, if one instance of a service is misbehaving, that instance can (eg) have the "production" label removed from it, so that monitoring/load balancing services will exclude it from their operations but leave it running so it can be debugged in situ.

The article also points to still-open problems, including (surprise!) configuration (every configuration DSL eventually becomes Turing-complete and they're all mutually incompatible), dependency management (related to configuration, but also introduces new complications, i.e. what if service X depends on Y but Y must first go through a registration or authentication step before it can be deployed), and more.

Overall a really interesting read that suggests, ironically, that the programmer-facing view of cloud datacenters is actually a "back to the 1990s" view of managing a collection of individual apps with APIs (remember ASPs?) rather than managing a collection of either virtual or physical machine images.

Dealing with exploding demand for computing education

2014-05-21T11:27:00.000-07:00

UW CS chair Ed Lazowska makes the case nicely that interest in CS is booming, and that it's not likely to be just a flash in the pan, so how are we going to meet this demand and deal with larger course sizes?

In terms of offering a high-quality course experience to ever-larger numbers of students, my view (doubtless shared by others) is that we cannot just do the same thing we've been doing but at larger scale; we need to find new ways to do things. Neither can we plop the students in front of a MOOC with minimal instructor guidance, though clearly those technologies have a role to play.

In the interest of trying to figure some of this out, I tried collecting some thoughts about this based on the Berkeley experience. I presented this as a position at the recent Dagstuhl seminar about MOOCs, but I'm also on an internal committee in Berkeley CS charged with figuring out how to expand access to CS curriculum while keeping high quality, so these thoughts are percolating there as well. Comments welcome.

Separating Scalable from Non-Scalable Elements to Refactor Residential Course Delivery

In Fall 2013, enrollment in UC Berkeley’s introductory CS course exceeded 1,000 for the first time, reflecting a trend in top CS departments for exploding demand for CS.

Most residential cours

es are offered in a “one size fits all” model: a single lecture, a single set of assignments or labs that everyone does, multiple recitation sections that generally cover the same material as one another, a lab session where everyone works on the same lab with TA’s on hand to help, and perhaps small-group tutoring sessions or office hours as the only place where students with differing needs get more customized attention. The assumption is that this combination of elements serves the “mass of the distribution” of students, stragglers can squeak by with the additional support of office hours or tutoring, and superstars can use their spare time to do undergraduate research.

Yet as enrollments grow, we observe that (a) not all aspects of offering the course scale equally well in terms of instructor resources, and (b) the “outliers” (superstars and stragglers) become more and more pronounced, with the former representing underutilized talent and the latter an inordinate drain on instructional staff time.

I argue that MOOC technology, in the format of a SPOC (Small Private Online Course), can provide the leverage needed to refactor course resources to improve how we serve these exploding populations by separating the “scalable” from “non-scalable” elements of the course and by carefully thinking about the role MOOCs can play in each part.

Things that scale well

Certain MOOCs have demonstrated that some elements of a course scale inexpensively and well:

Sophisticated automatic grading, such as used in many CS MOOCs, allow nontrivial assignments to be automatically graded nearly instantly (vs. handed back a week later) and with finer-grained feedback than TA’s or readers could provide.
In some domains, automatic problem generation builds on autograding to support mastery learning. Both autograding and problem generation can take advantage of inexpensive public cloud computing rather than requiring extensive on-campus infrastructure.
Free video distribution (YouTube) makes lectures cheap to distribute.
Well-structured Q&A forums such as StackOverflow or Piazza allow students to help each other, with occasional intervention from teaching staff.

Things that scale poorly, and how to address them

Larger variance across student cohorts. Especially for courses with “broad but shallow” prerequisites, students’ levels of preparation may vary, and during the course, different cohorts of students may need help with different topics. Mitigation: “Just-in-time” flexible deployment of teaching staff.

High-end outliers (superstars): Superstar students are often an underused resource. As well, an instructor whose efforts are all expended on simply managing a large course is unable to identify these superstars (and sometimes they’re not obvious) and cultivate them further (invite as research assistants, etc.) Mitigation: Identify ways to formally recognize and train these students to be effective in helping their peers.

Low-end outliers (stragglers): Stragglers can take a disproportionate amount of staff time, leading to an Amdahl’s Law-like effect limiting course scaling. Mitigation: combine JIT deployment of teaching staff with SPOC resources that enable mastery learning.

Learning activities that are interaction-intensive: Especially in engineering courses, most real synthesis learning occurs in design projects, but these are grading-intensive and interaction-intensive. Mitigation: Refactor the course into multiple courses, each of which concentrates either on “high scale” or “high touch” but not both, and resource the courses differently.

The main argument is that we must examine both creating new staff roles and amplifying the productivity and leverage of those roles using MOOC technology, thereby increasing overall teaching productivity.

Rethinking Teaching Staff: New Roles & Flexible Deployment

New teaching roles are already being created, but ad hoc/post hoc. We need to formalize these roles, resource them, and train them. One possible factoring of roles (with some overlap among them, e.g. some community stewards may also be contributors) might be as follows:

Authors/Creators make an initial set of editorial decisions that result in a narrative through a body of material, analogous to textbook authors. We have argued that the combination of SPOCs and e-books is a promising formula for packaging such content.
Core Contributors create additional material such as assignments, assessments, tutorials and other scaffolding within the author-provided framework, which may be used and adapted by many instructors downstream. They might, for example, participate in direct teaching during the school year and spend the summer doing course development or analyzing the previous semester’s learning outcomes data. The SPOC delivery model and the software supporting it make it more convenient than ever to use SPOCs for “curricular technology transfer.”
Community Stewards become experts on the materials and help other instructors (including TA’s and other teaching staff) work effectively with the much larger range of materials available in a SPOC (compared to traditional textbooks).
Course Managers keep courses running smoothly by keeping tabs on student cohorts to understand who’s having difficulty where, marshalling and deploying instructional staff to respond to those needs, responding to escalations from instructional staff, and so on. The course manager must have domain expertise comparable to a very strong student, and may also be responsible for handling violations of academic integrity in the course.
Discussion Leaders facilitate small-group discussions (analogous to today’s recitation sections) using a combination of their own and provided materials.
Tutors work with small groups of students on specific material with which they need help.
Students/learners also help each other in person (e.g. “guerrilla lab sections” ) and virtually (e.g. discussion forums, hangouts).

These new positions will have to be recognized and trained.

Recognition: Residential campuses currently recognize relatively few “official” teaching roles, such as lecturer, TA/head TA, lab TA/lab assistant, reader/grader. New roles should be recognized with a combination of academic credit and stipends. For example, graduate TA’s receive a stipend and tuition waiver, but are also required to complete certain teaching activities as part of their PhD preparation. At some schools, undergraduates can also be either regular or lab TA’s. At Berkeley, a third mechanism allows undergraduates to receive credit for “Teaching in EECS” even if they can only commit 2-4 hours a week (vs. the 10 hour minimum for regular TA’s.)

Training (teaching skills): Many campuses’ current orientations and courses for teaching assistants focus on training “full” TA’s who will teach sections, create materials, grade assessments, conduct review sessions, and more. These courses therefore cover more than is necessary for some of the other roles. For example, “dealing with disruptive students in class” is not a topic that (e.g.) tutors would need much experience with. Some basic training for less-than-full-TA’s might cover:

how to help students stuck on problems without giving away the answer
how to “coach” students to effectively use techniques such as pair programming or peer grading, both to evaluate each others’ work and learn from the process of doing so,

Training (Orientation to the material): Both Code.org and the NSF CS10K project aim to train high school teachers to deliver computer science courses. Not only will the courses themselves be delivered as SPOCs, but they are creating “teacher training SPOCs” that will be combined with live and remote interaction (Google Hangouts) to train instructors on the use of materials.

Example 1: UC Berkeley/edX CS169x, Software Engineering

This MOOC was developed based on a campus course whose enrollment had also been growing, and it features many of the elements that “scale well,” including rigorous but automatically graded programming assignments. Over 100,000 MOOC students have attempted the course and over 10,000 have earned certificates over five offerings of it. The course now has a facilitator who is a faculty member at another university who became excited about the material after taking the course as a MOOC student. He marshals the volunteer community TA’s drawn from alumni of previous offerings, but is also a contributor who has created his own materials. He also stewards a community of classroom instructors using the material in their classrooms in a SPOC model. A SPOC is a Small Private Online Course usually based on MOOC materials, but with heavy involvement of the instructor “on the ground” in customizing and facilitating the course. I have argued elsewhere for the potential of this model and reported on successful initial trials using Berkeley MOOC materials in a SPOC setting at half a dozen universities, finding that different instructors use different subsets of the resources, some add their own, and some either don’t use our videos or use them to increase their own understanding of the material before presenting to their own students. In the meantime, the course’s original authors continue to improve the foundational materials and textbook, relying on the “network” of support to efficiently disseminate those changes and create new materials around them.

Example 2: UC Berkeley CS61A, Great Ideas in Software Abstraction

This campus-based course, which has no corresponding MOOC as of this writing, is a rigorous introduction to the main paradigms of programming—procedural abstraction, data abstraction, functional, and logic. It is based on a transliteration into Python of Abelson & Sussman’s renowned Structure & Interpretation of Computer Programs. For the record-breaking 1100-student offering of the course in Fall 2013, the lecturer would huddle with TA’s on a weekly basis to understand which topics students were having trouble with, and would then deploy a subset of teaching staff to create and run “guerrilla sections” specifically covering difficult topics. Combined with making his lectures available online in advance of the live lecture, this meant that most students don’t attend lecture and most students don’t attend the same sections. He also recruited star alumni from the previous semester to serve as tutors or lab assistants who committed only 2-4 hours per week in exchange for academic credit designated as “Teaching in EECS”, a teaching role that most CS courses have yet to exploit. These helpers knew the material since they were alumni of the course, but received informal training on how to address common questions on homeworks/lab exercises without giving away the answers.

Summary

The current “one size fits all” model of residential course delivery is a poor fit for exploding enrollments as well as for faculty productivity in an era of tightening budgets.
Separating the scalable from the non-scalable parts of a course allows the two to be resourced separately. The scalable parts can serve the mass of the distribution while the non-scalable parts can be resourced in a way more tailored to utilizing the outliers, both the superstars and the stragglers.

MOOC technology in the form of curated SPOCs, with appropriate new teaching roles supporting the course, can play a role in both the scalable and non-scalable elements.

Acknowledgments

These ideas come from conversations with the UC Berkeley Taskforce on Computer Science Curriculum (CS-TFOC), which includes David Culler, Dan Garcia, Björn Hartmann, David Wagner; John DeNero, Dave Patterson, and Andrew Huang (Berkeley); Mehran Sahami (Stanford); Saman Amarasinghe (MIT); Sam Joseph (Hawaii Pacific University, and lead facilitator for CS 169x on edX); and many others.