Working in Public: The Making and Maintenance of Open Source Software
Rate it:
Open Preview
1%
Flag icon
The cycle looks something like this: Open source developers write and publish their code in public. They enjoy months, maybe years, in the spotlight. But, eventually, popularity offers diminishing returns. If the value of maintaining code fails to outpace the rewards, many of these developers quietly retreat to the shadows.
2%
Flag icon
There are countless initiatives today aimed at helping more developers contribute to open source projects. These efforts are widely championed as “good for open source,” and they are frequently accomplished by tapping into a public sense of goodwill. However, in speaking to maintainers privately, I learned that these initiatives frequently cause them to seize with anxiety, because such initiatives often attract low-quality contributions. This creates more work for maintainers—all contributions, after all, must be reviewed before they are accepted. Maintainers frequently lack infrastructure to ...more
2%
Flag icon
I started to see the problem is not that there’s a dearth of people who want to contribute to an open source project, but rather that there are too many contributors—or they’re the wrong kind of contributors. Open source code is public, but it doesn’t have to be participatory: maintainers can buckle under excess demand for their attention.
2%
Flag icon
There is a long, and growing, tail of projects that don’t fit the typical model of collaboration. Such projects include Bootstrap, a popular design framework used by an estimated 20% of all websites,4 where three developers have authored over 73% of commits.*5 Another example is Godot, a software framework for making games, where two developers respond to more than 120 issues opened on their project per week.†6 Bootstrap’s contributors in 2017, plotted by number of commits. This distribution—where one or a few developers do most of the work, followed by a long tail of casual contributors, and ...more
2%
Flag icon
One study found that in more than 85% of the open source projects the researchers examined on GitHub, less than 5% of developers were responsible for over 95% of code and social interactions.
3%
Flag icon
In contrast to big, monolithic software projects that give rise to persistent communities, npm packages are designed to be small and modular, which leads to fewer maintainers per project and a transitory relationship with the code they write. Viewed in this light, the lack of contributors today reflects an adaptation to changing environmental circumstances, wherein the relationship between a maintainer, contributors, and users is lighter, more transactional.
3%
Flag icon
Casual contributors are often aware that they have little context for what is going on behind the scenes of the project. But more importantly, they do not want to spend time familiarizing themselves with a project’s goals, roadmap, and contribution process. These developers primarily see themselves as users of the project; they do not think of themselves as part of a “contributor community.”
3%
Flag icon
The problem facing maintainers today is not how to get more contributors but how to manage a high volume of frequent, low-touch interactions. These developers aren’t building communities; they’re directing air traffic.
4%
Flag icon
Open source has always served as a vanguard for the rest of our online behavior. In the late 1990s, open source was the poster child for a hopeful vision of widespread public collaboration, then dubbed “peer production.” Because open source software was starting to outpace software sold by companies, economists believed that these developers had achieved the unthinkable. As the internet floated peacefully in its embryonic state, it really did seem possible that the world might eventually be powered by the efforts of self-organized communities. But over the last twenty years, open source ...more
5%
Flag icon
Richard Stallman, the MIT hacker who’s generally credited with starting the free software movement, was inspired to launch the GNU project, a free software operating system, in 1983, after attempting to customize a Xerox printer in MIT’s AI Lab and finding that he could not access or modify its source code. Stallman wanted to liberate code from proprietary use. The term “free” refers to being able to do what you want with the code, rather than the code being free of charge. (Thus the oft-repeated phrase, attributed to Stallman, “Free as in freedom, not free as in beer,” and the occasional use ...more
5%
Flag icon
It’s hard to overemphasize the extent to which freedom of code matters to free software developers. Bradley Kuhn, a leader of the nonprofit umbrella organization Software Freedom Conservancy, likens his lifestyle to that of a vegetarian. Just as a vegetarian doesn’t eat meat, to the extent that he’s able Bradley doesn’t use proprietary software.16 This means not using websites like Twitter, Medium, YouTube, or GitHub. Code, like livestock, needs liberation from humanity, even at the expense of personal convenience.
5%
Flag icon
The term “hacker” was popularized by author Steven Levy, who memorably captured a portrait of the 1980s hacker generation in the book Hackers: Heroes of the Computer Revolution. In Hackers, Levy profiles a number of well-known programmers of the time, including Bill Gates, Steve Jobs, Steve Wozniak, and Richard Stallman. He suggests that hackers believe in sharing, openness, and decentralization, which he calls the “hacker ethic.”17 According to Levy’s portrait, hackers care about improving the world, but don’t believe in following the rules to get there.
5%
Flag icon
Hackers are characterized by bravado, showmanship, mischievousness, and a deep mistrust of authority. Hacker culture still lives on today, in the way that beatniks, hippies, and Marxists still exist, but hackers don’t capture the software cultural zeitgeist in the same way that they used to. The generational successor to hackers today might be cryptographers and those who dabble in information security: those who flirt with the law, and do so with a wink and a bow.
18%
Flag icon
Focusing on the relationship between contributors and users, we can think of projects in terms of their contributor growth and user growth. This gives us four production models: federations, clubs, toys, and stadiums.
18%
Flag icon
Federations are projects with high contributor growth and high user growth. These are the “bazaars,” first described by Eric S. Raymond, which we typically think of when we imagine an open source project. These projects are rare but impactful: just as most startups don’t end up like Facebook, most open source projects aren’t Linux. Although they comprise a small percentage of open source projects (less than 3%, according to one study), federations occupy the most mindshare due to the size of each project.92 Rust, Node.js, and Linux are all examples of federations.
19%
Flag icon
Clubs are projects with high contributor growth and low user growth, leading to a roughly overlapping group of contributors and users. While there are fewer users overall, these users are more likely to participate as contributors. Astropy, for example, is a package that provides core functionality and tooling for those using Python for astronomy and astrophysics. While Astropy will never be used by most developers, the package’s narrow focus makes it easier to recruit contributors and maintain relevance among those for whom Astropy is extremely important.
19%
Flag icon
Toys are projects with low contributor growth and low user growth. They’re probably the least interesting production model to analyze for the purposes of this book, because they are effectively personal projects. Toys are like a side project or a weekend project. Eventually, they might become more widely used, but in their current stage they’re just something that an individual developer enjoys tinkering around with for fun. For example, developer Andrey Petrov made a project called ssh-chat, a client that lets users chat through the Secure Shell (SSH) protocol. While the project has thousands ...more
20%
Flag icon
Stadiums are projects with low contributor growth and high user growth. While they may receive casual contributions, their regular contributor base does not grow proportionally to their users. As a result, they tend to be powered by one or a few developers. Many widely depended-upon packages and libraries fit into this model, including webpack, Babel, Bundler, and RSpec. Stadiums are becoming increasingly commonplace today. In a stadium model, one or a few maintainers make decisions on behalf of a broader user base. Unlike a federation or club, whose communities are decentralized, a stadium’s ...more
22%
Flag icon
While it feels obvious today that we want to freely share the things we make, the early success of open source captivated scholars and economists because it defied everything we thought we knew about how and why people create. Open source developers were frequently characterized as “hobby” developers (most famously in Bill Gates’s 1976 “Open Letter to Hobbyists,” which we’ll get to later), because the assumption was that only companies could make “real” software. Even Carl Shapiro and Hal R. Varian’s Information Rules: A Strategic Guide to the Network Economy, a 1999 book widely regarded as ...more
22%
Flag icon
Previously, our understanding of how and why people make things was modeled after Ronald Coase’s theory of the firm, which proposes that firms (i.e., companies, organizations, and other institutions with centralized resources) naturally emerge as a way to reduce transaction costs in the market.109 Coase would’ve told us that only companies make software because, from a coordination standpoint, managing the resources required to pull off such a feat would be most efficiently handled within the same organization. By contrast, the open source projects attracting attention in the late 1990s and ...more
This highlight has been truncated due to consecutive passage length restrictions.
23%
Flag icon
A few of the conditions that Benkler identifies as necessary to pull off commons-based peer production are intrinsic motivation, modular and granular tasks, and low coordination costs.
28%
Flag icon
In his 1975 book The Mythical Man-Month, Fred Brooks tackles the problem of organizational design for teams building software. He cites an idea from computer scientist Harlan Mills, who suggests organizing developers like a surgical team, “rather than a hog-butchering team.”140 In the surgical-team model, there is a “chief programmer,” who, like a surgeon, sets the project’s specifications and design. The surgeon has a “copilot,” who serves as their confidante and right arm. Then there are a number of supporting roles, including someone who handles money and administration, someone who writes ...more
This highlight has been truncated due to consecutive passage length restrictions.
31%
Flag icon
The contributors to an open source project can be classified as either active or casual, which is often determined based on the frequency of their contributions.
32%
Flag icon
Active contributors (also called “regular contributors” or “long-term contributors”) are considered members of the project, based on their reputation or the consistency of their contributions. They’re what we typically imagine when we think of open source contributors: a community of developers in which members are invested in one another and in the project.
33%
Flag icon
Casual contributors are, generally speaking, those who have made one contribution or less (such as an unmerged pull request) to the project. However, it’s easier to identify them by their low affinity to the project. A casual contributor may have made multiple contributions, but this type of contributor’s defining characteristic is that they don’t feel a deeper connection to the contributor community, beyond a desire to see their own contributions merged. Viewed through the lens of membership, they are better understood as users than contributors.
38%
Flag icon
To the untrained eye, writing software appears to be all about the new and shiny, free from the earthly troubles of working with atoms rather than imaginary bits. In practice, software ages quietly, in the shadows, and stubbornly refuses to die.
38%
Flag icon
Firstly, software, once written, is never really finished. It might be feature-complete, but, in order to continue running, software almost always requires some sort of ongoing maintenance. At minimum, that might mean keeping dependencies up-to-date, but it might also mean things like upgrading infrastructure to meet demand, fixing bugs, or updating documentation. So-called “greenfield” projects—those where a developer gets to write software from scratch—are coveted for a reason. Most of the work that software developers do is not writing new code, but rather tending to the code that someone ...more
38%
Flag icon
A second observation is that once software finds a set of users, it’s hard for it to ever really disappear. Someone out there is probably going to use that code for a very long time. Some of the oldest code ever written is still running in production today. Fortran, which was first developed in 1957 at IBM, is still widely used in aerospace, weather forecasting, and other computational industries. COBOL, another programming language, was first released in 1959. Network Time Protocol, used to synchronize time between computer systems, was initially developed in the early 1980s.
39%
Flag icon
Software doesn’t die, because someone out there—someone its developers may not even be aware of—will continue to use it. The author Neal Stephenson once described Unix as “not so much a product as it is a painstakingly compiled oral history of the hacker subculture. It is our Gilgamesh epic . . . Unix is known, loved, and understood by so many hackers that it can be re-created from scratch whenever someone needs it.”192 Code is not a product to be bought and sold so much as a living form of knowledge.
40%
Flag icon
Jacob Thornton, the developer who cocreated Bootstrap, suggested that open source is, instead, “free as in puppy”: Open-sourcing something is kind of like adopting a cute puppy. You write this project with your friends, it’s really great, and you’re like, “OK, like I’ll open-source it, it’ll be fun! Like, whatever, we’ll get on the front page of Hacker News.” . . . And it is! It’s super fun, it’s a great thing. But what happens is, puppies grow and get old, and pretty soon . . . your puppy’s kinda like a mature dog . . . . and you’re like, “Oh my god, so much time is required for me to take ...more
42%
Flag icon
Software requires physical infrastructure to reliably serve a large audience without any downtime, security attacks, or interruptions in service. Today, these costs have mostly been handed off to central providers, but that doesn’t make them any less real. These companies work hard to make infrastructure costs feel invisible to the rest of us. Content is rarely hosted by users anymore. Publishing on Medium or GitHub, for example, means creators never even have to think about hosting costs. The cost of uploading a photo or video to Instagram or YouTube is paid for by the platform.
42%
Flag icon
Werner Vogels, chief technology officer for Amazon and an architect of Amazon Web Services, describes how the marginal cost of physical infrastructure can become significant at scale: Under the covers these services are massive distributed systems that operate on a worldwide scale. This scale creates additional challenges, because when a system processes trillions and trillions of requests, events that normally have a low probability of occurrence are now guaranteed to happen and must be accounted for upfront.
42%
Flag icon
If software consumption were truly zero marginal cost, it would be just as easy for anyone else to maintain their own version of GitHub as it is for GitHub itself. But it’s far more efficient for a single platform to manage the code, security, infrastructure, support, and whatever else comes with maintaining a software product. Developers use GitHub over GitLab not just for the network effects but also for the former’s security and reliability. The same goes for why someone would use a Google product, like Gmail or Google Docs, over that of a startup. It costs money and manpower to do these ...more
42%
Flag icon
User support is a significant marginal cost associated with software, plaguing not just open source developers but the biggest technology companies, which are still figuring out how to manage unprecedented levels of adoption. When user adoption is low, the cost of support feels trivially small, perhaps nonexistent. The vast majority of users will quietly download code without ever making themselves known. However, as adoption grows, the once trivial cost of support can become significant. Perhaps only 0.1% of users require support. If a company has 1,000 users, then only one user needs ...more
43%
Flag icon
Eric S. Raymond once coined the aphorism “Given enough eyeballs, all bugs are shallow.” His point is that open source software presents an advantage over closed source software, because if more people can inspect the code it will increase the chance that more bugs will be discovered. The implication is that support can be handled in a fully decentralized manner that will distribute its costs among users.* But as Fred Brooks wryly notes in his classic engineering book The Mythical Man-Month, first published two decades before Raymond made his claim, although “more users find more bugs,” this ...more
44%
Flag icon
Code is “cleanest” when it’s first released, because that’s the time at which developers are thinking about the project holistically and writing it from scratch. As more code is added incrementally, software starts to become unwieldy, like a building from the 1850s that’s had new rooms, plumbing, and electric wiring added piecemeal over the years.
44%
Flag icon
When adding code to a project, it can be hard to prioritize decisions on a long time horizon. A critical bug or security vulnerability might require taking shortcuts to patch things up quickly, but eventually those short-term decisions start to add up. Developers refer to this as technical debt: making choices that are easier today but that cost time and money to address later on. Open source projects are particularly susceptible to technical debt because they accept contributions from developers who may not necessarily know one another, nor have full context on the project. Scope creep refers ...more
44%
Flag icon
Refactoring is the process by which developers pay down technical debt: rewriting, simplifying, and otherwise cleaning up the codebase without changing its functionality. Much like editing a book versus writing it for the first time, refactoring code is often dreaded and unrewarding work. Since open source developers tend toward work that they find intrinsically motivating, not only are open source...
This highlight has been truncated due to consecutive passage length restrictions.
44%
Flag icon
While static code does not change over time, code in active state is closely intertwined with its dependencies, meaning other code that runs with it. If software is a LEGO house, each of its dependencies can be thought of as a LEGO brick. An open source project might be one of those bricks, with a different project, maintained by another group of developers, forming another brick. Over time, code falls out of sync with its dependencies, becoming incompatible with newer versions of those dependencies. Even if your code doesn’t change, everything else around it changes. Code that hasn’t been ...more
45%
Flag icon
One particularly challenging aspect of dependency management is security. Much like technical debt and refactoring, security vulnerabilities can be time-consuming to manage, with little upside for the developer, coupled with the fear of an extremely bad situation if they miss something important. In a commercial setting, developers are paid to deal with the things they don’t feel like doing. In open source, where so much work runs on personal motivation, security can easily fall by the wayside.
46%
Flag icon
Maintenance is often classified as reactive work: it’s the minimum required to keep things running smoothly. But users’ needs change over time, too. Software must also change to meet these needs, or else risk becoming irrelevant. Clayton Christensen famously identifies and analyzes this problem in The Innovator’s Dilemma, the book in which he tries to understand why successful companies can be overtaken by new ones, even if they are doing well. By focusing too much on iterating upon their incumbent product, companies risk missing major opportunities for so-called “disruptive innovation,” which ...more
47%
Flag icon
Zero marginal cost means that developers have free access to endless amounts of code. Any developer can find millions of repositories on GitHub for free. In addition to the educational benefits, having access to others’ code helps bring developers’ ideas into reality faster by reducing their fixed costs.
47%
Flag icon
It’s cheaper to reuse existing software components than to write code from scratch, which also makes it possible for entrepreneurs to start software companies with fewer up-front costs. The entire software industry owes its financial success to leveraging this arbitrage. These benefits are passed down to software’s users, too. If software doesn’t cost much to make, developers can offer consumers more of their tools, toys, and applications at affordable prices. But software’s “zero marginal cost” property heavily favors its consumers. If software is free to consume, in terms of value accrued to ...more
47%
Flag icon
Software producers have never really figured out how to sell code itself. Stratechery’s Ben Thompson makes a similar observation about the music industry: “The music industry was primarily selling plastic discs in jewel cases; the music encoded on those discs was a means of differentiating those pieces of plastic from other ones, but music itself was not being sold.”237 By tweaking the properties of non-rivalry and non-excludability, which make information zero marginal cost, producers artificially nudged consumers into paying for code, tamping it into physical formats like a genie stuffed ...more
48%
Flag icon
Code by itself is not, and has never been, worth anything, and consumers already know this intuitively when they refuse to directly pay for it. These lessons were memorably encapsulated by Bill Gates’s attempts to sell BASIC in the 1970s. BASIC was the software used to run Altair, a personal computer made by a company called Micro Instrumentation and Telemetry Systems (MITS). It was Microsoft’s first product, licensed to MITS and sold together with the Altair. At one of MITS’s demos, a paper tape containing BASIC was stolen. Pirated copies of BASIC began to appear and permeate through the ...more
48%
Flag icon
Producers constantly fight an uphill battle to threaten, lock down, beg, blame, and shame consumers into paying for content. They must play the role of cowboys, roping consumers with their lassos and dragging them in the desired direction, whether by threatening the users with lawsuits or wheedling readers into disabling ad-blocking software—which extends to not allowing them to read articles in a browser’s private mode. Although we continue to pay lip service to the idea that consumers should pay for software somehow, code struggles against its bonds, spurred perhaps by the famous ...more
48%
Flag icon
Economist David Friedman tells a joke that goes like this: Two economists walked past a Porsche showroom. One of them pointed at a shiny car in the window and said, “I want that.” “Obviously not,” the other replied.239 The joke is about revealed preference: the idea that we can only understand consumer preferences based on their actual behavior. If we were to rewrite the joke about open source software, it might go something like this: Two developers cloned a popular open source proje...
This highlight has been truncated due to consecutive passage length restrictions.
48%
Flag icon
Even as software’s purchase value is being driven dramatically down, its social value seems to be going dramatically up. We can’t live without software anymore, but we also don’t want to pay for it. How is this the case? The author Jane Jacobs explores these conflicting views in her 1961 book The Death and Life of Great American Cities, in which she tries to explain why urban planning policy failed cities. Jacobs’s major critique of urban planning in the 1950s is that the planners treated cities—the layout of their buildings, parks, and roads—as static objects, which were only developed at the ...more
49%
Flag icon
Code, in active state, carries its value in its dependencies, or who else is currently using it. If I publish code and nobody uses it, it’s worth less than other code I’ve written that’s embedded in software used by millions. I may derive personal value from the other code I wrote, but its value to others is negligible. In this way, software is comparable to public infrastructure, and similar valuation methodologies apply. Like code, infrastructure derives its value from its active dependencies, irrespective of the cost of its construction or maintenance.
51%
Flag icon
We measure the reputational value of content differently depending on whether we’re viewing it in static or active state. “Likes” and “follows” are not the same reward on social media. A viral tweet might gain thousands of likes, but those don’t necessarily translate into follows. The same goes for views and likes on a YouTube video, versus subscribing to its creator’s channel. The difference between likes and follows also helps us understand when maintenance costs matter. A viral YouTube video doesn’t necessarily have ongoing costs if its creator doesn’t plan on making more videos. But if the ...more
« Prev 1