More on this book
Community
Kindle Notes & Highlights
Read between
June 24 - June 26, 2023
The cycle looks something like this: Open source developers write and publish their code in public. They enjoy months, maybe years, in the spotlight. But, eventually, popularity offers diminishing returns. If the value of maintaining code fails to outpace the rewards, many of these developers quietly retreat to the shadows.
One study found that in more than 85% of the open source projects the researchers examined on GitHub, less than 5% of developers were responsible for over 95% of code and social interactions.
The role of a maintainer is evolving. Rather than coordinating with a group of developers, these maintainers are defined by the need for curation: sifting through the noise of interactions, such as user questions, bug reports, and feature requests, which compete for their attention.
The problem facing maintainers today is not how to get more contributors but how to manage a high volume of frequent, low-touch interactions. These developers aren’t building communities; they’re directing air traffic.
This book is an attempt to identify, and expand upon, what it means to be online today, told through the story of open source, where individual developers write code consumed by millions.
Although the Linux Foundation reports more than 14,000 contributors to the Linux kernel since 2005,24 Torvalds is still the only person who’s allowed to merge those contributions into the main project.
If GitHub is like Facebook, SourceForge was the MySpace of code-hosting platforms: the first significant product of its kind, and, though still alive today, mostly remembered as a blueprint.
Free software developers frequently champion copyleft licensing, like the GNU General Public License (GPL).
As one can imagine, copyleft licenses aren’t commercially friendly, because companies must license their own software on the same terms. So early open source advocates began to emphasize permissive licenses, like the Berkeley Software Distribution (BSD) and the Massachusetts Institute of Technology (MIT) licenses, which allow developers to do pretty much whatever they want with the code, without changing the terms of their own projects.
Today, the MIT license is, by far, the most popular license used by GitHub projects.
Platforms deliver value to third parties that build on top of them, whereas aggregators are pure intermediaries.
Thompson references a quote attributed to Bill Gates, who defines a platform as “when the economic value of everybody that uses it exceeds the value of the company that creates it.”
More than any other type of creator, open source developers seem like they should’ve been most impervious to platform effects.
And yet, GitHub continues to dominate. Even the Apache Software Foundation, an umbrella organization formed in 1999, which was once perceived as reluctant to adopt modern open source tooling,62 announced in 2019 that it was migrating its Git-based projects to GitHub.
Fogel is a thoughtful, and in my view underrated, voice of early open source culture. He wrote the only notable book to date on the production of open source: Producing Open Source Software: How to Run a Successful Free Software Project, first published in 2005. It’s available in its entirety online at https://producingoss.com/
In conservation biology, the term charismatic megafauna refers to the idea that polar bears sell environmental causes better than mollusks or insects. The cuter, the better.
Open source software is frequently characterized as participatory, which implies that anyone can modify its code. While this is theoretically true, in practice open source is not blindly open to every person who wants to change it.
Some developers have permission to merge changes into the trunk (or master), which is the baseline version of the project. Having these permissions is often referred to as commit access,
Commit access is a technical permission, but there are also social considerations.
Before they merge a change in, they must also consider how it will be received by other contributors and users. Bigger projects often use a formal “request for comments” (RFC) process to allow communities to discuss these changes before they are merged. In Python, for example, these requests are called Python Enhancement Proposals (PEPs),68 while in Go, another programming language, a formal proposal is called a “design document.”69...
This highlight has been truncated due to consecutive passage length restrictions.
In some projects, nobody gets commit access besides the author, no matter how big the project gets.
The process by which a developer gains commit access varies widely between projects and is subject to preexisting social norms.
These differences in social norms are often closely intertwined with technical design.
And Debian has a monolithic, tightly coupled codebase, where green-lighting the wrong maintainer could, indeed, have dire consequences. But JavaScript, including Node.js, is designed to be modular, where each maintainer has a limited ability to affect other components of the ecosystem, so JavaScript developers are more likely to prioritize moving fast and accepting contributions.
Bitcoin’s community, like Clojure’s, prioritizes stability and security, preferring to move slowly and with care, even if it means including fewer features and contributors. Ethereum is more like Node.js: it’s a platform for others to develop on, flinging itself far and wide.
The process of getting a change approved depends, among other things, on the complexity of the change (and the complexity of the project), as well as on one’s reputation among those with the ability to approve the change.
At minimum, open source projects hosted on GitHub can be broken into three parts: code (the final output of a project), an issue tracker (a way to discuss changes), and pull requests (a way to make changes).
Although the concepts of an issue (also called a ticket) and a pull request (also called a patch) are much older than GitHub, issues and pull requests are GitHub’s branding of these features, and therefore aren’t quite so easy to migrate between platforms.
At a high level, open source projects tend to move from closed —> open —> either closed or distributed development (depending on their size).
While some open source developers write code in public from the very beginning, many prefer to do their initial creative work in private, so they can properly articulate their ideas before opening the project up for feedback.
Nathan Marz, who wrote the data-computation system Apache Storm, timed its release with his talk at a software conference called Strange Loop, open-sourcing the project live onstage.
As a project becomes more widely used, more developers will interact with it. One heuristic for when this transition occurs is when maintainers start doing more non-code than code work on the project, such as triaging issues and reviewing others’ pull requests.
If they don’t have many contributors, they’ll start to filter out the noise, pulling back to a more closed, focused state of development. In a closed state, maintainers are more selective about reviewing external contributions, so they can focus on their work.
In a distributed state, maintainers actively recruit more contributors to pitch in, with the goal of retaining them in the project.
A project’s contributor growth is a function of its technical scope, support required, ease of participation, and user adoption.
A project that feels feature-complete won’t attract as many contributors as one that’s extensible and customizable.
Focusing on the relationship between contributors and users, we can think of projects in terms of their contributor growth and user growth. This gives us four production models: federations, clubs, toys, and stadiums.
Federations are projects with high contributor growth and high user growth. These are the “bazaars,” first described by Eric S. Raymond, which we typically think of when we imagine an open source project.
Rust, Node.js, and Linux are all examples of federations.
Federations are similar to companies or NGOs. They’re more complex to manage from a governance standpoint, so they tend to develop processes—voting, leadership positions, foundations, working groups, and technical councils—that address coordination issues within their contributor community.
As their contributor community grows, federations typically “shard” contributors into smaller working groups,
Federations also often employ an RFC (request for comments) process, similar to a ballot initiative, to manage major change proposals to the project.
Clubs are projects with high contributor growth and low user growth, leading to a roughly overlapping group of contributors and users.
Kazuhiro Yamashita et al. describe contributor retention using the terms “magnet” and “sticky,”
Magnetic projects are those that attract a large proportion of new contributors. Sticky projects are those where a large proportion of contributors continue to make contributions.
Successful clubs are highly sticky,
It’s the difference between living in a small town and a big city: in a city, communities easily divide into smaller groups, but in a small town everybody knows everybody else’s business and cares more about who’s spending time with whom.
Toys are projects with low contributor growth and low user growth. They’re probably the least interesting production model to analyze for the purposes of this book, because they are effectively personal projects.
Open source projects on GitHub with fewer than ten stars would also fall into the toy category.
Stadiums are projects with low contributor growth and high user growth.