Eric S. Raymond's Blog, page 18
March 26, 2017
src 1.13 is released
My exercise in how small you can make a version-control system and still have it be useful, src, does seem to have a significant if quiet fanbase out there. I can tell because patches land in my mailbox at a slow but steady rate.
As the blurb says: Simple Revision Control is RCS/SCCS reloaded with a modern UI, designed to manage single-file solo projects kept more than one to a directory. Use it for FAQs, ~/bin directories, config files, and the like. Features integer sequential revision numbers, a command set that will seem familiar to Subversion/Git/hg users, and no binary blobs anywhere.
March 21, 2017
When ancient-history geeks go bad
A few minutes ago here at chez Raymond, my friend John Desmond says: “So, have you heard about the new Iraqi national anthem?”
I said “Uh, OK, I’m braced for this. What about it?”
He said “In the good old Sumer time.”
I pointed a finger at him and said “You’re Akkad!”
Yes. Yes, we probably do both deserve a swift kicking.
March 20, 2017
cvs-fast-export 1.43 is released
Maintaining cvs-fast-export is, frankly, a pain in the ass. Parts of the code I inherited are head-achingly opaque. CVS repositories are chronically prone to malformations that look like bugs in the tool and/or can’t be adapted to in any safe way. Its actual bugs are obscure and often difficult to fix – the experience is not unlike groping for razor-blades in the dark. But people expect cvs-fast-export to “just work” anyway and don’t know enough about what a Zeno’s tarpit the domain problem is to be properly grateful when it does.
Still I persevere. Somebody has to; the thought of vital code being trapped in CVS is pretty nervous-making if you know everything that can go wrong with it.
This release fixes a bug introduced by an incorrect optimization hack in 2014. It should only have affected you if you tried to use the -c option.
If you use this at a place that pays developers, please have your organization contribute to my Patreon feed. Some of my projects are a pleasure to do for free; this one is grubby, hard work.
March 18, 2017
Things Every Hacker Once Knew: 1.12
Latest version, as usual, here.
New stuff: Note just how crazily heterogenous the six-bit character sets were. FTP. Ctrl-V on Unix systems. A correction about uu{de|en}code. Timeline updates for ’74 and ’77.
The pace of submissions continues to slow.
March 16, 2017
An apologia for terminal games
Yes, to a certain segment of the population I suppose I define myself as a relic of ancient times when I insist that one can write good and absorbing computer games that don’t have a GUI – that throw down old-school in a terminal emulator.
Today I’m shipping a new release of the game greed – which is, I think, one of the better arguments for this proposition. Others include roguelike dungeon crawlers (nethack, angband, moria, larn), VMS Empire, the whole universe of text adventure games that began with ADVENT and Zork, and Super Star Trek.
I maintain a bunch of these old games, including an improved version of the BSD Battleships game and even a faithful port of the oldest of them all: wumpus, which I let you play (if you want) in a mode that emulates the awful original BASIC interface, all-caps as far as the eye can see.
Some of these I keep alive only because somebody ought to; they’re the heritage grain of computer gaming, even if they look unimpressive to the modern eye. But others couldn’t really be much improved by a GUI; greed, in particular, is like that. In fact, if you ranked heritage terminal games by how little GUIfication would improve then, I think greed would probably be right at the top (perhaps sharing that honor with ski). That in itself makes greed a bit interesting.
Much has been gained by GUIfying games; I have my own favorites in that style, notably Civilization II and Spaceward Ho! and Battle For Wesnoth (on which I was a developer for years). But the very best terminal games retain, I think, a distinct charm of their own.
Some of them (text adventures, roguelikes) work, I think, the way a novel does, or Scott McCloud taught us minimalist cartooning does; they engage the user’s own imagination as a peripheral, setting up a surprisingly strong interaction between the user’s private imagery and the bare elements of the game. At their best, such games (like novels) can have a subtle imaginative richness that goes well beyond anything this week’s graphical splatterfest offers.
More abstract puzzle games like greed don’t quite do that. What they offer instead is some of the same appeal as tiling window managers. In these games there is no waste, no excess, no bloat, no distraction; it’s all puzzle value all the way down. There’s a bracing quality about that.
Ski is kind of hermaphroditic that way. You can approach it as a cartoon (Aieee! Here comes the Yeti! Flee for your life!) or as a pure puzzle game. It works either way.
Finally, maybe it’s just me, but one thing I think these old-school terminal games consistently do better than their modern competition is humor. This is probably the McCloud effect again. I’ve laughed harder at, and retained longer, the wry turns of phrase from classic text adventures than any sight gag I’ve ever seen in a GUI game.
So, enjoy. It’s an odd and perhaps half-forgotten corner of our culture, but no less valuable for that.
UPDATE: I probably shouldn’t have described wumpus (1972) as “the oldest of them all”, because there were a few older games for teletypes like Hammurabi, aka Hamurabi (with a single ‘m’) aka The Sumer game from 1968. But wumpus is the oldest one that seems to be live in the memory of the hacker culture; only SPACEWAR (1961) has a longer pedigree, and it’s a different (vector graphics) kind of thing.
March 14, 2017
Semantic locality and the Way of Unix
An important part of the Way of Unix is to try to tackle large problems with small, composable tools. This goes with a tradition of using line-oriented textual streams to represent data. But…you can’t always do either. Some kinds of data don’t serialize to text streams well (example: databases). Some problems are only tractable to large, relatively monolithic tools (example: compiling or interpreting a programming language).
Can we say anything generatively useful about where the boundary is? Anything that helps us do the Way of Unix better, or at least help us know when we have no recourse but to write something large?
Yes, in fact we can. Back in 2015 I was asked why reposurgeon, my editor for version-control repositories, is not written as a collection of small tools rather than as a relatively large interpreter for a domain-specific language. I think the answer I gave then generalizes to a deeper insight, and a productive set of classifications.
(Part of the motivation for expanding that comment into this essay now is that I recently got mail from a reposurgeon user who shipped me several good fix patches and complimented the tool as a good example of the Way of Unix. I actually had to pause to think about whether he was right or not, because reposurgeon is certainly not a small program, either in code bulk or internal complexity. At first sight it doesn’t seem very Unixy. But he had a point, as I hope to show.)
Classic Unix line-oriented text streams have what I’m going to call semantic locality. Consider as an example a Unix password file. The semantic boundaries of the records in it – each one serializing a user’s name, numeric user ID, home directory, and various other information – correspond nicely to the syntactic boundary at each end of line.
Semantic locality means you can do a lot by looking at relatively small pieces (like line-at-a-time) using simple state machines or parsers. Well-designed data serializations tend to have this property even when they’re not textual, so you can do Unix-fu tricks on things like the binary packed protocol a uBlox GPS ships.
Repository internals are different. A lot of the most important information – for example, the DAG structure of the history – is extremely de-localized; you have to do complicated and failure-prone operations on on an entire serialized dump of the repository (assuming you can get such a thing at all!) to recover it. You can’t do more than the very simplest transforms on the de-localized data without knowing a lot of fragile context.
(Another, probably more familiar example of a data structure with poor semantic locality is a database dump. It may be possible to express individual records in tables in a representation that has good locality, but the web of relationships between tables is nowhere expressed in a way that is local for parsing.)
Now, if you are a really clever Unix hacker, the way you deal with the problem of representing and editing repositories is by saying “Fsck it. I’m not going to deal with repository internals at all, only lossless textual serializations of them.” Voila, reposurgeon! All your Unix-fu is suddenly relevant again. You exile your serialization/deserialization logic into stream exporters and importers which have just one extremely well-defined job, just as the Way of Unix prescribes.
Inside those importer/exporter tools…Toto, you’re not in Unix-land anymore, at least not as far as the gospel of small separable tools is concerned. That’s OK; by using them to massage the repository structures into a shape with better semantic locality you’ve made the conceptually hard part (the editing operations) much easier to specify and implement. You can’t get all the way to line-oriented text streams, but you can get close enough that ed(1), the ancient Unix line-oriented text editor, makes a good model for reposurgeon’s interface.
To sharpen this point, consider repocutter. This companion tool in the reposurgeon distribution takes advantage of the fact that Subversion itself can serialize repository into a textual dumpfile. There’s a repertoire of useful operations that repocutter can perform on these dumpfiles; notably, one of them is dissecting a multi-project Subversion repository dump to get out any one of the project histories in a dumpfile of its own. While repocutter has a more limited repertoire than reposurgeon, it does behave like a classic Unix filter.
Stepping back from the details of reposurgeon, we can use it as a paradigmatic case for a couple of more general observations that explain and generalize traditional Unix practice.
First: semantic locality aids decomposability. Whether you get to write a flock of small separable tools or have to do one large one is largely controlled by whether your data structure has a lossless representation with good semantic locality or not.
Or, to put it more poetically, you can carve a data structure at its joints only if there’s a representation with joints to carve.
Second: There’s almost nothing magic about line-oriented text streams other than their good semantic locality. (I say “almost” only because eyeball friendliness and the ability to edit them with unspecialized tools also matter.)
Third: Because semantic locality aids decomposability, part of the Way of Unix is to choose data structures and data formats that maximize semantic locality (under the constraint that you have to represent the data’s entire ontology).
That third point is the philosophical generalization of “jam it into a line-oriented text-stream representation”; it’s why that works, when it works.
Fourth: When you can transform a data structure or representation to a form with better semantic locality, you can collect gains in decomposability.
That fourth point is the main trick that reposurgeon pulls. I had more to say about this as a design strategy in
Automatons, judgment amplifiers, and DSLs.
March 12, 2017
Ones-complement arithmetic: it lives!
Most hackers know how the twos-complement representation of binary numbers works, and are at least aware that there was an older representation called “ones-complement” in which you negated a binary number by inverting each bit.
This came up on the NTPsec development list recently, with a question about whether we might ever have to port to a non-twos-complement machine. To my utter, gob-smacked astonishment, it turns out ones-complement systems still exist – though, thankfully, not as an issue for us.
I thought I could just mumble something about the CDC 6600 and be done, but if you google “one’s-complement machines” you’ll find that Unisys still ships a series of machines with the brand “Clear-Path Dorado” (latest variant introduced 2015) that are emulations of their old 1100-series mainframes running over Intel Xeon hardware – and these have one’s-complement arithmetic.
This isn’t a practical port blocker for NTPsec, as NTP will never run over the batch OS on these things – it’s about as POSIX-compatible as the Bhagavad-Gita. It’s just weird and interesting that ones-complement machines survive in any form at all.
And a bit personal for me. My father was a programmer at Univac in the 1950s and early ’60s. He was proud of his work. My very first interaction with a computer ever was getting to play a very primitive videogame on the oscilloscope-based video console of a Univac 1108. This was in 1968. I was 11 years old, and my game machine cost $8M and took up the entire ground floor of an office building in Rome, Italy.
Other than the 1100, the ones-complement machines Wikipedia mentions (LINC, PDP-1, and CDC6600) are indeed all long dead. There was a ones-complement “CDC Cyber” series as late as 1989, but again this was never going to implement POSIX.
About other competitors to twos-complement there is less to say. Some of them are still used in floating-point representations, but I can find no evidence that sign-magnitude or excess-k notation have been used for integers since the IBM 7090 in 1959.
There’s a comp.lang.std.c article from 1993 that argues in some technical detail that that a C compiler is not practical on ones-complement hardware because too many C idioms have twos-complement assumptions baked in. The same argument would apply to sign-magnitude and excess-k.
March 8, 2017
How to change the world in Zen easy lessons
This morning I stumbled over a comment from last from last September that I somehow missed replying to at the time. I suspect it’s something more than one of my readers has wondered about, so here goes…
Edward Cree wrote:
If I’m really smart enough to impress esr, I feel like I ought to be doing more with myself than toy projects, games, and an obscure driver. It’s not that I’m failing to change the world, it’s that I’m not even trying. (Not for want of causes, either; there are plenty of things I’d change about the world if I could, and I suspect esr would approve of most of them.)
Obviously without Eric’s extroversion I won’t be as influential as him, but… dangit, Eric, what’s your trick? You make having a disproportionate effect on the course of history look easy! Why can I never find anything important to hack on?
There are several reasons people get stuck this way. I’ve experienced some of them myself. I’ve seen others.
If this sounds like you, dear reader, the first question to ask yourself is whether you are so attached to having a lot of potential that you fear failing in actuality. I don’t know Edward’s age, but I’ve seen this pattern in a lot of bright young people; it manifests as a lot of project starts that are potentially brilliant but a failure to follow through to the point where you ship something that has to meet a reality test. Or in an opposite way: as self-constraining to toy projects where the risk of failure is low.
So my first piece of advice is this: if you want to have “a disproportionate effect on the course of history”, the first thing you need to do is give yourself permission to fail – as long as you learn something from every failure, and are ready to keep scaling up your bets after success.
The second thing you need to do is finish something and ship it. No, more than that. You need to make finishing and shipping things a habit, something you do routinely. There are things that can be made to look easy only by cultivating a lot of self-discipline and persistence. This is one of them.
(The good news is that once you get your self-discipline to the required level it won’t feel like you have to flog yourself any more. It’ll just be habit. It’ll be you.)
Another thing you need to do is actually pay attention to what’s going on around you, at every scale. 99% of the time, you find important things to hack on by noticing possibilities other people have missed. The hard part here is seeing past the blinding assumptions you don’t know you have, and the hard part of that is being conscious of your assumptions.
Here’s my favorite example of this from my own life. After I described the many-eyeballs-make-bugs-shallow effect, I worried for years at the problem of why nobody in the hacker culture had noticed it sooner. After all, I was describing what was already a decades-old folk practice in a culture not undersupplied with bright people – why didn’t I or anybody else clue in faster?
I remember vividly the moment I got it. I was pulling on my pants in a hotel in Trondheim, Norway, idly chewing over this question yet again. It was because we all thought we knew why we were simultaneously innovating and achieving low error rates – we had an unexamined, unconscious explanation that suited us and we never looked past it.
That assumption was this: hackers write better software because we are geniuses, or at least an exceptionally gifted and dedicated elite among programmers. Our culture successfully recruits and selects for this.
The insidious thing about this explanation is that it’s not actually false. We really are an exceptionally gifted elite. But as long as you don’t know that you’re carrying this assumption, or know it and fail to look past it because it makes you feel so good, it will be nearly impossible to notice that something else is going on – that the gearing of our social machine matters a lot, and is an evolved instrument to maximize those gifts.
There’s an old saw that it’s not the things you don’t know that hurt you, it’s the things you think you know that ain’t so. I’m amplifying that: it’s the things you don’t know you think that hurt you the most.
It’s not enough to be rigorous about questioning your assumptions once you’ve identified them. The subtler work is noticing you have them. So when you’re looking for something important to hack on, the question to learn to ask is: what important problems are everybody, including you, seeing right past? Pre-categorizing and dismissing?
There’s a kind of relaxed openness to what is, a seeing past preconceptions, that is essential to creativity. We all half-know this; it’s why hackers resonate so strongly with Zen humor. It’s in that state that you will notice the problems that are really worth your effort. Learn to go there.
As for making it look easy…it’s only easy in the same way that mastery always looks a skill easier than it is. When someone like John Petrucci or Andy Timmons plays a guitar lick with what looks like simple, effortless grace, you’re not seeing the years of practice and effort they put into getting to where that fluency and efficiency is natural to them.
Similarly, when you see me doing things with historical-scale consequences and making it look easy, you’re not seeing the years of practice and effort I put in on the component skills (chopping wood, drawing water). Learning to write well. Learning to speak well. Getting enough grasp on what makes people tick that you know how to lead them. Learning enough about your culture that you can be a prophet, speak its deepest yearnings and its highest aspirations to it, bringing to consciousness what was unconscious before. These are learnable skills – almost certainly anyone reading this is bright enough to acquire them – but they’re not easy at all.
Want to change the world? It’s doable. It’s not magic. Be aware. Be courageous. And will it – want it enough that you accept your failures, learn from them, and never stop pushing.
March 6, 2017
Reposturgeon recruits the CryptBitKeeper!
I haven’t announced a reposurgeon release on the blog in some time because recent releases have mostly been routine stuff and bugfixes. But today we have a feature that many will find interesting: reposurgeon can now read BitKeeper repositories. This is its first new version-control system since Monotone was added in mid-2015.
Those of you who remember the BitKeeper flap of 2005 might assume some fearsome magic was required, but no. BitKeeper has gone open source and now has a “bk fast-export” subcommand, so adding read-side support was pretty trivial. In theory the write-side support ought to work – there’s also a “bk fast-import” that reposurgeon uses – but the importer does not work well. It doesn’t seem to handle tag declarations, and I have core-dumped it during basic testing. I expect this will be fixed eventually; BitMover has a business incentive to make imports easy, after all.
While reposurgeon has your attention, I guess I should mention another revent development. The svncutter tool that I wrote back around 2009 is back, as “repocutter”, and now part of the reposurgeon distribution with improved documentation. There are some cases of Subversion repositories holding multiple projects that it turns out are better handled by slicing them apart with repocutter than by trying to do the entire dissection in reposurgeon.
Yes, there are still some pending bugs in weird Subversion cases. I decided to ship a release anyway, deferring those, because read support for BitKeeper seemed important enough. I believe that makes reposurgeon’s coverage of Unix-hosted version-control systems about as complete as is technically possible.
March 2, 2017
Things Every Hacker Once Knew: 1.11
The newest version of Things Every Hacker Once Knew is only a minor update.
There’s material on SIGHUP; six-bit characters on 36-bit machines; a correction that XMODEM required 8 bits; and why screensavers are called that.
New submissions are ramping down; I don’t expect to need to issue another update of this for some time.
Eric S. Raymond's Blog
- Eric S. Raymond's profile
- 140 followers
