Eric S. Raymond's Blog, page 48

December 23, 2013

De-normalizing dissent

I really hadn’t been planning to comment on the Duck Dynasty brouhaha. But conservative gadfly Mark Steyn (a very funny, witty man even if you disagree with his politics) has described the actual strategy of GLAAD and its allies with a pithy phrase that I think describes wider circulation – “de-normalizing dissent”.


OK, let’s get the obvious out of the way first. Judged by his remarks in Esquire, Duck Dynasty patriarch Phil Robertson is an ignorant, bigoted cracker who reifies almost every bad redneck stereotype there is. His religion is barely distinguishable from a psychotic delusional system. Nothing I am about to say should be construed as a defense of the content of his beliefs.


On the other hand, Steyn has a point when he detects something creepy and totalitarian about the attempt to hound Robertson out of his job and out of public life. True, nothing GLAAD has done rises to the level of state coercion – there is no First Amendment issue here, no violence or threat of same in play.


But what GLAAD and its allies are trying to accomplish is not mere moral suasion either; they’re trying to make beliefs they disapprove of unspeakable in polite society by making the consequences of expressing them so unpleasant that people will self-censor. In Steyn’s well-chosen phrase, they’re trying to de-normalize dissent.



This is a fun game. Do I get to play? I think anyone who speaks of communism or socialism in less opprobrious terms than they would apply to Nazis should be considered morally equivalent to a National Socialist and shunned by all right-thinking people. Let’s remember the hundreds of millions of genocide victims and de-normalize that advocacy!


What? You don’t think that’s a good idea? You think even odious dissent should remain part of the conversation? Then welcome to the ranks of Phil Robertson’s defenders. Unless you’re a blatant partisan hypocrite.


Steyn likes to say that the right way to react to people who try to de-normalize your dissent is to push back twice as hard. I agree. So I give you my favorite Twitter hashtag of the day, to be applied to GLAAD any time it tries this kind of public bullying: #ButtNazis.


Don’t let the #ButtNazis fuck your free speech up the ass while pretending they’re on a lofty mission of moral uplift. Trying to publicly shame and humiliate Robertson, OK; trying to get him fired and shunned, not OK. The former would have been education; the latter was an ugly power play intended to establish GLAAD as arbiters of what can and cannot be said.


All freedom-loving people should reject such attempts, and I am heartened to see that even many homosexual public figures are doing so. (Camille Paglia’s reported description of GLAAD’s behavior as “Stalinist, fascist” struck me as particularly apt.) GLAAD’s influence is going to greatly diminish after this, which is as it should be.


I hope this fiasco will serve as a warning to other activist organizations that there is a line between persuasion and suppressive bullying which they cross at their own peril. Myself, I promise to continue putting the defense of free expression over any form of partisanship.

 •  0 comments  •  flag
Share on Twitter
Published on December 23, 2013 05:53

December 20, 2013

Your new word of the week: explorify?

There are a lot of things people writing software do in the world of bits that don’t have easy analogs in the world of atoms. Sometimes it can be tremendously clarifying when one of those things gets a name, as for example when Martin Fowler invented the term “refactoring” to describe modifying a codebase with the intent to improve its structure or aesthetics without changing its behavior.


There’s a related thing we do a lot when trying to wrap our heads around large, complicated codebases. Often the most fruitful way to explore code to modify it. Because you don’t really know you have understood a piece of code until you can modify it successfully.


Sometimes – often – this can feel like launching an expedition into the untamed jungle of code, from some base camp on the periphery deeper and deeper into trackless wilderness. It is certainly possible to lose your bearings. And large, old codebases can be very jungly, overgrown and organic – full of half-planned and semi-random modifications, dotted with occasional clearings where the light gets in and things locally make sense.



A refactoring expedition can serve very well for this kind of exploration, but it’s not the only kind. As a trivial-sounding example, when trying to grok a large mass of older C code one of the first things I tend to do is identify where ints and chars are actually logic flags and re-type them as C99 bools.


This isn’t refactoring in the strict sense – no code organization or data structures change. It can be very effective, though, because identifying all the flags tends to force your mental model closer to the logic structure of the code.


Another thing I often do for the same reason is identify related global variables and corral them into context structures. (Note to self: must find and release the YACC mods I wrote years ago to support multiple parsers in the same runtime.)


For a clearer example of how this concept is different from refactoring, consider another common subtype of it: adding a small feature, not so much because the feature is needed but to improve and verify your knowledge of the code. The inverse happens – I’ve occasionally gone on exploratory hunts for dead or obsolete code – but it’s much less common.


I think we need a word for this. I spent a significant amount of mental search time riffling through my vocabulary looking for an existing word to repurpose, but didn’t find one. My wife, who’s as lexophilic as I am, didn’t turn up anything either.


Therefore I propose “explorify”, a portmanteau of “explore” and “modify”. But I’m much less attached to that particular word than I anm to having one for the concept. Perhaps one of my commenters will come up with something better.


Sample usage:


“I was explorifying and found a bug. Patch enclosed.”


“Yes, I can probably do that feature. But I’ll need some time to explorify first.”


“No, we probably didn’t need strictly hex literal recognition there. I was explorifying.”

 •  0 comments  •  flag
Share on Twitter
Published on December 20, 2013 05:03

December 15, 2013

Announcing cvssync, with thoughts on “good enough”

There’s an ancient Unix maxim to the effect that a tool that gets 85% of your job done now is preferable to one that gets 100% done never. Sometimes chasing corner cases is more work than the problem really justifies.


In today’s dharma lesson, I shall illustrate this principle with a real-world and useful example.



In my last blog post I explained why I had to shoot cvsps through the head. Some of my regulars regretted the loss of the good feature bolted to its crappy repo-analysis code – it could fetch remote CVS repository metadata for analysis rather than requiring them to have been already mirrored locally.


To fill this functional gap, I needed a tool for mirroring the contents of a remote CVS repository to a local directory. There’s floating folklore to the effect that a tool called “cvssuck” does this job, but when I tried to use it it failed in about the most annoying possible way. It mirrored the directory structure of the remote site without fetching any masters!


Upon investigation I discovered that the cvssuck project site has disappeared and there hasn’t been a release in years. Disgusted, I asked myself how it could possibly have become that broken. Seemed to me the whole thing ought to be a trivial wrapper around rsync.


Or…maybe not. What scanty documentation I found for cvssuck made a big deal out of the fact that it (inefficiently) used CVS itself to fetch masters. This doesn’t make any sense if they were rsync accessible. because then it would be a much faster and more efficient way to do the same job.


But I thought about the sites I generally have to fetch from when I’m grabbing CVS repositories for conversion, as I did most recently for the groff project. SourceForge. Savannah. These sites (and, I suspect, most others that still support CVS) do in fact allow rsync access so that project administrators can use it to do offsite backups.


OK, so suppose I write a little wrapper around rsync to fetch from these sites. It might not do the guaranteed fetch that cvssuck advertises…but on the other hand cvssuck does not seem to actually work, at least not any more. What have I got to lose?


About an hour of experimentation and 78 lines of Python code later, I had learned a few things. First, a stupid-simple wrapper around rsync does in fact work for SourceForge and Savannah. And second, there is a small but significant value the wrapper can add.


The only thing you are pretty much guaranteed to be able to find out about a CVS repository is the CVS command needed to check out a working copy. For example, the groff CVS page gives you this command:



cvs -z3 -d:pserver:anonymous@cvs.savannah.gnu.org:/sources/groff co

You have to figure out for yourself that the should also be “groff”, but there are clues to that on the web page. For those of you blessed enough to be unfamiliar with CVS, a single instance can host multiple projects that can be checked out separately; the module name selects one of these.


It isn’t necessarily clear how to get from that cvs invocation to an rsync command. Here’s how you do it. First, lop off the “anonymous@” part; that is a dummy log credential. Treat “/sources/groff” as a file path to the repository directory, then realize that the module is a subdirectory. You and up writing this:



rsync -avz cvs.savannah.gnu.org:/sources/groff/groff my-local-directory

That’s really simple, but it turns out not to work on SourceForge. Because SourceForge runs an rsync daemon and hides the absolute file path to the repository. The corresponding fetch from SourceForge, if groff existed there, would look like this:



rsync -avz groff.cvs.sourceforge.net::cvsroot/groff/groff groff

Note the double colon and absence of leading ‘/’ on the repository path.


The value a wrapper script can add is knowing about these details so you don’t have to. Thus, cvssync. You call it with the arguments you would give a CVS checkout command. It pulls those apart, looks at the hostname, figures out how to reassemble the elements into an rsync command, and runs that.


This just shipped with cvs-fast-export release 0.7. At the moment it really only knows two things: A special rule about building rsync commands for SourceForge, and a general rule that happens to work for Savannah and should for most other CVS sites a well. More hosting-site would be easy to add, a line or two at most of Python for each hosting side.


This wrapper doesn’t do the last 15% of the job; it will fail if the CVS host blocks rsync or has an unusual directory structure. But that 85% now is more valuable than 100% never, especially when its capabilities are so easily extended.


And hey, it only took an hour for me to write, test, document, and integrate into the cvs-fast-export distribution. This is the Great Way of Unix; heed the lesson.

 •  0 comments  •  flag
Share on Twitter
Published on December 15, 2013 02:13

December 11, 2013

How to demolish your software project with style

I did something unusual today. I pulled the plug on one of my own projects.


In Solving the CVS-lifting problem and Announcing cvs-fast-export I described how I accidentally ended up maintaining two different CVS-to-something-else exporters.


I finally got enough round tuits to put together two-thirds of the head-to-head comparison I’ve been meaning to do – that is, compare the import-stream output of cvs-fast-export to that of cvsps to see how they rate against each other. I wrote both git-stream output stages, so this was really a comparison of the analysis engines.


I wasn’t surprised which program did a better job; I’ve read and modified both pieces of code, after all. Keith Packard’s analysis engine, in cvs-fast-export, is noticeably more elegant and craftsmanlike than the equivalent in cvsps. (Well, duh. Yeah, that Keith Packard, the co-architect of X.)


What did surprise me was the magnitude of the quality difference once I could actually compare them head-to-head. Bletch. Turns out it’s not a case of a good job versus mildly flaky, but of good job versus suckage.


The comparison, and what I discovered when I tried to patch cvsps to behave less badly, was so damning that I did something I don’t remember ever having felt the need to do before. I shot one of my own projects through the head.



The wrong thing to do in this situation is to just let the bad code hang out there in the noosphere gradually bitrotting, with no maintainance and warnings to people who might stumble over it and think it’s safe to use or salvageable. This is bad for the same reasons abandoning a physical building and letting it decay into a public hazard is bad,


Instead, I shipped a final archival release with an end-of-life notice, prominent warnings in the documentations about the Bad Things that are likely to happen if you try to use it, and a pointer to a better alternative.


This is the right thing to do. The responsible thing. Which I’m making a point of since I’ve too often seen people fall into doing the wrong thing – usually through embarassment at the prospect of admitting that they made a mistake or, possibly, can’t meet the qualifications to finish what they started.


I’ll say it straight up: I tried hard, but I can’t fix cvsps. Peeling away the shims and kluges and junk just reveals more shims and kluges and junk. Well, in the repo-analysis code, anyway; there’s another pieces, a partial CVS client for fetching metadata out of remote CVS repositories, that is rather good. It’s why I kept trying to salvage the whole mess for about ten months longer than I should have.


What I think happened here is that the original author of cvsps did a fast, sloppy ad-hoc job that worked well enough for simple cases but never matured because he didn’t encounter the less simple ones. Keith, on the other hand, did what I would do in like circumstances – thought the problem entirely through on an algorithmic level and nuked it flat. His code is solid.


One of the differences that makes is that Keith’s code copes better when put under unanticipated stress, such as me coming along and sawing off the entire git-aware output back end to replace it with a stream-file emitter. But I digress. I’m not here today to talk about architecture, but about how to demolish your project with style.


Software is communication to other human beings as much, or more so, than it is communication to computers. As an open-source hacker, you are part of a craft community with a past and a future. If you care about your craft and your community, the end of a project leaves you with a duty to clean up after it so that it becomes a positive lesson to those who come after you, rather than a trap and attractive nuisance.


And now I’ll get off my soapbox and go back to work. On cvs-fast-export. After this, making sure it has a really good test suite before I ship 1.0 seems even more important.

 •  0 comments  •  flag
Share on Twitter
Published on December 11, 2013 18:20

December 5, 2013

Heads up: the reposturgeon is mutating!

A few days ago I released reposurgeon 2.43. Since then I’ve been finishing up yet another conversion of an ancient repository – groff, this time, from CVS to git at the maintainer’s request. In the process, some ugly features and irregularities in the reposurgeon command language annoyed me enough that I began fixing them.


This, then, is a reposurgeon 3.0 release warning. If you’ve been using 2.43 or earlier versions, be aware that there are already significant non-backwards-compatible changes to the language in the repository head version and may be more before I ship. Explanation follows, embedded in more general thoughts about the art of language design.



First, a justification. Most computer languages (including domain-specific languages like reposurgeon’s) incur high costs when they change incompatibly. It’s a bad thing when a program breaks halfway through its expected lifetime – or worse, when its behavior changes in subtle ways without visibly breaking. Responsible language maintainers don’t make such changes at all if they can help it, and never do so casually.


But reposurgeon has an unusual usage pattern. Lift procedures written in reposurgeon are generally written once, used for a repository conversion, then discarded. This means that users are exposed to incompatibility problems only if they change versions while a conversion is in progress. This is usually easy to avoid, and when it can’t be avoided the lift recipes are generally short and relatively easy to verify.


Thus, the costs from reposurgeon compatibility breakage are unusually low, and I have correspondingly more freedom to experiment than most language designers. Still, conservatism about breaking compatibility sometimes does deter me, because I don’t want to casually obsolesce the knowledge of reposurgeon in my users’ heads. Making them re-learn the language at every release would be rude and obtrusive of me.


That conservatism has a downside beyond just slowing the evolution of the language, however. It can sometimes lead to design decisions, made to preserve compatibility, that produce warts on the language and that you come to regret later. Over time these pile up as a kind of technical debt that eventually has to be discharged. That discharge is what’s happening to reposurgeon now.


Now I’ll stop speaking abstractly and point at some actual ugly spots. The early design of reposurgeon’s language was strongly influenced by the sorts of things you can easily do in Python’s Cmd class for building line-oriented interpreters. What Cmd wants you to do is write command handler methods that are chosen based on the first whitespace-separated token on the line, and get the rest of the line as an argument. Thus, when reposurgeon interprets this:



read foobar.fi random extra text

what actually happens is that it’s turned into a method call to



do_read("foobar.fi random extra text")

and how you parse that text input in do_read() is up to you. Which is why, in the original reposurgeon design, I used the simplest possible syntax. If you said



read foobar.svn
delete /nasty content/ obliterate
write foobar2.fi

this was interpreted as “read and parse the Subversion stream dump in foobar.svn in the file foobar.fi, delete every commit for which the change comment includes the string “nasty content”, then write out the resulting history as an fast-import stream to the file foobar2.fi.


Looks innocent enough, yes? But there’s a problem lurking here. I first bumped into it when I wanted to specify an optional behavior for stream writes. In some circumstances you want some extra metainformation appended to each change comment as it goes out, a fossil identification (like, say, a Subversion commit number) retained from the source version control system. The obvious syntax for this would look like this:



write fossilize foobar2.fi

or, possibly, with the ‘fossilize’ command modifier after the filename rather than before it. But there’s a problem; “write” by itself on a line means “stream the currently selected history to standard output”, just as “read” means “read a history dump from standard input”. So, if I write



write fossilize

what do I mean? Is this “write a fossilized stream to standard output”, or “write an unfossilized stream to the file ‘fossilize’”? Ugh…


What the universe was trying to tell me is that my Cmd-friendly token-oriented syntax wasn’t rich enough for my semantic domain. What I needed to do was take the complexity hit in my command language parser to allow it to look at this



write --fossilize foobar2.fi

and say “aha, –fossilize is led with two dashes so it’s an option rather than a command argument” The handler would be called more or less like this:



do_read("foobar2.fi, options=["--fossilize"])

I chose at the time not to do this because I wanted to keep the implementation simplicity of just treating whitespace-separated tokens on the command line as positional arguments. What I did instead was introduces a “set” command (and a dual “clear” command) to manipulate global option flags. So the fossilized write came to look like this.



set fossilize
write foobar2.fi
clear fossilize

That was my first mistake. Those of you with experience at this sort of design will readily anticipate what came of opening this door – an ugly profusion of global option flags. By the time I shipped 2.43 there were seven of them.


What’s wrong with this is that global options don’t naturally have the same lifetime as the operations they’re modifying. You can get unexpected behavior in later operations due to persistent global state. That’s bad design; it’s a wart on the language.


Eventually I ended up having to write my own command parser anyway, for a different reason. There’s a “list” command in the language that generates summary listings of events in a history. I needed to be able to save reports from it to a file for later inspection. But I ran into the modifier-syntax problem again. How is the do_list() handler supposed to know which tokens in the line passed to it are target filenames?


Command shells like reposurgeon have faced this problem before. Nobody has ever improved on the Unix solution to the problem, which is to have an output redirection syntax. Here’s a reminder of how that works:



ls foo # Give me a directory listing of foo on standard output
ls >bar # Send a listing of the current directory to file bar
ls foo >bar # Send a listing of foo to the file bar
ls >bar foo # same as above - ls never sees the ">bar"

In reposurgeon-2.9 I bit the bullet and implemented redirection parsing in a general way. I found almost all the commands that could be described as report generators and used my new parser to make them support. A few commands that took file inputs got re-jiggered to use “


For example, there’s an “authors read” command that reads text files mapping local Subversion- and CVS-style usernames to DVCS-style IDs. Before 2.9, the command to apply an author map looked like this:



authors read foo.map

That changed to



authors read

But notice that I said “almost all”. To be completely consistent, the expected syntax of my first example should have changed to look like this:



read foobar2.fi

That is, read and write should have changed to always require redirection rather than ever taking filenames as arguments. But when I got to that point, I retained I/O filename arguments for those commands only, also supporting the new syntax but not decommissioning the old.


That was my second mistake. Technical debt piling up…but, you see, I thought I was being kind to my users. The other commands I had changed to require redirection were rarely used; “read” and “write”, on the other hand, pretty much have to occur in every lift script. Breaking my users’ mental model of them seemed like the single most disruptive change I could possibly make. Put plainly, I chickened out.


Now we fast-forward to 2.42 and the groff conversion, during which the technical debt finally piled high enough to topple over.


There’s a reposurgeon command ‘unite” that’s used to merge multiple repositories into one. I won’t go into the full algorithm it uses except to note that if you give it two repositories that are linear, and the root of one of them was committed later than the tip of the other, the obvious graft occurs – the later root commit is made the child of the earlier tip commit. I needed this during the groff conversion.


Every time you do a unite you have a namespace-management problem. The repositories you are gluing together may have collisions in their branch and tag names – in fact they almost certainly have one collision, on the default branch name “master”. The unite primitive needs to do some disambiguation.


The policy it had before 2.43 was very simple; every tag and branch name gets either prefixed or suffixed with the name of the repo it came from. Thus, if you merge two repos named “early” and “late”, you end up with two tags named “master-early” and “master-late”.


This turns out to be dumb and heavyhanded when applied to to two linear repos with “master” as the only collision. The natural thing to do in that case is to leave all the (non-colliding) names alone, rename the early tip branch to “early-master” and leave the late repo’s “master” branch named “master”.


I decided I wanted to implement this as a policy option for unite – and then ran smack dab into the modifier-syntax problem again, Here’s what a unite command looks like (actual example from recent work):



unite groff-old.fi groff-new.fi

Aarrgh! Redirection sequence won’t save me this time. Any token I could put in that line as a policy switch would look like a third repository name. Dammit, I need a real modifier syntax and I need it now.


After reflecting on the matter, I once again copied Unix tradition and added a new syntax rule: tokens beginning with “–” are extracted from the command line and put in a separate option set also available to the command handler. Because why invent a a new syntactic style when your audience already knows one that will suit? It’s good interface engineering to re-use classic notations.


I mentioned near the beginning of this rant that this is what I should have done to the parser much sooner. Now my new unite policy can be invoked something like this:



unite --natural groff-old.fi groff-new.fi

OK, so I implemented option extraction in my command parser. Then it hit me: if I’m prepared to accept a compatibility break, I can get rid of most or all of those ugly global flags – I can turn them into options for the read and write commands. Cue angelic choirs singing hosannahs…


Momentary aside: This is not exceptional. This is what designing domain-specific languages is like all the time. You run into these same sorts of tradeoffs over and over again. The interplay between domain semantics and expressive syntax, the anxieties about breaking compatibility, even the subtle sweetness of finding creative ways to re-use classic tropes from previous DSLs…I love this stuff. This is my absolute favorite kind of design problem.


So, I gathered up my shovels and rakes and other implements of destruction and went off to abolish global flags, re-tool the read & write syntax, and otherwise strive valiantly for truth, justice, and the American way. And that’s when I received my just comeuppance. I collided head-on with a kluge I had put in place to preserve the old, pre-redirection syntax of read and write.


Since 2.9 the code had supported two different syntaxes



read foobar.fi # Old
read

The problem was the easiest way to do this had been to look at the argument line before the redirection parser sees it, and prepend “


Friends, when this sort of thing happens to you, here is what you will do if you are foolish. You will compound your kluge with another kluge, groveling through the string with some kind of rule like “insert < before the first token that does not begin with --". And that kluge will, as surely as politicians lie, come back around to bite you in the ass at some future date.


If you are wise, you will recognize that the time has come to commit a compatibility break, and repent of your error in not doing so sooner. That is why in the repository tip version of reposurgeon the old pre-redirection sequemce of "read" and "write" is now dead; the command interpreter will throw an error if you try it.


(Minor complication: "read foo" still does something useful if foo is a directory rather than a file. But that's OK because we have unambiguous option syntax now.)


But another thing about this kind of design is that once you've accepted you need to do a particular compatibility break, it becomes a propitious time for others. Because one big break is usually easier to cope with than a bunch of smaller ones spread over time.


That means it's open season until 3.0 ships on changes in command names and syntactic elements. If you have used reposurgeon, and there is something you consider a wart on the design, now is the time to tell me about it.


 •  0 comments  •  flag
Share on Twitter
Published on December 05, 2013 11:23

December 2, 2013

shipper is about to go 1.0 – reviewers requested

If you’re a regular at A&D or on my G feed, and even possibly if you aren’t, you’ll have noticed that I ship an awful lot of code. I do get questions about this; between GPSD, reposurgeon, giflib, doclifter, and bimpty-bump other projects it is reasonable that other hackers sometimes wonder how I do it.


Here’s part of my answer: be fanatical about automating away every part of your workflow that you can. Every second you don’t spend on mechanical routines is a second you get to use as creative time.


Soon, after an 11-year alpha period, I’m going to ship version 1.0 if one of my main automation tools. This thing would be my secret weapon if I had secrets. The story of how it came to be, and why it took 11 years to mature, should be interesting to other hackers on several different levels.



The background…


I’m the designer or maintainer of around 40 open-source projects. Even allowing for the fact that more than half of those are very stable old code that only needs a release once in a blue moon, the cumulative amount of boring fingerwork involved in keeping these updated is considerable.


When I say “boring fingerwork” I’m not even talking about coding effort, but rather the mundane tasks of uploading tarballs to archive locations, updating web pages, mailing out announcements, sending release notifications to Freecode, broadcasting heads-ups on relevant IRC channels, et cetera.


For older projects this shipping overhead is often more work than applying the small fixes and patches that trigger each release. It’s tedious, fiddly stuff – and irritatingly error-prone if done by hand.


A long time ago, now, I decided to stop doing it by hand. My overall goal was simple: I wanted to be able to type “make release” (or, more recently, “scons release”) in my project directory and have the right things happen, without fail. So I started building a tool to automate away as much tedium and fiddliness as I could. I called it “shipper”, because, well, that’s what it does.


Shipper’s job is to identify deliverables (like, say, tarballs and generated web pages) and push them to appropriate destinations (like, a public FTP directory or a website). It’s also intended to issue release notifications over various channels.


One of the things all these announcements and many of the names of deliverables will have in common is an embedded version number. One of the goals of shipper’s design is to allow you to specify the release’s version number in one place and one place only – because when you repeat a detail like that from memory you will occasionally get it wrong, with embarrassing results.


As for version numbers, so for other pieces of metadata that archive sites and forges and announcement channels commonly want – like a sort description of the project’s purpose, or a home page link, or the name of a project IRC channel. A design goal is that you only need to specify anything like this once per project; shipper will find it and publish it anywhere it needs to go.


To that end, shipper looks in several different places to mine the data it wants. You can specify some things that aren’t project specific, like the Web location of your personal website, in a “.shipper” file in your home directory. If your project has a Debian-style control file, or an RPM specification, it will look in those for things they normally carry, like a homepage location or project description. Finally the project can have its own “.shipper” file to specify other things shipper might need to know.


The third kind of knowledge that shipper has is embodied in code. It knows, for example, that if you specify “sourceforge” as a delivery destination, it needs to compose the name of the download directory to which your tarballs should be copied in a particular way that begins with frs.sourceforge.net and includes your project name. Because it would be silly for each and every one of your Makefiles to include that recipe; you might get it wrong the Nth time you repeat it, and what if sourceforge’s site structure changes?


There are some things shipper doesn’t try to know. Like, how to send release notifications to freecode.com; what it knows is how to call freecode-submit to do that. Actually, shipper doesn’t even know how to copy files across the network; instead, it knows how to generate scp and lftp commands given a source and destination.


I’ve been using versions of shipper on my own projects since 2002. It’s an important enabler of my ability to ship three or four or sometimes even more software releases within the span of a week. But here at Eric Conspiracy Secret Labs, we release no code before its time. And until very recently I was just not happy with shipper’s design.


It was getting the job done, but in a ugly way that required lots of option switches and dropping various kinds of intermediate files in the project directory while it was operating. But then I had a conceptual breakthrough.


Old shipper was complicated and ugly because it had two main modes of operation: one to show you what it was going to do, by listing the commands it would generate – then another to actually do them. The intermediate files it was leaving around during the process were text content for email and freecode.com announcements.


The breakthrough was this: Why not give up on executing commands entirely, and instead generate a shellscript to be piped to sh?


With that design, most of the options go away. If you want to see what shipper will do, you run it and look at the output. The contents of what used to be intermediate files are here-documents in the generated shellscript. The Makefile recipe for releasing shipper itself just looks like this:



VERS=$(shell sed

Here, the output of shipper is being piped to sh -e -x; the options make the first error in a generated command fatal and echo commands to standard output just before they’re performed.


Note the trick being played here: VERS, as set in the makefile and passed to shipper, is mined from where the version number is set in the shipper script itself. For a C project, it might make more sense to set the version in the Makefile and pass it into the C compilation with -DVERSION=$(VERS).


The point is, either way, there’s a single point of truth about the version number, and all the email and IRC and other announcements that shipper might generate will reflect it.


Here is shipper’s control file:



# This is not a real Debian control file, though the syntax is compatible.
# It's project metadata for the shipper tool

Package: shipper

Description: Automated shipping of open-source project releases.
shipper is a power distribution tool for developers with multiple
projects who do frequent releases. It automates the tedious process
of shipping a software release and (if desired) templating a project
web page. It can deliver releases in correct form to SourceForge,
Berlios, and Savannah, and knows how to post a release announcement
to freecode.com via freecode-submit.

XBS-Destinations: freecode, mailto:esr@thyrsus.com

Homepage: http://www.catb.org/~esr/shipper

XBS-HTML-Target: index.html

XBS-Gitorious-URL: https://gitorious.org/shipper

XBS-IRC-Channel: irc://chat.freenode.net/#shipper

XBS-Logo: shipper-logo.png

XBS-Freecode-Tags: packaging, distribution

XBS-VC-Tag-Template: %(version)s

By now you have enough information to guess what most of this is declaring. XBS-Destinations says that shipper should send a release notification to freecode.com and an email notification to me (as a smoke test).


The XBS-HTML-Target line tells it to template a simple web page and include it in the web deliverables; you can see the result here. XBS-Logo, if present, is used in generating that page. The template used to generate the [page is easily customized.


XBS-VC-Tag-Template tells shipper how to compose a tag to be pushed to the project repo to mark the release. This value simply substitutes in the release version. You might want a prefix, something like like "release-%(version)s", on yours.


Here's what the shipper-generated release script for shipper looks like:



cat >index.html <
< !DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Transitional//EN'
'http://www.w3.org/TR/xhtml1/DTD/xhtml...


[bulky stuff omitted here]



INAGADADAVIDA

scp -p COPYING login.ibiblio.org:/public/html/catb/esr/shipper/COPYING
scp -p shipper-0.19.md5 login.ibiblio.org:/public/html/catb/esr/shipper/shipper-0.19.md5
scp -p NEWS login.ibiblio.org:/public/html/catb/esr/shipper/NEWS
scp -p TODO login.ibiblio.org:/public/html/catb/esr/shipper/TODO
scp -p shipper-0.19.tar.gz login.ibiblio.org:/public/html/catb/esr/shipper/shipper-0.19.tar.gz
scp -p README login.ibiblio.org:/public/html/catb/esr/shipper/README
scp -p index.html login.ibiblio.org:/public/html/catb/esr/shipper/index.html
scp -p shipper-logo.png login.ibiblio.org:/public/html/catb/esr/shipper/shipper-logo.png
git tag -a 0.19 -m ‘Tagged for external release 0.19′
git push; git push –tags
freecode-submit <

INAGADADAVIDA

irkerd -i ‘irc://chat.freenode.net/#shipper’ ‘shipper-0.19 has just shipped.’
# That’s all, folks!

Yes, that last line sends an announcement to the #shipper channel on freenode. Notice how things like the Description section in the freecode.com submission form are copied direct from the control file.


It's worth re-emphasizing that none of those commands were generated by hand - I'm spared the boring and glitch-prone process of typing them all. I just push the go-button and, boom, a complete and consistent release state gets pushed everywhere it needs to go. Look, ma, no hand-work!


And that's the point. You set up your per-project metadata once and go. Only the things that must change each release need to be altered - and shipper knows how to extract the most recent changes from your NEWS file. Imagine how much mechanical ritual and distraction from more important things this has saved me since 2002!


At long last, I think shipper is ready for beta, for other people to try using it. I'd love it if people contributed shipping methods for other forges. The documentation needs a critique from someone who doesn't know the tool intimately. There might be ways I'm not seeing to make the tool simpler and more effective - I'm unhappy that the -w option still exists. There's still work to be done.


But it's worth doing. This isn't just about convenience either, though that matters. By reducing the friction cost of shipping, shipper encourages frequent incremental releases on short cycles. That, in turn, makes open-source development work better and faster, which is a good thing for all of us.

 •  0 comments  •  flag
Share on Twitter
Published on December 02, 2013 08:02

December 1, 2013

Reposurgeon Battles All Monsters!

Though there haven’t been any huge dramatic improvements since Subversion analysis got good enough to use even on horribly gnarly repositories, reposurgeon continues to quietly get better and faster. I shipped 2.43 a few minutes ago.



Credit for much of the recent under-the-hood work goes to Julien Rivaud. He’s been continuing to speed-tune the code in support of the conversion of the huge and tangled Blender repository, which finally finished last week. His latest improvement speeds up the evaluation of selection expressions by short-circuiting logical operations as they’re evaluated left to right.


Meanwhile, I’ve been adding some user-visible features as the need for them becomes apparent in doing conversions. Currently I’m in the process of lifting the history of groff from CVS to git – a surprisingly easy one, considering the source.


Recent new primitives include: =O, =M, =F selectors for parentless, merge, and fork commits; a svn_noautoignoresioption to suppress the normal simulation of default Subversion ignores in a translated repo; a ‘manifest’ command that reports path-to-mark mappings; a ‘tagify’ commant that changes empty commits into tags; and a ‘reparent’ command for modifying ancestry links in the DAG.


Fear the reposturgeon!

 •  0 comments  •  flag
Share on Twitter
Published on December 01, 2013 07:46

November 22, 2013

GPSD 3.10 is shipped – and announcing the GPSD Time Service HOWTO

Blogging has been light recently because I’ve been working very hard on a major GPSD release, which I just shipped. This is mostly new features, not bugfixes, and it’s probably the most new code we’ve shipped in one release since about 2009.



The most user-visible feature of 3.10 is that 1PPS events are now visible in gpsmon – if you have a GPS that delivers this signal you can fire it up and watch how your system clock drifts in real time against the GPS top of second. Also you’ll see visible indicators of PPS in the packet logging window at the start of each reporting cycle.


For those of you using GPSD with marine AIS radios, the Inland AIS system used on the Thames and Danube receivers is now fully decoded. We’ve also added support for the aid-to-navigation messages used in English and Irish coastal waters. There’s a new AIS data relay utility, gps2udp, that makes it easy to use GPSD to feed AIS aggregation sites like AISHub. AIS report control has been cleaned up, with text dumping of controlled-vocabulary fields no longer conditional on the “scaled” flag (that was dumb!) but done unconditionally in new JSON attributes paired with the numeric ones.


There’s alpha-stage RTCM3 decoding; I expect this to become more fully baked in future releases. No ADSB yet, alas – we’ve had people express interest but nobody is actually coding.


The usual bug fixes, too. Use of remote data sources over TCP/IP is much more reliable than it was in 3.9; more generally, the daemon is less vulnerable to incorrectly dropping packets when write boundaries from an I/O source land in the middle of packets. Mode and speed changes to u-blox devices now work reliably; there had been a race condition after device startup that made them flaky.


The most significant changes, though, are in features related to time service. GPSD, which is used by quite a few Stratum 1 network time servers, now feeds ntpd at nanosecond rather than microsecond resolution. The PPS drift report that is part of gpsd’s JSON report stream if your GPS emits 1PPS is now nanosecond-resolution as well.


And, after weeks of effort, we’ve shipped along with 3.10 the first edition of the GPSD Time Service HOWTO. This document explains in practical detail how to use GPSD and a 1PPS-capable GPS to set up your own Stratum 1 time server.


This might not seem like a big deal, but the HOWTO is actually the first explanation accessible to ordinary mortals of a good deal of what was previously black magic known only to a handful of metrologists, NTP maintainers, and time-nut hobbyists. What happened was that I cornered several domain experts and beat them mercilessly until they confessed. :-)


Now I’m gonna go catch up on my sleep…

 •  0 comments  •  flag
Share on Twitter
Published on November 22, 2013 08:04

November 5, 2013

Finally, one-line endianness detection in the C preprocessor

In 30 years of C programming, I thought I’d seen everything. Well, every bizarre trick you could pull with the C preprocessor, anyway. I was wrong. Contemplate this:



#include

#define IS_BIG_ENDIAN (*(uint16_t *)"\0\xff" < 0x100)

That is magnificently awful. Or awfully magnificent, I'm not sure which. And it pulls off a combination of qualities I've never seen before:




Actually portable (well, assuming you have C99 stdint.h, which is a pretty safe assumption in 2013).
Doesn't require runtime code.
Doesn't allocate storage, not even constant storage.
One line, no auxiliary definitions required.
Readily comprehensible by inspection.

Every previous endianness detector I've seen failed one or more of these tests and annoyed me in so doing.


In GPSD it's replacing this mess:



/*
__BIG_ENDIAN__ and __LITTLE_ENDIAN__ are define in some gcc versions
only, probably depending on the architecture. Try to use endian.h if
the gcc way fails - endian.h also doesn not seem to be available on all
platforms.
*/
#ifdef __BIG_ENDIAN__
#define WORDS_BIGENDIAN 1
#else /* __BIG_ENDIAN__ */
#ifdef __LITTLE_ENDIAN__
#undef WORDS_BIGENDIAN
#else
#ifdef BSD
#include
#else
#include
#endif
#if __BYTE_ORDER == __BIG_ENDIAN
#define WORDS_BIGENDIAN 1
#elif __BYTE_ORDER == __LITTLE_ENDIAN
#undef WORDS_BIGENDIAN
#else
#error "unable to determine endianess!"
#endif /* __BYTE_ORDER */
#endif /* __LITTLE_ENDIAN__ */
#endif /* __BIG_ENDIAN__ */

And that, my friends, is progress.

 •  0 comments  •  flag
Share on Twitter
Published on November 05, 2013 19:54

October 30, 2013

Dell UltraSharp 2713 monitor – bait and switch warning

I bought a Dell-branded product this afternoon. That was a mistake I will not repeat.


Summary: the 2713UM only reaches its rated 2560×1440 resolution when connected via DVI-D. On HDMI it is limited to 1920×1080; on VGA to 2048×1152. This $700 and supposedly professional-grade monitor is thus functionally inferior to the $300 Auria I still have connected to the other head of the same video card.


Two things make this extra infuriating:


I spent more than four hours on the phone with three different Dell technical-support people to find out that not only don’t they know how to fix this, nobody can give any reason for it. It’s a completely arbitrary, senseless limit. The monitor’s EDID hardware apparently tells lies to the host system that low-ball its capabilities. This couldn’t happen by accident; somebody designed in this nonsense.


And then neglected to tell potential customers about it. Nothing anywhere in the promotional material for this monitor even hints at these limits, and Dell’s own technical support people haven’t been clued in either. Bait and switch taken to a whole new level.


(Why did I buy a Dell product? Because it was the only thing I could get my hands on same-day that matched the specs of my other Auria, which went flickery-crazy early this afternoon.)


When I unloaded about this on Tech Support Guy #3, he passed me to a marketing representative. I explained, relatively politely under the circumstances, that I has over 15K social-media followers and was planning to give Dell a public black eye over their repeated bungling unless somebody gave me a really good reason not to.


She declined to send me $400 so I wouldn’t have been taken worse than by buying another Auria, then passed me to somebody she described as a manager. But I could tell by the accent he was just another drone in a call center in East Fuckistan who had neither the ability nor the intention to improve my day. After two more iterations of this I had had enough and hung up.


Dell. You pay more, but you’ll get less. Pass it on.


(Yes, I typoed the model number originally.)

 •  0 comments  •  flag
Share on Twitter
Published on October 30, 2013 19:09

Eric S. Raymond's Blog

Eric S. Raymond
Eric S. Raymond isn't a Goodreads Author (yet), but they do have a blog, so here are some recent posts imported from their feed.
Follow Eric S. Raymond's blog with rss.