30 Days in the Hole
Yes, it’s been a month since I posted here. To be more precise, 30 Days in the Hole – I’ve been heads-down on a project with a deadline which I just barely met. and then preoccupied with cleanup from that effort.
The project was reposurgeon’s biggest conversion yet, the 280K-commit history of the Gnu Compiler Collection. As of Jan 11 it is officially lifted from Subversion to Git. The effort required to get that done was immense, and involved one hair-raising close call.
I was still debugging the Go translation of the code four months ago when the word came from the GCC team that they has a firm deadline of December 16 to choose between reposurgeon and a set of custom scripts written by a GCC hacker named Maxim Kyurkov. Which I took a look at – and promptly recoiled from in horror.
The problem wasn’t the work of Kyurkov himself; his scripts looked pretty sane to me, But they relied on git-svn, and that was very bad. It works adequately for live gatewaying to a Subversion repository, but if you use it for batch conversions it has any number of murky bugs including a tendency to badly screw up the location of branch joins.
The problem I was facing was that Kyurkov and the GCC guys, never having had their noses rubbed in these problems as I had, might be misled by git-svn’s surface plausibility into using it, and winding up with a subtly damaged conversion and increased friction costs for the rest of time. To head that off, I absolutely had to win on 16 Dec.
Which wasn’t going to be easy. My Subversion dump analyzer had problems of it own. I had persistent failures on some particularly weird cases in my test suite, and the analyzer itself was a hairball that tended to eat RAM at prodigious rates. Early on, it became apparent that the 128GB Great Beast II was actually too small for the job!
But a series of fortunate occurrences followed. One was that friend at Amazon was able to lend me access to a really superpowered cloud machine with 512GB. The second and much more important was in mid-October when a couple of occasional reposurgeon contributors, Julien “__FrnchFrgg__” Rivaud and Daniel Brooks showed up to help – Daniel having wangled his boss’s permission to go full-time on this until it was done. (His boss whose company is critically depended on GCC flourishing…)
Many, many hours of hard work followed – profiling, smashing out hidden O(n**2) loops that exploded on a repo this size, reducing working set, fixing analyzer bugs. I doubled my lifetime consumption of modafinil. And every time I scoped what was left to do I came up with the same answer: we would just barely make the deadline. Probably.
Until…until I had a moment of perspective after three week of futile attempts to patch the latest round of Subversion-dump analyzer bugs and realized that trying to patch-and-kludge my way around the last 5% of weird cases was probably not going to work. The code had become a rubble pile; I couldn’t change anything without breaking anything.
It looked like time to scrap everything downstream of the first-stage stream parser (the simplest part, and the only one I was completely sure was correct) and rebuild the analyzer from first principles using what I had learned from all the recent failures.
Of course the risk I was taking was that come deadline time the analyzer wouldn’t be 95% right but rather catastrophically broken – that there simply wouldn’t be time to get the cleaner code working and qualified. But after thinking about the odds a great deal, I swallowed hard and pulled the trigger on a rewrite.
I made the fateful decision on 29 Nov 2019 and as the Duke of Wellington famously said, “It was a damned near-run thing.” If I had waited even a week longer to pull that trigger, we would probably have failed.
Fortunately, what actually happened was this: I was able to factor the new analyzer into a series of passes, very much like code-analysis phases in a compiler. The number fluctuated, there ended up being 14 of them, but – and this is the key point – each pass was far simpler than the old code, and the relationships between then well-defined. Several intermediate state structures that had become more complication than help were scrapped.
Eventually Julien took over two of the trickier intermediate passes so I could concentrate on the worst of the bunch. Meanwhile, Daniel was unobtrusively finding ways to speed the code and slim its memory usage down. And – a few days before the deadline – the GCC project lead and a sidekick showed up on our project channel to work on improving the conversion recipe.
After that formally getting the nod to do the conversion was not a huge surprise. But there was a lot of cleanup, verification, and tuning to be done before the official repository cutover on Jan 11. What with one thing and another in was Jan 13 before I could declare victory and ship 4.0.
After which I promptly…collapsed. Having overworked myself, I picked up a cold. Normally for me this is no big deal; I sniffle and sneeze for a few days and it barely slows me down. Not this time – hacking cough, headaches, flu-like symptoms except with no fever at all, and even the occasional dizzy spell because the trouble spread to my left ear canal.
I’m getting better now. But I had planned to go to the big pro-Second Amendment demonstration in Richmond on Jan 20th and had to bail at the last minute because I was too sick to travel.
Anyway, the mission got done. GCC has a really high-quality Git repository now. And there will be a sequel to this – my first GCC compiler mod.
And posting at something like my usual frequency will resume. I have a couple of topics queued up.
Eric S. Raymond's Blog
- Eric S. Raymond's profile
- 140 followers
