| Colin Watson | |||||
|
Subscribe
Flavours |
Tue, 29 Jul 2008 (This post is inspired by this post on ArsGeek, hence the title. Thanks to Scott James Remnant for some of the ideas in this post.) I'll get to the details of ArsGeek's post in a moment, but really the more interesting part of it is the discussion of how we sometimes end up releasing software with (from various people's perspectives) serious bugs, a question of release management. As a member of the Ubuntu release team, I thought I'd respond to this and try to clarify what goes on. Release management strategiesFirstly, I know of no way to ensure a bug-free release of a system the size of a modern GNU/Linux distribution, with software coming from a huge variety of sources with all kinds of different QA standards, not to mention plenty of sources within the distribution itself where new bugs can be introduced. Indeed, ensuring releases free of serious bugs is rather more of an art than a science. What may be surprising is that, for a composite distribution of software from lots of sources, it doesn't actually seem to make a whole lot of difference to bug density whether you choose to whack all the bugs out before picking a release date, or just pick a release date up-front and tell everyone to aim for that (as long as people believe your release dates; more on this later). I can't offer hard evidence for this, only the gut feel of somebody with experience of both approaches, but it does seem to me to be the case. The idea of releasing "when it's ready" is a very appealing one. I have some experience here: I used to be one of Debian's release managers too, which is just about as far to that end of the scale as you get among GNU/Linux distributions. In theory, you put all the software into a big pot, stir it up, collect all the bugs, and at some point when it smells reasonably good you get everyone to whack away at the bugs in turn until they're all gone. Then you release. In theory, theory and practice are the same thing ... In practice, what happens is that there are a number of counterbalancing pressures. Developers don't want to be tied down waiting for all those other lazy developers who haven't fixed their bugs yet; users want the latest version of their favourite application which has some new and shiny feature; security support for the last release is getting harder as it gets older and the backport distance increases; people with new hardware want a new release that will support it; and so on. The longer you spend fixing bugs, the more the internal pressures show, and the harder it gets. Thus, even Debian compromises on this, and many of the bugs that are "fixed" shortly before release are in fact dealt with by removing leaf packages from the distribution, documenting workarounds, or simply deciding that they aren't release-critical. And this is only for the highest-severity bugs! Many other bugs, by straightforward necessity, are simply not important enough to hold up the release of the whole distribution. Now, Ubuntu opted for the time-based approach up front. You pick a date and you stick to it as if your life depended on it. Within that constraint, you figure out (read: guess) when new upstream software and major new features have to land in order to be able to fix the expected new bugs they introduce, and make case-by-case decisions on anything that slips past the intermediate deadlines you impose. You have to say "no" a lot, and you spend a lot of time in planning. Nowadays, this is familiar to most people: you don't delay a time-based release for new features. The hard bit comes when things go wrong, and then you find that you can't delay a time-based release to fix bugs either! If you slip your deadlines as a matter of course, then nobody will believe your release dates next time, and you'll start finding that people ignore the internal deadlines because "hey, they're not going to release on time anyway"; it all goes to pot very quickly. (We saw this in Debian: every time a published release date slipped, interest in putting in time to meet the next one diminished.) Thus you typically have a difficult choice: you can roll back to the last-known-good version (if this is even possible; often enough, interdependencies mean you'd end up rolling back a whole subsystem in order to get back to something that worked, and figuring out whether that's worth it can be harder than just fixing the bug to start with), or you can document the new bugs that have been introduced. Each choice will make somebody unhappy, as one group of people expects the latest-and-greatest software, and another expects stability. Neither group is intrinsically wrong, but you can't satisfy both of them in the presence of buggy software. Of course, there is a third choice: fix the bug already! In rather a lot of cases, we do, and we also spend a great deal of time triaging and organising bugs so that developers can focus on the most important ones first. In a time-based world, though, the clock is ticking and you have finite resources. If you have any, you can get your paid staff to focus on the really critical issues, and of course we do that; many of the Ubuntu team who are employed by Canonical pull some pretty long hours around release time. (As a side note, time-based releases seem to work a bit better when paid developers are involved because you can get away with doing this.) But all those paid staff tend to have extensive demands on their time, and so some things usually slip through the net. Ubuntu 8.04 decisionsSo, we have a bunch of tough decisions to make. I don't especially expect this to attract sympathy: after all, if it were easy, everyone would be doing it. Nevertheless, we have to make tough decisions all the time where there may not necessarily be a right answer, and they are an inevitable consequence of a time-based release process. When this kind of thing goes wrong, people rightly criticise us; as ArsGeek says, the buck stops with Ubuntu and Canonical as far as Ubuntu users are concerned. It often isn't as straightforward as it looks, though, since you don't get a practical demonstration of what would have gone wrong the other way round. If you look at it purely on the basis of bugs introduced, it seems clear that we should have held back on switching to GVFS (and GIO) and instead stuck with GNOME-VFS. The VFS layer is at the core of GNOME, though. Sticking with GNOME-VFS would mean either that we'd have to be using something that the upstream developers were no longer putting much effort into supporting, very probably resulting in other regressions, or we'd have had to ship GNOME 2.18 in Ubuntu 8.04 as we did in 7.10. This would be a big deal: the update in 8.10 would be fiendishly difficult because now we'd be behind, security support would be harder to achieve, software developers would find that software developed on other systems of around the same vintage might not even build on Ubuntu, and many users would be unhappy because they'd be missing out on new features introduced in GNOME 2.20. Instead, what we decided to do was to take the pain and do what we could to mitigate it, and, partly as a result of this, we handled the 8.04.1 point release somewhat differently from how we've handled similar point releases in the past. We set a date for 8.04.1 in advance, and a number of people worked solely towards that. Canonical sponsored work in GNOME to help iron out certain GVFS bugs that had already been raised as major omissions in the new framework. We agreed that 8.04.1 would be the first in a series of six-monthly point releases of 8.04 (out of phase with the main Ubuntu release cycle) so that users would be able to plan around it. There were various other major decisions in Ubuntu 8.04, some of which made headlines and some of which didn't. Here are a couple of examples with contrasting results:
Specific problemsArsGeek raised several specific annoyances, so I'll respond to those while I'm here.
In many of these cases, I noticed that nobody had actually filed a bug to tell us about the problem. While of course we do as much testing as we can internally, and you can always say "oh, that's obvious, they should have tested that one", we still don't have anything like the number of people you'd need to do complete testing on a full operating system, and we do have to rely on our users to report problems in the things they use. So, the last in my list of reasons why annoyances exist is that, every so often, developers don't notice them, and the users who do notice them don't report them! Mon, 23 Jun 2008Christoph:
That's because $ perl -le 'print "yoo" if (1 + 1) =~ /3/' perlop(1) has a useful table of precedence.
To my horror, I recently saw this online SSH key generator. I hope nobody reading this needs to be told why this is a bad idea. However, in case you do, here are a few reasons:
I think this is probably being done in innocent seriousness (although I kind of hope it's a joke in poor taste), and have e-mailed the contact address offering to explain why it's a bad idea. Sat, 12 Apr 2008Ubuntu's live CD installer, Ubiquity, needs to suppress desktop automounting while it's doing partitioning and generally messing about with mount points, otherwise its temporary mount points end up busy on unmount due to some smart-arse desktop component that decides to open a window for it. To date, it employs the following methods, each of which was sufficient at the time:
This is getting ridiculous. Dear desktop implementors: please pick a configuration mechanism and stick to it, and provide backward compatibility if you can't. This is not a rocket-science concept. I rather liked the
I hacked together a little timesaver for developers this morning: omni completion for Launchpad bugs in Vim's debchangelog mode. To use it, install vim 7.1-138+1ubuntu3 once it hits the mirrors, open up a debian/changelog file, type "LP: #", and hit Ctrl-X Ctrl-O. It'll think for a while and then give you a list of all the bugs open in Launchpad against the package in question, from which you can select to insert the bug number into your changelog. Here's a screenshot to make it clearer:
Thanks to Stefano Zacchiroli for doing the same for Debian bugs back in July. Tue, 29 Jan 2008See Encodings in man-db for context. Yesterday, I uploaded man-db 2.5.1-1 to unstable. With this version, not only is it possible to install manual pages in UTF-8 (as with 2.5.0, although with fewer bugs), but it's also possible to ask man to produce a version of an arbitrary page in the encoding of your choice, and have it guess the source encoding for you fairly reliably. This finally provides enough support to have debhelper automatically recode manual pages to UTF-8. It'll probably take a little while to shake out the corner-case bugs, but I'm generally pretty happy with this. Once the new man-db and debhelper land in testing, I'll send a note to debian-devel-announce and push harder on my policy amendment. Considering the historical state of man-db when it comes to localisation, and all of the dependencies and general yak-shaving that had to be tackled to get here, this represents the end of probably several hundred hours of work, so I'm pretty happy that this is out the door. The only remaining step is to add UTF-8 input support to groff, which fortunately Brian M. Carlson is working on. After that, we can reasonably claim to have dragged manual pages kicking and screaming into the 21st century. Thu, 29 Nov 2007Erich: I do sometimes wonder why we don't relax the definition of "safe" upgrades to include installing new packages but still not removing old ones. I know that many of my uses of dist-upgrade are just for when something grows a new dependency that I didn't previously have installed. (Of course this wouldn't always help as it wouldn't account for a new dependency that conflicted with an old dependency, but never mind. It would certainly do wonders for the metapackage case.) Mon, 17 Sep 2007I've spent some quality upstream time lately with man-db. Specifically, I've been upgrading its locale support. I recently published a pre-release, man-db 2.5.0-pre2, mainly for translators, but other people may be interested in having a look at it as well. I hope to release 2.5.0 quite soon so that all of this can land in Debian. Firstly, man-db now supports creating and using databases for per-locale hierarchies of manual pages, not just English. This means that apropos and whatis can now display information about localised manual pages. Secondly, I've been working on the transition to UTF-8 manual pages. Now, modulo some hacks, groff can't yet deal with Unicode input; some possible input characters are reserved for its internal use which makes full 32-bit input difficult to do properly until that's fixed. However, with a few exceptions, manual pages generally just need the subset of Unicode that corresponds to their language's usual legacy character set, so for now it's good enough to just recode on the fly from UTF-8 to some appropriate 8-bit character set and use groff's support for that. man-db has actually supported doing this kind of thing for a while, but
it's been difficult to use since it only applies to
I'm still debating whether Debian policy should recommend installing
UTF-8 manual pages in
If your key has so many UIDs and such a combinatorially exploded number of signatures on it that it takes gpg minutes just to start up in --edit-key mode, then I probably won't bother signing it. HTH, HAND. Sat, 23 Dec 2006
More on the Google Summer of Code: as well as the project I'm mentoring for Debian, I'm mentoring Evan Dandrea (no blog yet?), writing a migration assistant for Ubiquity. I haven't talked much about Ubiquity, mostly because I've been far too busy writing it. Ubuntu has needed an installer for its live CD for a while, partly because, well, loads of users want it, and partly because it will cut Canonical's costs quite a lot if we only have to send out half the number of CDs in shipit (apparently single-CD packaging is significantly cheaper than double-CD packaging). I'd been resisting doing something from scratch because I love d-i and I think it would be a really bad idea to end up maintaining two installer implementations from the ground up (live CDs are nice, but they don't cut it for everyone). So when I was given the task of doing a shiny live CD installer with a custom-designed UI, while I started with a more-or-less from-scratch implementation put together by Guadalinex (thanks!), I fairly quickly diverged from that and morphed it into something that uses d-i code for as much of the backend operation and logic as it sensibly can. I'd call it a sort of debconf frontend except that its design is almost opposite to how a debconf frontend should work, in that it's highly specialised to react to particular question names. This had a lot of advantages in terms of being fairly quick to write, although in the long term I think I might prefer something closer to cdebconf plugins for the job; we'll see how things turn out. Evan wanted to write an extension to this to automatically migrate settings and documents from an existing Windows installation, which I think would be an absolutely excellent thing to have: automatic migration is a real killer feature in an installer. In fact, d-i already has a start at this, namely os-prober, which in conjunction with some bootloader installer code magically sets up boot menu entries for other operating systems on your disk. So, I suggested to Evan that he might want to put most of the clever logic in a udeb, so that it can be used in d-i too, and to my relief he seemed quite enthused by the idea. He's starting on the preliminary work now and I look forward to seeing his progress. Best of luck, Evan! [/summerofcode/2006] permanent link
I'm mentoring Matheus Morais in the Google Summer of Code, porting d-i to the Hurd. We've exchanged a few mails and he has in hand all the preliminary (but not yet functional; wouldn't want to make it too easy :-)) patches I've put together in the past. I think I should be reasonably well-placed to judge his progress. Best of luck, Matheus! [/summerofcode/2006] permanent link Mon, 06 Feb 2006Joey writes about the lack of new tools that fit into the Unix philosophy. My favourite of such things I've written is sponge. It addresses the problem of editing files in-place with Unix tools, namely that if you just redirect output to the file you're trying to edit then the redirection takes effect (clobbering the contents of the file) before the first command in the pipeline gets round to reading from the file. Switches like sed -i and perl -i work around this, but not every command you might want to use in a pipeline has such an option, and you can't use that approach with multiple-command pipelines anyway. I normally use sponge a bit like this: sed '...' file | grep '...' | sponge file Since it's so trivial I imagine lots of other people have written something similar (another common name for it seems to be inplace; my name indicates soaking up all the input and then squeezing it all out again); but I do keep meaning to try to get a rewritten version into coreutils at some point. Fri, 27 Jan 2006
Joey has been
campaigning
for a while to get everything in the archive changed to depend on
Recently it occurred to me that we didn't necessarily have to do it that
way round. In a bout of late-night hacking while staying awake to look after
a sick child (he seems mostly OK now, although the rushed trip to the
hospital earlier was a bit on the nerve-wracking side), I've shuffled things
around in the cdebconf package so that it no longer has any file conflicts
with debconf or debconf-doc, and changed the debconf confmodule to fire up
the cdebconf frontend rather than its own if the
DEBCONF_USE_CDEBCONF environment variable is non-empty. (The
details of this may change before it actually gets uploaded, as I'd like to
get Joey to look it over and approve it first.) This allows you to install
cdebconf, set that environment variable, and play around with cdebconf with
relative ease; when we come to switch to cdebconf for real, instead of a
huge conflicting mess that apt will probably have trouble resolving, it'll
just be a matter of changing a couple of lines in
Of course, don't expect cdebconf to be a complete working replacement for debconf just yet; if you try using it for a dist-upgrade run it'll fall over. Due to its d-i heritage, it doesn't yet load templates automatically; that has to be done by hand. Frontend names differ from debconf's, which will need some migration code. At the moment it can only handle UTF-8 templates, which are mandated in the installer but only optional in the rest of the system. It doesn't have all of debconf's rich array of database modules. I haven't adapted the Perl or Python confmodules yet. The list goes on. However, I think we at least stand a chance of getting a handle on the problem now. (I'll post this article to debian-devel once the changes have been reviewed and uploaded.) Mon, 09 Jan 2006Working on free software has made me fairly revision control system-agnostic; I can't afford to get too wedded to any one system because as soon as I do somebody will invent something new and I'll have to convert again, so I just work with whatever other people on the same project are using. Even CVS doesn't make a lot of difference to the way I work as long as I'm working online and have cvsps handy. And of course I usually don't bother with revision control if I'm just tweaking somebody else's Debian source package a bit (in which case I just use debdiff for paranoia). Using bzr at work, though, I think I just found my killer app in Michael Ellerman's shelve plugin. My working style generally involves alternating between doing lots and lots of stuff in the one working copy and (after testing) going through and committing it in logical chunks. This is fine if everything's in separate files (most revision control systems let you commit just some files), but if several of the chunks are in the one file then I'm reduced to saving diffs and manually editing out the bits I don't want to commit yet, which is obviously pretty tedious and error-prone. bzr shelve presents each diff hunk in your working copy to you in turn and asks you whether you want to keep it. If you say no, that hunk gets unapplied and goes into a "shelf", where bzr unshelve can later reapply it. In the meantime commits act as though the shelved hunks didn't exist. This doesn't help if you want to defer only one of two immediately adjacent changes that end up in the same hunk, of course, but it vastly reduces the scale of the problem. I suppose it would be easy enough to write a shelve-a-like for any other system; it's just that I haven't seen it for any other system yet. If working with systems that lack it really starts to annoy me, I may have to rip out the guts of shelve and figure out how to make it generic. Tue, 03 Jan 2006Hot on the heels of Joey's tale of getting rid of base-config (the second stage of the installer) in Debian, we've now pretty much got rid of it in Ubuntu Dapper too. The upshot of this is that rather than asking a bunch of questions, installing the base system, and rebooting to install everything else, we now just install everything in one go and reboot into a completed system. This does mean that, if your system doesn't boot, you don't get to find out about it for a bit longer. However, it has lots of advantages in terms of speed (the much-maligned archive-copier mostly goes away), reducing code duplication (base-config had a bunch of infrastructure of its own which was done better in the core installer anyway), comprehensibility, and killing off some annoying bugs like #13561 (duplicate mirror questions in netboot installs), #15213 (second stage hangs if you skip archive-copier in the first stage), and #19571 (kernel messages scribble over base-config's UI). To go with Joey's Debian timeline, the Ubuntu history looks a bit like this:
Although it caused some friction, I'm glad that we did the first cuts of many of these things outside Debian and got to try things out before landing version-2-quality code in Debian. The end result is much nicer than the intermediate ones ever were. Sometimes following up on a bug takes you a lot further than you expected. Debian bug #337041 looked like it was going to be fairly straightforward once I upgraded coreutils to figure out what the new IUTF8 flag actually did, since the SSH2 protocol already supports transferring termios flags around. Unfortunately, since IUTF8 is relatively new, it doesn't have a number assigned in the draft connection protocol. Moreover, that Internet-Draft is in the last stages before becoming an RFC and can't be modified any more, and it doesn't include any facility for private-use extensions. D'oh. To add further complication, since IUTF8 is Linux-specific, it's not hard to imagine that some other OS might introduce something with the same name but subtly different semantics, and so the SSH protocols can't just defer to POSIX for the definition but instead have to spell out exactly what they mean. As a result of all of this, it looks like the best way to make progress might be for me to write an I-D myself that creates a channel extension to set or clear IUTF8, and attempt to enlist support from some upstream implementors. I didn't expect bug triage to lead me into the Internet standardisation process quite so quickly! New year, new blog. I've had a LiveJournal for a while, but don't write very much in it, and many of its readers wouldn't be interested in me talking about Debian and such anyway. I think the best solution is for me to keep technical posts here. |
||||