Archives

Categories

How not to package a library

I need to vent a little steam about the way that GNU binutils is packaged in Ubuntu.

Now binutils is a very important package, and contains many tools which are the basis of the development toolchain on Linux and other free platforms, and widely used on non-free platforms also.  Important parts of the build pipeline such as gasldar, and strip live here.  So do a bunch of commandline utilities which are less well known but just as useful in their way like strings, objdump, readelf, and addr2line.

Binutils also contains some more obscure stuff that practically nobody uses…like the BFD library.

BFD stands for Binary File Descriptor (not Big F***ing Deal as you might guess).  It’s basically the binutils internal abstraction layer for object files, executables, and core files.  These days on almost any Unix-like platform that means ELF objects, but the BFD library supports all kinds of file formats and platforms.  Dozens of them.

So BFD is a very important library.  And it’s packaging is a mess.

BFD lives in a kind of confused halfway state between being a supported externally visible library and an internal abstraction layer. The library is documented, and the documentation is shipped with binutils and includes this statement:

BFD is a package which allows applications to use the same routines to
operate on object files whatever the object file format. A new object
file format can be supported simply by creating a new BFD back end and
adding it to the library.

Now that really sounds like it’s meant to be external. But, and this a big but, there’s no meaningful ABI for the library. It’s internal structures are entirely visible in the header file, and the API requires you to reach into those structures, but the names, types, and order of the fields are subject to arbitrary change from one bintuils version to the next. The library has no API or ABI versioning scheme either, so there’s no easy way to write code which handles multiple versions.

Sensibly written libraries go to great lengths to hide their internal structures, or to keep the ABI view of those structures stable, and also keep the signatures of functions stable. Gnome libraries typically do this.

Another set of libraries will change their APIs and ABIs from time to time but use the ELF symbol versioning mechanism to ensure that older applications will continue to run. The best example is glibc. On my system there are two (presumably incompatible) versions of the pthread_cond_timedwait function.


% objdump --dynamic-syms /lib/x86_64-linux-gnu/libc.so.6
...
0000000000101ed0 g DF .text 0000000000000026 GLIBC_2.3.2 pthread_cond_timedwait
0000000000131c70 g DF .text 0000000000000026 (GLIBC_2.2.5) pthread_cond_timedwait
...

Yet another class of libraries do change their API but provide a way to detect this at compile time. For example, here’s some typical sample code which uses Berkeley DB.

#if (DB_VERSION_MAJOR > 3) || ((DB_VERSION_MAJOR == 3) && (DB_VERSION_MINOR > 0))
r = (dbenv->open)(dbenv, dbdir, flags, 0644);
#else
r = (dbenv->open)(dbenv, dbdir, NULL, flags, 0644);
#endif

But BFD doesn’t do any of that (well, that’s not entirely fair, some more recent changes have had some support for conditional compilation, like BFD_SUPPORTS_PLUGINS). The ABI changes and there’s no way to know at link time.

This creates one giant headache for distro packager folks.

RedHat solved it by not building or shipping a dynamic library for BFD. Instead they ship a static library only, in the binutils-devel package. This is pretty sensible; applications will link against -lbfd and get the static library. The executables might be a bit bigger, but they work and that counts for a lot.

Ubuntu solved it by shipping a dynamic library, and appending the version number of binutils to the name of the library. Not just a library version number which indicates the ABI revision and changes rarely, but the full version of the package, which changes on every single release. Like this

/usr/lib/libbfd-2.22.90-system.20120924.so

and all the binutils programs that link against it get the dynamic library and depend on that library at runtime.

% ldd /usr/bin/objdump
...
libbfd-2.22.90-system.20120924.so => /usr/lib/libbfd-2.22.90-system.20120924.so

which keeps those executable sizes down. When a new binutils package is released, there’s a new library name but all the executables that link against it at runtime are also updated, so it Just Works right?

Wrong! When somebody like me comes along and actually uses the BFD library in another package, like ggcov, everything falls apart. But not immediately, oh no. What happens is:

  1. User downloads ggcov source.
  2. User builds ggcov, linking against -lbfd, gets today’s libbfd dynamic library.
  3. Ubuntu’s packaging system detects and records in the ggcov package a dependency on binutils but not the specific version.
  4. User happily uses my package for a while.
  5. A few weeks later, Ubuntu releases a new binutils package with a new dynamic library
  6. Of course there’s no warning from the packaging system when the user upgrades binutils
  7. But the BFD library ggcov was using is now gone
  8. User runs ggcov again, it fails to start due to a missing library
  9. User files a bug report against ggcov. Like Ubuntu #713811.
  10. User decides to try the latest version of ggcov; rinse and repeat

And I get to pick up all the pieces. Great.

Anyway, the latest release of ggcov has a workaround for this mess. It tries quite hard to link against a static BFD library first, and only falls back to a dynamic library if no static library could be found. This is about a dozen lines of convoluted shell code in the configure.in that would not be necessary if the Ubuntu guys would be a little bit more thoughtful when packaging binutils.

Ok, that’s enough venting for now.

ggcov 0.9 released

Features include

  • support for gcc 4.7
  • automatically suppressing the failure branch of assert() statements
  • a workaround for the Ubuntu libbfd linking bug
  • numerous minor bug fixes and UI improvements.

You can grab the source or upgrade the Debian package.

Hyperloop

Hyperloop concept drawings, from SpaceX.com

Hyperloop was announced earlier today. I just sent the following to one of the two feedback addresses for Hyperloop.

G’day,

I read your hyperloop document with considerable interest today.

My first reaction was: f****ing awesome!

My second reaction was: I’m just a software engineer but I can see there’s some corners here that need more thought. So I wrote these down. I hope they help. Here goes…

5 minutes sounds like a very short loading time for 28 humans. You would certainly need a rapidly load able luggage container. I suspect you’ll need more platforms and a longer loading time than you’ve allowed for.

What’s powering the capsule into the airlock from the tube side? Ditto on the platform side? Onto and off of the turntable?

I don’t see any calculations on how you are designing an airlock to achieve a < 2 min cycle time.

What happens if one of the linear induction motors has a transient failure, so that it fails to accelerate (or worse, decelerates) one capsule but succeeds in accelerating the next one. Will the 2nd capsule catch up to the first one?

What happens if there is a break in the tube? Earthquakes won’t do this but a terrorist or military strike could. Presumably the first capsule to arrive in the next 2 minutes is toast, but what about subsequent ones? You mention some form of automatic braking system – how is this triggered?

You mention two capsule configurations, one with small section for semi-recumbent passengers and one with a large section for motor vehicles. What about a pure cargo version using standardized aircraft load pallets?

http://en.m.wikipedia.org/wiki/Unit_load_device

Intermingling cargo capsules with passenger capsules in the capsule stream during off peak periods might be a good way to earn some more revenue without decreasing passenger throughput.

Speaking of semi-recumbent passengers, how do you plan to deal with obese or wheelchair bound passengers?

There’s no room for toilet facilities and no room to move from a seat into a toilet; the journey is 30-40 minutes with no stop possible and you are carrying Californians with giant sodas. Will you be installing catheters or just hosing down the capsule at each stop?

It’s nice that the passengers will have an entertainment system. Will they have air? And light? Would breathing air be stored or replenished using bleed air from the compressor? How about cabin air conditioning – you have a small sealed room full of humans all perspiring breathing and farting.

6.11 feet is 1.86 metres not 1.10.

Are your tube thickness calculations taking into a account a failure mode where the capsule is deflected inside the tube and starts bouncing around between the walls?

What’s the limit on the capsule separation? Wake turbulence from the thrust nozzle? Automatic braking distance?

What happens if a pylon falls? Are the tubes attached to it or just resting on top under their self weight?

The system is not immune to wind, ice fog and rain, it’s just a much smaller operational threat. All moisture presents a danger of rust, so there will need to be a constant painting procedure for the tube. Ice freeze/thaw cycles could damage the pylons or their foundations. High rainfall could create boggy conditions around a single pylon leading to subsidence larger or more rapid than the adjustment system an cope with. High winds could tear off the solar panels, potentially interrupting or reducing power to the linear induction motors. Ambient temperature of the low pressure air in the tube would affect how much water was needed for cooling (albeit not by very much).

Safety emergency exits..how?

You mention a braking system…to bring a 15 tonne capsule to rest from 1200 km/h. That would be an interesting design. I don’t see any allowance for it in the weight or cost budgets.

Why would you expect a level of security comparable to airports? The worst case casualty scenario for an onboard bomb is much smaller than with any airliner or even a train (assuming the braking system in following capsules actually works). And nobody is going to be hijacking a capsule to fly into a building. You could get away with a much faster and laxer security procedure, which would be a competitive advantage against air travel.

Best of luck with Hyperloop — this is the kind of thinking that makes America worth moving to.

Does the world need 32 bit inodes?

This one’s for filesystems geeks.

Eric Sandeen posted The world wants 32-bit inodes on his blog back in 2008, and recently followed up with another comment. His conclusion is that it’s (still) not safe to use 64-bit inode numbers (e.g. the inode64 option in XFS) because of compatibility issues with userspace programs.

Eric’s original post grew out of an email discussion we had back in 2007, and he used a script that I wrote back then to gather data. He’s carefully preserved that script – thanks Eric! – so I downloaded it and ran it on a more modern, fast-moving desktop/laptop distro to see if I could get another data point.

Here’s the output of the old script on a 64b Ubuntu 12.10 install.

$ ./summarise_stat.pl /usr/bin /bin /usr/sbin /sbin
58099 76.9% don’t use any stat() family calls at all
12493 16.5% use 32-bit stat() family interfaces only
4498 6.0% use 64-bit stat64() family interfaces only
491 0.6% use both 32-bit and 64-bit stat() family interfaces

So it turns out that script had a few problems.

  • I don’t have 70000 executables! The script had a bug which mis-handled symlinks and in particular the symlink /usr/bin/X11 -> . which Ubuntu sets up, and massively overcounted executables.
  • The script doesn’t take into account the difference between 32b and 64b executables, in particular that a 64b executable can use the old stat interface without problems because the st_ino field is wider.

Here’s an updated script with that and some other quirks fixed. I also added an explicit list of all the broken executables. Here’s the result of the fixed script on the same machine.

$ ./summarise_stat.pl /usr/bin /bin /usr/sbin /sbin
Summary by status
—————–
672 33.4% are scripts (shell, perl, whatever)
1338 66.5% are 64-bit executables
1 0.0% use both 32-bit and 64-bit stat() family interfaces [BROKEN]
1 0.0% BROKEN
List of broken files
——————–
These use both 32-bit and 64-bit stat() family interfaces
/usr/bin/skype

So there’s one single executable which is doing the wrong thing. This is really good because most recent machines are 64 bit, even desktops and laptops. For comparison, here’s the output on a 32 bit Ubuntu 12.10 box, edited for brevity

$ ./summarise_stat.pl /usr/bin /bin /usr/sbin /sbin
Summary by status
—————–
460 28.3% are scripts (shell, perl, whatever)
687 42.3% don’t use any stat() family calls at all
165 10.1% use 32-bit stat() family interfaces only [BROKEN]
305 18.8% use 64-bit stat64() family interfaces only
9 0.6% use both 32-bit and 64-bit stat() family interfaces [BROKEN]
174 10.7% BROKEN
List of broken files
——————–
These use 32-bit stat() family interfaces only

/usr/bin/nautilus

/usr/bin/ischroot

/usr/bin/file-roller

/bin/mountpoint

/sbin/e2fsck
/sbin/mke2fs

/sbin/e2image

/sbin/cryptsetup
These use both 32-bit and 64-bit stat() family interfaces
[…nothing interesting…]

Note how there’s a larger proportion of 32 bit programs which use stat64() (i.e. do the right thing) than the sample from 2008. Note also that very few of the programs which use stat() look like they might care about st_ino at all. This looks to me like application developers and distro vendors have been addressing this problem by fixing applications.

Of course this is only a single machine install, not the entire software repo, so there may be sampling issues.

To go back to the theory, the executables having problems will be the intersection of 3 sets:

  1. 32b programs (every program on a 32bit machine, hardly any on a 64b machine), and
  2. programs using the old stat interface, and
  3. programs caring about the value stat() reports for st_ino.

In 2007 set 1 was the majority of programs, now it’s a minority and shrinking thanks to the march of hardware progress. Set 2 has always been a small minority of all programs, and it looks like it’s shrinking due to the march of software progress. Set 3 is a very small minority. I would say the trend is towards problematic programs basically drying up. This is excellent news, if my data can be believed.

I think the right thing to do is for distros to force the issue. One way is to make the 32 bit glibc wrapper for the stat() family put a fixed invalid value into st_ino, probably 0xffffffff would be best. We could then document this field as being invalid for stat() on 32 bit machines, please use stat64. That way 32 bit programs which use the old stat() but ignore st_ino will keep working just fine instead of the current situation where they get spurious EOVERFLOW errors.

Another perfectly reasonable approach would be to remove the overflow check for st_ino, just report the lower 32 bits, and document the problem in the stat(2) manpage. Currently EOVERFLOW is documented only to occur when st_size overflows 32b, not st_ino, so this documentation change would be a clarification and not a change in documented behaviour.

The Wild West is over

It seems that Tim Bray has a draft RFC for a new HTTP status code.

New status codes aren’t exactly common things because the guys who developed HTTP 1.1 filled in the handful of holes left by the very first HTTP standard. I did not expect to see another one, and yet here it is.

451 Unavailable For Legal Reasons

That’s right, it’s now so important and common an event that the Law interferes with the technical operation of the World Wide Web, that we need a status code baked into the HTTP protocol itself.

Tim’s example of usage is humourous in the best tradition of RFCs, but the inescapable fact is that the fun time for techies is over. The Internet’s Wild West phase where anything goes, where sheer competence and bravado won respect, where fortunes were made, where men were men horses were horses and women were schoolmarms, is done. Finished. “Tamed”. Succumbed to the Rule of Law. Or more precisely, the Rule of Lawyers.

This is terrifying. What happens to the Type A/nonconformist/rugged individualist personalities who built the place? Superannuated. Marginalised. Incarcerated. Shot in the back during a poker game. Well maybe not that last one, the analogy is not that strong.

I wonder if Elon Musk needs help building his Mars colony.

Changing Jobs

Today was my last day at FastMail. It was great working with those guys and I would recommend it to anyone. But now I’m heading off to work for SGI in the Bay Area. It’s going to be a great challenge!

Equality

In the last few days, various folks on social media have replaced their profile pictures with a red rendering of the = symbol, mathematical talk for “equals”. Or various amusing riffs on that, such as the Equals-Dalek.

This is meant to indicate that the person involved supports the right of other people to choose to be married, without any level of government saying saying something silly like “no you can’t , because the person you love and want want to marry has the wrong set of chromosomes”.

Just for the record, I support that freedom.

I also support the freedom to breathe. It’s more or less the same thing, just operating at a different level on Mazlow’s hierarchy. It’s not something you can morally justify denying people, absent real damage caused to third parties.

I don’t feel the need to replace my profile picture with an = for the same reason I don’t feel the need to replace my profile picture with a big red O2. It should be obvious, duh.

The Oatmeal said it best

NovaProva 1.1 released

I’m pleased to announce that a new version of NovaProva, the new generation unit test framework for C, has been released. NovaProva is available for download at http://www.novaprova.org now.

Changes in version 1.1:

  • ported to Linux x86_64
  • all the meta-tests pass now
  • minor cleanups

For more information, see Getting Started.

Sentences that are always a lie

One of the things I really don’t enjoy collecting is sentences that are always a lie, i.e. they are not spoken if they’re true. For example:

“Please don’t get me wrong, I’m not here to sell you anything”

The terrible reign of Windows phone is over

Yes, I have an iPhone again.

Yesterday I went out and bought a shiny new iPhone 4S. Apple had a sale on, and the Apple Store in Chadstone was extraordinarily busy. Of course the sale didn’t apply to the device I wanted…sigh…so the only effect of the sale was to make it harder to find a redshirted somebody to take my money. Quite a lot of my money.

Why didn’t I buy an iPhone 5? Because there’s nothing that compelling about it. A faster processor is nice, but my old iPhone was a 3GS so any current model will be an improvement. The 5’s camera is no better than the 4S’. LTE might be nice, but I really don’t need to be giving Telstra any more money when I can use WiFi at home and in the office and Instapaper for most of my reading on the train. The 5 does have one large negative: a different connector. We are an iPhone household; three days ago there were 3 30-pin chargers around the house, two at work, and one in the car. If we were going to spend time and money changing all these over or fitting adaptors, I don’t see any good reason to go to yet another proprietary connector instead of the EU standard Micro-USB connector.

So now I’m in a position to describe what it’s like to go back to an iPhone after several weeks of exile in Windows Phone land.

The very first thing I noticed: I actually had a choice of cases, instead of just one naff rubber thing from Nokia in my choice of two ugly fluorescent colours. Yay for form factor that people actually make accessories for!

The very best thing I noticed: there are actual apps again! I can play Boggle. And Words with Friends. And use Urbanspoon. Yay for a platform that people actually write apps for!

The most significant accumulation of small effects: Mobile Safari is about a bajillion times better than mobile IE. Pages actually load reliably, instead of crapping out at the 80% mark about 20% of the time. Pages load much faster, it feels about two to three times as fast. Sometimes I barely have time to notice that the new page has loaded. Typically the page title and the first few lines of content appear on screen in under a second, even before the page has finished loading, instead of making we wait a minute or two for the whole thing to download. Touching clickable screen items doesn’t randomly highlight completely irrelevant rectangles on the screen before failing to notice the touch. Sites actually format themselves half decently, instead of deciding that you must be on a desktop because “Windows” is in the user agent string. Most amusingly, The Huff doesn’t decide I’m viewing from a Blackberry and go completely useless (this made me pity Blackberry users almost as much as Windows Phone users).

Best difference for a network engineer: Safari’s page loading progress bar moves in fits and starts as the page loads and keeps going until the end. This might look inelegant but at least it’s honest. Mobile IE on the other hand is obviously lying to me. It shows progress going linearly with time and smoothly from 0% to about 80%, always, for every page. Even when I’m in a Metro tunnel and I know that there is no reception. Even when the immediately upstream DNS server is not responding. Seriously Microsoft, it’s insulting.

Worst surprise about cleaning up a Windows phone: There is no obvious way to upload SMS conversations to somewhere else. WTF? Still exploring this one.

Least surprising surprise about cleaning up a Windows phone: The Pictures tile on the main screen shows a slideshow of randomly selected photos from your Camera Roll and Saved Photos albums. Even after you’ve “deleted” them. I saw this coming and avoided taking any embarrassing photos with the phone.

The one feature I actually miss from the Windows phone: The Windows virtual keyboard shows the virtual keys in the case that will actually be typed. So in shift mode, you see Q W E R T Y, but otherwise q w e r t y. The iOS virtual keyboard shows capitals always. It’s a small thing, but it’s surprising to see Microsoft actually get something more ‟right” than Apple.