2008-11-25 18:34:20

Your daily pot of MySQL insanity: Threading

I really know I shouldn't look into MySQL source code, but since there are customers who really entrust their data to this piece of software, I sometimes have to.

The task today: Updating PHP on Etch to version 5.2.6. Sounds easy, right? Wrong. Because after you put up with disabling all the new security extension which prevent the worst PHP scripts you've ever seen from running, and after fiddling half a day with weird compile time detection scripts which don't work in a uniform way, i.e. discover different environments under different circumstances, thus leaving you with a PHP which is in a constant debate with itself where its extension directory really is, …

…you finally come to the point where you have a working PHP package which waits infinitely on a futex on exit. A command as simple as php -v hangs indefinitely.

Following the code path further up, there is main(), php_module_shutdown(), zend_shutdown() (Why this stack order?!), zend_hash_graceful_reverse_destroy(), zend_hash_apply_deleter(), module_destructor(), zm_shutdown_mysql() (Argh, there's our other worst offender), mysql_server_end(), my_end(), my_thread_global_end(), pthread_cond_destroy(), …

Digging through the code a bit, and digging through the glibc code some more, it becomes clear that this is indeed one of the good old pthread_exit() NPTL bugs from older glibc days. For once, it seems that Debian has also shipped beta software to the world. At least it wasn't some important system component, only the libc.

Looking around some more, I stumbled over the PHP bug #42625. It is not very helpful, though, it only proclaims:

 Problem will solve itself when you rename /lib/tls to

Thanks for the suggestion, dear PHP people, but I have to get some work done first. Of course the above workaround does its job, but it is simply not what you would like to see in your server environment. And I don't mean the name.

So I dug out a different patch for mysql which adds code to detect whether or not the glibc NPTL is in use. In fact, it gets the check entirely wrong by checking if the OS is Linux, while the bug is certain to appear on other operating systems running glibc as well. In that precise moment, though, I couldn't have cared less.

Digging through the related header files, some more horrid things turned up which made it perfectly clear that I should not have gone there. One comment was rather odd:

/*Irena: compiler does not like this: */
/*#define my_pthread_getprio(pthread_t thread_id) pthread_dummy(0) */

Yeah, Irena, I can imagine perfeclty well why the compiler didn't like that. After all, typed arguments aren't really supported in preprocessor macros. It can be hacked in using tricks, but they will always be tricks. However, now that you found out that the compiler doesn't like it, why commit it?

Only a number of lines further, there is some code to handle a nonexistent errno:

#ifndef ESRCH
/* Define it to something */
#define ESRCH 1

Maybe it is not the best idea after all to define ESRCH to EPERM.

Then finally I am at the code I want to modify, but there is some oddity as well (there's always one more oddity). The added thread types for the header file are not flags or anything, they just enumerate types of libraries. However, no enum is used. That's ok, but the numbering is slightly odd:

#define THD_LIB_OTHER 1
#define THD_LIB_NPTL 2
#define THD_LIB_LT 4

Why bitwise numbering if it is not a flag? Or can a threading library be LT and OTHER at a time?

Of course, that's all not dangerous, but it gives a very odd impression …

Posted by Tonnerre Lombard | Permanent link | File under: broken, programming