Stories
Slash Boxes
Comments
NOTE: use Perl; is on undef hiatus. You can read content, but you can't post it. More info will be forthcoming forthcomingly.

All the Perl that's Practical to Extract and Report

use Perl Log In

Log In

[ Create a new account ]

domm (4030)

domm
  (email not shown publicly)
http://domm.plix.at/

Just in case you like to know, I'm currently full-time father of 2 kids, half-time Perl hacker, sort-of DJ, bicyclist, no longer dreadlocked and 33 years old

I'm also head of Vienna.pm [pm.org], maintainer of the CPANTS [perl.org] project, member of the TPF Grants Commitee [perlfoundation.org] and the YAPC Europe Foundation [yapceurope.org].

I've got stuff on CPAN [cpan.org], held various talks [domm.plix.at] and organised YAPC::Europe 2007 in Vienna [yapceurope.org].

Journal of domm (4030)

Thursday July 31, 2008
01:18 PM

hatefull mysql

[ #37070 ]

GAAA!!!

Yesterday I got a "funny" bug report: If the user 'märrie' logs in, she gets the account of user 'marrie'. As the system in question gets it's data from a combination of different sources (SAP, a semi-external single sign on system, some local data) I expected some hellish encoding problems (which would be strange, because we switched the whole system over to utf8 two years ago).

So I start my bug-hunt at the lowest level and connect to the MySQL DB. I do a quick "select username from users where username = 'märrie"' (just for basic sanity checking) and get back:
marrie
märrie

WTF??

To cut a long bug-hunt short: MySQL considers 'ä' and 'a' to be the same character (at least when using default utf8 settings) - not only for sorting, where this makes a little bit sense, but also for selecting, which is totally pointless.

The solution: use a collate of 'utf8_bin':
mysql> SELECT 'ä' = 'a' COLLATE utf8_general_ci;

+------------------------------------+
| 'ä' = 'a' COLLATE utf8_general_ci |
+------------------------------------+
| 1 |
+------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT 'ä' = 'a' COLLATE utf8_bin;
+-----------------------------+
| 'ä' = 'a' COLLATE utf8_bin |
+-----------------------------+
| 0 |
+-----------------------------+
1 row in set (0.00 sec)

/me hates MySQL!

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More | Login | Reply
Loading... please wait.
  • $ cd projects/MahDayJobb/
    $ cat sql/schemapatch-00039.sql
    ALTER DATABASE CHARACTER SET = 'utf8' COLLATE = 'utf8_unicode_ci';