William Jiang

JavaScript,PHP,Node,Perl,LAMP Web Developer – http://williamjxj.com; https://github.com/williamjxj?tab=repositories

utf8_general_ci vs. utf8_unicode_ci

utf8_general_ci vs. utf8_unicode_ci

While setup MySQl DB, what’s the difference between utf8_general_ci vs. utf8_unicode_ci?
I found a good article for the explain: http://forums.mysql.com/read.php?103,187048,188748#msg-188748:

utf8_general_ci is a very simple collation. What it does – it just
– removes all accents
– then converts to upper case
and uses the code of this sort of “base letter” result letter to compare.
For example, these Latin letters: ÀÁÅåāă (and all other Latin letters “a”
with any accents and in any cases) are all compared as equal to “A”.

utf8_unicode_ci uses the default Unicode collation element table (DUCET).

The main differences are:
1. utf8_unicode_ci supports so called expansions and ligatures, for example:
German letter ß (U+00DF LETTER SHARP S) is sorted near “ss”
Letter Œ (U+0152 LATIN CAPITAL LIGATURE OE) is sorted near “OE”.
utf8_general_ci does not support expansions/ligatures, it sorts
all these letters as single characters, and sometimes in a wrong order.

2. utf8_unicode_ci is *generally* more accurate for all scripts.
For example, on Cyrillic block:
utf8_unicode_ci is fine for all these languages:
Russian, Bulgarian, Belarusian, Macedonian, Serbian, and Ukrainian.
While utf8_general_ci is fine only for Russian and Bulgarian subset of Cyrillic.
Extra letters used in Belarusian, Macedonian, Serbian, and Ukrainian

The disadvantage of utf8_unicode_ci is that it is a little bit
slower than utf8_general_ci.

So when you need better sorting order – use utf8_unicode_ci,
and when you utterly interested in performance – use utf8_general_ci.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: