Converting MySQL character sets

MySQL uses character set definitions on different levels. There is a default character set for the MySQL server, one character set for each database, one for each table and even one for a single column. You should generally take care that you use a consistent dataset for all columns and tables inside a database and the database itself. There exists no SQL-command or command line utility to convert a database from one character set to another. Therefore I will describe a quick way to accomplish this.

Let’s assume we have a database called testdb with a current character set of utf8. We want to convert it to latin1.

The first step is to export the database in the currently used character set via mysqldump:

mysqldump --opt -umyusername -p testdb > testdb.sql

Now open the exported file and replace all occurrences of =utf8 with the new character set =latin1. Here is an example how to do this in the vi texteditor:

:%s/=utf8/=latin1/g

The next step is to set the new collation for the database testdb. Connect to the database with a mysql client and run the following SQL command:

alter database testdb collate latin1_swedish_ci;

We are using latin1_swedish_ci as that is the default collation for the latin1 character set. You can check if the changes are working with these commands:

use testdb;
show variables like "character_set_database";
show variables like "collation_database";

The last step is to reimport our modified dump into the database. Here you need to make sure that you use the old character set (in this case utf8) during the import and not the new character set:

mysql --default-character-set=utf-8 -umyusername -p testdb < testdb.sql

Now you’re done. The database stores its data consistently with the new character set.

Leave a Reply