MySQL 8: How to change charset and collation of a table

Updated: January 25, 2024 By: Guest Contributor Post a comment

Introduction

MySQL is a robust data storage system that supports various character sets and collations, allowing developers to work with multiple languages and data comparison methods. As your applications evolve or localization requirements change, you may encounter the need to alter the character set and collation for your database tables. By understanding how to execute these changes, you can ensure the proper handling of string comparisons and sorting, ultimately leading to better application performance and user experiences.

In this tutorial, we will explore how to change the character set and collation of tables in MySQL 8. We will cover the concepts of character set and collation, reasons why you might need to change them, and give code examples ranging from basic to advanced, to guide you through the processes.

Understanding Character Set and Collation

A character set is a set of symbols and encodings. In a database, a character set determines how text data is stored. Collation, on the other hand, is a set of rules for comparing characters in a character set, defining how data is sorted and compared.

It is essential to choose the appropriate character set and collation that matches the data you intend to store because they impact not only how data is stored but also how string comparisons, sorting, and indexing are performed.

When Do You Need to Change Charset and Collation?

You might need to change the character set and collation for various reasons, including:

  • Supporting internationalization in your application.
  • Improving performance through proper character set optimization.
  • Aligning the character set and collation of independent tables within a single database to avoid inconsistencies during joins or query results.

Viewing Current Charset and Collation of a Table

Before you change anything, it is a good practice to check the current character set and collation of your table. You can do it using the following SQL command:

SHOW FULL COLUMNS FROM your_table_name;

This command shows the character set and collation among other details of each column in your table. You can get the default character set and collation of the table itself with:

SHOW TABLE STATUS LIKE 'your_table_name';

Changing the Charset and Collation of a Table

The basic syntax for changing the character set and collation of a table is as follows:

ALTER TABLE your_table_name CONVERT TO CHARACTER SET charset_name COLLATE collation_name;

Here’s an example:

ALTER TABLE user_info CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

You will use utf8mb4 as it supports a broader range of Unicode characters compared to utf8. Following up with the correct collation for the character set; for International English, utf8mb4_unicode_ci is a good default as it offers accurate sorting behavior in a wide range of languages.

Particular Column Charset and Collation Change

To change a specific column’s charset and collation in MySQL 8, use this syntax:

ALTER TABLE your_table_name CHANGE column_name column_name column_definition CHARACTER SET charset_name COLLATE collation_name;

Example:

ALTER TABLE user_info CHANGE bio bio TEXT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

This statement changes the character set and collation of the ‘bio’ column in the ‘user_info’ table to utf8mb4 and utf8mb4_unicode_ci correspondingly.

Changing the Default Charset and Collation of a Table

To change the default charset and collation for a table without converting the existing data:

ALTER TABLE your_table_name DEFAULT CHARACTER SET charset_name COLLATE collation_name;

Here is an example:

ALTER TABLE user_info DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

This will set the utf8mb4 and utf8mb4_unicode_ci as the default for future data inserted into the table but will not convert existing data to the new character set or collation.

Converting Data to New Charset and Collation

When changing the character set and/or collation for a table, it may be necessary to convert the data in the table to match the new settings. It ensures consistency across your data and is done using the convert operation:

ALTER TABLE your_table_name CONVERT TO CHARACTER SET charset_name COLLATE collation_name;

This command will alter not just the table default settings but also convert all the existing column values to the new character set and collation.

Advanced Use Case: Collation for Complex Queries

In some scenarios, you may need to manually specify the collation for individual queries, especially when working with advanced queries involving multiple tables with different collations. In such cases, you can override the default collation using the COLLATE clause in your queries:

SELECT * FROM table1 JOIN table2 ON table1.text_column COLLATE utf8mb4_unicode_ci = table2.text_column COLLATE utf8mb4_unicode_ci;

This query overrides any table-specified collations and ensures a proper comparison between columns of different tables.

Conclusion

Understanding how to adjust character sets and collations is essential for developers dealing with MySQL databases. Using the aforementioned strategies, you can ensure data consistency, improve sorting and comparison operations, and ensure internationalization support within your databases.