Python: Convert character to Unicode code point and vice versa

Updated: June 1, 2023 By: Khue Post a comment

This concise and straight-to-the-point article will show you how to convert a character to a Unicode code point and turn a Unicode code point into a character in Python.

What are Unicode code points?

A Unicode code point is a numerical value that represents a specific character in the Unicode standard. It is a unique identifier assigned to each character and can range from 0 to 0x10FFFF.

The Unicode code point allows for the universal representation and encoding of characters from various writing systems and languages. It provides a standardized way to represent and process text in different scripts and symbols.

Convert a character to a Unicode code point in Python

Python provides a function named ord(), which takes a character as input and returns its corresponding Unicode code point. This is the tool we’ll use to get the job done.

Example:

character = 'A'
code_point = ord(character)
print(f"The code point of '{character}' is {code_point}")

character = '😄'
code_point = ord(character)
print(f"The code point of '{character}' is {code_point}")

Output:

The code point of 'A' is 65
The code point of '😄' is 128516

If you inadvertently (or intentionally) provide an invalid input to the ord() function, it will not be happy and raise a ValueError as shown below:

character = 'ABC_DEF'
code_point = ord(character)
# TypeError: ord() expected a character, but string of length 7 found

Convert a Unicode code point to a character in Python

In order to convert a Unicode code point to its corresponding character, you can use the Python built-in chr() function.

Example:

code_point = 102
character = chr(code_point)
print(f"The character corresponding to code point {code_point} is '{character}'")

Output:

The character corresponding to code point 102 is 'f'

What if you feed the chr() function with an invalid, out-of-range input (the valid range is from 0 to 0x10FFFF)? Well, it will raise a ValueError as the ord() function does, like this:

code_point = 10222111
character = chr(code_point)

# ValueError: chr() arg not in range(0x110000)