PHP: How to convert a string to char codes and vice versa

Updated: January 10, 2024 By: Guest Contributor Post a comment

Introduction

Converting between strings and character codes is a common task in programming – in PHP, this intertwining of character sets and manipulations are carried out through a variety of functions and approaches. This guide will delve into how to accomplish these conversions efficiently in PHP.

Understanding Character Encoding

Before starting with conversion, it’s imperative to understand what character encoding is. A character encoding tells the computer how to interpret raw zeroes and ones into actual characters. In PHP, the most common encoding is UTF-8.

Converting a String to Char Codes

To convert a string to character codes in PHP, you can use the ord() function for single characters or a combination of functions for strings. Here’s a basic example:

$string = 'hello';
$charCodes = array();
for ($i = 0; $i < strlen($string); $i++) {
    $charCodes[] = ord($string[$i]);
}
print_r($charCodes);

This will output character codes for each character in the string ‘hello’.

Handling Multibyte Characters

If you’re working with multibyte characters (like characters from the Cyrillic alphabet or Chinese characters), you should use the mb_ord() function instead to cater for UTF-8 strings:

$string = '世界';
$charCodes = array();
$length = mb_strlen($string);
for ($i = 0; $i < $length; $i++) {
    $charCode = mb_ord(mb_substr($string, $i, 1), 'UTF-8');
    $charCodes[] = $charCode;
}
print_r($charCodes);

Converting Char Codes to a String

When converting character codes back to a string, you can use chr() for ASCII and mb_chr() for UTF-8. The following example illustrates this:

$charCodes = [104, 101, 108, 108, 111];
$string = '';
foreach ($charCodes as $code) {
    $string .= chr($code);
}
echo $string;

To work with UTF-8 encoding:

$charCodes = [19990, 30028];
$string = '';
foreach ($charCodes as $code) {
    $string .= mb_chr($code, 'UTF-8');
}
echo $string;

Advanced Techniques

In more advanced scenarios, you might want to work with a custom mapping of character codes to strings. This requires more complex encoding and decoding strategies:

Custom Encoding

$string = 'Custom encode this!';
$customMap = array_flip(get_defined_constants(true)['pcre']);
$encoded = '';
foreach (str_split($string) as $char) {
    $encoded .= sprintf('%03d', $customMap[$char] ?? ord($char));
}
echo $encoded;

Custom Decoding

$encoded = '099117115116111109032101110099111100101032116104105115033';
$decoded = '';
for ($i = 0; $i < strlen($encoded); $i += 3) {
    $code = intval(substr($encoded, $i, 3));
    $decoded .= in_array($code, $customMap) ? array_search($code, $customMap) : chr($code);
}
echo $decoded;

Conclusion

In PHP, converting strings to character codes, and vice versa, is a straightforward process that can be accomplished with built-in functions. Understanding and appropriately applying these functions will allow you to handle a wide range of encoding tasks including dealing with different character sets and implementing custom encoding schemes effectively.