Using NumPy char.isdecimal() function (3 examples)

Updated: February 29, 2024 By: Guest Contributor Post a comment

Introduction

NumPy, a core library for numerical computing in Python, provides comprehensive support for array operations. Among its myriad of functionalities, the handling of arrays of strings through the numpy.char module is less known but equally powerful. In this article, we specifically dive into the char.isdecimal() function provided within this module, showcasing its utility through three progressively complex examples.

What is NumPy’s char.isdecimal() Used for?

The numpy.char.isdecimal() function is an element-wise operation for arrays of strings, checking each element to determine if it contains exclusively decimal characters. This can be particularly useful for data cleaning and validation processes where ensuring numeric only content is necessary.

Syntax:

char.isdecimal(a)

Where a is the input array. The output is an array of booleans identical in shape to a.

Basic Usage

In our first example, let’s examine the function in its most straightforward application:

import numpy as np

# Creating an array of numeric strings and texts
arr = np.array(['123', '456', 'hello', '789world'])

# Applying the isdecimal function
decimal_check = np.char.isdecimal(arr)

print(decimal_check)

The output would be:

[ True  True False False]

This demonstrates that the function effectively differentiates between strings comprising solely of decimal characters and those containing non-decimal characters or entirely non-numeric content.

Filtering Decimals in an Array

Moving to a more practical example, let’s use isdecimal() to filter out non-decimal strings from an array:

import numpy as np

# Array of mixed content
arr = np.array(['200', '300', 'Python', '400', 'C++'])

# Applying isdecimal and filtering
decimal_array = arr[np.char.isdecimal(arr)]

print(decimal_array)

The resulting array will be:

['200' '300' '400']

Here, we used the boolean array returned by isdecimal() as an index to filter out the elements that do not represent decimal numbers, yielding a cleaner set of numeral strings.

Advanced Applications: Combining with Other NumPy Operations

In our final example, we will demonstrate how isdecimal() can be teamed with other NumPy functionalities to perform more complex operations. A common task might be grouping and counting decimal vs. non-decimal strings within an array. Let’s see this in action:

import numpy as np

arr = np.array(['10', 'abc', '20', 'def', '30', 'ghi'])

# Identify decimal strings
is_decimal = np.char.isdecimal(arr)

# Counting decimals
num_decimals = np.sum(is_decimal)

# Grouping decimals and non-decimals
decimal_vals = arr[is_decimal]
non_decimal_vals = arr[~is_decimal]

print(f'Decimal values: {decimal_vals}')
print(f'Non-decimal values: {non_decimal_vals}')
print(f'Total number of decimal values: {num_decimals}')

Output:

Decimal values: ['10' '20' '30']
Non-decimal values: ['abc' 'def' 'ghi']
Total number of decimal values: 3

This operation allows us to neatly categorize and count the elements of our array into decimals and non-decimals, showcasing the versatility of isdecimal() when integrated into broader data processing workflows.

Conclusion

The numpy.char.isdecimal() function is a potent tool for those looking to process and analyze text data within arrays, allowing for efficient and straightforward differentiation between numeric and non-numeric string elements. Through its incorporation into data cleaning and validation tasks, this function can significantly streamline workflows, making it an invaluable component of a data scientist’s toolkit.