NumPy

The Basics

Beyond the Basic

SciPy Tutorials

Intro to SciPy with Examples SciPy show_config() Examples Scipy cluster.vq.whiten() Function SciPy cluster.vq.vq() Examples SciPy kmeans() Function Explained SciPy fcluster() Examples Exploring is_monotonic() in SciPy SciPy Optimal Leaf Ordering SciPy: cut_tree() Function SciPy Dendrogram Tutorial SciPy maxdists() Function SciPy cophenet() Tutorial SciPy Ward Clustering Guide SciPy median() Function Examples SciPy hierarchical clustering SciPy avg() clustering explained SciPy Complete Linkage Clustering SciPy Linkage Function Explained SciPy fclusterdata() Tutorial SciPy's datasets.ascent() Function SciPy datasets.face() 3 Examples SciPy ECG Function Guide SciPy fft.fft() Tutorial SciPy's fft.ifft() Explained SciPy & fft.ifft2() Function fft.ifftn() in SciPy Examples Understanding fft.irfft() SciPy: fft.rfft2() Explained Understanding fft.irfft2() in SciPy SciPy: Working with fft.rfftn() SciPy fft.irfftn() Tutorial Exploring fft.hfft() in SciPy SciPy fft.ihfft() Guide SciPy: fft.hfft2() Function Guide SciPy and fft.hfftn() function SciPy fft.dct() Examples SciPy fft.dctn() Guide SciPy fft.dst() Function Guide SciPy fft.idst() Explained SciPy fft.dstn() Function Guide SciPy fft.ifft() with Examples Understanding fft.fftshift() SciPy fft.ifftshift() Explained SciPy fft.fftfreq() Explained SciPy: fft.set_workers() Guide SciPy fft.set_global_backend() Guide SciPy integrate.quad() Explained SciPy's integrate.quad_vec() SciPy dblquad() Examples SciPy tplquad() Function Guide SciPy integrate.nquad() Guide SciPy's fixed_quad() Function SciPy integrate.trapezoid() Examples SciPy cumulative_trapezoid() Guide SciPy integrate.simpson() Examples SciPy solve_ivp() Examples SciPy and Radau Integration SciPy: solve_bvp() Tutorial SciPy krogh_interpolate() Guide SciPy pchip_interpolate() Guide Scipy griddata() with Examples SciPy interpolate.splrep() Guide SciPy interpolate.splev() Guide SciPy interpolate.splint() Guide SciPy interpolate.spalde() Guide SciPy interpolate.splder() Guide SciPy interpolate.insert() Guide SciPy interpolate.bisplev() Guide Using io.loadmat() in SciPy SciPy: io.savemat() Examples Mastering io.whosmat() in SciPy SciPy io.readsav() Tutorial io.mminfo() in SciPy Explained SciPy io.mmread() Function SciPy io.mmwrite() Explained SciPy's hb_read() in Examples SciPy io.hb_write() Explained SciPy io.wavfile.read() Guide SciPy io.arff.loadarff() Function SciPy linalg.inv() Function SciPy linalg.solve() Explained SciPy solve_banded() Guide SciPy: solveh_banded() Explained SciPy solve_circulant() Func SciPy solve_triangular() Guide SciPy & linalg.det() Function SciPy special.yvp() Function Guide SciPy special.kvp() Explained SciPy itmodstruve0() Examples SciPy special.gammasgn() function

Solving Bugs

NumPy – Using char.find() function (4 examples)

Updated: March 2, 2024 By: Guest Contributor Post a comment

NumPy, the cornerstone library for numerical computing in Python, offers a wide array of functions designed to operate on arrays for efficient computation. Among its lesser-known treasures is the char.find() function, a method belonging to the NumPy character array class that enables users to search for substrings within an array of strings. This tutorial demonstrates the practicality and versatility of the char.find() method through a series of examples, ramping up from simple use cases to more complex applications.

Table Of Contents

1 Understanding char.find()

1.1 Basic Usage

1.2 Case Sensitivity

1.3 Searching at the Beginning or End

1.4 Advanced Search Patterns

2 Conclusion

Understanding `char.find()`

The np.char.find() function in NumPy provides a vectorized way to search for substrings within each element of an array of strings. The method returns the index of the first occurrence of the substring if it is present, otherwise, it returns -1. This functionality mirrors Python’s native str.find() method but is optimized for array processing, offering significant speed advantages when working with large datasets.

Basic Usage

import numpy as np

cities = np.array(['New York', 'Los Angeles', 'Chicago', 'Houston'])
results = np.char.find(cities, 'or')
print(results)

In this example, we’ve searched for the substring ‘or’ in an array of city names. The output shows the indices of the first occurrence of ‘or’ in each string:

[-1  1 -1 -1]

‘Los Angeles’ contains ‘or’ at index 1, while the others do not contain ‘or’, resulting in -1.

Case Sensitivity

The char.find() function is case-sensitive, which means it distinguishes between uppercase and lowercase letters. To perform a case-insensitive search, one could lower or upper case the entire array prior to searching. Here’s an example:

import numpy as np

cities = np.array(['New York', 'Los Angeles', 'Chicago', 'Houston'])
cities_lower = np.char.lower(cities)
results = np.char.find(cities_lower, 'ch')
print(results)

The output is:

[ 5 -1  0  2]

This example demonstrates the position of ‘ch’ in a case-insensitive manner within the array elements.

Searching at the Beginning or End

While char.find() searches for the substring’s first appearance anywhere within the string, specific situations may necessitate finding substrings at the start or end of strings. This requirement could be addressed by further analyzing the output or pre-processing the strings. This example shows how to identify strings beginning with ‘New’:

import numpy as np

cities = np.array(['New York', 'Los Angeles', 'Chicago', 'Houston', 'New Orleans'])
results = np.char.find(cities, 'New')
print(results >= 0)

By evaluating whether the indices are greater than or equal to 0, we can determine which cities start with ‘New’. The output:

[ True False False False  True]

indicates that both ‘New York’ and ‘New Orleans’ match the criteria.

Advanced Search Patterns

For more sophisticated search requirements, such as finding substrings that follow a particular pattern, users might need to resort to regular expressions. However, the char.find() function can still be useful for simpler pattern matching. For instance, finding strings that contain a numerical digit can be achieved by searching for each digit individually and combining the results:

import numpy as np

data = np.array(['Model 3', 'Cybertruck', 'Model S', 'Roadster'])
has_number = np.zeros(len(data), dtype=bool)
for digit in '0123456789':
    has_number |= np.char.find(data, digit) >= 0
print(has_number)

The output,

[ True False  True  False]

reveals that ‘Model 3’ and ‘Model S’ contain numerical digits.

Conclusion

The char.find() function in NumPy is a potent tool for processing text data at scale. Through its application in basic and advanced examples alike, we’ve seen its ability to streamline workflows that involve searching strings within arrays. Understanding how to effectively leverage char.find() and other NumPy string operations can immensely improve data handling efficiency, especially in data science and machine learning projects where text data is prevalent.

Next Article: How to Perform Advanced Array Indexing in NumPy

Previous Article: Understanding numpy.busday_offset() function (4 examples)

Series: NumPy Intermediate & Advanced Tutorials

NumPy