NumPy: How to Split an Array Horizontally (column-wise)

Updated: January 22, 2024 By: Guest Contributor Post a comment

Overview

NumPy is a fundamental package for scientific computing in Python, and one of its many utilities is the ability to easily manage and manipulate arrays. In certain situations, you may need to split an array, particularly column-wise. This guide will walk you through how to do just that, starting with basic examples and gradually moving towards more advanced use-cases.

Introduction to np.hsplit

NumPy provides a function specifically designed for horizontal splitting, np.hsplit(). This function splits an array into multiple sub-arrays horizontally (column-wise).

Basic Usage of np.hsplit

import numpy as np

# Create an example array
array = np.arange(16).reshape(4, 4)
print('Original array:\n', array)

# Split the array into 2 sub-arrays horizontally
divided = np.hsplit(array, 2)
print('Divided array:', divided)

Output:

Original array:
 [[ 0 1 2 3]
 [4 5 6 7]
 [ 8 9 10 11]
 [12 13 14 15]]

Divided array: [array([[ 0, 1],
 [ 4, 5],
 [ 8, 9],
 [12, 13]]), array([[ 2, 3],
 [ 6, 7],
 [10, 11],
 [14, 15]])]

Splitting Along a Specific Column

If you want to split an array at specific column indices, you can pass a list of columns to np.hsplit().

# Split the array at the 1st and 3rd column
sections = np.hsplit(array, [1, 3])

print('Sections:', sections)

Output:

Sections: [array([[ 0],
 [ 4],
 [ 8],
 [12]]), array([[ 1, 2],
 [ 5, 6],
 [ 9, 10],
 [13, 14]]), array([[ 3],
 [ 7],
 [11],
 [15]])]

Working with Higher Dimensional Arrays

NumPy’s hsplit can also be used on higher-dimensional arrays. However, the split will always occur along the second axis.

# Create a 3D array
array_3d = np.arange(24).reshape(2, 4, 3)
print('3D array:\n', array_3d)

# Split the 3D array into 2 sub-arrays along the second axis
divided_3d = np.hsplit(array_3d, 2)
print('\nSplit 3D array:\n', divided_3d)

Output:

3D array:
 [[[ 0 1 2]
 [3 4 5]
 [6 7 8]
 [9 10 11]]

 [[12 13 14]
 [15 16 17]
 [18 19 20]
 [21 22 23]]]

Split 3D array:
 [array([[[0, 1, 2],
 [6, 7, 8]],

 [[12, 13, 14],
 [18, 19, 20]]]), array([[[3, 4, 5],
 [9, 10, 11]],

 [[15, 16, 17],
 [21, 22, 23]]])]

Splitting Using np.array_split

For scenarios where you need more flexible splits that do not necessarily divide the array into equal parts, you can use np.array_split()

Unequal Column Split

# Indicate the number of splits - may result in unequal sections
uneven = np.array_split(array, 3, axis=1)
print('Uneven split:\n', uneven)

Output:

Uneven split:
 [array([[0, 1],
 [4, 5],
 [8, 9],
 [12, 13]]), array([[ 2],
 [ 6],
 [10],
 [14]]), array([[ 3],
 [ 7],
 [11],
 [15]])]

Error Handling in Array Splitting

It’s important to be aware that np.hsplit can throw a ValueError if the array cannot be divided into equal sections or if the indices provided create an impossible division. Always make sure the way you want to split your array is compatible with its shape.

Conclusion

Splitting an array horizontally is a common operation while handling multidimensional data in NumPy. You can make use of np.hsplit for equally divided splits, or np.array_split when you need more control over the divisions, even if they’re not equal. Understanding these tools can significantly aid in the data processing and manipulation stages of a project.