Using NumPy random Generator.permutation() method (5 examples)

Updated: March 1, 2024 By: Guest Contributor Post a comment

Introduction

NumPy’s Generator.permutation() method is a powerful utility for random sampling in scientific computing. This tutorial offers a deep dive into its capabilities through five practical examples, ranging from simple array shuffling to more complex applications. Whether you are a beginner or looking to expand your NumPy repertoire, these examples will provide valuable insights into using permutation() effectively.

What is Generator.permutation() Used for?

Before diving into the examples, it’s crucial to understand what Generator.permutation() does. In essence, it randomly permutes a sequence or returns a permuted range. If you have an array, it will shuffle the elements randomly; if you have an integer, it will treat it as the array arange(n) and shuffle its elements. This method is part of the new NumPy random module, which provides an array of functions for generating random numbers in a more versatile and performance-enhanced manner than its predecessor.

Syntax:

generator.permutation(x, axis=0)

Where:

  • x: If x is an integer, returns a permuted range [0, x). If an array, returns a shuffled copy.
  • axis: Axis along which to permute. Default is 0.

Example 1: Basic Array Shuffling

import numpy as np

# Create a random generator object
rng = np.random.default_rng()

# Create an array
arr = np.array([1, 2, 3, 4, 5])

# Shuffle the array
shuffled_arr = rng.permutation(arr)

# Output
print(shuffled_arr)

This basic example demonstrates how to shuffle an array. Note that the original array remains unchanged, and a new array is returned with shuffled elements.

Example 2: Permuting an Integer Range

import numpy as np

# Create a random generator object
rng = np.random.default_rng()

# Integer range (this will create an array from 0 to 4)
rng.permutation(5)

# Output
[3, 0, 4, 1, 2]  # example output, actual output varies due to randomness

In this example, by providing an integer instead of an array, we instruct permutation() to generate an array ranging from 0 to 4 and then shuffle it. This approach is useful for generating random indexes or orders.

Example 3: Shuffling Multidimensional Arrays

import numpy as np

# Create a random generator object
rng = np.random.default_rng()

# Create a 2D array
arr = np.array([[1, 2], [3, 4], [5, 6]])

# Shuffle the array along the first axis
shuffled_arr = rng.permutation(arr)

# Output
print(shuffled_arr)

This example highlights the behavior of permutation() with multidimensional arrays. It shuffles the array along its first axis, effectively shuffling the rows if you’re thinking in terms of a matrix layout. This is particularly useful for tasks like randomizing the order of dataset samples without changing the data within each sample.

Example 4: Permuting Columns Using Advanced Indexing

import numpy as np

# Create a random generator object
rng = np.random.default_rng()

# Create a 2D array
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# Get a permutation of column indexes
col_permutation = rng.permutation(arr.shape[1])

# Apply the permutation to each row (i.e., permute columns)
permuted_arr = arr[:, col_permutation]

# Output
print(permuted_arr)

This advanced example shows how to shuffle columns instead of rows by applying a permutation to the column indexes. It illustrates the flexibility of permutation() in manipulating data structures for complex applications.

Example 5: Random Data Partitioning

import numpy as np

# Create a random generator object
rng = np.random.default_rng()

# Generate a dataset
data = np.random.rand(100, 2)

# Generate a random partition index
partition_index = rng.permutation(data.shape[0])[:20]

# Extract a random subset of the data
subset = data[partition_index]

# Output
print(subset.shape)

In this final example, we utilize permutation() to efficiently divide a dataset into subsets, showcasing its utility for data preprocessing tasks such as creating training and test splits. The method allows for random, non-repetitive sampling of indices, ensuring that each entry in the subset is unique.

Conclusion

The Generator.permutation() method in NumPy offers a flexible solution for generating randomized sequences and shuffling data. Through these examples, we’ve explored its capabilities from basic array manipulation to complex data structuring tasks. Whether you’re processing small arrays or large datasets, permutation() is a tool that can significantly enhance the versatility of your data manipulation toolkit.