NumPy – Draw samples from a Gamma distribution (4 examples)

Updated: February 28, 2024 By: Guest Contributor Post a comment

Overview

This tutorial is aimed at both beginners and advanced Python users who are working in data science, statistics, or areas demanding data simulation. NumPy, a cornerstone in the Python data science toolkit, provides an exceedingly flexible function numpy.random.gamma for sampling from gamma distributions. Understanding how to effectively use this function enables one to simulate data under various conditions, critical for probabilistic modeling and Bayesian inference among other applications.

What is Gamma Distribution?

To start, let’s briefly delve into what a gamma distribution is. Essentially, it models the times until an event occurs given a certain average rate of occurrence. Two parameters, shape (k) and scale (theta), define the gamma distribution. The relationship between these parameters allows for the modeling of a wide range of phenomena.

Now, let’s move on to the examples showing how to draw samples from this distribution using NumPy.

Example 1: Basic Sampling

import numpy as np

# Parameters
shape, scale = 2.0, 2.0 # k, theta
np.random.seed(0)  # For reproducibility

gamma_samples = np.random.gamma(shape, scale, 1000)

print("Sample mean: ", np.mean(gamma_samples))
print("Sample standard deviation: ", np.std(gamma_samples))

Output:

Sample mean:  3.9311453021789893
Sample standard deviation:  2.6954625994211785

In this basic example, we set both the shape and scale parameters to 2.0. We then draw 1000 samples from the gamma distribution. Using np.mean and np.std, we can observe that the sample mean and standard deviation are approximately close to what we would theoretically expect from such a distribution, given its mean is k*theta and variance is k*theta^2.

Example 2: Changing Parameters

import numpy as np

shape = 5.0
scale = 1.0 # Decreasing theta
np.random.seed(0)

gamma_samples = np.random.gamma(shape, scale, 1000)

print("Increased shape, decreased scale:\n", "Mean: ", np.mean(gamma_samples), "\nStandard deviation: ", np.std(gamma_samples))

Output:

Increased shape, decreased scale:
 Mean:  4.96119214863775 
Standard deviation:  2.1396329480421756

Adjusting the shape and scale parameters allows you to control the distribution’s skewness and dispersion. In this case, increasing the shape parameter and decreasing the scale parameter results in a distribution that is less skewed and has a lower variance, evident in the sample’s statistics.

Example 3: Multidimensional Sampling

import numpy as np

shape = [2, 5]
scale = [2, 3]
np.random.seed(0)

multi_samples = np.random.gamma(shape, scale, (1000, 2))

print("Multidimensional samples mean:\n", np.mean(multi_samples, axis=0))
print("Multidimensional samples standard deviation:\n", np.std(multi_samples, axis=0))

Output:

Multidimensional samples mean:
 [ 4.0487291  14.59007288]
Multidimensional samples standard deviation:
 [2.89590104 6.35959753]

This advanced example demonstrates multidimensional sampling, which is crucial for simulations involving multiple variables or processes. By providing lists of shape and scale parameters and instructing NumPy to create a 1000×2 array, we obtain samples representative of two different gamma distributions. This showcases the flexibility and power of NumPy’s random sampling capabilities.

Example 4: Plotting Sample Distributions

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(0)
gamma_samples = np.random.gamma(2, 2, 1000)

# Plotting
plt.hist(gamma_samples, bins=30, density=True)
plt.title('Gamma Distribution Sample')
plt.xlabel('Value')
plt.ylabel('Density')
plt.show()

Visualizing data is a crucial part of understanding its distribution. In this example, using matplotlib in conjunction with NumPy, we plot the histogram of our sampled data. This visual representation helps to intuitively grasp the shape and spread of the gamma distribution from our sampled data.

Conclusion

Through these examples, we demonstrated the flexibility and utility of using NumPy to sample from gamma distributions. From basic examples to more complex simulations, including visualization, NumPy’s random.gamma function is an indispensable tool for statistical modeling and data simulation. Whether you’re a statistician, a data scientist, or merely curious about data distribution, these examples should serve as a solid foundation for further exploration and application.