NumPy

The Basics

Beyond the Basic

SciPy Tutorials

Intro to SciPy with Examples SciPy show_config() Examples Scipy cluster.vq.whiten() Function SciPy cluster.vq.vq() Examples SciPy kmeans() Function Explained SciPy fcluster() Examples Exploring is_monotonic() in SciPy SciPy Optimal Leaf Ordering SciPy: cut_tree() Function SciPy Dendrogram Tutorial SciPy maxdists() Function SciPy cophenet() Tutorial SciPy Ward Clustering Guide SciPy median() Function Examples SciPy hierarchical clustering SciPy avg() clustering explained SciPy Complete Linkage Clustering SciPy Linkage Function Explained SciPy fclusterdata() Tutorial SciPy's datasets.ascent() Function SciPy datasets.face() 3 Examples SciPy ECG Function Guide SciPy fft.fft() Tutorial SciPy's fft.ifft() Explained SciPy & fft.ifft2() Function fft.ifftn() in SciPy Examples Understanding fft.irfft() SciPy: fft.rfft2() Explained Understanding fft.irfft2() in SciPy SciPy: Working with fft.rfftn() SciPy fft.irfftn() Tutorial Exploring fft.hfft() in SciPy SciPy fft.ihfft() Guide SciPy: fft.hfft2() Function Guide SciPy and fft.hfftn() function SciPy fft.dct() Examples SciPy fft.dctn() Guide SciPy fft.dst() Function Guide SciPy fft.idst() Explained SciPy fft.dstn() Function Guide SciPy fft.ifft() with Examples Understanding fft.fftshift() SciPy fft.ifftshift() Explained SciPy fft.fftfreq() Explained SciPy: fft.set_workers() Guide SciPy fft.set_global_backend() Guide SciPy integrate.quad() Explained SciPy's integrate.quad_vec() SciPy dblquad() Examples SciPy tplquad() Function Guide SciPy integrate.nquad() Guide SciPy's fixed_quad() Function SciPy integrate.trapezoid() Examples SciPy cumulative_trapezoid() Guide SciPy integrate.simpson() Examples SciPy solve_ivp() Examples SciPy and Radau Integration SciPy: solve_bvp() Tutorial SciPy krogh_interpolate() Guide SciPy pchip_interpolate() Guide Scipy griddata() with Examples SciPy interpolate.splrep() Guide SciPy interpolate.splev() Guide SciPy interpolate.splint() Guide SciPy interpolate.spalde() Guide SciPy interpolate.splder() Guide SciPy interpolate.insert() Guide SciPy interpolate.bisplev() Guide Using io.loadmat() in SciPy SciPy: io.savemat() Examples Mastering io.whosmat() in SciPy SciPy io.readsav() Tutorial io.mminfo() in SciPy Explained SciPy io.mmread() Function SciPy io.mmwrite() Explained SciPy's hb_read() in Examples SciPy io.hb_write() Explained SciPy io.wavfile.read() Guide SciPy io.arff.loadarff() Function SciPy linalg.inv() Function SciPy linalg.solve() Explained SciPy solve_banded() Guide SciPy: solveh_banded() Explained SciPy solve_circulant() Func SciPy solve_triangular() Guide SciPy & linalg.det() Function SciPy special.yvp() Function Guide SciPy special.kvp() Explained SciPy itmodstruve0() Examples SciPy special.gammasgn() function

Solving Bugs

SciPy – Using cluster.hierarchy.fcluster() function (3 examples)

Updated: March 4, 2024 By: Guest Contributor Post a comment

Clustering is a powerful tool in data science, enabling the identification of intrinsic groupings within data. One of the widely used methods for hierarchical clustering is provided by SciPy, a Python library that supports scientific and technical computing. In this guide, we will delve into the utilization of the fcluster() function from SciPy’s cluster.hierarchy module and illustrate its application with three progressive examples.

Table Of Contents

1 What does luster.hierarchy.fcluster() Do?

2 Getting Started

3 Example 1: Basic clustering

4 Example 2: Choosing a distance threshold

5 Example 3: Advanced usage with custom distance metrics

6 Conclusion

What does luster.hierarchy.fcluster() Do?

The fcluster() function in SciPy’s cluster.hierarchy module is designed for cutting hierarchical clusters to form flat clusters. This tutorial will explore its versatile options and applications through practical examples, from easy to complex scenarios, to grasp its functionality fully.

Getting Started

Before diving into the examples, ensure SciPy, NumPy, Matplotlib are installed in your environment. You can install them using pip:

pip install scipy numpy matplotlib

Once installed, you can import the necessary modules for our examples:

import numpy as np
from scipy.cluster.hierarchy import linkage, dendrogram, fcluster
import matplotlib.pyplot as plt

Example 1: Basic clustering

In our first example, we start with a simple dataset:

data = np.array([[5, 3], [10, 15], [15, 12], [24, 10], [30, 30], [
                85, 70], [71, 80], [60, 78], [70, 55], [80, 91],])

This example will demonstrate how to perform hierarchical clustering and use fcluster() to create flat clusters:

Z = linkage(data, 'ward')
plt.figure()
dendrogram(Z)
plt.show()
clusters = fcluster(Z, 2, criterion='maxclust')
print(clusters)

Output:

The output indicates the assignment of each data point to one of the two clusters.

The complete code for this example:

import numpy as np
from scipy.cluster.hierarchy import linkage, dendrogram, fcluster
import matplotlib.pyplot as plt

data = np.array([[5, 3], [10, 15], [15, 12], [24, 10], [30, 30], [
                85, 70], [71, 80], [60, 78], [70, 55], [80, 91],])

Z = linkage(data, 'ward')
plt.figure()
dendrogram(Z)
plt.show()
clusters = fcluster(Z, 2, criterion='maxclust')
print(clusters)

Example 2: Choosing a distance threshold

In this example, we illustrate the usage of a distance threshold for clustering instead of specifying the number of clusters:

import numpy as np
from scipy.cluster.hierarchy import linkage, dendrogram, fcluster

data = np.array([[5, 3], [10, 15], [15, 12], [24, 10], [30, 30], [
                85, 70], [71, 80], [60, 78], [70, 55], [80, 91],])

Z = linkage(data, 'ward')
clusters = fcluster(Z, 10, criterion='distance')
print(clusters)

Output:

[3 1 1 2 4 5 7 8 6 9]

This approach allows for more organic grouping based on the distances between points, resulting in clusters determined by the specified threshold.

Example 3: Advanced usage with custom distance metrics

For a more advanced scenario, consider we have a custom distance metric. This example shows how to use a precomputed distance matrix with fcluster() to perform clustering:

from scipy.spatial.distance import pdist
import numpy as np
from scipy.cluster.hierarchy import linkage, dendrogram, fcluster

data = np.array([[5, 3], [10, 15], [15, 12], [24, 10], [30, 30], [
                85, 70], [71, 80], [60, 78], [70, 55], [80, 91],])

Y = pdist(data, 'euclidean')
Z = linkage(Y, 'ward')
clusters = fcluster(Z, 5, criterion='maxclust')
print(clusters)

Output:

[1 1 1 1 2 3 4 4 3 5]

This method offers flexibility when dealing with complex data and custom definitions of distance.

Conclusion

Through these examples, we’ve demonstrated the versatility and utility of SciPy’s fcluster() function in the realm of hierarchical clustering. Starting with simple applications and moving to more sophisticated scenarios, we’ve seen how fcluster() provides the tools to create both straightforward and complex cluster solutions. With practice, you can leverage these insights to analyze and interpret your own data. Remember, the beauty of hierarchical clustering lies in its flexibility and depth, revealing the underlying structure of data in a way that is both informative and visually compelling.

Next Article: SciPy cluster.hierarchy.is_monotonic() function (3 examples)

Previous Article: SciPy cluster.vq.kmeans() function: Explained with examples

Series: SciPy Tutorials: From Basic to Advanced

NumPy