Sling Academy
Home/Rust/Building a Statistics CLI Tool in Rust for Data Analysis

Building a Statistics CLI Tool in Rust for Data Analysis

Last updated: January 03, 2025

In this guide, we will walk through creating a simple Command Line Interface (CLI) tool in Rust for data analysis. Rust is known for its safety and performance, making it a great choice for developing efficient CLI applications. Our tool will take basic statistical measurements such as mean, median, and mode from a given set of numerical data input through the command line.

Setting Up Your Environment

Before we start coding our CLI tool, ensure that you have Rust installed on your system. You can check this by running:

rustc --version

If Rust is not installed, you can follow the official installation guide.

Project Initialization

Create a new Rust project by running:

cargo new stats_cli --bin

This command will create a directory named stats_cli containing a basic Rust project with a binary crate.

Importing Dependencies

To make our application robust, we will use some external crates. Open Cargo.toml and in the [dependencies] section, add:


[dependencies]
clap = "4.0"

The clap crate is an essential library for parsing command line arguments, which will help us handle inputs gracefully.

Planning the CLI Application

Our application should achieve the following:

  • Accept a list of numbers.
  • Calculate mean, median, and mode.
  • Display the results clearly.

To handle these tasks, we’ll create separate functions to compute each statistic and integrate them into our CLI workflow.

Coding the CLI Application

Open src/main.rs and set up your base program structure with argument parsing. Here is the core logic:

use clap::{Arg, Command};

fn main() {
    let matches = Command::new("stats_cli")
                          .version("1.0")
                          .about("Calculates basic statistics")
                          .arg(Arg::new("numbers")
                               .about("List of numbers")
                               .required(true)
                               .min_values(1)
                          )
                          .get_matches();

    if let Some(numbers) = matches.values_of("numbers") {
        let numbers: Vec = numbers.map(|n| n.parse().unwrap()).collect();
        println!("Mean: {:.2}", calculate_mean(&numbers));
        println!("Median: {:.2}", calculate_median(&numbers));
        println!("Mode: {:.2}", calculate_mode(&numbers).unwrap_or_default());
    }
}

Implementing Statistical Functions

Now, we need to implement calculate_mean, calculate_median, and calculate_mode. Add the following below the main function:

fn calculate_mean(numbers: &[f64]) -> f64 {
    let sum: f64 = numbers.iter().sum();
    sum / numbers.len() as f64
}

fn calculate_median(numbers: &mut [f64]) -> f64 {
    numbers.sort_by(|a, b| a.partial_cmp(b).unwrap());
    let mid = numbers.len() / 2;
    if numbers.len() % 2 == 0 {
        (numbers[mid - 1] + numbers[mid]) / 2.0
    } else {
        numbers[mid]
    }
}

fn calculate_mode(numbers: &[f64]) -> Option {
    use std::collections::HashMap;
    let mut occurrences = HashMap::new();
    for &value in numbers {
        *occurrences.entry(value).or_insert(0) += 1;
    }
    occurrences.iter().max_by_key(|entry| entry.1).map(|(k, _)| *k)
}

Running Your Application

With everything set up, you can now compile and run your statistics CLI tool by navigating to your project directory and using:

cargo run -- 5 4 2 7 5 5 8

This command will pass the numbers 5, 4, 2, 7, 5, 5, 8 to the program, and it will output the mean, median, and mode accordingly.

Conclusion

You now have a basic understanding of how to create a simple statistics CLI tool using Rust. This project can easily be expanded to include more advanced data analysis features such as variance, standard deviation, and even graphical plots. As you continue experimenting, consider adding more functionalities and exploring other powerful Rust crates.

Next Article: Using Traits for Generic Numeric Functions Across Rust Types

Previous Article: Exploring Multidimensional Arrays with `ndarray` in Rust

Series: Math and Numbers in Rust

Rust

You May Also Like

  • E0557 in Rust: Feature Has Been Removed or Is Unavailable in the Stable Channel
  • Network Protocol Handling Concurrency in Rust with async/await
  • Using the anyhow and thiserror Crates for Better Rust Error Tests
  • Rust - Investigating partial moves when pattern matching on vector or HashMap elements
  • Rust - Handling nested or hierarchical HashMaps for complex data relationships
  • Rust - Combining multiple HashMaps by merging keys and values
  • Composing Functionality in Rust Through Multiple Trait Bounds
  • E0437 in Rust: Unexpected `#` in macro invocation or attribute
  • Integrating I/O and Networking in Rust’s Async Concurrency
  • E0178 in Rust: Conflicting implementations of the same trait for a type
  • Utilizing a Reactor Pattern in Rust for Event-Driven Architectures
  • Parallelizing CPU-Intensive Work with Rust’s rayon Crate
  • Managing WebSocket Connections in Rust for Real-Time Apps
  • Downloading Files in Rust via HTTP for CLI Tools
  • Mocking Network Calls in Rust Tests with the surf or reqwest Crates
  • Rust - Designing advanced concurrency abstractions using generic channels or locks
  • Managing code expansion in debug builds with heavy usage of generics in Rust
  • Implementing parse-from-string logic for generic numeric types in Rust
  • Rust.- Refining trait bounds at implementation time for more specialized behavior