How to Calculating Variance and Standard Deviation in Go

Introduction
Basic Example
Intermediate Example
Advanced Example
Conclusion

Introduction

In this article, we'll delve into how to calculate variance and standard deviation in Go. Variance and standard deviation are statistical measures that help us understand the spread or dispersion of a set of numbers. Variance gives an idea of how much the values in a dataset deviate from the mean, while the standard deviation provides a measure of dispersion expressed in the same unit as the original data.

Basic Example

Let's start by calculating the variance and standard deviation for a small array of numbers. Below is the basic implementation.


package main

import (
    "fmt"
    "math"
)

func variance(data []float64) float64 {
    mean := mean(data)
    var sum float64
    for _, value := range data {
        diff := value - mean
        sum += diff * diff
    }
    return sum / float64(len(data))
}

func mean(data []float64) float64 {
    total := 0.0
    for _, value := range data {
        total += value
    }
    return total / float64(len(data))
}

func standardDeviation(data []float64) float64 {
    return math.Sqrt(variance(data))
}

func main() {
    data := []float64{1, 2, 3, 4, 5}
    fmt.Printf("Variance: %.2f\n", variance(data))
    fmt.Printf("Standard Deviation: %.2f\n", standardDeviation(data))
}

Intermediate Example

Now let's add error handling and improve the function to handle potential edge cases, such as passing an empty slice.


package main

import (
    "errors"
    "fmt"
    "math"
)

func variance(data []float64) (float64, error) {
    if len(data) == 0 {
        return 0, errors.New("data slice is empty")
    }
    mean := mean(data)
    var sum float64
    for _, value := range data {
        diff := value - mean
        sum += diff * diff
    }
    return sum / float64(len(data)), nil
}

func mean(data []float64) float64 {
    total := 0.0
    for _, value := range data {
        total += value
    }
    return total / float64(len(data))
}

func standardDeviation(data []float64) (float64, error) {
    varValue, err := variance(data)
    if err != nil {
        return 0, err
    }
    return math.Sqrt(varValue), nil
}

func main() {
    data := []float64{1, 2, 3, 4, 5}
    varValue, err := variance(data)
    if err != nil {
        fmt.Println("Error:", err)
    } else {
        fmt.Printf("Variance: %.2f\n", varValue)
    }
    stdDev, err := standardDeviation(data)
    if err != nil {
        fmt.Println("Error:", err)
    } else {
        fmt.Printf("Standard Deviation: %.2f\n", stdDev)
    }
}

Advanced Example

For a more advanced scenario, let’s refactor our code with a struct-based approach that can handle dynamic data sets and provides utilities for people working with streams of data.


package main

import (
    "errors"
    "fmt"
    "math"
)

type Statistic struct {
    data []float64
}

func NewStatistic() *Statistic {
    return &Statistic{data: []float64{}}
}

func (s *Statistic) AddData(value float64) {
    s.data = append(s.data, value)
}

func (s *Statistic) Mean() (float64, error) {
    if len(s.data) == 0 {
        return 0, errors.New("no data available")
    }
    mean := s.sum() / float64(len(s.data))
    return mean, nil
}

func (s *Statistic) sum() float64 {
    total := 0.0
    for _, value := range s.data {
        total += value
    }
    return total
}

func (s *Statistic) Variance() (float64, error) {
    mean, err := s.Mean()
    if err != nil {
        return 0, err
    }
    var sumSquares float64
    for _, value := range s.data {
        diff := value - mean
        sumSquares += diff * diff
    }
    return sumSquares / float64(len(s.data)), nil
}

func (s *Statistic) StandardDeviation() (float64, error) {
    varValue, err := s.Variance()
    if err != nil {
        return 0, err
    }
    return math.Sqrt(varValue), nil
}

func main() {
    stats := NewStatistic()
    stats.AddData(1)
    stats.AddData(2)
    stats.AddData(3)
    stats.AddData(4)
    stats.AddData(5)

    variance, err := stats.Variance()
    if err != nil {
        fmt.Println("Error:", err)
    } else {
        fmt.Printf("Variance: %.2f\n", variance)
    }
    stdDev, err := stats.StandardDeviation()
    if err != nil {
        fmt.Println("Error:", err)
    } else {
        fmt.Printf("Standard Deviation: %.2f\n", stdDev)
    }
}

Conclusion

By using varying complexity levels of code, we managed to calculate variance and standard deviation in Go. We've created solutions that cater to simple, intermediate, and advanced needs, ensuring robust and flexible programming practices in Go development. Utilizing a struct-based approach can significantly improve code organization and allow statistics operations to cater to evolving data streams.

Next Article: Capacity and Length: Managing Slice Growth in Go

Previous Article: How to clear all slice elements in Go

Series: Working with Slices in Go

Golang

How to set up and run Go in Ubuntu

November 20, 2024