Machine Learning (ML) is one of the most transformative fields in modern software development, powering applications like image recognition, recommendation systems, and predictive analytics. Traditionally, languages like Python and R have dominated this space due to their extensive libraries and community support. However, Rust — known for its speed, memory safety, and zero-cost abstractions — is quickly emerging as a powerful alternative for performance-critical machine learning tasks.

Rust combines the efficiency of C/C++ with the safety and expressiveness of a modern language. Its strong type system, thread safety, and high-performance capabilities make it an excellent choice for building fast and reliable ML pipelines — especially when models need to be deployed in production systems where performance and safety matter most.

In this tutorial, you’ll learn the fundamentals of Machine Learning in Rust through a hands-on example. We’ll walk you through setting up your Rust environment, exploring popular ML crates, and building a simple Iris flower classifier using the linfa ecosystem — a growing collection of Rust libraries inspired by Python’s scikit-learn.

By the end of this tutorial, you will have:

Learned how to set up Rust for machine learning development
Explored how Rust handles data and modeling using ndarray and linfa
Built, trained, and evaluated your first ML model in Rust

Let’s get started and see how Rust can bring both performance and safety to your machine learning workflow.

Prerequisites

Before we dive into coding, let’s make sure your development environment is ready to build and run machine learning applications in Rust. This section covers everything you need to get started smoothly.

Basic Knowledge

You should have:

A fundamental understanding of the Rust programming language, including how to create projects, use crates, and run code with Cargo.
A general idea of machine learning concepts, such as datasets, training, and evaluation. (Don’t worry if you’re new — we’ll explain the essentials as we go.)

1. Install Rust

If you haven’t installed Rust yet, the easiest way is through rustup, the official Rust installer and version manager.
Run the following command in your terminal or command prompt:

curl https://sh.rustup.rs -sSf | sh

Follow the on-screen instructions, then verify the installation:

rustc --version

This command should display the installed Rust compiler version.

2. Set Up Your Editor

You can use any text editor or IDE you prefer, but the recommended setup is:

VS Code with the Rust Analyzer extension for syntax highlighting, autocompletion, and in-editor diagnostics.
Alternatively, JetBrains CLion or RustRover also provides great Rust support.

3. Create a Working Directory

Choose a folder where you’ll store your Rust ML projects, then open it in your terminal. We’ll create our first project in the next section.

4. Dataset

We’ll use the well-known Iris dataset, which is commonly used for beginner-level machine learning tasks. The dataset is available directly from the linfa-datasets crate, so you don’t need to download it manually.

5. Cargo and Crates

Make sure you’re familiar with Cargo, Rust’s package manager and build tool. You’ll use it to:

Add dependencies (ML and data crates)
Run and build your project
Manage versions and configurations

Once you have Rust installed and your editor ready, you’re all set to start coding.

Setting Up the Project

Now that your environment is ready, let’s create a new Rust project and set up the dependencies needed for our machine learning example.

1. Create a New Rust Project

Open your terminal and run the following commands:

cargo new rust-ml-intro
cd rust-ml-intro

This creates a new directory named rust-ml-intro with the standard Rust project structure:

rust-ml-intro/
├── Cargo.toml
└── src/
    └── main.rs

Cargo.toml – The configuration file for your project, where you’ll define dependencies and metadata.
src/main.rs – The main entry point of your Rust application.

2. Add Dependencies

Next, open the Cargo.toml file and add the following dependencies under the [dependencies] section:

[dependencies]
linfa = "0.8.0"
linfa-datasets = "0.8.0"
linfa-trees = "0.8.0"

Here’s what each crate does:

linfa: The core machine learning toolkit for Rust, providing traits, data structures, and model-building utilities.
linfa-datasets: A companion crate that includes built-in datasets like Iris and utilities for loading your own.
linfa-trees: Implements decision tree models, which we’ll use to classify Iris flowers.
ndarray: Provides powerful N-dimensional array data structures similar to NumPy arrays in Python.

After saving the file, run:

cargo build

This command downloads and compiles all dependencies. The first build might take a few minutes, depending on your internet connection and system performance.

3. Project Verification

Once the build completes successfully, test that everything works by running:

cargo run

You should see the default “Hello, world!” message. This means your Rust environment is properly configured, and the dependencies are installed correctly.

4. Ready for Machine Learning

At this point, your Rust project is ready to handle datasets and build ML models.
In the next section, we’ll take a closer look at how machine learning works in Rust and the key crates that make it possible.

Understanding Machine Learning in Rust

Before we dive into coding, let’s take a moment to understand how machine learning fits into the Rust ecosystem. While Rust is still relatively new in the data science world, it’s rapidly gaining attention for its performance, safety, and reliability — three critical pillars for production-grade machine learning systems.

Why Machine Learning in Rust?

Rust brings unique strengths to the ML space that set it apart from traditional languages like Python or R:

Performance: Rust compiles to native code and delivers C/C++-level speed — ideal for computation-heavy ML workloads.
Memory Safety: Rust’s ownership system eliminates entire classes of bugs (like null pointer dereferences or data races) without a garbage collector.
Concurrency: Rust makes it easier to write safe, concurrent ML applications that can scale efficiently across multiple threads.
Integration: It’s easy to integrate Rust models or computations into existing Python or C++ systems, thanks to excellent FFI (Foreign Function Interface) support.

In short, Rust is not just about writing fast code, but writing safe and maintainable code — a huge advantage in machine learning pipelines that often process massive datasets.

The Rust ML Ecosystem

Although smaller compared to Python’s ecosystem, Rust’s machine learning landscape is growing fast. Here are some of the most notable crates:

Linfa: A suite of Rust ML algorithms inspired by scikit-learn. It offers a consistent API for tasks like classification, clustering, and regression.
SmartCore: Another powerful ML library implementing algorithms such as logistic regression, random forests, and support vector machines.
Burn: A deep learning framework for building neural networks with GPU acceleration, similar to PyTorch.
tch-rs: Rust bindings for PyTorch’s C++ API — useful for leveraging existing deep learning models.
ndarray: The Rust equivalent of NumPy, providing N-dimensional arrays and numerical operations.

For this tutorial, we’ll focus on Linfa, as it provides a beginner-friendly, modular approach to machine learning in Rust — perfect for learning the basics.

How Rust Handles Data

In Rust, datasets are typically represented using the ndarray crate. It provides two key structures:

Array1 for one-dimensional arrays (vectors)
Array2 for two-dimensional arrays (matrices)

For example:

use ndarray::array;

let x = array![[1., 2.], [3., 4.], [5., 6.]];
println!("{:?}", x);

This data structure is used internally by ML crates like Linfa to represent features and targets.

Building a Simple ML Model

Now it’s time to get hands-on and build your first machine learning model in Rust. We’ll use the Iris dataset, a classic dataset for beginners in ML, and train a Decision Tree classifier to predict the species of iris flowers based on their sepal and petal measurements.

1. About the Iris Dataset

The Iris dataset contains 150 samples of iris flowers, each with four features:

Sepal length
Sepal width
Petal length
Petal width

Each sample belongs to one of three species:

Iris setosa
Iris versicolor
Iris virginica

Our goal is to train a model that can accurately predict the flower’s species from its measurements.

2. Load and Explore the Dataset

Open the src/main.rs file and replace its content with the following:

use linfa::prelude::*;
use linfa_datasets::iris;
use linfa_trees::DecisionTree;
use std::collections::HashSet;

fn main() {
    // Load the Iris dataset
    let dataset = iris();

    // Display basic dataset information
    println!("Number of samples: {}", dataset.nsamples());
    println!("Number of features: {}", dataset.nfeatures());
    let unique_targets: HashSet<_> = dataset.targets().iter().cloned().collect();
    println!("Target names: {:?}", unique_targets);
}

Run the program with:

cargo run

You should see output similar to:

Number of samples: 150
Number of features: 4
Target names: {0, 2, 1}

This confirms that the dataset is successfully loaded and ready for training.

3. Split the Data into Training and Test Sets

To evaluate our model properly, we’ll split the dataset into training and testing portions — typically 80% for training and 20% for testing.

Add this to your code after loading the dataset:

    // Split the dataset into 80% training and 20% testing
    let (train, test) = dataset.split_with_ratio(0.8);

    println!("Training samples: {}", train.nsamples());
    println!("Testing samples: {}", test.nsamples());

4. Train the Decision Tree Model

Now we can create and train a decision tree classifier using the training data:

    // Train the Decision Tree model
    let model = DecisionTree::params().fit(&train).unwrap();

The .fit() method trains the model based on the provided training dataset.

5. Make Predictions

Once the model is trained, we can use it to make predictions on the unseen test data:

    // Predict using the test data
    let predictions = model.predict(&test);

6. Evaluate Model Accuracy

Finally, we’ll check how accurate our model is using the confusion matrix, which compares predicted and actual labels:

    // Evaluate model accuracy
    let cm = predictions.confusion_matrix(&test).unwrap();
    let accuracy = cm.accuracy();
    println!("Model accuracy: {:.2}%", accuracy * 100.0);

7. Full Code Example

Here’s the complete src/main.rs file:

use linfa::prelude::*;
use linfa_datasets::iris;
use linfa_trees::DecisionTree;
use std::collections::HashSet;

fn main() {
    // Load the Iris dataset
    let dataset = iris();

    // Display basic dataset information
    println!("Number of samples: {}", dataset.nsamples());
    println!("Number of features: {}", dataset.nfeatures());
    let unique_targets: HashSet<_> = dataset.targets().iter().cloned().collect();
    println!("Target names: {:?}", unique_targets);

    // Split the dataset into 80% training and 20% testing
    let (train, test) = dataset.split_with_ratio(0.8);

    println!("Training samples: {}", train.nsamples());
    println!("Testing samples: {}", test.nsamples());

    // Train the Decision Tree model
    let model = DecisionTree::params().fit(&train).unwrap();

    // Predict using the test data
    let predictions = model.predict(&test);

    // Evaluate model accuracy
    let cm = predictions.confusion_matrix(&test).unwrap();
    let accuracy = cm.accuracy();
    println!("Model accuracy: {:.2}%", accuracy * 100.0);
}

With this, you’ve successfully trained and evaluated your first machine learning model in Rust! 🎉

Running and Testing the Machine Learning Model

At this point, your Rust machine learning project is fully set up, and the model code is ready. Now, let’s run and test it to see everything in action.

1. Build the Project

First, compile the project to ensure all dependencies are resolved and there are no syntax issues:

cargo build

If everything goes well, you should see output similar to this:

Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.13s

This means the project was built successfully.

2. Run the Application

Now, execute your Rust program using:

cargo run

If you’re following the example from the previous section, the output should look something like:

Number of samples: 150
Number of features: 4
Target names: {0, 2, 1}
Training samples: 120
Testing samples: 30
Model accuracy: 70.00%

✅ What’s happening here:

The program loads the Iris dataset using Linfa.
It trains a K-Nearest Neighbors (KNN) classifier.
Then it evaluates the model’s accuracy and displays the confusion matrix.

You’ve just trained and tested a fully working ML model in Rust — impressive!

3. Try Running in Release Mode

To see how fast Rust can really be, run it in release mode:

cargo run --release

Rust’s optimized compiler will make the training and prediction lightning fast — especially compared to interpreted languages like Python.

4. Debugging Tips

If you encounter any errors or unexpected results:

Make sure your dependencies in Cargo.toml match exactly:

[dependencies]
linfa = "0.8.0"
linfa-datasets = { version = "0.8.0", features = ["iris"] }

Run cargo clean && cargo build to rebuild everything from scratch.
Use println! generously to inspect data shapes and contents.

5. Testing Your Model

You can add a simple unit test to verify that the model’s accuracy meets a minimum threshold:

#[cfg(test)]
mod tests {
    use super::*;
    use linfa_datasets::iris;
    use linfa_nn::{ LinearSearch, NearestNeighbour, distance::L2Dist };
    use ndarray::Array1;
    use std::collections::HashMap;

    #[test]
    fn test_knn_accuracy() {
        let dataset = iris();
        let (train, valid) = dataset.split_with_ratio(0.8);

        let train_records = train.records().to_owned();
        let train_targets = train.targets().to_owned();
        let valid_records = valid.records();

        let nn_index = LinearSearch::new()
            .from_batch(&train_records, L2Dist)
            .expect("Failed to build index");

        let mut predictions = Vec::with_capacity(valid_records.nrows());
        for sample in valid_records.outer_iter() {
            let neighbours = nn_index.k_nearest(sample, 3).unwrap();

            let mut votes = HashMap::new();
            for (_pt, idx) in neighbours {
                *votes.entry(train_targets[idx]).or_insert(0) += 1;
            }
            let predicted = votes
                .into_iter()
                .max_by_key(|(_, c)| *c)
                .unwrap().0;
            predictions.push(predicted);
        }

        let pred_ds = Dataset::new(valid_records.to_owned(), Array1::from(predictions));
        let acc = pred_ds.confusion_matrix(&valid).unwrap().accuracy();
        assert!(acc > 0.8, "Model accuracy too low: {}", acc);
    }
}

Run the test with:

cargo test

If the model performs as expected, you’ll see:

running 1 test
test tests::test_knn_accuracy ... ok

✅ At this point, you’ve:

Successfully built and run an ML model in Rust.
Evaluated its performance.
Written a test to validate it automatically.

Conclusion and Next Steps

In this tutorial, you learned how to implement a K-Nearest Neighbors (KNN) algorithm in Rust using the linfa ecosystem, specifically the linfa_nn crate for nearest-neighbor searches. You went through the essential steps — setting up dependencies, preparing and normalizing data, training the model, and making predictions using Euclidean distance (L2Dist).

By following this guide, you’ve gained insight into:

How to structure a machine learning workflow in Rust.
How to use ndarray for handling datasets efficiently.
How to perform nearest-neighbor queries with linfa_nn::LinearSearch.

The Linfa framework aims to provide a comprehensive suite of machine learning algorithms in Rust, and the KNN example is an excellent first step in exploring its power and simplicity.

Next Steps

To continue learning and expanding your skills, here are a few directions to explore:

Use a real dataset — try loading data from CSV files using the linfa-datasets crate.
Experiment with distance metrics — explore linfa_nn::distance::CosineDist or linfa_nn::distance::ManhattanDist.
Tune parameters — adjust k to find the optimal number of neighbors for better accuracy.
Integrate with other Linfa modules — such as linfa-bayes, linfa-trees, or linfa-clustering to build more complex ML pipelines.
Visualize results — integrate Rust with Python via PyO3 or output data for visualization in Jupyter notebooks.

Rust’s safety, concurrency, and performance make it a great choice for machine learning projects that require speed and reliability. With Linfa, Rust is quickly becoming a viable language for data science and AI development.

You can get the full source code on our GitHub.

That's just the basics. If you need more deep learning about Rust, you can take the following cheap course:

Thanks!

Getting Started with Machine Learning in Rust

Learn how to build a K-Nearest Neighbors (KNN) model in Rust using Linfa and linfa_nn for efficient nearest-neighbor search and classification.

Table of Contents:

Prerequisites

Basic Knowledge

1. Install Rust

2. Set Up Your Editor

3. Create a Working Directory

4. Dataset

5. Cargo and Crates

Setting Up the Project

1. Create a New Rust Project

2. Add Dependencies

3. Project Verification

4. Ready for Machine Learning

Understanding Machine Learning in Rust

Why Machine Learning in Rust?

The Rust ML Ecosystem

How Rust Handles Data

Building a Simple ML Model

1. About the Iris Dataset

2. Load and Explore the Dataset

3. Split the Data into Training and Test Sets

4. Train the Decision Tree Model

5. Make Predictions

6. Evaluate Model Accuracy

7. Full Code Example

Running and Testing the Machine Learning Model

1. Build the Project

2. Run the Application

3. Try Running in Release Mode

4. Debugging Tips

5. Testing Your Model

Conclusion and Next Steps

Next Steps

Getting Started with Machine Learning in Rust

Learn how to build a K-Nearest Neighbors (KNN) model in Rust using Linfa and linfa_nn for efficient nearest-neighbor search and classification.

Table of Contents:

Prerequisites

Basic Knowledge

1. Install Rust

2. Set Up Your Editor

3. Create a Working Directory

4. Dataset

5. Cargo and Crates

Setting Up the Project

1. Create a New Rust Project

2. Add Dependencies

3. Project Verification

4. Ready for Machine Learning

Understanding Machine Learning in Rust

Why Machine Learning in Rust?

The Rust ML Ecosystem

How Rust Handles Data

Building a Simple ML Model

1. About the Iris Dataset

2. Load and Explore the Dataset

3. Split the Data into Training and Test Sets

4. Train the Decision Tree Model

5. Make Predictions

6. Evaluate Model Accuracy

7. Full Code Example

Running and Testing the Machine Learning Model

1. Build the Project

2. Run the Application

3. Try Running in Release Mode

4. Debugging Tips

5. Testing Your Model

Conclusion and Next Steps

Next Steps

Related Articles