How to clone only a subdirectory from a Git repository

Updated: January 28, 2024 By: Guest Contributor Post a comment

Introduction

In the world of software development, working with Git repositories is a commonplace activity for version control. At times, developers find themselves needing to work with a specific subdirectory within a large repository. This tutorial will guide you through the process of cloning only a particular subdirectory from a Git repository, rather than the entire project. By the end of this guide, you will have learned how to efficiently handle subdirectories in Git, saving time and bandwidth in the process.

Understanding the Sparse-checkout Feature

Introduced in Git version 1.7, the sparse-checkout feature allows users to selectively check out parts of a repository. Before this feature, the only way to interact with a subdirectory of a repository was to clone the entire repository and then navigate to the subdirectory. The sparse-checkout functionality conserves both space and resources by enabling you to clone only the portions of a repository you need.

Prerequisites

  • An installed version of Git, 2.25 or higher.
  • A URL to the Git repository.
  • A basic understanding of Git commands.

Step-by-Step Instructions

Step 1: Initialize a New Repository

Initialize a new local repository using the following command:

git init <repo-name>
cd <repo-name>

Step 2: Enable Sparse-checkout

To enable sparse-checkout, you can use the following command:

git config core.sparseCheckout true

Step 3: Add Remote Repository

Add the remote repository from which you wish to clone the subdirectory:

git remote add origin <repository-url>

Step 4: Define the Subdirectory to be Cloned

Create a file within the .git/info directory that specifies the subdirectory to be cloned. You can do so with the following command:

echo '<path-to-subdirectory>/*' > .git/info/sparse-checkout

Step 5: Pull the Subdirectory

Pull the specified subdirectory into your local repository:

git pull origin <branch-name>

Replace <branch-name> with the name of the branch you want to pull from, usually master or main.

Practical Examples

Cloning a Single Subdirectory

The following is an example of cloning a single subdirectory from a repository:

git init my-repo
cd my-repo
git remote add origin https://github.com/user/repo.git
echo 'subdirectory/*' > .git/info/sparse-checkout
git config core.sparseCheckout true
git pull origin master

Updating Sparse-Checkouts

To update the sparse-checkout list:

git sparse-checkout set another-subdir/*

Complex Sparse-Checkouts

For more complex sparse-checkouts involving multiple subdirectories or patterns, list each one on a new line in the sparse-checkout file:

echo 'subdir1/*' >> .git/info/sparse-checkout
echo 'subdir2/*.txt' >> .git/info/sparse-checkout

git pull origin master

Conclusion

As we have learned throughout this tutorial, cloning only a subdirectory from a Git repository is straightforward with the use of Git’s sparse-checkout feature. This powerful feature can assist developers in managing large repositories efficiently by allowing them to work with the parts most relevant to their tasks. By following the steps outlined in this guide and utilizing the example commands provided, you can minimize the overhead and focus on the parts of a code base that truly matter for your work.