Monday, March 10, 2025

Handling Git Large repositories

Hey Hello dear DevOps and DevSecOps and SRE team heros!! here I came new challenge to solve the common problem on the Git. You may be using GitHub or GitLab or even BitBucket for source code management, Now few projects websites or mobile apps requires to store images, audio files, or video files which are larger in size. During the transfer to the client systems they are facing following issues:
  1. Slowness in git clone and fetch operations
  2. Sluggish commits and status checks
  3. Repository size bloat
  4. Complexity in managing multiple branches

Git LFS installation

apt install -y git-lfs
Git LFS (Large File Storage) is used to handle large files in a Git repository efficiently by replacing them with lightweight pointers while storing the actual files in a separate location. Here are various examples of using Git LFS, including tracking, untracking, checking status, and more: ---

1. Initialize Git LFS

Before using Git LFS in a repository, initialize it:
git lfs install
This sets up Git LFS for the repository. ---

2. Track Large Files

To track specific file types, use:
git lfs track "*.psd"
or track a specific file:
git lfs track "bigfile.zip"
This updates the `.gitattributes` file to include:
***.psd filter=lfs diff=lfs merge=lfs -text
After tracking, commit the `.gitattributes` file:
git add .gitattributes
git commit -m "Track large files with Git LFS"
---

3. Check Tracked Files

To see which files are being tracked by Git LFS:
git lfs track
---

4. Check LFS Status

To check which large files are modified or committed:
git lfs status
---

5. Untrack a File

If you no longer want a file to be tracked by Git LFS:
git lfs untrack "bigfile.zip"
This removes it from `.gitattributes`. Then commit the change:
git add .gitattributes
git commit -m "Untrack bigfile.zip from Git LFS"
Important: This does not remove files from previous commits. ---

6. List LFS Objects

To see which LFS files exist in your repository:
git lfs ls-files
---

7. Migrate Large Files (if added before tracking)

If you accidentally committed a large file before tracking it with Git LFS, migrate it:
git lfs migrate import --include="*.zip"
---

8. Push and Fetch LFS Files

After committing, push LFS files to the remote:
git push origin main
To pull and fetch LFS files:
git pull
git lfs fetch
---

9. Removing LFS Files From History (if needed)

If a large file was added before tracking and you want to remove it:
git lfs migrate import --include="bigfile.zip" --everything
Then, force push the cleaned history:
git push origin --force --all
---

10. Verify LFS Files in Remote Repository

To check which LFS files exist on the remote:
git lfs ls-files --long

Common mistakes working with Git LFS

  • Not Tracking Files Properly: Forgetting to use git lfs track for large files can lead to them being stored in the repository instead of being managed by Git LFS.
  • Ignoring the .gitattributes File: The .gitattributes file is crucial for Git LFS to function correctly. Failing to commit this file can cause issues for collaborators.
  • Pushing Without Installing Git LFS: If Git LFS isn't installed on your system, pushing large files will fail or result in errors.
  • Exceeding Storage Limits: Platforms like GitHub have storage limits for Git LFS. Exceeding these limits can block further uploads.
  • Cloning Without Git LFS: If you clone a repository without Git LFS installed, you might end up with pointer files instead of the actual large files.
  • Using Git LFS for Small Files: Git LFS is designed for large files. Using it for small files can unnecessarily complicate your workflow.
  • Not Cleaning Up Old Files: Over time, unused large files can accumulate in the LFS storage, increasing costs or storage usage.
Please write your learnings in the comments.

Categories

Kubernetes (25) Docker (20) git (14) Jenkins (12) AWS (7) Jenkins CI (5) Vagrant (5) K8s (4) VirtualBox (4) CentOS7 (3) docker registry (3) docker-ee (3) ucp (3) Jenkins Automation (2) Jenkins Master Slave (2) Jenkins Project (2) containers (2) create deployment (2) docker EE (2) docker private registry (2) dockers (2) dtr (2) kubeadm (2) kubectl (2) kubelet (2) openssl (2) Alert Manager CLI (1) AlertManager (1) Apache Maven (1) Best DevOps interview questions (1) CentOS (1) Container as a Service (1) DevOps Interview Questions (1) Docker 19 CE on Ubuntu 19.04 (1) Docker Tutorial (1) Docker UCP (1) Docker installation on Ubunutu (1) Docker interview questions (1) Docker on PowerShell (1) Docker on Windows (1) Docker version (1) Docker-ee installation on CentOS (1) DockerHub (1) Features of DTR (1) Fedora (1) Freestyle Project (1) Git Install on CentOS (1) Git Install on Oracle Linux (1) Git Install on RHEL (1) Git Source based installation (1) Git line ending setup (1) Git migration (1) Grafana on Windows (1) Install DTR (1) Install Docker on Windows Server (1) Install Maven on CentOS (1) Issues (1) Jenkins CI server on AWS instance (1) Jenkins First Job (1) Jenkins Installation on CentOS7 (1) Jenkins Master (1) Jenkins automatic build (1) Jenkins installation on Ubuntu 18.04 (1) Jenkins integration with GitHub server (1) Jenkins on AWS Ubuntu (1) Kubernetes Cluster provisioning (1) Kubernetes interview questions (1) Kuberntes Installation (1) Maven (1) Maven installation on Unix (1) Operations interview Questions (1) Oracle Linux (1) Personal access tokens on GitHub (1) Problem in Docker (1) Prometheus (1) Prometheus CLI (1) RHEL (1) SCM (1) SCM Poll (1) SRE interview questions (1) Troubleshooting (1) Uninstall Git (1) Uninstall Git on CentOS7 (1) Universal Control Plane (1) Vagrantfile (1) amtool (1) aws IAM Role (1) aws policy (1) caas (1) chef installation (1) create organization on UCP (1) create team on UCP (1) docker CE (1) docker UCP console (1) docker command line (1) docker commands (1) docker community edition (1) docker container (1) docker editions (1) docker enterprise edition (1) docker enterprise edition deep dive (1) docker for windows (1) docker hub (1) docker installation (1) docker node (1) docker releases (1) docker secure registry (1) docker service (1) docker swarm init (1) docker swarm join (1) docker trusted registry (1) elasticBeanStalk (1) global configurations (1) helm installation issue (1) mvn (1) namespaces (1) promtool (1) service creation (1) slack (1)