This article was written by guest authors – and GitKon speakers – Nitish Garg and Andreas Hermann.
Nitish is a Staff Engineer at Rippling with over five years of experience specializing in Developer Productivity and Infrastructure Engineering. He has also previously worked at Bloomberg. Nitish’s educational background includes degrees from the University of Florida (UF) and the Indian Institute of Technology (IIT) Delhi.
Andreas is a physicist turned software engineer who currently leads the Scalable Builds Group at Tweag, contributing to Bazel, Buck2, and Pants. As Google’s first Bazel Community Expert, he maintains Bazel extensions and provides professional services.
Hear more about this topic in Andreas and Nitish’s GitKon session, Whose Repo is it Anyway? Dueling Developer Perspectives on Mono v Multi-Repo. Watch on-demand here (expand the video description and use the chapters to jump to their talk).
As monorepos continue to gain traction, especially in large-scale projects, it’s crucial to weigh the benefits against the potential costs. With the right tooling, many of these challenges can be mitigated, making monorepos an attractive option for many organizations, especially large ones.
Benefits of Monorepos
Unified Versioning: Everything at a single version. No mismatch between shared libraries. No cascading dependency and version updates. This streamlined approach to versioning significantly reduces integration issues, making development processes more efficient and predictable.
Single Source of Truth: All your codebase in one place, making it easier to oversee and manage. A centralized codebase enhances visibility and control, which is crucial for maintaining high-quality standards across projects.
Code Sharing: Shared utilities or libraries are instantly available to all projects without the need for separate package management. This immediate availability of shared resources fosters innovation and reduces redundancy in coding efforts.
Atomic Changes: Update multiple projects with a single commit, ensuring coordinated changes across modules. This ensures consistency and integrity of updates, which is vital for complex systems with interdependent components.
Simplified Dependency Management: All projects use the same versions of third-party dependencies, reducing the chances of conflicts. Consistent dependency management minimizes compatibility issues, streamlining the maintenance and update processes.
Collaboration: Encourages a cohesive developer experience, fostering better collaboration across teams. Improved collaboration leads to enhanced productivity and a more integrated approach to problem-solving.
Costs of Monorepos
Size: As code accumulates, the repo can become large, potentially slowing down operations.
Complexity: More tools and custom infrastructure may be needed to handle a growing monorepo effectively.
Build Times: Without proper tooling, building or testing, the entire repo could become slower over time.
Access Control: Need for finer-grained access controls can become challenging. Implementing effective access control in a large monorepo can be complexbut is essential for security and compliance.
Learning Curve: For new developers, navigating a large, unified codebase might seem daunting. A monorepo may require the use of dedicated tools, which now add to the required learning material for new developers.
Essential Tools for Monorepos
Build tools: These support large, multi programming language projects, manage builds and tests, and can accelerate builds using distributed builds and cache.
Bazel has a large and thriving ecosystem of extensions and tooling, and a strong focus on explicit dependency declarations.
Pants 2 is a build tool with a stronger focus on ease of adoption and use.
Buck 2 is a new build tool with better support for languages that require dependency order compilation like Erlang, Haskell, and OCaml.
Remote Execution: All the above mentioned build systems supported distributed build and test through the same Remote Execution protocol, with multiple self-service or commercial options available.
Nix: A package manager for Linux and MacOS to pin your system tool and library dependencies and achieve reproducible builds.
CI/CD: Monorepo build systems can express complex cross-component and cross-language dependencies for building, testing, and deploying targets. This allows engineers to execute most of these from their development machine for testing purposes. It can also greatly simplify the required CI/CD configuration by pushing more of this complexity into the build system and less into the CI/CD pipeline configuration. Monorepo build systems also ensure correctness through isolation to varying degrees. This opens up the possibility for persistent CI runners with frequently warm in-memory cache, without overly increasing the risk of unreliable CI/CD pipelines.
Source Control: Git is the most used source control system and will serve the needs of most use-cases well; however, at very large project sizes, you may encounter limitations. In those situations, you may want to look at Scalar, Sapling, or other source control strategies.
Remember, the choice between a monorepo and a multi-repo setup often depends on the organization’s scale, the nature of the projects, and team preferences.