Unlocking Ultimate Argo CD Scalability

Akuity Platform unlocks Argo CD ultimate scalability

Argo CD is a deployment tool for Kubernetes that follows GitOps methodology. Besides being the best implementation of a GitOps controller, Argo CD also stands out with its enterprise and multi-tenancy features. SSO integration, flexible RBAC, and multi-cluster management allow running a single Argo CD instance to serve multiple development teams. This is extremely attractive since it reduces the operational overhead and provides great visibility for all the deployments inside the organization. However, no software can scale infinitely, and Argo CD is no exception.



Scaling Challenges

The most popular scalability-related question we're getting from our customers is how much is too much? At which point will Argo CD start to struggle? The question is indeed very important since it allows us to define the correct architecture that accounts for future growth (there's even a whole Special Interest Group dedicated to the question of Argo scalability on CNCF Slack - #argo-sig-scalability) The answer might be tricky since it depends on usage factors and will differ from one organization to another. Let's walk through the most important factors that affect Argo CD scalability.

Number of Managed Clusters

The application controller is the heart of Argo CD. It's responsible for reconciling the desired state stored in Git against the actual state of the cluster. The number of managed Kubernetes clusters and, most importantly, the number of resources in those clusters determines how much memory and CPU is required for the application controller. As per our observations, the default configuration is enough to handle a dozen mid-size clusters, which is pretty good. If the number of clusters grows, you might need to give it more memory and CPU. When the number of clusters reaches hundreds, you will have to utilize sharding to run multiple controller instances and fine-tune some configurations to save money on the compute required to run the controller.

Number of Git Repositories

Accessing manifests stored in Git and, more importantly, generating manifests is another resource-intensive operation. This work is performed by the repo server. The most expensive operation is the generation of manifests since it usually requires running kustomize or helm to generate the final manifests. As the number of repositories grows, you might need to increase the number of repo server replicas to handle the load. The good news is that the repo server is stateless and can be scaled horizontally.

Number of Applications

Finally, the number of Argo CD applications affects the performance of the presentation layer, the Argo CD UI, and the API server. The more applications you have, the more time it takes to load the UI. Argo CD comfortably handles hundreds of applications, gets a little slower when the number of applications reaches ~3,000, and starts to struggle when the number of applications is more than 5,000.

Given the above factors, we usually recommend running multiple Argo CD instances to account for future growth. The most typical approach is to run one Argo CD instance per team or department. This approach allows us to isolate teams from each other and provide a dedicated dashboard for each team. However, this is still a compromise since it introduces some management headaches and requires running multiple instances of Argo CD.

Akuity's Answer to Argo CD Scalability Issues

Akuity was designed, by the creators of the Argo Project, to address the above challenges and unlock the ultimate scalability of Argo CD. Our goal is to significantly push the limits of Argo CD and make it possible to run a single Argo CD instance for a huge organization. We tackled the backend bottlenecks first by introducing a unique agent-based architecture that allows running a dedicated application controller and repo server in each managed cluster. This approach significantly simplifies the scalability challenge since work is not naturally distributed between clusters. It's also cheaper since the resource requests of each controller can be tuned to match the size of the cluster. Often running the agent in each cluster is free since components utilize spare resources available in the cluster.

What about the frontend bottlenecks? We're happy to announce that we've found a solution. Akuity-hosted Argo CD got long-awaited server-side pagination which pretty much solves the problem and allows users to comfortably have tens of thousands of applications in a single Argo CD instance. To enable the feature upgrade on your Akuity-managed Argo CD instance, set it to the Akuity version using Akuity Portal:

  • Navigate to the instance details page
  • Select the Settings tab
  • Click the instance version dropdown and select the Akuity image, denoted by the -ak.X suffix (i.e. v2.7.6-ak.2), matching your desired Argo CD version. The Akuity image is available for all Argo CD versions starting from v2.7.3.

Instance Settings Page
Instance Settings Page

Congrats! Server-side pagination is enabled, and you can take advantage of the enhanced Argo CD user experience.

Scalability Test

Scaling any software is a complex task and requires testing in a production-like environment. This is exactly what we did. We've gathered the requirements of our customers and open-source community and tried to come up with numbers that would satisfy the most ambitious use cases: 1,000 clusters and 50,000 Argo CD applications. We've used K3S to simulate managed Kubernetes clusters and save money, which worked perfectly. We used a Kustomize-based set of manifests to utilize the repo server. Finally, we deployed 50 applications in each cluster which gave us 50,000 applications in total. The results are pretty impressive, and we are happy to conclude that Akuity can comfortably handle 1,000 clusters and 50,000 applications in a single Argo CD instance.

Argo CD Application List Page

Summary

We are very pleased with the initial results and happy to offer enhanced Argo CD UI to our customers. Server-side pagination is a very valuable feature, and we are committed to contributing it back to the open source. Once we feel comfortable with the feature, we will open-source it and make it available to the community. Please give it a try and provide your feedback! Do you have a use case that requires pushing the limits of Argo CD even further? We would love to hear from you and help you to unlock the ultimate scalability of Argo CD!

Share this blog:

Latest Blog Posts

Introducing Akuity Workspaces

Introducing Akuity Workspaces

We are excited to announce two significant additions to the Akuity Platform that will enhance how your organization manages access to resources: Workspaces and…...

What's New in Kargo v0.7.0

What's New in Kargo v0.7.0

Kargo v0.7 is now available on GitHub ! The Kargo community has been hard at work driving Kargo closer and closer to a GA release. For users upgrading from v…...

Overcoming Edge Kubernetes Challenges with the Akuity Platform

Overcoming Edge Kubernetes Challenges with the Aku...

Edge computing involves placing your workload as close to the user as necessary but no closer. It used to mean keeping computing close to the source of…...

Leverage the industry-leading suite

Contact our team to learn more about Akuity Cloud