August 24, 2023
Christian Hernandez
Reducing Argo CD Operational Burden
As an Argo CD administrator, you probably care a lot about security, robustness, and best practices in implementing every aspect of GitOps with Argo CD. As your implementation grows, so does the complexity of managing and maintaining your Argo CD instances. At this point, you have probably gone through Nicholas Morey's blog about Argo CD Architectures trying to see how you can reduce your operational burden.
Although Argo CD is an easy-to-use, easy-to-get-started, and out-of-the-box solution; overhead still exists. As adoption grows, challenges of scale, reliability, and ideal implementation come up. While a small footprint is easy to handle; larger ones can be demanding for the day-to-day feed and care. You shouldn't have to worry about the operational burden of managing Argo CD, but instead reaping its benefits.
Here at Akuity, we love Argo CD! So much, in fact, that the original founders of the Argo Project created the Akuity platform from the lessons learned from running Argo CD, securely and at scale over at Intuit. Now these codified practices are delivered to each of our customers!
In this blog, I will go over how the Akuity Platform removes the overhead and burden of running and managing Argo CD by comparing common deployment strategies with the Akuity Platform.
Early Deployment Implementation
Many Argo CD implementations start off basically the same. You have a central cluster where you install Argo CD, login with the build-in Administrator account, and off you go! From a "getting started" standpoint, Argo CD is easy to install and use. But as the footprint grows, you'll see some challenges that you're going to face during this "growing pains" phase.
This diagram outlines some of the challenges you'll immediately encounter:
As you may have already encountered, once you start onboarding more and more applications with more and more clusters, operationalizing Argo CD can not only be a challenge; but it can be overwhelming.
The first thing that you'll notice is that Argo CD needs open network access to your managed clusters Kubernetes API endpoint (since Argo CD, by design, uses a Push model). This means that not only do you need network access, but also Kubrentes API access - which requires storing long-lived highly privileged credentials for every cluster managed by that Argo CD instance. This also means Argo CD also has visibility to other things like Secrets and ConfigMaps on the Managed clusters.
Which leads me to the next challenge that arises - the access to your Git repository. Argo CD is going to need network access to your Git repository and if you're going to implement webhooks - Argo CD will also need to open inbound connections. On the subject of Git repositories, you'll also notice that the number of repositories needed will grow since the use of mono repositories has been known to cause performance issues. Moving to a poly-repo design also has its drawbacks. The most notable one is that you have many points of control. You will end up having repos with Cluster Configs, Argo CD Configs, and Application configs. All these repositories need to orchestrate their Git workflows in order to properly do an Application rollout.
Segueing over to performance, the only option (besides moving to a poly-repo design) to increase performance is to "scale up". This can be done by modifying the requests/limits on the individual Argo CD components. Also, you can run multiple replicas of Argo CD components and in the case of the argocd-application-controller
- you can set up some sharding, however the shards will be imbalance relative to the cluster's size. Another major drawback with this is that you only have so many resources on the Hub cluster and you'll soon hit a limit to how much a single hub cluster can manage.
Growing Pains
Another design you may be familiar with is deploying an Argo CD instance per logical team, where each team has its own Argo CD instances. This feels like a good compromise between hub-and-spoke and trying to run one-instance-per-cluster; however these come with some challenges. See the diagram below for an overview:
What becomes immediately evident is that you actually compound the issues you run into with the hub-and-spoke design. While true, each team can scale more than in the central hub-and-spoke way; the issue of "can only scale up" comes to head for each team as adoption grows. Also, as an administrator, you need to configure each and every Argo CD instance for each team. This exponentially increases your responsibility for the feed and care of these clusters.
None of the issues of the hub-and-spoke design goes away. You still need the bidirectional network and API connection. The team instances need bidirectional connection to the git repositories and teams need to store those credentials on the instance cluster.
While you can use an Argo CD instance to manage these Argo CD instances - this design isn't that popular because you will end up with the question of "Who is responsible for what". There will be instances where end users try and change something to have it only revert back because that's taken care of at the "Parent" Argo CD instance. There will be various points of demarcation that need to be communicated.
Akuity's Agent-Based Architecture to The Rescue!
So how do you deal with these challenges?
Using Akuity, you get the same great "easy-to-use" benefits of the open source Argo CD installation while eliminating a lot of the drawbacks. Take a look at the diagram as it outlines some of the benefits:
The Akuity Platform has a hybrid agent-based design. Compared directly to an open source deployment of Argo CD, no long-lived cluster admin credentials are stored in the control plane. Also, no direct API server access is needed and managed clusters only need outbound access to the Akuity SaaS platform.
Using this hybrid-agent approach, components are distributed for a more robust design. The control plane components are run on the Akuity Platform, while the application and repository controllers are run on the managed clusters. This means that the controllers are distributed between the managed clusters so that they are closer to where the work is being done, which results in less networking traffic. This provides a robust design that scales well; even in a single instance implementation.
This also allows for the highest level of flexibility; you can choose to stay in a monorepo type of deployment and scale as you add more managed clusters. Another option that Akuity Platform gives you is the ability to delegate git operations to a single cluster to give you more control over credential management. The hybrid design also means that the Akuity Platform doesn't need access to any secrets and/or credentials for anything; but especially any Git repositories.
Some organizations try to emulate the "agent" design by deploying an Argo CD instance per cluster which, as you can imagine, becomes a management nightmare. This design exponentially increases the management overhead. Trying to solve this, a lot of organizations try to do an "Argo of Argos", where you try to manage your Argo CD instance per cluster with a centralized Argo CD which makes it challenging in multi-tenant environments.
Supercharging GitOps with Akuity
Managing your Argo CD implementation shouldn't be something that you spend a lot of time on. You should also be reaping the benefits of what each Argo CD implementation gives you and not have to worry about what the best practice is. To do so, use a platform that has best practices, security, and scalability built in. This way, you can focus on what matters most - deploying and managing your applications. Here at Akuity, we've taken care of managing Argo CD so that you can focus on just using it. We feel that Argo CD should be used to drive your GitOps implementation while still being flexible enough to support many architectural designs and deployments.
Try it out for yourself with the Akuity Platform free trial and let me know how it went. Once you see how easy it is to set up an Argo CD instance and connect your clusters it will be hard for you to come back to setting and configuring Argo CD by yourself ever again 😉