Infrastructure as Code (IaC) — What is it?
The explosion of public cloud platforms has made the accessibility and consumption of IT infrastructure an uncomplicated experience. The traditional IT infrastructure found in vast and expensive corporate data centers can now be consumed by anyone with an internet connection. As organizations/businesses start consuming public cloud platforms and its infrastructure you often hear the expression, infrastructure as code (IaC).
This article was originally published on Medium. Link to the Medium article can be found here.
If you have ever wondered, the what, the why, and the how, in regards to IaC then you have come to the right place.
Before we dive into the nuts and bolts of IaC it helps to first understand how IT infrastructure works. Let’s start with static infrastructure, think server racks, mainframes, routers, switches, firewalls, and pretty much any equipment you expect to find in a traditional data center. In this static infrastructure environment, when you need more capacity you simply add more capacity though physical provisioning, either through horizontal and/or vertical scaling. The need for physical provisioning and waiting for the compute capacity to become available is what makes this environment static.
Adding new equipment, increasing capacity, enabling new functionality, can take several several weeks (10 + weeks), from the day the order is placed to the day the equipment is ready for usage by product teams. In addition, maintaining this equipment requires time and effort from various IT professionals, both from a hardware and software perspective. It’s not uncommon for physical server racks to be split up virtually through various platform technologies (VM, Kubernetes, Pivotal, etc). It’s uncommon for this environment to have APIs available for the underlying infrastructure. Let’s change gears now and look at dynamic infrastructure.
All public cloud platforms provide dynamic infrastructure, IT resources that are ephemeral in nature or if stated in a simple manner, infrastructure that is “on-demand”. This type of infrastructure is simply consumed by requesting it through the user interface console or APIs (more on this later). This on-demand infrastructure comes with a consumption pricing model, pay for every second of usage. Unlike static infrastructure environments, the consumer does not have to worry about having sufficient compute capacity and placing orders for physical servers. This is handled by the cloud provider on the behalf of the consumer. In addition, all the infrastructure available on the platform has APIs available which allows for automation capabilities.
What is IaC?
Infrastructure as Code is nothing more than replacing the traditional manual provisioning of infrastructure through admin consoles/GUI with a programming-based approach (think scripting). Instead of clicking on buttons and navigating through various screens to deploy/enable infrastructure, instead those actions are now achieved through a codified approach.
IaC is heavily leveraged in dynamic infrastructure environments such as public cloud platforms due to the ability to provision and/or deprovision a large number of resources quickly through APIs. Without IaC this could be a tedious and arduous process. It’s important to note that IaC is not a new concept and it’s something that infrastructure analyst have done for many years through scripting and chaining commands together. What is different today in regard to IaC is the code aspect of it.
How does IaC work?
The modern IaC approach leverages declarative programming vs the traditional scripting approach of the past. Declarative programming is easy to get into as you are simply telling the computer “what to do” by filling out values for a given required input parameter. The computer will figure out the rest. Traditional scripting, or more accurately “imperative programming” is associated with general programming. In the imperative programming approach you are telling the computer “how to do something” through programming logic. This tends to be a more intimidating challenge for those that lack programming experience/background.
There are many IaC tools available today, some alternatives are declarative or imperative in nature. Terraform is the most common open-source declarative IaC solution. Both IaC flavors (imperative, declarative) operate the same, the tools act as an abstraction layer for the infrastructure. Rather than writing the logic for API calls for the various infrastructure actions, users can instead focus on creating templates the define the desired infrastructure resources and state. At run-time, the tool, evaluates the templates/logic and executes the respective API call to the corresponding infrastructure action specified.
The two images (above and below) are both achieving the same end goal of deploying an EC2 into an AWS environment. It’s easy to think that one solution is better than the other due to the simplicity of declarative programming, but it’s important to understand that each tool has it’s unique sets of pros and cons.
Challenges with IaC
Like all things in life, there are challenges associated with IaC. IaC may require upskilling. In order to effectively use IaC, one has to adopt software engineering practices and common software development tools. This can be a major change if you are an infrastructure analyst with no former scripting and/or programming experience. It takes time and practice to become comfortable writing IaC that adheres to basic software engineering principles (DRY, KISS…), in addition a coach is needed to help provide guidance and direction.
The challenge software developers encounter is different from the one infrastructure analysts face. Software developers are now being asked to learn and understand the various infrastructure pieces required to host an application architecture. This includes the networking, security, disaster recovery, compute and so on. This is all part of the DevSecOps model that product teams are embracing. The networking piece tends to be the most common pain point for development teams as rarely teams have individuals with a networking background and/or understanding of how to integrate with existing network infrastructure (hybrid cloud and multi-cloud).
The need for learning and allocation time to practice is easily the most important challenge that makes IaC difficult as first. Another challenge encountered with IaC is the transition from manual infrastructure provisioning to IaC. If you try to mix both manual provisioning and IaC, you will quickly run into issues that can be time consuming to fix. Each IaC tool keeps track of the infrastructure it deployed, if you start to modify this infrastructure manually that the IaC tool deployed, then often times the IaC tool will error out and stop the next time you execute its deploy command. The reason being that the current state of the deployed infrastructure is different from the state that the IaC is expecting it to be in. Sometimes the IaC tools are able to adjust to these discrepancies in the infrastructure state and self-heal, other times the change are too large and results in errors.
Other challenges worth quickly mentioning are:
- Structuring IaC state
- Integration with CI/CD pipeline
- Working collaborative — remote state file
- Lack of example code (decreasing)
Why use IaC?
This is a commonly asked question by individuals first entering the dynamic infrastructure environment, and it is a fair question when taking the challenges IaC introduces into account. IaC offers many benefits that outweigh the cons. At a high level IaC introduces the following benefits:
- Speed & efficiency
- Integration into CI/CD pipelines
- Manage infrastructure via source control
- Team collaboration
- Reduce technical debt
- Prevent human mistakes
- Simplify compliance
The real benefits of IaC are seen when large environments and/or large number of infrastructure resources need to be deployed. The time it would take to do this manually can be quite extensive depending on the number of unique resources.
The ability to work as a team is also an important benefit to IaC. By leveraging a version control system (VCS) such as git, various team members can work on different pieces of the infrastructure and roll out their changes in a controlled manner with merge requests that also includes an auditable history.
The main benefit of IaC is the ability to create automation and integration with continuous integration/continuous delivery (CI/CD) pipelines. Most commonly, teams deploy the infrastructure as the last step of their pipeline, after all tests and code scans are completed. This makes sense as you wouldn’t want idle resources to be stood up and increase cost.
However, to truly benefit from IaC we need to change our behavior and how we treat infrastructure. Assuming we are in a dynamic infrastructure environment, such as a public cloud provider (AWS, GCP, or Azure), we need to treat infrastructure as cattle and not as a pet! What? Yes— the common industry analogy is pets vs cattle. Allow we to explain.
If you have a VM instance, we normally load everything we need to run an application through an image. If something goes wrong with the server, we tend to spend time debugging in an attempt to address the issue. We are treating it as a pet at that point. If your pet is sick, you do everything you can to nourish it back to a healthy state.
In a dynamic infrastructure environment this is an inefficient approach, what we should rather do is deprovision the sick server and provision a new one that is in a healthy state. This is the cattle approach, in an industrial sized cattle ranch, if one of the animals were to get sick, say break a leg. It wouldn’t be efficient to spend time and money to nourish it back to state — instead it would go … well, we all know where it would go 😢.
The point is, because we have infrastructure available at our fingertips, don’t waste time on unhealthy infrastructure, rather provision new infrastructure that is healthy and deprovision the unhealthy resources. This is easily achievable with IaC in combination with Docker and images.
Infrastructure as Code has many benefits, but it also introduces a series of challenges. However, the pros outweigh the cons, and it is an investment well worth undertaking. Despite what IaC tool you and your team decide to use, just make sure that it is the right fit for the skill level of your team. If the team is lacking programming experience, then a declarative IaC solution might be the better approach at first. In the end, IaC is an investment that will continue to pay dividends to your team and organization.
To learn more about Infrastructure as Code and how to implement, I recommend reading HashiCorp’s Workflow Overview