23 Configuration Management Flashcards
Team Silos
Larger companies tend to be more organized by clearly defining IT team responsibilities and compartmentalizing all the work into silos. Calling a given department a silo is saying it’s only responsible for certain devices, tasks, and projects.
It’s the SysAdmins in the company who tend to be responsible for the actual Windows and Linux servers in the infrastructure. In really huge companies, there could even be whole, separate teams looking after the Windows and Linux environments.
The security team usually focuses on the company’s firewalls, but they can also be in charge of creating virtual private networks (VPNs) between companies for connectivity. Remote access solutions like a client VPN and things like Citrix, which allow access to the environment, can often be in their domain too. The security group is also tasked with monitoring for security events and making sure vulnerabilities are addresses as they’re discovered. As usual, those neat little organizational silo boundaries end up to be more suggestion over time because they’re just too limiting. Individual companies need customized solutions to meet their individual needs so predictably, and we’re seeing a lot of cross-over between the various silos now. For instance, a network team just might be tasked with setting up VPNs because of the more advanced networking knowledge needed to pull that off well. The waters get even muddier when we consider all the roles and services that run on servers and network devices. One company will say things like DHCP, DNS, and load balancer are the network team’s responsibility, while another will let their sysadmins take charge instead. Services can even be a shared responsibility between teams, with the sysadmin team getting custody of DHCP servers while the network team is responsible for DHCP traffic leaving network devices like wireless controllers.
The development silo is all about the developer’s complete focus on supporting their software releases. The other IT teams support them by building the infrastructure they need for their applications. For example, if a developer needs a new web server, the network team would make sure the server can get online. The systems team would install the server’s operating system and the actual web server application. The security team would check the server after it’s built to see it complies with their best practices, ensuring antimalware has been installed and the host firewall has been enabled.
DevOps
It’s basically a way to merge teams to solve some issues regarding silos. A company would have DevOps engineers busy automating the infrastructure so that it can be spun up as needed for applications. The most obvious benefit to be had here is that IT resources needed by the company can now be provisioned very quickly instead of maybe waiting weeks for all the different IT teams to complete their various tasks required to get a vital resource online. Change control is also simplified because DevOps methodology lets us test the code in different environments. It also allows us the freedom to easily back out of changes when necessary.
Infrastructure as Code (IaC)
You can think of Infrastructure as Code (IaC) as the practical output that we’d get out of a DevOps practice. IaC is the fun “automate everything” part of DevOps, with the goal being to create and maintain our IT infrastructure using the configuration managers. Aside from helping improve the speed of deployments, IaC can also help combat configuration drift—something that begins anytime you make an unintended change on a network device or a server. But sadly, drift is a reality for companies over time. It happens when a server is removed from the network but the VLAN isn’t removed from the switches. Maybe a firewall rule is added to do some troubleshooting, but it’s forgotten afterwards, becoming one of hundreds—a cobweb in the rule base. Things just find their way onto servers that shouldn’t be there, and one day all these random cobwebs in the environment will cause problems. Armed with this little beauty, we can skip all that troubleshooting chaos by just deleting the deviant server, then redeploying it instead. Yes, it can be fun to sleuth out exactly why Apache broke, but you can still have plenty of good times during the nice break you’ll get while the virtual machine rebuilds, right? So if you’re thinking that periodically deleting the created infrastructure and remaking it can drastically reduce the amount of drifting from the intended configuration, you’re right, because the new resources will be created exactly as described by the IaC solutions. As a bonus, the new VMs can be automatically patched and up to date as they’re being spun up by the solution. And if an agent-based configuration manager like Puppet or Chef is being used, then drift can be automatically corrected since the server will periodically check in to ensure it isn’t going rogue on you. An important principle of Infrastructure as Code is idempotence—a super fancy way of saying that the configuration will be only be applied to a target environment if it will result in a change. This means that if you make an Ansible playbook that enables NTP on a server, then start the service, Ansible will only do the install task if NTP isn’t already installed on that server. Plus, it’ll only do the start service task if NTP isn’t currently running—nice. Here’s how easy it is to install NTP in an Ansible playbook
: - name: Install NTP on RHEL
yum: name=ntp state=present
The actual syntax doesn’t matter right now, but the gist is that we’re naming the task Install NTP on RHEL, calling the yum module to install NTP on the Linux box and then start the service. Because of idempotence magic, Ansible will check to see if the package is installed and will skip the step if it is already. This saves us time when we run the playbook and makes the playbook’s footprint as small as possible.
Ansible
Ansible is currently one of the most popular configuration management solutions for networking professionals. The tool is a Python-based solution that applies configuration found in playbook files via an SSH connection. The solution is easy to get up and running because target systems don’t require agents installed and registered to a central server before the solution can work. Instead, Ansible doesn’t even require a central server—you can actually run playbooks directly off your office laptop. Even so, I recommend running Ansible from a central location because Ansible must be able to SSH to the target nodes to apply the configuration, and your network probably has some security restricting direct access. Because Ansible primarily uses SSH to connect to nodes and doesn’t require agents, it supports almost all major vendor systems in the networking community. All the vendor needs to do is provide modules for Ansible to use. A module is basically a set of instructions written in Python to be performed by Ansible. Back in the Infrastructure of Code section I briefly mentioned the yum module to show how to install NTP on a Red Hat Linux system. Playbooks are just a set of instructions saying what should be applied to a target device. The file is written in the YAML format we talked about in the automation chapter. Here’s how Ansible and Ansible Tower/AWX compares to the other solutions:
Pros:
■■ Easy install and setup.
■■ Powerful orchestration.
■■ Supported by most enterprise-grade vendors on the market.
■■ Supports both push and pull modules.
■■ Agentless model is faster and less complicated than the agent model.
■■ Sequential execution order makes deployments predictable.
Cons:
■■ Requires root SSH access to Linux nodes.
■■ As a newer platform, it’s not as mature as Puppet or Chef.
■■ Since Ansible has several scripting components, the syntax can vary.
■■ It’s focused on orchestration over configuration management so Ansible doesn’t protect
against configuration drift.
■■ Troubleshooting can be tricky when compared to Puppet or Chef.
■■ No native config rollback on failure.
Ansible - Installation
Ansible is a very lightweight solution. The tool is installed with the pip install ansible command. You can also use your Linux box’s package manager to install Ansible, but yum and apt tend to lag behind the current release. Pip generally has the latest version:
[ansible@rhel01 ~]$ pip install ansible –user
Once Ansible is installed, it only has two setting files you must have on your computer before you try to run a playbook—a settings file called ansible.cfg and an inventory file typically called hosts. Once those two files are in place, we can run Ansible in ad-hoc mode or write a playbook.
Ansible - Settings
The ansible.cfg file is entirely optional because if Ansible can’t find the settings file, it’ll just go with default values when doing tasks. When you install something through pip, it doesn’t usually create folders or files for you. By default Ansible will look for the settings file by checking the ANSIBLE_CONFIG variable to see if a path is set. If not, it’ll check for the ansible.cfg in the current directory that you’re running the playbook from. If Ansible still doesn’t find a path, it’ll try your home directory and then finally /etc/ansible/. To keep things nice and easy to maintain, use the /etc/ansible folder if you can. So even though there are loads of settings Ansible can use, we’re just going to create a file that sets host_key_checking to false under [defaults]. Doing this makes your lab easier because Ansible won’t try to verify the target node’s SSH keys. If you don’t turn this off, the system must have all the SSH host keys saved before the playbook will work correctly.
Ansible - Inventory
The inventory file tells Ansible which target nodes to connect to and gives it information on how it should make that connection. The file also allows you to group nodes for easier administration. By default, Ansible will check the for the hosts file under /etc/ansible/ when trying to access a node. You can also specify another location when you run Ansible by using a command-line switch. Inside the hosts file we’ll create two groups and a subgroup: switch, router, and a third group comprised of the router and switch groups we created for easy referencing. We’ll define the variables we want the connection to use by appending “:vars” to the group name that’s going to use them. There are three built-in variables we’ll use to tell Ansible the username and password for connecting to the Cisco devices:
ansible_connection This defines how Ansible connects to the nodes. Local means it will
SSH from the Ansible computer.
ansible_user This is the default username variable.
ansible_password This is the default password variable.
Ansible - Lab Setup
Ansible - Modules
Ansible version 2.9.0 is the version I’m using currently, and it comes with 3387 modules. We can check them out with the ansible-doc -l command. Because this is a CCNA book, I’ll filter the output to be just IOS-specific modules and trim it for brevity. The modules that tell Ansible how to do a task usually come from the vendor, but you can make your own custom ones in Python. For instance, you can use the ios_command module to push an exec-level command to a Cisco device so you can capture show output. The ios_config is used to push configuration commands to the device. The thing that makes Ansible unique compared to a regular Python script that pushes commands is the more specific modules available, which let you to complete specific tasks like adding a banner or enabling BGP just by calling the module. You’ll get to see this in action when we get to our playbook example.
terms that Ansible uses:
Inventory Defines the nodes that Ansible knows about and groups them so they can be
referenced. The inventory also includes connection information and variables.
Playbook A file that contains a set of instructions to be executed.
Play Multiple plays can exist in a playbook, which allows the playbook to apply configuration
to different nodes in different sections.
Variables You can use custom variables in your playbooks.
Templates You can use Python’s Jinja2 templates with your playbooks, which is really
helpful for network administration.
Tasks An action the playbook applies, like installing Apache on a Linux box.
Handlers These are a lot like tasks, except they’re only called to an event like a service
starting.
Roles Allows you to spread out playbooks across several folders to make the configurations
more modular, scalable, and flexible.
Modules Built into Ansible, modules are files that describe how Ansible will achieve a
given task. You can also write your own. Cisco has 69 modules in the current version of
Ansible that cover everything from IOS to their UCS server platform.
Facts These are global variables that contain a ton of information about the system,
including vital stats like the system’s IP address.
Ansible Tower/AWX
Ansible has two more enterprise-focused solutions to go with if you want more central
management and security controls.
■■ Ansible Tower: A paid version via Red Hat that adds central management to Ansible
that improves security because you can control who can run playbooks through Role-
Based Access Control (RBAC). Ansible Tower also provides a single point for integration
with other tools. Red Hat offers a free version that supports 10 hosts if you want
to try it.
■■ AWX: The upstream development version of Ansible Tower, that’s kind of like Fedora
vs. Red Hat Enterprise Linux. You can use it for free, but there’s less reliability since
there can be frequent changes with little testing. Be aware that there’s only limited support
available.
Puppet
Puppet is more popular than Ansible is with sysadmins because it’s agent based and provides
stronger configuration management. But Puppet used to require an agent so it could
only run on devices that support that like Cisco Nexus. This fact is important for the
CCNA exam—Cisco will test to see if you know Puppet generally requires agents, even
though the tool now supports agentless configuration in the newer versions.
Puppet is a Ruby-based tool and uses its own declarative language (a domain-specific
language) for its manifest files. These files are like Ansible playbooks.
Here’s how Puppet compares to our other two solutions:
Pros:
■■ Prevents configuration drift through compliance automation and reporting tools
because the agent continuously checks in with the Puppet master to make sure it’s compliant
with the manifests
■■ Strong web UI
■■ More mature solution than Ansible
■■ Easy setup
Cons:
■■ You’ve got to learn Ruby to get good with Puppet, and Ruby isn’t as widely used by the
network community. It’s more of a niche skill.
■■ Lacks a push system, so changes occur only when nodes check in.
■■ The master-agent architecture complicates redundancy and scalability.
important Puppet terms you want be familiar with:
■■ Puppet Master: The Master server that controls the configuration on managed nodes.
■■ Puppet Agent Node: A node controlled by a Puppet Master.
■■ Manifest: A file containing instructions to be executed.
■■ Resource: Declares a task that needs to be executed and how. For example, if we
install apache on a Linux box, we can declare Apache and make sure its state is set
to “installed.”
■■ Module: A group of manifests and related files organized nicely to make life easier.
■■ Class: You can use classes to organize the manifest file just as you can use programming
languages like Python.
■■ Facts: Global variables that contain a ton of information about the system, like the system’s
IP address.
■■ Services: Used to control services on a node.
Chef
Chef is by far the most complex configuration management tool in the box, but like Puppet, it’s based in the Ruby programming language and uses agents for communication. Chef is different because it has a much more distributed architecture plus a lot more features that make it a real favorite with developers! Chef has a central server just like Puppet does, but it also includes the concept of a Workstation node. Basically, it’s a standard workstation with Chef tools installed. The idea is to build your Chef cookbooks then upload them to the “bookshelf” on the Chef Server so they can be used by all the Chef Nodes—hungry yet? I am. The workstation creates a folder called the chef-repo (short for repository) to store cookbooks and recipes. A recipe is like an Ansible playbook, and a cookbook is the folder that holds the recipes. The idea behind all this is that a cookbook teaches you how to make several dishes. The Chef Cookbook contains directions on what to apply to server nodes. The Chef Workstation provides tool that interacts with the Chef server called—you guessed it—a knife. The Chef Node is the computer that Chef will control through the agent. It has two components installed: The Chef-Client is the agent that registers the node and does all the work, and another one called Ohai that monitors the server for changes in configuration that must be reported. I know that’s a lot to swallow, so here’s a quick summary of Chef components:
■■ Cookbook: A file containing a set of instructions to be executed.
■■ Recipe: An action that the cookbook applies, like installing Apache on a Linux box.
■■ Chef Nodes: Computers that Chef manages.
■■ Knife: Command-line tool for managing Chef through the server API.
■■ Chef Server: The master server that manages all the nodes and cookbooks.
■■ Chef Manage: The web interface for Chef Server—the API is used for communication.
■■ Workstation: A computer you perform configuration related tasks from like when creating
a cookbook.
■■ Bookshelf: A place to store cookbook content.
Configuration Management Tools
Ansible Ansible is a Python-based configuration management tool that uses YAML playbooks to push configuration to nodes. It’s an agentless solution offering wide support for network devices because it uses SSH to reach nodes. Because there’s no agent, Ansible can only push configuration to nodes.
Puppet
Puppet is a Ruby-based configuration management tool that uses custom manifest files to configure devices. It requires an agent to be installed on the node, so it has less network support. Puppet also doesn’t support pushing configuration to nodes. Instead, the configuration is applied when the agent checks in. Puppet does support Cisco network devices that can install the Puppet agent.
Chef
Chef is a Ruby-based configuration tool that uses cookbooks to apply configuration. Chef is the most advanced solution in this chapter and is better suited for programmers because it’s more structured and has a strong developer focus. It also requires that nodes have an agent deployed for it to be able to manage them. Chef can’t push configurations.