Chapter #6: Decision Making in Ansible
This is the sixth chapter of RHCE Ansible EX 294 exam preparation series. Learn about using conditional statements in Ansible.
In this tutorial, you will learn how to add decision making skills to your Ansible playbooks.
You will learn to:
- Use when statements to run tasks conditionally.
- Use block statements to implement exception handling.
- Use Ansible handlers to trigger tasks upon change.
Needless to say that you should be familiar with Ansible playbooks, ad-hoc commands and other Ansible basics to understand this tutorial. You may follow the earlier chapter of this RHCE Ansible series.
This tutorial follow the same setup that was mentioned in the first chapter of this series: 1 Red Hat control, 3 CentOS nodes and 1 Ubuntu node.
Choosing When to Run Tasks
Let's start to put conditions on when to run a certain task with Ansible.
Using when with facts
You can use when conditionals to run a task only when a certain condition is true. To demonstrate, create a new playbook named ubuntu-server.yml that has the following content:
[elliot@control plays]$ cat ubuntu-server.yml
---
- name: Using when with facts
hosts: all
tasks:
- name: Detect Ubuntu Servers
debug:
msg: "This is an Ubuntu Server."
when: ansible_facts['distribution'] == "Ubuntu"
Now go ahead and run the playbook:
[elliot@control plays]$ ansible-playbook ubuntu-servers.yml
PLAY [Using when with facts] *******************************************
TASK [Gathering Facts] *********************************************************
ok: [node4]
ok: [node1]
ok: [node3]
ok: [node2]
TASK [Detect Ubuntu Servers] ***************************************************
skipping: [node1]
skipping: [node2]
skipping: [node3]
ok: [node4] => {
"msg": "This is an Ubuntu Server."
}
PLAY RECAP *********************************************************************
node1 : ok=1 changed=0 unreachable=0 failed=0 skipped=1
node2 : ok=1 changed=0 unreachable=0 failed=0 skipped=1
node3 : ok=1 changed=0 unreachable=0 failed=0 skipped=1
node4 : ok=2 changed=0 unreachable=0 failed=0 skipped=0
Notice how I used the Ansible fact ansible_facts['distribution']
in the when condition to test which nodes are running Ubuntu. Also, notice that you don’t need to surround variables with curly brackets when using when conditionals.
In the playbook output, notice how TASK [Detect Ubuntu Servers]
skipped the first three nodes as they are all running CentOS and only ran on node4 as it is running Ubuntu.
Using when with registers
You can also use when conditionals with registered variables. For example, the following playbook centos-servers.yml will reveal which nodes are running CentOS:
[elliot@control plays]$ cat centos-servers.yml
---
- name: Using when with registers
hosts: all
tasks:
- name: Save the contents of /etc/os-release
command: cat /etc/os-release
register: os_release
- name: Detect CentOS Servers
debug:
msg: "Running CentOS ..."
when: os_release.stdout.find('CentOS') != -1
The playbook first starts by saving the contents of the /etc/os-release file into the os_release register. Then the second tasks displays the message “Running CentOS …” only if the word ‘CentOS’ is found in os_release standard output.
Go ahead and run the playbook:
[elliot@control plays]$ ansible-playbook centos-servers.yml
PLAY [Using when with registers] ***********************************************
TASK [Gathering Facts] *********************************************************
ok: [node4]
ok: [node1]
ok: [node3]
ok: [node2]
TASK [Save the contents of /etc/os-release] ************************************
changed: [node4]
changed: [node1]
changed: [node2]
changed: [node3]
TASK [Detect CentOS Servers] ***************************************************
ok: [node1] => {
"msg": "Running CentOS ..."
}
ok: [node2] => {
"msg": "Running CentOS ..."
}
ok: [node3] => {
"msg": "Running CentOS ..."
}
skipping: [node4]
PLAY RECAP *********************************************************************
node1 : ok=3 changed=1 unreachable=0 failed=0 skipped=0
node2 : ok=3 changed=1 unreachable=0 failed=0 skipped=0
node3 : ok=3 changed=1 unreachable=0 failed=0 skipped=0
node4 : ok=2 changed=1 unreachable=0 failed=0 skipped=1
Notice how TASK [Detect CentOS Servers]
only ran on the first three nodes and skipped node4 (Ubuntu).
Testing multiple conditions with when
You can also test multiple conditions at once by using the logical operators. For example, the following reboot-centos8.yml playbook uses the logical and operator to reboot servers that are running CentOS version 8:
[elliot@control plays]$ cat reboot-centos8.yml
---
- name: Reboot Servers
hosts: all
tasks:
- name: Reboot CentOS 8 servers
reboot:
msg: "Server is rebooting ..."
when: ansible_facts['distribution'] == "CentOS" and ansible_facts['distribution_major_version'] == "8"
You can also use the logical or operator to run a task if any of the conditions is true. For example, the following task would reboot servers that are running either CentOS or RedHat:
tasks:
- name: Reboot CentOS and RedHat Servers
reboot:
msg: "Server is rebooting ..."
when: ansible_facts['distribution'] == "CentOS" or ansible_facts['distribution'] == "RedHat"
Using when with loops
If you combine a when conditional statement with a loop, Ansible would test the condition for each item in the loop separately.
For example, the following print-even.yml playbook will print all the even numbers in the range(1,11):
[elliot@control plays]$ cat print-even.yml
---
- name: Print Some Numbers
hosts: node1
tasks:
- name: Print Even Numbers
debug:
msg: Number {{ item }} is Even.
loop: "{{ range(1,11) | list }}"
when: item % 2 == 0
Go ahead and run the playbook to see the list of all even numbers in the range(1,11):
[elliot@control plays]$ ansible-playbook print-even.yml
PLAY [Print Some Numbers] **********************************
TASK [Gathering Facts] ****************************
ok: [node1]
TASK [Print Even Numbers] ******************************
skipping: [node1] => (item=1)
ok: [node1] => (item=2) => {
"msg": "Number 2 is Even."
}
skipping: [node1] => (item=3)
ok: [node1] => (item=4) => {
"msg": "Number 4 is Even."
}
skipping: [node1] => (item=5)
ok: [node1] => (item=6) => {
"msg": "Number 6 is Even."
}
skipping: [node1] => (item=7)
ok: [node1] => (item=8) => {
"msg": "Number 8 is Even."
}
skipping: [node1] => (item=9)
ok: [node1] => (item=10) => {
"msg": "Number 10 is Even."
}
PLAY RECAP ***********************************
node1 : ok=2 changed=0 unreachable=0 failed=0 skipped=0
Using when with variables
You can also use when conditional statements with your own defined variables. Keep in mind that conditionals require boolean inputs; that is, a test must evaluate to true to trigger the condition and so, you need to use the bool filter with non-boolean variables.
To demonstrate, take a look at the following isfree.yml playbook:
[elliot@control plays]$ cat isfree.yml
---
- name:
hosts: node1
vars:
weekend: true
on_call: "no"
tasks:
- name: Run if "weekend" is true and "on_call" is false
debug:
msg: "You are free!"
when: weekend and not on_call | bool
Notice that I used the bool filter here to convert the on_call value to its boolean equivalent (no -> false).
Also, you should be well aware that not false is true and so the whole condition will evaluate to true in this case; you are free!
You can also test to see whether a variable has been set or not; for example, the following task will only run if the car variable is defined:
tasks:
- name: Run only if you got a car
debug:
msg: "Let's go on a road trip ..."
when: car is defined
The following task uses the fail module to fail if the keys variable is undefined:
tasks:
- name: Fail if you got no keys
fail:
msg: "This play require some keys"
when: keys is undefined
Handling Exceptions with Blocks
Let's talk about handling exceptions.
Grouping tasks with blocks
You can use blocks to group related tasks together. To demonstrate, take a look at the following install-apache.yml playbook:
[elliot@control plays]$ cat install-apache.yml
---
- name: Install and start Apache Play
hosts: webservers
tasks:
- name: Install and start Apache
block:
- name: Install httpd
yum:
name: httpd
state: latest
- name: Start and enable httpd
service:
name: httpd
state: started
enabled: yes
- name: This task is outside the block
debug:
msg: "I am outside the block now ..."
The playbook runs on the webservers group hosts and has one block with the name Install and start Apache that includes two tasks:
- Install httpd
- Start and enable httpd
The first task Install httpd
uses the yum module to install the httpd apache package. The second task Start and enable httpd
uses the service module to start and enabled httpd to start on boot.
Notice that the playbook has a third task that doesn’t belong to the Install and start Apache
block.
Now go ahead and run the playbook to install and start httpd on the webservers nodes:
[elliot@control plays]$ ansible-playbook install-apache.yml
PLAY [Install and start Apache Play] *******************************************
TASK [Gathering Facts] *********************************************************
ok: [node3]
ok: [node2]
TASK [Install httpd] ***********************************************************
changed: [node2]
changed: [node3]
TASK [Start and enable httpd] **************************************************
changed: [node3]
changed: [node2]
TASK [This task is outside the block] ******************************************
ok: [node2] => {
"msg": "I am outside the block now ..."
}
ok: [node3] => {
"msg": "I am outside the block now ..."
}
PLAY RECAP *********************************************************************
node2 : ok=4 changed=2 unreachable=0 failed=0 skipped=0
node3 : ok=4 changed=2 unreachable=0 failed=0 skipped=0
You can also follow up with an ad-hoc command to verify that httpd is indeed up and running:
[elliot@control plays]$ ansible webservers -m command -a "systemctl status httpd"
node3 | CHANGED | rc=0 >>
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2020-11-03 19:35:13 UTC; 1min 37s ago
Docs: man:httpd.service(8)
Main PID: 47122 (httpd)
Status: "Running, listening on: port 80"
Tasks: 213 (limit: 11935)
Memory: 25.1M
CGroup: /system.slice/httpd.service
├─47122 /usr/sbin/httpd -DFOREGROUND
├─47123 /usr/sbin/httpd -DFOREGROUND
├─47124 /usr/sbin/httpd -DFOREGROUND
├─47125 /usr/sbin/httpd -DFOREGROUND
└─47126 /usr/sbin/httpd -DFOREGROUND
Nov 03 19:35:13 node3 systemd[1]: Starting The Apache HTTP Server...
Nov 03 19:35:13 node3 systemd[1]: Started The Apache HTTP Server.
Nov 03 19:35:13 node3 httpd[47122]: Server configured, listening on: port 80
node2 | CHANGED | rc=0 >>
● httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Tue 2020-11-03 19:35:13 UTC; 1min 37s ago
Docs: man:httpd.service(8)
Main PID: 43695 (httpd)
Status: "Running, listening on: port 80"
Tasks: 213 (limit: 11935)
Memory: 25.1M
CGroup: /system.slice/httpd.service
├─43695 /usr/sbin/httpd -DFOREGROUND
├─43696 /usr/sbin/httpd -DFOREGROUND
├─43697 /usr/sbin/httpd -DFOREGROUND
├─43698 /usr/sbin/httpd -DFOREGROUND
└─43699 /usr/sbin/httpd -DFOREGROUND
Nov 03 19:35:13 node2 systemd[1]: Starting The Apache HTTP Server...
Nov 03 19:35:13 node2 systemd[1]: Started The Apache HTTP Server.
Nov 03 19:35:13 node2 httpd[43695]: Server configured, listening on: port 80
Handling Failure with Blocks
You can also use blocks to handle task errors by using the rescue and always sections. This is pretty much similar to exception handling in programming languages like the try-catch in Java or try-except in Python.
You can use the rescue section to include all the tasks that you want to run in case one or more tasks in the block has failed.
To demonstrate, let’s take a look at the following example:
tasks:
- name: Handling error example
block:
- name: run a command
command: uptime
- name: run a bad command
command: blabla
- name: This task will not run
debug:
msg: "I never run because the above task failed."
rescue:
- name: Runs when the block failed
debug:
msg: "Block failed; let's try to fix it here ..."
Notice how the second task in the block run a bad command
generates an error and in turn the third task in the block never gets a chance to run. The tasks inside the rescue section will run because the second task in the block has failed.
You can also use ignore_errors: yes to ensure that Ansible continue executing the tasks in the playbook even if a task has failed:
tasks:
- name: Handling error example
block:
- name: run a command
command: uptime
- name: run a bad command
command: blabla
ignore_errors: yes
- name: This task will run
debug:
msg: "I run because the above task errors were ignored."
rescue:
- name: This will not run
debug:
msg: "Errors were ignored! ... not going to run."
Notice that in this example, you ignored the errors in the second task run a bad command
in the block and that’s why the third task was able to run. Also, the rescue section will not run as you ignored the error in the second task in the block.
You can also add an always section to a block. Tasks in the always section will always run regardless whether the block has failed or not.
To demonstrate, take a look at the following handle-errors.yml playbook that has all the three sections (block, rescue, always) of a block:
[elliot@control plays]$ cat handle-errors.yml
---
- name: Handling Errors with Blocks
hosts: node1
tasks:
- name: Handling Errors Example
block:
- name: run a command
command: uptime
- name: run a bad command
command: blabla
- name: This task will not run
debug:
msg: "I never run because the task above has failed!"
rescue:
- name: Runs when the block fails
debug:
msg: "Block failed! let's try to fix it here ..."
always:
- name: This will always run
debug:
msg: "Whether the block has failed or not ... I will always run!"
Go ahead and run the playbook:
[elliot@control plays]$ ansible-playbook handle-errors.yml
PLAY [Handling Errors with Blocks] *********************************************
TASK [Gathering Facts] *********************************************************
ok: [node1]
TASK [run a command] ***********************************************************
changed: [node1]
TASK [run a bad command] *******************************************************
fatal: [node1]: FAILED! => {"changed": false, "cmd": "blabla", "msg": "[Errno 2] No such file or directory: b'blabla': b'blabla'", "rc": 2}
TASK [Runs when the block fails] ***********************************************
ok: [node1] => {
"msg": "Block failed! let's try to fix it here ..."
}
TASK [This will always run] ****************************************************
ok: [node1] => {
"msg": "Whether the block has failed or not ... I will always run!"
}
PLAY RECAP *********************************************************************
node1 : ok=4 changed=1 unreachable=0 failed=0 skipped=0
As you can see; the rescue section did run as the 2ndtask in the block has failed and you didn’t ignore the errors. Also, the always section did (and will always) run.
Running Tasks upon Change with Handlers
Let's see about changing handlers and running tasks.
Running your first handler
You can use handlers to trigger tasks upon a change on your managed nodes. To demonstrate, take a look at the following handler-example.yml playbook:
[elliot@control plays]$ cat handler-example.yml
---
- name: Simple Handler Example
hosts: node1
tasks:
- name: Create engineers group
group:
name: engineers
notify: add elliot
- name: Another task in the play
debug:
msg: "I am just another task."
handlers:
- name: add elliot
user:
name: elliot
groups: engineers
append: yes
The first task Create engineers group
creates the engineers group and also notifies the add elliot
handler.
Let’s run the playbook to see what happens:
[elliot@control plays]$ ansible-playbook handler-example.yml
PLAY [Simple Handler Example] **************************************************
TASK [Gathering Facts] *********************************************************
ok: [node1]
TASK [Create engineers group] **************************************************
changed: [node1]
TASK [Another task in the play] ************************************************
ok: [node1] => {
"msg": "I am just another task."
}
RUNNING HANDLER [add elliot] ***************************************************
changed: [node1]
PLAY RECAP *********************************************************************
node1 : ok=4 changed=2 unreachable=0 failed=0 skipped=0
Notice that creating the engineers caused a change on node1 and as a result triggered the add elliot
handler.
You can also run a quick ad-hoc command to verify that user elliot is indeed a member of the engineers group:
[elliot@control plays]$ ansible node1 -m command -a "id elliot"
node1 | CHANGED | rc=0 >>
uid=1000(elliot) gid=1000(elliot) groups=1000(elliot),4(adm),190(systemd-journal),1004(engineers)
Ansible playbooks and modules are idempotent which means that if a change in configuration occurred on the managed nodes; it will not redo it again!
To fully understand the concept of Ansible’s idempotency; run the handler-example.yml playbook one more time:
[elliot@control plays]$ ansible-playbook handler-example.yml
PLAY [Simple Handler Example] **************************************************
TASK [Gathering Facts] *********************************************************
ok: [node1]
TASK [Create engineers group] **************************************************
ok: [node1]
TASK [Another task in the play] ************************************************
ok: [node1] => {
"msg": "I am just another task."
}
PLAY RECAP *********************************************************************
node1 : ok=3 changed=0 unreachable=0 failed=0 skipped=0
As you can see; the Create engineers group
task didn’t not cause or report a change this time because the engineers group already exists on node1 and as a result; the add elliot handler did not run.
Controlling when to report a change
You can use the changed_when keyword to control when a task should report a change. To demonstrate, take a look at the following control-change.yml playbook:
[elliot@control plays]$ cat control-change.yml
---
- name: Control Change
hosts: node1
tasks:
- name: Run the date command
command: date
notify: handler1
- name: Run the uptime command
command: uptime
handlers:
- name: handler1
debug:
msg: "I can handle dates"
Notice how the first task Run the date command
triggers handler1. Now go ahead and run the playbook:
[elliot@control plays]$ ansible-playbook control-change.yml
PLAY [Control Change] **********************************************************
TASK [Gathering Facts] *********************************************************
ok: [node1]
TASK [Run the date command] ****************************************************
changed: [node1]
TASK [Run the uptime command] **************************************************
changed: [node1]
RUNNING HANDLER [handler1] *****************************************************
ok: [node1] => {
"msg": "I can handle dates"
}
PLAY RECAP *********************************************************************
node1 : ok=4 changed=2 unreachable=0 failed=0 skipped=0
Both tasks Run the date command
and Run the uptime command
reported changes and handler1 was triggered. You can argue that running a date and uptime commands don’t really change anything on the managed node and you are totally right!
Now let’s edit the playbook to stop the Run the date command
task from reporting changes:
[elliot@control plays]$ cat control-change.yml
---
- name: Control Change
hosts: node1
tasks:
- name: Run the date command
command: date
notify: handler1
changed_when: false
- name: Run the uptime command
command: uptime
handlers:
- name: handler1
debug:
msg: "I can handle dates"
Now run the playbook again:
[elliot@control plays]$ ansible-playbook control-change.yml
PLAY [Control Change] **********************************************************
TASK [Gathering Facts] *********************************************************
ok: [node1]
TASK [Run the date command] ****************************************************
ok: [node1]
TASK [Run the uptime command] **************************************************
changed: [node1]
PLAY RECAP *********************************************************************
node1 : ok=3 changed=1 unreachable=0 failed=0 skipped=0
As you can see, the Run the date command
task didn’t report a change this time and as a result, handler1 was not triggered.
Configuring services with handlers
Handlers are especially useful when you are editing services configurations with Ansible. That’s because you only want to restart a service when there is a change in its service configuration.
To demonstrate, take a look at the following configure-ssh.yml playbook:
[elliot@control plays]$ cat configure-ssh.yml
---
- name: Configure SSH
hosts: all
tasks:
- name: Edit SSH Configuration
blockinfile:
path: /etc/ssh/sshd_config
block: |
MaxAuthTries 4
Banner /etc/motd
X11Forwarding no
notify: restart ssh
handlers:
- name: restart ssh
service:
name: sshd
state: restarted
Notice I used the blockinfile module to insert multiple lines of text into the /etc/ssh/sshd_config configuration file. The Edit SSH Configuration
task also triggers the restart ssh
handler upon change.
Go ahead and run the playbook:
[elliot@control plays]$ ansible-playbook configure-ssh.yml
PLAY [Configure SSH] ***********************************************************
TASK [Gathering Facts] *********************************************************
ok: [node4]
ok: [node3]
ok: [node1]
ok: [node2]
TASK [Edit SSH Configuration] **************************************************
changed: [node4]
changed: [node2]
changed: [node3]
changed: [node1]
RUNNING HANDLER [restart ssh] **************************************************
changed: [node4]
changed: [node3]
changed: [node2]
changed: [node1]
PLAY RECAP *********************************************************************
node1 : ok=3 changed=2 unreachable=0 failed=0 skipped=0
node2 : ok=3 changed=2 unreachable=0 failed=0 skipped=0
node3 : ok=3 changed=2 unreachable=0 failed=0 skipped=0
node4 : ok=3 changed=2 unreachable=0 failed=0 skipped=0
Everything looks good! Now let’s quickly take a look the last few lines in the /etc/ssh/sshd_config file:
[elliot@control plays]$ ansible node1 -m command -a "tail -5 /etc/ssh/sshd_config"
node1 | CHANGED | rc=0 >>
# BEGIN ANSIBLE MANAGED BLOCK
MaxAuthTries 4
Banner /etc/motd
X11Forwarding no
# END ANSIBLE MANAGED BLOCK
Amazing! Exactly as you expected it to be. Keep in mind that if you rerun the configure-ssh.yml playbook, Ansible will not edit (or append) the /etc/ssh/sshd_config file. You can try it for yourself.
I also recommend you take a look at the blockinfile and lineinfile documentation pages to understand the differences and the use of each module:
[elliot@control plays]$ ansible-doc blockinfile
[elliot@control plays]$ ansible-doc lineinfile
Alright! This takes us to the end of our Decision Making in Ansible tutorial. Stay tuned for next tutorial as you are going to learn how to use Jinja2 Templates to manage files and configure services dynamically in Ansible.
A Linux sysadmin who likes to code for fun. I have authored Learn Linux Quickly book to help people learn Linux easily. I also like watching the NBA and going for a cruise with my skateboard.