How to Handle Long Running Tasks in Ansible
Facing timeouts? Here is how you can handle tasks that take a long time to complete in Ansible.
Handling long-running tasks in Ansible can be challenging because Ansible expects tasks to be completed within a certain period and may time out otherwise.
There are two main reasons why you'd want to handle long-running tasks specifically:
- Connection Timeouts: By default, Ansible keeps connections to remote machines open until the task running on that machine finishes. If a task takes longer than the configured SSH timeout, the connection will drop and the playbook will fail.
- Parallelization and Efficiency: Ansible typically runs tasks synchronously, one after the other. If a task takes a long time, all subsequent tasks have to wait until it finishes, which can be inefficient. You might want to run long-running tasks concurrently with other tasks or in the background to keep things moving.
This tutorial will guide you through the best practices and methods to handle long-running tasks effectively.
Understanding Timeouts in Ansible
Ansible uses different timeout settings to manage how long it waits for tasks to complete. You can use timeout settings to ensure that Ansible handles long-running tasks gracefully without failing prematurely.
Regular Commands: ansible_command_timeout or command_timeout
The ansible_command_timeout
or command_timeout
setting specifies the maximum time, in seconds, that Ansible will wait for a command to complete.
Here is an example:
- name: Run a long-running command
command: your_command
vars:
ansible_command_timeout: 3600 # 1 hour
In this example, the command your_command
is allowed to run for up to 1 hour before Ansible considers it a timeout.
Commands Over SSH: ansible_ssh_timeout
When executing commands over SSH, you might encounter a different type of timeout setting. The ansible_ssh_timeout
setting defines the maximum amount of time, in seconds, that Ansible will wait for an SSH connection to be established or an SSH command to complete.
Here is an example:
- name: Execute command over SSH
command: your_command
vars:
ansible_ssh_timeout: 7200 # 2 hours
In this example, the SSH connection and command execution are allowed up to 2 hours before timing out.
Implementing Timeouts in Playbooks
You can implement timeout settings directly within your playbooks to handle long-running tasks efficiently.
This section will demonstrate how to use these settings in practical scenarios.
Here is an example playbook that uses ansible_command_timeout
:
---
- name: Example playbook for long-running tasks
hosts: all
vars:
ansible_command_timeout: 3600 # Set command timeout to 1 hour
tasks:
- name: Run a long-running command
command: "dd if=/dev/zero of=/mnt/test.img bs=2M count=2000"
The above playbook sets a timeout for commands to 1 hour. This means any command executed by the playbook will be allowed to run for up to 1 hour before timing out. The playbook includes a task that runs a command to create a large file, test. IMG
in the /mnt
directory by writing data to it in blocks of 2
megabytes until the file reaches the specified size.
Run the above playbook using Ansible:
ansible-playbook playbook.yml
You will see the following output:
Here is an example playbook that uses ansible_ssh_timeout
:
---
- name: Example playbook for SSH timeouts
hosts: all
vars:
ansible_ssh_timeout: 7200 # Set SSH timeout to 2 hours
tasks:
- name: Execute a command over SSH
command: "dd if=/dev/zero of=/mnt/test.img bs=2M count=2000"
The above playbook runs on all specified hosts and sets the SSH timeout to 2 hours. This means the SSH connection will remain open for up to 2 hours without timing out. The playbook contains a task that executes a command over SSH to create a large file, test.img
in the /mnt
directory by writing data to it in blocks of 2
megabytes until the file reaches the specified size.
Let's run this playbook:
ansible-playbook playbook.yml
You will see the following output:
Combining Timeout Settings
In some cases, you may need to combine both ansible_command_timeout
and ansible_ssh_timeout
to handle tasks that involve both command execution and SSH connections.
Here is an example playbook combining both settings:
---
- name: Playbook combining command and SSH timeouts
hosts: all
vars:
ansible_command_timeout: 3600 # 1 hour for command execution
ansible_ssh_timeout: 7200 # 2 hours for SSH operations
tasks:
- name: Run a long-running command with SSH
command: "dd if=/dev/zero of=/mnt/test.img bs=2M count=2000"
Use Asynchronous Actions for Handling Long Running Tasks
Ansible supports asynchronous task execution, allowing tasks to run in the background and continue with other tasks or wait for completion. This is achieved using the async and poll keywords.
async
specifies the maximum runtime in seconds.poll
determines how often to check the status of the task.
Here is an example playbook:
---
- name: Run long-running task asynchronously
hosts: all
tasks:
- name: Start a long-running task
command: "dd if=/dev/zero of=/mnt/test.img bs=15M count=2000"
async: 7200 # Run for up to 2 hours
poll: 0 # Do not wait for the task to complete
register: long_running_task
ignore_errors: true # Ensure playbook continues even if task fails
- name: Check on the long-running task
async_status:
jid: "{{ long_running_task.ansible_job_id }}"
register: job_result
until: job_result.finished
retries: 360
delay: 30
This Ansible playbook performs a long-running task asynchronously on all hosts. It starts by executing a command to create a large file test.img
on the /mnt
directory, which can run for up to 2 hours without blocking the playbook execution poll: 0
. The task is registered, and errors are ignored to ensure the playbook continues regardless of success or failure. It then checks the status of the task every 30
seconds (up to 360 retries) until it completes.
Run the above playbook using:
ansible-playbook playbook.yml
You will see the following output:
Conclusion
In this tutorial, I explained different ways to handle long-running tasks in Ansible with examples. I covered essential strategies such as asynchronous task execution, and adjusting SSH and command timeouts.
By applying these methods, you can prevent your playbooks from hanging or timing out prematurely, thus enhancing the reliability and performance of your automation efforts.
If you are new to Ansible and want to learn it from scratch, our Ansible tutorial series will be of great help. It's written for RHCE exam but it helps you the same whether you are preparing for the exam or not.
LHB Community is made of readers like you who like to contribute to the portal by writing helpful Linux tutorials.