Skip to main content

AWK

Chapter 4: Mathematical Operations in AWK

Transform raw data into polished reports and perform complex calculations that would require separate tools elsewhere.

Warp Terminal

AWK handles numbers like a built-in calculator. I would say like a scientific calculator, as it has several built-in mathematical functions. You can perform mathematical operations directly on your data fields without any special setup.

Let me create some sample files for you to work with:

sales_report.txt:

Product Price Quantity Discount_Percent
Laptop 1299.99 5 10
Desktop 899.50 3 15
Tablet 599.00 8 5
Monitor 349.99 12 20
Keyboard 99.99 15 0
Mouse 49.99 25 12

server_metrics.txt with fields hostname, cpu_percent, memory_mb, disk_io, temp_celsius:

web01 75.5 4096 85.2 45
web02 82.1 2048 78.9 62
db01 68.9 8192 92.3 38
db02 91.2 4096 88.7 71
cache01 45.3 1024 65.4 22
backup01 88.8 2048 91.1 55
🚧
I'll be using printf command a lot to format the output in the examples in this chapter. While I explain them a little, it would be much better if you made yourself familiar with printf command in bash.
Bash printf Command Examples [Better Than Echo]
You may print simple outputs with echo command but that’s not enough for complicated formatted outputs.

Basic Arithmetic Operations

To refresh your memory, here are the basic arithmetic operators in AWK:

Operation Operator Example Result Description
Addition + 5 + 3 8 Add two numbers
Subtraction - 10 - 4 6 Subtract second from first
Multiplication * 6 * 7 42 Multiply two numbers
Division / 15 / 3 5 Divide first by second
Modulo % 10 % 3 1 Remainder after division
Exponentiation ^ or ** 2 ^ 3 8 Raise to power

You already know that the order of execution matters in arithmetic. So, let's clear that as well.

Priority Operations Example
1 () Parentheses (2 + 3) * 4 = 20
2 ^ ** Exponentiation 2 + 3 ^ 2 = 11
3 - Unary minus -5 * 2 = -10
4 * / % Multiply/Divide/Modulo 6 / 2 * 3 = 9
5 + - Add/Subtract 5 - 2 + 1 = 4

With the basics aside, let's do some calculations.

Calculate total revenue and discounted pricing

Calculate total revenue for each product in the sales_report.txt.

awk 'NR > 1 {total = $2 * $3; print $1, "generates $" total}' sales_report.txt

It multiplies price (second field $2) by quantity (third field $3) to show total revenue per product, skipping the header line with NR > 1 (line number greater than 1).

Laptop generates $6499.95
Desktop generates $2698.5
Tablet generates $4792
Monitor generates $4199.88
Keyboard generates $1499.85
Mouse generates $1249.75
An animated GIF image showing the process of calculating total revenue and discounted pricing
Total revenue and discounted pricing

Now, let's apply discounts to calculate final prices:

awk 'NR > 1 {
    discount_amount = ($2 * $4) / 100
    final_price = $2 - discount_amount
    printf "%-10s: $%.2f (was $%.2f, saved $%.2f)\n", $1, final_price, $2, discount_amount
}' sales_report.txt

The long expression above calculates the discount amount and final price, showing original price and savings with formatted output.

The complicated part here could be to understand the formatting I used with printf. This is why I suggested reading about it at the beginning of this tutorial.

Quickly, %-10s sets the width to 10 with left alignment (-), %.2f sets the floating point to two decimal points.

Laptop    : $1169.99 (was $1299.99, saved $130.00)
Desktop   : $764.58 (was $899.50, saved $134.93)
Tablet    : $569.05 (was $599.00, saved $29.95)
Monitor   : $279.99 (was $349.99, saved $70.00)
Keyboard  : $99.99 (was $99.99, saved $0.00)
Mouse     : $43.99 (was $49.99, saved $6.00)
An animated image showing the process of applying discounts and calculate final prices
Calculate final prices

Calculate server temperature in Fahrenheit

Let's take it to the next level with temperature converter.

awk '{
    fahrenheit = ($5 * 9/5) + 32
    printf "%-10s: %.1f°C = %.1f°F", $1, $5, fahrenheit
    if (fahrenheit > 140) printf " (HOT!)"
    printf "\n"
}' server_metrics.txt

The expression above converts Celsius to Fahrenheit using the conversion formula and flags hot servers in our sample text file server_metrics.txt:

web01     : 45.0°C = 113.0°F
web02     : 62.0°C = 143.6°F (HOT!)
db01      : 38.0°C = 100.4°F
db02      : 71.0°C = 159.8°F (HOT!)
cache01   : 22.0°C = 71.6°F
backup01  : 55.0°C = 131.0°F
An animated GIF image showing the process of calculating server temperature in Fahrenheit.
Calculate server temperature in Fahrenheit

Advanced Mathematical Functions

AWK provides built-in mathematical functions for more complex calculations and you'll see some of them in this section.

Function Purpose Example Result
sqrt(x) Square root sqrt(16) 4
sin(x) Sine (radians) sin(1.57) 1 (90°)
cos(x) Cosine (radians) cos(0) 1 (0°)
atan2(y,x) Arc tangent of y/x atan2(1,1) 0.785 (45°)
exp(x) e^x (exponential) exp(1) 2.718
log(x) Natural logarithm log(2.718) 1
int(x) Integer part int(3.14) 3
rand() Random 0 to 1 rand() 0.423 (varies)
srand(x) Set random seed srand(42) Sets seed to 42

Calculate the square root for performance metrics

Let's create performance index from the server metrics file. It will use the sqrt function:

awk '{
    performance_index = sqrt($2 * $4)
    printf "%-10s: Performance Index = %.1f\n", $1, performance_index
}' server_metrics.txt

It creates a composite performance metric using square root of CPU and disk I/O product.

web01     : Performance Index = 80.1
web02     : Performance Index = 80.6
db01      : Performance Index = 79.8
db02      : Performance Index = 89.8
cache01   : Performance Index = 54.4
backup01  : Performance Index = 90.0
An animated GIF image showing the process of calculating the square root for peformance metrics.
Calculate Square Root for Performance Metrics

Random number generation

Let's generate random server maintenance schedules:

awk '{
    maintenance_day = int(rand() * 30) + 1
    maintenance_hour = int(rand() * 24)
    printf "%-10s: Schedule maintenance on day %d at %02d:00\n", $1, maintenance_day, maintenance_hour
}' server_metrics.txt

Remember rand() generates a random number between 0 and 1. So, I multiplied with 30 (for days of months) and 24 (hours of day) and only tool the integer part with int().

Thus we have a script that assigns random maintenance windows within a 30-day period.

web01     : Schedule maintenance on day 15 at 08:00
web02     : Schedule maintenance on day 3 at 14:00
db01      : Schedule maintenance on day 22 at 02:00
db02      : Schedule maintenance on day 8 at 19:00
cache01   : Schedule maintenance on day 11 at 05:00
backup01  : Schedule maintenance on day 27 at 16:00
An animated GIF image showing the process of random number generation.
Random number generation

⚠️ rand() is not so random in subsequent runs.

Run the script a few times. Do you notice something weird? The output stays the same. What's the big deal? Well, you would expect rand() to generate random values in each run and thus giving a random result each time, right? But that doesn't happen here.

You see, rand() will generate a random number between 0 and 1 only for the first run. All the subsequent runs will produce the same random numbers.

To make it generate radom numbers in each run, set up seed with srand().

🪧 Time to recall

You now have essential mathematical capabilities:

  • Arithmetic operations: Perform calculations directly on data fields
  • Mathematical functions: Use sqrt, int, rand for complex calculations

Practice Exercises

1. Create a formatted sales report with sales_report.txt in table format with aligned columns showing product, price, quantity, discount, and final revenue. The final output should look like this:

| Laptop     |  1299.99 |    5 |  10.00% |    5849.95 |
| Desktop    |   899.50 |    3 |  15.00% |    2293.73 |
| Tablet     |   599.00 |    8 |   5.00% |    4552.40 |
| Monitor    |   349.99 |   12 |  20.00% |    3359.90 |
| Keyboard   |    99.99 |   15 |   0.00% |    1499.85 |
| Mouse      |    49.99 |   25 |  12.00% |    1099.78 |

2. Calculate the average price of all products in the sales report

3. Convert all temperatures to Kelvin (K = C + 273.15)

4. Find which server has the highest CPU usage and by how much

In the next chapter, you'll learn about dealing with string manipulation in AWK.

Abhishek Prakash