Chapter 4: Mathematical Operations in AWK
Transform raw data into polished reports and perform complex calculations that would require separate tools elsewhere.

AWK handles numbers like a built-in calculator. I would say like a scientific calculator, as it has several built-in mathematical functions. You can perform mathematical operations directly on your data fields without any special setup.
Let me create some sample files for you to work with:
sales_report.txt:
Product Price Quantity Discount_Percent
Laptop 1299.99 5 10
Desktop 899.50 3 15
Tablet 599.00 8 5
Monitor 349.99 12 20
Keyboard 99.99 15 0
Mouse 49.99 25 12
server_metrics.txt with fields hostname, cpu_percent, memory_mb, disk_io, temp_celsius:
web01 75.5 4096 85.2 45
web02 82.1 2048 78.9 62
db01 68.9 8192 92.3 38
db02 91.2 4096 88.7 71
cache01 45.3 1024 65.4 22
backup01 88.8 2048 91.1 55

Basic Arithmetic Operations
To refresh your memory, here are the basic arithmetic operators in AWK:
Operation | Operator | Example | Result | Description |
---|---|---|---|---|
Addition | + | 5 + 3 | 8 | Add two numbers |
Subtraction | - | 10 - 4 | 6 | Subtract second from first |
Multiplication | * | 6 * 7 | 42 | Multiply two numbers |
Division | / | 15 / 3 | 5 | Divide first by second |
Modulo | % | 10 % 3 | 1 | Remainder after division |
Exponentiation | ^ or ** | 2 ^ 3 | 8 | Raise to power |
You already know that the order of execution matters in arithmetic. So, let's clear that as well.
Priority | Operations | Example |
---|---|---|
1 | () Parentheses | (2 + 3) * 4 = 20 |
2 | ^ ** Exponentiation | 2 + 3 ^ 2 = 11 |
3 | - Unary minus | -5 * 2 = -10 |
4 | * / % Multiply/Divide/Modulo | 6 / 2 * 3 = 9 |
5 | + - Add/Subtract | 5 - 2 + 1 = 4 |
With the basics aside, let's do some calculations.
Calculate total revenue and discounted pricing
Calculate total revenue for each product in the sales_report.txt
.
awk 'NR > 1 {total = $2 * $3; print $1, "generates $" total}' sales_report.txt
It multiplies price (second field $2) by quantity (third field $3) to show total revenue per product, skipping the header line with NR > 1
(line number greater than 1).
Laptop generates $6499.95
Desktop generates $2698.5
Tablet generates $4792
Monitor generates $4199.88
Keyboard generates $1499.85
Mouse generates $1249.75

Now, let's apply discounts to calculate final prices:
awk 'NR > 1 {
discount_amount = ($2 * $4) / 100
final_price = $2 - discount_amount
printf "%-10s: $%.2f (was $%.2f, saved $%.2f)\n", $1, final_price, $2, discount_amount
}' sales_report.txt
The long expression above calculates the discount amount and final price, showing original price and savings with formatted output.
The complicated part here could be to understand the formatting I used with printf. This is why I suggested reading about it at the beginning of this tutorial.
Quickly, %-10s
sets the width to 10 with left alignment (-), %.2f
sets the floating point to two decimal points.
Laptop : $1169.99 (was $1299.99, saved $130.00)
Desktop : $764.58 (was $899.50, saved $134.93)
Tablet : $569.05 (was $599.00, saved $29.95)
Monitor : $279.99 (was $349.99, saved $70.00)
Keyboard : $99.99 (was $99.99, saved $0.00)
Mouse : $43.99 (was $49.99, saved $6.00)

Calculate server temperature in Fahrenheit
Let's take it to the next level with temperature converter.
awk '{
fahrenheit = ($5 * 9/5) + 32
printf "%-10s: %.1f°C = %.1f°F", $1, $5, fahrenheit
if (fahrenheit > 140) printf " (HOT!)"
printf "\n"
}' server_metrics.txt
The expression above converts Celsius to Fahrenheit using the conversion formula and flags hot servers in our sample text file server_metrics.txt
:
web01 : 45.0°C = 113.0°F
web02 : 62.0°C = 143.6°F (HOT!)
db01 : 38.0°C = 100.4°F
db02 : 71.0°C = 159.8°F (HOT!)
cache01 : 22.0°C = 71.6°F
backup01 : 55.0°C = 131.0°F

Advanced Mathematical Functions
AWK provides built-in mathematical functions for more complex calculations and you'll see some of them in this section.
Function | Purpose | Example | Result |
---|---|---|---|
sqrt(x) | Square root | sqrt(16) | 4 |
sin(x) | Sine (radians) | sin(1.57) | 1 (90°) |
cos(x) | Cosine (radians) | cos(0) | 1 (0°) |
atan2(y,x) | Arc tangent of y/x | atan2(1,1) | 0.785 (45°) |
exp(x) | e^x (exponential) | exp(1) | 2.718 |
log(x) | Natural logarithm | log(2.718) | 1 |
int(x) | Integer part | int(3.14) | 3 |
rand() | Random 0 to 1 | rand() | 0.423 (varies) |
srand(x) | Set random seed | srand(42) | Sets seed to 42 |
Calculate the square root for performance metrics
Let's create performance index from the server metrics file. It will use the sqrt
function:
awk '{
performance_index = sqrt($2 * $4)
printf "%-10s: Performance Index = %.1f\n", $1, performance_index
}' server_metrics.txt
It creates a composite performance metric using square root of CPU and disk I/O product.
web01 : Performance Index = 80.1
web02 : Performance Index = 80.6
db01 : Performance Index = 79.8
db02 : Performance Index = 89.8
cache01 : Performance Index = 54.4
backup01 : Performance Index = 90.0

Random number generation
Let's generate random server maintenance schedules:
awk '{
maintenance_day = int(rand() * 30) + 1
maintenance_hour = int(rand() * 24)
printf "%-10s: Schedule maintenance on day %d at %02d:00\n", $1, maintenance_day, maintenance_hour
}' server_metrics.txt
Remember rand()
generates a random number between 0 and 1. So, I multiplied with 30 (for days of months) and 24 (hours of day) and only tool the integer part with int()
.
Thus we have a script that assigns random maintenance windows within a 30-day period.
web01 : Schedule maintenance on day 15 at 08:00
web02 : Schedule maintenance on day 3 at 14:00
db01 : Schedule maintenance on day 22 at 02:00
db02 : Schedule maintenance on day 8 at 19:00
cache01 : Schedule maintenance on day 11 at 05:00
backup01 : Schedule maintenance on day 27 at 16:00

⚠️ rand() is not so random in subsequent runs.
Run the script a few times. Do you notice something weird? The output stays the same. What's the big deal? Well, you would expect rand() to generate random values in each run and thus giving a random result each time, right? But that doesn't happen here.
You see, rand() will generate a random number between 0 and 1 only for the first run. All the subsequent runs will produce the same random numbers.
To make it generate radom numbers in each run, set up seed with srand()
.
🪧 Time to recall
You now have essential mathematical capabilities:
- Arithmetic operations: Perform calculations directly on data fields
- Mathematical functions: Use sqrt, int, rand for complex calculations
Practice Exercises
1. Create a formatted sales report with sales_report.txt
in table format with aligned columns showing product, price, quantity, discount, and final revenue. The final output should look like this:
| Laptop | 1299.99 | 5 | 10.00% | 5849.95 |
| Desktop | 899.50 | 3 | 15.00% | 2293.73 |
| Tablet | 599.00 | 8 | 5.00% | 4552.40 |
| Monitor | 349.99 | 12 | 20.00% | 3359.90 |
| Keyboard | 99.99 | 15 | 0.00% | 1499.85 |
| Mouse | 49.99 | 25 | 12.00% | 1099.78 |
2. Calculate the average price of all products in the sales report
3. Convert all temperatures to Kelvin (K = C + 273.15)
4. Find which server has the highest CPU usage and by how much
In the next chapter, you'll learn about dealing with string manipulation in AWK.
Creator of Linux Handbook and It's FOSS. An ardent Linux user who has new-found love for self-hosting, homelabs and local AI.