Chapter 5: Resources, Data Sources, and Dependencies
Understand how Terraform actually builds infrastructure graphs and manages dependencies.
Think of Terraform as a construction manager. Resources are the buildings you construct. Data sources are the surveys you conduct before building. Dependencies are the order in which construction must happen. You can’t build the roof before the walls, right?
Resources: The Heart of Everything
If Terraform were a programming language, resources would be the objects. They’re things you create, modify, and delete. Every piece of infrastructure — servers, databases, networks, load balancers—starts as a resource in your code.
The anatomy of a resource: Two parts matter most. The type tells Terraform what kind of thing to create. The name is how you refer to it in your code. That’s it.
resource "aws_instance" "web" {
ami = "ami-12345678"
instance_type = "t2.micro"
}
Here’s what beginners often miss: the name web isn’t the name your server gets in AWS. It’s just a label for your Terraform code. Think of it like a variable name in programming. The actual AWS resource might be named something completely different (usually via tags).
Arguments vs Attributes - the key distinction: You provide arguments (the input values). Terraform gives you attributes (the output values). You tell Terraform instance_type = "t2.micro". Terraform tells you back id = "i-1234567890abcdef0" and public_ip = "54.123.45.67" after creation.
This distinction is crucial because attributes only exist after Terraform creates the resource. You can’t reference an instance’s IP address before it exists. Terraform figures out the order automatically.
References connect everything: When you write aws_instance.web.id, you’re doing three things:
- Referencing the resource type (
aws_instance) - Referencing your local name for it (
web) - Accessing an attribute it exposes (
id)
This is how infrastructure connects. One resource references another’s attributes. VPC ID goes into subnet configuration. Subnet ID goes into instance configuration. These references tell Terraform the construction order.
Why the two-part naming? Because you might create multiple instances of the same type. You could have aws_instance.web, aws_instance.db, and aws_instance.cache. The type describes what it is. The name describes which one.
Data Sources: Reading the Existing World
Resources create. Data sources read. That’s the fundamental difference.
Real infrastructure doesn’t exist in a vacuum. You’re deploying into an existing VPC someone else created. You need the latest Ubuntu AMI that changes monthly. You’re reading a secret from a vault. None of these things should you create — you just need to reference them.
Data sources are queries: Think of them as SELECT statements in SQL. You’re querying existing infrastructure and pulling information into your Terraform code.
data "aws_ami" "ubuntu" {
most_recent = true
owners = ["099720109477"]
filter {
name = "name"
values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-*"]
}
}
This doesn’t create an AMI. It searches for one that already exists and gives you its ID.
Why data sources matter for infrastructure code: Imagine hardcoding AMI IDs. Next month, there’s a new Ubuntu release with security patches. You have to find the new AMI ID and update your code. Or, use a data source that always finds the latest. Code stays the same, infrastructure stays updated.
The same principle applies to everything external: VPCs, DNS zones, availability zones, TLS certificates, secrets. If it exists before your Terraform code runs, use a data source.
The reference difference: Resources are type.name.attribute. Data sources are data.type.name.attribute. That extra data. prefix tells Terraform and you that this is a read operation, not a create operation.
Data sources run first: Before Terraform creates anything, it runs all data source queries. This makes sense—you need to read information before you can use it to create things.
String Interpolation: Building Dynamic Infrastructure
Infrastructure can’t be static. You need bucket names that include environment names. Server names that include region. Tags that reference other resources. String interpolation is how you build these dynamic values.
The rule is simple: Use ${} when building strings. Don’t use it for direct references.
bucket = "myapp-${var.environment}-data" # String building - USE ${}
ami = data.aws_ami.ubuntu.id # Direct reference - NO ${}
Why the distinction? In Terraform’s early days (before version 0.12), you needed "${var.name}" everywhere. It was verbose and ugly. Modern Terraform is cleaner — interpolation only when actually building strings.
What you can put inside interpolation: Everything. Variables, resource attributes, conditional expressions, function calls. If it produces a value, you can interpolate it.
name = "${var.project}-${var.environment}-${count.index + 1}"
Common beginner mistake: Writing instance_type = "${var.instance_type}". The ${} is unnecessary here — you’re not building a string, just referencing a variable. Just write instance_type = var.instance_type.
When interpolation shines: Multi-part names. Constructing URLs. Building complex strings from multiple sources. Any time “I need to combine these values into text.”
Dependencies: The Hidden Graph
This is where Terraform’s magic happens. You write resources in any order. Terraform figures out the correct creation order automatically. How? By analyzing dependencies.
Implicit Dependencies: The Automatic Kind
When you reference one resource’s attribute in another resource, you’ve created
a dependency. Terraform sees the reference and knows the order.
Mental model: Think of dependencies as arrows in a diagram. VPC -> Subnet -> Instance. Each arrow means “must exist before.” Terraform builds this diagram automatically by finding all the attribute references in your code.
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
}
resource "aws_subnet" "app" {
vpc_id = aws_vpc.main.id # Reference creates dependency
cidr_block = "10.0.1.0/24"
}
resource "aws_instance" "web" {
subnet_id = aws_subnet.app.id # Another dependency
ami = "ami-12345678"
instance_type = "t2.micro"
}
You can write these in any order in your files. Terraform sees aws_vpc.main.id referenced in the subnet, and aws_subnet.app.id referenced in the instance. It builds the dependency graph: VPC -> Subnet -> Instance.
Why this matters: Terraform creates things in parallel when possible. If you define 10 S3 buckets with no dependencies, Terraform creates all 10 simultaneously. If you define a VPC with 10 subnets, it creates the VPC first, then all 10 subnets in parallel.
The key insight: Every attribute reference is a dependency. resource.name.attribute means “I need this resource to exist first.”
Explicit Dependencies: The Manual Kind
Sometimes Terraform can’t detect dependencies automatically. The relationship exists, but there’s no attribute reference to signal it.
Classic example - IAM: You create an IAM role. You attach a policy to it. You launch an instance with that role. The instance references the role, but not the policy. Terraform might launch the instance before the policy attaches, causing errors.
resource "aws_instance" "app" {
ami = "ami-12345678"
instance_type = "t2.micro"
depends_on = [aws_iam_role_policy.app_policy]
}
The depends_on argument says “don’t create this until that other thing exists,” even though we’re not referencing any of its attributes.
When you need explicit dependencies:
Timing matters but there’s no direct attribute reference
Resources must exist in a certain order for external reasons
You’re working around provider bugs or limitations
Use sparingly: Explicit dependencies reduce parallelism. Terraform must wait for the dependency before proceeding. Only use them when implicit dependencies won’t work.
The Dependency Graph
Behind the scenes, Terraform builds a directed acyclic graph (DAG) of all your resources. Nodes are resources. Edges are dependencies. This graph determines everything:
- What to create first
- What can be created in parallel
- What to destroy first when tearing down
Directed: Dependencies have direction. A depends on B, not the other way around.
Acyclic: No loops allowed. If A depends on B, B can’t depend on A (even indirectly). Terraform will error on circular dependencies—they’re impossible to resolve.
Why you should care: Understanding the dependency graph helps you debug. If Terraform is creating things in a weird order, check the references. If it’s failing on circular dependencies, look for cycles in your attribute references.
Viewing the graph: Run terraform graph to see the actual graph Terraform built. It’s mostly useful for debugging complex configurations.
How It All Fits Together
Every Terraform confguration is a combination of these concepts:
- Resources define what to create
- Data sources query what exists
- Interpolation builds dynamic values
- Dependencies determine the order
The workflow: Data sources run first (they’re just queries). Terraform analyzes all resource definitions and builds the dependency graph. It creates resources in the correct order, parallelizing when possible. References between resources become the glue.
The mental shift: You’re not writing a script that executes top-to-bottom. You’re describing desired state. Terraform figures out how to achieve it. That’s declarative infrastructure.
Why beginners struggle: They think procedurally. “First create this, then create that.” Terraform doesn’t work that way. You declare everything you want. Terraform analyzes the dependencies and figures out the procedure.
Common Mistakes and How to Avoid Them
Mistake 1: Using resource names as identifiers - Resource names in Terraform are local to your code. They’re not the names resources get in your cloud provider. Use tags or name attributes for that.
Mistake 2: Trying to reference attributes before resources exist - You can’t use aws_instance.web.public_ip in a variable default value. The instance doesn’t exist when Terraform evaluates variables. Use locals or outputs instead.
Mistake 3: Over-using explicit dependencies - If you’re writing lots of depends_on, you’re probably doing something wrong. Most dependencies should be implicit through attribute references.
Mistake 4: Confusing data sources with resources - Data sources don’t create anything. If you need to create something, use a resource, not a data source.
Mistake 5: Hardcoding values that data sources should provide - Don’t hardcode AMI IDs, availability zones, or other values that change. Use data sources to query them dynamically.
Quick Reference
Resources:
resource "type" "name" {
argument = "value"
}
# Reference: type.name.attribute
Data Sources:
data "type" "name" {
filter = "value"
}
# Reference: data.type.name.attribute
String Interpolation:
"prefix-${var.name}-suffix" # Building strings
var.name # Direct reference
Dependencies:
# Implicit (automatic)
subnet_id = aws_subnet.main.id
# Explicit (manual)
depends_on = [aws_iam_role.app]
Master these four concepts and you’ll understand 80% of Terraform. Everything else builds on this foundation.
You now understand the core building blocks: resources, data sources, and dependencies. But what if you need to create multiple similar resources? Copy pasting code isn’t the answer. In the next chapter, we’ll explore count, for_each, and conditionals—the tools that make your infrastructure code truly dynamic and scalable.