Explaining Terraform State Data and Illustrating Four Drift Scenarios

The aim of this page📝 is to describe the concept of state data in Terraform along with an illustration of drift scenarios. I am handling drifts on a daily basis, so knowing the mechanics is foundational knowledge. My little constraint is that we’re having statefiles stored remotely in places I don’t have access to, so my illustrations are from a sandbox. Also we manage config files, yet clients have access to their target environment.

2 min readOct 5, 2023

State_data is `JSON` formatted essential component of Terraform - located between and mapping `config_files` and `target_environment`

state data is stored in a json file terraform.tfstate.backup with configurable (local/remote) location
state data contains entries for resources, datasources, and outputs
state data is what maps ..terraform configuration <> state_file <> target environment ..identifiers from configuration <> state_file <> identifiers in the target environment
state data is ..⟹ it is how TF knows about the objects it is managing ..⟹ it is how TF knows which CRUD operation it needs to perform on a resource
rule: do not alter state files by hand
unique identifier in the target environment depends on object type ..e.g. ec2 instance has its own unique instance ID
the state data also contains metadata about the version of Terraform used, the version of the state data format, and the serial number of the current state data
when TF is executing an operation that is potentially altering state data, it tries to place a lock on the state data so no other instance of terraform can make changes
as for the location, you can save the state
..locally
..remotely (AWS, Azure, NFS, Terraform Cloud)

Each time TF generates `plan` it first loads a statefile, then refreshes it by querying environment, and compares the new state with the configuration

[load] state data from a file into memory. the empty statefile looks like this

{
  "version": 4,
  "terraform_version": "1.5.5",
  "serial": 70,
  "lineage": "07e6371c-b553-b77b-f49f-fd03fab89d1d",
  "outputs": {},
  "resources": [],
  "check_results": null
}

[refresh] the values in state file data by querying the target environment
Here, some changes are known, and some changes are known only after apply. For the latter the logs say

data "google_compute_network_endpoint_group" "collector_neg"  {
      STEP default_port          = 0 -> (known after apply)

[compare] the values (run diff) in the state data and the values in the configuration
⟹ announce the plan

Plan: 0 to add, 1 to change, 0 to destroy.

If appropriate, one deploys the config by running `apply` which first changes the environment and then the statefile

[add/update/destroy] environment resources
[add/update/destroy] the statefile accordingly
⟹ resolves drift between config and environment
this is how TF knows what needs to be changed, added or removed

Explaining Terraform State Data and Illustrating Four Drift Scenarios

State_data is `JSON` formatted essential component of Terraform - located between and mapping `config_files` and `target_environment`

Each time TF generates `plan` it first loads a statefile, then refreshes it by querying environment, and compares the new state with the configuration

If appropriate, one deploys the config by running `apply` which first changes the environment and then the statefile

Scenario #1: New record added to configuration

Scenario #2: Drift with resource present in config and state, but missing from environment

Scenario #3: Removing resources from the configuration will destroy resources in the environment and delete data from the statefile

Scenario #4: Manual manipulation with the state data returns either error or adds additional resource

Written by Pavol Kutaj

No responses yet

Explaining Terraform State Data and Illustrating Four Drift Scenarios

State_data is `JSON` formatted essential component of Terraform - located between and mapping config_files and target_environment

Each time TF generates plan it first loads a statefile, then refreshes it by querying environment, and compares the new state with the configuration

If appropriate, one deploys the config by running apply which first changes the environment and then the statefile

Scenario #1: New record added to configuration

Scenario #2: Drift with resource present in config and state, but missing from environment

Scenario #3: Removing resources from the configuration will destroy resources in the environment and delete data from the statefile

Scenario #4: Manual manipulation with the state data returns either error or adds additional resource

Written by Pavol Kutaj

No responses yet

State_data is `JSON` formatted essential component of Terraform - located between and mapping `config_files` and `target_environment`

Each time TF generates `plan` it first loads a statefile, then refreshes it by querying environment, and compares the new state with the configuration

If appropriate, one deploys the config by running `apply` which first changes the environment and then the statefile