Create Cloud Composer environments

Cloud Composer 3 | Cloud Composer 2 | Cloud Composer 1

This page explains how to create a Cloud Composer environment.

Before you begin

Step 1. Create or choose an environment's service account

When you create an environment, you specify a service account. This service account is called environment's service account. Your environment uses this service account to perform most of the operations.

The service account for your environment is not a user account. A service account is a special kind of account used by an application or a virtual machine (VM) instance, not a person.

You can't change the service account of your environment later.

If you don't have a service account for Cloud Composer environments in your project yet, create it.

See Create environments (Terraform) for an extended example of creating a service account for your environment in Terraform.

To create a new service account for your environment:

  1. Create a new service account as described in the Identity and Access Management documentation.

  2. Grant a role to it, as described in the Identity and Access Management documentation. The required role is Composer Worker (composer.worker).

  3. To access other resources in your Google Cloud project, grant extra permissions to access those resources to this service account. The Composer Worker (composer.worker) role provides this required set of permissions in most cases. Add extra permissions to this service account only when it's necessary for the operation of your DAGs.

Step 2. Basic setup

This step creates a Cloud Composer environment with default parameters in the specified location.

Console

  1. In the Google Cloud console, go to the Create environment page.

    Go to Create environment

  2. In the Name field, enter a name for your environment.

    The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens, and can't end with a hyphen. The environment name is used to create subcomponents for the environment, so you must provide a name that is also valid as a Cloud Storage bucket name. See Bucket naming guidelines for a list of restrictions.

  3. In the Location drop-down list, choose a location for your environment.

    A location is the region where the environment is located.

  4. In the Image version drop-down list, select a Cloud Composer image with the required version of Airflow.

  5. In the Service account drop-down list, select a service account for your environment.

    If you don't have a service account for your environment yet, see Create or choose an environment's service account.

gcloud

gcloud composer environments create ENVIRONMENT_NAME \
    --location LOCATION \
    --image-version IMAGE_VERSION \
    --service-account "SERVICE_ACCOUNT"

Replace:

  • ENVIRONMENT_NAME with the name of the environment.

    The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens, and can't end with a hyphen. The environment name is used to create subcomponents for the environment, so you must provide a name that is also valid as a Cloud Storage bucket name. See Bucket naming guidelines for a list of restrictions.

  • LOCATION with the region for the environment.

    A location is the region where the environment is located.

  • SERVICE_ACCOUNT with the service account for your environment.

  • IMAGE_VERSION with the name of a Cloud Composer image.

Example:

gcloud composer environments create example-environment \
    --location us-central1 \
    --image-version composer-3-airflow-2.10.5-build.19 \
    --service-account "
example-account@example-project.iam.gserviceaccount.com
"

API

Construct an environments.create API request. Specify the configuration in the Environment resource.

{
  "name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
  "config": {
    "softwareConfig": {
      "imageVersion": "IMAGE_VERSION"
    },
    "nodeConfig": {
      "serviceAccount": "SERVICE_ACCOUNT"
    }
  }
}

Replace:

  • PROJECT_ID with the Project ID.

  • LOCATION with the region for the environment.

    A location is the region where the environment is located.

  • ENVIRONMENT_NAME with the environment name.

    The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens, and can't end with a hyphen. The environment name is used to create subcomponents for the environment, so you must provide a name that is also valid as a Cloud Storage bucket name. See Bucket naming guidelines for a list of restrictions.

  • IMAGE_VERSION with the name of a Cloud Composer image.

  • SERVICE_ACCOUNT with the service account for your environment.

Example:

// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments

{
  "name": "projects/example-project/locations/us-central1/environments/example-environment",
  "config": {
    "softwareConfig": {
      "imageVersion": "composer-3-airflow-2.10.5-build.19"
    },
    "nodeConfig": {
      "serviceAccount": "
example-account@example-project.iam.gserviceaccount.com
"
    }
  }
}

Terraform

To create an environment with default parameters is a specified location, add the following resource block to your Terraform configuration and run terraform apply.

resource "google_composer_environment" "example" {
  provider = google-beta
  name = "ENVIRONMENT_NAME"
  region = "LOCATION"

  config {
    software_config {
      image_version = "IMAGE_VERSION"
    }
    node_config {
      service_account = "SERVICE_ACCOUNT"
    }
  }
}

Replace:

  • ENVIRONMENT_NAME with the name of the environment.

    The name must start with a lowercase letter followed by up to 62 lowercase letters, numbers, or hyphens, and can't end with a hyphen. The environment name is used to create subcomponents for the environment, so you must provide a name that is also valid as a Cloud Storage bucket name. See Bucket naming guidelines for a list of restrictions.

  • LOCATION with the region for the environment.

    A location is the region where the environment is located.

  • IMAGE_VERSION with the name of a Cloud Composer image.

  • SERVICE_ACCOUNT with the service account for your environment.

Example:

resource "google_composer_environment" "example" {
  provider = google-beta
  name = "example-environment"
  region = "us-central1"

  config {
    software_config {
      image_version = "composer-3-airflow-2.10.5-build.19"
    }
    node_config {
      service_account = "
example-account@example-project.iam.gserviceaccount.com
"
    }
  }
}

Step 3. (Optional) Configure environment scale and performance parameters

To specify the scale and performance configuration for your environment, select the environment size and workloads configuration.

You can change all performance and scale parameters after you create an environment.

Following parameters control the scale and performance:

  • Environment size. Controls the performance parameters of the managed Cloud Composer infrastructure that includes the Airflow database. Consider selecting a larger environment size if you want to run a large number of DAGs and tasks with higher infrastructure performance. For example, larger environment's size increases the amount of Airflow task log entries that your environment can process with minimal delay.

  • Workloads configuration. Controls the scale and performance of Airflow components that run in a GKE cluster of your environment.

    • Airflow scheduler. Parses DAG definition files, schedules DAG runs based on the schedule interval, and queues tasks for execution by Airflow workers.

      Your environment can run more than one Airflow scheduler at the same time. Use multiple schedulers to distribute load between several scheduler instances for better performance and reliability.

      Increasing the number of schedulers does not always improve Airflow performance. For example, having only one scheduler might provide better performance than having two. This might happen when the extra scheduler is not utilized, and thus consumes resources of your environment without contributing to overall performance. The actual scheduler performance depends on the number of Airflow workers, the number of DAGs and tasks that run in your environment, and the configuration of both Airflow and the environment.

      We recommend starting with two schedulers and then monitoring the performance of your environment. If you change the number of schedulers, you can always scale your environment back to the original number of schedulers.

      For more information about configuring multiple schedulers, see Airflow documentation.

    • Airflow triggerer. Asynchronously monitors all deferred tasks in your environment. If you have at least one triggerer instance in your environment (or at least two in highly resilient environments), you can use deferrable operators in your DAGs.

      In Cloud Composer 3, the Airflow triggerer is enabled by default. If you want to create an environment without a triggerer, set the number of triggerers to zero.

    • Airflow DAG processor. Processes DAG files and turns them into DAG objects. In Cloud Composer 3, this part of the scheduler runs as a separate environment component.

    • Airflow web server. Runs the Airflow web interface where you can monitor, manage, and visualize your DAGs.

    • Airflow workers. Execute tasks that are scheduled by Airflow schedulers. The minimum and maximum number of workers in your environment changes dynamically depending on the number of tasks in the queue.

Console

You can select a preset for your environment. When you select a preset, the scale and performance parameters for that preset are automatically selected. You also have an option to select a custom preset and specify all scale and performance parameters for your environment.

To select the scale and performance configuration for your environment, on the Create environment page:

  • To use predefined values, in the Environment resources section, click Small, Medium, or Large.

  • To specify custom values for the scale and performance parameters:

    1. In the Environment resources section, click Custom.

    2. In the Scheduler section, set the number of schedulers you want to use, and the resource allocation for their CPU, memory, and storage.

    3. In the Triggerer section, use the Number of triggerers field to enter the number of triggerers in your environment.

      If you don't want to use deferrable operators in your DAGs, set the number of triggerers to zero.

      If you set at least one triggerer for your environment, use the the CPU, and Memory fields to configure resource allocation for your triggerers.

    4. In the DAG processor section, specify the number of DAG processors in your environment and the amount of CPUs, memory, and storage for each DAG processor.

      Highly resilient environments require at least two DAG processors.

    5. In the Web server section, specify the amount of CPUs, memory, and storage for the web server.

    6. In the Worker section, specify:

      • The minimum and maximum number of workers for autoscaling limits in your environment.
      • The CPU, memory, and storage allocation for your workers
    7. In the Core infrastructure section, in the Environment size drop-down list, select the environment size.

gcloud

When you create an environment, the following arguments control the scale and performance parameters of your environment.

  • --environment-size specifies the environment size.
  • --scheduler-count specifies the number of schedulers.
  • --scheduler-cpu specifies the number of CPUs for an Airflow scheduler.
  • --scheduler-memory specifies the amount of memory for an Airflow scheduler.
  • --scheduler-storage specifies the amount of disk space for an Airflow scheduler.

  • --triggerer-count specifies the number of Airflow triggerers in your environment. The default value for this flag is 0. You need triggerers if you want to use deferrable operators in your DAGs.

    • For standard resilience environments, use a value between 0 and 10.
    • For highly resilient environments, use 0 or a value between 2 and 10.
  • --triggerer-cpu specifies the number of CPUs for an Airflow triggerer, in vCPU units. Allowed values: 0.5, 0.75, 1. The default value is 0.5.

  • --triggerer-memory specifies the amount of memory for an Airflow triggerer, in GB. The default value is 0.5.

    The minimum required memory is equal to the number of CPUs allocated for the triggerers. The maximum allowed value is equal to the number of triggerer CPUs multiplied by 6.5.

    For example, if you set the --triggerer-cpu flag to 1, the minimum value for --triggerer-memory is 1 and the maximum value is 6.5.

  • --dag-processor-count specifies the number of DAG processors in your environment.

    Highly resilient environments require at least two DAG processors.

  • --dag-processor-cpu specifies the number of CPUs for the DAG processor.

  • --dag-processor-memory specifies the amount of memory for the DAG processor.

  • --dag-processor-storage specifies the amount of disk space for the DAG processor.

  • --web-server-cpu specifies the number of CPUs for the Airflow web server.

  • --web-server-memory specifies the amount of memory for the Airflow web server.

  • --web-server-storage specifies the amount of disk space for the Airflow web server.

  • --worker-cpu specifies the number of CPUs for an Airflow worker.

  • --worker-memory specifies the amount of memory for an Airflow worker.

  • --worker-storage specifies the amount of disk space for an Airflow worker.

  • --min-workers specifies the minimum number of Airflow workers. Your environment's cluster runs at least this number of workers.

  • --max-workers specifies the maximum number of Airflow workers. Your environment's cluster runs at most this number of workers.

gcloud composer environments create ENVIRONMENT_NAME \
    --location LOCATION \
    --image-version composer-3-airflow-2.10.5-build.19 \
    --service-account "SERVICE_ACCOUNT" \
    --environment-size ENVIRONMENT_SIZE \
    --scheduler-count SCHEDULER_COUNT \
    --scheduler-cpu SCHEDULER_CPU \
    --scheduler-memory SCHEDULER_MEMORY \
    --scheduler-storage SCHEDULER_STORAGE \
    --triggerer-count TRIGGERER_COUNT \
    --triggerer-cpu TRIGGERER_CPU \
    --triggerer-memory TRIGGERER_MEMORY \
    --dag-processor-count DAG_PROCESSOR_COUNT \
    --dag-processor-cpu DAG_PROCESSOR_CPU \
    --dag-processor-memory DAG_PROCESSOR_MEMORY \
    --dag-processor-storage DAG_PROCESSOR_STORAGE \
    --web-server-cpu WEB_SERVER_CPU \
    --web-server-memory WEB_SERVER_MEMORY \
    --web-server-storage WEB_SERVER_STORAGE \
    --worker-cpu WORKER_CPU \
    --worker-memory WORKER_MEMORY \
    --worker-storage WORKER_STORAGE \
    --min-workers WORKERS_MIN \
    --max-workers WORKERS_MAX

Replace:

  • ENVIRONMENT_SIZE with small, medium, or large.
  • SCHEDULER_COUNT with the number of schedulers.
  • SCHEDULER_CPU with the number of CPUs for a scheduler, in vCPU units.
  • SCHEDULER_MEMORY with the amount of memory for a scheduler.
  • SCHEDULER_STORAGE with the disk size for a scheduler.
  • TRIGGERER_COUNT with the number of triggerers.
  • TRIGGERER_CPU with the number of CPUs for a triggerer, in vCPU units.
  • TRIGGERER_MEMORY with the amount of memory for a triggerer, in GB.

  • DAG_PROCESSOR_COUNT with the number of DAG processors.

  • DAG_PROCESSOR_CPU with the number of CPUs for the DAG processor.

  • DAG_PROCESSOR_MEMORY with the amount of memory for the DAG processor.

  • DAG_PROCESSOR_STORAGE with the amount of disk space for the DAG processor.

  • WEB_SERVER_CPU with the number of CPUs for the web server, in vCPU units.

  • WEB_SERVER_MEMORY with the amount of memory for the web server.

  • WEB_SERVER_STORAGE with the amount of memory for the web server.

  • WORKER_CPU with the number of CPUs for a worker, in vCPU units.

  • WORKER_MEMORY with the amount of memory for a worker.

  • WORKER_STORAGE with the disk size for a worker.

  • WORKERS_MIN with the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a lower number of workers can handle the load.

  • WORKERS_MAX with the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.

Example:

gcloud composer environments create example-environment \
    --location us-central1 \
    --image-version composer-3-airflow-2.10.5-build.19 \
    --service-account "
example-account@example-project.iam.gserviceaccount.com
" \
    --environment-size small \
    --scheduler-count 1 \
    --scheduler-cpu 0.5 \
    --scheduler-memory 2.5GB \
    --scheduler-storage 2GB \
    --triggerer-count 1 \
    --triggerer-cpu 0.5 \
    --triggerer-memory 0.5GB \
    --dag-processor-count 1 \
    --dag-processor-cpu 0.5 \
    --dag-processor-memory 2GB \
    --dag-processor-storage 1GB \
    --web-server-cpu 1 \
    --web-server-memory 2.5GB \
    --web-server-storage 2GB \
    --worker-cpu 1 \
    --worker-memory 2GB \
    --worker-storage 2GB \
    --min-workers 2 \
    --max-workers 4

API

When you create an environment, in the Environment > EnvironmentConfig > WorkloadsConfig resource, specify environment scale and performance parameters.

{
  "name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
  "config": {
    "workloadsConfig": {
      "scheduler": {
        "cpu": SCHEDULER_CPU,
        "memoryGb": SCHEDULER_MEMORY,
        "storageGb": SCHEDULER_STORAGE,
        "count": SCHEDULER_COUNT
      },
      "triggerer": {
        "count": TRIGGERER_COUNT,
        "cpu": TRIGGERER_CPU,
        "memoryGb": TRIGGERER_MEMORY
      },
      "dagProcessor": {
        "count": DAG_PROCESSOR_COUNT,
        "cpu": DAG_PROCESSOR_CPU,
        "memoryGb": DAG_PROCESSOR_MEMORY,
        "storageGb": DAG_PROCESSOR_STORAGE
      },
      "webServer": {
        "cpu": WEB_SERVER_CPU,
        "memoryGb": WEB_SERVER_MEMORY,
        "storageGb": WEB_SERVER_STORAGE
      },
      "worker": {
        "cpu": WORKER_CPU,
        "memoryGb": WORKER_MEMORY,
        "storageGb": WORKER_STORAGE,
        "minCount": WORKERS_MIN,
        "maxCount": WORKERS_MAX
      }
    },
    "environmentSize": "ENVIRONMENT_SIZE",
    "nodeConfig": {
      "serviceAccount": "SERVICE_ACCOUNT"
    }
  }
}

Replace:

  • SCHEDULER_CPU with the number of CPUs for a scheduler, in vCPU units.
  • SCHEDULER_MEMORY with the amount of memory for a scheduler, in GB.
  • SCHEDULER_STORAGE with the disk size for a scheduler, in GB.
  • SCHEDULER_COUNT with the number of schedulers.

  • TRIGGERER_COUNT with the number of triggerers. The default value is 0. You need triggerers if you want to use deferrable operators in your DAGs.

    • For standard resilience environments, use a value between 0 and 10.
    • For highly resilient environments, use 0 or a value between 2 and 10.

    If you use at least one triggerer, you must also specify the TRIGGERER_CPU, and TRIGGERER_MEMORY parameters:

  • TRIGGERER_CPU specifies the number of CPUs for a triggerer, in vCPU units. Allowed values: 0.5, 0.75, 1.

  • TRIGGERER_MEMORY configures the amount of memory for a triggerer. The minimum required memory is equal to the number of CPUs allocated for the triggerers. The maximum allowed value is equal to the number of triggerer CPUs multiplied by 6.5.

    For example, if you set the TRIGGERER_CPU to 1, the minimum value for TRIGGERER_MEMORY is 1 and the maximum value is 6.5.

  • DAG_PROCESSOR_COUNT with the number of DAG processors.

    Highly resilient environments require at least two DAG processors.

  • DAG_PROCESSOR_CPU with the number of CPUs for the DAG processor, in vCPU units.

  • DAG_PROCESSOR_MEMORY with the amount of memory for the DAG processor, in GB.

  • DAG_PROCESSOR_STORAGE with the amount of disk space for the DAG processor, in GB.

  • WEB_SERVER_CPU with the number of CPUs for the web server, in vCPU units.

  • WEB_SERVER_MEMORY with the amount of memory for the web server, in GB.

  • WEB_SERVER_STORAGE with the disk size for the web server, in GB.

  • WORKER_CPU with the number of CPUs for a worker, in vCPU units.

  • WORKER_MEMORY with the amount of memory for a worker, in GB.

  • WORKER_STORAGE with the disk size for a worker, in GB.

  • WORKERS_MIN with the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a lower number of workers can handle the load.

  • WORKERS_MAX with the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.

  • ENVIRONMENT_SIZE with the environment size, ENVIRONMENT_SIZE_SMALL, ENVIRONMENT_SIZE_MEDIUM, or ENVIRONMENT_SIZE_LARGE.

Example:

// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments

{
  "name": "projects/example-project/locations/us-central1/environments/example-environment",
  "config": {
    "workloadsConfig": {
      "scheduler": {
        "cpu": 2.5,
        "memoryGb": 2.5,
        "storageGb": 2,
        "count": 1
      },
      "triggerer": {
        "cpu": 0.5,
        "memoryGb": 0.5,
        "count": 1
      },
      "dagProcessor": {
        "count": 1,
        "cpu": 0.5,
        "memoryGb": 2,
        "storageGb": 1
      },
      "webServer": {
        "cpu": 1,
        "memoryGb": 2.5,
        "storageGb": 2
      },
      "worker": {
        "cpu": 1,
        "memoryGb": 2,
        "storageGb": 2,
        "minCount": 2,
        "maxCount": 4
      }
    },
    "environmentSize": "ENVIRONMENT_SIZE_SMALL",
    "nodeConfig": {
      "serviceAccount": "
example-account@example-project.iam.gserviceaccount.com
"
    }
  }
}

Terraform

When you create an environment, following arguments control the scale and performance parameters of your environment.

  • In the config block:

    • The environment_size field controls the environment size.
  • In the workloads_config block:

    • The scheduler.cpu field specifies the number of CPUs for an Airflow scheduler.
    • The scheduler.memory_gb field specifies the amount of memory for an Airflow scheduler.
    • The scheduler.storage_gb field specifies the amount of disk space for a scheduler.
    • The scheduler.count field specifies the number of schedulers in your environment.
    • The triggerer.cpu field specifies the number of CPUs for an Airflow triggerer.
    • The triggerer.memory_gb field specifies the amount of memory for an Airflow triggerer.
    • The triggerer.count field specifies the number of triggerers in your environment.

    • The dag_processor.cpu field specifies the number of CPUs for a DAG processor.

    • The dag_processor.memory_gb field specifies the amount of memory for a DAG processor.

    • The dag_processor.storage_gb field specifies the amount of disk space for a DAG processor.

    • The dag_processor.count field specifies the number of DAG processors.

      Highly resilient environments require at least two DAG processors.

    • The web_server.cpu field specifies the number of CPUs for the Airflow web server.

    • The web_server.memory_gb field specifies the amount of memory for the Airflow web server.

    • The web_server.storage_gb field specifies the amount of disk space for the Airflow web server.

    • The worker.cpu field specifies the number of CPUs for an Airflow worker.

    • The worker.memory_gb field specifies the amount of memory for an Airflow worker.

    • The worker.storage_gb field specifies the amount of disk space for an Airflow worker.

    • The worker.min_count field specifies the minimum number of workers in your environment.

    • The worker.max_count field specifies the maximum number of workers in your environment.

resource "google_composer_environment" "example" {
  provider = google-beta
  name = "ENVIRONMENT_NAME"
  region = "LOCATION"

  config {

    workloads_config {

      scheduler {
        cpu = SCHEDULER_CPU
        memory_gb = SCHEDULER_MEMORY
        storage_gb = SCHEDULER_STORAGE
        count = SCHEDULER_COUNT
      }
      triggerer {
        count = TRIGGERER_COUNT
        cpu = TRIGGERER_CPU
        memory_gb = TRIGGERER_MEMORY
      }
      dag_processor {
        cpu = DAG_PROCESSOR_CPU
        memory_gb = DAG_PROCESSOR_MEMORY
        storage_gb = DAG_PROCESSOR_STORAGE
        count = DAG_PROCESSOR_COUNT
      }
      web_server {
        cpu = WEB_SERVER_CPU
        memory_gb = WEB_SERVER_MEMORY
        storage_gb = WEB_SERVER_STORAGE
      }
      worker {
        cpu = WORKER_CPU
        memory_gb = WORKER_MEMORY
        storage_gb = WORKER_STORAGE
        min_count = WORKERS_MIN
        max_count = WORKERS_MAX
      }
    }

    environment_size = "ENVIRONMENT_SIZE"

    node_config {
      service_account = "SERVICE_ACCOUNT"
    }
  }
}

Replace:

  • ENVIRONMENT_NAME with the name of the environment.
  • LOCATION with the region where the environment is located.
  • SERVICE_ACCOUNT with the service account for your environment.
  • SCHEDULER_CPU with the number of CPUs for a scheduler, in vCPU units.
  • SCHEDULER_MEMORY with the amount of memory for a scheduler, in GB.
  • SCHEDULER_STORAGE with the disk size for a scheduler, in GB.
  • SCHEDULER_COUNT with the number of schedulers.
  • TRIGGERER_COUNT with the number of triggerers.
  • TRIGGERER_CPU with the number of CPUs for a triggerer, in vCPU units.
  • TRIGGERER_MEMORY with the amount of memory for a triggerer, in GB.

  • DAG_PROCESSOR_CPU with the number of CPUs for the DAG processor, in vCPU units.

  • DAG_PROCESSOR_MEMORY with the amount of memory for the DAG processor, in GB.

  • DAG_PROCESSOR_STORAGE with the amount of disk space for the DAG processor, in GB.

  • DAG_PROCESSOR_COUNT with the number of DAG processors.

  • WEB_SERVER_CPU with the number of CPUs for the web server, in vCPU units.

  • WEB_SERVER_MEMORY with the amount of memory for the web server, in GB.

  • WEB_SERVER_STORAGE with the disk size for the web server, in GB.

  • WORKER_CPU with the number of CPUs for a worker, in vCPU units.

  • WORKER_MEMORY with the amount of memory for a worker, in GB.

  • WORKER_STORAGE with the disk size for a worker, in GB.

  • WORKERS_MIN with the minimum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a lower number of workers can handle the load.

  • WORKERS_MAX with the maximum number of Airflow workers that your environment can run. The number of workers in your environment does not go above this number, even if a higher number of workers is required to handle the load.

  • ENVIRONMENT_SIZE with the environment size, ENVIRONMENT_SIZE_SMALL, ENVIRONMENT_SIZE_MEDIUM, or ENVIRONMENT_SIZE_LARGE.

Example:

resource "google_composer_environment" "example" {
  provider = google-beta
  name = "example-environment"
  region = "us-central1"

  config {

    workloads_config {

      scheduler {
        cpu = 2.5
        memory_gb = 2.5
        storage_gb = 2
        count = 1
      }
      triggerer {
        count = 1
        cpu = 0.5
        memory_gb = 0.5
      }
      dag_processor {
        cpu = 1
        memory_gb = 2
        storage_gb = 1
        count = 1
    }
      web_server {
        cpu = 1
        memory_gb = 2.5
        storage_gb = 2
      }
      worker {
        cpu = 1
        memory_gb = 2
        storage_gb = 2
        min_count = 2
        max_count = 4
      }
    }

    environment_size = "ENVIRONMENT_SIZE_SMALL"

    node_config {
      service_account = "
example-account@example-project.iam.gserviceaccount.com
"
    }

  }
}

Step 4. (Optional) Enable high resilience mode

Highly resilient (Highly Available) Cloud Composer environments are environments that use built-in redundancy and failover mechanisms that reduce the environment's susceptibility to zonal failures and single point of failure outages.

In Cloud Composer 3, highly resilient environments are available starting from Airflow builds composer-3-airflow-2.10.2-build.13 and composer-3-airflow-2.9.3-build.20.

A highly resilient environment is multi-zonal and runs across at least two zones of a selected region. The following components run in separate zones:

The minimum number of workers is set to two, and your environment's cluster distributes worker instances between zones. In case of a zonal outage, affected worker instances are rescheduled in a different zone. The Cloud SQL component of a highly resilient environment has a primary instance and a standby instance that are distributed between zones.

Console

On the Create environment page:

  1. In the Resilience mode section, select High resilience.

  2. In the Environment resources section, select scale parameters for a highly resilient environment. Highly resilient environments require exactly two schedulers, zero or between two and ten triggerers, and at least two workers:

    1. Click Custom.

    2. In the Number of schedulers drop-down list, select 2.

    3. In the Number of triggerers drop-down list, select 0, or a value between 2 and 10. Configure the CPU and Memory allocation for your triggerers.

    4. In the Minimum number of workers drop-down list, select 2 or more, depending on the required number of workers.

  3. In the Network configuration section:

    1. In the Networking type, select Private IP environment.

    2. If required, specify other networking parameters.

gcloud

When you create an environment, the --enable-high-resilience argument enables the high resilience mode.

Set the following arguments:

  • --enable-high-resilience
  • --enable-private-environment, and other networking parameters for a Private IP environment, if required
  • --scheduler-count to 2
  • --triggerer-count to 0 or a value between 2 and 10. If you use triggerers, the --triggerer-cpu and--triggerer-memory` flags are also required for environment creation.

    For more information about --triggerer-count, --triggerer-cpu, and --triggerer-memory flags, see Configure environment scale and performance parameters.

  • --min-workers to 2 or more

gcloud composer environments create ENVIRONMENT_NAME \
    --location LOCATION \
    --image-version composer-3-airflow-2.10.5-build.19 \
    --service-account "SERVICE_ACCOUNT" \
    --enable-high-resilience \
    --enable-private-environment \
    --scheduler-count 2 \
    --triggerer-count 2 \
    --triggerer-cpu 0.5 \
    --triggerer-memory 0.5 \
    --min-workers 2

API

When you create an environment, in the Environment > EnvironmentConfig resource, enable the high resilience mode.

{
  "name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
  "config": {
    "resilience_mode": "HIGH_RESILIENCE",
    "nodeConfig": {
      "serviceAccount": "SERVICE_ACCOUNT"
    }

  }
}

Example:


// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments

{
  "name": "projects/example-project/locations/us-central1/environments/example-environment",
  "config": {
    "resilience_mode": "HIGH_RESILIENCE",
    "nodeConfig": {
      "serviceAccount": "
example-account@example-project.iam.gserviceaccount.com
"
    }

  }
}

Terraform

When you create an environment, the resilience_mode field in the config block enables the high resilience mode.

resource "google_composer_environment" "example" {
  provider = google-beta
  name = "ENVIRONMENT_NAME"
  region = "LOCATION"

  config {

    resilience_mode = "HIGH_RESILIENCE"

    node_config {
      service_account = "SERVICE_ACCOUNT"
    }

  }
}

Example:

resource "google_composer_environment" "example" {
  provider = google-beta
  name = "example-environment"
  region = "us-central1"

  config {

    resilience_mode = "HIGH_RESILIENCE"

    node_config {
      service_account = "
example-account@example-project.iam.gserviceaccount.com
"
    }
  }
}

Step 5. (Optional) Specify a zone for the environment's database

You can specify a preferred Cloud SQL zone when creating a standard resilience environment.

Console

On the Create environment page:

  1. In the Advanced configuration section, expand the Show advanced configuration item.

  2. In the Airflow database zone list, select a preferred Cloud SQL zone.

gcloud

When you create an environment, the --cloud-sql-preferred-zone argument specifies a preferred Cloud SQL zone.

gcloud composer environments create ENVIRONMENT_NAME \
    --location LOCATION \
    --image-version composer-3-airflow-2.10.5-build.19 \
    --service-account "SERVICE_ACCOUNT" \
    --cloud-sql-preferred-zone SQL_ZONE

Replace the following:

  • SQL_ZONE: preferred Cloud SQL zone. This zone must be located in the region where the environment is located.

Example:

gcloud composer environments create example-environment \
    --location us-central1 \
    --image-version composer-3-airflow-2.10.5-build.19 \
    --service-account "
example-account@example-project.iam.gserviceaccount.com
" \
    --cloud-sql-preferred-zone us-central1-a

API

When you create an environment, in the Environment > DatabaseConfig resource, specify the preferred Cloud SQL zone.

{
  "name": "projects/PROJECT_ID/locations/LOCATION/environments/ENVIRONMENT_NAME",
  "config": {
    "databaseConfig": {
      "zone": "SQL_ZONE"
    },
      "nodeConfig": {
      "serviceAccount": "SERVICE_ACCOUNT"
    }
  }
}

Replace the following:

  • SQL_ZONE: preferred Cloud SQL zone. This zone must be located in the region where the environment is located.

Example:


// POST https://composer.googleapis.com/v1/{parent=projects/*/locations/*}/environments

{
  "name": "projects/example-project/locations/us-central1/environments/example-environment",
  "config": {
    "databaseConfig": {
      "zone": "us-central1-a"
    },
    "nodeConfig": {
      "serviceAccount": "
example-account@example-project.iam.gserviceaccount.com
"
    }
  }
}

Terraform

When you create an environment, the zone field in the database_config block specifies the preferred Cloud SQL zone.

resource "google_composer_environment" "example" {
  provider = google-beta
  name = "ENVIRONMENT_NAME"
  region = "LOCATION"

  config {
    database_config {
      zone = "SQL_ZONE"
    }

    node_config {
      service_account = "SERVICE_ACCOUNT"
    }
  }
}

Replace the following:

  • SQL_ZONE: preferred Cloud SQL zone. This zone must be located in the region where the environment is located.

Step 6. (Optional) Configure your environment's networking

You can configure Cloud Composer 3 networking in the following ways:

Console

  1. Make sure that your networking is configured for the type of environment that you want to create.

  2. In the Network configuration section, expand the Show network configuration item.

  3. If you want to connect your environment to a VPC network, in the Network attachment field, select a network attachment. You can also create a new network attachment. For more information, see Connect an environment to a VPC network.

  4. If you want to create a Private IP environment, in the Networking type section, select the Private IP environment option.

  5. If you want to add network tags, see Add network tags for more information.

gcloud

Make sure that your networking is configured for the type of environment that you want to create.

When you create an environment, the following arguments control the networking parameters. If you omit a parameter, the default value is used.

  • --enable-private-environment enables a Private IP environment.

  • --network specifies your VPC network ID.

  • --subnetwork specifies your VPC subnetwork ID.

Example (Private IP environment with a connected VPC network)

gcloud composer environments create ENVIRONMENT_NAME \
    --location LOCATION \
    --image-version composer-3-airflow-2.10.5-build.19 \
    --service-account "SERVICE_ACCOUNT" \
    --enable-private-environment \
    --network NETWORK_ID \
    --subnetwork SUBNETWORK_ID \

Replace:

  • NETWORK_ID with your VPC network ID.
  • SUBNETWORK_ID with your VPC subnetwork ID.