Skip to main content

Shipping Lambda Logs to OpenSearch Across AWS Accounts with Terraform

Table of Contents
Multi-account log centralization is table stakes for any platform team. If your Lambda functions live in one AWS account and your observability tooling in another, you need a production-grade pipeline that ships logs across account boundaries without compromising security. Here is the complete Terraform setup.

Every mature AWS organization eventually separates workloads from shared services into distinct accounts. Compute in Account A, OpenSearch in Account B. The challenge is getting CloudWatch logs to cross that account boundary reliably, with least-privilege IAM and no hardcoded credentials anywhere.

The canonical pattern is: Lambda writes structured logs to CloudWatch, a subscription filter forwards them to Kinesis Data Firehose via a cross-account destination, and Firehose delivers to OpenSearch. Kinesis provides buffering, retry, and a dead-letter path that direct Lambda-to-OpenSearch calls do not.

Architecture
#

flowchart LR
    subgraph AccountA["Account A (Workloads)"]
        Lambda["Lambda Function\nPython 3.12"]
        CWL["CloudWatch\nLog Group\n14-day retention"]
        Sub["Subscription Filter"]
        Dest["CW Logs\nDestination ARN"]
        Lambda --> CWL
        CWL --> Sub
        Sub --> Dest
    end

    subgraph AccountB["Account B (Shared Services)"]
        Firehose["Kinesis Firehose\nbuffer: 5MB / 300s"]
        OS["OpenSearch Domain\nos 2.11"]
        DLQ["S3 Bucket\nFailed records"]
        Firehose --> OS
        Firehose --> DLQ
    end

    Dest -->|"cross-account IAM role"| Firehose
Note

The CloudWatch Logs destination is a resource in Account B that accepts log data from Account A. It is the bridge between subscription filters and Kinesis. Account A never directly writes to OpenSearch.

Account A: Lambda, Log Group, Subscription Filter
#

account_a/lambda.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = "eu-central-1"
}

resource "aws_iam_role" "lambda_exec" {
  name = "lambda-exec-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Service = "lambda.amazonaws.com" }
      Action    = "sts:AssumeRole"
    }]
  })
}

resource "aws_iam_role_policy_attachment" "lambda_logs" {
  role       = aws_iam_role.lambda_exec.name
  policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}

resource "aws_lambda_function" "app" {
  function_name    = "my-service"
  filename         = data.archive_file.lambda_zip.output_path
  source_code_hash = data.archive_file.lambda_zip.output_base64sha256
  handler          = "handler.lambda_handler"
  runtime          = "python3.12"
  role             = aws_iam_role.lambda_exec.arn

  environment {
    variables = {
      LOG_LEVEL = "INFO"
    }
  }
}

data "archive_file" "lambda_zip" {
  type        = "zip"
  source_file = "${path.module}/src/handler.py"
  output_path = "${path.module}/build/handler.zip"
}

resource "aws_cloudwatch_log_group" "app" {
  name              = "/aws/lambda/${aws_lambda_function.app.function_name}"
  retention_in_days = 14
}

resource "aws_cloudwatch_log_subscription_filter" "to_opensearch" {
  name            = "forward-to-opensearch"
  log_group_name  = aws_cloudwatch_log_group.app.name
  filter_pattern  = ""  # empty string = forward all events
  destination_arn = var.cloudwatch_destination_arn  # Account B destination ARN

  depends_on = [aws_cloudwatch_log_group.app]
}

The destination_arn variable is the CloudWatch Logs destination created in Account B (shown below). An empty filter_pattern forwards everything; use "ERROR" or a JSON filter like { $.level = "ERROR" } to reduce volume.

Account B: OpenSearch Domain and Cross-Account Delivery
#

account_b/opensearch.tf
provider "aws" {
  region = "eu-central-1"
}

data "aws_caller_identity" "b" {}

resource "aws_opensearch_domain" "logs" {
  domain_name    = "platform-logs"
  engine_version = "OpenSearch_2.11"

  cluster_config {
    instance_type  = "t3.small.search"
    instance_count = 1
  }

  ebs_options {
    ebs_enabled = true
    volume_size = 20
    volume_type = "gp3"
  }

  encrypt_at_rest { enabled = true }
  node_to_node_encryption { enabled = true }

  domain_endpoint_options {
    enforce_https       = true
    tls_security_policy = "Policy-Min-TLS-1-2-2019-07"
  }

  tags = { Environment = "shared" }
}

# Least-privilege: only the two actions Firehose actually needs
resource "aws_opensearch_domain_policy" "firehose_access" {
  domain_name = aws_opensearch_domain.logs.domain_name

  access_policies = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { AWS = aws_iam_role.firehose_delivery.arn }
      Action    = ["es:ESHttpPut", "es:ESHttpPost"]
      Resource  = "${aws_opensearch_domain.logs.arn}/*"
    }]
  })
}

# Cross-account IAM role trusted by Account A's CloudWatch Logs service
resource "aws_iam_role" "cloudwatch_destination" {
  name = "cloudwatch-logs-cross-account-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Service = "logs.amazonaws.com" }
      Action    = "sts:AssumeRole"
      Condition = {
        StringEquals = {
          "aws:SourceAccount" = var.account_a_id
        }
      }
    }]
  })
}

resource "aws_iam_role_policy" "cloudwatch_destination_firehose" {
  name = "put-firehose-records"
  role = aws_iam_role.cloudwatch_destination.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect   = "Allow"
      Action   = ["firehose:PutRecord", "firehose:PutRecordBatch"]
      Resource = aws_kinesis_firehose_delivery_stream.logs.arn
    }]
  })
}

resource "aws_cloudwatch_log_destination" "logs" {
  name       = "lambda-logs-to-opensearch"
  role_arn   = aws_iam_role.cloudwatch_destination.arn
  target_arn = aws_kinesis_firehose_delivery_stream.logs.arn
}

resource "aws_cloudwatch_log_destination_policy" "logs" {
  destination_name = aws_cloudwatch_log_destination.logs.name

  access_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { AWS = "arn:aws:iam::${var.account_a_id}:root" }
      Action    = "logs:PutSubscriptionFilter"
      Resource  = aws_cloudwatch_log_destination.logs.arn
    }]
  })
}

Kinesis Data Firehose: Buffered Delivery to OpenSearch
#

Direct CloudWatch-to-OpenSearch delivery exists but lacks retry logic and dead-letter handling. Firehose solves both.

account_b/firehose.tf
resource "aws_s3_bucket" "firehose_dlq" {
  bucket = "platform-logs-firehose-dlq-${data.aws_caller_identity.b.account_id}"
}

resource "aws_iam_role" "firehose_delivery" {
  name = "firehose-opensearch-delivery"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect    = "Allow"
      Principal = { Service = "firehose.amazonaws.com" }
      Action    = "sts:AssumeRole"
    }]
  })
}

resource "aws_iam_role_policy" "firehose_delivery" {
  name = "firehose-delivery-policy"
  role = aws_iam_role.firehose_delivery.id

  policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect   = "Allow"
        Action   = ["es:ESHttpPut", "es:ESHttpPost", "es:DescribeDomain"]
        Resource = [aws_opensearch_domain.logs.arn, "${aws_opensearch_domain.logs.arn}/*"]
      },
      {
        Effect   = "Allow"
        Action   = ["s3:PutObject", "s3:GetBucketLocation"]
        Resource = [aws_s3_bucket.firehose_dlq.arn, "${aws_s3_bucket.firehose_dlq.arn}/*"]
      }
    ]
  })
}

resource "aws_kinesis_firehose_delivery_stream" "logs" {
  name        = "lambda-logs-to-opensearch"
  destination = "opensearch"

  opensearch_configuration {
    domain_arn            = aws_opensearch_domain.logs.arn
    role_arn              = aws_iam_role.firehose_delivery.arn
    index_name            = "lambda-logs"
    index_rotation_period = "OneDay"

    buffering_interval = 300  # seconds
    buffering_size     = 5    # MB

    retry_duration = 300

    s3_backup_mode = "FailedDocumentsOnly"

    s3_configuration {
      role_arn   = aws_iam_role.firehose_delivery.arn
      bucket_arn = aws_s3_bucket.firehose_dlq.arn
      prefix     = "failed-logs/"
    }
  }
}
Important

Set buffering_interval and buffering_size based on your actual ingestion rate. The defaults (300s / 5MB) are reasonable for low-to-medium volume. For high-throughput services, reduce the interval to avoid data latency in dashboards.

CI/CD with OIDC (No Hardcoded Credentials)
#

The original version of this post stored AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY directly in CI variables. This is a security anti-pattern. GitHub Actions supports OIDC token exchange: the runner proves its identity to AWS and receives a short-lived role session, with no long-lived credentials stored anywhere.

account_b/github_oidc.tf
resource "aws_iam_openid_connect_provider" "github" {
  url             = "https://token.actions.githubusercontent.com"
  client_id_list  = ["sts.amazonaws.com"]
  thumbprint_list = ["6938fd4d98bab03faadb97b34396831e3780aea1"]
}

resource "aws_iam_role" "github_actions_terraform" {
  name = "github-actions-terraform"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [{
      Effect = "Allow"
      Principal = {
        Federated = aws_iam_openid_connect_provider.github.arn
      }
      Action = "sts:AssumeRoleWithWebIdentity"
      Condition = {
        StringLike = {
          "token.actions.githubusercontent.com:sub" = "repo:my-org/my-repo:*"
        }
        StringEquals = {
          "token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
        }
      }
    }]
  })
}
.github/workflows/terraform.yml
name: Terraform Apply

on:
  push:
    branches: [main]

permissions:
  id-token: write
  contents: read

jobs:
  apply:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::ACCOUNT_B_ID:role/github-actions-terraform
          aws-region: eu-central-1

      - uses: hashicorp/setup-terraform@v3

      - run: terraform init
      - run: terraform validate
      - run: terraform plan
      - run: terraform apply -auto-approve
Warning

Pin the OIDC trust condition to a specific subject like repo:org/repo:ref:refs/heads/main rather than a wildcard. A wildcard lets any branch or fork assume the role.

Using the Logs in OpenSearch Dashboards
#

Once data flows, create an index pattern matching lambda-logs-* in OpenSearch Dashboards. Firehose rotates the index daily by default.

A useful starter query in the Discover tab to find errors grouped by function:

{
  "query": {
    "bool": {
      "must": [
        { "match": { "level": "ERROR" } },
        { "range": { "@timestamp": { "gte": "now-1h" } } }
      ]
    }
  },
  "aggs": {
    "errors_per_function": {
      "terms": { "field": "function_name.keyword", "size": 10 }
    }
  }
}

Build a dashboard with four panels: error rate over time (line chart on level:ERROR), p99 duration (from the duration field Lambda emits in REPORT lines), cold start rate (filter on Init Duration present), and top error messages (terms aggregation on errorMessage.keyword).

Common mistakes and how to avoid them

Using aws_elasticsearch_domain instead of aws_opensearch_domain The Elasticsearch provider resource was deprecated in the AWS Terraform provider 4.x release. It still works but creates OpenSearch resources with an Elasticsearch-compatible API and will eventually be removed. Use aws_opensearch_domain for all new code.

Hardcoding CI credentials Storing AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY in CI variables means those credentials live in your CI provider’s secret store forever, rotate manually, and can appear in logs when masking is misconfigured. OIDC eliminates the long-lived credential entirely.

Over-permissive IAM (es:*) The Firehose delivery role only needs es:ESHttpPut and es:ESHttpPost on the specific domain. es:* grants the ability to delete the domain, modify access policies, and trigger snapshots – none of which the delivery path requires.

No retention policy on the CloudWatch log group Without retention_in_days, CloudWatch keeps logs forever and you pay for storage indefinitely. 14 days covers most incident investigation windows. Adjust based on compliance requirements.

No Firehose buffer configuration The default Firehose buffer is 300s or 5MB, whichever comes first. For very low-volume services this means 5-minute gaps in dashboards. Always set explicit values and test them against your expected ingestion rate.

Missing depends_on between log group and subscription filter If Terraform creates the subscription filter before the log group exists, the apply errors. Always add depends_on = [aws_cloudwatch_log_group.app] to the subscription filter resource.


If you want to go deeper on any of this, I offer 1:1 coaching sessions for engineers working on AI integration, cloud architecture, and platform engineering. Book a session (50 EUR / 60 min) or reach out at manuel.fedele+website@gmail.com.