Every mature AWS organization eventually separates workloads from shared services into distinct accounts. Compute in Account A, OpenSearch in Account B. The challenge is getting CloudWatch logs to cross that account boundary reliably, with least-privilege IAM and no hardcoded credentials anywhere.
The canonical pattern is: Lambda writes structured logs to CloudWatch, a subscription filter forwards them to Kinesis Data Firehose via a cross-account destination, and Firehose delivers to OpenSearch. Kinesis provides buffering, retry, and a dead-letter path that direct Lambda-to-OpenSearch calls do not.
Architecture#
flowchart LR
subgraph AccountA["Account A (Workloads)"]
Lambda["Lambda Function\nPython 3.12"]
CWL["CloudWatch\nLog Group\n14-day retention"]
Sub["Subscription Filter"]
Dest["CW Logs\nDestination ARN"]
Lambda --> CWL
CWL --> Sub
Sub --> Dest
end
subgraph AccountB["Account B (Shared Services)"]
Firehose["Kinesis Firehose\nbuffer: 5MB / 300s"]
OS["OpenSearch Domain\nos 2.11"]
DLQ["S3 Bucket\nFailed records"]
Firehose --> OS
Firehose --> DLQ
end
Dest -->|"cross-account IAM role"| Firehose
The CloudWatch Logs destination is a resource in Account B that accepts log data from Account A. It is the bridge between subscription filters and Kinesis. Account A never directly writes to OpenSearch.
Account A: Lambda, Log Group, Subscription Filter#
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = "eu-central-1"
}
resource "aws_iam_role" "lambda_exec" {
name = "lambda-exec-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = { Service = "lambda.amazonaws.com" }
Action = "sts:AssumeRole"
}]
})
}
resource "aws_iam_role_policy_attachment" "lambda_logs" {
role = aws_iam_role.lambda_exec.name
policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}
resource "aws_lambda_function" "app" {
function_name = "my-service"
filename = data.archive_file.lambda_zip.output_path
source_code_hash = data.archive_file.lambda_zip.output_base64sha256
handler = "handler.lambda_handler"
runtime = "python3.12"
role = aws_iam_role.lambda_exec.arn
environment {
variables = {
LOG_LEVEL = "INFO"
}
}
}
data "archive_file" "lambda_zip" {
type = "zip"
source_file = "${path.module}/src/handler.py"
output_path = "${path.module}/build/handler.zip"
}
resource "aws_cloudwatch_log_group" "app" {
name = "/aws/lambda/${aws_lambda_function.app.function_name}"
retention_in_days = 14
}
resource "aws_cloudwatch_log_subscription_filter" "to_opensearch" {
name = "forward-to-opensearch"
log_group_name = aws_cloudwatch_log_group.app.name
filter_pattern = "" # empty string = forward all events
destination_arn = var.cloudwatch_destination_arn # Account B destination ARN
depends_on = [aws_cloudwatch_log_group.app]
}The destination_arn variable is the CloudWatch Logs destination created in Account B (shown below). An empty filter_pattern forwards everything; use "ERROR" or a JSON filter like { $.level = "ERROR" } to reduce volume.
Account B: OpenSearch Domain and Cross-Account Delivery#
provider "aws" {
region = "eu-central-1"
}
data "aws_caller_identity" "b" {}
resource "aws_opensearch_domain" "logs" {
domain_name = "platform-logs"
engine_version = "OpenSearch_2.11"
cluster_config {
instance_type = "t3.small.search"
instance_count = 1
}
ebs_options {
ebs_enabled = true
volume_size = 20
volume_type = "gp3"
}
encrypt_at_rest { enabled = true }
node_to_node_encryption { enabled = true }
domain_endpoint_options {
enforce_https = true
tls_security_policy = "Policy-Min-TLS-1-2-2019-07"
}
tags = { Environment = "shared" }
}
# Least-privilege: only the two actions Firehose actually needs
resource "aws_opensearch_domain_policy" "firehose_access" {
domain_name = aws_opensearch_domain.logs.domain_name
access_policies = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = { AWS = aws_iam_role.firehose_delivery.arn }
Action = ["es:ESHttpPut", "es:ESHttpPost"]
Resource = "${aws_opensearch_domain.logs.arn}/*"
}]
})
}
# Cross-account IAM role trusted by Account A's CloudWatch Logs service
resource "aws_iam_role" "cloudwatch_destination" {
name = "cloudwatch-logs-cross-account-role"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = { Service = "logs.amazonaws.com" }
Action = "sts:AssumeRole"
Condition = {
StringEquals = {
"aws:SourceAccount" = var.account_a_id
}
}
}]
})
}
resource "aws_iam_role_policy" "cloudwatch_destination_firehose" {
name = "put-firehose-records"
role = aws_iam_role.cloudwatch_destination.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Action = ["firehose:PutRecord", "firehose:PutRecordBatch"]
Resource = aws_kinesis_firehose_delivery_stream.logs.arn
}]
})
}
resource "aws_cloudwatch_log_destination" "logs" {
name = "lambda-logs-to-opensearch"
role_arn = aws_iam_role.cloudwatch_destination.arn
target_arn = aws_kinesis_firehose_delivery_stream.logs.arn
}
resource "aws_cloudwatch_log_destination_policy" "logs" {
destination_name = aws_cloudwatch_log_destination.logs.name
access_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = { AWS = "arn:aws:iam::${var.account_a_id}:root" }
Action = "logs:PutSubscriptionFilter"
Resource = aws_cloudwatch_log_destination.logs.arn
}]
})
}Kinesis Data Firehose: Buffered Delivery to OpenSearch#
Direct CloudWatch-to-OpenSearch delivery exists but lacks retry logic and dead-letter handling. Firehose solves both.
resource "aws_s3_bucket" "firehose_dlq" {
bucket = "platform-logs-firehose-dlq-${data.aws_caller_identity.b.account_id}"
}
resource "aws_iam_role" "firehose_delivery" {
name = "firehose-opensearch-delivery"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = { Service = "firehose.amazonaws.com" }
Action = "sts:AssumeRole"
}]
})
}
resource "aws_iam_role_policy" "firehose_delivery" {
name = "firehose-delivery-policy"
role = aws_iam_role.firehose_delivery.id
policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Effect = "Allow"
Action = ["es:ESHttpPut", "es:ESHttpPost", "es:DescribeDomain"]
Resource = [aws_opensearch_domain.logs.arn, "${aws_opensearch_domain.logs.arn}/*"]
},
{
Effect = "Allow"
Action = ["s3:PutObject", "s3:GetBucketLocation"]
Resource = [aws_s3_bucket.firehose_dlq.arn, "${aws_s3_bucket.firehose_dlq.arn}/*"]
}
]
})
}
resource "aws_kinesis_firehose_delivery_stream" "logs" {
name = "lambda-logs-to-opensearch"
destination = "opensearch"
opensearch_configuration {
domain_arn = aws_opensearch_domain.logs.arn
role_arn = aws_iam_role.firehose_delivery.arn
index_name = "lambda-logs"
index_rotation_period = "OneDay"
buffering_interval = 300 # seconds
buffering_size = 5 # MB
retry_duration = 300
s3_backup_mode = "FailedDocumentsOnly"
s3_configuration {
role_arn = aws_iam_role.firehose_delivery.arn
bucket_arn = aws_s3_bucket.firehose_dlq.arn
prefix = "failed-logs/"
}
}
}Set buffering_interval and buffering_size based on your actual ingestion rate. The defaults (300s / 5MB) are reasonable for low-to-medium volume. For high-throughput services, reduce the interval to avoid data latency in dashboards.
CI/CD with OIDC (No Hardcoded Credentials)#
The original version of this post stored AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY directly in CI variables. This is a security anti-pattern. GitHub Actions supports OIDC token exchange: the runner proves its identity to AWS and receives a short-lived role session, with no long-lived credentials stored anywhere.
resource "aws_iam_openid_connect_provider" "github" {
url = "https://token.actions.githubusercontent.com"
client_id_list = ["sts.amazonaws.com"]
thumbprint_list = ["6938fd4d98bab03faadb97b34396831e3780aea1"]
}
resource "aws_iam_role" "github_actions_terraform" {
name = "github-actions-terraform"
assume_role_policy = jsonencode({
Version = "2012-10-17"
Statement = [{
Effect = "Allow"
Principal = {
Federated = aws_iam_openid_connect_provider.github.arn
}
Action = "sts:AssumeRoleWithWebIdentity"
Condition = {
StringLike = {
"token.actions.githubusercontent.com:sub" = "repo:my-org/my-repo:*"
}
StringEquals = {
"token.actions.githubusercontent.com:aud" = "sts.amazonaws.com"
}
}
}]
})
}name: Terraform Apply
on:
push:
branches: [main]
permissions:
id-token: write
contents: read
jobs:
apply:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::ACCOUNT_B_ID:role/github-actions-terraform
aws-region: eu-central-1
- uses: hashicorp/setup-terraform@v3
- run: terraform init
- run: terraform validate
- run: terraform plan
- run: terraform apply -auto-approvePin the OIDC trust condition to a specific subject like repo:org/repo:ref:refs/heads/main rather than a wildcard. A wildcard lets any branch or fork assume the role.
Using the Logs in OpenSearch Dashboards#
Once data flows, create an index pattern matching lambda-logs-* in OpenSearch Dashboards. Firehose rotates the index daily by default.
A useful starter query in the Discover tab to find errors grouped by function:
{
"query": {
"bool": {
"must": [
{ "match": { "level": "ERROR" } },
{ "range": { "@timestamp": { "gte": "now-1h" } } }
]
}
},
"aggs": {
"errors_per_function": {
"terms": { "field": "function_name.keyword", "size": 10 }
}
}
}Build a dashboard with four panels: error rate over time (line chart on level:ERROR), p99 duration (from the duration field Lambda emits in REPORT lines), cold start rate (filter on Init Duration present), and top error messages (terms aggregation on errorMessage.keyword).
Common mistakes and how to avoid them
Using aws_elasticsearch_domain instead of aws_opensearch_domain
The Elasticsearch provider resource was deprecated in the AWS Terraform provider 4.x release. It still works but creates OpenSearch resources with an Elasticsearch-compatible API and will eventually be removed. Use aws_opensearch_domain for all new code.
Hardcoding CI credentials
Storing AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY in CI variables means those credentials live in your CI provider’s secret store forever, rotate manually, and can appear in logs when masking is misconfigured. OIDC eliminates the long-lived credential entirely.
Over-permissive IAM (es:*)
The Firehose delivery role only needs es:ESHttpPut and es:ESHttpPost on the specific domain. es:* grants the ability to delete the domain, modify access policies, and trigger snapshots – none of which the delivery path requires.
No retention policy on the CloudWatch log group
Without retention_in_days, CloudWatch keeps logs forever and you pay for storage indefinitely. 14 days covers most incident investigation windows. Adjust based on compliance requirements.
No Firehose buffer configuration The default Firehose buffer is 300s or 5MB, whichever comes first. For very low-volume services this means 5-minute gaps in dashboards. Always set explicit values and test them against your expected ingestion rate.
Missing depends_on between log group and subscription filter
If Terraform creates the subscription filter before the log group exists, the apply errors. Always add depends_on = [aws_cloudwatch_log_group.app] to the subscription filter resource.
If you want to go deeper on any of this, I offer 1:1 coaching sessions for engineers working on AI integration, cloud architecture, and platform engineering. Book a session (50 EUR / 60 min) or reach out at manuel.fedele+website@gmail.com.