Public API Cloudflare Migration — As-Built Runbook
Document type: Implementation history + reference runbook Last updated: 2026-04-29 Status: Phases 1–7 complete for dev. Public API Worker live at https://api.adventive.dev. Phase 8 (cutover) and stg/prd extension remain.
This document captures the actual sequence used to deploy Phase 1 of the Public API Cloudflare migration, including the gotchas hit and the fixes applied. It exists so that if everything had to be rebuilt — fresh AWS account, fresh Cloudflare account — a competent engineer could follow this end-to-end and arrive at the same working state.
It is a companion to 06-implementation-steps.md (the planning document). Where 06 describes intent, this document describes what was actually executed and observed.
1. Architecture summary
Section titled “1. Architecture summary”The deployment is composed of four cooperating systems:
-
EC2 Image Builder produces a hardened Ubuntu 22.04 AMI (
adv-cflared-*) that hascloudflaredpre-installed plus a boot-time bootstrap script. The pipeline runs weekly, publishes the AMI ID to SSM Parameter Store, and serves as the source of truth for the tunnel fleet. -
Cloudflare Tunnel is provisioned per environment via Terraform (currently dev only). The tunnel UUID, secret, and rendered config are stored in AWS Secrets Manager at
/adventive/cloudflared/<env>in the JSON shape the AMI’s bootstrap script expects. -
AWS Secrets Manager holds the tunnel credentials and config. The bootstrap script on every booting AMI instance reads from a path keyed by the instance’s
adv:envIMDS tag, decoupling the AMI from any specific environment. -
EC2 (Phase 1.5 — ASG) runs the
cloudflareddaemon. Each instance, on boot, resolves itsadv:envtag, fetches the matching secret, writes the config + credentials to/etc/cloudflared/, thencloudflared.servicestarts and registers with Cloudflare’s edge.
The AMI is environment-agnostic. The runtime instance’s tag determines which env it joins. This is the standing pattern for all Adventive Cloudflare Tunnel deployments.
2. Reference data (current state)
Section titled “2. Reference data (current state)”| Resource | Value |
|---|---|
| AWS account ID | 201161205241 |
| AWS region | us-east-1 |
| Cloudflare account ID | 46a873457665355ba02a85e61d7200a7 |
| Cloudflare account name | Adventive Tech, Inc. |
| Dev DNS zone | adventive.dev |
| Dev zone ID (Cloudflare) | c737bae5b535c0ec3daa72c809721e7d |
| VPC | vpc-e1636084 |
| Build subnet | subnet-2ef28677 |
| Build security group | sg-0968b424c847f142c (egress: TCP 80 + 443) |
| Artifacts S3 bucket | adventive-platform-artifacts |
| Image Builder pipeline | arn:aws:imagebuilder:us-east-1:201161205241:image-pipeline/adv-cflared-pipeline |
| Current recipe version | adv-cflared-recipe/1.0.2 |
| Current cflared component | adv-cflared-cloudflared/1.0.2 |
| Current bootstrap component | adv-cflared-bootstrap-script/1.0.0 |
| Current AMI | ami-02f2f244d3dd56cb4 (built 2026-04-29 07:13 UTC) |
| SSM parameter | /adventive/cloudflared/ami-id-latest |
| Dev tunnel UUID | cb0c2ef4-3426-4968-8175-a3a053ecc5ea |
| Dev tunnel name | adv-cflared-dev |
| Dev tunnel hostname | tunnel.adventive.dev (CNAME, proxied) |
| Dev tunnel secret | arn:aws:secretsmanager:us-east-1:201161205241:secret:/adventive/cloudflared/dev-7lUBaV |
| Lambda function | adv-cflared-publish-ami |
| EventBridge rule | adv-cflared-pipeline-success |
| Dev ASG | adv-cflared-dev (1× t3.micro, single-AZ in subnet-2ef28677) |
| Dev runtime IAM role | adv-cflared-runtime-dev |
| Dev runtime SG | adv-cflared-runtime-dev (egress: TCP+UDP 7844, TCP 443; no inbound) |
| Dev launch template | adv-cflared-dev |
| Existing Warp tunnel (do not touch) | cf-tunnel.us-east1.aws.adventive.com (id 604363f7-e1b9-4dc7-a485-4a3df8b2f751) |
| Dev RDS instance | development.coi6rcntfbgg.us-east-1.rds.amazonaws.com:3306 (us-east-1a, MySQL) |
| Dev RDS security group | adv-development-database (sg-0b940e4bfc388f9be) |
| Dev databases on that instance | console, aggregate, billing, vast |
| Console Hyperdrive | adv-svc-public-api-console-dev (id 059838c4abb64a92a4aece2a6a533a29) |
| Aggregate Hyperdrive | adv-svc-public-api-aggregate-dev (id c1b18833b07347daa77b56a2d19ef508) |
| Dev Access service token | adv-hyperdrive-dev (client_id 99fc0a3d3df90300224e08b37fd04f5b.access) |
| Console DB hostname | db-console-dev.adventive.dev (CNAME, proxied) |
| Aggregate DB hostname | db-aggregate-dev.adventive.dev (CNAME, proxied) |
| Auth helper Worker | adv-svc-auth-helper-dev |
| Auth helper URL | https://adv-svc-auth-helper-dev.adventive.workers.dev |
| Auth helper repo | ~/Repositories/GitHub/Adventive/adventive-auth-helper-worker/ |
| Auth helper KV namespace | kv-adv-svc-auth-helper-cache-dev (id 22db913488484a46a7f60ebb9b8c1704) |
| Cloudflare workers.dev subdomain | adventive.workers.dev |
| Public API Worker | adv-public-api-dev (live at https://api.adventive.dev) |
| Public API repo | ~/Repositories/GitHub/Adventive/adventive-public-api-worker/ (GitHub: adventive/adventive-public-api-worker, private) |
| Public API Postman collection | postman/adventive-public-api.json (28 requests, dev/stg/prd envs) |
3. Prerequisites
Section titled “3. Prerequisites”- AWS CLI v2 (
aws --version) - Terraform 1.6+
jqdigcurlbrew install --cask session-manager-plugin(for shell access to instances)
AWS access
Section titled “AWS access”- Authenticated AWS CLI with permission to create EC2, IAM, Secrets Manager, Image Builder, Lambda, EventBridge, SSM, S3 resources
Cloudflare access
Section titled “Cloudflare access”- Account-owned API token at https://dash.cloudflare.com → Manage Account → Account API Tokens, with policies:
- Entire Account scope:
Cloudflare One Connector: cloudflared→ Edit - Specified Domains scope (adventive.dev only):
DNS→ Edit,Zone→ Read
- Entire Account scope:
- Exported as
CLOUDFLARE_API_TOKENin any shell that runs Terraform on the cloudflare-tunnels module
- Repo:
Adventive/adventive-platform-infra - Local clone path:
~/Repositories/GitHub/Adventive/adventive-platform-infra/
4. Phase 1.1 — AMI build pipeline
Section titled “4. Phase 1.1 — AMI build pipeline”4.1 Network preparation
Section titled “4.1 Network preparation”The build instance runs in a VPC that already exists. We did not create a new VPC. The script scripts/setup-builder-network.sh is idempotent and handles:
- Verifying the VPC exists and resolving its CIDR
- Creating (or finding existing) security group
adv-imagebuilder-builderinvpc-e1636084with egress on TCP 443 (HTTPS — pkg.cloudflare.com, S3, AWS APIs) and TCP 80 (apt mirror redirects). No inbound rules. - Listing all subnets in the VPC, classifying each as public/private/isolated by inspecting their route tables, and printing the values for
infra/imagebuilder/terraform.tfvars
To reproduce:
cd ~/Repositories/GitHub/Adventive/adventive-platform-infra/scriptsVPC_ID=vpc-e1636084 bash setup-builder-network.shThe script prints two tfvars-formatted lines: build_subnet_id and build_security_group_ids.
4.2 S3 artifacts bucket
Section titled “4.2 S3 artifacts bucket”Created manually before terraform apply:
aws s3 mb s3://adventive-platform-artifacts --region us-east-1This bucket holds:
cflared/cflared-bootstrap.sh— the runtime bootstrap script (uploaded once before first build)imagebuilder-logs/cflared/...— per-build logs (written by the build instance during each pipeline run)
The bootstrap script lives at scripts/cflared-bootstrap.sh in the repo. Upload to S3 with:
aws s3 cp scripts/cflared-bootstrap.sh s3://adventive-platform-artifacts/cflared/cflared-bootstrap.shIf the script content changes, re-upload to the same key. The Image Builder component adv-cflared-bootstrap-script will pull the latest version on each build.
4.3 Image Builder components (YAML)
Section titled “4.3 Image Builder components (YAML)”Two custom components live in infra/imagebuilder/components/:
cflared.yml — installs cloudflared from Cloudflare’s APT repo and drops a primary systemd unit at /etc/systemd/system/cloudflared.service. Critical: the cloudflared deb package does not ship a systemd unit file. We write our own with Type=simple, --no-autoupdate, and Requires=cflared-bootstrap.service so cloudflared cannot start before the bootstrap finishes writing config.
bootstrap.yml — drops the cflared-bootstrap.sh script from S3 to /usr/local/sbin/, installs jq, and creates the cflared-bootstrap.service systemd unit (Type=oneshot, runs before cloudflared.service).
Component versioning is immutable per version. Any YAML edit requires bumping the version attribute in components.tf AND the recipe version in pipeline.tf. Old versions accumulate as Cloudflare retains build history; the lifecycle pattern is create_before_destroy = true so plan/apply doesn’t trip on AWS rejecting deletes that have downstream image references.
4.4 Terraform module — infra/imagebuilder/
Section titled “4.4 Terraform module — infra/imagebuilder/”Files:
versions.tf— Terraform 1.6+, AWS provider ~> 5.50, archive provider, default tags (adv:owner,adv:project,adv:repo,adv:module)variables.tf— region, artifacts_bucket, build_subnet_id, build_security_group_ids, ssm_parameter_name, schedule_expression, log_retention_dayslocals.tf— Ubuntu 22.04 AMI lookup (Canonical 099720109477), AWS-managed Image Builder component data sourcesiam.tf— builder instance role + instance profile (3 managed policies + inline S3 read/write forcflared/*andimagebuilder-logs/*), publish-ami Lambda rolecomponents.tf— twoaws_imagebuilder_componentresources sourcing fromcomponents/*.yml, both withlifecycle { create_before_destroy = true }pipeline.tf— recipe (withcreate_before_destroy), infrastructure config (t3.small, references var.build_subnet_id), distribution config, pipeline with croncron(0 6 ? * TUE *)ami_publish.tf— SSM parameter (ignore_changes = [value, description]), Lambda function (Python 3.12), EventBridge rule + target + Lambda permissionoutputs.tf— pipeline_arn, recipe_arn, ssm_parameter_name, parent_ami_id, etc.lambda/publish_ami_to_ssm.py— Lambda that handles EC2 Image Builder Image State Change events, extracts AMI ID, writes to SSMterraform.tfvars— actual values (gitignored)
4.5 Apply procedure
Section titled “4.5 Apply procedure”cd ~/Repositories/GitHub/Adventive/adventive-platform-infra/infra/imagebuilder
terraform initterraform plan -out=tfplanterraform apply tfplanFirst apply creates ~16 resources. None touch existing AWS resources outside the new adv-cflared-* namespace and the adventive-platform-artifacts bucket policies on the builder role.
4.6 Triggering a build
Section titled “4.6 Triggering a build”Pipeline runs on its weekly schedule, but you can fire one manually:
aws imagebuilder start-image-pipeline-execution \ --image-pipeline-arn $(terraform output -raw pipeline_arn)Build duration: 6–10 minutes total (BUILD workflow ~6 min, TEST workflow ~3 min).
4.7 Validation
Section titled “4.7 Validation”After a successful build:
# Image is AVAILABLE and has an AMI IDaws imagebuilder get-image \ --image-build-version-arn arn:aws:imagebuilder:us-east-1:201161205241:image/adv-cflared-recipe/1.0.2/<N> \ --query 'image.{state:state.status,ami:outputResources.amis[0].image}' --output table
# SSM parameter holds that AMI IDaws ssm get-parameter --name /adventive/cloudflared/ami-id-latest \ --query 'Parameter.{value:Value,modified:LastModifiedDate}' --output tableIf SSM doesn’t update automatically (see deficiency #21), publish manually:
LATEST_AMI=$(aws imagebuilder get-image \ --image-build-version-arn arn:aws:imagebuilder:us-east-1:201161205241:image/adv-cflared-recipe/1.0.2/<N> \ --query 'image.outputResources.amis[0].image' --output text)aws ssm put-parameter --name /adventive/cloudflared/ami-id-latest \ --value "$LATEST_AMI" --type String --overwrite \ --description "adv-cflared AMI manually published"4.8 Gotchas encountered (so you don’t repeat them)
Section titled “4.8 Gotchas encountered (so you don’t repeat them)”-
one()rejected multiple ARNs. AWS-managed Image Builder components (update-linux,aws-cli-version-2-linux, etc.) accumulate versions over time. Using Terraform’sone()function on theaws_imagebuilder_components.X.arnsset fails when two or more versions exist. Fix:reverse(sort(tolist(...)))[0]to take the highest-version ARN deterministically. -
Two managed components don’t exist with the names I guessed.
amazon-ssm-agent-linuxandcis-level-1-ubuntu-22-04-ltsreturned empty result sets inus-east-1. We dropped them from the recipe; Ubuntu 22.04 ships SSM agent via snap, and CIS hardening was nice-to-have. Confirm exact names viaaws imagebuilder list-components --owner Amazonif you want to add them back. -
CreateFiledoes not auto-create parent dirs in Image Builder workflows. Initially set up a systemd override at/etc/systemd/system/cloudflared.service.d/override.conf, but the.dparent dir was missing on a fresh install. Then realized the deeper issue: cloudflared deb has no base unit file to override. Fix: drop a primary unit at/etc/systemd/system/cloudflared.serviceinstead. -
Image Builder recipes/components are immutable per version, and old versions can’t be deleted while images reference them. Default Terraform behavior is destroy-then-create on a version bump, which fails when prior failed/successful builds hold references. Fix:
lifecycle { create_before_destroy = true }on bothaws_imagebuilder_component.cflaredandaws_imagebuilder_image_recipe.cflared. Old versions accumulate harmlessly in AWS as orphaned history. -
Build instance’s IAM role needs S3 PutObject for the
imagebuilder-logs/*prefix — the Image Builder TOE (Task Orchestrator and Executor) uploads per-step logs there. We initially granted only Read oncflared/*. Symptom: first build failed with “User is not authorized to perform: s3:PutObject” trying to uploadD0__update-linux__1.0.2_1.yml. -
EventBridge rule’s
source-pipeline-arnfilter doesn’t match Image State Change events. That field is on a different event type (Pipeline Execution Status Change) which uses different statuses (COMPLETED/FAILED). Image State Change carriesimage-arn, notsource-pipeline-arn. Fix: prefix-match onimage-arninstead. Note: even after this fix, builds 1.0.2/2 and 1.0.2/3 did not trigger the Lambda — the actual event field shape differs from documentation. Tracked as task #21.
5. Phase 1.2 — Cloudflare Tunnel + Secrets Manager
Section titled “5. Phase 1.2 — Cloudflare Tunnel + Secrets Manager”5.1 API token creation
Section titled “5.1 API token creation”Account-owned token via dashboard at Manage Account → Account API Tokens → Create Token. Name: adv-platform-infra-tunnels-dev. Two policies on the same token:
| Policy scope | Permission group | Access |
|---|---|---|
| Entire Account | Cloudflare One Connector: cloudflared | Edit |
| Specified Domains → adventive.dev | DNS | Edit |
| Specified Domains → adventive.dev | Zone | Read |
Critical: the tunnel permission lives at account scope. If you only set up “Specified Domains” scope, the tunnel permission group is invisible — that’s why scope matters here.
Verify with:
export CLOUDFLARE_API_TOKEN='<the cfat_... token>'
curl -sS "https://api.cloudflare.com/client/v4/accounts/46a873457665355ba02a85e61d7200a7/cfd_tunnel" \ -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ | jq '{success, count: (.result | length), errors}'# Expected: { "success": true, "count": <N>, "errors": [] }
curl -sS "https://api.cloudflare.com/client/v4/zones/c737bae5b535c0ec3daa72c809721e7d/dns_records?per_page=1" \ -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" \ | jq '{success, errors}'# Expected: { "success": true, "errors": [] }5.2 Terraform module — infra/cloudflare-tunnels/
Section titled “5.2 Terraform module — infra/cloudflare-tunnels/”Files:
versions.tf— Terraform 1.6+, cloudflare ~> 4.50, aws ~> 5.50, random ~> 3.6variables.tf—cloudflare_account_id,environmentsmap (withzone_id+apexper env),tunnel_subdomain(defaulttunnel),tunnel_name_prefix(defaultadv-cflared),secret_path_prefix(default/adventive/cloudflared)tunnels.tf— for each env:random_bytes(32-byte secret),cloudflare_zero_trust_tunnel_cloudflared,cloudflare_record(CNAME, proxied),aws_secretsmanager_secret,aws_secretsmanager_secret_versionoutputs.tf—tunnel_ids,tunnel_names,tunnel_hostnames,secret_arns,secret_namesterraform.tfvars— currentlydevonly; addingstg/prdlater is a tfvars edit, no code change
The secret payload structure exactly matches what cflared-bootstrap.sh expects:
{ "config_yaml": "tunnel: <uuid>\ncredentials-file: /etc/cloudflared/<uuid>.json\nno-autoupdate: true\nmetrics: 0.0.0.0:2000\n\ningress:\n - hostname: tunnel.<apex>\n service: http_status:503\n - service: http_status:404\n", "credentials_json": "{\"AccountTag\":\"...\",\"TunnelSecret\":\"...\",\"TunnelID\":\"...\"}"}The ingress uses a 503 placeholder until real services are added. Replace via secret version bump when there’s an origin to route to.
5.3 Apply procedure
Section titled “5.3 Apply procedure”cd ~/Repositories/GitHub/Adventive/adventive-platform-infra/infra/cloudflare-tunnels
cp terraform.tfvars.example terraform.tfvars # then fill in real IDsexport CLOUDFLARE_API_TOKEN='<cfat_... token>'
terraform initterraform plan -out=tfplanterraform apply tfplanFive resources per env. None touch existing tunnels (the pre-existing Warp tunnel cf-tunnel.us-east1.aws.adventive.com is not affected because we create with a different name).
5.4 Validation
Section titled “5.4 Validation”# Tunnel registeredTUNNEL_ID=$(terraform output -json tunnel_ids | jq -r '.dev')curl -sS "https://api.cloudflare.com/client/v4/accounts/46a873457665355ba02a85e61d7200a7/cfd_tunnel/$TUNNEL_ID" \ -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" | jq '.result | {id, name, status, deleted_at}'
# DNS resolves through Cloudflaredig +short tunnel.adventive.dev
# Secret payload self-consistentaws secretsmanager get-secret-value --secret-id /adventive/cloudflared/dev \ --query SecretString --output text \ | jq '{tunnel_id_in_creds: (.credentials_json | fromjson | .TunnelID), config_starts_with_tunnel: (.config_yaml | startswith("tunnel: "))}'5.5 Gotchas encountered
Section titled “5.5 Gotchas encountered”-
Cloudflare provider v4 schema field is
secret, nottunnel_secret. Initially usedtunnel_secret = ...per a hallucinated argument name; provider v4.52.7 rejects with “argument is not expected here”. Fix:secret = random_bytes.tunnel_secret[each.key].base64. -
/user/tokens/verifyendpoint doesn’t validate account-owned tokens — that endpoint is user-token specific. Account-owned tokens (thecfat_*prefix) authenticate fine on real API endpoints but returnsuccess: falseon the user-token verify endpoint. Don’t use that endpoint as a token-validity test for account-owned tokens. -
The pre-existing Warp tunnel uses the same Cloudflare account. Be careful not to delete it during cleanup. Filter by name (
adv-cflared-*) when scripting tunnel operations.
6. End-to-end smoke test (Phase 1.1 + 1.2 validation)
Section titled “6. End-to-end smoke test (Phase 1.1 + 1.2 validation)”Manually launched a single EC2 instance from the AMI to validate the entire chain works. This is one-shot, not part of the persistent infrastructure — Phase 1.5 replaces it with a proper ASG.
6.1 IAM role + instance profile (manual)
Section titled “6.1 IAM role + instance profile (manual)”ROLE_NAME="adv-cflared-tunnel-smoketest"SECRET_ARN_PATTERN="arn:aws:secretsmanager:us-east-1:201161205241:secret:/adventive/cloudflared/dev-*"
aws iam create-role --role-name "$ROLE_NAME" \ --assume-role-policy-document '{"Version":"2012-10-17","Statement":[{"Effect":"Allow","Principal":{"Service":"ec2.amazonaws.com"},"Action":"sts:AssumeRole"}]}'
aws iam attach-role-policy --role-name "$ROLE_NAME" \ --policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
aws iam put-role-policy --role-name "$ROLE_NAME" --policy-name read-cflared-dev-secret \ --policy-document "{\"Version\":\"2012-10-17\",\"Statement\":[{\"Effect\":\"Allow\",\"Action\":\"secretsmanager:GetSecretValue\",\"Resource\":\"$SECRET_ARN_PATTERN\"}]}"
aws iam create-instance-profile --instance-profile-name "$ROLE_NAME"aws iam add-role-to-instance-profile --instance-profile-name "$ROLE_NAME" --role-name "$ROLE_NAME"sleep 10 # IAM eventual consistency6.2 Launch parameters
Section titled “6.2 Launch parameters”aws ec2 run-instances \ --image-id $(aws ssm get-parameter --name /adventive/cloudflared/ami-id-latest --query 'Parameter.Value' --output text) \ --instance-type t3.micro \ --subnet-id subnet-2ef28677 \ --security-group-ids sg-0968b424c847f142c \ --iam-instance-profile Name=adv-cflared-tunnel-smoketest \ --metadata-options 'HttpTokens=required,HttpEndpoint=enabled,InstanceMetadataTags=enabled' \ --tag-specifications 'ResourceType=instance,Tags=[{Key=Name,Value=adv-cflared-tunnel-smoketest},{Key=adv:env,Value=dev},{Key=adv:project,Value=cloudflare-tunnel}]'Two non-obvious requirements:
InstanceMetadataTags=enabled— without this, the bootstrap script can’t readadv:envfrom IMDS, can’t resolve which secret to fetch, and the tunnel never starts. This is per-instance, not part of the AMI.- The
adv:envtag must be present at launch — the bootstrap script reads it via IMDS; missing tag → bootstrap exits 1.
6.3 Expected boot sequence
Section titled “6.3 Expected boot sequence”The bootstrap completes in ~25 seconds after the instance reaches running:
cloud-initfinishescflared-bootstrap.servicestarts, fetches IMDS tag, fetches Secrets Manager value, writes/etc/cloudflared/config.ymland/etc/cloudflared/<tunnel-uuid>.jsoncflared-bootstrap.servicefinishes (Type=oneshot, RemainAfterExit=yes)cloudflared.servicestarts (Requires=cflared-bootstrap.service ensures order)- cloudflared dials Cloudflare edge on TCP 7844 (or UDP 7844 / QUIC), authenticates with the tunnel ID + secret, registers as a connector
- Cloudflare’s tunnel status flips from
inactivetohealthy
6.4 Validation
Section titled “6.4 Validation”# Tunnel goes healthy within ~60s of instance runningINSTANCE_ID=<from run-instances output>for i in {1..15}; do STATUS=$(curl -sS "https://api.cloudflare.com/client/v4/accounts/46a873457665355ba02a85e61d7200a7/cfd_tunnel/cb0c2ef4-3426-4968-8175-a3a053ecc5ea" \ -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN" | jq -r '.result.status') echo "[$(date +%H:%M:%S)] tunnel status: $STATUS" [ "$STATUS" = "healthy" ] && break sleep 10done6.5 Critical SG gap discovered
Section titled “6.5 Critical SG gap discovered”The build SG (sg-0968b424c847f142c) only opens TCP 80 + 443. Cloudflare Tunnel uses TCP+UDP 7844 to dial the edge. Without that egress, cloudflared timed out repeatedly:
ERR Unable to establish connection with Cloudflare edgeerror="DialContext error: dial tcp 198.41.200.13:7844: i/o timeout"Fix during smoke test:
aws ec2 authorize-security-group-egress --group-id sg-0968b424c847f142c \ --ip-permissions \ 'IpProtocol=tcp,FromPort=7844,ToPort=7844,IpRanges=[{CidrIp=0.0.0.0/0,Description="cloudflared edge TCP"}]' \ 'IpProtocol=udp,FromPort=7844,ToPort=7844,IpRanges=[{CidrIp=0.0.0.0/0,Description="cloudflared edge UDP/QUIC"}]'This rule was reverted after the smoke test. The runtime SG in Phase 1.5 owns the 7844 egress rule — the build SG only needs 80/443 for AMI builds. Two distinct SGs for two distinct workloads.
6.6 Diagnostics via SSM (no shell plugin needed)
Section titled “6.6 Diagnostics via SSM (no shell plugin needed)”If the tunnel fails to register, fetch service journal logs via aws ssm send-command (works without session-manager-plugin):
CMD_ID=$(aws ssm send-command \ --instance-ids "$INSTANCE_ID" \ --document-name AWS-RunShellScript \ --parameters 'commands=[ "echo === cflared-bootstrap.service ===", "journalctl -u cflared-bootstrap.service --no-pager -n 100", "echo === cloudflared.service ===", "journalctl -u cloudflared.service --no-pager -n 60", "ls -la /etc/cloudflared/" ]' --query 'Command.CommandId' --output text)sleep 8aws ssm get-command-invocation --command-id "$CMD_ID" --instance-id "$INSTANCE_ID" \ --query '{Status: Status, StdOut: StandardOutputContent}' --output text6.7 Smoke test teardown
Section titled “6.7 Smoke test teardown”INSTANCE_ID=<the smoke test instance ID>ROLE=adv-cflared-tunnel-smoketest
aws ec2 terminate-instances --instance-ids $INSTANCE_IDaws ec2 wait instance-terminated --instance-ids $INSTANCE_ID
aws iam remove-role-from-instance-profile --instance-profile-name $ROLE --role-name $ROLEaws iam delete-instance-profile --instance-profile-name $ROLEaws iam detach-role-policy --role-name $ROLE \ --policy-arn arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCoreaws iam delete-role-policy --role-name $ROLE --policy-name read-cflared-dev-secretaws iam delete-role --role-name $ROLE
aws ec2 revoke-security-group-egress --group-id sg-0968b424c847f142c \ --ip-permissions \ 'IpProtocol=tcp,FromPort=7844,ToPort=7844,IpRanges=[{CidrIp=0.0.0.0/0}]' \ 'IpProtocol=udp,FromPort=7844,ToPort=7844,IpRanges=[{CidrIp=0.0.0.0/0}]'7. Operational procedures
Section titled “7. Operational procedures”7.1 Bumping a Image Builder component
Section titled “7.1 Bumping a Image Builder component”Edit the YAML in infra/imagebuilder/components/. Then in the same commit:
- Bump the matching component’s
versionincomponents.tf(e.g., 1.0.2 → 1.0.3) - Bump the recipe’s
versioninpipeline.tfto match (recipes are immutable per version) terraform plan -out=tfplan && terraform apply tfplan
The lifecycle { create_before_destroy = true } on both resources handles the version transition cleanly. Old versions remain in AWS as orphan history (free).
7.2 Manually publishing an AMI to SSM
Section titled “7.2 Manually publishing an AMI to SSM”Until the EventBridge auto-publish bug is fixed (task #21), publish manually after each build:
LATEST_AMI=$(aws imagebuilder get-image \ --image-build-version-arn <image-build-version-arn from list-image-build-versions> \ --query 'image.outputResources.amis[0].image' --output text)
aws ssm put-parameter --name /adventive/cloudflared/ami-id-latest \ --value "$LATEST_AMI" --type String --overwrite \ --description "adv-cflared AMI manually published"The Lambda is wired correctly and works on direct invocation:
aws lambda invoke --function-name adv-cflared-publish-ami \ --cli-binary-format raw-in-base64-out \ --payload '{"detail":{"image-arn":"<image-build-version-arn>","state":{"status":"AVAILABLE"}}}' \ /tmp/lambda-out.jsoncat /tmp/lambda-out.json7.3 Cleaning up old AMIs
Section titled “7.3 Cleaning up old AMIs”Until the lifecycle policy lands (task #20), periodic manual sweep:
KEEP_AMI=$(aws ssm get-parameter --name /adventive/cloudflared/ami-id-latest --query 'Parameter.Value' --output text)for ami in $(aws ec2 describe-images --owners self --filters 'Name=tag:adv:image,Values=cflared' --query 'Images[].ImageId' --output text); do if [ "$ami" = "$KEEP_AMI" ]; then continue; fi SNAPS=$(aws ec2 describe-images --image-ids "$ami" --query 'Images[0].BlockDeviceMappings[?Ebs].Ebs.SnapshotId' --output text) aws ec2 deregister-image --image-id "$ami" for snap in $SNAPS; do aws ec2 delete-snapshot --snapshot-id "$snap"; donedone7.4 Adding a new environment to the tunnels module
Section titled “7.4 Adding a new environment to the tunnels module”In infra/cloudflare-tunnels/terraform.tfvars:
environments = { dev = { zone_id = "c737bae5b535c0ec3daa72c809721e7d", apex = "adventive.dev" } stg = { zone_id = "<adventivestg.com zone ID>", apex = "adventivestg.com" }}terraform plan should show 5 new resources (1 random_bytes + 1 tunnel + 1 CNAME + 1 secret + 1 secret_version). terraform apply tfplan.
The API token must have permissions on the new zone. If using the dev-only token, create a new token with broader scope or extend the existing one.
8. Known deficiencies (tracked as backlog)
Section titled “8. Known deficiencies (tracked as backlog)”| ID | Description | Impact | Mitigation |
|---|---|---|---|
| #20 | No Image Builder lifecycle policy — old AMIs accumulate | Cost: ~$1/month per orphan AMI’s snapshot | Manual sweep script in §7.3; full fix is aws_imagebuilder_lifecycle_policy |
| #21 | EventBridge → Lambda auto-publish doesn’t fire on Image State Change events | New AMIs don’t update SSM automatically | Manual aws ssm put-parameter per §7.2; root cause is event field shape mismatch |
| Token type | Cloudflare auth uses user-owned token, not account-owned | Token tied to user account; dies if user leaves | Migrate to account-owned token when its UI properly exposes Tunnel permission |
| Local TF state | Both modules use local state | State files contain tunnel secrets; state lives on developer’s laptop | Mandatory before stg/prd: encrypted S3 backend + DynamoDB locking; backend stub already in versions.tf |
| Build SG dual-purpose | (resolved in Phase 1.5 design) | — | Phase 1.5 ASG module owns the runtime SG with 7844 egress |
9. Phase 1.5 — Per-env ASG module
Section titled “9. Phase 1.5 — Per-env ASG module”Module location: infra/cflared-asg/. Replaces the manual smoke test with a declarative, self-healing fleet.
9.1 Files
Section titled “9.1 Files”versions.tf— Terraform 1.6+, AWS provider ~> 5.50, default tags, S3 backend stub (commented)variables.tf—aws_region,ssm_ami_parameter,secret_path_prefix,environmentsmap (per-env:subnet_ids,instance_type,desired_capacity,min_healthy_percentage,health_check_grace_period)iam.tf— per-env runtime role (EC2 trust) + AmazonSSMManagedInstanceCore + inlinesecretsmanager:GetSecretValueon/adventive/cloudflared/<env>-*+ instance profilesecurity_group.tf— per-env runtime SG (no inbound) + 3 egress rules: TCP 7844, UDP 7844, TCP 443launch_template.tf—image_idfrom SSM (data.aws_ssm_parameter.ami_id.insecure_value), IMDSv2 required,instance_metadata_tags=enabled, instance/volume tag specifications includingadv:envasg.tf— ASG with rolling instance refresh policy,min_healthy_percentage=100for dev (launch new before terminating old), references LT vialatest_versionoutputs.tf—ami_id_used,asg_names,launch_template_ids,launch_template_versions,runtime_role_arns,runtime_security_group_ids,instance_profile_namesterraform.tfvars— dev only:subnet_ids = ["subnet-2ef28677"],instance_type = "t3.micro",desired_capacity = 1
9.2 Apply procedure
Section titled “9.2 Apply procedure”cd ~/Repositories/GitHub/Adventive/adventive-platform-infra/infra/cflared-asg
terraform initterraform plan -out=tfplanterraform apply tfplanCreates 10 resources for dev: 1 IAM role + 1 instance profile + 1 inline policy + 1 managed policy attachment + 1 SG + 3 SG egress rules + 1 LT + 1 ASG.
9.3 As-built validation (2026-04-29)
Section titled “9.3 As-built validation (2026-04-29)”ASG: adv-cflared-devAMI in use: ami-02f2f244d3dd56cb4Instance: i-060419d0a98c98422 (Healthy / InService)Tunnel status progression after terraform apply:
[12:57:14] down[12:57:24] down[12:57:34] down[12:57:44] down[12:57:55] down[12:58:05] healthy ← registered~60 seconds from instance launch to tunnel registration — comparable to the manual smoke test.
down (vs inactive) is the right state when the tunnel has prior connection history. inactive would mean Cloudflare has never seen a connector; down means it’s seen one before but it’s currently absent.
9.4 Gotchas encountered
Section titled “9.4 Gotchas encountered”-
data.aws_ssm_parameter.valueis sensitive by default. SSM parameters can hold SecureString secrets, so the AWS provider marks.valuesensitive even when the parameter is type=String. This makes theimage_idfield on the launch template render as(sensitive value)and breaks any output that references it. Fix: usedata.aws_ssm_parameter.ami_id.insecure_value(sibling attribute, only valid for type=String, returns the same content without the sensitive flag). Applied in bothlaunch_template.tfandoutputs.tf. -
VPC has no private subnets. Survey of
vpc-e1636084showed all 6 subnets are public (IGW route, no NAT). Pragmatic decision: use the publicsubnet-2ef28677for dev with strict SG (no inbound). Tracked as deficiency for prd planning. See §8 andproject_phase15_subnet_deficiency.mdin memory. -
min_healthy_percentage=100for desired=1 ASG is the right setting for dev. Forces ASG to launch the replacement first, wait healthy, then terminate the old instance — briefly running 2 instances during refresh, achieving zero downtime. With=50on a single-instance ASG, AWS rounds 50% of 1 down to 0 and may terminate before launching, causing brief downtime. Prd withdesired=2will use=50correctly (1 always healthy).
9.5 Operational implications
Section titled “9.5 Operational implications”- Rolling AMI updates: when Image Builder publishes a new AMI to SSM, run
terraform plan/applyin this module. Plan shows LTimage_idchange → new LT version → ASG instance refresh → rolling replacement. - Self-healing: if the instance crashes or fails health checks, ASG launches a replacement automatically. No human action needed.
- Tunnel state continuity: Cloudflare tunnel UUIDs and secrets persist across instance replacements (they live in the cloudflare-tunnels module’s state). Any number of cloudflared connectors can register against the same tunnel UUID — instance churn doesn’t affect the tunnel’s identity in Cloudflare’s view.
10. Phase 2 — Hyperdrive provisioning
Section titled “10. Phase 2 — Hyperdrive provisioning”Module location: infra/cloudflare-hyperdrive/. Creates Cloudflare Hyperdrive configs that the Public API Worker will bind to for reaching the dev RDS through the existing tunnel.
10.1 Architecture
Section titled “10.1 Architecture”Worker → Hyperdrive → Cloudflare edge → Access (service token auth) → Tunnel → cloudflared on adv-cflared-dev (ASG) → TCP 3306 → RDS MySQL (development.coi6rcntfbgg…)Hyperdrive resolves the public hostname (db-<db>-dev.adventive.dev) through Cloudflare; the Access application protecting that hostname requires a service-token credential which Hyperdrive presents on every connection; cloudflared then proxies the TCP stream through the tunnel to the actual RDS endpoint.
Two Hyperdrives created:
adv-svc-public-api-console-dev→ MySQL databaseconsoleadv-svc-public-api-aggregate-dev→ MySQL databaseaggregate
10.2 Module dependencies (changes made to upstream modules)
Section titled “10.2 Module dependencies (changes made to upstream modules)”infra/cloudflare-tunnels/ was refactored to drive ingress + DNS records from a per-env list in tfvars rather than a single hardcoded ingress. Each entry creates a CNAME under the env’s apex zone and adds an ingress stanza to the tunnel’s config_yaml. Existing single-tunnel resource keys (["dev"]) became compound keys (["dev/tunnel"]); the resource refresh during plan handled this without a destroy.
infra/cflared-asg/ added one egress rule on the runtime SG: TCP 3306 to 0.0.0.0/0. Without it cloudflared can dial the RDS hostname but the runtime SG drops the SYN.
10.3 New module: infra/cloudflare-hyperdrive/
Section titled “10.3 New module: infra/cloudflare-hyperdrive/”Files:
versions.tf— Cloudflare ~> 4.50, AWS ~> 5.50, default tagsvariables.tf—environmentsmap (non-sensitive structure: rds_sg_id, rds_host, databases) +database_passwords(sensitive; keyed[env][db]). Split because Terraform forbids sensitive values infor_eachkeys.locals.tf— flattens (env, db) pairs for uniform iteration; data source looking up the cflared runtime SG by name.secrets.tf— Secrets Manager secret + version per (env, db). JSON shape:{username, password, host, port, database}. Hyperdrive doesn’t read from Secrets Manager directly; this is for any other tooling that wants the canonical home for these creds.access.tf— one service token per env + one Access application per (env, db) hostname + one Access policy per app allowing the env’s service token (decision = "non_identity").hyperdrive.tf—cloudflare_hyperdrive_configper (env, db). Noteorigin = { ... }(attribute, not block) andportMUST be omitted whenaccess_client_id/access_client_secretare set (the cloudflared ingress determines the port).db_security_group.tf— adds an ingress rule to the existing RDS SG allowing 3306 from the runtime SG (referenced by ID via the data source).outputs.tf— Hyperdrive IDs (used inwrangler.toml), service token client IDs, secret ARNs.
10.4 Apply procedure
Section titled “10.4 Apply procedure”cd ~/Repositories/GitHub/Adventive/adventive-platform-infra/infra/cloudflare-tunnelsterraform plan -out=tfplan && terraform apply tfplan
cd ../cflared-asgterraform plan -out=tfplan && terraform apply tfplan
aws autoscaling start-instance-refresh --auto-scaling-group-name adv-cflared-dev \ --preferences '{"MinHealthyPercentage": 100, "InstanceWarmup": 90}'# Wait for refresh to complete; the new instance picks up the new tunnel ingress
cd ../cloudflare-hyperdriveterraform initterraform plan -out=tfplan && terraform apply tfplanThe instance refresh between cflared-asg and cloudflare-hyperdrive is mandatory — Hyperdrive validates the database connection at creation time, and that validation goes through the live cloudflared instance. If the instance still has the old config without the database ingress rules, Hyperdrive create fails with 404 Not Found (2015).
10.5 Token permissions (cumulative through Phase 2)
Section titled “10.5 Token permissions (cumulative through Phase 2)”The adv-platform-infra-tunnels-dev Cloudflare API token now has these policies:
| Scope | Permission group | Access |
|---|---|---|
| Entire Account | Cloudflare One Connector: cloudflared | Edit |
| Entire Account | Access: Apps and Policies | Edit |
| Entire Account | Access: Service Tokens | Edit |
| Entire Account | Hyperdrive | Edit |
| Specified Domains → adventive.dev | DNS | Edit |
| Specified Domains → adventive.dev | Zone | Read |
10.6 Gotchas encountered
Section titled “10.6 Gotchas encountered”- Sensitive variables can’t be used in
for_each. Marking a variablesensitive = truemakes everything derived from it sensitive, including its keys. Terraform forbids sensitive resource addresses (would leak structure metadata). Fix: split into a non-sensitive structure variable + a sensitive password-only map keyed identically. - Cloudflare provider v4:
originis an attribute, not a block.origin = { ... }(with=), notorigin { ... }. origin.portcannot coexist withorigin.access_client_id. When using Access service-token auth, port routing is determined by the cloudflared ingress rule’s TCP service URL — not by the Hyperdrive config. Dropportfrom the origin block when using Access.- MySQL non-existent users on RDS trigger AuthSwitchRequest. Hyperdrive doesn’t support AuthSwitch; if the username in the Hyperdrive config doesn’t exist on the database, the error reads as a Hyperdrive-side limitation rather than “user not found.” Always verify the user exists first via
SELECT User, Host, plugin FROM mysql.user. - Token permission granularity is finer than expected. Cloudflare splits Access into
Access: Apps and Policies(apps + policies) andAccess: Service Tokens(service tokens) — these are separate permission groups. Hyperdrive needs its own permission group too. All three must be added to the token before the Phase 2 module applies cleanly. - Hyperdrive validates the DB connection at creation. This is a feature:
terraform applywon’t lie to you about connectivity. If the tunnel ingress is wrong, MySQL credentials are wrong, or the SG path is broken,cloudflare_hyperdrive_configcreate will fail and you’ll know immediately. The flip side: the tunnel’s runtime instance must be running the current ingress config before applying this module.
10.7 Worker integration (Phase 4 territory)
Section titled “10.7 Worker integration (Phase 4 territory)”When the Public API Worker is built (Phase 4), its wrangler.toml binds the Hyperdrive IDs:
[[hyperdrive]]binding = "DB_CONSOLE"id = "059838c4abb64a92a4aece2a6a533a29"
[[hyperdrive]]binding = "DB_AGGREGATE"id = "c1b18833b07347daa77b56a2d19ef508"Standard mysql2/promise connection inside the Worker:
const c = await mysql.createConnection(env.DB_CONSOLE.connectionString);const [rows] = await c.query('SELECT 1');11. Phase 3 — Auth helper Worker
Section titled “11. Phase 3 — Auth helper Worker”Repo: ~/Repositories/GitHub/Adventive/adventive-auth-helper-worker/. New TypeScript+Hono Worker that any Adventive Worker can call over a service binding to validate X-Api-Key + X-Integration-Key against the console.api table.
11.1 Architecture
Section titled “11.1 Architecture”caller Worker --(service binding "AUTH")--> adv-svc-auth-helper-dev │ ├── KV CACHE (5-min TTL) ── hit ──► return │ └── miss │ └─► Hyperdrive DB_CONSOLE │ └─► Cloudflare edge → Access (service token) → Tunnel │ └─► cloudflared on adv-cflared-dev → RDS console.api11.2 Endpoints
Section titled “11.2 Endpoints”GET /__health→{status, commit_sha, environment}POST /authwithX-Api-Key+X-Integration-Keyheaders →{valid, accountId, rph, name}or{valid:false}on bad keys, 503 on DB unreachable
11.3 Repo layout
Section titled “11.3 Repo layout”adventive-auth-helper-worker/├── package.json hono + mysql2 deps; npm scripts for typecheck/deploy/tail├── tsconfig.json strict, Workers types├── wrangler.toml [env.dev] only; stg/prd stubbed├── src/│ ├── env.ts Bindings interface (DB_CONSOLE Hyperdrive + CACHE KV)│ ├── logger.ts Structured JSON logger; events match plan vocabulary│ ├── auth.ts SHA-256 cache key + KV lookup + mysql2 query│ └── index.ts Hono app├── tests/ Stub vitest; full suite is Phase 4 territory├── README.md├── RUNBOOK.md└── CHANGELOG.md11.4 SQL query (verbatim from legacy)
Section titled “11.4 SQL query (verbatim from legacy)”SELECT account_id, name, rphFROM apiWHERE int_key = ? AND api_key = ? AND !is_deletedLIMIT 1The legacy CodeIgniter Api_model::validateKeys() was the source. The legacy also does an UPDATE api SET r_count = r_count + 1 for rate limiting; per the plan, that responsibility moves to a Durable Object in the public API Worker (Phase 5) — the auth helper itself is a pure validator.
11.5 Deploy procedure
Section titled “11.5 Deploy procedure”cd ~/Repositories/GitHub/Adventive/adventive-auth-helper-workernpm installnpm run typechecknpx wrangler deploy --env devAfter deploy, tail logs:
npx wrangler tail --env dev11.6 Smoke test results (validated 2026-04-29)
Section titled “11.6 Smoke test results (validated 2026-04-29)”GET /__health → 200 OK with commit_shaPOST /auth (real keys) → {valid:true, accountId:246, rph:500, name:"CLOUDFLAREDEVAPITEST"}POST /auth (real keys, second hit) → same JSON; logs show auth.cache.hit (no DB query)POST /auth (bogus keys) → {valid:false, accountId:0, rph:0}11.7 Gotchas encountered
Section titled “11.7 Gotchas encountered”- mysql2 + Cloudflare Workers requires
disableEval: true. mysql2 useseval/new Functionfor SQL parser optimization by default; CF Workers’ V8 isolate blocks runtime code generation (Code generation from strings disallowed for this context). Fix: pass connection config as an object (host/user/password/database/port +disableEval: true) instead of a connection string. The connection string path doesn’t accept extra options. - wrangler hits
/membershipseven with account-owned tokens despitewhoamisucceeding. Tracked Cloudflare quirk. Workaround:wrangler loginfor OAuth-based local dev, separate auth context from the API token Terraform uses. The two coexist; switch by setting/unsettingCLOUDFLARE_API_TOKEN. - Cloudflare account workers.dev subdomain is unique per account and not obvious from the dashboard. Pull from
wrangler deployoutput (printed at end of every deploy) or from the Worker’s page on dash.cloudflare.com. For Adventive:adventive.workers.dev. - KV binding name vs namespace title. When you create a KV namespace via
wrangler kv namespace create kv-adv-svc-auth-helper-cache-dev, wrangler suggests a binding name matching the title. We useCACHEas the binding (clean, code-side), with the descriptive name as the namespace title. The two are independent.
11.8 Token permissions added in Phase 3
Section titled “11.8 Token permissions added in Phase 3”The adv-platform-infra-tunnels-dev token didn’t get new permissions; we used wrangler login (OAuth) for Worker deploys instead. If we later want CI to deploy the auth helper, we’ll need to add to that token (or a CI-specific token):
| Permission group | Access |
|---|---|
| Workers Scripts | Edit |
| Workers KV Storage | Edit |
| User Details | Read (so wrangler’s /memberships probe succeeds) |
12. Phases 4–7 — Public API Worker (driven by Claude Code)
Section titled “12. Phases 4–7 — Public API Worker (driven by Claude Code)”Repo: ~/Repositories/GitHub/Adventive/adventive-public-api-worker/. Live: https://api.adventive.dev.
12.1 Architecture (in one paragraph)
Section titled “12.1 Architecture (in one paragraph)”Hono router on Cloudflare Workers. Auth via service binding to adv-svc-auth-helper-dev. Two Hyperdrive MySQL connections — DB_CONSOLE (campaign / advertiser / placement structure) and DB_AGGREGATE (kpi / engagement / quartile / clickthrough metrics). Rate limiting via a RateLimiter Durable Object per API key, hourly window, alarm-based reset. All DB queries use db.query() not db.execute() (Hyperdrive rejects COM_STMT_PREPARE). placement_name is a computed CASE expression joining ad_html5.ad_name or asset.asset_name, not a real column.
12.2 Endpoints (all 7 verified working against real dev DBs)
Section titled “12.2 Endpoints (all 7 verified working against real dev DBs)”| Method | Path | Returns |
|---|---|---|
| GET | /credentialscheck | { status: true } |
| GET | /advertisers | { data: [{id, name}] } |
| GET | /advertisers/:id | { data: AdvertiserDetail } (with contacts[]) |
| GET | /campaigns | { data: [CampaignListItem] } (no default date filter) |
| GET | /campaigns/:id | { data: CampaignDetail } (sites→placements→ad_units→delivery_groups) |
| GET | /analytics/:campaignId | { data: CampaignAnalytics } (scalar, not array) |
| GET | /clickthroughs | { data: [ClickthroughRow] } (requires advertiser_id; 4-month default) |
| GET | /connector | { data: [ConnectorRowV1|V2] } (requires advertiser_id; 2 bulk queries) |
| GET | /dataconnector | { data: [ConnectorRowV1|V2] } (account-wide; 2 bulk queries) |
All routes accept /v{N}.{x}/ prefix. Major version ≥ 2 activates v2 schema (adds engagement + video quartile data on analytics / connector / dataconnector).
12.3 Error format
Section titled “12.3 Error format”RFC 7807. Body: { status, title, detail }. Content-Type: application/problem+json. 429 responses include Retry-After: 3600.
12.4 Per-commit gate
Section titled “12.4 Per-commit gate”npm run typechecknpm run testnpx wrangler deploy --dry-run --env devCLOUDFLARE_ACCOUNT_ID=46a873457665355ba02a85e61d7200a7 for dry-run/deploy.
12.5 Gotchas captured during Phases 4–7
Section titled “12.5 Gotchas captured during Phases 4–7”- Cloudflare does NOT auto-create DNS records when a Worker route deploys. Manually add an
AAAArecord pointing at100::(proxied) for each custom hostname before the first deploy to that env. ADR atdecisions/2026-04-29-worker-dns.mdin the public-api-worker repo. This will hit again at stg/prd setup — pre-createapi.adventivestg.comandapi.adventive.comAAAA records before deploying. - mysql2 is pinned; do not upgrade without testing. Hyperdrive compatibility is the constraint. Pin notes are in
src/lib/db.ts. ThedisableEval: truerequirement (already documented in memory) still applies. - Hyperdrive rejects
COM_STMT_PREPARE. Always usedb.query(), neverdb.execute(). Documented in the public API Worker but worth surfacing for any new MySQL Worker. - AUTH helper interface contract: POST with
{ apiKey, integrationKey }headers →{ accountId, rph }plusvalidflag. The auth helper is in a separate repo (adventive-auth-helper-worker); itsvalid:false(200) and DB-unreachable (503) semantics MUST be respected on the consumer side. Public API Worker translatesvalid:falseto its own 401 response shape for end users.
12.6 Outstanding for stg / prd extension
Section titled “12.6 Outstanding for stg / prd extension”Per Claude Code’s handoff:
- Phase 1–3 extension to stg / prd: Hyperdrive provisioning (4 more configs: console / aggregate × stg / prd), service binding deploys of
adv-svc-auth-helper-stg/adv-svc-auth-helper-prd, AAAA records forapi.adventivestg.comandapi.adventive.com, RDS-side SG ingress for the runtime SGs that don’t exist yet. - Phase 8 (cutover): Traffic move from PHP app to this Worker.
- Public API Worker stg / prd deploy:
wrangler deploy --env stg/--env prdonly after Hyperdrive IDs are filled inwrangler.tomland the upstream infra is up.
12.7 Post-Phase-7 hardening cycle (2026-04-29)
Section titled “12.7 Post-Phase-7 hardening cycle (2026-04-29)”After the initial Phase 7 deploy, the following landed in subsequent commits per Claude Code’s report:
- OpenAPI spec compliance fixes (analysis ADR, schema sync)
- RFC 7807 error envelopes
- Banner headers on responses
- Flexible version routing (
/v{N}.{x}/prefix matching) - Compliance gap closure:
Retry-Afteron 429,removeZerosv1 query parameter, analytics scalar rather than array - Bulk connector implementation (2 bulk queries)
- Postman collection at
postman/adventive-public-api.jsonwith all 28 requests + pre-request auth scripting
13. Outstanding work (tracked as task backlog)
Section titled “13. Outstanding work (tracked as task backlog)”| Item | Owner | Notes |
|---|---|---|
| Image Builder lifecycle policy | task #20 | Auto-delete old AMIs |
| EventBridge → Lambda auto-publish | task #21 | Runtime event-shape mismatch |
| Account-owned token migration | memory project_phase12_token_deficiency | Wait for Cloudflare UI fix |
| Local TF state → encrypted S3 backend | runbook §8 | Required before stg/prd |
| Public subnet → NAT + private subnet | memory project_phase15_subnet_deficiency | Required before prd |
-ro suffix vs actual user names | runbook §8 | Cosmetic; reflects future-cluster-split intent |
| Looker / Data Studio integration guide | task #33 | End-user docs |
| TapClicks integration guide | task #34 | End-user docs |
| Google Sheets integration guide | task #35 | End-user docs |
| Windsor.ai integration guide | task #41 | End-user docs — marketing data pipeline / connector platform |
developer.adventive.com docs site | task #36 | Public-facing developer portal |
| WAF policies on Public API | task #37 | Layered defense beyond per-key rate limit |
| New Relic observability + alerting | task #38 | Per standing observability memory |
| Integrations admin-dashboard screen | task #39 | Product idea |
| Adventive MCP server | task #40 | Product idea |
| Stg / prd extension of Phases 1–3 | not yet ticketed | Multi-phase workstream |
| Phase 8 — cutover | planning doc | After stg / prd burn-in |
14. Change log
Section titled “14. Change log”| Date | Change |
|---|---|
| 2026-04-29 | Initial document. Phase 1.1 + 1.2 complete and validated. Smoke test torn down. Phase 1.5 starting. |
| 2026-04-29 | Phase 1.5 complete — infra/cflared-asg/ applied, ASG adv-cflared-dev running one t3.micro from ami-02f2f244d3dd56cb4, tunnel adv-cflared-dev registered as healthy with Cloudflare. Phase 1 fully done. |
| 2026-04-29 | Phase 2 complete — cloudflare-tunnels refactored for multi-ingress, cflared-asg added 3306 egress, new cloudflare-hyperdrive module created. Two Hyperdrive configs (console + aggregate) for dev validated end-to-end. Worker bindings ready for Phase 4. |
| 2026-04-29 | Phase 3 complete — adventive-auth-helper-worker repo scaffolded, KV namespace provisioned, deployed to dev, smoke-tested against real and bogus key pairs. Public API Worker can now consume AUTH service binding in Phase 4. |
| 2026-04-29 | Phases 4–7 complete (Claude Code) — Public API Worker deployed live at https://api.adventive.dev. All 7 endpoint handlers verified against real dev DBs. RFC 7807 errors, /v{N}.{x}/ version routing, Postman collection. Backlog updated with documentation, WAF, observability, product items. |
| 2026-04-29 | Engineering review presentation delivered: Adventive_Public_API_DEV_Status.pptx (18 slides, Adventive engineering style). Save point — public API workstream paused for the night. Resume tomorrow on stg/prd extension, WAF, or any of the backlog items. |