Production AWS Pipeline — Vaishavi Jayashankar

Context

The reporting process was entirely manual. I replaced it.

The company manages digital advertising across multiple client properties. There was no automated system, no historical data store, no consistent format, and no way to compare performance week-over-week without doing it manually.

I designed and shipped the replacement entirely on my own. It has run in production, without manual intervention, since the day I deployed it.

The Problem, Precisely

No pipeline. No history. No consistency.

01

No historical data store

Week-over-week comparisons required manual lookups. Nothing tracked trends automatically, no baseline, no variance detection, no way to surface a pattern without someone doing the math.

02

No standardized format

Reports varied by who produced them. No consistent KPI structure. No repeatable template. Stakeholders had to interpret different formats each week.

03

No lead attribution

Reports showed impressions and clicks but not which ad sets generated leads. CPL calculations used total account spend instead of campaign-level spend, making costs look artificially high.

04

No actionable analysis

Data existed, but didn't surface decisions. Stakeholders interpreted raw numbers with no guidance on what to act on.

Architecture Decisions

Seven services. Every choice deliberate.

I designed the architecture from scratch with one constraint that shaped every decision: no one would maintain this system after I left. It had to be self-running, auditable, and extensible without me.

EventBridge over a cron job

A managed schedule (cron(0 12 ? * MON *)) means no server to maintain, no cron daemon to monitor, automatic retry on failure, and full visibility in the AWS console. A non-engineer can inspect or modify the schedule without touching code.

DynamoDB for time-series metrics

Needed week-over-week comparison without a relational database. DynamoDB with campaign_id as partition key and week_start_date as sort key gives efficient point-in-time lookups at effectively zero cost at this data volume. No schema migrations. No maintenance window.

Secrets Manager over environment variables

Three API tokens, Meta, Anthropic, HubSpot, need rotation without redeployment. Secrets Manager makes access auditable via CloudWatch and decouples credential management from the Lambda entirely. If a token rotates, no code changes.

SES over third-party email

Kept everything in the AWS ecosystem, enabled full HTML email formatting, and cost fractions of a cent per report. Verified the [company domain] domain and all all recipient addresses before go-live, a step that would have blocked production silently if skipped.

S3 for raw data archival

Every API response gets written to S3 with a 90-day TTL before any transformation. If the Lambda logic has a bug downstream, the raw data still exists and can be replayed. Immutable inputs, mutable processing.

Code: Lambda Orchestrator

What the entry point looks like.

The Lambda handler orchestrates the full pipeline in sequence: fetch credentials, pull API data, persist to DynamoDB, generate AI analysis, render HTML, deliver via SES. Each step is independently testable.

            lambda_function.py · handler excerpt
            Python 3.13
          

            def
            lambda_handler(event, context):
            # 1. Pull credentials from Secrets Manager
            secrets =
            get_secrets()

            # 2. Fetch this week's campaign data from Meta + HubSpot
            meta_data =
            fetch_meta_campaigns(secrets["META_TOKEN"])
            hs_data =
            fetch_hubspot_leads(secrets["HUBSPOT_TOKEN"])

            # 3. Archive raw payloads to S3 (immutable, 90-day TTL)
            write_to_s3(meta_data, hs_data)

            # 4. Persist week-over-week metrics to DynamoDB
            prev_week =
            get_previous_metrics(week_start) write_metrics(meta_data, week_start)

            # 5. Generate AI analysis (3-pass extended thinking)
            analysis =
            generate_claude_analysis(
            meta_data,
            hs_data,
            prev_week,
            api_key=secrets["ANTHROPIC_KEY"] )

            # 6. Render HTML report + deliver via SES to 6 recipients
            html =
            render_report(meta_data, hs_data,
            analysis)
            send_via_ses(html, recipients=STAKEHOLDER_LIST)

            return {"statusCode": 200,
            "body":
            "Report delivered"}
          

AI Design

Three-pass reasoning. Stability-first philosophy.

The Claude API integration was the part I iterated on most. Early versions produced fluent but inconsistent analysis, sometimes dramatic about normal variance, sometimes vague about real anomalies. The prompt went through multiple rewrites before landing on what worked.

Stability-first philosophy

The insight: marketing stakeholders don't want to be alarmed by normal week-to-week variance. The prompt explicitly instructs Claude to treat swings under a threshold as noise, not signal. Only surface an alert if the trend is directional and sustained across multiple weeks.

Three-pass extended thinking

Pass 1: read the raw data. Pass 2: identify patterns and anomalies against the 4-week rolling trend. Pass 3: generate department-specific recommendations for Marketing, Digital/Dev, and Leadership. Extended thinking gives the model room to reason before committing to output, reducing confident-sounding errors.

Structured scannable output

Replaced paragraph analysis with [UP] / [DOWN] / [FLAT] / [ALERT] / [OK] rendered as colored HTML indicators. A stakeholder can scan the full report in under 30 seconds and know exactly what needs attention.

Added CloudWatch structured logging for Claude's raw response and parser output, essential for diagnosing output deviations in a serverless system where there is no console to inspect.

What Broke, What Got Fixed, What Got Added

Real production iteration.

Fix

Meta app in Development mode, blocked all API calls

Lambda returned OAuthException code 200, every API call blocked silently. Root cause: the Meta app was in Development (Unpublished) mode, restricting access to app admins only. Fix: published app to Live, generated a new system user token, updated the secret in Secrets Manager. Cost: two days of debugging. Fix time: five minutes. Now a first-deploy checklist item.

Fix

KPI cards triple-rendering in Outlook

KPI summary cards appeared three times in Outlook due to nested HTML tables. Replaced with flat <td width=25%> layout, Outlook-compatible, no nesting. HTML email rendering is its own compatibility layer entirely.

Fix

CPL calculated against wrong spend denominator

Cost per lead was calculated against total account spend instead of campaign-level spend. Produced inflated CPL figures that overstated advertising costs. Fixed to use only spend attributed to lead-generating campaigns.

Added

4-week rolling trend window

Extended get_previous_week_metrics to return a 4-week dictionary and added get_all_time_totals. Trend data feeds into Claude so analysis reflects directional movement across a month, not just a single point-in-time comparison.

Added

HubSpot lead attribution integration

New Lead Activity section pulls from HubSpot's API, showing which campaign and ad set generated each lead, with calculated CPL and landing page views. Green status when leads exist that week, yellow when none, visible at a glance.

Outcomes

Running in production. Zero manual steps.

7

AWS services integrated in a single cohesive pipeline

0

Manual steps required to produce a report after deployment

6

Leadership team receiving structured AI-generated analysis weekly

This system runs in production at a real company. Every Monday at 7am, it fetches live data, reasons over it, and delivers a formatted report, without anyone touching it. That's the outcome that matters.

What I'd Do Differently

Three things I'd change from day one.

01

Add structured CloudWatch logging from the start

Debug logging for the Claude response parser was added after the first production issues. In a serverless system with no interactive console, structured logging is your only visibility into what actually happened inside a function. It should be a first-commit item, not an afterthought.

02

Test the Meta app publishing state before go-live

The Development mode issue cost two days. The fix was five minutes. App publishing state, system user token scope, and API permission tiers should be checklist items before any first deploy involving the Meta Graph API.

03

Design the AI prompt around the reader, not the data

Early prompts asked Claude to analyze data. Better prompts told Claude who would read the output, what decisions they needed to make, and what useful vs. alarming looked like in this context. The stability-first philosophy came from asking "what would make a marketing director trust this report?"

Production AWS Pipeline:Serverless Intelligence,End to End