Platform Engineering · Data Infrastructure Sole Owner 8,000+ students · 3 programs

NUCEE Data Platform:
From Fragmented Silos
to Repeatable Infrastructure

Built Northeastern University's Center of Entrepreneurship Educations's first centralized data platform from scratch, designing the schema, standardizing definitions across three programs, migrating 800+ users to a new platform, and automating 10+ workflows that had previously required hours of manual work each week. The deliverable wasn't a report. It was a system that keeps working after I leave.

8K+
Students served across VMN, IDEA, and Mosaic
−50%
Reduction in data retrieval time
800+
Mentors and students migrated to new platform
Role
Data & Impact Fellow · Sole Owner
Timeline
Jan 2025 – Present
Organization
NUCEE · Northeastern University
Stack
Salesforce · Airtable · Chronus · Tableau · Python · Excel
Data Architecture · Unified schema across 3 programs
Salesforce Airtable Chronus
SOURCE SYSTEMS, PREVIOUSLY SILOED VMN Program spreadsheets · ad hoc IDEA Program Airtable · Salesforce Mosaic Program Chronus · spreadsheets Unified Data Platform canonical schema · shared definitions · 10+ automated workflows Salesforce (source of truth) · Airtable (working layer) · Chronus (mentor tracking) FY25 Impact Report 200+ stakeholders · first ever Princeton Review Data 20yr process redesigned Mentor/Student Platform 800+ users migrated documentation · data dictionaries · onboarding guides included as part of the platform deliverable
Unified Data Schema · Shared across VMN, IDEA, Mosaic
students entity
student_id PKuuid
programenum
cohort_yearinteger
venture_id FKuuid
stageenum
activeboolean
ventures entity
venture_id PKuuid
namestring
stageenum
program FKenum
funding_raiseddecimal
mentor_id FKuuid
engagements entity
engagement_id PKuuid
mentor_id FKuuid
venture_id FKuuid
typeenum
session_datedate
duration_mininteger
Context

Three programs. No shared model. No single source of truth.

The NU Center for Entrepreneurship Education runs three programs, VMN, IDEA, and Mosaic, each serving students, ventures, and mentors across different stages of the entrepreneurship pipeline. When I joined, the programs operated in silos: separate tracking systems, inconsistent definitions for the same metrics, no way to produce a unified view of NUCEE's impact.

The Princeton Review ranking submission process had run the same way for 20 years, manual, fragmented, and dependent on institutional memory rather than documented process. Every time someone needed a number, how many active ventures, how many mentor sessions this quarter, what stage are students at, they had to go hunting across multiple tools and hope the definitions matched.

My job was to build the infrastructure that makes that question answerable in seconds, not hours.

What I Built

Platform design first, automation second.

01
Designed NUCEE's first centralized data architecture
Mapped every data source across the three programs, Salesforce, Airtable, Chronus, and various ad-hoc spreadsheets, and designed a unified schema that could represent students, ventures, mentors, and engagements consistently. Defined shared field names, data types, and enums so the same concept meant the same thing everywhere, across every program.
02
Standardized definitions across three programs, before building anything
What counts as an "active venture"? What's a mentor "session"? These had different answers in different programs. Building automations on top of inconsistent definitions would have locked in the inconsistency. Getting alignment on canonical definitions first meant every workflow built afterward was working from the same foundation. This took weeks. The technical work took days.
03
Built 10+ automated workflows replacing manual processes
Built automations in Airtable and Salesforce to replace manual data collection, tracking, and reporting processes, automated status updates, reporting triggers, and data sync flows that previously required hours of manual work each week. Reduced manual data retrieval time by approximately 50%.
04
Delivered NUCEE's first unified annual impact report
Produced the organization's first report consolidating data across all three programs into a single coherent narrative, venture counts, mentor engagement, student reach, funding outcomes. Used by 200+ stakeholders including university leadership, program staff, and external partners.
05
Rebuilt the Princeton Review ranking submission from scratch
Led a ground-up redesign of a process that hadn't changed in 20 years. Documented the new workflow, identified the required data points, and built infrastructure to pull them reliably and repeatably, ending the cycle of manual scrambling each submission year.
06
Migrated 800+ mentors and students to a new platform
Directed the data migration and operations redesign for a NUCEE-affiliated program, rebuilt the mentor and student tracking infrastructure for 300+ mentors and 500+ students on an entirely new platform. Scoped the migration, mapped field relationships, validated data fidelity post-migration.
Platform Architecture Decisions

Why these choices, not others.

Salesforce as source of truth, Airtable as working layer
Salesforce held the authoritative contact and venture records, but program staff weren't living in Salesforce, they were in spreadsheets. Airtable bridged the gap: familiar enough for non-technical staff to use daily, structured enough to support automations and reporting. The architecture kept Salesforce as the canonical source and Airtable as the operational surface.
Definitions before tooling, always
The biggest bottleneck wasn't technical, it was semantic. VMN, IDEA, and Mosaic each had informal, incompatible definitions for shared concepts. Building automations on inconsistent definitions would have locked in the inconsistency at scale. Alignment on shared definitions first meant every workflow built afterward was trustworthy from day one.
Documentation as a platform deliverable
A co-op engagement that leaves behind undocumented systems is a liability, not an asset. Every workflow, every schema decision, every automation was documented, data dictionaries, onboarding guides, workflow maps, with enough detail that someone who wasn't there when it was built can maintain, extend, or debug it. That was a design constraint from the start, not an afterthought.
Outcomes

A platform that keeps working.

−50%
Reduction in data retrieval time via automated workflows
800+
Users migrated to new platform with validated data fidelity
10+
Manual workflows replaced by automated systems

The most important metric isn't in the list above: this platform will keep producing accurate, consistent data after I leave, because it was designed to. That's the measure of infrastructure that actually works.

What I'd Do Differently

Three things that shaped how I think about platform work.

01
Data infrastructure is an organizational problem first
The fragmentation at NUCEE wasn't a technical failure, it was an organizational one. Three programs grew independently with their own tooling and conventions. Fixing the infrastructure required changing how people across the organization thought about shared data, not just building better pipelines. The technical work was the easy half.
02
The most valuable deliverable is the one that keeps working
A report someone has to recreate manually every year isn't infrastructure, it's a recurring project. The goal was always to build systems that produce value after I leave. That means automation over one-off analysis, documentation over tribal knowledge, and repeatability over elegance.
03
Standardization is a leadership challenge, not a technical one
Getting three program teams to agree on shared definitions required more facilitation than I expected. The technical work of building the unified schema took days. The organizational work of getting buy-in took weeks. That ratio was right, a foundation built on contested definitions would have collapsed the moment someone tried to produce a cross-program report.
Salesforce Airtable Tableau Chronus Jupyter Python Excel Data Modeling Process Automation Stakeholder Reporting
Next Project
Flowmersion, Technical PM, 0 to Live