Platform Engineering · Data Infrastructure
Sole Owner
8,000+ students · 3 programs
NUCEE Data Platform:
From Fragmented Silos
to
Repeatable Infrastructure
Built Northeastern University's Center of Entrepreneurship Educations's
first centralized data platform from scratch, designing the schema,
standardizing definitions across three programs, migrating 800+ users to
a new platform, and automating 10+ workflows that had previously
required hours of manual work each week.
The deliverable wasn't a report. It was a system that keeps working
after I leave.
8K+
Students served across VMN, IDEA, and Mosaic
−50%
Reduction in data retrieval time
800+
Mentors and students migrated to new platform
Unified Data Schema · Shared across VMN, IDEA, Mosaic
student_id PKuuid
programenum
cohort_yearinteger
venture_id FKuuid
stageenum
activeboolean
venture_id PKuuid
namestring
stageenum
program FKenum
funding_raiseddecimal
mentor_id FKuuid
engagement_id PKuuid
mentor_id FKuuid
venture_id FKuuid
typeenum
session_datedate
duration_mininteger
Context
Three programs. No shared model. No single source of truth.
The NU Center for Entrepreneurship Education runs three programs, VMN,
IDEA, and Mosaic, each serving students, ventures, and mentors across
different stages of the entrepreneurship pipeline. When I joined, the
programs operated in silos: separate tracking systems, inconsistent
definitions for the same metrics, no way to produce a unified view of
NUCEE's impact.
The Princeton Review ranking submission process had run the same way
for 20 years, manual, fragmented, and dependent on institutional
memory rather than documented process. Every time someone needed a
number, how many active ventures, how many mentor sessions this
quarter, what stage are students at, they had to go hunting across
multiple tools and hope the definitions matched.
My job was to build the infrastructure that makes that question
answerable in seconds, not hours.
What I Built
Platform design first, automation second.
01
Designed NUCEE's first centralized data architecture
Mapped every data source across the three programs, Salesforce,
Airtable, Chronus, and various ad-hoc spreadsheets, and designed
a unified schema that could represent students, ventures,
mentors, and engagements consistently. Defined shared field
names, data types, and enums so the same concept meant the same
thing everywhere, across every program.
02
Standardized definitions across three programs, before building
anything
What counts as an "active venture"? What's a mentor "session"?
These had different answers in different programs. Building
automations on top of inconsistent definitions would have locked
in the inconsistency. Getting alignment on canonical definitions
first meant every workflow built afterward was working from the
same foundation. This took weeks. The technical work took days.
03
Built 10+ automated workflows replacing manual processes
Built automations in Airtable and Salesforce to replace manual
data collection, tracking, and reporting processes, automated
status updates, reporting triggers, and data sync flows that
previously required hours of manual work each week. Reduced
manual data retrieval time by approximately 50%.
04
Delivered NUCEE's first unified annual impact report
Produced the organization's first report consolidating data
across all three programs into a single coherent narrative,
venture counts, mentor engagement, student reach, funding
outcomes. Used by 200+ stakeholders including university
leadership, program staff, and external partners.
05
Rebuilt the Princeton Review ranking submission from scratch
Led a ground-up redesign of a process that hadn't changed in 20
years. Documented the new workflow, identified the required data
points, and built infrastructure to pull them reliably and
repeatably, ending the cycle of manual scrambling each
submission year.
06
Migrated 800+ mentors and students to a new platform
Directed the data migration and operations redesign for a
NUCEE-affiliated program, rebuilt the mentor and student
tracking infrastructure for 300+ mentors and 500+ students on an
entirely new platform. Scoped the migration, mapped field
relationships, validated data fidelity post-migration.
Platform Architecture Decisions
Why these choices, not others.
Salesforce as source of truth, Airtable as working layer
Salesforce held the authoritative contact and venture records, but
program staff weren't living in Salesforce, they were in
spreadsheets. Airtable bridged the gap: familiar enough for
non-technical staff to use daily, structured enough to support
automations and reporting. The architecture kept Salesforce as the
canonical source and Airtable as the operational surface.
Definitions before tooling, always
The biggest bottleneck wasn't technical, it was semantic.
VMN, IDEA, and Mosaic each had informal, incompatible definitions
for shared concepts. Building automations on inconsistent
definitions would have locked in the inconsistency at scale.
Alignment on shared definitions first meant every workflow built
afterward was trustworthy from day one.
Documentation as a platform deliverable
A co-op engagement that leaves behind undocumented systems is a
liability, not an asset. Every workflow, every schema decision,
every automation was documented, data dictionaries, onboarding
guides, workflow maps, with enough detail that someone who wasn't
there when it was built can maintain, extend, or debug it. That was
a design constraint from the start, not an afterthought.
Outcomes
A platform that keeps working.
−50%
Reduction in data retrieval time via automated workflows
800+
Users migrated to new platform with validated data fidelity
10+
Manual workflows replaced by automated systems
The most important metric isn't in the list above: this platform will
keep producing accurate, consistent data after I leave, because it was
designed to. That's the measure of infrastructure that actually works.
What I'd Do Differently
Three things that shaped how I think about platform work.
01
Data infrastructure is an organizational problem first
The fragmentation at NUCEE wasn't a technical failure, it was an
organizational one. Three programs grew independently with their
own tooling and conventions. Fixing the infrastructure required
changing how people across the organization thought about shared
data, not just building better pipelines. The technical work was
the easy half.
02
The most valuable deliverable is the one that keeps working
A report someone has to recreate manually every year isn't
infrastructure, it's a recurring project. The goal was always to
build systems that produce value after I leave. That means
automation over one-off analysis, documentation over tribal
knowledge, and repeatability over elegance.
03
Standardization is a leadership challenge, not a technical one
Getting three program teams to agree on shared definitions
required more facilitation than I expected. The technical work
of building the unified schema took days. The organizational
work of getting buy-in took weeks. That ratio was right, a
foundation built on contested definitions would have collapsed
the moment someone tried to produce a cross-program report.
Salesforce
Airtable
Tableau
Chronus
Jupyter
Python
Excel
Data Modeling
Process Automation
Stakeholder Reporting