The New Product Development Flow

Every engineering team faces the same tension: ship fast or stay aligned. Move quickly and you lose context, accumulate drift, and end up with a product that nobody fully understands. Move carefully and you drown in process — PRDs nobody reads, standups that recap what Slack already said, and planning sessions that produce plans nobody follows.

We decided to reject the tradeoff. Over the past year we've built a development flow that lets us ship daily while keeping the entire team — engineers, product, leadership — on the same page. It's not magic. It's a set of lightweight practices that compound.

Here's the honest version — what works, what we learned the hard way, and where we're still figuring it out.

Part 1: How We Work

Code Is the Source of Truth

Last quarter, an engineer asked why our notification service was using a particular retry strategy. The old way: dig through Confluence, find a page from 2023 that half-describes the original design, realize it doesn't match the current implementation, then ping three people on Slack. The new way: open /docs/decisions/notification-retry-strategy.md, read the ADR, done. Three minutes instead of three hours.

We don't maintain separate documentation systems that drift from reality. The code is the system of record. Every deployment, every configuration, every integration — it's all in the repo. Architecture decisions, product rationales, and context that spans multiple features live in /docs inside the repo. These are version-controlled, reviewed in PRs, and discoverable by anyone — including AI agents. When someone asks "why did we build it this way?", the answer is one grep away.

Linear + Claude Code: The Actual Workflow

Our task management lives in Linear. But Linear issues aren't just tickets — they're the interface between product intent and engineering execution. The PM writes the what and why. The how lives in the code.

Here's what the daily workflow actually looks like: an engineer picks up a Linear issue, opens Claude Code, and says "read this issue and the relevant codebase, then propose an implementation plan." Claude Code scans the repo, understands the existing patterns — how we structure services, what conventions we follow, where similar features live — and proposes a plan. The engineer reviews, adjusts, then says "implement it." Claude writes the code; the engineer reviews the PR, tests edge cases, and merges.

This isn't "AI writes all the code." It's closer to pair programming with an engineer who has perfect recall of your entire codebase but no product judgment. The human stays responsible for the decisions — the AI handles the mechanical translation from intent to code.

The result: tasks that used to take a full day from branch to merge now take half that. We measured it — month-over-month averages over the past year show a consistent ~50% reduction in cycle time.

Match Process Weight to Feature Weight

Not every feature needs the same process:

Small features (< 1 day): Linear issue → branch → Claude Code implementation → PR → merge. Minimal overhead, maximum velocity.

Big features (multi-day): Product one-pager → Linear project with sub-issues → ADR if architectural → implementation with checkpoints → staged rollout. More structure, but still lightweight.

A copy change doesn't need an ADR. A new data pipeline does. We learned this the hard way — early on, we tried applying the same lightweight process to everything. A two-week infrastructure migration got the "small feature" treatment, and we ended up rebuilding half of it when we realized midway that we'd made an architectural assumption nobody had validated. Now the rule is simple: if it touches more than one system or takes more than a day, it gets a one-pager first.

Part 2: How We Decide

The one-pager defines what we're building. If the how involves an architectural bet, that's when an ADR gets written. Together with a clear Definition of Done, these three lightweight artifacts keep the entire team aligned without the overhead of formal review boards or multi-week planning cycles.

Before building anything substantial, the PM writes a one-pager. Not a PRD — a single page that answers:

Who — Who is this for? Which user segment, which persona?

Problem — What problem are we solving? What's the evidence it's a real problem?

Scope — What's in scope? Equally important: what's explicitly out of scope?

Success metrics — How will we know this worked? What's the target?

Rollout plan — How do we ship this? All at once, staged, feature-flagged?

The one-pager forces clarity. If you can't fit it on a page, you probably don't understand the problem well enough yet. It's also the artifact that keeps product and engineering aligned — everyone reads the same page before work begins. And it takes 30 minutes to write, not three days.

Architecture Decision Records

When the one-pager leads to a technical decision that could go multiple ways, the engineering lead writes an ADR in /docs/decisions/. It follows a simple structure:

Context — What's the situation? What problem are we solving?

Options — What did we consider? Brief pros/cons for each.

Decision — What did we choose and why?

Consequences — What are the tradeoffs? What do we gain, what do we give up?

ADRs are short — typically one page, written as PRs, reviewed with the team. They're not meant to be comprehensive analysis documents. They're meant to capture the reasoning so that six months from now, when someone asks "why PostgreSQL instead of DynamoDB?", the answer exists and is findable. We have about 20 of these now. New engineers read through the recent ones in their first week — it's the fastest way to understand not just how the system works, but why it works that way.

Definition of Done

We keep quality high without bureaucracy by defining "done" clearly and consistently:

Acceptance criteria — Every Linear issue has explicit criteria before work begins. The PM writes them. Not a novel — two to three bullet points that define what "working" means.

Tests — Critical paths have automated tests. We don't aim for 100% coverage — we aim for 100% coverage of things that would wake someone up at 3am.

Lightweight release checklist — Before merge: Does it work? Did you test the edge cases? Is it instrumented? Would you be comfortable if this shipped while you're on vacation? Four questions, not forty.

The goal is to make quality a habit, not a gate. When the bar is clear and consistent, people hit it naturally.

Part 3: How We Learn

The AI-Driven Feedback Loop

This is the part of our flow that's most actively evolving. The core loop is running today; some pieces are mature, others are still being built.

Observe — This is solid. Every feature ships with instrumentation. User behavior, feature adoption, error rates, performance metrics — raw data flows into our analytics pipeline automatically. This part is non-negotiable: nothing ships without a tracking plan.

Infer — This is where the AI comes in, and where we're still iterating. Today, we run weekly analysis jobs that process usage data and surface anomalies — features with unexpected drop-off patterns, error rate spikes correlated with specific user segments, adoption curves that diverge from predictions. It's not a magic "AI tells you what to build" system. It's more like having an analyst who never sleeps and always checks the data.

Decide — The team reviews AI-generated insights alongside their own product intuition. We decide what to act on, what to investigate further, and what to ignore. Human judgment stays in the loop — the AI surfaces signals, humans make calls.

Ship — Implementation happens fast because the codebase is clean, the tooling is good, and Claude Code accelerates the work. Small changes ship same-day. Larger experiments get a rollout plan.

Measure — Every change gets instrumented before it ships. We define success metrics upfront and measure against baseline. No shipping without knowing how we'll evaluate.

Repeat — The loop runs continuously. Each cycle makes the product better and generates more data for the next cycle.

This isn't a quarterly planning exercise. It's a daily rhythm. The honest caveat: the "Infer" step is the youngest part of the system. We started with manual dashboard reviews, graduated to automated anomaly detection, and are working toward AI-generated hypotheses with suggested actions. We're maybe 60% of the way to where we want to be.

Instrumentation & Measurement

We treat instrumentation as a first-class engineering concern, not an afterthought:

Tracking plan — Before building a feature, the engineer defines what they'll measure. Events, properties, funnels. This goes into the PR description alongside the code. It takes five minutes and prevents the "we shipped it but have no idea if it's working" problem.

Baseline + target — Every metric has a current value (baseline) and a target. "Improve onboarding" is not a goal. "Increase day-7 retention from 23% to 30%" is. The PM owns the target; engineering owns the measurement.

Dashboard — Key metrics are visible on a shared dashboard. Not buried in an analytics tool that requires a SQL query to access. If it matters, it's on the wall.

Post-launch review — After a feature ships and has enough data, the PM and eng lead do a 15-minute review. Did it hit the target? If yes, what can we learn? If no, why not? What's the next move? These reviews are short but they're the most valuable meetings we have — they close the loop and feed the next cycle.

Why This Scales

Each of these practices is lightweight on its own. A one-pager takes 30 minutes. An ADR takes 20 minutes. Acceptance criteria take 5 minutes per issue. Instrumentation adds maybe 10% to implementation time.

But they compound. Six months in, we have a searchable history of every decision, clear metrics on every feature, and a codebase where the "why" is as accessible as the "what." A new engineer who joined last month told us they felt productive in their first week — they read the recent ADRs, browsed the one-pagers, and understood not just what the system does but why it's built that way.

The teams that ship fast and stay aligned aren't the ones with the most process. They're the ones with the right process — lightweight practices that create compounding returns.

We're not claiming this is perfect. The AI feedback loop is still maturing. We occasionally skip a one-pager for something that really should have had one. But the system is self-correcting: when we skip a step and feel the pain later, the team voluntarily adds it back. That's how you know a process is working — people follow it because it helps, not because it's mandated.

The Core Principle

Speed and alignment aren't opposing forces. With the right lightweight practices — code as truth, AI-driven feedback loops, and clear quality gates — they reinforce each other. Each practice takes minutes but compounds over months.