Mandatory code reviews harmful? The speed v/s stability trade off is a myth; Leadership AI: Trust over Speed

Issue #70 Bytes

Nov 10, 2025

🌱 Dive into Learning-Rich Sundays with groCTO ⤵️

Article of the Week ⭐

“If you do code reviews, at least do them fast enough to not block people. Teams reviewing PRs in under 3 hours are 2.1x more productive than teams taking 8+ hours!“

The price of mandatory code reviews

In software engineering, “every PR must be reviewed” has become dogma. Anton Zaides challenges that rule with new data from 400+ companies and 3,000 engineers using Weave’s productivity metrics and finds that while code reviews do slow teams down, they’re still worth it if you do them right.

Some high-performing startups, like Pylon, now let engineers merge their own code. Reviews are optional and used only for risky changes or onboarding. Their idea is to hire great engineers and trust them by removing bottlenecks. But it’s not all gold that shines, this approach has clear tradeoffs:

Teams with reviews: ~31 expert hours per dev/month, 3.7 bugs per dev.
Teams without reviews: ~59 expert hours per dev/month (1.9× faster) but 8.9 bugs per dev (2.4× buggier).

Even after normalizing for output, skipping reviews leads to roughly 25% more bugs per unit of work. So yes, removing reviews speeds you up, but you pay for it in defect density. That’s not all though… let’s take a closer look.

Diminishing returns

Going from zero to some code review dramatically reduces bugs. After about one review per two PRs, the quality gains flatten. Trivial PRs (like changing a log line) probably don’t need mandatory review. The sweet spot is selective reviews by using judgment to balance speed and safety.

Quality reviews > frequent reviews

Not all reviews are equal. Teams doing high-quality reviews (depth, clarity, actionable feedback) ship 38% slower but with 61% fewer bugs. If you’re going to spend time reviewing, make it count. However, the author of the PR is the main person who benefits from this by learning. If most of the code was authored by an AI or someone unable to respond to it, the review is wasted.

Speed still matters

The costliest mistake is doing reviews slowly.

PRs reviewed within 3 hours → 2.1× higher productivity
PRs waiting 8+ hours → steep morale drop and velocity loss

The best teams follow a simple rule: review fast unless you’re in the middle of deep focus work.

Culture compounds quality

Engineers who give thoughtful reviews tend to get thoughtful reviews back. Teams with low-effort reviews rarely escape that gravity. Code review quality, like code quality itself, reflects the culture.

Top 10% teams show the paradox in action: they ship 2.7× faster while keeping their bugs per feature 33% lower than average teams. They optimize reviews by balancing the fine edge of speed and quality: small PRs, fast turnaround, high trust, and high context sharing.

Manager.dev

Other highlights 👇

The speed-versus-stability trade-off is a myth. The best-performing teams (“Pragmatic performers” and “Harmonious high-achievers”) achieve both high throughput and high stability simultaneously, while struggling teams often fail at both dimensions.

What are the seven team profiles of engineering delivery performance?

The DORA metrics tell us what’s happening, not why. Two teams with identical numbers might look equally “high performing” on paper, yet one could be a stable, healthy team and the other an exhausted group barely keeping up.

This new 2025 DORA research dives beneath the surface, clustering data from nearly 5,000 developers to identify seven distinct team profiles that reveal the human and systemic factors shaping delivery outcomes.

The Seven Profiles

Foundational Challenges (10%): Teams struggling with basic capability gaps and weak engineering practices.
Legacy Bottleneck (11%): Constant firefighting due to brittle systems; burnout and reactivity dominate.
Constrained by Process (17%): Technology is fine, but process inefficiency creates friction and exhaustion.
High Impact, Low Cadence (7%): Great outcomes and strong product performance, but achieved through unsustainable intensity.
Stable and Methodical (15%): Reliable and predictable, but sometimes overly cautious or slow.
Pragmatic Performers (20%): Balanced throughput and stability, focused on value and learning.
Harmonious High-Achievers (20%): The healthiest cluster: high speed, high stability, low burnout, high satisfaction.

This meta-study shatters the old belief that teams must choose between speed and stability. When a team slows down, the cause isn’t always technical. Legacy drag often disguises itself as reactivity and fatigue. Process inefficiency can quietly erode morale, even when the software itself is stable. And heroic, high-impact cultures might produce strong short-term results while silently draining people’s energy.

Diagnosis before intervention

If deployment frequency is down, don’t rush to rewrite systems or enforce new processes. Instead, ask: Which pattern are we living in? Are you fighting legacy complexity, trapped by process overhead, or running on unsustainable effort?

The seven profiles give teams a shared language to discuss these realities. Saying “we’re process-constrained” or “we’re legacy-bottlenecked” opens clearer conversations than vague complaints about being “too slow.” The real work of improvement starts not by chasing better metrics, but by understanding the shape of your system which enables reshaping it so performance and well-being rise together.

Improvement begins when teams stop optimizing the numbers and start addressing which kind of team they are because metrics describe the outcome, but culture and constraints explain the cause.

Research-Driven Engineering Leadership

Why trust, not speed, defines software leadership in the AI era

For decades, engineering leaders have measured success through output like lines of code, sprint velocity, story points. But as Rahul Chandel discovered leading teams at Twilio and Coinbase, that mindset collapses in the AI era. When code can be generated in seconds by AI’s amplification in speed trust into the generated output becomes a headache for many teams.

AI can speed up delivery but without strong systems of review, observability, and safety, it can just as easily amplify risk. The new job of software leaders is to orchestrate resilience across complex human–AI ecosystems in addition to squeeze out more features out of their existing processes.

From Output to Outcomes

When generative AI took over repetitive coding work, Chandel saw output skyrocket at the cost of plummet. What mattered now was how well the system behaved under pressure. At Twilio, he learned to anchor success to service-level objectives (SLOs) and error budgets acting as concrete indicators of reliability tied to customer trust contrasting with delivery cadence. At Coinbase, he brought the same discipline to trading systems, tracking mean time to resolution (MTTR) as the new pulse of engineering health.

Reframing Leadership

AI’s acceleration comes with cognitive overhead. It creates more output to review, more complexity to reason about, and more opportunities for silent failure. Chandel introduced three cultural practices to absorb that complexity without losing safety:

Tiered review standards for AI-generated code on critical paths raising scrutiny on hot spots.
Critical inquiry coaching to teach engineers how to interrogate AI output by asking what assumptions it made and where it might break.
Psychological safety so engineers feel free to question both each other and the machine.

The New Bottleneck: Trust

When his team considered migrating from Redis to Valkey, the technical migration wasn’t nearly as challenging as the orchestration risk. AI could rewrite code, but it couldn’t assess latency trade-offs or plan rollback strategies. The leader’s role was to underwrite trustworthiness: designing phased rollouts, setting risk thresholds, validating monitoring.

CIO.com

Are your code reviews missing the story behind the code?

Most code reviews miss the bigger picture. Reviewers see diffs, not context—why a change was made, what it impacts, or how risky it is. That lack of narrative turns reviews into guesswork and slows teams down. AI can close that gap by generating contextual summaries that explain intent, highlight dependencies, and surface hidden issues before humans even start reading the code. It makes reviews faster, cleaner, and more focused. Typo’s new AI-generated PR summaries bring exactly that context into your workflow

Learn more

Find Yourself 🌻

That’s it for Today!

Whether you’re innovating on new projects, staying ahead of tech trends, or taking a strategic pause to recharge, may your day be as impactful and inspiring as your leadership.

See you next week(end), Ciao 👋

Credits 🙏

Curators - Diligently curated by our community members Denis & Varun

Featured Authors - Anton Zaides, Lizzie Matusov, Rahul Chandel (for CIO)

Sponsors - This newsletter is sponsored by Typo AI - Engineering Intelligence Platform for the AI Era.

1) Subscribe — If you aren’t already, consider becoming a groCTO subscriber.

2) Share — Spread the word amongst fellow Engineering Leaders and CTOs! Your referral empowers & builds our groCTO community.

Share groCTO

Discussion about this post

Ready for more?