Mandatory code reviews harmful? The speed v/s stability trade off is a myth; Leadership AI: Trust over Speed
Issue #70 Bytes
đ± Dive into Learning-Rich Sundays with groCTO —ïž
Article of the Week â
âIf you do code reviews, at least do them fast enough to not block people. Teams reviewing PRs in under 3 hours are 2.1x more productive than teams taking 8+ hours!â
The price of mandatory code reviews
In software engineering, âevery PR must be reviewedâ has become dogma. Anton Zaides challenges that rule with new data from 400+ companies and 3,000 engineers using Weaveâs productivity metrics and finds that while code reviews do slow teams down, theyâre still worth it if you do them right.
Some high-performing startups, like Pylon, now let engineers merge their own code. Reviews are optional and used only for risky changes or onboarding. Their idea is to hire great engineers and trust them by removing bottlenecks. But itâs not all gold that shines, this approach has clear tradeoffs:
Teams with reviews: ~31 expert hours per dev/month, 3.7 bugs per dev.
Teams without reviews: ~59 expert hours per dev/month (1.9Ă faster) but 8.9 bugs per dev (2.4Ă buggier).
Even after normalizing for output, skipping reviews leads to roughly 25% more bugs per unit of work. So yes, removing reviews speeds you up, but you pay for it in defect density. Thatâs not all though⊠letâs take a closer look.
Diminishing returns
Going from zero to some code review dramatically reduces bugs. After about one review per two PRs, the quality gains flatten. Trivial PRs (like changing a log line) probably donât need mandatory review. The sweet spot is selective reviews by using judgment to balance speed and safety.
Quality reviews > frequent reviews
Not all reviews are equal. Teams doing high-quality reviews (depth, clarity, actionable feedback) ship 38% slower but with 61% fewer bugs. If youâre going to spend time reviewing, make it count. However, the author of the PR is the main person who benefits from this by learning. If most of the code was authored by an AI or someone unable to respond to it, the review is wasted.
Speed still matters
The costliest mistake is doing reviews slowly.
PRs reviewed within 3 hours â 2.1Ă higher productivity
PRs waiting 8+ hours â steep morale drop and velocity loss
The best teams follow a simple rule: review fast unless youâre in the middle of deep focus work.
Culture compounds quality
Engineers who give thoughtful reviews tend to get thoughtful reviews back. Teams with low-effort reviews rarely escape that gravity. Code review quality, like code quality itself, reflects the culture.
Top 10% teams show the paradox in action: they ship 2.7Ă faster while keeping their bugs per feature 33% lower than average teams. They optimize reviews by balancing the fine edge of speed and quality: small PRs, fast turnaround, high trust, and high context sharing.
Other highlights đ
The speed-versus-stability trade-off is a myth. The best-performing teams (âPragmatic performersâ and âHarmonious high-achieversâ) achieve both high throughput and high stability simultaneously, while struggling teams often fail at both dimensions.
What are the seven team profiles of engineering delivery performance?
The DORA metrics tell us whatâs happening, not why. Two teams with identical numbers might look equally âhigh performingâ on paper, yet one could be a stable, healthy team and the other an exhausted group barely keeping up.
This new 2025 DORA research dives beneath the surface, clustering data from nearly 5,000 developers to identify seven distinct team profiles that reveal the human and systemic factors shaping delivery outcomes.
The Seven Profiles
Foundational Challenges (10%): Teams struggling with basic capability gaps and weak engineering practices.
Legacy Bottleneck (11%): Constant firefighting due to brittle systems; burnout and reactivity dominate.
Constrained by Process (17%): Technology is fine, but process inefficiency creates friction and exhaustion.
High Impact, Low Cadence (7%): Great outcomes and strong product performance, but achieved through unsustainable intensity.
Stable and Methodical (15%): Reliable and predictable, but sometimes overly cautious or slow.
Pragmatic Performers (20%): Balanced throughput and stability, focused on value and learning.
Harmonious High-Achievers (20%): The healthiest cluster: high speed, high stability, low burnout, high satisfaction.
This meta-study shatters the old belief that teams must choose between speed and stability. When a team slows down, the cause isnât always technical. Legacy drag often disguises itself as reactivity and fatigue. Process inefficiency can quietly erode morale, even when the software itself is stable. And heroic, high-impact cultures might produce strong short-term results while silently draining peopleâs energy.
Diagnosis before intervention
If deployment frequency is down, donât rush to rewrite systems or enforce new processes. Instead, ask: Which pattern are we living in? Are you fighting legacy complexity, trapped by process overhead, or running on unsustainable effort?
The seven profiles give teams a shared language to discuss these realities. Saying âweâre process-constrainedâ or âweâre legacy-bottleneckedâ opens clearer conversations than vague complaints about being âtoo slow.â The real work of improvement starts not by chasing better metrics, but by understanding the shape of your system which enables reshaping it so performance and well-being rise together.
Improvement begins when teams stop optimizing the numbers and start addressing which kind of team they are because metrics describe the outcome, but culture and constraints explain the cause.
Why trust, not speed, defines software leadership in the AI era
For decades, engineering leaders have measured success through output like lines of code, sprint velocity, story points. But as Rahul Chandel discovered leading teams at Twilio and Coinbase, that mindset collapses in the AI era. When code can be generated in seconds by AIâs amplification in speed trust into the generated output becomes a headache for many teams.
AI can speed up delivery but without strong systems of review, observability, and safety, it can just as easily amplify risk. The new job of software leaders is to orchestrate resilience across complex humanâAI ecosystems in addition to squeeze out more features out of their existing processes.
From Output to Outcomes
When generative AI took over repetitive coding work, Chandel saw output skyrocket at the cost of plummet. What mattered now was how well the system behaved under pressure. At Twilio, he learned to anchor success to service-level objectives (SLOs) and error budgets acting as concrete indicators of reliability tied to customer trust contrasting with delivery cadence. At Coinbase, he brought the same discipline to trading systems, tracking mean time to resolution (MTTR) as the new pulse of engineering health.
Reframing Leadership
AIâs acceleration comes with cognitive overhead. It creates more output to review, more complexity to reason about, and more opportunities for silent failure. Chandel introduced three cultural practices to absorb that complexity without losing safety:
Tiered review standards for AI-generated code on critical paths raising scrutiny on hot spots.
Critical inquiry coaching to teach engineers how to interrogate AI output by asking what assumptions it made and where it might break.
Psychological safety so engineers feel free to question both each other and the machine.
The New Bottleneck: Trust
When his team considered migrating from Redis to Valkey, the technical migration wasnât nearly as challenging as the orchestration risk. AI could rewrite code, but it couldnât assess latency trade-offs or plan rollback strategies. The leaderâs role was to underwrite trustworthiness: designing phased rollouts, setting risk thresholds, validating monitoring.
Are your code reviews missing the story behind the code?
Most code reviews miss the bigger picture. Reviewers see diffs, not contextâwhy a change was made, what it impacts, or how risky it is. That lack of narrative turns reviews into guesswork and slows teams down. AI can close that gap by generating contextual summaries that explain intent, highlight dependencies, and surface hidden issues before humans even start reading the code. It makes reviews faster, cleaner, and more focused. Typoâs new AI-generated PR summaries bring exactly that context into your workflow
Find Yourself đ»
Thatâs it for Today!
Whether youâre innovating on new projects, staying ahead of tech trends, or taking a strategic pause to recharge, may your day be as impactful and inspiring as your leadership.
See you next week(end), Ciao đ
Credits đ
Curators - Diligently curated by our community members Denis & Varun
Featured Authors - Anton Zaides, Lizzie Matusov, Rahul Chandel (for CIO)
Sponsors - This newsletter is sponsored by Typo AI - Engineering Intelligence Platform for the AI Era.
1) Subscribe â If you arenât already, consider becoming a groCTO subscriber.
2) Share â Spread the word amongst fellow Engineering Leaders and CTOs! Your referral empowers & builds our groCTO community.



