Vibe Coding Thoughts; AI-Driven Productivity; Measuring AI Impact; LLM Reality Check

Issue #57 Bytes

Jul 13, 2025

🌱 Dive into Learning-Rich Sundays with groCTO ⤵️

Vibe Coding for Teams, Thoughts to Date

Ever scratch your head wondering what LLMs are doing to how we write code? This blog dives right in. It's got this super interesting take: while AI is shaking things up big time, one old truth still stands – reading code is way harder than writing it.

Kellan Elliott-McCrea spills the beans on how LLMs, despite their magic, are actually creating new headaches. Think about it: getting an AI to understand your existing codebase is tough, and then there's this weird "Not Invented Here" thing where they just want to write new stuff instead of reusing what's already there. The result? Our codebases are getting… well, bushier and kinda wild.

If you're an engineer or lead trying to figure out how to navigate this brave new AI world without drowning in a sea of messy code, you have to read this. It'll give you a fresh perspective on what's really happening under the hood.👇

Full blog

Article of the Week ⭐

“[…] only 6% of the 617 engineering leaders surveyed for LeadDev’s Engineering Leadership Report in March [2025] reported a significant boost in productivity.”

AI coding assistants aren’t really making devs feel more productive

Despite the AI hype cycle, only 6% of engineering leaders in LeadDev’s 2025 survey report a significant productivity gain from AI tools.

That’s a stark contrast to marketing claims from GitHub, Cursor, and JPMorgan touting big boosts. While most developers now use AI tools like Copilot, their day-to-day experience tells a different story: minor gains, if any.

Why the Disconnect?

The wrong pain points
Most AI tools focus on code generation, not where developers actually lose time like test feedback cycles, flaky CI, dependency deadlocks, slow deploys. Refactoring and doc-gen help, but don’t move the bottlenecks in a significant way.

Tooling vs. workflow
Top-down AI rollouts are often decoupled from how dev teams actually work. Leaders are buying licenses based on promise and speculative branding reasons, not based on developer-validated problems. That gap shows up in stalled adoption and underwhelming returns.

The Shift That’s Needed

Stop chasing AI as the fix and start tracking where friction lives. Bring developers into the conversation early. Co-design usage with the people doing the work.

Use AI tools to substract waste, wait time, and guesswork, along with a continuous investment to ensure LLM generated outputs remain accurate and helpful.

Rebecca Murphey from Swarmia nailed it: focus on the biggest bottleneck, fix it, then move to the next. AI can help—but only if it’s aligned with reality on the ground.

Other highlights 👇

How to Measure AI Impact in Engineering Teams

Too many teams are adopting AI tools without knowing whether they’re helping, hurting, or just adding noise. Headlines like “AI writes 50% of our code” are often meaningless without context.

Agents Aren’t Developers

AI agents accelerate humans and shouldn’t be measured like ICs. Their impact rolls up into team-level output, and ideally outcomes. Research shows that tracking merely based on lines-of-code of output per engineer provides the wrong perspective.

How to Measure

Telemetry: Usage data from source control, AI tools, PRs.
Surveys: Developer satisfaction, perceived gains.
Experience Sampling: Real-time prompts during work to uncover use cases that matter.

Don’t Let Metrics Backfire

Start small, baseline early.
Avoid using metrics to judge individuals.
Communicate clearly: you're measuring to guide investment, not to surveil teams.

Software engineering with LLMs in 2025: reality check

AI Dev Tools Startups

Teams building AI tools are, unsurprisingly, the most aggressive users.

Anthropic: 90% of Claude Code is written by Claude itself. Productivity gains internally range from 2x to 10x, depending on the developer.
Windsurf: 95% of code written using its own Cascade agent and passive tool, Windsurf Tab.
Cursor: Used daily across the company, generating ~40–50% of production code.

These companies are iterating fast on agent-based flows and seeing meaningful leverage in day-to-day work.

Big Tech

At Google and Amazon, AI tools are increasingly embedded—but adoption looks different.

Google has deeply integrated Gemini into internal tools like Cider (IDE), Critique (code review), and Code Search. Their approach is cautious, aiming to build trust. Internally, teams expect a 10x increase in code volume, and infra teams are preparing accordingly.
Amazon has invested heavily in Q Developer. After shaky early performance, it improved significantly by removing its dependency on Amazon’s older in-house model (Nova). The CLI version is rapidly growing in popularity. Amazon is also building out MCP infrastructure at scale—reminiscent of its “API-first” mandate from 2002, enabling agent and tool chaining across thousands of services.

AI Startups (Non-tooling)

Some are all-in, others still skeptical.

incident.io: Embracing AI deeply, with Claude Code used across the team and learnings shared regularly in Slack.
A biotech AI startup: Tried multiple tools but found most unhelpful. The team saw more benefit from non-AI productivity tools like the ruff linter and uv package manager.

Veteran Engineers

Perhaps the most surprising trend is how many respected engineers have changed their minds.

Armin Ronacher (Flask): Went from skeptic to full adopter. Claude Code now handles much of his refactoring and daily dev work.
Kent Beck (XP) and Martin Fowler both see this shift as on par with the move from assembly to high-level languages.
Simon Willison and Peter Steinberger point to agentic tools—models that run code, test, and iterate—as a major unlock in recent months.

Adoption Data

DX’s unpublished study (38,000+ devs): ~50% of developers use AI tools weekly.
Time savings is real but modest—median ~4 hours/week. That’s ~10% of a 40-hour work week.
Many tools shine for individuals, but organizational-level benefit is still catching up. Metrics like deployment frequency and test coverage still matter, and aren’t automatically lifted by AI coding gains.

Open Questions

Why are founders and execs consistently more bullish than devs?
Why hasn’t org-wide adoption matched individual enthusiasm?
How much value do these tools really add outside of writing code?

Impact is uneven. And in many places, the most exciting work is just getting started.

Find Yourself 🌻

That’s it for Today!

Whether you’re innovating on new projects, staying ahead of tech trends, or taking a strategic pause to recharge, may your day be as impactful and inspiring as your leadership.

See you next week(end), Ciao 👋

Credits 🙏

Curators - Diligently curated by our community members Denis & Kovid

Featured Authors - Kellan Elliott-McCrea, Chantal Kapani, Laura Tacho, Gergely Orosz

Sponsors - This newsletter is sponsored by Typo AI - Ship reliable software faster.

1) Subscribe — If you aren’t already, consider becoming a groCTO subscriber.

2) Share — Spread the word amongst fellow Engineering Leaders and CTOs! Your referral empowers & builds our groCTO community.

Share groCTO