You Can't Vibe Code A Payments Platform, But Even Regulated Industries Are Rethinking Writing By Hand
Sohil Shah, Staff Software Engineer at PayPal, on why AI-assisted development in regulated industries inverts the skill stack from writing to reading.

To survive going forward, engineers need to get really good at reading code even more so more than writing it.
Most conversations about AI and software engineering happen in a world where the worst outcome of a bad deploy is a rolled-back feature and an apologetic Slack message. In Sohil Shah's world, a bad deploy moves money.
Shah is a Staff Software Engineer at PayPal, where he builds agentic AI systems that respond to production incidents on the company's financial infrastructure. Before PayPal, he spent nearly three years at ByteDance as a backend engineer on TikTok's WebRTC systems, including the U.S. data compliance initiative driven by regulatory pressure known as Project Texas. Before that, he spent nearly eight years at JPMorgan Chase, where he rose to VP of Software Engineering managing cross-regional teams with budgets over $5 million, building payments platforms in an environment where SOC 2, PCI-DSS, and federal oversight are architectural constraints that dictate exactly what you can build and how.
Every stop on Shah's resume is a place where compliance shaped the engineering. That background colors everything he thinks about AI-assisted development, and it's why his read on where the industry is heading sounds different from the usual Silicon Valley optimism.
"To survive going forward, engineers need to get really good at reading code even more so than writing it," Shah told The Read Replica. "The writing is probably going to be done almost entirely by AI at some point in the near future."
In a startup, that's an interesting thesis about developer productivity. In a regulated enterprise, it's a statement about risk management. When AI writes code that touches financial transactions, patient records, or compliance-sensitive data, success comes down to whether a human reviewed it with enough depth to catch what the model got wrong, and whether there's an audit trail proving they did.
Less writing, more reading
"AI code is sloppy. It is error prone. It is never going to be 100% perfectly accurate," Shah said. He's not speculating. Veracode's 2025 GenAI Code Security Report tested more than 100 large language models and found that 45% of AI-generated code introduced OWASP Top 10 vulnerabilities. The Spring 2026 update tested the flagship releases from OpenAI, Google, and Anthropic and landed on the same number. Syntax correctness has climbed past 95%, but security pass rates haven't moved. The code compiles, the tests pass, and the controls a senior developer would include on instinct are missing.
In most of tech, that's a quality problem. In financial services, it's a PCI-DSS violation waiting for an auditor.
Shah is still pro-tool. But he argues AI has permanently altered what it means to be good at the job, and the shift hits harder in places where "move fast and break things" was never on the table. "If you provide the right prompts, the right guardrails, the right tools, then AI can perform to expected levels," he said. "That's where experience comes in. You need experienced developers who can provide feedback to the models and improve from that point on."
The market is arriving at a version of the same conclusion. In December 2025, Cursor acquired Graphite, the code review platform used by Shopify, Snowflake, and Figma, for a sum reportedly well above Graphite's $290 million valuation. Cursor CEO Michael Truell told Fortune the reasoning behind the deal was that "the way engineering teams review code is increasingly becoming a bottleneck to them moving even faster as AI has been deployed more broadly within engineering teams." Faros AI's telemetry report from more than 10,000 developers put numbers to it: high-AI-adoption teams merged 98% more pull requests, review time ballooned 91%, and company-level delivery metrics stayed flat. The writing accelerated. Reading it didn't.
The work nobody misses
If reading is the new bottleneck, the nature of engineering work has to follow. At PayPal, Shah has watched the shift start with the disappearance of the most dreaded line item on every team's backlog.
"I used to spend a lot of time on code refactoring," he said. "You push code out as soon as possible for business needs, and you incur a lot of technical debt." The traditional playbook of hiring contractors, bringing on temps, or routing the work to interns was never efficient. In a regulated environment, it cost you twice. Every new pair of hands on production code meant another access review, another compliance onboarding, another set of credentials to provision and revoke. "All of that is now gone," Shah said. "I can ask the agent to refactor specific pieces of code, and in most cases it writes better code than a human, or at least at the same level."
The agents don't need provisioned database access. They don't need compliance onboarding. They don't generate the operational overhead that temporary human labor creates in regulated environments. The tedious middle layer of engineering work, refactoring sprints deprioritized quarter after quarter, and tickets nobody volunteered for, is getting absorbed.
"The quality of work that engineers have to do is now a lot richer. A lot more is expected out of them." The agents may not have eliminated the headcount, but they've raised the floor on what counts as engineering work.
When the agents run the investigation
The clearest window into how this inversion plays out operationally is Shah's core domain: incident response on financial infrastructure.
The traditional model is synchronous and manual. An incident fires, a manager assembles a call, SMEs join one by one, and everyone attempts a live diagnosis while the clock ticks. In financial services, the clock runs faster than most industries. Regulatory reporting obligations can kick in within hours, and prolonged outages on payment systems carry reputational and compliance exposure that compounds by the minute.
Shah's team is replacing the investigation layer with agents. "The heavy lift like looking at the metrics, the data, all the changes that have happened in production are getting automated," he said. "These agents are capable enough to figure out the issues and give a full report. The incident managers can then be in the driving seat to take decisions rather than waiting for someone to join the call."
The pattern is starting to show up across the industry. AWS, Microsoft, and New Relic have all shipped agentic SRE products in the past year. But what Shah describes is the same shift in miniature: synchronous investigation becomes asynchronous, agent-driven analysis. The humans move up the loop. Instead of spending the first stretch of an incident playing detective, the incident manager starts with a report and makes calls. The agent does the legwork. The human reads, evaluates, and decides.
Why "usually right" isn't good enough
That auditability requirement is what pushed Shah's team away from standard RAG toward something more structured. "General RAG is not sufficient for the models to make a decision," Shah said. "It hallucinates too much. So we've focused on graphRAG, where you have specified nodes and relationships that are predefined. We've found a better success rate in deterministic decision-making."
The distinction matters in regulated contexts. Standard RAG retrieves documents by semantic similarity and lets the model reason over them. Powerful, but opaque. You can't easily trace why the model reached a particular conclusion, which makes it difficult to defend to an auditor. Graph-based approaches traverse explicit, predefined relationships between entities. The reasoning path is inspectable. When an agent says "Service X caused the incident because it depends on Service Y, which was updated at 3:47 PM," you can follow that chain back through the graph and verify every step.
Microsoft open-sourced GraphRAG in 2024, and the approach has matured fast. LazyGraphRAG brought indexing costs down to 0.1% of the original. The core insight that drove both projects, that agents traversing logical chains need explicit relationships rather than inferred ones, is the same conclusion Shah's team reached independently from inside a payments company.
The Postgres ecosystem is heading in the same direction. Platforms building on Postgres are making database-level access controls a default rather than an opt-in, treating enforcement at the data layer as infrastructure rather than configuration — exactly the kind of guardrail Shah's team needs when agents are querying production systems autonomously.
"For zero-to-one applications, you'd start from prototyping and even vibe coding to set up the project," he said. "But for applications already running in production, you don't want to vibe any decisions. You need a very deterministic path."
Day-one contributors, decade-old constraints
The same tools that are clearing the backlog are also collapsing the onboarding curve. Shah described a pattern every engineering leader recognizes: tenured engineers accumulating tribal knowledge that makes them indispensable while creating a barrier for newcomers. "It's always been very hard for someone coming in from outside to establish themselves as a core contributor," he said. AI tools have largely dissolved that barrier. "Now, it's very easy for someone coming in to become a core contributor on day one."
In a startup, faster onboarding is an unqualified win. Regulated environments are more complicated. The tribal knowledge that senior engineers carried wasn't just about the codebase, but also the compliance context surrounding the codebase. Which services handle PII? Where are the encryption boundaries? Why does that particular data flow exist in that particular shape? The answer is often that a regulatory requirement from years ago mandated it. Even if AI can make the code legible on day one, it can't transfer the institutional memory of why the code is shaped the way it is.
Shah acknowledged the gap. Agents are "really good at reading code and finding things that experienced developers would miss," he said. "But they're not good at making architecture decisions. If you ask them to improve the existing architecture, they're not going to make good decisions in the long run."
Where the return actually lives
For teams in regulated industries, automating existing processes means navigating the full weight of existing controls, audit requirements, and regulatory expectations. Those are the very things that make the processes slow in the first place, and the overhead doesn't shrink just because an agent is doing the work. If anything, it grows: now you need to prove the agent's decisions are auditable too.
Building something new is a different calculus. A product or revenue stream that didn't exist before doesn't inherit a decade of accumulated compliance architecture. The constraints are real, but they're scoped from scratch rather than layered on top of legacy controls nobody fully remembers the origin of.
"I get asked this a lot. 'Is there enough ROI for existing teams to automate their existing processes?'" he said. "And my answer is usually, no. You might not find a high ROI for it." The value, he argued, lives somewhere else. "If you are looking at these models to create new products, new experiences, new revenue streams, that's where the highest value differentiator is going to be."
The writing gets cheaper by the quarter. The judgment to evaluate it, and the institutional knowledge to know which compliance constraints the agent can't see, gets more expensive by the day.






