Latest News

You Can't Outrun Governance, but Reducing Hurdles to Production-Readiness Starts With Condensing AI Pipelines

Westcoast Cloud Cloud Technical Engineer Joe Gem Reyes shares his path to reducing hurdles to production-ready AI deployments with Fabric.

Working with MCP and AI means instead of ten hurdles, you're just going through two, making it much easier to capture insights and deliver what the customer wants.

Joe Gem Reyes

Cloud Technical Engineer

Westcoast Cloud

MCP turned one year old in November, and despite a rocky start in the hearts and minds of devs, the protocol that was supposed to standardize how LLMs talk to enterprise systems has begun to see true enterprise adoption hinting at its longevity. OpenAI, Google DeepMind, and Microsoft have all adopted it, the Linux Foundation now governs the spec, and wiring an agent to a production database has gone from a dedicated integration sprint to an afternoon of configuration.

But an afternoon doesn't leave much room for governance. Bitsight's TRACE researchers found roughly 1,000 MCP servers sitting on the public internet with no authorization controls, some wired directly into Kubernetes clusters, CRM platforms, and production databases. The only thing between that exposure and a production incident is whether someone mapped the data layer before the agent went live.

Joe Gem Reyes, Master of Data Science, is one of the people operating in that gap. A Cloud Technical Engineer at Westcoast Cloud, the UK's largest Microsoft distributor, Reyes has spent nearly a decade bridging financial operations and data science across Google, RingCentral, and the Bank of the Philippine Islands. He's now the sole data professional for an office of 70, building AI agents on Microsoft Fabric and shipping insights for ~700 resellers at a velocity that would have required a dedicated team two years ago.

His stack on Fabric runs LLMs directly against semantic data models, and what used to split across data engineering, modeling, and application logic lives in a single pipeline he operates alone. "Working with MCP and AI means instead of ten hurdles, you're just going through two, making it much easier to capture insights and deliver what the customer wants." With query construction, schema navigation, and API orchestration abstracted away, the bottleneck has shifted to data quality and access control. But most teams are still staffed for the plumbing MCP just replaced.

Ten hurdles down to two

"An AI agent should never touch production data," Reyes said. "You isolate it, you put a read-only copy in front of the agent, and you treat everything the model generates as untrusted until a human says otherwise. That's not optional. That's the baseline."

OpenAI runs this pattern at scale: a single Postgres primary with nearly 50 read replicas across multiple regions serving 800 million ChatGPT users. If it's good enough for that workload, the argument for pointing an unsandboxed agent at a production database gets harder to make.

"My days are incredibly busy generating insights. Using tools like Copilot and OpenAI makes it easier for me," Reyes told The Read Replica. "I am a data scientist, but I am slowly transitioning into an AI developer because I also build agents. I am accepting AI as my companion now."

For Reyes, the real work happens before any of it gets turned on. Mapping where datasets live, scoping RLS policies, budgeting for validation that nobody wants to fund until something breaks. Done right, the same tools that collapsed ten hurdles down to two become a governed system of record. Done without it, they stay a demo.

Champagne taste, open-source budget

"Many companies say they want AI, but they're consistently unprepared for the financial cost or the long testing phases required to do it right," Reyes said.

The model licensing is the line item teams plan for, but the real cost lands during validation: spinning up parallel test environments, benchmarking LLM outputs against ground truth datasets, and hardening pipelines against the edge cases that surface once a model hits production data. Hallucinated joins. Malformed aggregations. DAX expressions that return plausible numbers from the wrong grain. Those are the failure modes that burn weeks of engineering time, and they don't show up in anyone's initial business case.

Reyes runs this gauntlet as a team of one, sitting on datasets large enough that manual processing stopped being viable a long time ago. The pressure to skip validation and ship is constant. He argues the testing is where the value actually gets locked in.

Shadow agents

In March, an internal AI agent at Meta was asked to analyze a technical question posted on a company forum. Instead of returning a private response, the agent posted its (incorrect) answer publicly and a colleague acted on it. For two hours, employees without clearance had access to sensitive company and user data. Meta classified it SEV1. No attacker or exploit; just an agent acting outside its scope with nobody watching.

The governance problem, as Reyes sees it, starts with people like him. He can curate and launch his own LLM tomorrow. No infrastructure team, no approval gate. Nobody tracking what it touches once it's live. "When there are no enforced controls at the data layer, you end up with ad hoc systems hitting sensitive datasets and nobody knows they're there," he said.

Meta had the resources, the security team, and the infrastructure to contain that incident in two hours. Most organizations operating at Reyes's scale don't. With the barrier to entry at zero, strict data contracts and curation are the only real differentiators between a POC that stalls and a tool that makes it to production. The Shadow IT problem from a decade ago is back, but now the unauthorized tool can execute code against your production data.

One-minute risk windows

"You cannot disregard hallucinations or misinterpretations when you're working in an AI environment," Reyes said. "I've had LLMs generate DAX that looks clean, runs without errors, and pulls from the wrong table relationships. The numbers come back looking reasonable, and if you're not checking the grain, you'd ship that to a reseller without blinking."

The accidental failures are bad enough. The deliberate ones are worse. An unguarded MCP connection to a semantic model is an open channel, and a crafted prompt can escalate into arbitrary query execution. Injections don't need to break anything to succeed. They just need the model to trust the input.

Reyes's pipeline scopes datasets tightly, enforces RLS at the Fabric layer, and routes nothing to production without human review. In practice, he sees injection attempts regularly across SQL, DAX, and Python workloads. Even a one-minute exposure window is enough for InfoSec to escalate. "You will not always see these vulnerabilities at face value. The query runs, the output looks right, and before anyone notices, the injection is already three layers deep."

Process dictates plumbing

Reyes's tactical advice for teams juggling multiple models: cut from five to two. Fewer models means a smaller validation surface, lower licensing costs, and governance that a lean team can actually enforce.

The architectural advice goes deeper. Whether a team runs Fabric, Databricks, or dbt Labs, the same mistake keeps surfacing: teams adopt whatever architecture is trending instead of mapping their own data flows first. "If we know where our data resides and understand our specific processes, we can decide on the best architecture to use," he said. "I have seen many companies go down the drain because they didn't plan ahead, used the wrong data models, or were not ready to embrace data and AI together."

The wreckage Reyes is describing has nothing to do with model selection. Every one of those companies pointed a perfectly good model at plumbing that wasn't ready for it.