Why API Gateways Don't Solve AI Governance

2025-12-10

i’ve spent the last few months talking to teams deploying large language models in production. and there’s a pattern i keep seeing: they try to drop their LLM behind an existing API gateway—kong, apigee, aws api gateway, whatever—and expect it to Just Work.

it doesn’t. and it’s not because traditional gateways are bad. they’re actually pretty great at what they were designed for: routing traffic, rate limiting, authentication, basic transformation. they’ve been doing that reliably for over a decade. the problem is that ai workloads are fundamentally different, and trying to force ai through a traditional gateway architecture is like using a hammer for a screw.

let me break down why.

the identity problem

here’s where it gets weird. in a traditional API setup, a client calls your gateway with auth credentials, and you route to a backend service. the backend knows who made the request because the auth token passed through the gateway tells it.

but with AI, the architecture is different. your user authenticates with the LLM, the LLM calls your backend with a shared API key. your backend now has no idea who actually triggered that request. it just sees a call from “the LLM service.” that’s what i call the three-party problem, and it breaks everything downstream.

you can’t track usage per user. you can’t enforce per-user budget limits. you can’t audit who called what. and if something goes wrong, you have no idea whose request caused it.

traditional gateways have no mechanism for binding identity through the model layer. because the gateway has never needed to. it authenticates the client. the model was always inside your own boundary. now it’s not.

policy enforcement isn’t per-endpoint anymore

with a REST API, you enforce policy at the endpoint level. this endpoint requires read access, that one requires admin, etc. pretty straightforward.

but with AI, you need policy enforcement at the tool level. the user can call the LLM, which can call your payment API, which calls your database, which calls your logging service. the LLM might only be allowed to call read endpoints. a different user might be allowed to call write endpoints but not delete. you need to enforce this per-tool, per-user, and dynamically.

a traditional gateway can’t do this. you’d have to write custom plugins for every tool, every policy combination. suddenly you’ve got a maintenance nightmare, and you’re essentially building a second product inside your gateway’s plugin layer.

content inspection before it reaches the model

here’s something that keeps security teams up at night: PII leakage. your user sends a request like “hey, process this invoice” and the invoice contains their credit card number. that gets sent to the LLM, and now your prompt is carrying sensitive data.

traditional gateways can do basic request/response inspection. but they’re not designed to understand context. they don’t know that a credit card number in an invoice is different from a credit card number in a test payload. they can’t parse unstructured data and make intelligent decisions about what’s safe to send to the model.

you need content transformation and inspection specifically built for AI use cases. mask PII before it reaches the model. validate that tool responses don’t contain sensitive data. log what data touched your model for compliance audits. traditional gateways weren’t built for this.

budget tracking is per-service, not per-user

rate limiting in traditional gateways works great: 1000 requests per minute per API key, done.

but with AI, you need per-user budget tracking. user A gets $100/month, user B gets $500/month. the LLM doesn’t know about these limits, so the gateway has to enforce them. and it has to do it while the LLM is making multiple calls on behalf of a single user. one user request can trigger 5 model calls, each of which might call multiple backend tools.

traditional gateways have no native concept of “this user’s monthly quota.” you’d have to hack it in with plugins, caching layers, and custom logic. messy.

the alternative

what you really need is a gateway purpose-built for ai: one that handles identity binding across model layers, enforces policy on a per-tool basis, inspects content before it reaches the model, tracks budgets per user (not per service), and gives you audit logs that actually make sense.

this is what we built gatewaystack to solve. not because traditional gateways are bad, but because AI has genuinely different requirements.

so before you spend the next six months writing custom plugins for your gateway, ask yourself: am i solving ai-specific problems with a tool that was built for REST APIs?

what’s your biggest headache trying to run ai workloads through a traditional gateway?

← back to writing