Beyond the Prompt: Auditing AI Compliance in Real-Time

30 Mar

Your AI agent is making commitments your legal team has never approved

When a company deploys an AI support agent, the assumption is that the agent follows the rules. It knows the script. It stays within the approved policy boundaries. It never over-promises.

That assumption is wrong, and the evidence is piling up.

AI agents deviate. They generate plausible-sounding answers that invent policies, misquote terms, and make commitments no one authorised. The agent is not being deceptive in any meaningful sense. It is doing what language models do: producing the most statistically probable response. The problem is that "statistically probable" and "legally safe" are not the same thing, and at the scale modern support teams are operating, no human reviewer sees most of what the AI says. Isara exists precisely because this gap between what your AI is doing and what you think it is doing is widening every day.

When the script breaks down: the compliance cost of AI drift in customer support

The failure mode has a name in the industry: contextual drift. In longer conversations or interactions, AI models may subtly shift away from initial grounding, increasing the chance of errors. For a contact centre agent handling thousands of interactions a day, that drift is not an edge case. It is a statistical certainty.

The consequences are not theoretical. At one e-commerce brand, an AI agent repeatedly promised customers it had shipped replacement products for damaged orders, then closed the tickets without actually triggering any shipment. The customer service team only found out when frustrated customers followed up days later, creating double the work, longer resolution time, and zero goodwill.

The regulatory exposure is equally real. 95 percent of executives said their organisations experienced negative consequences in the past two years as a result of their enterprise AI use, according to an August 2025 Infosys report. A direct financial loss was the most common consequence, reported in 77 percent of those cases.

The regulatory environment is not waiting for companies to catch up. The EU AI Act entered force in August 2024, and its most significant obligations for high-risk systems become enforceable in August 2026. Violations can result in fines up to €35 million or 7 percent of global annual turnover. In the United Kingdom, FCA Consumer Duty already applies to AI-generated customer communications in financial services, regardless of whether those communications came from a human or a model.

Three failure patterns appear consistently across industries when AI agents are left to operate without continuous audit:

Policy fabrication. The agent invents a refund window, a cancellation term, or a product feature that does not exist. A developer using a company's AI-powered support chatbot discovered the system had invented a subscription policy limiting devices per account, a policy that had never existed, leading to user frustration and public backlash.
Commitment language. The agent uses phrases such as "I guarantee," "we will definitely," or "you are entitled to" in contexts where those words carry legal weight. At scale, this creates contractual exposure no one is tracking.
Script divergence. In regulated industries, agents must follow approved communication frameworks. Without regular checks, policy changes, data updates, and shifting customer expectations turn into AI inaccuracies in customer experience.

The platform running your AI agent is not positioned to catch this. Compliance is not one-and-done. AI agents drift over time as they process new data and as regulations evolve. When the only audit trail lives inside the platform that sold you the agent, the checker shares the same priors as the system being checked.

This is the structural problem that Isara is built to address: providing independent verification of AI agent behaviour, separate from the platforms running those agents, with humans making the verification decisions on every flagged interaction.

What a real-time compliance audit actually looks like: a framework for customer support leaders

Most compliance discussion in AI focuses on deployment decisions. Which model to use. What data to train on. How to structure the system prompt. These are important questions. But they are asked once, at the beginning. What happens next, across millions of conversations, is where the compliance risk actually lives.

Consider a hypothetical but representative scenario for a mid-market telecoms operator running an AI agent on Intercom across 15,000 customer interactions per month. Without independent monitoring, this is the typical compliance exposure profile:

A subset of conversations, roughly 3 to 5 percent on conservative estimates based on published hallucination rates for grounded tasks, will contain some form of inaccurate policy reference or invented product claim.
A smaller but more serious subset will involve commitment language that creates legal exposure under Consumer Duty or sector-specific scripts.
An even smaller subset will involve the AI taking or recommending an action outside its authorised scope, such as offering a discount it is not permitted to offer or referencing a policy tier the customer does not qualify for.

At 15,000 conversations per month, even a 3 percent inaccuracy rate generates 450 interactions containing compliance risk. Without systematic audit, none of those interactions are reviewed unless a customer escalates. Most do not escalate. They simply stop using the service.

The principle that Isara applies to this problem has three components:

Continuous coverage. Every interaction is surfaced for review, not a sample. The AI identifies and prioritises the conversations most likely to contain compliance risk, so human reviewers focus their time on what matters.
Independent signal. The audit happens outside the platform running the agent. When regulators come asking, governance platforms provide comprehensive reports showing exactly how your organisation handles regulated data through AI agents. Medium That report is only credible if it was produced independently.
Human verification. Isara's AI locates and organises. Humans make the verification decisions. This is not a design preference. It is the structural requirement that makes verification-grade data defensible. Automated flags become verified findings only when a qualified person confirms them.

According to a McKinsey study, organisations spend 15 to 20 percent of operational budgets on compliance-related activities. AI agents can reduce this by over 40 percent, while increasing regulatory coverage and speed. Lyzr The lever is continuous monitoring that surfaces issues early, not quarterly reviews that catch them late.

The EU AI Act is moving in one direction: more scrutiny, not less. Regulators now expect explainability on demand, particularly for agents involved with customer data or financial transactions. Banking Exchange An independent audit trail produced by a platform with no stake in the agent's performance is the only kind of record that satisfies that expectation.

What customer support and compliance leaders ask Isara about compliance auditing

How does Isara identify when an AI agent has deviated from a brand guideline or regulatory script?

Isara analyses interactions independently of the platform running your AI agent. It surfaces conversations where the agent's language diverges from approved policy frameworks, uses commitment language outside authorised parameters, or references terms and conditions inaccurately. Flagged interactions are prioritised for human review. The verification decision rests with your team, not with Isara's system or the platform's own reporting.

Does Isara work across multiple platforms at once?

Yes. Isara connects to Intercom, Zendesk, HubSpot, and Salesforce, providing a cross-platform view of AI agent behaviour. This matters for companies running more than one support tool, or for compliance teams who need a single audit record that is not tied to any one vendor's reporting layer.

What does the audit trail look like for regulatory purposes?

Isara produces an independent record of AI agent interactions, flagged conversations, human verification decisions, and outcomes. This is separate from the execution data held by your platform. For regulated sectors operating under FCA Consumer Duty, MiFID II, or the EU AI Act's transparency obligations, this record belongs to your organisation, not to the platform that ran the agent.

What is Isara's approach to commitment language detection?

Commitment language, phrases where an AI agent makes a promise, warranty, or guarantee, is one of the highest-risk compliance categories in customer support. Isara is designed to surface this class of deviation as a priority for human review. This covers not just explicit promise language but contextual commitments where the agent implies an entitlement the customer does not have.

Can Isara monitor AI agents in financial services specifically?

Financial services is one of Isara's primary sectors. The combination of existing regulatory obligations, FCA Consumer Duty, MiFID II, IDD, PSD2, and the increasing use of AI agents in customer-facing roles creates a particularly acute compliance exposure. Isara's independent monitoring is designed to meet that context, including the documentation requirements that regulated firms need for audit and oversight purposes.

Florian Baptiste

Beyond the Prompt: Auditing AI Compliance in Real-Time

Your AI agent is making commitments your legal team has never approved

When the script breaks down: the compliance cost of AI drift in customer support

What a real-time compliance audit actually looks like: a framework for customer support leaders

What customer support and compliance leaders ask Isara about compliance auditing

The Self-Certification Trap

Training Your Silicon Team: Using Isara to Refine AI Logic