AI Governance · Data Privacy · June 2026

AI Optimization and Data Privacy — Questions Every Company Should Ask Before They Start

Established companies with solid vendor contracts and data practices are moving into AI optimization and finding that what protected them before does not cover what AI actually does with their data. These are not naive questions. They are exactly the right ones — and most companies are not asking them.

Monte Fisher, CPA (Ret.), CFE — Former Shell GRA Manager, North America Forensic Analytics · AI Governance · VCAnalytics.ai · Makati, Philippines · June 2026

12 min readLength SMB & EnterpriseAudience NIST · COSO · ISO 42001Frameworks No vendor agendaIndependent

The exposure that already exists — before any AI project begins

Right now, today, your employees are almost certainly using AI tools that expose your company data — and most of them have no idea they are doing anything risky. They are pasting client contracts into ChatGPT to get a summary. They are running financial projections through an AI assistant to save time. They are drafting proposals in Copilot with proprietary pricing and client details inline. None of this is malicious. It is convenient. And it is happening in virtually every company regardless of whether a formal AI project has been approved. An AI optimization project is not creating risk from zero — it is adding structure to risk that is already there. The right way to think about it: this project is your opportunity to audit what is already happening, build the controls that should exist, and come out materially more secure than when you started.

Start Here: Your Risk Assessment Drives Everything

Before you decide what safeguards to build, what vendor to choose, or what architecture to use, you need to answer one foundational question: what data are you actually putting at risk, and what does it cost you if that data is exposed?

Not every company needs the same solution. A company processing public marketing data through an AI optimization tool faces a fundamentally different risk profile than a law firm running confidential client contracts through the same tool. The safeguards that make sense — and the budget that justifies them — follow directly from that assessment.

The risk assessment that drives the budget

Think of it this way. You would not put a Ferrari security system on a basic commuter car — but you also would not protect a Ferrari with a bicycle lock. The question is not "what is the maximum security we could build?" It is "what level of protection is proportionate to what we are actually protecting, and what does a breach actually cost us?"

A small logistics company using AI to optimize delivery routes, processing no customer PII, with no regulated data — a basic DPA and a standard-tier enterprise agreement may be entirely sufficient. Total governance cost: a few hundred dollars a month and a one-page incident response procedure.

A financial services firm using AI to analyze client portfolios, processing account numbers, SSNs, and investment strategies — PII masking before data leaves, egress monitoring, sandboxing on return, independent controls validation annually, and a full-time designated governance owner. Total governance cost: significant, and justified by the exposure.

Most companies sit somewhere in between. The risk assessment tells you where you are on that spectrum — and therefore what you need to spend and what you can manage with lighter controls.

The risk assessment does not need to be a lengthy document. It needs to answer four questions honestly:

The four questions your risk assessment must answer

1. What data will the AI system touch? List it specifically — customer PII, financial records, contracts, employee data, proprietary processes.

2. What is the cost of exposure? Regulatory fines, client contract liability, reputational damage, competitive harm. Put numbers on it where possible.

3. What data is critical versus manageable? Not all data needs the same protection level. Identify what would cause serious harm if exposed versus what is inconvenient but recoverable.

4. What controls are proportionate to that cost? This is where budget decisions get made — driven by the risk, not by what a vendor is trying to sell you.

This assessment is the foundation. Every safeguard discussed in this article has a cost — in money, in time, in operational complexity. The risk assessment is what tells you which ones your situation actually requires.

Why Existing Protections Often Fall Short

Most companies entering AI optimization are not new to data protection. They have NDAs. They have vendor agreements. They have IT policies and legal review processes that have served them well for years. The issue is not carelessness — it is that those protections were designed for a different kind of vendor relationship, one where data goes in, a service comes out, and the vendor retains nothing meaningful in between.

AI does not work that way. A standard vendor NDA says the vendor will not disclose your confidential information. That language was written for a world where a vendor receives a file, does work, and returns a deliverable. It says nothing about whether your data is used to improve a model that then serves the vendor's other customers. It says nothing about sub-processors. It says nothing about what happens to data patterns after the contract ends, even when the raw data has been deleted.

The core gap

Traditional vendor contracts protect against disclosure of your data. AI engagements require agreements that also address training use, model retention, sub-processor chains, and what survives data deletion. Most standard templates do not cover these — and most companies do not realize that until they specifically look.

Real Life: What "Data Used for Training" Actually Means

Real-world example — Law firm and AI contract drafting tool

A mid-size law firm starts using an AI tool to draft and review contracts. Attorneys paste in client names, deal terms, acquisition prices, confidential business structures, and proprietary transaction strategies. The AI tool is on a standard subscription tier. The vendor's terms of service on that tier permit using customer inputs to improve the underlying model.

Six months later, a competitor's attorney uses the same AI tool to draft a similar acquisition agreement. The model — trained partly on the first firm's confidential inputs — suggests contract language and deal structure terms that are suspiciously specific to the first firm's client situation. The competitor's attorney has no idea where those suggestions came from. Neither does the AI vendor, because the model has absorbed thousands of inputs and cannot trace any single output back to its source.

The first firm's client never consented to their deal terms being used this way. The NDA between the firm and the AI vendor said "do not disclose confidential information" — but the vendor never disclosed it directly to anyone. They used it to improve a model. Most NDAs do not address that distinction.

What would have prevented this: A signed Data Processing Agreement on an enterprise tier explicitly prohibiting training use. The cost difference between standard and enterprise tier for most tools: a few hundred dollars a month. The cost of the alternative: potentially the client relationship, bar complaints, and litigation.

What applies to every company: A manufacturer pasting proprietary process specifications into an AI assistant. A financial services firm running client portfolio data through an AI analysis tool. An HR team using AI to process employee performance records. The category of data changes. The mechanism of risk is identical.

The Five Questions to Ask Every AI Vendor

Question 1 — Model Training

"Is our data used to train or improve your models — including models operated by your sub-processors?"

This needs to be asked explicitly and answered in writing. "We take data privacy very seriously" is not an answer. The prohibition on training use needs to be in the signed Data Processing Agreement and must extend to every sub-processor downstream, not just the primary vendor.

If the vendor cannot answer this with specifics, or if the DPA does not address it explicitly, assume training use is permitted under current terms until proven otherwise.

Question 2 — Sub-Processors

"Who are your sub-processors, where are they located, and do your data handling obligations extend to them contractually?"

When you hand data to an AI vendor, that data rarely stays with just that vendor. It flows through cloud infrastructure providers, model hosting services, logging tools, and sometimes human review processes. Each is a sub-processor and each is a potential exposure point. Ask for the list. Ask where each stores data geographically. Ask whether your negotiated terms flow down to each sub-processor contractually. If the answer is no, your protections stop at the first handoff.

Question 3 — Offshore Development Access

"Who has access to production data during development, debugging, and maintenance — and where are those people located?"

AI optimization products are frequently built and maintained by development teams in countries with different privacy laws and different legal frameworks for accountability. That is a legitimate cost and talent decision. It becomes a risk question when those teams have access to production data without equivalent controls to a domestic employee. Ask who has access, whether access is logged and auditable, and what your practical recourse is if something goes wrong across that jurisdiction gap.

Question 4 — Jurisdiction and Accountability

"Where are you incorporated, which jurisdiction governs our contract, and what is your dispute resolution mechanism?"

AI startups are frequently incorporated in jurisdictions chosen for ease of formation or tax efficiency. If a vendor is incorporated offshore with contract terms specifying arbitration in a neutral jurisdiction and professional indemnity coverage capped well below potential breach liability, the practical path to recovery after a data incident is long and uncertain. Your clients hold you accountable for what your vendors do with their data. That accountability runs upstream to you regardless of what your internal vendor contract says.

Question 5 — Insurance and Bonding

"Do you carry cyber liability or professional indemnity insurance, and does coverage extend to incidents caused by your sub-processors?"

This question most quickly reveals the gap between stated commitments and actual financial accountability. Many AI vendors — particularly earlier-stage companies — carry limited coverage relative to the scale of data they handle. Ask for documentation. If the vendor cannot produce it, there is no meaningful financial backstop behind their data handling commitments. Bonding is worth raising separately, particularly in regulated industries where vendors handling sensitive data may be required to be bonded as a condition of engagement.

Safeguards: Match the Control to the Risk

The table below maps each safeguard to the risk level that justifies it. This is the cost-versus-benefit framework in practice. Your risk assessment tells you which row you are in. That row tells you what you need to build and what you can defer.

Safeguard	Risk Level	Relative Cost	What It Addresses
Signed Data Processing Agreement	All levels	Low — administrative	Training use, deletion, breach notification
Data minimization policy	All levels	Low — process change	Limits exposure at source
Internal AI tool inventory	All levels	Low — one-time exercise	Surfaces shadow AI use by employees
Written incident response procedure	All levels	Low — one page	Defines response before it is needed
Egress monitoring and access logging	Medium & high	Medium — configuration	Real-time visibility, self-owned audit trail
PII masking and tokenization	Medium & high	Medium — technical build	Vendor never sees real identities
Data validation and sandboxing on return	Medium & high	Medium — technical build	Blocks malformed or malicious returned data
Designated AI governance role	Medium & high	Medium — assignment or hire	Owns monitoring, alerts, vendor oversight
Periodic internal controls self-testing	Medium & high	Low — scheduled internally	Confirms controls actually work, not just exist
On-premise hardware deployment	High only	High — infrastructure	Data never leaves your network
Independent third-party controls audit	High only	High — external engagement	Independent validation for clients and regulators

The Safeguards in Detail

Signed Data Processing Agreement — required at every risk level, no exceptions. Specifies training use prohibition, sub-processor obligations, deletion timelines, and breach notification. Request before any data sharing begins.

PII Masking and Tokenization — before sending data to any AI vendor, real identifying information is replaced with anonymous tokens. The AI works on masked data, returns outputs referencing the same tokens, and your internal system re-links them. The vendor completes the work without ever seeing client names, account numbers, or contract values. If they are breached, the attacker gets tokens that map to nothing outside your system.

Data Validation and Sandboxing on Return — AI outputs re-entering your systems are received into an isolated environment first, scanned against expected formats and content rules, and only then allowed into your live systems. Antivirus scanning of returned data files is the baseline. Schema validation adds a second layer. This addresses the risk most companies never consider: not what leaves, but what comes back.

On-Premise Hardware — for organizations where data sensitivity justifies it, running the AI model locally on your own servers eliminates vendor-side data exposure entirely. Higher upfront cost, zero vendor data risk. The right architectural choice for legal, financial, healthcare, and defense contractor environments handling their most sensitive data.

Egress Monitoring and Access Logging — owned by you, run by you — egress monitoring tracks data leaving your own network: volume, destination, frequency, content type. You configure it. You receive the alerts. You review the reports. When an anomaly appears — an unusually large transfer to an AI vendor endpoint, a transfer outside business hours — your team sees it in real time and decides how to respond. This is your visibility into your own data flows, not a vendor watching you. Access logging inside the AI system creates an audit trail that you own and can produce on demand if a client or regulator asks what happened to their data.

Periodic Internal Controls Self-Testing — on a defined schedule, quarterly is typical, your team runs the controls to confirm they actually work. Did the egress alert fire when it should have? Did tokenization replace all PII fields correctly? Did the incident response procedure get followed when there was a minor issue last month? This is a fire drill, not an audit. You run it internally. No outside vendor required. It converts theoretical controls into verified ones.

Independent Third-Party Controls Audit — a separate exercise at a higher level of formality. A penetration test, a SOC 2 audit, a formal AI governance review conducted by an outside firm. More expensive, less frequent, but carries independent credibility that self-attestation does not. When a major client or regulator asks "who validated your controls?" an independent review answers that question in a way your own testing cannot. Start with internal self-testing and build toward independent validation when client contracts or regulatory requirements demand it.

Every Company Deploying AI Needs a Designated Governance Role

This is the piece most companies miss entirely — and it is where informal AI adoption by employees turns into a managed, auditable, defensible program.

This does not need to be a new hire. It does not need to be a full-time position at lower risk levels. It almost certainly should not be a traditional IT manager whose focus is network maintenance and hardware. That skill set is valuable but it is not data governance.

What an AI governance role looks like in practice

Receives and reviews real-time alerts from egress monitoring and access logging — because data privacy incidents require awareness and response in hours, not the next IT ticket cycle. Owns the vendor DPA review process and maintains the AI tool inventory. Reviews validation and exception reports from AI data pipelines on a defined schedule. Serves as the internal point of contact when a vendor reports a data incident. Has sufficient authority to pause a vendor relationship or restrict employee access to an AI tool when a risk is identified. Reports to senior leadership, not just to IT.

In many organizations this person already exists under a different title. A compliance manager. A senior operations director. A CFO in a smaller company who already owns vendor relationships and risk oversight. The gap is not headcount — it is formal assignment, defined scope, the right tools and alerts configured, and the authority to act when something requires action.

The reason this matters specifically for employee AI use: when your team uses AI tools informally — which they are doing right now — there is currently no one whose job it is to know which tools are in use, what data those tools are seeing, whether those tools have signed DPAs, or whether any of them represent a material exposure. The governance role closes that gap and converts invisible informal AI use into a managed program where someone is paying attention.

This Is an Opportunity, Not Just a Risk Checklist

The companies that come out of an AI governance project in the strongest position are not the ones that treated it as a compliance exercise. They are the ones that used it as the forcing function to finally audit their existing data practices, close the gaps they already had, and build controls that protect them across every vendor relationship — not just the AI one. — Monte Fisher, CPA (Ret.), CFE

Most of the safeguards in this article are not specific to AI. They are sound data governance practices that apply to every vendor relationship involving sensitive data. Your AI project is the reason to finally put them in place. The payoff extends well beyond the AI project itself.

The internal risk audit that precedes a well-governed AI deployment frequently uncovers things that were already there: vendor contracts not reviewed in years, employee access to data exceeding their job requirements, outbound data flows through tools never formally approved. Finding and fixing those things is permanent value. The AI project is the trigger. The security improvement is lasting.

What good looks like coming out the other side

A risk assessment that drove the budget and architecture decisions. A signed DPA with every AI vendor touching company data. PII masking configured for external vendors where the risk justifies it. Egress monitoring active, alerts configured, reviewed by a designated governance owner. Returned data passing through validation before entering live systems. Periodic internal self-tests on a defined schedule confirming controls actually work. A one-page incident response procedure your team has reviewed. An AI tool inventory accounting for every tool employees are using. The right level of all of this for your risk profile — not a Ferrari where a reliable sedan does the job, and not a bicycle lock on a Ferrari.

How This Connects to the FAIG Assessment

The Fisher AI Implementation Gauge — the free 15-question self-assessment at vcanalytics.ai/ai-governance.html — measures data governance and vendor due diligence as two of its five scoring categories. The questions in this article are a direct expansion of what the FAIG surfaces at a higher level of detail.

If you have taken the FAIG and scored low on vendor due diligence or data governance, this article describes specifically what that gap looks like and what closes it. We are expanding the FAIG to include a dedicated data privacy and vendor accountability module covering offshore talent risk, foreign vendor jurisdiction, bonding and insurance verification, DPA completeness, and organizational governance role definition. If you are working through these questions for a specific AI project, trying to define what an AI governance function should look like in your organization, or simply trying to figure out whether your current exposure is low, medium, or high — message me directly. The initial conversation is always free.

Not sure where your organization sits on the risk spectrum?

The risk assessment is where everything starts. Monte personally reviews every message. The initial conversation is always free — no obligation to proceed. If your situation is low risk and a basic DPA is sufficient, he will tell you that. If gaps need closing, he will tell you that too — with specifics, not a sales pitch.

Message Monte · WhatsApp Free FAIG Assessment →

Monte Fisher, CPA (Ret.), CFE Former Shell Governance Risk and Assurance Manager, North America. Certified Fraud Examiner with 25+ years of forensic analytics and compliance experience. Monte advises small and medium businesses on AI governance, vendor due diligence, and AI transformation — with no vendor agenda and no product to sell. Based in Makati, Philippines. About Monte →

Disclaimer: This article is for general educational purposes only. It is not legal advice, technical consulting advice, or a formal governance audit. For formal AI governance audits, legal compliance opinions, or technology implementation decisions, always engage qualified professionals in your jurisdiction. Monte Fisher's CPA license is retired and inactive. Facilitation fees on AI partner introductions are always disclosed before any introduction is made. © 2026 VCAnalytics.ai