How to Build an AI Vendor Inventory When Nobody Owns the Full List

Ask your procurement team how many AI vendors your organization is using. Then ask your IT security team. Then ask three engineering leads. You will get four different answers, and none of them will be complete.

This isn't negligence. It's a structural consequence of how AI tools proliferated — fast, cheap, and through every purchasing channel simultaneously. The enterprise SaaS that came through procurement. The API keys that engineering bought directly. The credit card charges from individual contributors. The free-tier tools that never touched a payment system at all. Each channel is visible to a different function. No function sees all channels.

The result is an AI vendor inventory problem that nobody owns because nobody can own it from their existing vantage point. Building a complete inventory requires deliberately combining four discovery methods that each see a different part of the problem. This is that process.

Why the Gaps Exist Where They Do

Understanding the gap structure helps you prioritize your discovery effort and anticipate what you'll miss if you rely on any single method.

Procurement sees: formally negotiated contracts, purchase orders, approved vendor payments processed through AP. Procurement does not see: API keys purchased with a credit card, free-tier accounts, tools bundled inside other vendor products, anything that bypassed the procurement process.

IT and security see: devices enrolled in MDM, SaaS accounts federated through SSO, network traffic through managed endpoints. IT does not see: browser-based tools accessed without a native app, API calls made from engineering environments not in MDM, tools on personal devices used for company work, accounts authenticated with personal email addresses.

Engineering sees: dependencies in version-controlled code, integrations they personally built or reviewed, tools they're actively using. Engineering does not see: tools used in adjacent teams, tools used for non-engineering workflows (content, analysis, research), tools that live in spreadsheets and notebooks rather than application code.

Finance sees: charges that appear in expense reports, corporate card statements, and vendor invoices. Finance does not see: free-tier tools, tools billed to personal cards, cost embedded in usage of other services.

The complete picture requires all four vantage points. None is sufficient alone.

Discovery Method 1: Financial Scanning

Start with money. Financial scanning is the fastest way to build an initial vendor list because payment records are structured, searchable, and exist in systems you already have access to.

Pull three data sources:

Accounts payable records: Search for known AI vendor names — OpenAI, Anthropic, Cohere, Mistral, Hugging Face, Replicate, Together AI, Perplexity, AWS Bedrock-related charges, Azure AI, Google Vertex AI. Also search for terms like "API," "model," "inference," and "AI" in vendor names and invoice descriptions.
Corporate credit card statements: The same search, applied to credit card transactions. Pay particular attention to recurring small charges that look like subscription or usage billing. These are the shadow AI tool signatures.
Expense reports: Individual expense submissions will surface personal-card purchases that got reimbursed. Search the same vendor list plus a broader set of terms. Also look for "software" and "tools" categories.

Financial scanning will find tools that cost money. It will undercount the AI footprint by missing free-tier usage, which is significant — many tools are being used at free or trial tiers that haven't yet converted to paid. But it gives you a solid foundation to start from and it's fast.

Discovery Method 2: Network and API Layer Analysis

Network analysis finds tools in active use regardless of payment status. Work with your platform or security team to analyze DNS queries and HTTPS traffic to known AI API endpoints.

The target domains to look for:

api.openai.com
api.anthropic.com
api.cohere.ai
api.mistral.ai
api.together.xyz
api.replicate.com
api.perplexity.ai
bedrock.amazonaws.com (and regional variants)
aiplatform.googleapis.com
openai.azure.com

This list grows constantly. Build a process to update it quarterly as new providers emerge.

Network analysis has two important limitations. First, it only sees traffic from managed corporate network infrastructure. Engineers working from home, API calls going through employee personal devices, traffic through corporate VPN with split tunneling — these may be partial or absent. Second, it shows you what's being called but not who's calling it or why. You know a domain is being accessed; you need additional investigation to understand the usage pattern.

Despite these limitations, network analysis frequently surfaces tools that didn't appear in financial scanning — tools on free tiers, tools paid for personally, tools where only one team member knows the account exists. It's the discovery method most likely to surprise you.

Discovery Method 3: Codebase Search

Code in version control is the ground truth for what AI tools are embedded in your products and infrastructure. A codebase search finds AI vendor dependencies that are in production or development, which is the highest-risk category — these aren't analyst tools or productivity apps, they're systems processing company and customer data.

Run searches for:

AI SDK package imports: openai, anthropic, cohere, mistralai, langchain, llamaindex, huggingface_hub
API endpoint strings: the same domain list from network analysis
Model name references: gpt-4, claude, gemini, llama, mistral
Common variable patterns: OPENAI_API_KEY, ANTHROPIC_API_KEY, and similar environment variable names (not the values — the names)

Codebase search will find things the other methods miss: SDKs installed but not yet in production, multiple different AI integrations in different services, AI tools embedded inside dependencies rather than called directly. It will also surface the data handling questions — which AI vendors are receiving which types of data — that your security and privacy teams need to answer.

One practical note: don't just look at the main branch. Feature branches and older branches may reveal AI experiments that were tried and abandoned — those matter for your inventory because the accounts may still exist and data may have been sent even if the code was never merged.

Discovery Method 4: Team Survey

The survey is the most comprehensive method and the most underused because it feels less rigorous than technical discovery. It is, in fact, more comprehensive than any technical method because it captures tools that are used in workflows that don't touch managed infrastructure or version-controlled code: the ChatGPT subscription an analyst is using for research, the Notion AI features a product manager uses daily, the Grammarly Business account a communications team bought two years ago that now has AI features nobody thought to flag.

Survey design is critical. The framing determines the response rate and the honesty of the responses.

Do not ask: "Are you using any AI tools that haven't been formally approved?" That question positions the survey as a compliance audit and will get defensive non-answers.

Do ask: "We're building a registry of AI tools to make sure teams have access to what they need and that we can support tools properly. What AI tools do you use in your work, including tools you pay for yourself or use on a free plan?"

Include a structured question asking about purpose: productivity and writing assistance, code generation, research and summarization, data analysis, customer-facing features, internal tooling. This categorization will help you prioritize what to govern first.

Offer a follow-up option for engineers who want to discuss tools they're using that they're uncertain about. This creates a safe path for people who have already been thinking "I'm not sure if I should be using this" — which is a significant population in most organizations.

Reconciling and Deduplicating What You Find

The four methods will produce overlapping results with inconsistencies. The same tool may appear under different names in financial records versus network traffic versus survey responses. The same vendor may have multiple products being used independently. The reconciliation phase takes the raw data from all four methods and produces a clean, deduplicated inventory.

For each vendor, capture:

Vendor name and product name
How it was discovered (which methods found it)
Who is using it (team, individual, or embedded in application)
What it's being used for
Current monthly cost or estimated cost
Data classification of what it touches
Account owner or API key owner
Formal approval status

The data classification and approval status fields are the ones that drive action. A tool processing PII without a formal data processing agreement is a different priority than a tool an individual uses for personal productivity.

What to Do With What You Find

The inventory is not the end state. It's the input to a governance decision for each tool. A practical three-tier disposition:

Disposition	Criteria	Action
Approve and formalize	Legitimate use, acceptable risk, reasonable cost	Add to approved tool registry, assign cost owner, document data handling
Review required	Higher data sensitivity, unclear data handling, significant cost	Conduct security and privacy review, pause use pending outcome if high risk
Deprecate or deny	Duplicate of approved tool, unacceptable risk, no business case	Communicate deprecation timeline, migrate workflows if needed

The goal is not to shrink the inventory to a minimal approved list. The goal is to understand what's in use, get it into a framework, and provide engineers with a path to get new tools approved quickly so they don't route around the process.

Maintaining the Inventory Without Creating Bureaucracy

An inventory that requires a manual update process will be out of date within weeks. The maintenance model has to be mostly automated, with human review only for exceptions.

Financial scanning should run monthly as part of the AP review process. Network monitoring should run continuously. Codebase search can be automated through a lightweight CI check that flags new AI package imports for review. The survey should run annually or semi-annually.

The approved tool registry should have a lightweight intake process — ideally a form that takes 10 minutes to fill out and produces a decision within a week. If the process takes longer, engineers will skip it. The value of the registry depends entirely on it being the easier path, not the harder one.

Platforms like Oberhahn address the ongoing maintenance problem by providing a centralized layer where AI API spend and usage data aggregate automatically — meaning your financial scan is always current and your vendor list reflects what's actually being called, not just what someone remembered to update.

The AI vendor inventory is never finished. New tools launch constantly, engineering teams find new use cases, vendors bundle AI into products that didn't previously have it. What you're building is not a static document — it's a living capability to know what you're running and why. The four discovery methods, run systematically and continuously, give you that capability. The governance layer built on top of it gives it value.