The Problem With Trademark Data (And Why We Built an API to Fix It)

Global network visualization showing interconnected trademark office data streams converging into a unified API layer
8 min read

There are 93.2 million active trademark registrations worldwide, spread across more than 200 intellectual property offices. In 2024 alone, 11.7 million new applications were filed globally — roughly 32,000 every single day.

That is an extraordinary amount of data. And almost none of it is easy to access programmatically.

The State of Trademark Data Access

If you want to build software that works with trademark data, you quickly discover that the landscape looks nothing like modern data infrastructure. Each office operates independently, with its own systems, formats, and access policies.

The USPTO is among the more progressive offices. Its Open Data Portal provides several APIs — including the Trademark Status Document Retrieval (TSDR) system — and offers bulk XML downloads of trademark application data. But even here, developers need API keys (required since October 2020), must navigate multiple endpoints for different data types, and work with XML schemas specific to the USPTO.

EUIPO offers a trademark search API through its developer portal, and WIPO maintains an API catalog alongside its Global Brand Database, which holds over 28 million records from multiple international and national sources. These are legitimate, well-maintained systems.

But they are also completely different from each other.

The USPTO publishes daily XML files (TDXF format). EUIPO exposes REST endpoints with its own response schema. WIPO's Global Brand Database uses yet another query format. Dozens of national offices — from INPI in Brazil to the IP Office of India — have their own systems, or in many cases, no public API at all. Some offices still rely on periodic bulk data dumps rather than real-time access.

This is the fragmentation problem. It is not theoretical. It is the daily reality for anyone building trademark technology.

Where the Filings Are

Understanding why this matters requires looking at where trademark activity is actually happening. The distribution is heavily concentrated — and rapidly shifting.

Trademark Application Class Counts by Origin, 2024 (Thousands) — Source: WIPO

Asian offices received nearly two-thirds (65.6%) of all filings worldwide in 2024, with China alone accounting for 45.8% of global class counts. The fastest-growing origins were Brazil (+10.4%), India (+7.4%), and Russia (+2.9%), while both China and the US saw slight declines of 1.5%.

For any trademark search, clearance, or monitoring tool that claims global coverage, these numbers are non-negotiable. You cannot build a credible product that only covers the USPTO and EUIPO. You need access to the offices where filings are actually growing — and those are precisely the offices with the least accessible data infrastructure.

The Fragmentation Tax

Every company building on trademark data pays what we call the fragmentation tax: the engineering time spent normalizing, deduplicating, and reconciling data across sources rather than building the features their users actually need.

Consider what it takes to run a single multi-jurisdictional trademark search today without a unified API:

  • Authentication: Separate credential systems for each office API. Some require OAuth tokens, others use API keys, a few still rely on session-based authentication.
  • Query translation: A search for "APEX" in Nice Class 9 needs to be expressed differently for the USPTO, EUIPO, WIPO Madrid Monitor, and each national office.
  • Response normalization: Status codes, owner entity formats, date formats, classification structures, and goods/services descriptions all vary between offices.
  • Update frequency: Some offices update in real-time, others daily, some weekly. Knowing the freshness of your data matters for clearance opinions.
  • Error handling: Each system has different rate limits, downtime patterns, and failure modes.

The trademark search software market was valued at approximately $1.2 billion in 2024, with the AI-powered segment projected to reach $7.98 billion by 2033 at a 20.8% CAGR. A meaningful share of that spending goes not toward search intelligence or user experience, but toward solving the same data plumbing problems that every player in the market faces independently.

What Signa's API Does

Signa normalizes data from 200+ trademark offices into a single REST API with consistent authentication, query format, and response schema. Here is what that looks like in practice.

Search across jurisdictions with a single request:

const results = await fetch('https://api.signa.so/v1/search', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer sig_live_...',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    query: { text: 'APEX' },
    filters: {
      jurisdiction: 'US',
      nice_classes: [9, 42],
      status: 'registered'
    },
    strategies: {
      text: { enabled: true, weight: 1.0 },
      phonetic: { enabled: true, weight: 0.8 }
    }
  })
});

Every result comes back in the same format regardless of source office — prefixed IDs (tm_, own_), standardized status codes, ISO date formats, normalized owner entities, and consistent classification data. The response follows Stripe-style conventions: single resources at top level, collections with { object: "list", data: [...], has_more, pagination }, and RFC 9457 error format.

Monitor a mark across offices with webhooks:

// Retrieve full trademark detail — all canonical data in one call
const mark = await fetch('https://api.signa.so/v1/trademarks/tm_7d4e1f2a', {
  headers: { 'Authorization': 'Bearer sig_live_...' }
});

// Returns: classifications, owners, attorneys, events, proceedings,
// text variants, design codes, media, priorities — everything.
// No ?include= parameters needed. One shape, always complete.

The detail endpoint returns all canonical data by default — classifications, owners, attorneys, prosecution events, proceedings, text variants, design codes, and media — in a single predictable response shape. No ?include= parameter gymnastics. One endpoint, one cache key, one response shape to learn.

A Growing Market With Infrastructure Gaps

The IP software market is expanding rapidly — projected to grow from $15.9 billion in 2026 to $42.5 billion by 2034 at a 13.1% CAGR. Legal technology as a whole is following a similar trajectory, with a growing majority of organizations adopting API-first approaches to their software infrastructure.

Global IP Software Market Size ($B) — Source: Fortune Business Insights

Yet much of this growth is concentrated in platforms that are closed ecosystems. Corsearch (which acquired TrademarkNow), Clarivate (which owns CompuMark), and other incumbents provide trademark search and monitoring tools — but primarily as end-user applications, not as developer infrastructure.

The result is that companies building new trademark technology often face a choice: license data from an incumbent (expensive, restrictive), scrape public databases (unreliable, legally questionable), or build their own multi-office integration from scratch (slow, maintenance-heavy).

We think there should be a fourth option: an API that treats trademark data as developer infrastructure.

Where This Is Heading

Three trends are converging that will reshape how trademark data is consumed:

AI agents as API consumers. In 2026, APIs are increasingly being consumed not by human-written application code but by AI agents and LLMs. Trademark data is a natural fit — structured, high-volume, and requiring cross-referencing across jurisdictions. We are building with this in mind, including MCP server support for AI-native discovery.

Real-time over batch. The industry has historically operated on periodic data dumps and weekly update cycles. As trademark filing volumes grow and clearance timelines compress, the expectation is shifting toward real-time data access. Our API reflects this — live data from source offices, not stale snapshots.

Global by default. With India, Brazil, and other emerging markets driving double-digit growth in filings, building for a single jurisdiction is no longer viable for most use cases. The 8.3 million registrations recorded in 2024 — a 9.2% increase, ending two years of decline — signal that the global trademark register is accelerating again.

We built Signa because we believe trademark data should be as accessible to developers as financial data or weather data. The information is public. The offices are publishing it. The missing piece is infrastructure that normalizes 200+ office formats into a single, Stripe-style API with prefixed IDs, cursor pagination, and RFC 9457 errors — so developers can build on trademark data instead of wrestling with it.

That is what we are building — and we are just getting started. Join the waitlist to get early access.


Sources: