The language of AI search — defined clearly.
The terminology around AI, search, and web visibility is evolving fast. This glossary defines the concepts that matter for your business — without the jargon.
A
An AI agent is a software system that goes beyond answering questions — it acts. It can browse websites, fill forms, compare products, book appointments, and complete multi-step tasks autonomously. Unlike a chatbot that responds to prompts, an agent operates with a goal and navigates the web to achieve it. The rise of AI agents is driving the need for websites to be machine-usable, not just machine-readable.
An AI citation occurs when a generative AI system selects your content as a source for its answer. Unlike a traditional search ranking, an AI citation means the AI has evaluated your content as accurate, authoritative, and relevant enough to surface to a user. Earning AI citations depends on structured content, entity clarity, and topical authority.
AI crawlers (such as GPTBot for OpenAI, Google-Extended for Google AI, PerplexityBot, ClaudeBot, and Anthropic-AI) index websites to feed AI training data or real-time knowledge retrieval. Websites must explicitly allow or block these crawlers via robots.txt. Blocking them prevents AI systems from using your content in their responses.
Google AI Overviews are automatically generated summaries that appear above traditional search results. They cite multiple sources and synthesize answers to user queries. Appearing in an AI Overview drives visibility even for zero-click searches, and requires content that is structured, authoritative, and clearly answers specific questions.
ARIA (Accessible Rich Internet Applications) labels add semantic meaning to HTML elements that don't communicate their purpose through their tag alone. A button labeled with ARIA knows it's a "Submit consultation request" rather than just a generic button. These labels improve accessibility for disabled users and are equally important for AI agents interpreting page structure.
An action map catalogs each critical user action on a website — booking forms, search tools, checkout flows, intake forms — and documents what inputs each action requires, what it returns, and whether it is currently accessible to AI agents. It serves as the blueprint for making a site usable by automated systems and as a reference for future development.
The agentic web refers to the emerging paradigm where AI agents — software that can autonomously browse websites, fill forms, compare products, and execute tasks — become a primary type of web user. Unlike a search engine that indexes and ranks, an agentic system interacts with websites functionally, completing goals rather than returning links. Websites must be structured for machine usability, not just human readability.
AEO focuses on formatting content to be extracted as authoritative answers in AI-generated responses, featured snippets, and voice search results. It involves writing clear, concise answers to specific questions, using proper heading hierarchy, and implementing FAQ and HowTo schema markup.
C
Search engines and AI crawlers allocate a fixed crawl budget per domain — a limit on how many pages they'll process. Sites with thousands of low-quality or duplicate pages waste crawl budget on content that doesn't contribute to visibility. Optimizing crawl budget means ensuring crawlers spend their time on your highest-value pages.
E
E-E-A-T is Google's framework for assessing content reliability. Experience refers to first-hand involvement in the topic. Expertise means demonstrable knowledge. Authoritativeness is earned through recognition by other credible sources. Trustworthiness relates to transparency, accuracy, and site security. AI systems use similar signals when deciding which sources to cite.
Entity SEO focuses on establishing your brand, people, products, and locations as named entities that AI systems can identify unambiguously. This involves consistent NAP (name, address, phone) data, structured data markup, Wikipedia/Wikidata presence, and content that clearly associates your brand with specific topics and industries. Entity clarity is a foundational signal for AI citations.
G
GEO is the emerging discipline of making your content the preferred source for AI-generated answers. Unlike traditional SEO (ranking in blue-link results), GEO focuses on structured content, entity authority, schema markup, and topical depth that AI systems use when composing responses. A GEO strategy typically includes content audits, structured data implementation, and citation tracking across major AI platforms.
H
HowTo schema is a Schema.org markup type that explicitly identifies a page as containing step-by-step instructions. It allows AI systems and search engines to extract and display your steps directly in results or generated responses, improving your visibility for procedural queries. It's particularly effective for tutorials, guides, and process documentation.
K
A Google Knowledge Panel is a structured information display shown for recognized entities. It pulls from structured data, Google Business Profile, Wikipedia, and other authoritative sources. Appearing in a knowledge panel signals to AI systems that your entity is well-established and trustworthy, which correlates with higher AI citation rates.
L
Large language models (LLMs) are the underlying technology behind ChatGPT, Claude, Gemini, and similar AI systems. They are trained on massive text datasets and learn to predict and generate natural language. When an AI assistant answers a question about your business, it's doing so based on patterns learned during training and, in some systems, from live web retrieval.
Similar to robots.txt for search crawlers, llms.txt is an emerging standard that lets website owners provide a structured summary of their site's purpose, key pages, and content priorities to AI systems. It sits at the root of your domain (e.g., yoursite.com/llms.txt) and helps LLMs better represent your business in their responses.
R
robots.txt is a text file at the root of your domain that uses the Robots Exclusion Protocol to instruct automated crawlers. For AI readiness, it's important to explicitly allow AI-specific crawlers such as GPTBot, Google-Extended, PerplexityBot, ClaudeBot, and Anthropic-AI, which many sites unknowingly block through overly restrictive rules.
S
Schema.org is a collaborative project by Google, Microsoft, Yahoo, and Yandex that provides standardized markup for describing content types — businesses, products, events, people, articles, FAQs, and more. Adding Schema.org markup to your pages helps AI systems understand what your content represents, not just what words it contains. It is a foundational layer for both traditional SEO and AI visibility.
Semantic HTML uses elements like nav, header, main, section, article, form, label, and button that tell browsers and AI systems what role each element plays. Non-semantic HTML (divs and spans for everything) is visually fine for human users but leaves AI agents guessing about structure and purpose. Semantic markup is essential for agent-readable websites.
Structured data is code (typically JSON-LD following Schema.org vocabulary) that you add to your pages to give AI systems unambiguous information about your content. Rather than inferring that "Dr. Jane Smith, Cardiologist" means you're a medical practice, structured data states it explicitly. It powers rich results in search, AI citation accuracy, and eventually WebMCP tool declarations.
T
Technical SEO covers the non-content aspects of search optimization: site speed, crawlability, indexability, structured data, canonical tags, sitemaps, robots.txt, and mobile performance. For AI-era websites, technical SEO has expanded to include AI crawler configuration, render-path optimization, and semantic HTML structure.
Topical authority is earned when a website comprehensively covers a subject area — not just a single keyword, but an entire topic cluster with related subtopics, definitions, guides, and case studies. AI systems preferentially cite sources with demonstrated topical authority. It is built through content strategy, internal linking, and consistent publication over time.
V
Voice search queries are conversational, specific, and often local ("What AI consultant is near me?"). They differ from typed queries in length and phrasing, and are answered by AI systems that pull from structured, question-answering content. Optimizing for voice search overlaps heavily with AEO — clear answers to specific questions, FAQ schema, and natural language content structure.
W
WebMCP (Web Model Context Protocol) is a standard being developed by Google's Chrome team and Microsoft's Edge team through the W3C. It adds a parallel layer to websites — built for machines to understand and execute — alongside the existing human-readable layer. With WebMCP, a booking form doesn't just wait for a human to fill it out; it declares itself as a "bookAppointment" tool with specific inputs that an AI agent can call directly. Broader browser support is expected by mid-to-late 2026.
Z
Zero-click searches occur when Google (or another search engine) displays enough information in AI Overviews, featured snippets, knowledge panels, or answer boxes that the user never needs to visit a website. While this can reduce traffic, appearing as the cited source still builds brand authority. For AI-era SEO, the goal shifts from "rank to get clicks" to "be the cited source even when there's no click."