Clay Tables: Building Scalable Lead Lists and Research Workflows

If you've opened Clay for the first time, the interface looks deceptively simple: a spreadsheet. Rows and columns. But Clay tables are not spreadsheets. They're programmable data pipelines disguised as a grid—and understanding how to structure them is the difference between a brittle lead list and a scalable research workflow.

This guide covers everything you need to know about Clay tables: column types, formulas, enrichment integrations, templates, and the design patterns that separate fragile one-off tables from production-grade GTM infrastructure. Whether you're building your first lead list or architecting multi-step workflows that qualify, enrich, and sequence thousands of prospects, this is the reference you'll keep coming back to.

We'll also cover where Clay tables hit their limits—and how tools like Octave can extend them with structured GTM context so your workflows stay maintainable as your team and playbooks grow.

What Are Clay Tables?

A Clay table is the core workspace where you build lead lists, run enrichments, apply formulas, and orchestrate multi-step GTM workflows. Each table holds a list of records—typically people, companies, or accounts—and each column represents either a data field, an enrichment, a formula, or an AI-generated output.

Think of a Clay table as a programmable spreadsheet with API access. Every column can trigger an action: pull data from an enrichment provider, run a formula, call an AI model, or push data to a downstream tool. Rows process independently, which means you can run different enrichment chains per row based on conditional logic.

Key Concept

Clay tables are not static data stores. They are execution environments. Each column can be an enrichment, a formula, or an integration call—and columns execute in dependency order, left to right, creating a data pipeline within the grid.

Tables vs. Traditional Spreadsheets

Feature	Traditional Spreadsheet	Clay Table
Data entry	Manual input	Manual, CSV import, CRM sync, webhook triggers, or API
Formulas	Cell references (A1, B2)	JavaScript-based with column references and AI formula generation
Enrichment	Not available	150+ data providers with waterfall logic
AI integration	Limited or add-on	Native AI columns (Claygent, HTTP API calls to LLMs)
Downstream actions	Export to CSV	Push to CRM, sequencer, Slack, webhooks
Conditional logic	IF functions	Column-level conditions that skip enrichments for unqualified rows

Clay Table Column Types

Columns are the building blocks of every Clay table. Understanding the column types available—and when to use each—is fundamental to building workflows that scale. Here is a breakdown of every major column type and its role in a typical GTM workflow.

Input Columns

Input columns hold the raw data you start with: names, domains, job titles, LinkedIn URLs, company names. You populate them via CSV upload, CRM import, webhook trigger, or manual entry. These are the seed data that downstream columns consume.

Common input columns for a lead list:

First Name / Last Name — Contact identifiers
Company Name — Account-level identifier
Company Domain — Used by most enrichment providers as the primary lookup key
Job Title — Persona matching and qualification
LinkedIn Profile URL — Required for person-level enrichments and AI research
Email — If already known; otherwise, enriched downstream

Enrichment Columns

Enrichment columns call external data providers to fill in information you don't already have. Clay connects to 150+ providers, and its waterfall enrichment feature lets you chain multiple providers for the same data point—if Provider A returns empty, it falls through to Provider B, then C.

Typical enrichment columns include:

Email finder — Waterfall across multiple providers for maximum coverage
Phone number — Direct dials and mobile numbers
Company firmographics — Employee count, revenue, industry, funding stage
Technographics — Tech stack detection (what tools a company uses)
Intent signals — Job postings, funding events, news mentions

Credit Optimization

Enrichments consume Clay credits. Use conditional execution to only run expensive enrichments on rows that pass initial qualification. For example, only waterfall email lookups on companies that match your ICP firmographics. This pattern is covered in depth in our guide on chaining research to qualification to sequences.

Formula Columns

Formula columns use JavaScript syntax to transform, filter, and compute values from other columns. Clay also provides an AI formula generator that converts natural language into formulas—useful when you need quick transformations but don't want to write JS manually.

Common formula use cases:

Domain extraction: Parse a company domain from an email address or LinkedIn URL
String formatting: Standardize job titles, clean company names, or extract first names from full names
Conditional scoring: Build basic lead scores based on firmographic data (employee count, industry, funding)
Data normalization: Convert revenue strings to numbers, standardize country codes, or merge duplicate fields
Boolean flags: Create true/false columns that downstream enrichments use as run conditions

AI Columns

AI columns call language models to classify, summarize, or generate content based on data in other columns. Clay's native AI integration (Claygent) can browse the web and answer research questions, while HTTP API columns let you connect to external AI services.

This is where Clay tables get powerful—and where they also get fragile. An AI column that generates personalized email copy needs context: who your company is, what your value prop is, who this persona cares about. That context has to come from somewhere. In a standalone Clay table, it typically gets embedded directly in the prompt. In a more structured setup, it comes from an external context layer like Octave.

Integration Columns

Integration columns push or pull data from external tools: CRMs (Salesforce, HubSpot), sequencers (Outreach, Salesloft, Instantly, Smartlead), Slack, webhooks, and more. These are the output stage of your Clay table—where enriched, qualified, and generated data gets pushed to the tools that act on it.

Building a Scalable Lead List in Clay

A lead list is the most common Clay table. But "scalable" means more than just adding rows. It means designing the table so you can reuse it across segments, swap enrichment providers without rebuilding, and add new columns without breaking downstream dependencies.

Step-by-Step: Your First Lead List

Define Your Source

Decide where your initial records come from: a CSV export from your CRM, a Clay "Find Companies" or "Find People" search, a webhook from an intent provider, or a manual list of target accounts. The source determines your starting columns.

Set Up Input Columns

At minimum, you need Company Domain and Company Name for account-level workflows, or First Name, Job Title, LinkedIn Profile, and Company Domain for person-level workflows. These are the keys that enrichment providers use for lookups.

Add Firmographic Enrichments

Add columns for employee count, industry, revenue, and funding stage. These let you filter and score before running more expensive enrichments downstream. Use waterfall enrichment to chain multiple providers for higher data coverage.

Apply Qualification Logic

Use formula columns to flag records that match your ICP. A simple approach: create a boolean column that checks employee count, industry, and funding stage against your criteria. Only rows that pass this filter should trigger downstream enrichments.

For more sophisticated qualification that evaluates fit against your full ICP—including qualitative criteria like business model, operating environment, and strategic priorities—you can use an external qualification agent instead of formula-based scoring.

Enrich Contact Data

For qualified rows, add email finder and phone enrichment columns. Set these to only run when your qualification flag is true. This saves credits by skipping records that don't match your ICP.

Add Research and Personalization

Add AI or enrichment columns that gather the signals you need for personalization: recent company news, job postings, tech stack, LinkedIn activity. These become the inputs for your messaging layer.

Push to Downstream Tools

Add integration columns to push qualified, enriched records to your CRM and sequencer. Map the fields and configure the push conditions.

Formulas in Clay: A Practical Reference

Clay formulas use JavaScript syntax with some Clay-specific conventions. You reference column values by their column names, and Clay provides helper functions for common operations. You can also use the AI formula generator to describe what you want in plain English and let Clay write the JavaScript.

Common Formula Patterns

Use Case	Description	When to Use
Domain extraction	Parse domain from email or URL	You have emails but need domains for company enrichments
String cleanup	Trim whitespace, standardize casing, remove prefixes	Imported data has inconsistent formatting
Conditional flags	Return true/false based on multi-column criteria	Gate expensive enrichments behind qualification checks
Score calculation	Weighted numeric score from multiple columns	Basic ICP scoring before deeper qualification
Text concatenation	Combine multiple fields into a single string	Building context strings to pass to AI columns or runtime context
Array operations	Filter, map, or join array values	Working with tech stack lists, job titles, or multi-value enrichment fields
Date parsing	Calculate days since event, format dates	Timing-based triggers (days since funding, job change recency)

AI Formula Generator

If you're not comfortable writing JavaScript, Clay's AI formula generator converts plain-English descriptions into working formulas. Describe what you want ("extract the domain from the email column, removing any subdomains") and Clay generates the code. It handles most common transformations well, though complex multi-column logic may need manual refinement.

Formula Limitations

Formulas are excellent for data transformation and basic scoring, but they hit walls quickly when you try to embed real GTM logic in them. ICP definitions that start as clean conditionals become nested, unmaintainable blocks as your segmentation grows. If you find yourself with formula columns that span 20+ lines of conditional logic, that's a signal you need a structured qualification layer—either a dedicated scoring system or a context engine that handles qualification externally.

Enrichment Integrations and Waterfall Logic

Clay's enrichment ecosystem is its core differentiator. With access to 150+ data providers, you can chain multiple sources for the same data point using waterfall logic. If one provider returns empty, the next one in the chain fires automatically.

Setting Up Waterfall Enrichment

Waterfall enrichment is Clay's approach to maximizing data coverage. Instead of relying on a single email finder or company data provider, you stack multiple providers in priority order. Clay tries the first, and if it returns nothing (or returns low-confidence data), it falls through to the next.

A typical email waterfall might chain three to four providers. The result is often 70-80%+ email coverage, compared to 30-50% from any single provider. The same pattern applies to phone numbers, technographics, and company data.

Conditional Enrichment

Not every row needs every enrichment. Clay lets you set conditions on enrichment columns so they only run when specific criteria are met. This is critical for credit management and workflow efficiency.

Common conditions:

Only enrich emails for qualified accounts: Gate email waterfall behind a firmographic qualification formula
Only run tech stack enrichment for target industries: Skip if industry doesn't match your ICP
Only run AI research for high-score leads: If you're using a qualification agent, only run person-level enrichment on leads scoring above your threshold

This conditional execution pattern becomes especially powerful when you chain qualification into sequence generation—you qualify companies first, then only spend credits on enrichment and content generation for prospects that pass.

Clay Templates: Starting Points, Not Endpoints

Clay's template library offers pre-built tables for common workflows: lead enrichment, account research, competitor monitoring, job change tracking, and more. Templates give you a solid starting point with columns, enrichments, and formulas already configured.

Popular Template Categories

Category	What It Does	Best For
Lead enrichment	Takes a list of names/domains and fills in emails, phones, firmographics	Building outbound lists from scratch
Account research	Deep research on target accounts: news, funding, hiring, tech stack	ABM plays and account prioritization
Job change tracking	Monitors contacts for role changes and triggers outreach	Re-engaging churned champions at new companies
Competitor monitoring	Tracks competitor customers, hires, and product changes	Competitive displacement campaigns
CRM enrichment	Pulls records from your CRM, enriches, and pushes back	Cleaning and enriching existing CRM data

Templates Are Starting Points

Templates give you the column structure and enrichment setup, but they don't include your GTM context: your ICP definitions, your value props, your qualification criteria. You'll need to add those yourself—either as formula columns and prompt logic within the table, or by connecting a context layer like Octave that provides that context via API. See our guide on using Clay with Octave for the full setup.

Table Design Patterns and Best Practices

The difference between a Clay table that works for a one-off campaign and one that scales across quarters comes down to structure. Here are the patterns that keep tables maintainable.

1. Left-to-Right Pipeline Design

Organize columns in execution order: inputs on the left, enrichments in the middle, outputs on the right. This mirrors the data pipeline and makes the workflow readable at a glance. Anyone who opens the table can follow the logic from source data through enrichment, qualification, and action.

2. Gate Expensive Operations

Every enrichment and AI call costs credits and time. Add qualification gates before expensive operations. A firmographic check (employee count, industry) costs almost nothing compared to a person-level AI enrichment or sequence generation. Structure your table so cheap checks run first and expensive operations only fire on qualified rows.

3. Separate Research from Action

Don't try to do everything in one table. Complex workflows often benefit from a multi-table architecture:

Table 1: Account qualification — Company-level enrichment and scoring
Table 2: Contact finding — Find people at qualified accounts
Table 3: Outreach generation — Person-level research and sequence creation

This mirrors the ABM workflow pattern where you qualify accounts, prospect contacts, and generate sequences as distinct stages.

4. Use Naming Conventions

Name columns descriptively: firmographic_employee_count instead of employees, waterfall_email_result instead of email. When you have 30+ columns, clear naming prevents confusion about which column feeds into which formula or enrichment.

5. Document with Notes

Add column descriptions for any non-obvious logic. Future you (or your teammate who inherits the table) will thank you when they need to understand why a formula column checks for both "Series B" and "Series C" funding but not "Series A."

6. Version Your Tables

Before making major structural changes, duplicate the table. Clay doesn't have native version control, so table duplication is your undo button. Label duplicates with dates or version numbers.

When Clay Tables Hit Their Limits

Clay tables are powerful execution environments, but they have structural limits that emerge as your GTM operation scales. Understanding these limits helps you plan for them rather than hitting them mid-campaign.

The Prompt Swamp

When you use AI columns for qualification, personalization, and content generation, the prompts embedded in those columns accumulate your GTM knowledge: ICP definitions, value props, competitive positioning, persona details. This works for one table. It breaks when you have ten tables, three segments, and two product lines—each with its own copy of your positioning.

When your ICP shifts, you're updating prompts across every table. When a new product launches, you're duplicating and modifying entire workflows. This is the "prompt swamp" that GTM Engineers often encounter as they scale.

Qualification Fragility

Formula-based lead scoring in Clay is deterministic and transparent, which is good. But it's limited to structured data you already have in columns. It can't evaluate qualitative fit: "Does this company's business model align with our use cases?" or "Would this person's priorities make them receptive to our positioning?" Those questions require context that formulas can't encode.

Context Drift

As tables multiply, each one develops its own version of your GTM context. The ICP definition in Table A diverges from the one in Table B. The competitive positioning in your cold outreach table doesn't reflect the updated messaging from last month's positioning sprint. There's no single source of truth.

The Pattern

These limits all stem from the same root cause: GTM context embedded in execution. Clay is an excellent execution engine, but it was built for orchestration, not for storing and versioning your go-to-market knowledge. The solution is separating context from execution—maintaining your GTM knowledge in a dedicated layer and consuming it from Clay at runtime. This is exactly the pattern that an Octave + Clay integration enables.

Advanced Table Workflows

Once you're comfortable with single-table lead lists, these advanced patterns unlock more sophisticated GTM motions.

Trigger-Based Tables

Instead of building static lists, configure tables that fire on real-time signals: a webhook from your CRM when a lead hits a scoring threshold, a job change alert from a monitoring provider, or a funding event from a news API. The table processes each incoming record through your enrichment and qualification pipeline automatically.

Multi-Table Chaining

Use Clay's table-to-table features to chain workflows. A company qualification table outputs qualified accounts to a contact finding table, which outputs contacts to a sequence generation table. Each table is focused and maintainable, and you can swap out components without rebuilding the whole chain.

Conditional Branching

Use formula columns and conditional enrichment to route leads through different paths based on their data. A lead at a company using a competitor tool gets routed to a competitive displacement workflow. A lead at a fast-growing company with recent funding gets routed to an expansion play. The table becomes a routing engine, not just a list.

CRM Bidirectional Sync

Pull records from your CRM into Clay for enrichment, then push the enriched data back. This keeps your CRM as the system of record while leveraging Clay's enrichment ecosystem. The key is field mapping: make sure you're updating existing CRM records, not creating duplicates. Our guide on coordinating Clay, CRM, and sequencer in one flow covers this pattern in detail.

FAQ

How many rows can a Clay table handle?

Clay tables can handle thousands of rows, but performance depends on the number of enrichment columns and their execution conditions. For very large lists (10,000+ rows), consider splitting into batches or using multi-table architectures where a qualification table filters before passing to enrichment-heavy downstream tables.

How do Clay credits work with enrichments?

Each enrichment provider consumes a certain number of Clay credits per row. Waterfall enrichments charge only for the provider that returns a result, not for every provider in the chain. Use conditional execution and qualification gates to avoid running enrichments on rows that don't meet your criteria.

Can I use Clay tables with external AI tools instead of Claygent?

Yes. Clay supports HTTP API columns that can call any external service. This is how you connect tools like Octave's agents—you add an enrichment column, configure the API connection, and map your Clay columns to the required inputs. The response fields get parsed into new columns.

What's the best way to organize columns in a large table?

Follow the left-to-right pipeline pattern: input columns on the left, enrichments in the middle (ordered by dependency), qualification formulas after enrichments, AI/generation columns after qualification, and integration/push columns on the far right. Use clear naming conventions and hide columns that are intermediate calculations to keep the view clean.

How do I avoid duplicate records across multiple tables?

Use company domain as the primary dedup key for account-level tables and LinkedIn URL or email for person-level tables. Clay has a built-in deduplication feature you can run on any column. For cross-table dedup, use your CRM as the single source of truth and check for existing records before creating new ones in downstream pushes.

Can I schedule Clay tables to run automatically?

Clay supports scheduled runs and webhook-triggered tables. Scheduled runs are useful for recurring enrichment jobs (like weekly CRM data refresh). Webhook triggers are better for real-time workflows (like processing inbound leads as they arrive). Both can be configured from the table settings.

How do I share a Clay table with my team?

Clay workspaces support team access. You can share tables within your workspace, and team members can view, edit, or duplicate them. For cross-team sharing, export the table as a template or share the workspace link. Note that enrichment columns are tied to your workspace's credit balance.

When should I use a formula vs. an enrichment vs. an AI column?

Use formulas for data transformation, cleanup, and basic calculations on data you already have. Use enrichments to pull new data from external providers. Use AI columns when you need classification, summarization, or generation that requires reasoning over unstructured data. In general: formulas are cheapest, enrichments cost credits, and AI columns cost the most but handle the most complex tasks.

Next Steps

Clay tables are the foundation of any modern GTM workflow. Start with a clean lead list, apply the design patterns above, and iterate. As your workflows scale, keep an eye on where context starts leaking into your tables—long prompt columns, complex scoring formulas, duplicated qualification logic—and consider whether a dedicated context layer would make your setup more maintainable.

For hands-on guidance on extending Clay with structured GTM context, check out these related guides:

If you're ready to separate your GTM context from your Clay execution, Octave gives you a structured Library, testable Playbooks, and AI agents that connect directly to Clay. Build it once, consume it everywhere.