a

ES|QL LOOKUP JOIN Limitations: What Elastic’s New Join Feature Can’t Do

ES|QL LOOKUP JOIN Limitations

TL;DR

ES|QL LOOKUP JOIN (GA in Elasticsearch 8.17+) brings joins to Elasticsearch, but with major constraints: 2 billion row limit on lookup indices, single-field equality joins only, no wildcards or dynamic index references, no cross-cluster use after aggregations, memory circuit breaker failures under load, and Elasticsearch-only (no external database joins). For complex or cross-source joins, use data federation platforms instead.

Table of Contents

Introduction

Elastic finally added joins to Elasticsearch. The ES|QL LOOKUP JOIN feature, released in late 2024, lets you join data across indices within Elasticsearch.

This is genuinely useful progress. But before you redesign your data architecture around LOOKUP JOIN, understand its documented limitations—some of which may be dealbreakers for your use case.

What ES|QL LOOKUP JOIN Does

LOOKUP JOIN enables enriching query results with data from a second index, similar to a SQL LEFT JOIN:

FROM logs-*
| WHERE @timestamp > NOW() - 1 DAY
| LOOKUP JOIN customers ON customer_id
| STATS count = COUNT(*) BY customer_name, plan_tier

This query:

1. Queries the logs-* indices

2. Joins each row with the customers index on customer_id

3. Brings in customer_name and plan_tier from the lookup index

4. Aggregates results

For teams who’ve wanted joins in Elasticsearch, this is a significant step forward.

ES|QL LOOKUP JOIN Version Compatibility

LOOKUP JOIN isn’t available in all Elasticsearch versions. Here’s what’s supported where:

Elasticsearch VersionLOOKUP JOIN SupportCross-Cluster LOOKUP JOINNotes
8.x and earlierNot availableNot availableUse ENRICH policies for enrichment
8.16 (Preview)Technical previewNot availableInitial release, not production-ready
8.17Generally availableNot availableSingle-cluster only
9.0 – 9.1Generally availableNot availableStable, single-cluster joins
9.2.0+Generally availableSupported (with restrictions)Cannot use after STATS, SORT, LIMIT
ES|QL LOOKUP JOIN version compatibility timeline showing feature availability across Elasticsearch releases LOOKUP JOIN became generally available in Elasticsearch 8.17. Cross-cluster support arrived in 9.2.0 with restrictions.

Key version notes:

  • If you’re on Elasticsearch 8.x below 8.16, LOOKUP JOIN is unavailable, use ENRICH policies or external federation instead
  • The 8.16 technical preview has different syntax and behavior than the GA release – don’t rely on preview-era examples
  • Cross-cluster LOOKUP JOIN (9.2.0+) has additional constraints covered below

The Documented Limitations

According to Elastic’s official documentation and Elasticsearch Labs, LOOKUP JOIN has specific constraints:

Limitation 1: Lookup Index Size Cap

The constraint: Lookup indices are limited to a single shard and 2 billion documents maximum (a Lucene limitation for one shard).

From the docs: > “Indices in lookup mode are always single-sharded. Lookup indices are limited to 2 billion documents, a Lucene limit for one shard.”

Impact:

  • Your lookup index cannot exceed 2 billion rows
  • For high-cardinality reference data, this may be insufficient
  • No horizontal scaling of lookup indices

Workaround: None within ES|QL. If your reference data exceeds this limit, you need a different approach.

Limitation 2: Single Match Field Only

The constraint: LOOKUP JOIN only supports joining on a single field.

From the docs: > “LOOKUP JOIN can only use a single match field and a single index.”

What this means:

-- Supported: single field join
| LOOKUP JOIN customers ON customer_id

-- NOT supported: composite key join
| LOOKUP JOIN orders ON customer_id AND order_date

-- NOT supported: multiple lookup indices
| LOOKUP JOIN customers ON customer_id
| LOOKUP JOIN products ON product_id  -- Can chain, but each is limited

Impact:

  • Composite key joins require workarounds
  • Cannot join on multiple fields simultaneously
  • Complex data models may not translate directly

Limitation 3: No Wildcards or Dynamic Index References

The constraint: You must specify exact index names, no wildcards, aliases, date math, or data streams.

From the docs: > “Only specific index names or aliases are supported; wildcards and remote cluster prefixes are not supported.”

What fails:

-- These don't work for lookup indices
| LOOKUP JOIN logs-*       -- Wildcard: not supported
| LOOKUP JOIN logs-2024.*  -- Date pattern: not supported
| LOOKUP JOIN @timestamp   -- Dynamic: not supported

Impact:

  • Cannot join against time-partitioned lookup indices
  • Must know exact index names at query time
  • Rolling indices require query updates

Limitation 4: Cross-Cluster Limitations

The constraint: Cross-cluster LOOKUP JOIN has additional restrictions and version requirements.

From the docs: > “Cross cluster search is unsupported in versions prior to 9.2.0. Additionally, cross-cluster LOOKUP JOIN can not be used after aggregations (STATS), SORT and LIMIT commands, and coordinator-side ENRICH commands.”

What this means:

-- Cross-cluster LOOKUP JOIN restrictions
FROM remote_cluster:logs-*
| STATS count = COUNT(*) BY customer_id
| LOOKUP JOIN customers ON customer_id  -- Fails: can't use after STATS

-- Must restructure query
FROM remote_cluster:logs-*
| LOOKUP JOIN customers ON customer_id  -- Before aggregation
| STATS count = COUNT(*) BY customer_name

Impact:

  • Multi-cluster architectures have significant constraints
  • Query order matters and limits flexibility
  • Aggregations and joins don’t compose freely
Diagram showing ES|QL LOOKUP JOIN constraints including single-field join, single-shard lookup index, and equality-only matching
ES|QL LOOKUP JOIN has specific constraints around index size, join fields, cross-cluster usage, and matching types that may limit its usefulness for complex analytics.

Limitation 5: Equality Matching Only

The constraint: Joins only support exact equality matches.

From the docs: > “Only matching on equality is supported.”

What’s not supported:

-- NOT supported: range joins
| LOOKUP JOIN time_ranges
    WHERE event_time BETWEEN start_time AND end_time

-- NOT supported: fuzzy matching
| LOOKUP JOIN products
    WHERE SIMILARITY(product_name, search_term) > 0.8

-- NOT supported: inequality joins
| LOOKUP JOIN pricing WHERE tier >= customer_tier

Impact:

  • Time-window joins (common in event correlation) aren’t possible
  • Approximate matching requires preprocessing
  • Band joins for ranges need workarounds

Limitation 6: Output Ordering Not Guaranteed

The constraint: LOOKUP JOIN may not preserve row ordering.

From the docs: > “The output rows produced by LOOKUP JOIN can be in any order and may not respect preceding SORTs. To guarantee a certain ordering, place a SORT after any LOOKUP JOINs.”

What this means:

-- Ordering not preserved
FROM logs-*
| SORT @timestamp DESC
| LOOKUP JOIN customers ON customer_id  -- May scramble order
| LIMIT 100  -- Wrong 100 rows?

-- Correct approach
FROM logs-*
| LOOKUP JOIN customers ON customer_id
| SORT @timestamp DESC  -- Sort AFTER join
| LIMIT 100

Impact:

  • Existing ES|QL queries may produce unexpected results when adding LOOKUP JOIN
  • Must carefully consider query structure
  • Pagination behavior changes

Limitation 7: Memory and Document Size Constraints

The constraint: Large matches or large documents can cause circuit breaker failures.

From the docs: > “The query will circuit break if there are too many matching documents in the lookup index or if documents are too large.”

“LOOKUP JOIN works in batches of normally about 10,000 rows and requires large amounts of heap space if matching documents are multiple megabytes or larger.”

Impact:

  • High-cardinality joins may fail
  • Large reference documents are problematic
  • Must monitor and tune memory settings
  • Unpredictable failures under load

What LOOKUP JOIN Can’t Replace

Given these limitations, LOOKUP JOIN doesn’t solve several common scenarios:

1. Joining Elasticsearch with External Databases

LOOKUP JOIN is Elasticsearch-to-Elasticsearch only. It cannot:

-- NOT possible with LOOKUP JOIN
FROM elasticsearch.logs
| JOIN mysql.customers ON customer_id

-- Also NOT possible
FROM elasticsearch.events
| JOIN postgresql.accounts ON account_id

For cross-system joins, you need data federation or ETL.

2. Complex Multi-Table Joins

While you can chain LOOKUP JOINs, each has the single-field limitation:

-- Technically possible but increasingly constrained
FROM orders
| LOOKUP JOIN customers ON customer_id
| LOOKUP JOIN products ON product_id
| LOOKUP JOIN regions ON region_id
-- Each join adds memory pressure and latency
-- Still single-field equality only

For complex analytical queries, purpose-built analytics platforms handle multi-source joins more flexibly.

3. Real-Time Enrichment at Scale

The memory constraints and circuit breaker behavior make LOOKUP JOIN risky for:

  • High-throughput streaming scenarios
  • Large lookup tables (approaching 2B limit)
  • Documents with large nested structures

4. Time-Window Correlations

Security and observability use cases often need:

-- "Find all events within 5 minutes of login failure"
-- Cannot express in LOOKUP JOIN (no range predicates)

These require either:

  • Pre-computed enrichments via ENRICH policies
  • External correlation engines
  • Post-query processing

Alternatives for Complex Join Scenarios

Option 1: ENRICH Processor (Pre-Computed Joins)

For relatively static reference data, ENRICH policies pre-compute joins at ingest time:

PUT /_enrich/policy/customer-policy
{
  "match": {
    "indices": "customers",
    "match_field": "customer_id",
    "enrich_fields": ["customer_name", "plan_tier"]
  }
}

Trade-offs: – Data baked into documents at ingest – Stale when reference data changes – No dynamic filtering

Option 2: Data Federation Platforms

Platforms like Knowi join Elasticsearch with external sources at query time:

-- Cross-source federation
SELECT
  es.event_type,
  mysql.customer_name,
  pg.account_balance
FROM elasticsearch.events es
JOIN mysql.customers mysql ON es.customer_id = mysql.id
JOIN postgresql.accounts pg ON mysql.account_id = pg.id

Advantages:

  • No Elasticsearch limitations
  • Join any data source
  • Real-time, no pre-computation

See How to Join Elasticsearch with Other Datasources for implementation details.

Option 3: Denormalization at Ingest

Embed related data directly in documents during ingestion:

{
  "event_type": "purchase",
  "customer": {
    "id": "cust_123",
    "name": "Acme Corp",
    "tier": "enterprise"
  },
  "product": {
    "id": "prod_456",
    "name": "Widget Pro"
  }
}

Trade-offs:

  • Larger documents
  • Updates require re-indexing
  • Works well for read-heavy, slowly-changing data

When to Use LOOKUP JOIN

LOOKUP JOIN works well when:

  • Small lookup tables: Reference data well under 2B rows
  • Single-field joins: Simple equality matching sufficient
  • Elasticsearch-only: No need to join external databases
  • Moderate scale: Memory constraints aren’t limiting
  • Static references: Lookup data doesn’t change frequently

Good use case example:

-- Enrich log events with error code descriptions
FROM application_logs
| WHERE level = 'ERROR'
| LOOKUP JOIN error_codes ON error_code
| STATS count = COUNT(*) BY error_description

Here, error_codes is small, static, and single-field joinable.

The Bottom Line

ES|QL LOOKUP JOIN is a welcome addition to Elasticsearch, enabling enrichment scenarios that previously required ENRICH policies or external processing.

But it’s not a general-purpose join solution:

  • 2 billion row limit on lookup indices
  • Single-field equality only
  • No cross-cluster with aggregations
  • Memory constraints can cause failures
  • Elasticsearch-only—no external database joins

For complex analytical joins, multi-source queries, or high-scale enrichment, evaluate purpose-built analytics platforms that handle joins without these constraints.

Frequently Asked Questions

What is ES|QL LOOKUP JOIN in Elasticsearch?

ES|QL LOOKUP JOIN is a feature released in late 2024 that lets you enrich query results with data from a second Elasticsearch index, similar to a SQL LEFT JOIN. You specify a main query and a lookup index, and LOOKUP JOIN brings in additional fields based on a matching key. It works only within Elasticsearch, not across external databases.

What is the maximum size for a LOOKUP JOIN index?

Lookup indices are limited to a single shard with a maximum of 2 billion documents (a Lucene per-shard limitation). There’s no way to horizontally scale a lookup index across multiple shards. If your reference data exceeds this limit, you need an alternative approach like data federation or denormalization.

Can ES|QL LOOKUP JOIN join on multiple fields?

No. LOOKUP JOIN only supports joining on a single field with exact equality matching. Composite key joins (joining on two or more fields simultaneously), range joins, and fuzzy matching are not supported. If you need to join on a composite key, you’d need to create a combined field in both indices as a workaround.

Does ES|QL LOOKUP JOIN work across clusters?

Cross-cluster LOOKUP JOIN is supported starting in Elasticsearch 9.2.0, but with additional restrictions. You cannot use cross-cluster LOOKUP JOIN after aggregations (STATS), SORT, LIMIT commands, or coordinator-side ENRICH commands. This limits query flexibility in multi-cluster architectures.

Can LOOKUP JOIN replace ETL for enriching Elasticsearch data?

For simple, small-scale enrichment within Elasticsearch, yes. But LOOKUP JOIN has constraints that make it unsuitable as a general ETL replacement: the 2B row limit on lookup indices, single-field joins only, equality matching only, memory constraints that can cause circuit breaker failures, and no ability to join with external databases like MySQL or PostgreSQL.

Why does LOOKUP JOIN not preserve row ordering?

By design, LOOKUP JOIN may return rows in any order regardless of preceding SORT commands. If you need ordered results, you must place SORT after the LOOKUP JOIN, not before. This is a documented behavior that can produce unexpected results when adding LOOKUP JOIN to existing ES|QL queries.

Next Steps

Need joins beyond LOOKUP JOIN’s capabilities?

Request a Demo Meet pur team to see live how you can go beyond the limitations of Elasticsearch joins.


Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email
About the Author:

RELATED POSTS