a

Elasticsearch Aggregations Guide: Analytics and Reporting Queries

TL;DR 

Aggregations turn Elasticsearch from a search engine into a real-time analytics engine. 

  • Use bucket aggs to group data
  • metric aggs to compute numbers, 
  • pipeline aggs to transform results (moving averages, derivatives), and 
  • matrix aggs for correlations. 

Combine with Kibana or a BI layer (like Knowi) to build dashboards, alerts, and embedded analytics.

In this guide you’ll learn:

  1. What aggregations are and how they compare to SQL.
  2. The main aggregation types – bucket, metric, pipeline, and matrix.
  3. Real reporting queries and analytics examples.
  4. Best practices for performance and memory optimization.
  5. How to visualize data using Kibana or third-party dashboards like Knowi.
  6. Real-world use cases and FAQs.

Table of Contents

  1. What is Elasticsearch?
  2. What Are Aggregations in Elasticsearch?
    1. Why Are Aggregations Essential for Analytics and Reporting?
    2. Common Use Cases for Elasticsearch Aggregations
    3. Dashboards
    4. Trend Analysis
    5. Anomaly Detection
    6. Funnel Analysis
    7. Log Analytics
    8. The Four Types of Aggregations
      1. Bucket Aggregations
        1. Terms Aggregation
        2. Range Aggregation
        3. Date Histogram
        4. Filters Aggregation
        5. Composite Aggregation
        6. RECAP: Quick Summary
      2. Metric Aggregations: “What numbers do we want?”
        1. Basic Metric Aggregations
        2. Cardinality (Distinct Counts)
        3. Extended Stats
        4. Percentiles
        5. Percentile Ranks
      3. Pipeline Aggregations
        1. Moving Average (moving_fn / moving_avg)
        2. Derivative
        3. Cumulative Sum
        4. Bucket Script
        5. Bucket Sort
      4. Matrix Aggregations
        1. Matrix Stats Aggregation
        2. When to Use Matrix Aggregations
      5. Summary of Aggregation Types So Far
    9. Common Reporting Queries
      1. Top N Values – “Top Products, Users, or Regions”
      2. Trend Over Time – “How Does This Metric Change Each Week or Month?”
        1. Adding Trend Metrics
      3. Funnel or Sequential Analysis
        1. Simple Funnel Example – Using Filters Aggregation
        2. Calculating Conversion Rates
        3. Common Uses
        4. Key takeaway:
      4. Time Series Anomaly Detection
        1. Pattern 1 – Compare to a Moving Average
        2. Pattern 2 – Rate of Change (Derivative)
        3. Pattern 3 – Standard Deviation Bands
        4. Practical Tips
    10. Best Practices for Using Aggregations
      1. High cardinality hurts memory
      2. Filters vs post_filter
      3. terms tuning
      4. Error visibility
      5. Cache & filter context
      6. Reduce buckets first
      7. Mind global limits
      8. Mapping hygiene
      9. Tiny checklist before you run a big agg
    11. Limitations of Elasticsearch for Analytics
      1. 1. No Joins Across Indices
      2. 2. Complexity in Multi-Dimensional Grouping
      3. 3. Limited Visualization & Reporting
      4. Quick Recap
    12. How to Build Dashboards on Elasticsearch
      1. Option 1: Kibana (Official Visualization Tool)
        1. What it is:
        2. Key Features:
        3. How it works:
        4. Example Use Cases:
        5. Pros:
        6. Cons:
      2. Option 2: Custom Dashboards Using Elasticsearch APIs
        1. What it is:
        2. Example Workflow:
        3. Pros:
        4. Cons:
      3. Option 3: Third-Party Analytics Platforms
        1. What they do:
        2. Examples:
        3. Pros:
        4. Cons:
      4. Which Option to Choose?
    13. Real-World Use Cases of Elasticsearch Aggregations
      1. 1. Log Analytics: Errors Grouped by Type or Service
      2. 2. E-commerce: Sales by Region / Product / Time
      3. 3. IoT Analytics: Sensor Summaries & Anomalies
      4. 4. SaaS Product Analytics: Feature Usage Trends
      5. Summary of Use Cases
    14. What aggregations unlock:
    15. Where ES fits in the analytics stack:
    16. When to reach for something else:
    17. Elasticsearch handles search. Knowi handles analytics.
    18. Related Resources
    19. Frequently Asked Questions
      1. What’s the difference between bucket and metric aggregations?
      2. Can Elasticsearch aggregations replace SQL analytics?
      3. How do I aggregate across multiple indices?
      4. How do I speed up slow aggregations?
      5. How do I get “Top N” accurately?
      6. What’s the best way to do anomaly detection?

What is Elasticsearch?

Elasticsearch is a search and analytics engine – it’s often used to search through large amounts of text (like logs, documents, or records), but it’s also very powerful for analytics because it can summarize data quickly.

For example:
Imagine you have millions of sales records.
You could use Elasticsearch to answer:

  • “What’s the average sale amount per region?”
  • “How many sales happened this week?”
  • “Which products have the highest revenue?”

These kinds of summary questions are done using something called Aggregations.

What Are Aggregations in Elasticsearch?

Aggregations are Elasticsearch’s analytics engine – they let you group, filter, and calculate metrics on data stored in your index.

Aggregations are Elasticsearch’s version of “GROUP BY” and “SUM/AVG/COUNT” in SQL.

They let you:

  • Group data (like by country or category)
  • Calculate numbers (like totals, averages, max, min)
  • Build charts (like histograms, date histograms, pie charts, etc.)

In simple terms:

Search = “Find matching documents.”

Aggregation = “Summarize data across those documents.”

Why Are Aggregations Essential for Analytics and Reporting?

Because they let you:

  • Analyze large datasets without exporting to SQL or BI tools.
  • Build dashboards that auto-update as data changes.
  • Perform real-time analytics on logs, metrics, or business data.
  • Do multi-level grouping (e.g., sales by country → by product → by month).

They’re especially powerful when speed matters – Elasticsearch can compute millions of records in milliseconds.

Common Use Cases for Elasticsearch Aggregations

Dashboards

Goal: Monitor business or operational KPIs in real time.

Example:
An e-commerce company stores each order as a document:

{ "region": "US", "category": "Electronics", "revenue": 200, "status": "Delivered" }

Using terms and sum aggregations, they can track:

  • Total revenue by region
  • Average order value by category
  • Count of pending vs delivered orders

These results feed directly into a dashboard (like Kibana or Knowi) that updates instantly as new orders come in.

Trend Analysis

Goal: Understand how something changes over time.

Example:
A SaaS company wants to see user signups per week.

They use a date_histogram on the signup_date field to group users by week, and a count aggregation to get totals.

This reveals seasonal spikes, growth trends, or declines.

You can also combine it with avg session time or revenue metrics to correlate trends.

Anomaly Detection

Goal: Spot unexpected changes – spikes, drops, or unusual behavior.

Example:
A monitoring system tracks server errors per minute.

It uses:

  • date_histogram → group logs by minute
  • count → total errors per minute
  • moving_avg → rolling average over time
  • derivative → rate of change between buckets

If today’s error count suddenly doubles compared to the moving average, the system flags an anomaly.

Funnel Analysis

Goal: Measure how users progress through sequential stages.

Example:
An online store tracks users through stages:

  1. Visited homepage
  2. Added item to cart
  3. Completed purchase

Using filters or composite aggregations, Elasticsearch counts how many users reached each stage.

By comparing stage-to-stage percentages, you can identify where users drop off (e.g., only 30% complete checkout).

Log Analytics

Goal: Make sense of application or system logs.

Example:
Each log record includes:

{ "status": 500, "service": "checkout", "response_time": 120 }

With terms and avg aggregations, you can answer:

  • Which services throw the most 5xx errors?
  • What’s the average response time per service?
  • How many errors occurred per hour?

This helps teams troubleshoot, spot slow APIs, or detect outages – all inside Elasticsearch, no external SQL or ETL needed.

The Four Types of Aggregations

There are four main families of aggregations in Elasticsearch.

Bucket Aggregations

Think of these as “grouping” operations.

They divide your documents into sets (buckets) based on shared criteria. Each bucket can then have its own sub-aggregations (metrics, filters, etc.).

Let’s say you have documents like this:

{ "category": "electronics", "price": 100 }
{ "category": "books", "price": 20 }
{ "category": "electronics", "price": 200 }

You want to know: What’s the average price per category?

You’d use a terms aggregation on “category”, and an avg aggregation on “price”.

Here’s the Elasticsearch query for: “average price per category.”

GET /sales/_search
{
  "size": 0,
  "aggs": {
    "avg_price_by_category": {
      "terms": {
        "field": "category.keyword"
      },
      "aggs": {
        "average_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}
  1. “size”: 0
    Means: don’t return the raw documents, only the aggregated results.
    (We’re doing analytics, not search results.)
  2. “aggs”: { … }
    This section defines your aggregations.
  3. “avg_price_by_category”: { “terms”: { “field”: “category.keyword” } }
    creates buckets by each unique category (like “electronics”, “books”, etc.).
    • “terms” = bucket type
    • “field”: “category.keyword” = the field you’re grouping by
  4. Inside that, another “aggs”:
    “average_price”: { “avg”: { “field”: “price” } }

 This calculates the average of price within each category bucket.

The results might look something like:

"aggregations": {
  "avg_price_by_category": {
    "buckets": [
      { "key": "electronics", "doc_count": 2, "average_price": { "value": 150 } },
      { "key": "books", "doc_count": 1, "average_price": { "value": 20 } }
    ]
  }
}

That means:

  • Electronics → avg price = 150
  • Books → avg price = 20

Terms Aggregation

What it does: Groups by unique field values (like SQL GROUP BY).
Example use: Revenue by region or category.

"aggs": {
  "revenue_by_region": {
    "terms": { "field": "region.keyword" },
    "aggs": {
      "total_revenue": { "sum": { "field": "revenue" } }
    }
  }
}

Range Aggregation

What it does: Groups numeric or date values into defined ranges.
Example use: Price bands or age groups.

"aggs": {
  "price_ranges": {
    "range": {
      "field": "price",
      "ranges": [
        { "to": 100 },
        { "from": 100, "to": 500 },
        { "from": 500 }
      ]
    },
    "aggs": {
      "avg_price": { "avg": { "field": "price" } }
    }
  }
}

Output might show:

  • < 100 USD → 200 items
  • 100 – 500 USD → 350 items
  • > 500 USD → 50 items

Date Histogram

What it does: Buckets by time intervals – day, week, month, etc.
Example use: Orders per week, log counts per hour.

"aggs": {
  "orders_per_week": {
    "date_histogram": {
      "field": "order_date",
      "calendar_interval": "week"
    },
    "aggs": {
      "revenue": { "sum": { "field": "price" } }
    }
  }
}

This is useful for trend analysis and time-series charts.

Filters Aggregation

What it does: Creates fixed buckets based on specific conditions.
Example use: Compare different regions or product categories in one query.

"aggs": {
  "region_comparison": {
    "filters": {
      "filters": {
        "US":  { "term": { "region.keyword": "US" } },
        "EU":  { "term": { "region.keyword": "EU" } },
        "APAC":{ "term": { "region.keyword": "APAC" } }
      }
    },
    "aggs": { "revenue": { "sum": { "field": "revenue" } } }
  }
}

Composite Aggregation

What it does: Combines multiple fields for multi-level grouping and supports pagination (fetching results in batches).
Example use: Revenue per (country, month) pair when data is huge.

"aggs": {
  "sales_by_country_month": {
    "composite": {
      "sources": [
        { "country": { "terms": { "field": "country.keyword" } } },
        { "month":   { "date_histogram": { "field": "order_date", "calendar_interval": "month" } } }
      ]
    },
    "aggs": {
      "total_revenue": { "sum": { "field": "revenue" } }
    }
  }
}

Composite aggregations are great for scrolling through large result sets without overloading memory.

RECAP: Quick Summary

Aggregation TypePurpose / What It Groups ByTypical Use CasesExample Output
TermsGroups by unique field valuesTop categories, users, countries“US – 500 orders”, “UK – 200 orders”
RangeGroups numeric fields into custom rangesPrice ranges, age groups, salary bands“0–100: 200 items”, “100–500: 350 items”
Date HistogramGroups by time intervalsWeekly sales, monthly signups, hourly logs“Jan: 1000 orders”, “Feb: 1500 orders”
FiltersGroups documents into predefined filtersCompare KPIs by region, plan, or environment“US: $20k revenue”, “EU: $18k revenue”
CompositeMulti-field grouping, supports paginationLarge datasets (e.g., region + month + category)“US-Jan: $10k”, “UK-Feb: $5k”

Metric Aggregations: “What numbers do we want?”

These aggregations calculate numeric values – sums, averages, unique counts, percentiles, and more.

They’re the building blocks for KPIs, dashboards, and summaries.

Basic Metric Aggregations 

Metric TypeWhat It DoesExample
avgCalculates averageAverage product price
sumAdds up valuesTotal revenue
min / maxFinds smallest or largestLowest or highest order value
value_countCounts documents with a value in a fieldCount of orders that have a price
stats / extended_statsGives multiple metrics (min, max, avg, sum, count, std deviation)Full price summary

Example 1: Average Price

GET /sales/_search
{
  "size": 0,
  "aggs": {
    "average_price": {
      "avg": { "field": "price" }
    }
  }
}

Output:

"aggregations": {
  "average_price": { "value": 150.0 }
}

Example 2: Total Revenue

GET /sales/_search
{
  "size": 0,
  "aggs": {
    "total_revenue": {
      "sum": { "field": "price" }
    }
  }
}

If you had 3 sales of $10, $20, $30 → the total is 60.

So, metrics tell you “how much”, while buckets tell you “per what” (like per region, per month, per category).

Cardinality (Distinct Counts)

What it does: Counts unique values in a field – like COUNT(DISTINCT user_id) in SQL.

Use case: Count unique users, customers, or sessions.

"aggs": {
  "unique_users": {
    "cardinality": { "field": "user_id" }
  }
}

Example result:

"unique_users": { "value": 10423 }

⚠️ Note: It’s an approximate count using a hashing algorithm (HyperLogLog++). It’s fast and memory-efficient but may differ by 1–2%.

Extended Stats

What it does: Returns extra statistical values beyond the basic stats – like variance, standard deviation, and sum of squares.

Use case: Useful for detecting spread or consistency in data (e.g., revenue variance across stores).

"aggs": {
  "price_distribution": {
    "extended_stats": { "field": "price" }
  }
}

Output includes:

{
  "count": 1000,
  "min": 10,
  "max": 950,
  "avg": 420,
  "sum": 420000,
  "std_deviation": 85,
  "variance": 7200
}

Percentiles

What it does: Shows values at given percentiles (e.g., 50th, 90th, 99th). Tells you thresholds – “90% of requests took less than X ms.”

Use case: Performance monitoring, response time analytics.

"aggs": {
  "response_time_percentiles": {
    "percentiles": {
      "field": "response_time",
      "percents": [50, 90, 99]
    }
  }
}

Example Output:

"values": {
  "50.0": 120,
  "90.0": 250,
  "99.0": 900
}

Percentile Ranks

What it does: Opposite of Percentiles – given a value, it tells what percent of data falls below it.

Use case: “What percent of requests complete in under 200 ms?”

"aggs": {
  "fast_requests": {
    "percentile_ranks": {
      "field": "response_time",
      "values": [200]
    }
  }
}

Output:

"values": { "200.0": 85.7 }

Means: 85.7% of requests finish within 200ms.

To Summarize: 

Metric aggregations = your numerical summaries.

  • Use cardinality for unique counts.
  • Use percentiles for performance analysis.
  • Use extended_stats for deeper statistical modeling.

Pipeline Aggregations

Pipeline aggregations take the output of another aggregation and perform calculations on those results. They are a bit like Excel formulas applied to an existing chart.

So instead of working directly on documents, they work on aggregated data.

Moving Average (moving_fn / moving_avg)

What it does: Smooths out fluctuations in time-series data.
Use case: Detect trends in sales, traffic, or errors over time.

"aggs": {
  "sales_over_time": {
    "date_histogram": { "field": "order_date", "calendar_interval": "day" },
    "aggs": {
      "daily_revenue": { "sum": { "field": "price" } },
      "moving_avg_revenue": {
        "moving_fn": { "buckets_path": "daily_revenue", "window": 7 }
      }
    }
  }
}

Result: A 7-day rolling average that smooths daily sales spikes.

Derivative

What it does: Calculates the rate of change between buckets.
Use case: Find growth rates or sudden jumps in metrics.

"aggs": {
  "users_over_time": {
    "date_histogram": { "field": "signup_date", "calendar_interval": "week" },
    "aggs": {
      "weekly_signups": { "value_count": { "field": "user_id" } },
      "growth_rate": {
        "derivative": { "buckets_path": "weekly_signups" }
      }
    }
  }
}

Result: Tells you how signups increased or decreased week-to-week.

Cumulative Sum

What it does: Adds up values across buckets to create a running total.
Use case: Show total sales or revenue accumulated over time.

"aggs": {
  "revenue_over_time": {
    "date_histogram": { "field": "order_date", "calendar_interval": "month" },
    "aggs": {
      "monthly_sales": { "sum": { "field": "revenue" } },
      "total_revenue": {
        "cumulative_sum": { "buckets_path": "monthly_sales" }
      }
    }
  }
}

Result: “Month 1: $5K, Month 2: $11K, Month 3: $18K…” (running total)

Bucket Script

What it does: Lets you do math between multiple aggregations.
Use case: Calculate ratios like conversion rate, profit margin, or error rate.

"aggs": {
  "orders": {
    "date_histogram": { "field": "order_date", "calendar_interval": "month" },
    "aggs": {
      "total_orders": { "value_count": { "field": "order_id" } },
      "completed_orders": {
        "filter": { "term": { "status": "completed" } }
      },
      "completion_rate": {
        "bucket_script": {
          "buckets_path": {
            "completed": "completed_orders>_count",
            "total": "total_orders"
          },
          "script": "params.completed / params.total * 100"
        }
      }
    }
  }
}

Result: % of orders completed each month.

Bucket Sort

What it does: Sorts or filters bucket results after they’ve been calculated.
Use case: Get the Top N months, regions, or users after applying metrics.

"aggs": {
  "region_revenue": {
    "terms": { "field": "region.keyword" },
    "aggs": {
      "total_sales": { "sum": { "field": "revenue" } },
      "top_regions": {
        "bucket_sort": {
          "sort": [{ "total_sales": { "order": "desc" } }],
          "size": 5
        }
      }
    }
  }
}

Result: Top 5 regions by total revenue.

To summarize

Pipeline TypePurposeExample
moving_fn / moving_avgSmooth time seriesRolling 7-day sales
derivativeCalculate changeWeekly growth
cumulative_sumRunning totalTotal revenue to date
bucket_scriptMath between metricsConversion rates
bucket_sortSort or filter resultsTop 5 by revenue

Matrix Aggregations

Matrix aggregations are used for statistical or correlation analysis when you have multiple numeric fields and want to understand how they relate to each other.

Unlike the other types (which group or summarize data), matrix aggregations compute relationships between fields – kind of like what you’d do in data science or ML preprocessing.

Matrix Stats Aggregation

What it does:
Calculates a statistical matrix including:

  • count
  • mean
  • variance
  • skewness
  • kurtosis
  • covariance and correlation between fields

Use case:
You want to see if two metrics are related – e.g., are higher ad spends correlated with higher conversions?

Example:

"aggs": {
  "campaign_stats": {
    "matrix_stats": {
      "fields": ["ad_spend", "conversions", "clicks"]
    }
  }
}

Example Output:

"aggregations": {
  "campaign_stats": {
    "fields": [
      { "name": "ad_spend", "count": 100, "mean": 2500, "variance": 50000 },
      { "name": "conversions", "count": 100, "mean": 60, "variance": 300 },
      { "name": "clicks", "count": 100, "mean": 1200, "variance": 20000 }
    ],
    "correlations": {
      "ad_spend": { "conversions": 0.87, "clicks": 0.92 }
    }
  }
}

Here, a correlation of 0.87 between ad_spend and conversions means they’re strongly positively related – when one goes up, so does the other.

When to Use Matrix Aggregations

  • Data science prep: To understand feature relationships before modeling.
  • Finance: To find correlations between stock prices or revenue streams.
  • Marketing analytics: To link spend, reach, clicks, and conversions.

But they’re not typically used for dashboards or business KPIs – they’re more for statistical analysis.

Summary of Aggregation Types So Far

TypeExample UseWorks On
BucketGroup by category, date, rangeDocuments
MetricCalculate totals, averages, unique countsDocument fields
PipelineDo math on other aggregationsAggregated results
MatrixCorrelate multiple numeric fieldsStatistical analysis

Common Reporting Queries

We’ll now see how everything you’ve learned about aggregations gets applied to real analytics questions.

We’ll start with Top N Values (one of the most common patterns for dashboards).

Top N Values – “Top Products, Users, or Regions”

Goal: Find which items contribute the most – top-selling products, busiest users, most-visited pages, etc.

Example Query: Top 5 products by total sales

GET /sales/_search
{ 
  "size": 0,
  "aggs": {
    "top_products": {
      "terms": {
        "field": "product.keyword",
        "order": { "total_sales": "desc" },
        "size": 5
      },
      "aggs": {
        "total_sales": { "sum": { "field": "price" } }
      }
    }
  }
}

Result:

"buckets": [
  { "key": "Laptop", "total_sales": { "value": 90000 } },
  { "key": "Phone",  "total_sales": { "value": 82000 } },
  ...
]

Use this pattern for “Top 10 customers by revenue,” “Top regions by traffic,” etc.

Trend Over Time – “How Does This Metric Change Each Week or Month?”

Goal: See how a number (like sales, users, or errors) changes across time periods – daily, weekly, monthly, etc.

This is where the date_histogram bucket really shines.

Example: Weekly Signups Over Time

GET /users/_search
{
  "size": 0,
  "aggs": {
    "signups_over_time": {
      "date_histogram": {
        "field": "signup_date",
        "calendar_interval": "week"
      },
      "aggs": {
        "weekly_signups": {
          "value_count": { "field": "user_id" }
        }
      }
    }
  }
}

What happens:

  1. date_histogram groups users by week.
  2. value_count counts how many users signed up in each week.

Sample Output:

"buckets": [
  { "key_as_string": "2025-09-01", "doc_count": 240 },
  { "key_as_string": "2025-09-08", "doc_count": 310 },
  { "key_as_string": "2025-09-15", "doc_count": 450 }
]

Perfect for plotting line charts in Kibana or Knowi.

Adding Trend Metrics

You can easily combine this with a derivative or moving average to show growth rate or smoothed trends.

Example: Weekly Growth Rate

"aggs": {
  "signups_over_time": {
    "date_histogram": { "field": "signup_date", "calendar_interval": "week" },
    "aggs": {
      "weekly_signups": { "value_count": { "field": "user_id" } },
      "growth_rate": {
        "derivative": { "buckets_path": "weekly_signups" }
      }
    }
  }
}

This shows how much signups increased or decreased compared to the previous week.

In short:

  • Use date_histogram for grouping by time.
  • Use sum / count / avg to measure something per interval.
  • Optionally use derivative / moving_avg for growth and trends.

Funnel or Sequential Analysis

Goal: Track how users progress through stages of a journey – like a marketing funnel or signup process.

Example:

Visitors → Signups → Purchases

Each stage filters the data differently, and you compare how many users reach each.

Simple Funnel Example – Using Filters Aggregation

Let’s say your index stores user actions like:

{ "user_id": 1, "event": "visit" }
{ "user_id": 1, "event": "add_to_cart" }
{ "user_id": 1, "event": "purchase" }

You can count how many users did each action:

GET /events/_search
{
  "size": 0,
  "aggs": {
    "funnel_stages": {
      "filters": {
        "filters": {
          "Visited": { "term": { "event.keyword": "visit" } },
          "Added_to_Cart": { "term": { "event.keyword": "add_to_cart" } },
          "Purchased": { "term": { "event.keyword": "purchase" } }
        }
      },
      "aggs": {
        "unique_users": { "cardinality": { "field": "user_id" } }
      }
    }
  }
}

Result:

"buckets": {
  "Visited":        { "unique_users": { "value": 5000 } },
  "Added_to_Cart":  { "unique_users": { "value": 2000 } },
  "Purchased":      { "unique_users": { "value": 700 } }
}

This gives you stage-by-stage counts.

Calculating Conversion Rates

You can use a bucket_script to compute the ratios:

"aggs": {
  "conversion_rate": {
    "bucket_script": {
      "buckets_path": {
        "purchased": "Purchased>unique_users",
        "visited": "Visited>unique_users"
      },
      "script": "params.purchased / params.visited * 100"
    }
  }
}

That’ll give you something like: “14% of visitors completed a purchase.”

Common Uses

  • Marketing funnels: ad clicks → signups → paid subscriptions
  • Product usage funnels: trial started → dashboard viewed → report shared
  • Sales funnels: leads → opportunities → closed deals

Key takeaway:

  • Use filters for stage-based counts.
  • Combine with cardinality for unique users.
  • Optionally add bucket_script for conversion %.

Time Series Anomaly Detection

Goal: Flag unusual spikes or dips in a metric over time (errors, latency, signups, sales).

Pattern 1 – Compare to a Moving Average

  1. Bucket by time
  2. Compute the metric per bucket
  3. Smooth it with a moving window
  4. Alert when actual deviates a lot from the smoothed value
GET /logs/_search
{
  "size": 0,
  "aggs": {
    "per_min": {
      "date_histogram": { "field": "ts", "fixed_interval": "1m" },
      "aggs": {
        "err_count": { "filter": { "range": { "status": { "gte": 500 } } } },
        "count": { "value_count": { "field": "status" } },
        "moving_avg": {
          "moving_fn": { "buckets_path": "count", "window": 15 }
        },
        "delta": {
          "bucket_script": {
            "buckets_path": { "c": "count", "m": "moving_avg" },
            "script": "params.c - params.m"
          }
        },
        "spike": {
          "bucket_selector": {
            "buckets_path": { "d": "delta", "m": "moving_avg" },
            "script": "params.m > 0 && params.d > 3 * Math.sqrt(params.m)"
          }
        }
      }
    }
  }
}

What this does

  • date_histogram: 1-minute buckets
  • count: events per minute
  • moving_fn: rolling average (window=15)
  • delta: how far current is from the rolling average
  • bucket_selector: keeps only points where delta is > 3×√mean (a quick Poisson-ish spike check)

Result: Only “spiky” minutes remain in per_min.buckets.

Pattern 2 – Rate of Change (Derivative)

Detect sudden changes vs. the last bucket.

{
  "aggs": {
    "per_min": {
      "date_histogram": { "field": "ts", "fixed_interval": "1m" },
      "aggs": {
        "count": { "value_count": { "field": "id" } },
        "change": { "derivative": { "buckets_path": "count" } },
        "big_jump": {
          "bucket_selector": {
            "buckets_path": { "chg": "change" },
            "script": "params.chg != null && params.chg > 100"
          }
        }
      }
    }
  }
}

Use when: Big jumps matter more than absolute values.

Pattern 3 – Standard Deviation Bands

Compare to mean ± N×std dev over a moving window.

{
  "aggs": {
    "per_5m": {
      "date_histogram": { "field": "ts", "fixed_interval": "5m" },
      "aggs": {
        "latency_p50": { "percentiles": { "field": "latency_ms", "percents": [50] } },
        "roll": {
          "moving_fn": {
            "buckets_path": "latency_p50[50.0]",
            "window": 12,
            "script": "MovingFunctions.unweightedAvg(values)"
          }
        },
        "diff": {
          "bucket_script": {
            "buckets_path": {
              "p50": "latency_p50[50.0]",
              "avg": "roll"
            },
            "script": "params.p50 - params.avg"
          }
        },
        "outlier": {
          "bucket_selector": {
            "buckets_path": { "d": "diff" },
            "script": "Math.abs(params.d) > 75"   // tune threshold
          }
        }
      }
    }
  }
}

Use when: You monitor a latency percentile (p50/p95) and want to flag big deviations.

Practical Tips

  • Pick the right interval: too small = noisy; too big = slow to react.
  • Window size: 7–20 buckets often works; tune per data’s seasonality.
  • Guardrails: add a min doc count per bucket to avoid false positives from sparse data.
  • Post-filtering: bucket_selector is your friend for alert-style queries.
  • Visualize: plot actual vs. moving average and highlight kept buckets.

Best Practices for Using Aggregations

High cardinality hurts memory

  • Prefer lower-cardinality group-bys; for uniques use cardinality with a sensible precision_threshold.
  • Don’t aggregate on text; use .keyword.

Filters vs post_filter

  • Filter/query affects hits and aggs.
  • post_filter narrows hits only (keep facet counts stable).

terms tuning

  • size = final top buckets; start with 10–50.
  • Increase shard_size if top terms look wrong.
  • Use min_doc_count to drop noise.
  • For “all buckets”/paging, use composite.

Error visibility

  • Turn on show_term_doc_count_error; inspect doc_count_error_upper_bound.
  • If large, increase shard_size or prefilter.

Cache & filter context

  • Put repeatable clauses in bool.filter (cacheable).
  • Request cache works best with size: 0 and steady data.

Reduce buckets first

  • Reasonable date_histogram intervals; avoid 1s unless needed.
  • Add pipelines after you’ve trimmed buckets.

Mind global limits

  • search.max_buckets (~65k) caps total buckets; tighten intervals or switch to composite.

Mapping hygiene

  • Group-by fields as keyword/numeric/date; avoid enabling fielddata on text.

Tiny checklist before you run a big agg

  • Is the group-by field low/medium cardinality?
  • Can I prefilter the time/window?
  • Can I reduce interval / size / sub-aggs?
  • Should I use composite instead of massive terms?
  • Do I need accurate counts, or are approximate OK (cardinality)?
  • Am I in filter context for cacheability?

Limitations of Elasticsearch for Analytics

1. No Joins Across Indices

Elasticsearch isn’t a relational database.
You can’t easily do queries that join data from multiple indices (tables), such as:

“Show total sales per customer where customer info is in another index.”

Why:
Each index is designed for fast, distributed searching – not for relational linking.

Workarounds:

  • Denormalize data: store related fields together during ingestion.
  • Use parent-child or nested fields, though they have limits and higher query cost.
  • External join tools: Some platforms (like Knowi) handle cross-index joins automatically before pushing the query to Elasticsearch.

2. Complexity in Multi-Dimensional Grouping

Elasticsearch can group data easily on one or two fields,
but multi-level “cube-style” aggregations (e.g., region → category → product → month) can explode in size.

Why:
Each level creates buckets, and the total buckets grow exponentially with dimensions – quickly hitting the search.max_buckets limit (≈65,000).

Workarounds:

  • Use composite aggregations to scroll through results in manageable chunks.
  • Pre-aggregate data at ingest time (e.g., daily summaries).
  • Keep dashboards simple: limit to 2–3 grouping dimensions.
  • Consider rollup indices for time-series compression.

3. Limited Visualization & Reporting

Out of the box, Elasticsearch only returns JSON data – no charts, dashboards, or formatted reports.

Why:
It’s an engine, not a visualization tool.

Workarounds:

  • Kibana: Free, native tool from Elastic for visualization and dashboards.
  • Third-party tools: Knowi, Grafana, Tableau (via connectors) provide richer visualization, SQL-like query layers, and alerting.
  • Custom dashboards: Use Elasticsearch APIs + charting libraries (D3, Plotly, etc.) if you want a bespoke setup.

Quick Recap

LimitationWhat It MeansCommon Fix
No joinsCan’t combine data from multiple indices like SQL joinsDenormalize or use a BI tool that handles joins
Multi-dimensional groupingToo many buckets = heavy memory & limitsUse composite or rollups
No built-in visualizationRaw JSON output onlyUse Kibana or external analytics platforms

How to Build Dashboards on Elasticsearch

Elasticsearch by itself doesn’t render charts – it just provides the data. To visualize it, you need a layer on top. You have three main choices:

Option 1: Kibana (Official Visualization Tool)

What it is:

Kibana is the native visualization tool built by Elastic, designed specifically for Elasticsearch.

Key Features:

  • Real-time dashboards (auto-refreshing)
  • Charts: bar, line, pie, maps, heatmaps, gauges
  • Powerful search bar using Lucene / KQL syntax
  • Alerting and anomaly detection via Watcher / Machine Learning plugin

How it works:

  • You connect Kibana to your Elasticsearch cluster.
  • You use “Visualize Library” or “Lens” to drag and drop fields.
  • You build dashboards by combining multiple visualizations.

Example Use Cases:

  • Log analytics dashboards (ELK Stack)
  • Security monitoring (SIEM)
  • Application performance metrics

Pros:

  • Tight integration with Elasticsearch
  • Real-time updates
  • Open-source and free (basic tier)

Cons:

  • Limited for business KPIs or multi-source data
  • Harder to customize for executive-style dashboards
  • Requires Elastic Stack setup (and some admin overhead)

Option 2: Custom Dashboards Using Elasticsearch APIs

What it is:

You can query Elasticsearch directly through its REST API and render results in your own web app using libraries like Chart.js, Plotly, or D3.js.

Example Workflow:

Send an aggregation query:

 POST /sales/_search
{
  "size": 0,
  "aggs": { "sales_by_month": { "date_histogram": { "field": "date", "calendar_interval": "month" } } }
}
  1. Parse the JSON response.
  2. Feed the data into your chart library in your frontend app.

Pros:

  • Fully customizable look and feel.
  • Can integrate with any front-end (React, Angular, Vue).
  • Ideal for embedding analytics inside a product.

Cons:

  • Requires engineering work.
  • No built-in dashboard management, sharing, or alerting.

Option 3: Third-Party Analytics Platforms

What they do:

These platforms connect to Elasticsearch (and other data sources) to provide unified analytics, joins, AI capabilities, and visualization without extra setup.

Examples:

  • Knowi: Purpose-built for Elasticsearch, supports cross-index joins, NLP queries (“Ask Data Anything”), and embedded dashboards.
  • Grafana: Great for time-series metrics and DevOps monitoring.
  • Tableau / Power BI: Possible via connectors, but less efficient with NoSQL or nested data.

Pros:

  • Multi-source integration (Elasticsearch + SQL + APIs + files)
  • Friendly UI for non-technical users
  • Built-in scheduling, alerts, and AI insights
  • Easier to share dashboards across teams

Cons:

  • May require licensing or cloud setup
  • Some abstraction over raw queries (less control)

Which Option to Choose?

Use CaseBest Option
Log or system monitoringKibana
Developer-built app or portalCustom via APIs
Unified business dashboards or embedded analyticsKnowi
Real-time time-series metricsGrafana

Real-World Use Cases of Elasticsearch Aggregations

1. Log Analytics: Errors Grouped by Type or Service

Goal: Identify which services or error types occur most frequently.

Data Example:

{ "timestamp": "2025-10-08", "service": "checkout", "status": 500, "error_type": "TimeoutError" }

Aggregation Query:

{
  "size": 0,
  "aggs": {
    "errors_by_service": {
      "terms": { "field": "service.keyword" },
      "aggs": {
        "error_types": { "terms": { "field": "error_type.keyword" } },
        "error_count": { "value_count": { "field": "status" } }
      }
    }
  }
}

Insight:

  • Checkout → 300 errors, mostly “TimeoutError”
  • Payments → 250 errors, mostly “AuthFailure”

Business Value: Quickly identify failing services or common error causes.

2. E-commerce: Sales by Region / Product / Time

Goal: Track revenue trends and top-performing products.

Data Example:

{ "order_id": 123, "region": "US", "category": "Electronics", "price": 450, "order_date": "2025-09-15" }

Aggregation Query:

{
  "size": 0,
  "aggs": {
    "sales_by_region": {
      "terms": { "field": "region.keyword" },
      "aggs": {
        "sales_by_month": {
          "date_histogram": { "field": "order_date", "calendar_interval": "month" },
          "aggs": { "total_sales": { "sum": { "field": "price" } } }
        }
      }
    }
  }
}

Insight:

  • US revenue trending up 20% month-over-month.
  • Electronics and Apparel are top-grossing categories.

Business Value: Track KPIs like revenue growth and product performance.

3. IoT Analytics: Sensor Summaries & Anomalies

Goal: Detect abnormal sensor readings and summarize device data.

Data Example:

{ "device_id": "A100", "temperature": 82.5, "timestamp": "2025-10-08T10:00:00Z" }

Aggregation Query:

{
  "size": 0,
  "aggs": {
    "avg_temp_per_device": {
      "terms": { "field": "device_id.keyword" },
      "aggs": {
        "avg_temp": { "avg": { "field": "temperature" } },
        "max_temp": { "max": { "field": "temperature" } }
      }
    }
  }
}

Insight:

  • Device A100 average = 82.5°F
  • Device B200 average = 97.2°F (potential overheating anomaly)

Business Value: Enables proactive maintenance and real-time alerts.

Goal: Measure adoption and retention across product features.

Data Example:

{ "user_id": 22, "feature": "dashboard_export", "event": "used", "timestamp": "2025-09-20" }

Aggregation Query:

{
  "size": 0,
  "aggs": {
    "usage_over_time": {
      "date_histogram": { "field": "timestamp", "calendar_interval": "week" },
      "aggs": {
        "features": { "terms": { "field": "feature.keyword" } }
      }
    }
  }
}

Insight:

  • Dashboard exports up 40% in the last month.
  • “Alert setup” feature adoption is declining.

Business Value: Product teams can prioritize improvements or redesign underused features.

Summary of Use Cases

DomainKey MetricsExample AggregationsInsights
LogsError count, typeterms, filters, countIdentify failing services
E-commerceRevenue, ordersterms, date_histogram, sumRevenue by region & product
IoTSensor statsterms, avg, max, derivativeDetect anomalies
SaaSFeature usage, retentionterms, date_histogram, cardinalityTrack adoption trends

What aggregations unlock:

Elasticsearch aggregations turn raw documents into live analytics– from simple KPIs (sum, avg) to time-series insights (date_histogram + pipelines) and even multi-field stats (matrix). You can answer “how much,” “per what,” and “how it’s changing” without exporting data.

Where ES fits in the analytics stack:

  • Ingest + Store: Elasticsearch (fast, scalable, search + analytics in one).
  • Analyze: Aggregations (bucket + metric + pipeline + matrix).
  • Visualize/Act: Kibana, custom apps, or a BI layer (e.g., Knowi) for joins, sharing, and embedded workflows.

When to reach for something else:

  • Heavy cross-index joins or complex multi-dimensional cubes → consider denormalization, rollups, or a BI layer that handles joins.
  • Pixel-perfect reporting or blended data from many systems → purpose-built BI/analytics tools.

Elasticsearch handles search. Knowi handles analytics.

Visualize any aggregation instantly, run joins across indices, and use AI-powered NLQ to ask your Elasticsearch data anything.

From terms and date_histogram to cross-index joins, alerts, and embedded dashboards, Knowi brings complete analytics on top of Elasticsearch – without ETL or complex setup.

Book a demo here –  https://www.knowi.com/lp/elasticsearch-analytics/.

If you enjoyed this guide, you’ll find these next reads helpful:

  1. Best Tools for Elasticsearch Analytics: Kibana vs Knowi vs Grafana: A detailed comparison of the top visualization and analytics platforms for Elasticsearch.
  2. Kibana Tutorial 2025: Elasticsearch Visualization Made Simple: A hands-on walkthrough of building dashboards in Kibana – from setting up index patterns to using visual builder and Lens for metrics and trends.

Frequently Asked Questions

What’s the difference between bucket and metric aggregations?

  • Bucket aggs group documents (by term, time, range, filters, composite).

Metric aggs compute numbers inside those groups (sum, avg, min/max, percentiles, cardinality).

 Can Elasticsearch aggregations replace SQL analytics?

  • For many dashboards and time-series analyses, yes.

For multi-table joins, complex window functions, and ad-hoc SQL across many sources, you’ll likely combine ES with a BI/ETL layer or denormalize data upfront.

How do I aggregate across multiple indices?

  • Use index patterns (e.g., logs-*) when the schema matches.

True joins across unrelated indices aren’t native-use denormalization, parent-child/nested models (with trade-offs), or a BI layer that joins before/after the ES query.

How do I speed up slow aggregations?

  • Narrow the time window or add pre-filters.
  • Prefer keyword/numeric/date fields for group-bys (not text).
  • Tune size/shard_size (or switch to composite for huge group-bys).
  • Use filter context (cacheable), sensible date_histogram intervals, and avoid heavy runtime scripts.

Watch cardinality-use approximate cardinality metrics and pre-aggregate where possible.

How do I get “Top N” accurately?

  • Use a terms agg with order by your metric and reasonable size (e.g., 10–50).
  • If results seem off (skewed shards), increase shard_size or prefilter.

For full lists or stable paging, use composite.

What’s the best way to do anomaly detection?

  • Build a date_histogram, compute your metric per bucket, then add pipeline aggs:
    • moving_fn (rolling average)
    • derivative (rate of change)
    • bucket_selector (flag outliers)
  • Visualize actual vs moving average and alert on deviations.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email
About the Author:

RELATED POSTS