Interview Prep

Elite Edge

Cutting-edge Elastic capabilities, capacity planning methodology, and the differentiator: this lab itself as proof of hands-on competency.

ES|QL: Next-Generation Query Language

ES|QL (Elasticsearch Query Language) is a new pipe-based query language that provides a SQL-like experience for Elasticsearch. Unlike Query DSL (which is JSON-based and can be verbose), ES|QL uses a concise pipe syntax inspired by SPL and KQL.

ES|QL examplessql

// Find critical support tickets with high resolution time
FROM support-tickets
| WHERE severity == "critical" AND resolution_hours > 2
| SORT resolution_hours DESC
| KEEP ticket_id, title, severity, resolution_hours, customer_name

// Aggregate log levels with stats
FROM maclab-logs-*
| STATS count = COUNT(*), avg_size = AVG(message_length) BY log_level
| SORT count DESC

// Time-series analysis
FROM maclab-logs-*
| WHERE @timestamp > NOW() - 24 hours
| EVAL hour = DATE_EXTRACT("HOUR", @timestamp)
| STATS error_count = COUNT(*) BY hour
| SORT hour

// Join-like enrichment with ENRICH
FROM support-tickets
| ENRICH customer-info ON customer_name
| WHERE region == "APJ"
| STATS total_tickets = COUNT(*), avg_resolution = AVG(resolution_hours) BY customer_name

Key Advantages Over Query DSL

Readable pipe syntax: data flows left to right through transformations

Computed columns with EVAL: create new fields on-the-fly

No JSON nesting: dramatically simpler for complex queries

Built-in ENRICH: lookup enrichment without runtime fields

Runs on a new compute engine optimized for columnar processing

Vector Search & Semantic Search

Elasticsearch supports approximate k-nearest-neighbor (kNN) search using the HNSW (Hierarchical Navigable Small World) algorithm. This enables semantic search — finding documents by meaning rather than exact keyword match.

Vector search mapping and queryjson

// Create index with dense_vector field
PUT /semantic-tickets
{
  "mappings": {
    "properties": {
      "title": { "type": "text" },
      "title_vector": {
        "type": "dense_vector",
        "dims": 384,
        "index": true,
        "similarity": "cosine"
      }
    }
  }
}

// kNN search — find semantically similar tickets
POST /semantic-tickets/_search
{
  "knn": {
    "field": "title_vector",
    "query_vector": [0.12, -0.34, ...],  // 384-dim embedding
    "k": 5,
    "num_candidates": 100
  }
}

// Hybrid search — combine kNN with BM25
POST /semantic-tickets/_search
{
  "query": {
    "match": { "title": "cluster health red" }
  },
  "knn": {
    "field": "title_vector",
    "query_vector": [0.12, -0.34, ...],
    "k": 5,
    "num_candidates": 100
  },
  "rank": {
    "rrf": {}  // Reciprocal Rank Fusion to combine scores
  }
}

HNSW Algorithm

Builds a multi-layer graph of vectors. Each layer is a navigable small world graph where nodes connect to nearby neighbors. Search starts at the top layer (sparse, long-range connections) and descends to the bottom layer (dense, precise connections). This gives O(log N) search time.

RRF: Reciprocal Rank Fusion

Combines results from different ranking methods (BM25 + kNN) without needing to normalize scores. Each result is scored based on its rank position: score = 1 / (k + rank). Results appearing in both lists get combined scores, naturally surfacing the most relevant documents.

Elastic 9.x & Beyond

BBQ (Better Binary Quantization) Vectors

A new vector quantization technique that reduces vector storage by up to 32x while maintaining search quality. Critical for scaling vector search to billions of documents without prohibitive memory costs.

Jina AI Acquisition

Elastic acquired Jina AI's embedding model technology, enabling native embedding generation within Elasticsearch. This removes the need for external embedding services, simplifying the vector search pipeline.

Serverless Elasticsearch

Elastic's serverless offering decouples compute from storage. Customers pay per query rather than per node, with automatic scaling. The indexing and search tiers scale independently.

Logsdb Index Mode

A new index mode optimized for log data that uses synthetic _source (reconstructed from doc_values) to reduce storage by 50-70%. Trades _source retrieval speed for significant storage savings.

Elastic AI Assistant

GenAI-powered assistant integrated into Kibana for observability and security workflows. Analyzes alerts, suggests root causes, and generates response playbooks using RAG over Elastic documentation.

Capacity Planning Methodology

Shard Sizing Rules

Target shard size: 10-50 GB for time-series data

Maximum recommended: 50 GB per shard (larger shards slow recovery)

Minimum meaningful: 1 GB per shard (smaller creates overhead)

Shards per GB heap: max 20 shards per GB of heap memory

For our cluster: 1 GB heap x 3 nodes x 20 = max 60 shards total

JVM Heap Rules

Never exceed 50% of available RAM

Never exceed 30-31 GB (compressed oops threshold)

Xms and Xmx must be equal (prevent heap resizing)

G1GC is default since ES 7.x — CMS is deprecated

Circuit Breakers

Circuit breaker settingsjson

{
  "indices.breaker.total.limit": "95%",      // Total heap usage limit
  "indices.breaker.fielddata.limit": "40%",   // Fielddata cache limit
  "indices.breaker.request.limit": "60%",     // Per-request limit
  "network.breaker.inflight_requests.limit": "100%"  // In-flight requests
}

// When a circuit breaker trips:
// HTTP 429 — CircuitBreakingException
// "Data too large, data for [<field>] would be [X] bytes,
//  which is larger than the limit of [Y]"
// Fix: reduce query scope, add more nodes, increase heap (if under 50%)

The Differentiator: This Lab

I didn't just read about Elasticsearch — I built it.

This MacLab Elasticsearch deployment represents more than interview preparation. It's evidence of the approach I bring to every technical challenge: deploy it, break it, fix it, document it.

Cluster

3 Nodes

TLS encrypted

Languages

KR + EN

Nori tokenizer

Drills

Node, mapping, restore

Deployed a 3-node cluster with TLS inter-node and HTTP encryption

Configured Korean language support with Nori tokenizer (decompound_mode: mixed)

Built ingest pipelines with grok, date, and lowercase processors

Created bilingual support tickets (Korean and English)

Ran failure drills: node failure, mapping conflict, snapshot restore

Validated Korean text search: "클러스터 상태" returned 3 relevant results

Built this interview prep site as a Next.js 16 app on the same platform

Executive Dashboard

Closing Statement

The Elastic Senior Support Engineer role asks for curiosity, technical depth, native Korean, and the ability to help enterprise customers solve complex distributed systems problems. This lab demonstrates all four: I was curious enough to build a complete cluster from scratch, deep enough to understand TLS certificate chains and Nori morphological analysis, fluent enough to create bilingual test data and Korean search queries, and systematic enough to run failure drills and document every finding.

The best support engineers don't just know the product — they've broken it and fixed it. That's exactly what this lab represents.