How Elasticsearch Cluster Manages Node Memory and Storage Layout?

How Elasticsearch Clusters Manage Node Memory and Storage Layout 🧠

πŸ’‘ AWS OpenSearch (formerly Elasticsearch) manages each node’s memory and storage much like a traditional server, but applies its own internal management mechanisms and operational workflows.

The diagram below provides a simplified, easy-to-understand layout.

In short, it illustrates the following:

[ Physical RAM ]

   β”‚
   β”œβ”€β”€ [ JVM Process (Elasticsearch) ]
   β”‚         β”‚
   β”‚         β”œβ”€β”€ [ JVM Heap Memory ]
   β”‚         β”‚         β”œβ”€β”€ Java objects
   β”‚         β”‚         β”œβ”€β”€ Threadpools
   β”‚         β”‚         β”œβ”€β”€ Indexing buffers
   β”‚         β”‚         └── (Optional) Heap Data Structure object
   β”‚         β”‚
   β”‚         └── [ Other JVM Memory: Stack, Metaspace, Native Memory ]
   β”‚
   └── [ OS Page Cache, Other Processes, etc. ]

Let's see simply again the working paths (Indexing Path & Search Path).

Indexing Path

    Client
    ↓
    Threadpool (write/bulk)  [JVM Heap]
    ↓
    Indexing buffer          [JVM Heap]
    ↓
    Lucene in-memory segment
    ↓ (flush)
    OS Page Cache            [RAM outside JVM heap]
    ↓
    EBS Volume               [Persistent disk storage]

Search Path

    Client
    ↓
    Threadpool (search)      [JVM Heap]
    ↓
    Lucene segment read
    ↓ (hits OS page cache if warm, else EBS read)
    OS Page Cache            [RAM outside JVM heap]
    ↓
    Optional: Query Cache    [JVM Heap]
    ↓
    Response sent to client

πŸ“Œ Key Takeaways

  • Threadpools & queues β†’ JVM heap.
  • In-memory indexing buffers & caches β†’ JVM heap.
  • Segment file content for searches β†’ OS page cache (outside JVM heap, still RAM).
  • EBS volume is persistent storage; OS page cache + Lucene memory-mapped files keep hot data in RAM.
  • Half RAM for heap, half for OS cache is the golden rule.

Happy learning! πŸ“š