How Elasticsearch Clusters Manage Node Memory and Storage Layout π§
π‘ AWS OpenSearch (formerly Elasticsearch) manages each nodeβs memory and storage much like a traditional server, but applies its own internal management mechanisms and operational workflows.
The diagram below provides a simplified, easy-to-understand layout.
In short, it illustrates the following:
[ Physical RAM ]
β
βββ [ JVM Process (Elasticsearch) ]
β β
β βββ [ JVM Heap Memory ]
β β βββ Java objects
β β βββ Threadpools
β β βββ Indexing buffers
β β βββ (Optional) Heap Data Structure object
β β
β βββ [ Other JVM Memory: Stack, Metaspace, Native Memory ]
β
βββ [ OS Page Cache, Other Processes, etc. ]
Let's see simply again the working paths (Indexing Path & Search Path).
Indexing Path
Client
β
Threadpool (write/bulk) [JVM Heap]
β
Indexing buffer [JVM Heap]
β
Lucene in-memory segment
β (flush)
OS Page Cache [RAM outside JVM heap]
β
EBS Volume [Persistent disk storage]
Search Path
Client
β
Threadpool (search) [JVM Heap]
β
Lucene segment read
β (hits OS page cache if warm, else EBS read)
OS Page Cache [RAM outside JVM heap]
β
Optional: Query Cache [JVM Heap]
β
Response sent to client
π Key Takeaways
- Threadpools & queues β JVM heap.
- In-memory indexing buffers & caches β JVM heap.
- Segment file content for searches β OS page cache (outside JVM heap, still RAM).
- EBS volume is persistent storage; OS page cache + Lucene memory-mapped files keep hot data in RAM.
- Half RAM for heap, half for OS cache is the golden rule.
Happy learning! π