Heap and thread monitoring

Certify and Increase Opportunity.
Be
Govt. Certified Apache Cassandra Professional

Heap and thread monitoring

binary heap

While Java takes care most of the details of memory and heap management, situations can occur where memory is never reclaimed. This recipe uses two java tools jmap and jhat, to capture a heap dump and examine it.

Steps to complete it
Determine the pid of a running java process
$ ps -ef | grep java
edward    3736     1  1 01:46 pts/0    00:00:06

Use the jmap tool to dump the heap to a file.
$ jmap  -dump:file=b  3736
Dumping heap to /home/edward/hpcas/b …
Heap dump file created

Start a jhat web server on 7001 (this defaults to 7000 the cassandra storage port.
$jhat -port 7001 /home/edward/hpcas/b
Chasing references, expect 15 dots……………
Eliminating duplicate references……………
Snapshot resolved.
Started HTTP server on port 7001
Server is ready.

Jhat and jmap allow you to capture and review heap dumps. This is a valuable tool when chasing down memory leaks or un-explained memory usage.

Tuning Java Heap

Because Cassandra is a database, it spends significant time interacting with the operating system’s I/O infrastructure through the JVM, so a well-tuned Java heap size is important. Cassandra’s default configuration opens the JVM with a heap size that is based on the total amount of system memory:

System Memory Heap Size
Less than 2GB 1/2 of system memory
2GB to 4GB 1GB
Greater than 4GB 1/4 system memory, but not more than 8GB

General Guidelines

Many users new to Cassandra are tempted to turn up Java heap size too high, which consumes the majority of the underlying system’s RAM. In most cases, increasing the Java heap size is actually detrimental for these reasons:

  • In most cases, the capability of Java 6 to gracefully handle garbage collection above 8GB quickly diminishes.
  • Modern operating systems maintain the OS page cache for frequently accessed data and are very good at keeping this data in memory, but can be prevented from doing its job by an elevated Java heap size.

To change a JVM setting, modify the cassandra-env.sh file.

Because MapReduce runs outside the JVM, changes to the JVM do not affect Hadoop operations directly.

Thread Pool Statistics

Cassandra maintains distinct thread pools for different stages of execution. Each of these thread pools provide statistics on the number of tasks that are active, pending and completed. Watching trends on these pools for increases in the pending tasks column is an excellent indicator of the need to add additional capacity. Once a baseline is established, alarms should be configured for any increases past normal in the pending tasks column. Details on each thread pool are

Thread Pool Description
AE_SERVICE_STAGE Shows anti-entropy tasks
CONSISTENCY-MANAGER Handles the background consistency checks if they were triggered from the client’s consistency level <consistency>
FLUSH-SORTER-POOL Sorts flushes that have been submitted
FLUSH-WRITER-POOL Writes the sorted flushes
GOSSIP_STAGE Activity of the Gossip protocol on the ring
LB-OPERATIONS The number of load balancing operations
LB-TARGET Used by nodes leaving the ring
MEMTABLE-POST-FLUSHER Memtable flushes that are waiting to be written to the commit log.
MESSAGE-STREAMING-POOL Streaming operations. Usually triggered by bootstrapping or decommissioning nodes.
MIGRATION_STAGE Tasks resulting from the call of system_* methods in the API that have modified the schema
MISC_STAGE
MUTATION_STAGE API calls that are modifying data
READ_STAGE API calls that have read data
RESPONSE_STAGE Response tasks from other nodes to message streaming from this node
STREAM_STAGE Stream tasks from this node

Get industry recognized certification – Contact us

Menu