Certify and Increase Opportunity.
Be
Govt. Certified Apache Cassandra Professional
Heap and thread monitoring
While Java takes care most of the details of memory and heap management, situations can occur where memory is never reclaimed. This recipe uses two java tools jmap and jhat, to capture a heap dump and examine it.
Steps to complete it
Determine the pid of a running java process
$ ps -ef | grep java
edward 3736 1 1 01:46 pts/0 00:00:06
Use the jmap tool to dump the heap to a file.
$ jmap -dump:file=b 3736
Dumping heap to /home/edward/hpcas/b …
Heap dump file created
Start a jhat web server on 7001 (this defaults to 7000 the cassandra storage port.
$jhat -port 7001 /home/edward/hpcas/b
Chasing references, expect 15 dots……………
Eliminating duplicate references……………
Snapshot resolved.
Started HTTP server on port 7001
Server is ready.
Jhat and jmap allow you to capture and review heap dumps. This is a valuable tool when chasing down memory leaks or un-explained memory usage.
Tuning Java Heap
Because Cassandra is a database, it spends significant time interacting with the operating system’s I/O infrastructure through the JVM, so a well-tuned Java heap size is important. Cassandra’s default configuration opens the JVM with a heap size that is based on the total amount of system memory:
System Memory | Heap Size |
---|---|
Less than 2GB | 1/2 of system memory |
2GB to 4GB | 1GB |
Greater than 4GB | 1/4 system memory, but not more than 8GB |
General Guidelines
Many users new to Cassandra are tempted to turn up Java heap size too high, which consumes the majority of the underlying system’s RAM. In most cases, increasing the Java heap size is actually detrimental for these reasons:
- In most cases, the capability of Java 6 to gracefully handle garbage collection above 8GB quickly diminishes.
- Modern operating systems maintain the OS page cache for frequently accessed data and are very good at keeping this data in memory, but can be prevented from doing its job by an elevated Java heap size.
To change a JVM setting, modify the cassandra-env.sh file.
Because MapReduce runs outside the JVM, changes to the JVM do not affect Hadoop operations directly.
Thread Pool Statistics
Cassandra maintains distinct thread pools for different stages of execution. Each of these thread pools provide statistics on the number of tasks that are active, pending and completed. Watching trends on these pools for increases in the pending tasks column is an excellent indicator of the need to add additional capacity. Once a baseline is established, alarms should be configured for any increases past normal in the pending tasks column. Details on each thread pool are
Thread Pool | Description |
---|---|
AE_SERVICE_STAGE | Shows anti-entropy tasks |
CONSISTENCY-MANAGER | Handles the background consistency checks if they were triggered from the client’s consistency level <consistency> |
FLUSH-SORTER-POOL | Sorts flushes that have been submitted |
FLUSH-WRITER-POOL | Writes the sorted flushes |
GOSSIP_STAGE | Activity of the Gossip protocol on the ring |
LB-OPERATIONS | The number of load balancing operations |
LB-TARGET | Used by nodes leaving the ring |
MEMTABLE-POST-FLUSHER | Memtable flushes that are waiting to be written to the commit log. |
MESSAGE-STREAMING-POOL | Streaming operations. Usually triggered by bootstrapping or decommissioning nodes. |
MIGRATION_STAGE | Tasks resulting from the call of system_* methods in the API that have modified the schema |
MISC_STAGE | |
MUTATION_STAGE | API calls that are modifying data |
READ_STAGE | API calls that have read data |
RESPONSE_STAGE | Response tasks from other nodes to message streaming from this node |
STREAM_STAGE | Stream tasks from this node |
Apply for Apache Cassandra Certification Now!!
http://www.vskills.in/certification/Certified-Apache-Cassandra-Professional