Contents
This chapter describes how to tune Spotfire Streaming applications. Application and system parameters are described.
The Spotfire Streaming runtime supports multiple processes communicating through shared memory, or a memory mapped file. When a JVM is started using the deployment tool, all runtime resources required by the JVM are available in the same process space. There are cases where multiple JVMs on a single node may be appropriate for an application (see Multiple JVMs), but there is a performance impact for dispatching between JVMs.
By default, the Spotfire Streaming runtime does not modify the JVM heap
(-Xms<size>
and -Xmx<size>
) or stack
(-Xss<size>
) memory options. If during testing, the JVM is found
to run short of, or out of memory, these options can be modified either setting them as
arguments to the deployment tool.
Both JConsole
and VisualVM
can be used for looking at heap memory utilization.
By default, the Spotfire Streaming runtime does not modify any of the JVM garbage collection parameters.
For production systems deploying using the Oracle JVM, we recommend that you enable garbage collection logging using the following deployment options:
-
-XX:+PrintGCDateStamps
-
-XX:+PrintGCDetails
-
-Xloggc:gc.log
Note: replace
gc.log
with a name unique to your deployed JVM to avoid multiple JVMs from colliding using the same log file.
This provides a relatively low overhead logging that can be used to look for memory issues and using the timestamps may be correlated to other application logging (for example, request/response latency).
Another useful set of Oracle JVM option controls GC log file rotation. See (Java HotSpot VM Options).
-
-XX:-UseGCLogFileRotation
-
-XX:-NumberOfGCLogFiles
-
-XX:GCLogFileSize
Garbage collection tuning is a complex subject with dependencies on the application, the target load, and the desired balance of application throughput, latency, and footprint. Because there is no best one-size-fits-all answer, most JVMs offer a variety of options for modifying the behavior of the garbage collector. An Internet search shows a large selection of writings on the subject. One book with good coverage on the implementation and tuning of garbage collection in Oracle JVMs is Java Performance by Charlie Hunt and Binu John.
When deploying using the Oracle JVM, we recommend setting the following JVM deploy option, which causes a JVM heap dump to be logged on an out of memory error within the JVM:
-XX:+HeapDumpOnOutOfMemoryError
Typically, a Spotfire Streaming deployment consists of a single JVM per node. However, there may be cases where multiple JVMs per node are required (for example, Exceeding a per-process limit on the number of file descriptors).
the Spotfire Streaming runtime supports multiple JVMs deployed within a single node. These JVMs may all access the same managed objects.
-
Size
Shared memory needs to be large enough to contain the application's managed objects, the runtime state, and any in-flight transactions. See the System Sizing Guide for information on how to determine the correct size.
When caching managed objects, shared memory only needs to be large enough to store the subset of cached managed objects.
-
mmap
By default the Spotfire Streaming runtime uses a normal file in the file system. The
mmap(2)
system call is used to map it into the address space of the Spotfire Streaming processes.In a development environment, this is very convenient. Many developers may share a machine, and the operating system only allocates memory as it is actually utilized in the shared memory files. Cleanup of stranded deployments (where the processes are gone but the shared memory file remains) may be as simple as removing file system directories.
A performance disadvantage when using mmaped files for shared memory is that the operating system spends cycles writing the memory image of the file to disk. As the size of the shared memory file and the amount of shared memory accessed by the application increases, the operating system spends more and time writing the contents to disk.
Warning
Deploying a shared memory file on a networked file system (such as NFS) is not supported for production deployments. The I/O performance is not sufficient to support the required throughput. Use System V Shared Memory instead.
-
System V Shared memory
Spotfire Streaming also supports using System V Shared Memory for its shared memory.
Note
To reclaim System V Shared Memory the Spotfire Streaming node must be stopped and removed using the epadmin remove node command. The shared memory is not released by removing the node deployment directory.
An advantage of using System V Shared Memory is that the operating system does not spend any cycles attempting to write the memory to disk.
The operating system allocates the memory all at once, and it cannot be swapped, which is another advantage. In some cases, this also allows the operating system to allocate the physical memory contiguously and use the CPU's TLB (translation lookaside buffer) more efficiently. See Linux Huge Page TLB support for Linux tuning information.
See Linux System V Shared Memory Kernel Tuning for details on tuning Linux System V Shared Memory kernel parameters and macOS System V Shared Memory Kernel Tuning for details on tuning macOS System V Shared Memory kernel parameters.
Managed objects support caching of a subset of the object data in shared memory. The cache size should be set so that it is large enough to allow a working set of objects in shared memory. This avoids constantly refreshing object data from a remote node or an external data store, which negatively impacts performance. Spotfire Streaming uses an LRU (least recently used) algorithm to evict objects from shared memory, so objects that are accessed most often remains cached in shared memory.
The machine where a Spotfire Streaming node runs should always have enough available physical memory so that no swapping occurs on the system. Spotfire Streaming gains much of its performance by caching as much as possible in memory. If this memory becomes swapped, or simple paged out, the cost to access it increases by many orders of magnitude.
On Linux one can see if swapping has occurred using the following command:
$ /usr/bin/free total used free shared buffers cached Mem: 3354568 3102912 251656 0 140068 1343832 -/+ buffers/cache: 1619012 1735556 Swap: 6385796 0 6385796
The BIOS for many hardware platforms include power savings and performance settings. Significant performance differences may be seen based on the settings. For the best Spotfire Streaming performance, we recommend setting them to their maximum performance and lowest latency values.
Operating system kernels typically enforce configurable limits on System V Shared Memory usage. On Linux, these limits can be seen by running the following command:
$ ipcs -lm ------ Shared Memory Limits -------- max number of segments = 4096 max seg size (kbytes) = 67108864 max total shared memory (kbytes) = 67108864 min seg size (bytes) = 1
The tunable values that affect shared memory are:
-
SHMMAX
- This parameter defines the maximum size, in bytes, of a single shared memory segment. It should be set to at least the largest desired memory size for nodes using System V Shared Memory. -
SHMALL
- This parameter sets the total amount of shared memory pages that can be used system wide. It should be set to at leastSHMMAX/page size
. To see the page size for a particular system run the following command:$ getconf PAGE_SIZE 4096
-
SHMMNI
- This parameter sets the system wide maximum number of shared memory segments. It should be set to at least the number of nodes that are to be run on the system using System V Shared Memory.
These values may be changed either at runtime (in several different ways) or system boot time.
Change SHMMAX
to 17 gigabytes, at runtime, as root, by setting the value directly in /proc:
# echo 17179869184 > /proc/sys/kernel/shmmax
Change SHMALL
to 4 million pages, at runtime, as root, via the sysctl program:
# sysctl -w kernel.shmall=4194304
Change SHMMNI
to 4096 automatically at boot time:
# echo "kernel.shmmni=4096" >> /etc/sysctl.conf
On Linux, the runtime attempts to use the huge page TLB support the when allocating System V Shared Memory for sizes that are even multiples of 256 megabytes. If the support is not present, or not sufficiently configured, the runtime will automatically fall back to normal System V Shared Memory allocation.
-
The kernel must have the hugepagetlb support enabled. This is present in 2.6 kernels and later. See (http://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt).
-
The system must have huge pages available. They can be reserved:
At boot time via /etc/sysctl.conf:
vm.nr_hugepages = 512
Or at runtime:
echo 512 > /proc/sys/vm/nr_hugepages
Or the kernel can attempt to allocate the from the normal memory pools as needed:
At boot time via /etc/sysctl.conf:
vm.nr_overcommit_hugepages = 512
Or at runtime:
echo 512 > /proc/sys/vm/nr_overcommit_hugepages
-
Non-root users require group permission. This can be granted:
At boot time via /etc/sysctl.conf:
vm.hugetlb_shm_group = 1000
Or at runtime by:
echo 1000 > /proc/sys/vm/hugetlb_shm_group
where 1000 is the desired group id.
-
On earlier kernels in the 2.6 series, the user ulimit on maximum locked memory (memlock) must also be raised to a level equal to or greater than the System V Shared Memory size. On RedHat systems, this involves changing /etc/security/limits.conf, and the enabling the PAM support for limits on whatever login mechanism is being used. See the operating system vendor documentation for details.
A system imposed user limit on the maximum number of processes may impact the ability to deploy multiple JVMs concurrently to the same machine, or even a single JVM if it uses a large number of threads. The limit for the current user may be seen by running:
$ ulimit -u 16384
Many RedHat systems include a limit of 1024:
$ cat /etc/security/limits.d/90-nproc.conf # Default limit for number of user's processes to prevent # accidental fork bombs. # See rhbz #432903 for reasoning. * - nproc 1024
This 1024 should be raised if your errors are like the following:
EAGAIN The system lacked the necessary resources to create another thread, or the system-imposed limit on the total number of threads in a process {PTHREAD_THREADS_MAX} would be exceeded.
Operating system kernels typically enforce configurable limits on System V Shared Memory usage. On macOS, these limits can be seen by running the following command:
ipcs -M IPC status from <running system> as of Sun Apr 29 05:38:52 PDT 2018 shminfo: shmmax: 1073741824 (max shared memory segment size) shmmin: 1 (min shared memory segment size) shmmni: 32 (max number of shared memory identifiers) shmseg: 8 (max shared memory segments per process) shmall: 2097152 (max amount of shared memory in pages)
The tunable variables that affect shared memory are:
-
kern.sysv.shmmax
- This variable defines the maximum size, in bytes, of a single shared memory segment. It should be set to at least the largest desired memory size for nodes using System V Shared Memory. -
kern.sysv.shmall
- This variable sets the total number of shared memory pages that can be used system wide. It should be set to at leastkern.sysv.shmmax/pagesize
. The current page size can be seen using thepagesize
command:pagesize 4096
-
kern.sysv.shmmni
- This variable sets the system wide maximum number of shared memory segments. It should be set to at least the number of nodes that are to be run on the system using System V Shared Memory.
These variables can be changed at runtime using sysctl
, or at system boot time using /etc/sysctl.conf
.
These sysctl
commands change the kernel to support two million System
V shared memory pages with a maximum shared memory segment size of 8 gigabytes. These
changes take affect immediately, but are not maintained across system
reboots.
sudo sysctl kern.sysv.shmall=2097152 sudo sysctl kern.sysv.shmmax=8589934592
The /etc/sysctl.conf
must be updated to have the changed variables be maintained across system reboots. Changes to /etc/sysctl.conf
require a system reboot to take affect.
# # Maximum shared memory segment size of 10 GB # Maximum of 2 million shared memory pages # kern.sysv.shmmax=1073741824 kern.sysv.shmall=2097152
A system imposed user limit on the maximum number of processes and threads may impact the ability to deploy multiple JVMs concurrently to the same machine, or even a single JVM if it uses a large number of threads. The current process limit is displayed using:
ulimit -u 2837
There are two tunable variables that control this value:
These variables can be changed at runtime using sysctl
, or at system boot time using /etc/sysctl.conf.
These sysctl
commands change the kernel to support a total of 4K
processes system wide and 2 K processes per user. These changes take affect immediately, but
are not maintained across system
reboots.
sudo sysctl kern.maxproc=4096
sudo sysctl kern.maxprocperuid
=2048
The /etc/sysctl.conf
must be updated to have the changed variables be maintained across system reboots. Changes to /etc/sysctl.conf
require a system reboot to take affect.
#
# Support 4096 total process and 2048 per user
#
kern.maxproc=4096
kern.maxprocperuid
=2048
A Spotfire Streaming application can be, and often is, run on a single node. With High-availability and Distribution features, Spotfire Streaming can run distributed applications across multiple nodes. From an operational point of view, there are very few benefits from running multiple nodes on a single machine. This document recommends and assumes that each node is run on its own machine.
When an application reaches its throughput limit on a single node, additional performance can be gained by adding multiple nodes. This is called horizontal scaling. For an application that is not designed for distribution, this often poses a problem. Sometimes this can be addressed by adding a routing device outside of the nodes. But sometimes this can only be addressed by rewriting the application.
A distributed Spotfire Streaming application can be spread across an arbitrary number of nodes at the High-availability data partition boundary. If the active node for a set of partitions has reached throughput saturation, one or more of the partitions may be migrated to other nodes.
When Spotfire Streaming detects a deadlock, a detailed trace is sent to the node's deadlock.log file. The deadlock trace shows information about the transaction that deadlocked, which resource deadlocked, transaction stacks, thread stack traces, and other transactions involved in the deadlock.
A lock order deadlock can occur when two or more transactions lock the same two or more objects in different orders. An illustration of this can be found in the Deadlock Detection section of the Architects Guide.
The program below generates a single transaction lock ordering deadlock between two threads, running in a single JVM, in a single node.
package com.tibco.ep.dtm.snippets.tuning; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; /** * Deadlock Example from the Tuning Guide. * */ public class Deadlock { private static MyManagedObject object1; private static MyManagedObject object2; /** * Main entry point * @param args Not used * @throws InterruptedException Execution interrupted */ public static void main(String[] args) throws InterruptedException { // // Create a pair of Managed objects. // new Transaction("Create Objects") { @Override public void run() { object1 = new MyManagedObject(); object2 = new MyManagedObject(); } }.execute(); // // Create a pair of transaction classes to lock them. // Giving the object parameters in reverse order will // cause two different locking orders, resulting in a deadlock. // Deadlocker deadlocker1 = new Deadlocker(object1, object2); Deadlocker deadlocker2 = new Deadlocker(object2, object1); // // Run them in separate threads until a deadlock is seen. // while ((deadlocker1.getNumberDeadlocks() == 0) && (deadlocker2.getNumberDeadlocks() == 0)) { MyThread thread1 = new MyThread(deadlocker1); MyThread thread2 = new MyThread(deadlocker2); thread1.start(); thread2.start(); thread1.join(); thread2.join(); } } @Managed private static class MyManagedObject { int value; } private static class MyThread extends Thread { private final Deadlocker m_deadlocker; MyThread(Deadlocker deadlocker) { m_deadlocker = deadlocker; } @Override public void run() { m_deadlocker.execute(); } } private static class Deadlocker extends Transaction { private final MyManagedObject m_object1; private final MyManagedObject m_object2; Deadlocker(MyManagedObject object1, MyManagedObject object2) { m_object1 = object1; m_object2 = object2; } @Override public void run() { // // This will take a transaction read lock on the first object. // @SuppressWarnings("unused") int value = m_object1.value; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); // // This will take a transaction write lock on the second object. // m_object2.value = 42; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); } private void blockForAMoment() { try { Thread.sleep(500); } catch (InterruptedException ex) { } } } }
The program generates a deadlock trace into the deadlock.log file, similar to the following annotated trace shown below.
A deadlock trace begins with a separator:
============================================================
Followed by a timestamp and a short description of the deadlock.
2016-06-17 11:02:22.746084 Deadlock detected in transaction 109:1 by engine application::com_intellij_rt_execution_application_AppMain1 running on node A.snippets.
Next there is more detailed information about the deadlock transaction.
TransactionID = 109:1 Node = A.snippets Name = com.tibco.ep.dtm.snippets.tuning.Deadlock$Deadlocker Begin Time = 2016-06-17 11:02:22.245182 State = deadlocked
Followed by a description of the object and lock type for the deadlock. This example shows that the deadlock occurred in transaction 109:1 attempting to take a write lock on an object ...MyManagedObject:43.
Lock Type = write lock Target Object = com.tibco.ep.dtm.snippets.tuning.Deadlock$MyManagedObject:43 (3184101770:178056336:270224610788623:43)
Followed by a list of transaction locks held on the target object at the time of the deadlock are shown. This example shows that transaction 108:1 has a read lock on the target object.
Locks on Target Object: read lock held by transaction 108:1 Number of Target Object Write Lock Waiters = 0
Next is a list of locks held by the deadlock transaction. Note that this example shows the deadlock transaction holding a read lock on ...MyManagedObject:39.
Locks held by transaction 109:1: com.tibco.ep.dtm.snippets.tuning.Deadlock$MyManagedObject:39 (3184101770:178056336:270224610788623:39) read lock
The next section shows a transaction callstack for the deadlock transaction. A transaction callstack contains transaction life cycle entries and entries showing the transaction's thread/process usage. A transaction callstack is read from bottom to top and always starts with a begin transaction entry. This example shows a transaction that deadlocked while using a single thread (thread ID 28488, engine 107).
Transaction callstack for 109:1: TranId Engine ThreadId Method 109:1 107 28488 deadlock on com.tibco.ep.dtm.snippets.tuning.Deadlock$MyManagedObject:43 109:1 107 28488 begin transaction
Next are thread stack traces for each of the threads being used by the transaction at the time of the deadlock.
Thread stack traces are read from bottom to top.
Thread stacks for transaction 109:1: TranId Engine ThreadId Stack type Method 109:1 107 28488 Java com.kabira.platform.NativeRuntime.setInteger(Native Method) 109:1 107 28488 Java com.tibco.ep.dtm.snippets.tuning.Deadlock$Deadlocker.run (Deadlock.java:115) 109:1 107 28488 Java com.kabira.platform.Transaction.execute(Transaction.java:478) 109:1 107 28488 Java com.kabira.platform.Transaction.execute(Transaction.java:560) 109:1 107 28488 Java com.tibco.ep.dtm.snippets.tuning.Deadlock$MyThread.run (Deadlock.java:81)
The next section shows a list of engines installed in the node and their IDs. This maps to the Engine column in the transaction and thread sections.
Engines installed on node A.snippets: ID Name 100 System::swcoordadmin 101 System::kssl 102 System::administration 103 Dtm::distribution 107 application::com_intellij_rt_execution_application_AppMain1
The next sections show the same transaction information (when available) for each of the other transactions involved in the deadlock.
Other involved transactions: TransactionID = 108:1 Node = A.snippets Name = com.tibco.ep.dtm.snippets.tuning.Deadlock$Deadlocker Begin Time = 2016-06-17 11:02:22.245172
This section shows that transaction 108:1 is blocked waiting for a write lock on an object ...MyManagedObject:39, which is held with a read lock by the 109:1, the deadlocked the transaction.
State = blocked Lock Type = write lock Target Object = com.tibco.ep.dtm.snippets.tuning.Deadlock$MyManagedObject:39 (3184101770:178056336:270224610788623:39) Locks on Target Object: read lock held by transaction 109:1 Number of Target Object Write Lock Waiters = 1 Transaction callstack for 108:1: TranId Engine ThreadId Method 108:1 107 28489 begin transaction Thread stacks for transaction 108:1: TranId Engine ThreadId Stack type Method 108:1 107 28489 Java com.kabira.platform.NativeRuntime.setInteger(Native Method) 108:1 107 28489 Java com.tibco.ep.dtm.snippets.tuning.Deadlock$Deadlocker.run (Deadlock.java:115) 108:1 107 28489 Java com.kabira.platform.Transaction.execute(Transaction.java:478) 108:1 107 28489 Java com.kabira.platform.Transaction.execute(Transaction.java:560) 108:1 107 28489 Java com.tibco.ep.dtm.snippets.tuning.Deadlock$MyThread.run (Deadlock.java:81) Locks held by transaction 108:1: com.tibco.ep.dtm.snippets.tuning.Deadlock$MyManagedObject:43 (3184101770:178056336:270224610788623:43) read lock
Lock promotion is when a transaction currently holding a read lock on an object attempts to acquire a write lock on the same object (i.e. Promoting the read lock to a write lock). If blocking for this write lock would result in a deadlock, it is called a promotion deadlock.
The program below generates a single promotion deadlock between two threads, running in a single JVM, in a single node.
package com.tibco.ep.dtm.snippets.tuning; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; /** * Promotion deadlock Example from the Tuning Guide. */ public class PromotionDeadlock { private static MyManagedObject targetObject; /** * Main entry point * @param args Not used * @throws InterruptedException Execution interrupted */ public static void main(String[] args) throws InterruptedException { // // Create a Managed objects. // new Transaction("Create Objects") { @Override public void run() { targetObject = new MyManagedObject(); } }.execute(); // // Create a pair of transaction classes that will both // promote lock the Managed object, resulting in a // promotion deadlock. // Deadlocker deadlocker1 = new Deadlocker(targetObject); Deadlocker deadlocker2 = new Deadlocker(targetObject); // // Run them in separate threads until a deadlock is seen. // while ((deadlocker1.getNumberDeadlocks() == 0) && (deadlocker2.getNumberDeadlocks() == 0)) { MyThread thread1 = new MyThread(deadlocker1); MyThread thread2 = new MyThread(deadlocker2); thread1.start(); thread2.start(); thread1.join(); thread2.join(); } } @Managed private static class MyManagedObject { int value; } private static class MyThread extends Thread { private final Deadlocker m_deadlocker; MyThread(Deadlocker deadlocker) { m_deadlocker = deadlocker; } @Override public void run() { m_deadlocker.execute(); } } private static class Deadlocker extends Transaction { private final MyManagedObject m_targetObject; Deadlocker(MyManagedObject targetObject) { m_targetObject = targetObject; } @Override public void run() { // // This will take a transaction read lock on the object. // @SuppressWarnings("unused") int value = m_targetObject.value; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); // // This will take a transaction write lock on the object // (promoting the read lock). // m_targetObject.value = 42; // // Wait a while to maximize the possibility of contention. // blockForAMoment(); } private void blockForAMoment() { try { Thread.sleep(500); } catch (InterruptedException ex) { } } } }
The trace messages are similar to those shown in the previous section for a lock order deadlock, with the difference being that promotion deadlock is mentioned:
============================================================ 2016-06-17 10:52:46.948868 Deadlock detected in transaction 86:1 by engine application::com_intellij_rt_execution_application_AppMain0 running on node A.snippets. TransactionID = 86:1 Node = A.snippets Name = com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$Deadlocker Begin Time = 2016-06-17 10:52:46.448477 State = deadlocked Lock Type = promote lock Target Object = com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyManagedObject:11 (3184101770:8762792:270224610788623:11) Locks on Target Object: read lock (and promote waiter) held by transaction 85:1 read lock held by transaction 86:1 Number of Target Object Write Lock Waiters = 0 Locks held by transaction 86:1: com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyManagedObject:11 (3184101770:8762792:270224610788623:11) read lock Transaction callstack for 86:1: TranId Engine ThreadId Method 86:1 105 27318 promotion deadlock on com.tibco.ep.dtm.snippets.tuning. PromotionDeadlock$MyManagedObject:11 86:1 105 27318 begin transaction Thread stacks for transaction 86:1: TranId Engine ThreadId Stack type Method 86:1 105 27318 Java com.kabira.platform.NativeRuntime.setInteger(Native Method) 86:1 105 27318 Java com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$Deadlocker. run(PromotionDeadlock.java:116) 86:1 105 27318 Java com.kabira.platform.Transaction.execute(Transaction.java:478) 86:1 105 27318 Java com.kabira.platform.Transaction.execute(Transaction.java:560) 86:1 105 27318 Java com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyThread. run(PromotionDeadlock.java:83) Engines installed on node A.snippets: ID Name 100 System::swcoordadmin 101 System::kssl 102 System::administration 103 Dtm::distribution 105 application::com_intellij_rt_execution_application_AppMain0 Other involved transactions: TransactionID = 85:1 Node = A.snippets Name = com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$Deadlocker Begin Time = 2016-06-17 10:52:46.448434 State = blocked Lock Type = promote lock Target Object = com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyManagedObject:11 (3184101770:8762792:270224610788623:11) Locks on Target Object: read lock (and promote waiter) held by transaction 85:1 read lock held by transaction 86:1 Number of Target Object Write Lock Waiters = 0 Transaction callstack for 85:1: TranId Engine ThreadId Method 85:1 105 27317 begin transaction Thread stacks for transaction 85:1: TranId Engine ThreadId Stack type Method 85:1 105 27317 Java com.kabira.platform.NativeRuntime.setInteger(Native Method) 85:1 105 27317 Java com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$Deadlocker. run(PromotionDeadlock.java:116) 85:1 105 27317 Java com.kabira.platform.Transaction.execute(Transaction.java:478) 85:1 105 27317 Java com.kabira.platform.Transaction.execute(Transaction.java:560) 85:1 105 27317 Java com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyThread. run(PromotionDeadlock.java:83) Locks held by transaction 85:1: com.tibco.ep.dtm.snippets.tuning.PromotionDeadlock$MyManagedObject:11 (3184101770:8762792:270224610788623:11) read lock
The previous examples showed simple deadlocks, occurring between two transactions. More complex deadlocks are possible involving more than two transactions. For example, transaction 1 deadlocks trying to acquire a lock on an object held by transaction 2 who is blocked waiting on an object held by transaction 3.
To aid in analyzing complex deadlocks, the following is found in the trace messages:
For each contended object, a display of the locks is included, including any promotion waiters.
If the runtime detects that a deadlock happens due to a read lock being blocked, it includes the transaction blocked waiting for the promotion.
Single node deadlocks are bad for performance because they are a source of contention, leading to lower throughput, higher latency, and higher CPU cost. But the deadlocks are detected immediately, because each node has a built-in transaction lock manager.
Distributed deadlocks are extremely bad for performance because they use a timeout mechanism for deadlock detection. The default setting for this timeout is 60 seconds in a production build.
The program below will generate a distributed transaction lock ordering deadlock between two transactions running across multiple nodes.
package com.tibco.ep.dtm.snippets.tuning; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; import com.kabira.platform.highavailability.PartitionManager; import com.kabira.platform.highavailability.PartitionManager.EnableAction; import com.kabira.platform.highavailability.PartitionMapper; import com.kabira.platform.highavailability.ReplicaNode; import static com.kabira.platform.highavailability.ReplicaNode.ReplicationType.*; import com.kabira.platform.property.Status; /** * Distributed deadlock example from the Tuning Guide * <h2> Target Nodes</h2> * <ul> * <li> <b>servicename</b>=snippets * </ul> * Note this sample blocks on B.snippet and C.snippet nodes, * and needs to be explicitly stopped. */ public class DistributedDeadlock { private static TestObject object1; private static TestObject object2; private static final String nodeName = System.getProperty(Status.NODE_NAME); private static final String NODE_A = "A.snippets"; private static final String NODE_B = "B.snippets"; private static final String NODE_C = "C.snippets"; /** * Main entry point * @param args Not used * @throws InterruptedException Execution interrupted */ public static void main(String[] args) throws InterruptedException { // // Install a partition mapper on each node // AssignPartitions.installPartitionMapper(); // // Block all but the A node. // new NodeChecker().blockAllButA(); // // Define the partitions to be used by this snippet // new PartitionCreator().createPartitions(); // // Create a pair of objects, one active on node B, // and the other active on node C. // new Transaction("Create Objects") { @Override public void run() { object1 = new TestObject(); object2 = new TestObject(); // // For each distributed object, assign it a // reference to the other. // object1.otherObject = object2; object2.otherObject = object1; } }.execute(); // // Create a pair of objects, one active on node B, // and the other active on node C. // new Transaction("Spawn Deadlockers") { @Override public void run() { // // Ask them each to spawn a Deadlocker thread. // This should execute on node B for one of them // and node C for the other. // object1.spawnDeadlocker(); object2.spawnDeadlocker(); } }.execute(); // // Now block main in the A node to keep the JVM from exiting. // new NodeChecker().block(); } private static class PartitionCreator { void createPartitions() { new Transaction("Partition Definition") { @Override protected void run() throws Rollback { // // Set up the node lists - notice that the odd node list // has node B as the active node, while the even // node list has node C as the active node. // ReplicaNode [] evenReplicaList = new ReplicaNode [] { new ReplicaNode(NODE_C, SYNCHRONOUS), new ReplicaNode(NODE_A, SYNCHRONOUS) }; ReplicaNode [] oddReplicaList = new ReplicaNode [] { new ReplicaNode(NODE_B, SYNCHRONOUS), new ReplicaNode(NODE_A, SYNCHRONOUS) }; // // Define two partitions // PartitionManager.definePartition("Even", null, NODE_B, evenReplicaList); PartitionManager.definePartition("Odd", null, NODE_C, oddReplicaList); // // Enable the partitions // PartitionManager.enablePartitions( EnableAction.JOIN_CLUSTER_PURGE); } }.execute(); } } // // Partition mapper that maps objects to either Even or Odd // private static class AssignPartitions extends PartitionMapper { private Integer m_count = 0; @Override public String getPartition(Object obj) { this.m_count++; String partition = "Even"; if ((this.m_count % 2) == 1) { partition = "Odd"; } return partition; } static void installPartitionMapper() { new Transaction("installPartitionMapper") { @Override protected void run() { // // Install the partition mapper // PartitionManager.setMapper( TestObject.class, new AssignPartitions()); } }.execute(); } } @Managed private static class TestObject { TestObject otherObject; @SuppressWarnings("unused") private String m_data; public void lockObjects() { Transaction.setTransactionDescription("locking first object"); doWork(); // // Delay longer on the B node to try to force the deadlock // to occur on the C. Otherwise, both sides could see // deadlocks at the same time, making the log files less clear // for this snippet. // if (nodeName.equals(NODE_B)) { block(10000); } else { block(500); } Transaction.setTransactionDescription("locking second object"); otherObject.doWork(); block(500); } public void spawnDeadlocker() { new DeadlockThread(this).start(); } private void block(int milliseconds) { try { Thread.sleep(milliseconds); } catch (InterruptedException ex) { } } private void doWork() { m_data = "work"; } } private static class DeadlockThread extends Thread { private final Transaction m_deadlockTransaction; DeadlockThread(TestObject object) { m_deadlockTransaction = new DeadlockTransaction("DeadlockThread", object); } @Override public void run() { while (true) { if (m_deadlockTransaction.execute() == Transaction.Result.ROLLBACK) { return; } } } } private static class DeadlockTransaction extends Transaction { private final TestObject m_object; DeadlockTransaction(final String name, TestObject object) { super(name); m_object = object; } @Override public void run() throws Rollback { if (getNumberDeadlocks() != 0) { System.out.println("A deadlock has been seen, " + "you may now stop the distributed application"); throw new Transaction.Rollback(); } m_object.lockObjects(); } } private static class NodeChecker { // // If we are not the A node, block here forever // void blockAllButA() { while (!nodeName.equals(NODE_A)) { block(); } } public void block() { while (true) { try { Thread.sleep(500); } catch (InterruptedException ex) { } } } } }
The program should produce a deadlock that is processed on node C, and found in the node C deadlock.log file, looking similar to:
============================================================
The deadlock trace is generated on the node where the distributed transaction was started. This is not the node where the deadlock timeout occurred.
2016-06-17 11:51:32.618439 Global transaction deadlock processed on by engine Dtm::distribution running on node C.snippets in transaction 141:1 TransactionID = 141:1 GlobalTransactionID = serializable:3080819280765915:141:1:272780508690721 Node = C.snippets Name = DeadlockThread Description = locking second object Begin Time = 2016-06-17 11:50:31.830473 State = distributed deadlock Locks held by transaction 141:1: com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:46 (3184101770:3037728096:270224610788623:46) write lock Transaction callstack for 141:1: TranId Engine ThreadId Method 141:1 103 30698 distribution calling com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$ TestObject.$doWorkImpl()V on com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:60 141:1 103 30698 dispatch calling [distributed dispatch] on com.tibco.ep.dtm.snippets.tuning. DistributedDeadlock$TestObject:60 141:1 109 32695 begin transaction Thread stacks for transaction 141:1: TranId Engine ThreadId Stack type Method 141:1 109 32695 Java com.kabira.platform.NativeRuntime.sendTwoWay(Native Method) 141:1 109 32695 Java com.kabira.platform.NativeRuntime.sendTwoWay(NativeRuntime.java:111) 141:1 109 32695 Java com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject. doWork(DistributedDeadlock.java) 141:1 109 32695 Java com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject. $lockObjectsImpl(DistributedDeadlock.java:207) 141:1 109 32695 Java com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject. lockObjects(DistributedDeadlock.java) 141:1 109 32695 Java com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$DeadlockTransaction. run(DistributedDeadlock.java:279) 141:1 109 32695 Java com.kabira.platform.Transaction.execute(Transaction.java:478) 141:1 109 32695 Java com.kabira.platform.Transaction.execute(Transaction.java:560) 141:1 109 32695 Java com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$DeadlockThread. run(DistributedDeadlock.java:250) 141:1 103 30698 Native SWProcessManager::stackTrace() 141:1 103 30698 Native OSDispStackTraceNotifier::stackTrace() 141:1 103 30698 Native OSCallstack::collectCallstack() 141:1 103 30698 Native OSDeadlockReport::loadThreadStacks() 141:1 103 30698 Native OSDeadlockReport::distributedDeadlockReport() 141:1 103 30698 Native CSComm::handleDeadlockError() 141:1 103 30698 Native CSComm::handleRetryableError() 141:1 103 30698 Native CSComm::sendTwoWay() 141:1 103 30698 Native CSMetaDispatcher() 141:1 103 30698 Native OSDispChannel::callTwoWay() 141:1 103 30698 Native OSDispChannel::callDispatchFunc() 141:1 103 30698 Native OSThreadedDispChannel::dispatchUserEvent() 141:1 103 30698 Native OSThreadedDispChannel::start() 141:1 103 30698 Native startFunction() 141:1 103 30698 Native clone Engines installed on node C.snippets: ID Name 100 System::swcoordadmin 101 System::kssl 102 System::administration 103 Dtm::distribution 109 application::com_intellij_rt_execution_application_AppMain2
Next comes information from the remote node, where the deadlock timeout occurred.
Remote deadlock information: com.kabira.ktvm.transaction.DeadlockError: 2016-06-17 11:51:32.363282 Deadlock detected in transaction 139:4 by engine application::com_intellij_rt_execution_application_AppMain2 running on node B.snippets. TransactionID = 139:4 GlobalTransactionID = serializable:3080819280765915:141:1:272780508690721 Node = B.snippets Begin Time = 2016-06-17 11:50:32.336391 State = time out, distributed deadlock Lock Type = write lock Target Object = com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:60 (3184101770:3037728096:270224610788623:60) Locks on Target Object: write lock held by transaction 144:1 Number of Target Object Write Lock Waiters = 1 Locks held by transaction 139:4: com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:46 (3184101770:3037728096:270224610788623:46) write lock Transaction callstack for 139:4: TranId Engine ThreadId Method 139:4 109 32600 distributed deadlock on com.tibco.ep.dtm.snippets.tuning. DistributedDeadlock$TestObject:60 139:4 109 32600 dispatch calling com.tibco.ep.dtm.snippets.tuning. DistributedDeadlock$TestObject.$doWorkImpl()V on com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:60 139:4 103 32029 begin transaction Thread stacks for transaction 139:4: TranId Engine ThreadId Stack type Method 139:4 103 32029 Native SWQCB::queueTwoWayEvent() 139:4 103 32029 Native SWEventChan::sendTwoWayEvent() 139:4 103 32029 Native OSDispatch::sendTwoWayViaEventBus() 139:4 103 32029 Native OSDispatch::sendTwoWayRequest() 139:4 103 32029 Native CSReadChannel::processTwoWayRequest() 139:4 103 32029 Native CSReadChannel::processRequest() 139:4 103 32029 Native CSNetReader::execute() 139:4 103 32029 Native SWEngineThreadHandler::start() 139:4 103 32029 Native startFunction() 139:4 103 32029 Native clone 139:4 109 32600 Java com.kabira.platform.NativeRuntime.setReference(Native Method) 139:4 109 32600 Java com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject. $doWorkImpl(DistributedDeadlock.java:230) Engines installed on node B.snippets: ID Name 100 System::swcoordadmin 101 System::kssl 102 System::administration 103 Dtm::distribution 109 application::com_intellij_rt_execution_application_AppMain2 Other involved transactions: TransactionID = 144:1 GlobalTransactionID = serializable:3124420528571642:144:1:272698692647770 Node = B.snippets Name = DeadlockThread Description = locking second object Begin Time = 2016-06-17 11:50:31.839979 State = state not available, transaction may be running Transaction callstack for 144:1: TranId Engine ThreadId Method 144:1 103 30462 distribution calling com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock $TestObject.$doWorkImpl()V on com.tibco.ep.dtm.snippets. tuning.DistributedDeadlock$TestObject:46 144:1 103 30462 dispatch calling [distributed dispatch] on com.tibco.ep.dtm.snippets. tuning.DistributedDeadlock$TestObject:46 144:1 109 32696 begin transaction Locks held by transaction 144:1: com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:60 (3184101770:3037728096:270224610788623:60) write lock at com.kabira.platform.NativeRuntime.setReference(Native Method) at com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject.$doWorkImpl (DistributedDeadlock.java:230)
Included also from the remote node is a list of all tranasactions on the node that were blocked at the time of the deadlock.
All local blocked transactions on node B.snippets: Transaction [serializable:3124420528571642:144:1:272698692647770, tid 30718], started at 2016-06-17 11:50:41.841842, is blocked waiting for a write lock on com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:46 (3184101770:3037728096:270224610788623:46) locks write { 'DeadlockThread'[serializable:3080819280765915:141:1:272780508690721, tid 32695, locking second object] } {1 write waiters } Transaction callstack for transaction 142:1: Engine 103 Thread 30718 begin transaction Engine 109 Thread 32642 dispatch calling com.tibco.ep.dtm.snippets.tuning. DistributedDeadlock$TestObject.$doWorkImpl()V on com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:46 Objects currently locked in transaction [serializable:3124420528571642:144:1:272698692647770, tid 30718] com.tibco.ep.dtm.snippets.tuning.DistributedDeadlock$TestObject:60 (3184101770:3037728096:270224610788623:60) write lock
The transaction
statistic can show which classes are involved in transaction lock contention. Often, this is sufficient to help the developer
already familiar with the application, identify application changes for reducing the contention. For cases where the code
paths involved in the contention are not already known, the transactioncontention
statistic can be useful.
Enabling the transactioncontention
statistic causes the Spotfire
Streaming runtime to collect a stack backtrace each time a transaction lock encounters
contention. The stacks are saved per managed class name.
Note
The collection of transaction contention statistics is very expensive computationally and should only be used in development or test systems.
To use transaction contention statistics, enable them with the epadmin enable statistics --statistics=transactioncontention command.
If your application is not already running, start it. This example uses the TransactionContention snippet shown below.
package com.tibco.ep.dtm.snippets.tuning; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; /** * Simple transaction contention generator * <p> * Note this sample needs to be explicitly stopped. */ public class TransactionContention { /** * Main entry point * @param args Not used */ public static void main(String[] args) { // // Create a managed object to use for // generating transaction lock contention // final MyManaged myManaged = createMyManaged(); // // Create/start a thread which will // transactionally contend for the object. // new MyThread(myManaged).start(); while (true) { // // Contend for the object here // from // the main thread (competing // with the thread started above). // generateContention(myManaged); nap(200); } } private static MyManaged createMyManaged() { return new Transaction("createMyManaged") { MyManaged m_object; @Override protected void run() { m_object = new MyManaged(); } MyManaged create() { execute(); return m_object; } }.create(); } private static void generateContention(final MyManaged myManaged) { new Transaction("generateContention") { @Override protected void run() { writeLockObject(myManaged); } }.execute(); } @Managed private static class MyManaged { } private static void nap(int milliseconds) { try { Thread.sleep(milliseconds); } catch (InterruptedException e) { } } private static class MyThread extends Thread { MyManaged m_object; MyThread(MyManaged myManaged) { m_object = myManaged; } @Override public void run() { while (true) { generateContention(m_object); nap(200); } } } }
After your application has run long enough to generate some transaction lock contention, stop the data collection with the epadmin disable statistics statistics=transactioncontention command.
Display the collected data with the epadmin display statistics --statistics=transactioncontention command.
======== transaction contention report for A ======== 24 occurrences on type com.kabira.snippets.tuning.TransactionContention$MyManaged of stack: com.kabira.platform.Transaction.lockObject(Native Method) com.kabira.platform.Transaction.writeLockObject(Transaction.java:706) com.kabira.snippets.tuning.TransactionContention$2.run(TransactionContention.java:48) com.kabira.platform.Transaction.execute(Transaction.java:484) com.kabira.platform.Transaction.execute(Transaction.java:542) com.kabira.snippets.tuning.TransactionContention.generateContention(TransactionContention.java:43) com.kabira.snippets.tuning.TransactionContention$MyThread.run(TransactionContention.java:84) 57 occurrences on type com.kabira.snippets.tuning.TransactionContention$MyManaged of stack: com.kabira.platform.Transaction.lockObject(Native Method) com.kabira.platform.Transaction.writeLockObject(Transaction.java:706) com.kabira.snippets.tuning.TransactionContention$2.run(TransactionContention.java:48) com.kabira.platform.Transaction.execute(Transaction.java:484) com.kabira.platform.Transaction.execute(Transaction.java:542) com.kabira.snippets.tuning.TransactionContention.generateContention(TransactionContention.java:43) com.kabira.snippets.tuning.TransactionContention.main(TransactionContention.java:16) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:483) com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:483) com.kabira.platform.MainWrapper.invokeMain(MainWrapper.java:65)
This output shows the two call paths, which experienced contention.
The collected data may be cleared with the epadmin clear statistics --statistics=transactioncontention command.
Transaction lock promotion can lead to deadlocks. The transaction
statistic can show which classes are involved in transaction lock promotion. Often, this is sufficient to help the developer
already familiar with the application, identify application changes for removing the promotion locks. For cases where the
code paths involved in the contention are not already known, the transactionpromotion
statistic can be useful.
Enabling the transactionpromotion
statistic causes the Spotfire
Streaming runtime to collect a stack backtrace each time a transaction lock is promoted from
read to write. The stacks are saved per managed class name.
Note
The collection of transaction promotion statistics is very expensive computationally and should only be used in development or test systems.
To use transaction promotion statistics, enable them with the epadmin enable statistics --statistics=transactionpromotion command.
If your application is not already running, start it. This example uses the TransactionPromotion snippet shown below.
package com.tibco.ep.dtm.snippets.tuning; import com.kabira.platform.Transaction; import com.kabira.platform.annotation.Managed; /** * Simple transaction promotion generator */ public class TransactionPromotion { private static final MyManaged m_myManaged = createObject(); /** * Main entry point * @param args Not used */ public static void main(String[] args) { new Transaction("promotion") { @Override protected void run() { readLockObject(m_myManaged); // Do promotion writeLockObject(m_myManaged); } }.execute(); } private static MyManaged createObject() { return new Transaction("createObject") { MyManaged m_object; @Override protected void run() { m_object = new MyManaged(); } MyManaged create() { execute(); return m_object; } }.create(); } @Managed private static class MyManaged { } }
After your application has run stop the data collection with the epadmin disable statistics --statistics=transactionpromotion command.
Display the collected data with the epadmin display statistics --statistics=transactionpromotion command.
======== Transaction Promotion report for A ======== Data gathered between 2015-03-20 10:27:18 PDT and 2015-03-20 10:28:04 PDT. 1 occurrence on type com.kabira.snippets.tuning.TransactionPromotion$MyManaged of stack: com.kabira.platform.Transaction.lockObject(Native Method) com.kabira.platform.Transaction.writeLockObject(Transaction.java:706) com.kabira.snippets.tuning.TransactionPromotion$1.run(TransactionPromotion.java:29) com.kabira.platform.Transaction.execute(Transaction.java:484) com.kabira.platform.Transaction.execute(Transaction.java:542) com.kabira.snippets.tuning.TransactionPromotion.main(TransactionPromotion.java:22) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:483) com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) java.lang.reflect.Method.invoke(Method.java:483) com.kabira.platform.MainWrapper.invokeMain(MainWrapper.java:65)
This output shows the two call paths where the promotion occurred.
The collected data may be cleared with the epadmin clear statistics --statistics=transactionpromotion command.