Profiling

Profiling Overview

You can profile a StreamBase application while it is running to extract and format statistics about the operators, queues, and threads in that application, and about system resources consumed by that application. The extracted statistics are described below in Meanings of the Profiling Statistics Collected. You can also collect raw statistics from a running server into a file and can extract and format profiling statistics from the raw statistics file after the server has stopped.

Use the sbprofile command to extract and format statistics for you to analyze. The sbprofile command does not analyze and draw conclusions from the system statistics. Instead, its purpose is to extract useful information from the system statistics stream and format it for manual or automated inspection.

You can profile interactively to see statistics while an application is actively processing tuples, or you can gather raw statistics into a file for later review and analysis.

Note

The profiling system shows meaningful statistics only for operators, queues, and threads that have run for a while. It is common to have StreamBase applications with operators that use no measurable amount of CPU time when the application has not run very long or has not run with much data. For low latency applications running on fast CPUs, it could take hundreds of thousands of tuples to register statistics greater than zero.

When a StreamBase application starts, the JVM associated with StreamBase Server interprets the bytecode at the start of execution, compiles it into native CPU instructions, and then optimizes it as the application is run. Thus, when profiling, statistics gathered at the start of the run do not reflect the optimization timing that will be automatically applied as the application continues to run. Therefore, the longer the application runs, the more accurate the profile.

The Snapshot Interval and Profile Resolution

Statistics are extracted for formatting and analysis from the statv2 output stream emitted from the system container. Every running StreamBase application has a system container, and the system.statv2 stream is automatically generated. (In StreamBase releases before 7.0, the statistics stream was named system.stat.)

The system.statv2 stream emits statistics for the active operators and queues during each snapshot interval. By default, the snapshot interval is once every second. You cannot change the snapshot interval interactively while the server is running.

The snapshot interval is not the resolution of the statistics sampler in the server, it is the period over which gathered statistics are reported. The statistic sampler's default resolution is to sample the state of the CPU every 10 milliseconds. You can change this default by setting the streambase.profile.intervalMS system property. For example, set streambase.profile.intervalMS=1 to specify CPU sampling every millisecond, which provides roughly millisecond sampling accuracy. StreamBase Manager and sbprofile both benefit from the increased accuracy, but only sbprofile reports microsecond values in timestamps.

The following summarizes the difference between profile resolution and the snapshot interval:

  • Profile resolution, set with the streambase.profile.intervalMS system property, sets how often statistics are actually sampled, independent of whether they are monitored and reported or not.

  • The snapshot interval sets the size of the aggregate period for which statistics are reported and the statistics tuple is emitted.

Profiling While StreamBase Server is Running

You can view operator and queue statistics while StreamBase Server is running, using the following tools:

You can also view operator statistics — but not queue or thread statistics — in the Profiler View of the Test/Debug perspective in StreamBase Studio while running or testing a module. However, remember that StreamBase Studio was designed as a development environment, and is not configured or intended for benchmarking. Always make decisions about your application's performance based on running it with the epadmin command, not when running it in Studio. See Studio Performance Note.

Real-Time Profiling with sbprofile

By default, sbprofile with no arguments connects to the default server URI, sb://localhost:10000, and dequeues from the system.statv2 stream at that URI. By default, sbprofile prints to stdout a set of CSV formatted statistics lines, extracting one line for each active operator, queue, and thread, and printing one set of system statistics for every system.statv2 snapshot interval. The command extracts and formats statistics until it is stopped with Ctrl+C or until the server shuts down.

Collecting Raw Statistics for Profile Analysis

You can gather raw statistics from the system.statv2 stream of a running StreamBase application, and save those statistics to a file for later analysis and review. Use a command like the following example:

sbc -u sburi dequeue system.statv2 > rawstatfile.txt

For StreamBase Server releases before 7.0, use this version of the command:

sbc -u sburi dequeue system.stat > rawstatfile.txt

Substitute the URI of your running StreamBase application using the standard StreamBase URI format, as described in the sburi(5) reference page. Run this sbc command to collect raw statistics until either StreamBase Server shuts down, or you type Ctrl+C to stop the sbc command.

Raw statistics files can grow very large very quickly. Do not try to capture more than an hour or two of statistics for a production server processing real data. You can use the gzip command to compress the raw statistics file.

Meanings of the Profiling Statistics Collected

The tables in this section describe each field of each type of statistics information line in the CSV output format of the sbprofile command. The first time a new statistics type is emitted, it includes a header line that shows a description of the fields of that statistics type.

For most statistic types, sbprofile reports deltas since the previous snapshot interval. There is an exception for garbage collection (GC) statistics, which are passed through as reported by the JVM. GC statistics are reported as the total accumulated collection counts and time since the start of the JVM that is hosting StreamBase Server.

In the default CSV output format, each emitted line begins with a field containing a single letter that distinguishes the output type. The values in the first field are one of the following:

O for operator statistics lines.
D for operator description lines.
G for JVM garbage collection lines.
Q for queue statistics lines.
T for thread statistics lines.
S for system information lines.
s for stream statistics lines, for input or output streams.

The statistics information line types are described in the following subsections:

Operator Statistics
Operator Description
Garbage Collection Statistics
Queue Statistics
Thread Statistics
System Statistics
Stream Statistics

Operator Statistics

For each active operator, the profiling report produced by sbprofile contains the following fields:

Field Order Field for Operator Rows Description
1 Type This field is O for operator statistic lines.
2 Operator Name The name of the operator, usually qualified with the name of its container.
3 Operator In The number of tuples on the input port of this operator in the most recent snapshot interval.
4 Operator Out The number of tuples on the output port of this operator in the most recent snapshot interval.
5 Operator Percent Time The cumulative percentage of CPU time used by this operator since the monitored StreamBase Server instance started.
6 Operator Time The time in microseconds used by this operator in the most recent snapshot interval. This field is likely to be zero or show very small values for most operators in a healthy application.
7 Operator Size The size of the operator, if applicable for this operator. Size is an operator-dependent value, and is only valid for operators that maintain a state. This field shows zero or N/A for all stateless operators. See Meaning of the Operator Size Field for a table listing the possible meanings of the size column for stateful operators.
8 Operator Timestamp The timestamp value, including milliseconds, for this operator statistic event.

Operator Description

For each operator, the first time the operator is seen, a descriptive record can be optionally delivered. The description record contains the following fields:

Field Order Field for Operator Rows Description
1 Type This field is D for operator description lines.
2 Operator name The name of the operator, qualified with the name of its container.
3 Operator type The name of operator type, such as Map, Filter, input adapter, and so on.
4 Operator properties If the operator has any properties defined, they are shown in this field.

Garbage Collection Statistics

Lines beginning with G record JVM garbage collection (GC) events.

Garbage collection statistics are passed through as reported by the JVM hosting StreamBase Server. GC statistics are reported as the total accumulated collection counts and time since the start of the hosting JVM. Notice that this is in contrast to the statistics reported for other statistic types, which are reported as deltas since the previous snapshot interval.

The GC statistic record contains the following fields:

Field Order Field for GC Rows Description
1 Type This field is G for garbage collection event lines.
2 JVM collector name The JVM name for the garbage collector.
3 GC number of collections The number of collections since the JVM started.
4 GC collection time The amount of time taken by the garbage collection event.
5 GC timestamp Approximate timestamp of the garbage collection event.

Queue Statistics

Tuple queues are managed automatically by StreamBase Server. A queue is created when conditions arise that call for a queue; in general, a queue is created for transitions from one thread to another. In practical terms, this occurs when you specify concurrency options for operators, or when a custom operator or adapter uses the sendOutputAsync() API.

For each active queue, the profiling report produced by sbprofile contains the following fields:

Field Order Field for Queue Rows Description
1 Type This field is Q for queue statistic lines.
2 Queue Name The name of the queue.
3 Queue Max Size The maximum queue backlog reached since the current application was started.
4 Queue Current Size The number of tuples passing through this queue in the most recent snapshot interval. In a healthy application, this field should be near zero most of the time. Large numbers in this field point you to a portion of your application that might need to be optimized.
5 Region Load The percentage of the time a region was processing tuples since the last profiler report. A Region Load only applies to top level region queues, and the Region Load for a sub queue will always be zero. A high Region Load (near 100) might indicate that the region is saturated and that optimisation and tuning is required.
6 Queue Timestamp The timestamp including milliseconds value for this queue statistic event.

Thread Statistics

For each active thread, the profiling report produced by sbprofile contains the following fields:

Field Order Field for Thread Rows Description
1 Type This field is T for thread statistic lines.
2 Thread Name The name of the thread.
3 User Time The number of microseconds this thread ran in user mode in the most recent snapshot interval.
4 System Time The number of microseconds this thread ran in system mode in the most recent snapshot interval.
5 Thread Timestamp The timestamp including milliseconds value for this thread statistic event.

System Statistics

For each snapshot interval, the profiling report produced by sbprofile contains one system statistic line with the following fields:

Field Order Field for System Rows Description
1 Type This field is S for system information lines.
2 Maximum Memory The maximum amount of memory which the StreamBase Server process is allowed to allocate.
3 Used Memory The total amount of memory currently in use in the StreamBase Server process.
4 Free Memory The amount of free memory in the StreamBase Server process.
5 Total number of threads The total number of threads running in the JVM (including non-StreamBase threads).
6 Dequeue Client Count The current number of dequeuing clients.
7 Number of seconds running The time that StreamBase Server has been running, in seconds.

Stream Statistics

For each active input and output stream, the CSV-format profiling report produced by sbprofile contains one stream statistic line with the following fields:

Field Order Field for System Rows Description
1 Type This field is lowercase s for stream statistics lines.
2 Stream Name The name of the stream.
3 Input or Output indicator Uppercase I indicates that this line is for an Input Stream, while uppercase O indicates this line is for an Output Stream.
4 Total number of tuples The total number of tuples processed on this stream since server start.
5 Stream timestamp The timestamp including milliseconds value for this stream statistic event.

Meaning of the Operator Size Field

For operators that do not maintain a state, the Operator Size column shows zero or N/A. For the operators in the following table, the Operator Size column has the meaning shown.

Operator Meaning of Size for This Operator
Aggregate The number of open windows. The value may vary as the aggregate windows open and close.
BSort The size of the queue.
Gather The number of pending gather operations.
Java Operators or Adapters The return value of Operator.size(), which can be overridden by implementation. If you program your own Java operators, you can control the meaning of the Operator Size column by using or extending the Operator.size() method in your operator's code.
Join The total number of tuples in both buffers.
Log Adapter The Spotfire Streaming Log Output adapter reports the size of the queue of tuples waiting to be sent to the current logging system, but only when you enable the Log Asynchronously option. This queue is expected to have size 1 under normal circumstances. See The Log Asynchronously Option.
Merge The total number of tuples in both buffers.
Query The number of rows in the associated Query Table.

Module Reference operators show zero or N/A in the Operator Size column. This means that statistics gathering does not descend into the referenced module.

Decision Table Statistics

Decision table statistics are available using --decision-table option in sbprofile. These statistics report:

Field Order Field for System Rows Description
1 Number of conditions evaluated The number of conditions evaluated to true or to false.
2 Time spent evaluating conditions The time used to evaluate all conditions.
3 Number of output tuples created The number of output tuples created after evaluation.
4 Time spent creating output tuples The time used to create output tuples after evaluation.
5 Number of non-matching tuples emitted. The total numbers of tuples do not match after evaluation.

For example:

# d=Decision table name, Conditions Evaluated, Evaluation microseconds, Actions Executed, 
Action Execution microseconds, Non-matching Tuples Emitted
d, default.DecisionTable, 51, 231, 10, 684, 0, 2022-03-25 14:55:48.960
d, default.DecisionTable, 48, 261, 8, 552, 0, 2022-03-25 14:55:49.960
d, default.DecisionTable, 51, 247, 10, 668, 0, 2022-03-25 14:55:50.959

Generating a Profiling Report

Generate a statistics report from your saved raw statistics file with a command like the following:

sbprofile -s rawstatfile.txt -o profile-report.csv

Use -c instead of -o to write the output file in gzip compressed format, or -b to use bzip compression:

sbprofile -s rawstatfile.txt -c profile-report.csv.gz

The report is in CSV format like the examples in this section.

Append the --roll-size or --roll-time option to automatically roll over the output file when it reaches a specified size or when a certain time has elapsed. See sbprofile for syntax details. For example:

sbprofile -s rawstatfile.txt -o profile-report.csv --roll-size 1000
sbprofile -s rawstatfile.txt -c profile-report.csv.gz --roll-time 2H

The following example shows statistics lines of each type, adding operator description lines with sbprofile -d. This is from the Best Bids and Asks sample with Run this component in a parallel region specified for both Filter operators.

D, default.Update_Bids_and_Asks, query
O, default.Update_Bids_and_Asks, 41651, 41651, 9.59, 36426, 2960, 2010-12-09 13:37:28.362
D, default.PreserveBidInfo, map
O, default.PreserveBidInfo, 18380, 18380, 5.66, 55022, N/A, 2010-12-09 13:37:28.362
D, default.PreserveAskInfo, map
O, default.PreserveAskInfo, 10089, 10089, 2.64, 26155, N/A, 2010-12-09 13:37:28.362
D, default.IsNewBestBid:0.IsNewBestBid, filter
O, default.IsNewBestBid:0.IsNewBestBid, 41643, 9048, 1.1, 17791, N/A, 2010-12-09 13:37:28.362
D, default.IsNewBestBid:1.IsNewBestBid, filter
O, default.IsNewBestBid:1.IsNewBestBid, 41701, 9054, 0.44, 0, N/A, 2010-12-09 13:37:28.362
D, default.IsNewBestAsk.IsNewBestAsk, filter
O, default.IsNewBestAsk.IsNewBestAsk, 41701, 9868, 0.27, 0, N/A, 2010-12-09 13:37:28.362
Q, default.IsNewBestBid:0::default:ParallelSequence, 2048, 186, 2010-12-09 13:37:28.362
Q, default.IsNewBestAsk::default:ParallelSequence, 2048, 179, 2010-12-09 13:37:28.362
Q, default::default.IsNewBestAsk:ParallelSequence, 3943, 432, 2010-12-09 13:37:28.362
Q, default::default.IsNewBestBid:1:ParallelSequence, 3349, 433, 2010-12-09 13:37:28.362
Q, default::default.IsNewBestBid:0:ParallelSequence, 8192, 1, 2010-12-09 13:37:28.362
Q, default.IsNewBestBid:1::default:ParallelSequence, 2048, 196, 2010-12-09 13:37:28.362
Q, default.IsNewBestAsk.out:IsNewBestAsk_1::out:IsNewBestAsk_1, 2048, 179, 2010-12-09 13:37:28.362
Q, default.IsNewBestAsk.out:Update_Bids_and_Asks_1:0, 3943, 432, 2010-12-09 13:37:28.362
Q, default.IsNewBestBid.out:IsNewBestBid_1::out:IsNewBestBid_1, 2048, 196, 2010-12-09 13:37:28.362
Q, default.IsNewBestBid:0.out:Update_Bids_and_Asks_1:0, 8192, 1, 2010-12-09 13:37:28.362
Q, default.IsNewBestBid:1.out:Update_Bids_and_Asks_1:1, 4096, 433, 2010-12-09 13:37:28.362
T, ThreadPool-2, 327602, 234001, 2010-12-09 13:37:28.362
T, ThreadPool-4, 15600, 15600, 2010-12-09 13:37:28.362
T, ThreadPool-6, 15600, 0, 2010-12-09 13:37:28.362
T, default.IsNewBestAsk: Queue handler, 15600, 46800, 2010-12-09 13:37:28.362
T, default.IsNewBestBid:0: Queue handler, 15600, 0, 2010-12-09 13:37:28.362
T, default.IsNewBestBid:1: Queue handler, 0, 31200, 2010-12-09 13:37:28.362
T, default: Queue handler, 15600, 0, 2010-12-09 13:37:28.362
S, 477233152, 61640704, 78278656, 24, 2, 130, 2010-12-09 13:37:28.362
O, default.Update_Bids_and_Asks, 37683, 37683, 9.65, 167923, 2960, 2010-12-09 13:37:29.299
O, default.PreserveBidInfo, 16507, 16507, 5.65, 28615, N/A, 2010-12-09 13:37:29.299
O, default.PreserveAskInfo, 9107, 9107, 2.63, 18814, N/A, 2010-12-09 13:37:29.299
...

The following example shows statistics of the G line type that show JVM garbage collection statistics:

G, PS MarkSweep, 0, 0, 2011-04-28 20:41:25.399
G, PS Scavenge, 5, 27, 2011-04-28 20:41:25.399
G, PS MarkSweep, 0, 0, 2011-04-28 20:41:26.399
G, PS Scavenge, 5, 27, 2011-04-28 20:41:26.399
G, PS MarkSweep, 0, 0, 2011-04-28 20:41:27.399
G, PS Scavenge, 5, 27, 2011-04-28 20:41:27.399
G, PS MarkSweep, 0, 0, 2011-04-28 20:41:28.399
G, PS Scavenge, 5, 27, 2011-04-28 20:41:28.399
G, PS MarkSweep, 0, 0, 2011-04-28 20:41:29.399
G, PS Scavenge, 5, 27, 2011-04-28 20:41:29.399
G, PS MarkSweep, 0, 0, 2011-04-28 20:41:30.399

Related Topics

Back to Top ^