Contents
- Profiling Overview
- The Snapshot Interval and Profile Resolution
- Profiling While StreamBase Server is Running
- Real-Time Profiling with sbprofile
- Collecting Raw Statistics for Profile Analysis
- Meanings of the Profiling Statistics Collected
- Meaning of the Operator Size Field
- Generating a Profiling Report
- Related Topics
You can profile a StreamBase application while it is running to extract and format statistics about the operators, queues, and threads in that application, and about system resources consumed by that application. The extracted statistics are described below in Meanings of the Profiling Statistics Collected. You can also collect raw statistics from a running server into a file and can extract and format profiling statistics from the raw statistics file after the server has stopped.
Use the sbprofile command to extract and format statistics for you to analyze. The sbprofile command does not analyze and draw conclusions from the system statistics. Instead, its purpose is to extract useful information from the system statistics stream and format it for manual or automated inspection.
You can profile interactively to see statistics while an application is actively processing tuples, or you can gather raw statistics into a file for later review and analysis.
Note
The profiling system shows meaningful statistics only for operators, queues, and threads that have run for a while. It is common to have StreamBase applications with operators that use no measurable amount of CPU time when the application has not run very long or has not run with much data. For low latency applications running on fast CPUs, it could take hundreds of thousands of tuples to register statistics greater than zero.
When a StreamBase application starts, the JVM associated with StreamBase Server interprets the bytecode at the start of execution, compiles it into native CPU instructions, and then optimizes it as the application is run. Thus, when profiling, statistics gathered at the start of the run do not reflect the optimization timing that will be automatically applied as the application continues to run. Therefore, the longer the application runs, the more accurate the profile.
Statistics are extracted for formatting and analysis from the statv2
output stream emitted from the system
container. Every running StreamBase application has a
system
container, and the system.statv2
stream is automatically generated. (In StreamBase
releases before 7.0, the statistics stream was named system.stat
.)
The system.statv2
stream emits statistics for the active
operators and queues during each snapshot interval. By default, the snapshot interval
is once every second. You cannot change the snapshot interval interactively while the
server is running.
The snapshot interval is not the resolution of the statistics sampler in the server,
it is the period over which gathered statistics are reported. The statistic sampler's
default resolution is to sample the state of the CPU every 10 milliseconds. You can
change this default by setting the streambase.profile.intervalMS
system property. For example, set
streambase.profile.intervalMS=1
to specify CPU sampling
every millisecond, which provides roughly millisecond sampling accuracy. StreamBase
Manager and sbprofile both benefit from
the increased accuracy, but only sbprofile reports microsecond values in timestamps.
The following summarizes the difference between profile resolution and the snapshot interval:
-
Profile resolution, set with the
streambase.profile.intervalMS
system property, sets how often statistics are actually sampled, independent of whether they are monitored and reported or not. -
The snapshot interval sets the size of the aggregate period for which statistics are reported and the statistics tuple is emitted.
You can view operator and queue statistics while StreamBase Server is running, using the following tools:
-
Run sbprofile in a UNIX terminal window or StreamBase Command Prompt.
-
Run StreamBase Manager and review the Operators, Queues, and Threads views.
-
Run your own monitoring tool written with the StreamBase Monitor API, available for Java and .NET.
You can also view operator statistics — but not queue or thread statistics — in the Profiler View of the Test/Debug perspective in StreamBase Studio while running or testing a module. However, remember that StreamBase Studio was designed as a development environment, and is not configured or intended for benchmarking. Always make decisions about your application's performance based on running it with the epadmin command, not when running it in Studio. See Studio Performance Note.
By default, sbprofile with no arguments
connects to the default server URI, sb://localhost:10000
, and dequeues from the system.statv2
stream at that URI. By default, sbprofile prints to stdout
a set of CSV formatted statistics lines, extracting one line
for each active operator, queue, and thread, and printing one set of system
statistics for every system.statv2
snapshot interval. The command
extracts and formats statistics until it is stopped with Ctrl+C
or until the server shuts down.
You can gather raw statistics from the system.statv2
stream of a running StreamBase application, and save those statistics to a file for
later analysis and review. Use a command like the following example:
sbc -u sburi
dequeue system.statv2 > rawstatfile.txt
For StreamBase Server releases before 7.0, use this version of the command:
sbc -u sburi
dequeue system.stat > rawstatfile.txt
Substitute the URI of your running StreamBase application using the standard StreamBase URI format, as described in the sburi(5) reference page. Run this sbc command to collect raw statistics until either StreamBase Server shuts down, or you type Ctrl+C to stop the sbc command.
Raw statistics files can grow very large very quickly. Do not try to capture more than an hour or two of statistics for a production server processing real data. You can use the gzip command to compress the raw statistics file.
The tables in this section describe each field of each type of statistics information line in the CSV output format of the sbprofile command. The first time a new statistics type is emitted, it includes a header line that shows a description of the fields of that statistics type.
For most statistic types, sbprofile reports deltas since the previous snapshot interval. There is an exception for garbage collection (GC) statistics, which are passed through as reported by the JVM. GC statistics are reported as the total accumulated collection counts and time since the start of the JVM that is hosting StreamBase Server.
In the default CSV output format, each emitted line begins with a field containing a single letter that distinguishes the output type. The values in the first field are one of the following:
O for operator statistics lines. |
D for operator description lines. |
G for JVM garbage collection lines. |
Q for queue statistics lines. |
T for thread statistics lines. |
S for system information lines. |
s for stream statistics lines, for input or output streams. |
The statistics information line types are described in the following subsections:
Operator Statistics |
Operator Description |
Garbage Collection Statistics |
Queue Statistics |
Thread Statistics |
System Statistics |
Stream Statistics |
For each active operator, the profiling report produced by sbprofile contains the following fields:
Field Order | Field for Operator Rows | Description |
---|---|---|
1 | Type |
This field is O for operator statistic lines.
|
2 | Operator Name | The name of the operator, usually qualified with the name of its container. |
3 | Operator In | The number of tuples on the input port of this operator in the most recent snapshot interval. |
4 | Operator Out | The number of tuples on the output port of this operator in the most recent snapshot interval. |
5 | Operator Percent Time | The cumulative percentage of CPU time used by this operator since the monitored StreamBase Server instance started. |
6 | Operator Time | The time in microseconds used by this operator in the most recent snapshot interval. This field is likely to be zero or show very small values for most operators in a healthy application. |
7 | Operator Size |
The size of the operator, if applicable for this operator. Size is an
operator-dependent value, and is only valid for operators that maintain a
state. This field shows zero or N/A for all
stateless operators. See Meaning of the Operator Size Field for a table listing
the possible meanings of the size column for stateful operators.
|
8 | Operator Timestamp | The timestamp value, including milliseconds, for this operator statistic event. |
For each operator, the first time the operator is seen, a descriptive record can be optionally delivered. The description record contains the following fields:
Field Order | Field for Operator Rows | Description |
---|---|---|
1 | Type |
This field is D for operator description
lines.
|
2 | Operator name | The name of the operator, qualified with the name of its container. |
3 | Operator type | The name of operator type, such as Map, Filter, input adapter, and so on. |
4 | Operator properties | If the operator has any properties defined, they are shown in this field. |
Lines beginning with G record JVM garbage collection (GC) events.
Garbage collection statistics are passed through as reported by the JVM hosting StreamBase Server. GC statistics are reported as the total accumulated collection counts and time since the start of the hosting JVM. Notice that this is in contrast to the statistics reported for other statistic types, which are reported as deltas since the previous snapshot interval.
The GC statistic record contains the following fields:
Field Order | Field for GC Rows | Description |
---|---|---|
1 | Type |
This field is G for garbage collection event
lines.
|
2 | JVM collector name | The JVM name for the garbage collector. |
3 | GC number of collections | The number of collections since the JVM started. |
4 | GC collection time | The amount of time taken by the garbage collection event. |
5 | GC timestamp | Approximate timestamp of the garbage collection event. |
Tuple queues are managed automatically by StreamBase Server. A queue is created
when conditions arise that call for a queue; in general, a queue is created for
transitions from one thread to another. In practical terms, this occurs when you
specify concurrency options for operators, or when a custom operator or adapter
uses the sendOutputAsync()
API.
For each active queue, the profiling report produced by sbprofile contains the following fields:
Field Order | Field for Queue Rows | Description |
---|---|---|
1 | Type |
This field is Q for queue statistic lines.
|
2 | Queue Name | The name of the queue. |
3 | Queue Max Size | The maximum queue backlog reached since the current application was started. |
4 | Queue Current Size | The number of tuples passing through this queue in the most recent snapshot interval. In a healthy application, this field should be near zero most of the time. Large numbers in this field point you to a portion of your application that might need to be optimized. |
5 | Queue Timestamp | The timestamp including milliseconds value for this queue statistic event. |
For each active thread, the profiling report produced by sbprofile contains the following fields:
Field Order | Field for Thread Rows | Description |
---|---|---|
1 | Type |
This field is T for thread statistic lines.
|
2 | Thread Name | The name of the thread. |
3 | User Time | The number of microseconds this thread ran in user mode in the most recent snapshot interval. |
4 | System Time | The number of microseconds this thread ran in system mode in the most recent snapshot interval. |
5 | Thread Timestamp | The timestamp including milliseconds value for this thread statistic event. |
For each snapshot interval, the profiling report produced by sbprofile contains one system statistic line with the following fields:
Field Order | Field for System Rows | Description |
---|---|---|
1 | Type |
This field is upperase S for system
information lines.
|
2 | Maximum Memory | The maximum amount of memory which the StreamBase Server process is allowed to allocate. |
3 | Used Memory | The total amount of memory currently in use in the StreamBase Server process. |
4 | Free Memory | The amount of free memory in the StreamBase Server process. |
5 | Total number of threads | The total number of threads running in the JVM (including non-StreamBase threads). |
6 | Dequeue Client Count | The current number of dequeuing clients. |
7 | Number of seconds running | The time that StreamBase Server has been running, in seconds. |
For each active input and output stream, the CSV-format profiling report produced by sbprofile contains one stream statistic line with the following fields:
Field Order | Field for System Rows | Description |
---|---|---|
1 | Type |
This field is lowercase s for stream
statistics lines.
|
2 | Stream Name | The name of the stream. |
3 | Input or Output indicator |
Uppercase I indicates that this line is for an
Input Stream, while uppercase O indicates this
line is for an Output Stream.
|
4 | Total number of tuples | The total number of tuples processed on this stream since server start. |
5 | Stream timestamp | The timestamp including milliseconds value for this stream statistic event. |
For operators that do not maintain a state, the Operator Size column shows zero or
N/A
. For the operators in the following table, the
Operator Size column has the meaning shown.
Operator | Meaning of Size for This Operator |
---|---|
Aggregate | The number of open windows. The value may vary as the aggregate windows open and close. |
BSort | The size of the queue. |
Gather | The number of pending gather operations. |
Java Operators or Adapters |
The return value of Operator.size() , which
can be overridden by implementation. If you program your own Java operators,
you can control the meaning of the Operator Size column by using or extending
the Operator.size() method in your operator's
code.
|
Join | The total number of tuples in both buffers. |
Log Adapter | The StreamBase Log Output adapter reports the size of the queue of tuples waiting to be sent to the current logging system, but only when you enable the Log Asynchronously option. This queue is expected to have size 1 under normal circumstances. See The Log Asynchronously Option. |
Merge | The total number of tuples in both buffers. |
Query | The number of rows in the associated Query Table. |
Module Reference operators show zero or N/A
in the
Operator Size column. This means that statistics gathering does not descend into the
referenced module.
Generate a statistics report from your saved raw statistics file with a command like the following:
sbprofile -s rawstatfile.txt -o profile-report.csv
Use -c
instead of -o
to write
the output file in gzip compressed format, or -b
to use
bzip compression:
sbprofile -s rawstatfile.txt -c profile-report.csv.gz
The report is in CSV format like the examples in this section.
Append the --roll-size
or --roll-time
option to automatically roll over the output file when it
reaches a specified size or when a certain time has elapsed. See sbprofile for syntax details. For
example:
sbprofile -s rawstatfile.txt -o profile-report.csv --roll-size 1000 sbprofile -s rawstatfile.txt -c profile-report.csv.gz --roll-time 2H
The following example shows statistics lines of each type, adding operator
description lines with sbprofile -d
. This is from the Best Bids and
Asks sample with Run this component in a parallel
region specified for both Filter operators.
D, default.Update_Bids_and_Asks, query O, default.Update_Bids_and_Asks, 41651, 41651, 9.59, 36426, 2960, 2010-12-09 13:37:28.362 D, default.PreserveBidInfo, map O, default.PreserveBidInfo, 18380, 18380, 5.66, 55022, N/A, 2010-12-09 13:37:28.362 D, default.PreserveAskInfo, map O, default.PreserveAskInfo, 10089, 10089, 2.64, 26155, N/A, 2010-12-09 13:37:28.362 D, default.IsNewBestBid:0.IsNewBestBid, filter O, default.IsNewBestBid:0.IsNewBestBid, 41643, 9048, 1.1, 17791, N/A, 2010-12-09 13:37:28.362 D, default.IsNewBestBid:1.IsNewBestBid, filter O, default.IsNewBestBid:1.IsNewBestBid, 41701, 9054, 0.44, 0, N/A, 2010-12-09 13:37:28.362 D, default.IsNewBestAsk.IsNewBestAsk, filter O, default.IsNewBestAsk.IsNewBestAsk, 41701, 9868, 0.27, 0, N/A, 2010-12-09 13:37:28.362 Q, default.IsNewBestBid:0::default:ParallelSequence, 2048, 186, 2010-12-09 13:37:28.362 Q, default.IsNewBestAsk::default:ParallelSequence, 2048, 179, 2010-12-09 13:37:28.362 Q, default::default.IsNewBestAsk:ParallelSequence, 3943, 432, 2010-12-09 13:37:28.362 Q, default::default.IsNewBestBid:1:ParallelSequence, 3349, 433, 2010-12-09 13:37:28.362 Q, default::default.IsNewBestBid:0:ParallelSequence, 8192, 1, 2010-12-09 13:37:28.362 Q, default.IsNewBestBid:1::default:ParallelSequence, 2048, 196, 2010-12-09 13:37:28.362 Q, default.IsNewBestAsk.out:IsNewBestAsk_1::out:IsNewBestAsk_1, 2048, 179, 2010-12-09 13:37:28.362 Q, default.IsNewBestAsk.out:Update_Bids_and_Asks_1:0, 3943, 432, 2010-12-09 13:37:28.362 Q, default.IsNewBestBid.out:IsNewBestBid_1::out:IsNewBestBid_1, 2048, 196, 2010-12-09 13:37:28.362 Q, default.IsNewBestBid:0.out:Update_Bids_and_Asks_1:0, 8192, 1, 2010-12-09 13:37:28.362 Q, default.IsNewBestBid:1.out:Update_Bids_and_Asks_1:1, 4096, 433, 2010-12-09 13:37:28.362 T, ThreadPool-2, 327602, 234001, 2010-12-09 13:37:28.362 T, ThreadPool-4, 15600, 15600, 2010-12-09 13:37:28.362 T, ThreadPool-6, 15600, 0, 2010-12-09 13:37:28.362 T, default.IsNewBestAsk: Queue handler, 15600, 46800, 2010-12-09 13:37:28.362 T, default.IsNewBestBid:0: Queue handler, 15600, 0, 2010-12-09 13:37:28.362 T, default.IsNewBestBid:1: Queue handler, 0, 31200, 2010-12-09 13:37:28.362 T, default: Queue handler, 15600, 0, 2010-12-09 13:37:28.362 S, 477233152, 61640704, 78278656, 24, 2, 130, 2010-12-09 13:37:28.362 O, default.Update_Bids_and_Asks, 37683, 37683, 9.65, 167923, 2960, 2010-12-09 13:37:29.299 O, default.PreserveBidInfo, 16507, 16507, 5.65, 28615, N/A, 2010-12-09 13:37:29.299 O, default.PreserveAskInfo, 9107, 9107, 2.63, 18814, N/A, 2010-12-09 13:37:29.299 ...
The following example shows statistics of the G line type that show JVM garbage collection statistics:
G, PS MarkSweep, 0, 0, 2011-04-28 20:41:25.399 G, PS Scavenge, 5, 27, 2011-04-28 20:41:25.399 G, PS MarkSweep, 0, 0, 2011-04-28 20:41:26.399 G, PS Scavenge, 5, 27, 2011-04-28 20:41:26.399 G, PS MarkSweep, 0, 0, 2011-04-28 20:41:27.399 G, PS Scavenge, 5, 27, 2011-04-28 20:41:27.399 G, PS MarkSweep, 0, 0, 2011-04-28 20:41:28.399 G, PS Scavenge, 5, 27, 2011-04-28 20:41:28.399 G, PS MarkSweep, 0, 0, 2011-04-28 20:41:29.399 G, PS Scavenge, 5, 27, 2011-04-28 20:41:29.399 G, PS MarkSweep, 0, 0, 2011-04-28 20:41:30.399