Intermediate Stream Dequeuing

Introduction

For debugging purposes only, you can dequeue data from intermediate streams in your application, not only from output streams. This debugging feature allows you to examine the value of data at an intermediate point in the application before it has been fully processed and before it has arrived on an output stream. This feature is not to be used for inserting new data into the middle of an application.

Caution

When intermediate stream dequeuing is enabled, the StreamBase Server instance hosting your application consumes considerably more memory, to the point that you might not be able to successfully launch large applications.

Intermediate stream dequeuing (ISD) is disabled by default. You can enable it for a particular run or debug session as described below in Enabling ISD.

ISD Compared to Other Features

Do not confuse intermediate stream dequeuing with the always expose feature of input and output streams.

Always Expose Stream Feature

The General tab of the Properties view for input and output streams has a check box labeled Always expose stream for enqueue or dequeue, respectively. This feature is an attribute of explicit input and output streams that allows you to designate a stream to be en- or dequeuable no matter how deeply its containing module is nested. This feature only affects input and output streams.

Intermediate Stream Dequeue

ISD is a property setting of the hosting server, and affects all modules in an application. This feature allows you to dequeue not only from output streams, but from the output ports of Map, Filter, Union, or other operators in any module. You can limit the number of operators exposed for dequeuing by specifying a regular expression to match against the names of intermediate streams.

For another debugging approach, consider using runtime tracing in place of, or in addition to, ISD. Runtime tracing is described in Runtime Tracing and Creating Trace Files, and includes a comparison table in Tracing Versus Dequeuing.

What Streams Are Exposed

The following table clarifies what streams are exposed for dequeuing for various settings of intermediate stream dequeuing.

ISD Setting Regular Expression Setting Dequeuable Streams
ISD disabled (the default for run and debug modes) N/A For the top-level module only:
  • All explicit output streams.

  • Any sub-module output stream individually exposed for dequeuing in its Properties view.

ISD enabled None specified.

For the top-level module and all sub-modules:

  • All explicit input and output streams.

  • All explicit error input and error output streams.

  • All intermediate streams in all modules.

ISD enabled Regular expression substring specified.

For the top-level module:

  • All explicit input and output streams.

  • All explicit error input and error output streams.

For all sub-modules:

  • Any input, output, error, or intermediate stream whose name matches the supplied regular expression substring.

  • Any sub-module output stream individually exposed for dequeuing in its Properties view.

Note

Error output streams are considered intermediate streams for purposes of dequeuing. To allow dequeuing directly from an error output stream, you must enable ISD. Note that this is useful only for debugging: in a well-designed application, your application does not need to read the tuples output on error streams, which are passed automatically to the next higher module, or to the hosting Server.

Enabling ISD

You can enable intermediate stream dequeuing in several ways:

With all methods, you can enable either full or partial ISD:

  • Full ISD means enabling dequeuing on all available intermediate streams in all modules in the application.

  • Partial ISD means limiting the intermediate streams enabled for dequeuing by specifying a regular expression to match against the names of streams. Only intermediate streams whose name matches the expression as a substring are enabled for dequeuing.

Enable ISD in StreamBase Studio

Follow these steps to expose intermediate streams when running or debugging an application in Studio:

  1. Configure and save a run or debug configuration for the application whose intermediate streams you want to expose. Run configurations are described in Editing Launch Configurations.

  2. On the Advanced tab of the run configuration dialog, check the Enable Intermediate Stream Dequeue check box. When used without a filter pattern, this specifies full ISD.

  3. Optional step, to specify partial ISD. To restrict the number of intermediate streams exposed, enter a regular expression pattern to match against stream names. Intermediate streams whose name matches the provided expression as a substring are exposed for dequeuing; all other intermediate streams are not.

  4. Run the application by running the launch configuration. Intermediate streams now appear as selectable output streams in the Application Output view's list of streams.

The following illustration shows the Output Stream Selector dialog from the Application Output view for two cases of running the Bollinger Band sample installed with StreamBase.

  • You see the dialog on the left when running with the default run configuration. It shows only the three output streams defined in the application.

  • You see the dialog on the right when running with full intermediate stream dequeuing enabled. It shows the same three output streams plus the output port of all intermediate components as selectable streams.

Enable ISD for Command Line sbd

To enable intermediate stream dequeuing when starting StreamBase Server from the command line, use sbd with its –-intermediate-stream-dequeue option. For example:

sbd --intermediate-stream-dequeue BestBidsAsks.sbapp

The --intermediate-stream-dequeue option specifies full ISD, with all intermediate streams enabled. To specify partial ISD for sbd, use the configuration file method described in the next section.

Enable ISD in Configuration Files

To enable intermediate stream dequeuing in the server configuration file, set the streambase.codegen.intermediate-stream-dequeue property to true, using the <jvm-args> or <sysproperty> element of the server configuration file:

<sysproperty 
    name="streambase.codegen.intermediate-stream-dequeue" 
    value="true" />

The default for this property is false.

To enable partial ISD, specify a regular expression that matches the names of the streams you want to expose. Make this specification with the streambase.codegen.intermediate-stream-dequeue-regex property, using the <jvm-args> or <sysproperty> element of the server configuration file:

<sysproperty 
    name="streambase.codegen.intermediate-stream-dequeue-regex" 
    value="pattern" />

For example, the regular expression pattern Map\\d allows any intermediate stream whose name contains the string "Map" followed by a single digit to be exposed as a dequeuable stream. All other intermediate streams are not available.

<sysproperty 
    name="streambase.codegen.intermediate-stream-dequeue-regex" 
    value="Map\\d" />

Intermediate Stream Naming Convention

With intermediate stream dequeuing enabled, you can dequeue data from any output port of any operator using a stream name of the form:

out:operatorName_N

where operatorName is the name of the operator and N is the number of the output port for that operator.

Consider the Split.sbapp application, which is one of the operator samples installed with the StreamBase kit:

To verify the intermediate stream names, run sbc list when the application is running with ISD enabled. For example:

container   default
container   system
stream      INTRUSION_TooManyIPsForUser
stream      INTRUSION_TooManyUsersForIP
stream      IPandUserLogin
stream      out:CheckIPsForAUser_1
stream      out:CheckUsersInAnIP_1
stream      out:ProcessIPFirst_1
stream      out:ProcessIPFirst_2
schema      schema:IPandUserLogin
operator    CheckIPsForAUser
operator    CheckUsersInAnIP
operator    IPCountExceeded
operator    ProcessIPFirst
operator    UserCountExceeded

Compare the EventFlow diagram above with the sbc list output. Notice the following:

  • Intermediate streams are shown because we are running the server with ISD enabled, as described above.

  • Stream names such as out:ProcessIPFirst_2 illustrate the default naming convention for intermediate streams.

  • The output ports for the Filter operators, UserCountExceeded and IPCountExceeded, are connected to actual output streams, and thus do not have intermediate streams. Of course, you can dequeue from both intermediate streams and explicit output streams.

You can also determine intermediate stream names by viewing the XML source for your EventFlow application. Look for the stream values associated with output ports. For example:

<box name="ProcessIPFirst" type="split">
  <input port="1" stream="IPandUserLogin"/>
  <output port="1" stream="out:ProcessIPFirst_1"/>
  <output port="2" stream="out:ProcessIPFirst_2"/>
  <param name="output-count" value="2"/>
...

Unconventional Intermediate Stream Names

Intermediate stream names can be found that do not appear to follow the naming convention described in the previous section. When determining the name of intermediate streams from which to dequeue, you must confirm the names as actually used in your application. Confirm with an sbc list command or by viewing the XML source of your EventFlow application.

In some cases, the name of an intermediate stream can vary from the default convention, depending on the history of edits to the application. For example, consider following installed sample, AggregateByDim.sbapp:

When you run this sample in debug mode (or with JVM argument set as above), an sbc list command returns:

container  default
container  system
stream     AvgPricePSOut
stream     OutputStream1
stream     TradesIn
schema     schema:TradesIn
operator   Aggregate2Dimensions
operator   ConvertTimeToSeconds

By default, the intermediate stream between the Aggregate2Dimensions and ConvertTimeToSeconds operators would have been named out:Aggregate2Dimensions_1. But at some point in this application's history, the Aggregate2Dimensions operator was connected to an output stream named OutputStream1, and was subsequently disconnected from that output stream (which was either removed or renamed).

Despite the edits, the original output stream name (OutputStream1) defined for Aggregate2Dimensions is still in use. In other words, do not assume that each operator's output stream name always follows the default convention.