Concurrency Options

Introduction

Most Spotfire Streaming operators support one or more concurrency options that, in some cases, allow you to speed up your StreamBase applications. This page describes the basics of setting the concurrency options for operators. See Execution Order and Concurrency for an in-depth discussion of the principles supporting the concurrency options.

Most Spotfire Streaming operators and adapters can be configured:

  • To run each instance of a operator, module, or adapter in its own parallel region

  • To run with multiple copies of itself for parallel processing

  • For both options

Concurrency options are configured in the Concurrency tab of an operator's Properties view.

EventFlow Icon Decorations

The EventFlow Editor shows that the parallel region concurrency option is enabled for a component by marking that component's icon with a circle within its tile boundaries.

The concurrency setting Multiplicity, when set to multiple with a number of instances setting of two or more, is shown with an overlaid number that reflects the number of instances.

Extension Point operator icons can also show a numeric decoration, but that does not come from a Concurrency tab setting. In this case, the circled number represents the number of module instances defined in the operator's Modules tab in its Properties view.

When you use a parameter to specify the number of instances for a Module Reference, its icon decoration shows ${x}.

Parallel Region Option

Operators that support parallel regions have a checkbox at the top of the Concurrency tab of their Properties view. The wording of the label for this checkbox changes as follows:

  • When the Multiplicity option is set to single (the default setting), the parallel region option is labeled Run this component in a parallel region.

  • When Multiplicity is set to multiple, the parallel region option is labeled Run each instance of this component in a parallel region.

This option can apply to:

  • A single operator, which would run in a parallel region independent of the main application.

  • Multiple instances of an operator, if the operator supports both multiplicity and parallel regions. In this case, each instance of the operator runs in its own parallel region.

  • A module referenced by a Module Reference operator.

  • One or more module instances in an Extension Point operator, each of which would run in a separate parallel region.

A component is a candidate for parallel regions if the component is long-running or compute-intensive, can run without data dependencies on other StreamBase components, and would not cause the containing module to block while waiting for the component to return. If your component meets these criteria, you may be able to improve performance of the overall application by specifying parallel regions.

This option causes StreamBase Server to process the component, module, or module instances concurrently with other processing in the application. The operating systems supported by StreamBase automatically distribute the processing of threads across multiple processors.

The Concurrency tab also contains different queue configuration setting as follows:

  • Initial Buffer Size: This is the initial buffer size for a parallel region to use and it must be a power of two. Setting this is not actually necessary since these buffers can grow. This key is optional and its default value is 16.

  • Max Buffer Size: This is the maximum queue size in tuples for a parallel region. This must be a power of 2. This key is optional and its default value is 230.

  • Max Processing Batch Size: This is the maximum batch size in tuples to process from a queue at once. This key is optional and its default value is zero, meaning unlimited.

  • Max Outstanding Tuples: This is the number of tuples to process from a queue before blocking the caller. This key is optional and its default value is zero, meaning unlimited.

  • Data Distribution Policy: This is the distribution policy to use for parallel region queues. This key is only valid if the parallel region in question uses transactional memory for storage, in which case it enables high availability of that storage. Its value must be a valid distribution policy as declared in your application configuration. This key is optional and has no default value. If it is not set, then high availability for this region's queues is not enabled.

  • Wait Strategy: This is the default wait strategy used by the disruptor that controls parallel stream queuing. Valid values are BLOCKING, BUSY_SPIN, SLEEPING, and YIELDING. This key is optional and its default value is BLOCKING.

Caution

The parallel region setting is not suitable for every application, and using this setting requires a thorough analysis of your application. For a background discussion, see Execution Order and Concurrency, which includes important guidelines for using the concurrency options.

Multiplicity Options

Multiplicity refers to the number of instances of an operator, module, or module instance. To specify more than one instance of this component, select multiple in the Multiplicity control, and enter an integer number of instances, or enter the name of a parameter if the form ${param-name}. When changing from single to multiple multiplicity, Studio starts with 2 in the Number of instances field. Edit this as your application requires.

For example, for a Filter operator with a complex, compute-intensive filtering algorithm, you can specify that the Filter operator run with a multiplicity of 2. In this case, two instances of the Filter operator run in parallel.

You can combine a multiplicity setting with parallel regions. In this case, each instance of the operator runs in its own parallel region.

Multiplicity can apply to:

  • A single operator.

  • A module referenced by a Module Reference.

A multiplicity setting greater than one does not automatically improve your application's performance. The situations in which a component can take advantage of multiplicity are complex. See Execution Order and Concurrency for background information and guidelines.

Caution

As with parallel regions, a multiplicity setting of 2 or more is not suitable for every application, and using this setting requires a thorough analysis of your application. For a background discussion, see Execution Order and Concurrency, which includes important guidelines for using the concurrency options.

Important

StreamBase does not support exporting a Query Table from a Module Reference with a multiplicity setting greater than 1.

When an operator or module is set to run with multiple instances, or when two or more module instances of an Extension Point are run, StreamBase's default behavior is to broadcast incoming tuples equally to each instance of the operator or module. However, you can optionally specify non-default styles for dispatching incoming tuples to operator or module instances. This subject is discussed on Dispatch Styles.

The Number of instances field can take a parameter in the form ${param}, where param is defined in the Definitions tab of the EventFlow Editor of the same module, or is otherwise passed to the containing module.

Operator Support for Concurrency Options

The following table shows the varying operator support for concurrency features:

Component Parallel Regions Multiplicity Dispatch Styles
Aggregate yes yes yes
BSort yes yes yes
Decision Table yes
Distributed Router yes yes yes
Extension Point yes yes
Filter yes yes yes
Gather yes yes yes
Heartbeat yes yes yes
Iterate yes
Java Adapters, all custom and global yes yes yes
Java Operators, all custom and global yes yes yes
Join yes yes yes
Map yes yes yes
Merge yes yes yes
Metronome yes
Module Reference yes yes yes
Pattern yes yes yes
Query with Query Table
Query with JDBC Table yes yes yes
Sequence
Split yes
Union yes

The following shows the same information as the table above, organized in a different way.

Operators that support Parallel Regions, Multiplicity, and Dispatch Styles

Aggregate, BSort, Distributed Router, Filter, Gather, Heartbeat, Join, Map, Merge, Module Reference, Pattern, Query operator when associated with a JDBC Table, all custom or global Java operators, all custom or global Java adapters.

Operators that support Parallel Regions only

Decision Table, Iterate, Metronome, Split, Union

Special Case: Operator that supports Parallel Regions and Dispatch Styles

Extension Point

Operators with no support for concurrency options

Sequence, Query associated with a Query Table

Back to Top ^