Using the Iterate Operator

< Previous		Next >

Introduction

Use the Iterate operator to iterate over a list. You specify the name of an incoming field of type list, or an expression that resolves to type list. For each input tuple processed, the operator iterates over the list, potentially emitting one tuple for each element in the list. Since lists can contain more than one tuple, this operator can emit multiple outgoing tuples for each input tuple it processes.

Default Behavior

By default, Iterate emits one tuple for each element in the list.
By default, the emitted tuples contain all of the input tuple's fields plus one additional field named element that contains one element of the list, in the order of its position in the list.

Customizations

You can terminate the iteration over the list when a predicate condition becomes false. The condition can be a simple expression or can be a running aggregation over previous list elements.

You can selectively emit tuples only for those list elements that satisfy a filter predicate.

The iterate filter is applied after the iterate condition has been acted on. That is, if the iterate condition is false and the iterate filter is true, the iteration for the current input tuple being processed terminates, and no tuple is output by the operator for the current list element.

You can specify or define fields to be emitted from the operator using the field grid on the Output Settings tab, much like in the Map operator.

The StreamSQL equivalent of the Iterate operator is the FOREACH clause of the SELECT statement. See SELECT Statement in the StreamSQL Guide. Note that iteration conditions are not supported in StreamSQL, but iteration filters are supported by means of WHERE clauses.

StreamBase Studio provides an Iterate Operator Sample as part of the Operator Sample Group loaded with File → Load StreamBase Sample.

The each Tuple

The Iterate operator creates a tuple named each that is available for use in expressions in the Iterate Condition and Iterate Filter properties of the Operator Settings tab, as well as in any expression in the Output Settings tab.

The each tuple has three fields:

Field Name	Data Type	Description
sourcelist	list(T)	The value of the expression in the Iterate Over property.
element	T	The current list element being processed.
index	int	The zero-based index of the current element within `each.sourcelist`.

Properties: General Tab

Name: Use this required field to specify or change the name of this instance of this component, which must be unique in the current EventFlow module. The name must contain only alphabetic characters, numbers, and underscores, and no hyphens or other special characters. The first character must be alphabetic or an underscore.

Enable Error Output Port: Select this check box to add an Error Port to this component. In the EventFlow canvas, the Error Port shows as a red output port, always the last port for the component. See Using Error Ports to learn about Error Ports.

Description: Optionally enter text to briefly describe the component's purpose and function. In the EventFlow canvas, you can see the description by pressing Ctrl while the component's tooltip is displayed.

Properties: Operation Settings Tab

The properties view for the Operation Settings tab has the following appearance.

Use the Operation Settings tab to specify the following parameters:

Property	Description	Allowed Expression Entries
Iterate Over	Required. The name of a field of type list in the incoming stream, or an expression that resolves to a value of type list.	Simple expressions whose value is of type list.
Iterate Condition	A boolean simple or aggregate expression that, when it evaluates to `true` for a value of `each.element`, continues the iteration over the list. If the expression evaluates to `false`, the iteration terminates for the input tuple. If left empty, the operator iterates over all elements as if `true` had been specified.	Simple or aggregate expressions that can include the `each` tuple.
Iterate Filter	A boolean simple expression that, when it evaluates to `true` for a value of `each.element`, emits a tuple based on the element, if the Iterate Condition expression was true for the element. If the expression’s value is `false`, no output tuple is emitted based on the element. If this field is left empty, the operator behaves as if the expression `true` had been specified; that is, all list values cause the operator to emit a tuple.	Simple expressions that can include the `each` tuple.

Specify an Iterate Condition expression to have the operator stop processing a list as soon as one or more logical criteria are met. For example, you might limit output to the first n list items, or stop iterating when a flag in some field has or does not have a certain value.

Specify an Iterate Filter expression, such as to eliminate list items that are null or fail to meet some criterion. Your filter can transform list item values, letting you follow the Iterate operator with an Aggregate operator to assemble the emitted tuples into a new list.

Properties: Output Settings Tab

In the Output Settings tab, the Input Fields grid has All selected by default. Change this to None to suppress the output of the input tuple's fields.

In the image above, the operator emits three fields. The first two are renames of their corresponding input fields. (This is shown only to illustrate the use of the each construct; to pass the incoming fields through without the rename, use the Input Fields grid.) The third output field calculates the total cost of the stock purchases in the input list.

Properties: Concurrency Tab

Use the Concurrency tab to specify the use of parallel regions for this instance of this component. Consider selecting the parallel regions check box if this component instance is long-running or compute-intensive, can run without data dependencies on other StreamBase components, and would not cause the containing module to block while waiting for a thread to return. In this case, you may be able to improve performance by selecting this option. This option directs StreamBase Server to process this component concurrently with other processing in the application. The operating systems supported by StreamBase automatically distribute the processing of threads across multiple processors.

Caution

The parallel regions setting is not suitable for every application, and using this setting requires a thorough analysis of your application. For details, see Execution Order and Concurrency, which includes important guidelines for using the concurrency options.

Iterate Operator Examples

This section provides examples of using the Iterate operator.

Input List Field Values of Varying Lengths

This example has three components:

The input schema contains a list(string) field named Items.

The ItemsIterate operator is configured to Iterate Over the input tuple's Items field without conditions or filters.

The ItemsIterate operator's Output Settings suppresses the Input Fields but adds a single field named item that contains each.element.

In incoming tuple 1, the Items list has three elements. The ItemsIterate operator emits three tuples that each contain an item field, one tuple for each element of the Items field.

In incoming tuple 2, Items has two elements, resulting in two tuples emitted.

Filtering and Transforming Lists

This example illustrates how to create a new list whose contents are a filtered and transformed version of an input list, using a single Iterate operator followed by an Aggregate operator.

The Input Stream's schema has a list of ints and a single field multiplier of type int.

Configure the IterateFilter operator to Iterate Over the incoming list, but to also filter out any null elements

Use the the IterateFilter operator's Output Settings tab to transform each element of the incoming list by multiplying it by the multiplier field in the input tuple. Also emit the name of the emitted list and name of its index field, which we will use in the following Aggregate operator.

Configure the AggregateList operator with the following Predicate dimension settings. This fills an aggregate window with the contents of the tuples emitted from the Iterate operator. The Aggregate operator uses this dimension to open a window when the first element of a list is processed, and closes the window and emits a tuple when the last element of the list is processed. The Aggregate operator detects the first and last elements of the input list using the each.index value as well as the length of the each.sourcelist value.

The Aggregate operator uses the aggregatelist() expression language function to assemble the tuples that the Iterate operator emits into the new filtered and transformed list.

For an input tuple whose list field contains the elements 5, null, and 9, and whose multiplier field is 3, the Iterate operator emits two elements: it multiplies 5 times 3 and emits 15; it filters the null element; and then it multiplies 9 times 3 and emits 27. The follow-on Aggregate operator then creates a list containing both elements and emits the list as a field in a single output tuple.

Terminating Iterations with Iterate Conditions

You can configure the IterateFilter operator of the previous example to add an Iterate Condition setting, which tells the operator to stop processing the input list as soon as one or more logical criteria are met. For example, you can limit the output to the first 5 list items:

Iterate Condition: each.index <= 4

Or you can stop iterating when a flag in some field has or does not have a certain value.

Iterating Over a Range: Simulating a For Loop

The three previous examples iterate over lists that arrive from input stream tuple fields. However, the list to iterate over does not need to come from the input: it can be any list that is a result of the Iterate Over property’s expression. Thus, the list could be a literal, a constant, or some parametric list that is determined based on the value from any input tuple field.

This feature allows us to use the Iterate operator to create iteration structures resembling more specific iteration and looping constructs found in other programming languages. For example, we can use the range() expression language function to create iterations that resemble for loops, as in the following pseudo-code:

for i = 0 to 9 by 2
    output(i);

To achieve this result in EventFlow, set the Iterate Over expression to range(0,10,2), and add an output field i whose expression is each.element.

The range() function creates a list using its arguments; thus range(0,10,2) returns [0,2,4,6,8] — a five element list — and the Iterate operator iterates over that list.

range() always creates the entire list it is asked for all at once, so be cautious when using range() to control very long iterations. For example, range(0, 1000000) creates a list with one million elements, which would not be the most memory-efficient way to control an iteration.

Iterating Over a Range: Make N Copies of a Tuple

You can use a parametric range invocation such as range(0, input1.N), and have the Iterate operator emit all input fields and nothing else. In this case, the Iterate operator creates N copies of each of its input tuples, with the value N potentially varying with each input tuple. Specifically, for a tuple whose N field = 3, the Iterate operator emits 3 tuples that are indistinguishable replicas of the input tuple.

Back to Top ^