GATHER Statement

Syntax

GATHER [stream_identifier_1.]field_identifier_1  AS output_field_identifier_1[, ...]
  FROM stream_identifier_1, stream_identifier_2[, ... ]
  USING field_identifier_gather
  [TIMEOUT value ON field_identifier_timeout]
  [ERROR INTO stream_identifier];

Substitutable Fields

stream_identifier_n

The unique identifier (name) of a stream.

field_identifier_n

A tuple field to be included in the output stream.

output_field_identifier_n

The name for a tuple field that is included in the output stream.

field_identifier_gather

A tuple field present in all incoming streams that will be used as the gather key field.

value

The timeout criteria: the maximum differential between the specified field value in the unmatched tuples and the value in the most recently emitted tuple.

field_identifier_timeout

The tuple field monitored for timeout.

... in GATHER clause

Additional output field entries of the form: [stream_identifier_n.]field_identifier_n AS output_field_identifier_n.

... in FROM clause

Additional input stream entries of the form: stream_identifier_n.

ERROR INTO Clause

You can append an ERROR INTO clause just before the closing semicolon. The StreamSQL ERROR INTO clause is analogous to the Enable Error Output Port check box for operators and adapters in EventFlow applications.

Use ERROR INTO with the name of a stream, which must already exist. This sets up an Error Port for this operator, which is much like a local catch mechanism for errors from this operator.

See Using Error Ports and Error Streams for a discussion of StreamBase error handling mechanisms.

Discussion

GATHER combines its Input Streams by identifying matching tuples (1 to 1) across all the Input Streams, and produces one output tuple with values constructed from any of its input tuples. By default, GATHER buffers input tuples for each value of the key field until the value has been seen on each of the input streams; only then is the output tuple released. Matching tuples can arrive in any order (input need not be synchronized), and tuples are emitted in the order that they are fully matched.

The target list following the GATHER keyword can include field values from any of the matching tuples. The streams being gathered are listed after the FROM keyword; USING identifies the key field.

Sometimes it is not desirable to wait until all of the input streams have a matching tuple; you want to generate an output stream from the input tuples that are currently buffered. To support this requirement, GATHER includes an optional TIMEOUT clause that will collect the available input tuples into an output tuple on certain conditions. The ON keyword identifies the tuple field that timeout monitors; val sets the timeout criteria, which is a maximum differential between the specified field value in the unmatched tuples and the value in the most recently emitted tuple. For example, TIMEOUT 200 ON int_field specifies that once each stream has received a tuple with an int_field value that is 200 higher than the int_field value for unmatched tuples, the unmatched tuples will be emitted; null values will be inserted into target list values that cannot be populated from the buffered tuples.

GATHER generates a stream and can be used anywhere a stream expression is acceptable. As an alternative, the output could be captured in a stream, as illustrated in the following code fragments.

CREATE [OUTPUT] STREAM stream_identifier;
GATHER ... INTO stream_identifier;
CREATE [OUTPUT] STREAM stream_identifier AS
  GATHER ...;

Or, for an OUTPUT STREAM

GATHER ... => CREATE OUTPUT STREAM stream_identifier;