VJOIN Statement

Syntax

 VJOIN target_list
  FROM {stream_identifier_1 | stream_expression_1} 
    [[AS] stream_alias_1], {stream_identifier_2 | stream_expression_2} 
    [[AS] stream_alias_2]
  WHERE join_field_identifier_2
    BETWEEN join_field_identifier_1 - value_1 AND join_field_identifier_1 + value_2
    [AND predicate]
  [TIMEOUT timeout_value]
  [ERROR INTO stream_identifier];

Substitutable Fields

target_list

One or more entries, separated by commas, of the format target_list_entry.

target_list_entry

A value, of the format scalar_expression [AS output_field_identifier], to be included in the result set returned by the statement. Values can be obtained from a tuple field or from a simple function. Optionally, the name for a value can be modified through an AS clause.

scalar_expression

A value that will be included in the output tuple. Fields from either input tuple can be included in the output tuple. If desired, the output field can be renamed using the AS clause.

stream_identifier

The streams containing the input tuples.

stream_expression

A StreamSQL statement that produces a stream. The statement must be enclosed within parentheses.

stream_alias

A unique identifier or name (alias) used only within this statement.

join_field_identifier

The fields to be compared as the basis of the join; join_field_identifier_1 is name of the join field contained in the first stream while join_field_identifier_2 is name of the join field contained in the second stream. These fields can have different names but they must be of the same type. For tuples in the two input streams to be joined, the value of the join field in the second input stream must be within the range around the value of the join field in the first input stream [Low end of range = (value of join field in stream 1) - value_1; High end of range = (value of join field in stream 1) + value_2].

value

value_1 sets the lower end of the join range; value_2 sets the upper value of the join range. These entries, which are the same data type as the join_field_identifier, can be equivalent or different values.

predicate

A conditional statement that establishes additional constraints for the join, for example, stream_identifier_1.field_identifier==stream_identifier_2.field_identifier.

timeout_value

An integer value, in seconds, after which tuples will be flushed from the buffer. The TIMEOUT functionality is not enabled until at least two tuples arrive on a stream.

ERROR INTO Clause

You can append an ERROR INTO clause just before the closing semicolon. The StreamSQL ERROR INTO clause is analogous to the Enable Error Output Port checkbox for operators and adapters in EventFlow applications.

Use ERROR INTO with the name of a stream, which must already exist. This sets up an Error Port for this operator, which is much like a local catch mechanism for errors from this operator.

See Using Error Ports and Error Streams for a discussion of StreamBase error handling mechanisms.

Discussion

With a value based join, matches will only occur if the tuples satisfy the predicate condition and the values of their ordering fields are within a specified range. For the functionality to run correctly, the value of the ordering field in each stream must be increasing; undefined behavior will result if the value of the ordering field in either stream decreases.

All incoming tuples are buffered and a tuple will remain in the buffer as long as its ordering field value remains within the join range of each newly arriving tuple on the other inputs stream. If the value of the ordering field in a newly arriving tuple causes the join range to be repositioned, then tuples whose ordering field values are outside of the repositioned range will be dropped from the operator.

VJOIN generates a stream and can be used anywhere a stream expression is acceptable. As an alternative, the output could be captured in a stream, as illustrated in the following code fragments.

CREATE [OUTPUT] STREAM stream_identifier;
VJOIN ... INTO stream_identifier;
CREATE [OUTPUT] STREAM stream_identifier AS
  VJOIN ...;

Or, for an OUTPUT STREAM

VJOIN ... => CREATE OUTPUT STREAM stream_identifier;