SELECT Statement

Syntax

SELECT target_list_entry [, target_list_entry...]
  FROM event_source [...] |
  FROM PATTERN template [pattern_operator template ...] 
  WITHIN (interval TIME | value ON field)
  [FOREACH listfield AS field_identifier_foreach]
  [WHERE predicate]
  [HAVING predicate]
  [GROUP BY field_identifier_grouping [, ...]]
  [ORDER BY field_identifier_ordering [, ...] [DESC] [LIMIT number]]
  [INTO stream_identifier]
  [ERROR INTO stream_identifier]

Substitutable Fields

target_list_entry

A rule that defines a value to be included in the result set that is returned by the statement. The target list can contain one or more comma-separated target list entries. You can use either standard field identifiers, or wildcard rules:

  • expression [AS field_alias]
    expression

    An expression that identifies a field in a stream. The field's value will be included in the result set. Fields can be obtained from an event field, a table entry, or from a simple or aggregate function.

    field_alias

    Renames the field_identifier in the result set. Alias names must be used for values derived from functions and from named schemas; otherwise they are optional.

    Here are four examples of target list entry expressions:

    userID, userName
    il.Symbol AS symbol, il.Price
    tuple(point.x + point.y) AS pointsum
    point(1,2) AS p

    The first expression above selects the values of two fields in some stream. The second selects two fields from a stream identified as il and renames one of them. The third example assembles two fields from a named schema (called point) into a tuple field called pointsum. The fourth example uses a tuple constructor to specify constant values, which must conform to the named schema (point in this case).

  • wildcard_expression [AS wildcard_target]
    wildcard_expression

    A field identifier that uses a wildcard (*) as a variable name. The wildcard is expanded to a literal value. Examples of wildcard expressions:

    *
    f(g(*))
    f(*) + g(*) + 2
    wildcard_target

    After the AS keyword, the name to which the expanded value in the expression will be bound. The format is a wildcard with an optional string prefix and postfix. For example:

    *
    foo*
    *bar
    foo*bar

    A wildcard rule with no AS keyword and explicit target is assumed to have the default target, *. For example, these two wildcard rules are equivalent:

    symbol(*)
    symbol(*) AS *

    For details about using wildcards, how they are expanded, and some examples, see Wildcard Rules in StreamSQL SELECT Statements.

event_source

In a FROM clause: a stream, windowed stream, materialized window, or table from which the statement obtains values.

A single windowed stream does not require a comma between the stream identifier and the window specification. A simple join of a stream and table is indicated by separating their identifiers with a comma, while an outer join of a stream and a table or a stream and a materialized window is indicated by separating the identifiers with the OUTER JOIN keywords. Use a comma to separate the two windowed streams used in a tuple-based join.

  • Single unwindowed stream:

    FROM {stream_identifier | subquery} [AS source_alias]
  • Single windowed stream:

    FROM  {stream_identifier | subquery} '['window_specification | window_identifier']' 
      [AS source_alias]
  • Table query:

    FROM {stream_identifier | subquery} [AS source_alias] {, | OUTER JOIN } table_identifier 
      [AS table_alias]
  • Materialized window:

    FROM {stream_identifier | subquery} [AS source_alias]  OUTER JOIN  
      materialized_window_identifier [AS table_alias]
  • Tuple-based join:

    FROM {stream_identifier_1 | subquery_1} '['window_specification_1 | window_identifier_1']' 
      [AS source_alias_1], {stream_identifier_2 | subquery_2} 
      '['window_specification_2 | window_identifier_2']' [AS source_alias_2]
  • Pattern expression:

    FROM PATTERN {stream_identifier_1 | subquery_1} [AS source_alias_1]  
      [pattern_operator [NOT] ({stream_identifier_2 | subquery_2} [AS source_alias_2]),...] 
      WITHIN (interval TIME | value ON field)

    See Detecting Patterns for more information about pattern expressions.

stream_identifier

The unique identifier (name) of the stream.

subquery

A parenthesized statement that produces a stream.

source_alias

A unique identifier or name (alias) used only within this statement for an event source providing input used by this statement.

window_identifier

A named window specification previously declared with a CREATE WINDOW statement. Note that the square brackets are a required part of the syntax.

window_specification

A window specification defined within the FROM clause. Note that the square brackets are a required part of the syntax.

table_identifier

A named table previously declared with a CREATE TABLE statement.

materialized_window_identifier

A named materialized window previously declared with a CREATE MATERIALIZED WINDOW statement.

template

In the second (PATTERN) form of the FROM clause, a pattern that evaluates to a stream identifier or alias. The pattern can include nested templates, or templates combined using logical operators.

pattern_operator

Logical operator (or equivalent keyword) that relates a pair of templates in a FROM PATTERN clause. For example, !A or NOT A or !A, A THEN B or A -> B.

interval

For a time-based FROM PATTERN clause, defines a timeout (in seconds). All the events in the entire pattern (including subpatterns) must be received within the given amount of time.

value ON field

For a value-based FROM PATTERN clause, the maximum size of the range of values in the specifed field . The order field must be a top-level numeric or timestamp field, as described in StreamBase Pattern Matching Language. For example, a window size of 5 for an int field means that the window will close when the range of values of that field, across all streams, equals or exceeds 5.

predicate

A clause that limits the result set returned by the SELECT statement. That is, to be selected, the stream, event, or table row identified in the SELECT statement must satisfy the restrictions in the predicate. In a WHERE clause, if the predicate evaluates to true, the stream, event, or row is selected. In a HAVING clause, the predicate sets limits on which rows will be returned. The predicate can contain logical operators, mathematical operators, and/or a BETWEEN-AND clause.

SELECT * FROM InputStream1
  WHERE someField >= 3 && someField <= 6
  INTO OutputStream1;

Or

SELECT * FROM InputStream1
  WHERE someField BETWEEN 3 AND 6
  INTO OutputStream1;
listfield

A field of type list in the schema of the incoming tuple, whose elements are directed to the field_identifier_foreach field for each outgoing tuple of a SELECT statement with a FOREACH clause.

field_identifier_foreach

An output field prepended to the schema of the incoming tuple by the FOREACH clause. This field has the same data type as the list element type of the list field specified in the FOREACH clause. It is populated by the elements of the specified list, one per outgoing tuple.

field_identifier_grouping

An output field used to group the entries in the result set returned by the statement. When the SELECT statement is applied to a windowed stream, the GROUP BY clause must include each target_list_entry that is not derived from an aggregate function.

field_identifier_ordering

An output field used to order the entries in the result set returned by the statement. If the SELECT statement is being used as a Table query (that is, to read from a table), the ordering column(s) must be indexed.

number

The number of entries to include in the result set returned by the statement. The LIMIT keyword is only valid within the context of the ORDER BY clause.

ERROR INTO Clause

You can append an ERROR INTO clause just before the closing semicolon. The StreamSQL ERROR INTO clause is analogous to the Enable Error Output Port check box for operators and adapters in EventFlow applications.

Use ERROR INTO with the name of a stream, which must already exist. This sets up an Error Port for this operator, which is much like a local catch mechanism for errors from this operator.

See Using Error Ports and Error Streams for a discussion of StreamBase error handling mechanisms.

Discussion

SELECT is used to retrieve events from an unwindowed stream, one or two windowed streams, a materialized window, or a table. A SELECT statement includes required subclauses. There are two forms of the FROM clause: the first identifies the streams, materialized window or table from which the events are extracted. The second form (FROM PATTERN) identifies one or more streams in which events satisfy a specified pattern. The optional WHERE subclause adds additional restrictions to the SELECT result, such as a range of values or a limit to the number of events received. A SELECT statement can also include nested SELECT statements (also called subqueries).

A target_list_entry is a rule that expands to a value that will be included in each row of the result set. An entry can be extracted from an event present on a stream, from an event in a materialized window, from a row in a table, or from the return from a StreamBase function or expression. You can use wildcards in expressions. If a function or expression is used to generate a value, a field_alias must be specified; if a wildcard expression is used, a wildcard target must be specified.

If the target list includes values derived from aggregate functions, the FROM clause must specify a windowed stream, materialized window, or a table. If the target list of a SELECT statement run against a windowed stream includes event field values, simple function values or scalar expressions, and values derived from aggregate functions, each of the non-aggregate values must be listed in the GROUP BY clause.

When a target_list_entry is an aggregate function applied to a window definition, for example, the openval function, the syntax varies depending on whether the window was defined in a separate CREATE WINDOW statement or listed with the stream identifier in the FROM clause. If a CREATE WINDOW statement is used, then the window identifier is included as a parameter: openval(window_identifier). If the window specification is included within the FROM clause, it does not have a window identifier. In this case, a parameter is not required to invoke the openval() function.

When you use the first (non-PATTERN) form of the FROM clause to extract values from an unwindowed stream, the FROM clause includes only one unwindowed stream event source. If the event source is a windowed stream, the FROM clause includes a stream and window specification. To access a materialized window, the FROM clause must include a stream event source and a materialized window event source, and when accessing a table, the FROM clause must include a stream event source and a table event source. Finally, when performing an event based join, the FROM clause must include two windowed streams.

With a materialized window or table read, one or more field values from each event on the incoming stream are used to select rows from the window or table. Consequently, the SELECT statement can return a result set containing content from multiple window or table entries.

The FROM PATTERN clause must also contain a WITHIN subclause. This subclause defines a window within which the pattern is to be detected, so that the operation is finite and guaranteed to terminate. The window can be either time-based or value-based:

Time

The window will be managed and evaluated based on the passage of time. For example, an event containing the query result may be emitted and the window closed when a specified time has elapsed, followed by the arrival of a new event. For example, in this clause, the pattern query will compare the two input streams shown over a period of one day:

FROM PATTERN (SoldInput THEN BoughtInput) WITHIN 86400 TIME
Value

The query window closes when the range of values compared in a specified field exceeds a specified limit. The order field must be a numeric or timestamp field. For example, in this clause the query evaluates the id field in both input streams. The window closes and query results are output when input on the first stream is followed by input on the second stream, and the second stream's id is within 2 of the first stream's id:

FROM PATTERN (SoldInput THEN BoughtInput) WITHIN 2 ON id

If the WHERE clause is omitted from the SELECT statement, all events or rows are selected from the stream, materialized window, or table.

The HAVING clause is only valid when a SELECT statement is run against a windowed stream.

The ORDER BY field_identifier entries are the stream fields or table columns to use in ordering the result set. The optional LIMIT clause restricts the size of the result set. Note that the LIMIT clause must be used in conjunction with the ORDER BY clause.

Use the FOREACH clause to iterate over the elements of a list field, outputting one tuple for each element in listfield. When using FOREACH, you must specify listfield, a field of type list in the incoming tuple, and must specify a name for a field, field_identifier_foreach, that will be prepended to each outgoing tuple. The incoming tuple is resolved into multiple outgoing tuples, one per element of listfield. The schema of each outgoing tuple is the same as the incoming tuple, but with field_identifier_foreach prepended. The prepended field has the same data type as the list element type of listfield. The prepended field is populated with each element of listfield, in list order, one per outgoing tuple.

The following example outputs one tuple for each element of field prices, which has data type list(double). This statement sends to stream OutStream one tuple for each element in the prices field, with an extra field, eachprice, prepended.

SELECT * FROM InStream
  FOREACH prices AS eachprice
  INTO OutStream;

The return from a SELECT statement must be captured into a stream. The CREATE STREAM Statement statement (or possibly a CREATE OUTPUT STREAM Statement statement) can be combined with an embedded SELECT statement or written as separate statements, as the following fragments illustrate.

CREATE STREAM stream_idenfifier AS
  SELECT ...;

Or

CREATE STREAM stream_identifier;
SELECT ... INTO stream_identifier;