StreamBase Expression Language Functions

This topic describes the syntax and use of each function in the StreamBase expression language. See StreamBase Expression Language Features for information on using the expression language in general. See StreamBase Data Types for usage information on StreamBase data types.

This page is organized into the following sections:

Index of Expression Language Functions

The following table is a complete, alphabetized index into the functions provided in the StreamBase expression language, with links to the description for each function.

Function Name and Link to Section Category Simple or Aggregate Function
abs Math Simple function
acos Math Simple function
addbson BSON Simple function
advanceTimeBy System Simple function
advanceTimeTo System Simple function
aggregatelist Aggregate to List Aggregate function
aggregatelistnonnull Aggregate to List Aggregate function
alpha Statistical calculations Aggregate function
andall (simple) Math Simple function
andall (aggregate) Statistical calculations Aggregate function
append Lists Simple function
asin Math Simple function
atan Math Simple function
atan2 Math Simple function
avg (simple) Lists Simple function
avg (aggregate) Statistical calculations Aggregate function
avg_list (simple) Lists Simple function
avg_list (aggregate) Statistical calculations Aggregate function
avgif (aggregate) Statistical calculations Aggregate function
beta Statistical calculations Aggregate function
bitand Math Simple function
bitnot Math Simple function
bitor Math Simple function
bitxor Math Simple function
black_scholes Financial Simple function
blob Type conversions Simple function
bool Type conversions Simple function
bsontojson BSON Simple function
calljava (simple) External Functions (simple) Simple function
calljava (aggregate) External Functions (aggregate) Aggregate function
catchexception Errors Simple function
cbrt Math Simple function
ceil Math Simple function
choose Math Simple function
closeval Windowing Aggregate function
coalesce Utilities Simple function
coalesce_tuples Utilities Simple function
compile Utilities Simple function
compound_interest Financial Simple function
concat (simple) Lists Simple function
concat (aggregate) Aggregate to List Aggregate function
contains (list item) Lists Simple function
contains (strings) Strings Simple function
correlation_coefficient Statistical calculations Simple function
correlation_coefficient Statistical calculations Aggregate function
correlation_coefficientp Statistical calculations Aggregate function
cos Math Simple function
cosh Math Simple function
count Statistical calculations Aggregate function
countlong Statistical calculations Aggregate function
count_distinct Statistical calculations Aggregate function
count_distinct_elements Lists Simple function
count_list (simple) Lists Simple function
count_list (aggregate) Statistical calculations Aggregate function
countif (aggregate) Statistical calculations Aggregate function
covariance Statistical calculations Aggregate function
covariancep Statistical calculations Aggregate function
date Timestamps Simple function
days Timestamps Simple function
dotproduct Lists Simple function
double Type conversions Simple function
emptylist Lists Simple function
endswith Strings Simple function
epoch Timestamps Simple function
error Errors Simple function
eval Utilities Simple function
exp Math Simple function
exp_moving_avg Statistical calculations Aggregate function
expm1 Math Simple function
factorial Math Simple function
filterlist Lists Simple function
filternull Lists Simple function
findbson BSON Simple function
firstelement Lists Simple function
firstn Windowing Aggregate function
firstnonnullval Windowing Aggregate function
firstval Windowing Aggregate function
floor Math Simple function
foldleft Lists Simple function
foldright Lists Simple function
format Strings Simple function
format_time Timestamps Simple function
fromCodePoint Utilities Simple function
from_gmtime Timestamps Simple function
from_localtime Timestamps Simple function
from_unixtime Timestamps Simple function
frombson BSON Simple function
gaussianrandom Math Simple function
getAllHostIPs Runtime Simple function
get_conf_param Runtime Simple function
get_day_of_month Timestamps Simple function
get_day_of_week Timestamps Simple function
get_day_of_year() Timestamps Simple function
get_hour Timestamps Simple function
get_millisecond Timestamps Simple function
get_minute Timestamps Simple function
get_month Timestamps Simple function
getOldestWindowID() Windowing Aggregate function
get_quarter Timestamps Simple function
get_second Timestamps Simple function
getWindowID() Windowing Aggregate function
get_year Timestamps Simple function
getClientIP Runtime Simple function
getContainer Runtime Simple function
getHostName Runtime Simple function
getNodeName Runtime Simple function
getParallelRoot Runtime Simple function
getPath Runtime Simple function
getResourceAbsolutePath Runtime Simple function
getServerURI Runtime Simple function
getTableSize Runtime Simple function
getTargetTime() System Simple function
goodman_kruskal_gamma Statistical calculations Simple function
goodman_kruskal_gamma Statistical calculations Aggregate function
great_circle_distance Utilities Simple function
has_intersection Lists Simple function
hash Utilities Simple function
hours Timestamps Simple function
indexof (for lists) Lists Simple function
indexof (for strings) Strings Simple function
inet_aton Internet Simple function
inet_ntoa Internet Simple function
insertelement Lists Simple function
int Type conversions Simple function
intercept Statistical calculations Aggregate function
intersect Lists Simple function
interval Timestamps Simple function
isempty Strings Simple function
isinterval Timestamps Simple function
isnan NaN (not a number) Simple function
isnull Utilities Simple function
isOldestWindow() Windowing Aggregate function
join, joinlist Lists Simple function
join, joinlist Aggregate to List Aggregate function
joinbson BSON Simple function
jsontobson BSON Simple function
jsonpath Utilities Simple function
kendall_tau Statistical calculations Simple function
kendall_tau Statistical calculations Aggregate function
lag Windowing Aggregate function
lastelement Lists Simple function
lastindexof (for lists) Lists Simple function
lastindexof (for strings) Strings Simple function
lastn Windowing Aggregate function
lastnonnullval Windowing Aggregate function
lastval Windowing Aggregate function
lenbson BSON Simple function
length Utilities Simple function
length (for lists) Lists Simple function
length (for strings) Strings Simple function
list Type conversions Simple function
listbson BSON Simple function
ln Math Simple function
log10 Math Simple function
log1p Math Simple function
long Type conversions Simple function
lower Strings Simple function
lshift Math Simple function
ltrim Strings Simple function
maplist Lists Simple function
maplist_loose Lists Simple function
max (simple) Math Simple function
max (aggregate) Statistical calculations Aggregate function
maxn (aggregate) Statistical calculations Aggregate function
maxdouble Math Simple function
maxelement Lists Simple function
maxint Math Simple function
maxlong Math Simple function
md5 Utilities Simple function
median (simple) Lists Simple function
median (aggregate) Statistical calculations Aggregate function
mergelist Lists Simple function
milliseconds Timestamps Simple function
min (simple) Math Simple function
min (aggregate) Statistical calculations Aggregate function
minn (aggregate) Statistical calculations Aggregate function
mindouble Math Simple function
minelement Lists Simple function
minint Math Simple function
minlong Math Simple function
minutes Timestamps Simple function
namebson BSON Simple function
named schema constructor function Type conversions Simple function
nan Math Simple function
nanotime System Simple function
new_tuple Utilities Simple function
new_tuple_subset Utilities Simple function
new_tuple_subset_loose Utilities Simple function
notnan NaN (not a number) Simple function
notnull Utilities Simple function
now Timestamps Simple function
nullif Utilities Simple function
nulllist Lists Simple function
openval Windowing Aggregate function
orall (simple) Math Simple function
orall (aggregate) Statistical calculations Aggregate function
parsecsv Utilities Simple function
parsejson Utilities Simple function
parsejson_loose Utilities Simple function
parse_time Timestamps Simple function
pearson Statistical calculations Simple function
pearson Statistical calculations Aggregate function
percentile Statistical calculations Aggregate function
permute Math Simple function
pow Math Simple function
prepend Lists Simple function
product (simple) Lists Simple function
product (aggregate) Statistical calculations Aggregate function
putbson BSON Simple function
random Math Simple function
random_tuple Utilities Simple function
randomstring Strings Simple function
range Lists Simple function
rank (aggregate) Aggregate to List Aggregate function
regexmatch Strings Simple function
regexmatch_ignorecase Strings Simple function
regexreplace Strings Simple function
regexsplit Strings Simple function
regextuple Strings Simple function
removebson BSON Simple function
removeelement Lists Simple function
replace Strings Simple function
replaceelement Lists Simple function
reverse Lists Simple function
round Math Simple function
rshift Math Simple function
rtrim Strings Simple function
seconds Timestamps Simple function
securerandom Math Simple function
set_day_of_month Timestamps Simple function
set_day_of_week Timestamps Simple function
set_hour Timestamps Simple function
set_minute Timestamps Simple function
set_month Timestamps Simple function
set_second Timestamps Simple function
set_year Timestamps Simple function
sign Math Simple function
sin Math Simple function
sinh Math Simple function
shuffle() Lists Simple function
sleep System Simple function
slope Statistical calculations Aggregate function
sort Lists Simple function
spearmans_rank Statistical calculations Simple function
spearmans_rank Statistical calculations Aggregate function
split Strings Simple function
splitbson BSON Simple function
sqrt Math Simple function
startswith Strings Simple function
stdev (simple) Lists Simple function
stdev (aggregate) Statistical calculations Aggregate function
stdevp (simple) Lists Simple function
stdevp (aggregate) Statistical calculations Aggregate function
strftime Timestamps Simple function
string Type conversions Simple function
stringbson BSON Simple function
strlen Strings Simple function
strpinterval Timestamps Simple function
strptime Timestamps Simple function
strresize Strings Simple function
strresizetrunc Strings Simple function
sublist Lists Simple function
substr Strings Simple function
sum (simple) Lists Simple function
sum (aggregate) Statistical calculations Aggregate function
sum_list (simple) Lists Simple function
sum_list (aggregate) Statistical calculations Aggregate function
sumif (aggregate) Statistical calculations Aggregate function
systemenv System Simple function
systemproperty System Simple function
tan Math Simple function
tanh Math Simple function
throw Errors Simple function
time Timestamps Simple function
timezoneoffset Timestamps Simple function
timestamp Type conversions Simple function
tobson BSON Simple function
today Timestamps Simple function
today_utc Timestamps Simple function
toCodePoint Utilities Simple function
toCSV Utilities Simple function
tojson Utilities Simple function
to_degrees Math Simple function
to_milliseconds Timestamps Simple function
to_radians Math Simple function
to_seconds Timestamps Simple function
trim Strings Simple function
tuplebson BSON Simple function
tuple Type conversions Simple function
unique Lists Aggregate function
unique Aggregate to List Simple function
unsignedrshift Math Simple function
unzip Lists Simple function
upper Strings Simple function
uuid Utilities Simple function
variance (simple) Lists Simple function
variance (aggregate) Statistical calculations Aggregate function
variancep (simple) Lists Simple function
variancep (aggregate) Statistical calculations Aggregate function
version System Simple function
vwap Statistical calculations Aggregate function
weeks Timestamps Simple function
withmax Statistical calculations Aggregate function
withmin Statistical calculations Aggregate function
xorall Math Simple function
zip Lists Simple function

Back to Top ^

Simple Functions Overview

The majority of functions in the StreamBase expression language are simple functions, which operate on a single tuple field at a time. You can use simple functions in expressions for any StreamBase operator (except the Heartbeat, Metronome, and Union operators, which do not accept expressions).

The expression language also supports aggregate functions, which are described in Aggregate Functions Overview.

Simple functions are organized into the following categories:

Back to Top ^

Simple Functions: BSON

This category includes the following functions:

These functions create, manipulate and extract contents of BSON blobs. BSON blobs, as defined by bsonspec.org, are maintained in a binary format in which zero or more values, arrays, and objects with fields (key/value pairs) are stored as a single entity. A variety of implementations exist.

BSON is short for Binary JSON. BSON blobs serialize data in the manner of JSON, and like JSON, BSON data may include nested objects. JSON, as defined on http://www.json.org is text consisting of colon-separated name-value pairs (with field name quoted), and each pair separated by a comma. A BSON-encoded blob is sometimes referred to as a document.

The following table indicates how StreamBase types correspond to BSON types.

StreamBase Type BSON Type
int int32
long int64
double double
bool Boolean
string UTF-8 string
null null
timestamp timestamp*
blob binary
list array
tuple object

*The StreamBase BSON Java implementation for timestamps differs from the official BSON specification. They are both int64 but the StreamBase format is used internally rather than the bsonspec.org format.

addbson()

Function syntax:

blob addbson(blob bsonblob-array, T element)

Adds a new field at the end of a BSON array. Accepts a BSON blob that contains an array and an element to be appended to it. element can be of any type compatible with the contents of the array bsonblob-array. Returns a BSON object for the expanded array.

Back to Top ^

bsontojson()

Function syntax:

string bsontojson(blob bsonblob)

Converts BSON blob to JSON-formatted string. Accepts one BSON blob and returns its contents as a JSON object of key-value pairs.

Back to Top ^

findbson()

Function syntax:

blob findbson(blob bsonblob, int index)
blob findbson(blob bsonblob, string fieldname)

Searches a BSON object for a specified array index or field name. Accepts a BSON blob assumed to contain either a BSON array or a BSON object and an index into the array or a string that should match a field name. Returns a blob containing the found field or throws an error if not found or index is out of bounds. index is a zero-based index. fieldname is a string, interpreted as a field name.

Back to Top ^

frombson()

Function syntax:

T frombson(blob bsonblob, T type)

Converts a BSON blob to the requested StreamBase type. Accepts one BSON blob and returns its contents as the requested StreamBase type. The second argument provides an example of the return type and is not evaluated. If the type does not match, throws an error.

Back to Top ^

joinbson()

Function syntax:

blob joinbson(list(blob) L)

Concatenates a list of BSON fields into a BSON object. Accepts a list of BSON blobs containing fields and concatenates them. If the list argument is empty, returns an empty blob. If the list is null, returns null. To test for null blobs, use the isnull function. You can use splitbson to create a list of BSON fields.

Back to Top ^

jsontobson()

Function syntax:

blob jsontobson(string jsonstring)

Converts a JSON string to a BSON blob. Accepts one JSON string and returns its contents as a BSON blob. The argument jsonstring is any JSON string. For example:

jsontobson('{"symbol":"IBM","price":75.91,"quantity":150, "date":"2013-04-28 16:00:00.000-0400"}')

Back to Top ^

lenbson()

Function syntax:

int lenbson(blob bsonblob)

Returns the length of a BSON object or array.

If bsonblob contains an array, returns the number of elements in it.

If bsonblob is an object, returns the number of fields in it.

If bsonblob is any other scalar type, throws an error.

Back to Top ^

listbson()

Function syntax:

blob listbson(T1 value1, T2 value2, ... Tn valuen)

Accelerator for constructing a (possibly heterogeneous) BSON array, similar to a StreamBase list. Accepts any number of arguments and returns a BSON array containing them. Values are individually interpreted by tobson and can be of any type.

Back to Top ^

namebson()

Function syntax:

string namebson(blob bsonblob)

Returns the name of a BSON field

If bsonblob is a BSON field, returns the field name, otherwise returns an empty string.

Back to Top ^

putbson()

Function syntax:

blob putbson(blob bsonblob-object, string name, T value)

Adds a new field to a BSON object. Returns the blob bsonblob-object with added field name having value value. If field name already exists, its current value is replaced with value.

Back to Top ^

removebson()

Function syntax:

blob removebson(blob bson-list,
                int element-index1, ... int element-indexN)
blob removebson(blob bson-tuple,
                int element-index1|string fieldname1, ...
                int element-index2|string fieldnameN) 

Removes fields or list elements from BSON blobs. The BSON blob can contain an array or an object. A blob is returned with fields or elements that match the specified element-index or fieldname arguments removed. Fields can be removed from objects by index or field name.

The order of indexes and names in arguments does not matter. Any out-of-bounds indexes and non-matching field names are ignored.

Back to Top ^

splitbson()

Function syntax:

list splitbson(blob bsonblob-containing-list)
list splitbson(blob bsonblob-containing-object) 

Returns a list of BSON fields from a BSON object or a list of BSON values from a BSON array. Use frombson to convert individual returned values to StreamBase types.

Back to Top ^

stringbson()

Function syntax:

string stringbson(blob bsonblob-string)

Converts a BSON blob to a string. The BSON blob must contain one string. Unlike the string function, no type conversion is performed.

Back to Top ^

tobson()

Function syntax:

blob tobson(T arg)

Converts StreamBase values to a BSON blob. Each StreamBase int, long, double, string, timestamp, and blob maps to a respective BSON type, nulls map to nulls, lists map to BSON arrays, and tuples map to BSON objects.

Back to Top ^

tuplebson()

Function syntax:

blob tuplebson(AS-expr1, AS-expr2, ... AS-exprN)

Accelerator for constructing a BSON object from comma-separated AS expressions (comma-separated value expressions, each field named with the AS keyword). The resulting blob contains an object having the specified fields and values.

The following example creates a BSON object with three fields:

tuplebson("IBM" AS symbol, 87.3875 AS price, 47 AS quantity)

Back to Top ^

Simple Functions: Errors

This category includes the following functions:

catchexception()

Function syntax:

T catchexception(T arg1 [, ... [, T argn]])

Attempts to evaluate all arguments in order, returning the first one that evaluates without an error, or null if all arguments evaluate to an error. Accepts all data types, but all arguments must have the same type. The returned value has the same data type, T, as the arguments.

For example, the following example attempts to parse str as an interval, but returns a null timestamp if parsing fails:

catchexception(interval(str))

The following example attempts to divide a by b, but returns -1 if the division fails:

catchexception(a/b, -1)

Back to Top ^

error()

Function syntax:

bool error(string message)

Throws an error with message text message that can be caught using the StreamBase error streams mechanism. In the default configuration, this produces an error message on the console, and the server continues processing.

Because error() returns bool, it works well as a predicate expression in operators. For example, use error("Error condition") as the last predicate expression for a Filter operator to produce an error message when no earlier predicate expression was matched.

Back to Top ^

throw()

Function syntax:

T throw(string message)

Throws an error with message text message that can be caught using the StreamBase error streams mechanism. In the default configuration, this produces an error message on the console, and the server continues processing.

Because throw() returns an untyped null by default, it works well in if statements. For example: if (test-expr) then (5) else throw("Failed the test")

For the purposes of typechecking, cast the return of this function with one of the casting functions. For example, use bool(throw("Unmatched filter option")) as the last predicate expression of a Filter operator to produce an error message when no earlier predicate expression was matched.

Back to Top ^

Simple Functions: External Functions

This category includes the following functions:

calljava()

Function syntax:

T calljava(string class, string method [, arg1 , ..., argn])

Use calljava() to run a Java method directly from any StreamBase operator that uses an expression. The return type, T, of calljava() is the same as the return type of the called method. The called method can be from a standard Java library class, such as java.lang.Math, or can be from a custom class of your own in the current project or in a referenced project.

Custom Java functions are functions you build with the StreamBase Java Client library. You can use this simple form of calljava() in any expression except aggregate or Query Table contexts. (To use calljava() in an aggregate expression, refer to the aggregate calljava().)

You can use calljava()without any configuration file entries if the method called is in a class in the current project or in a Maven dependency of this project.

As an alternative to using calljava(), you can define an alias for your custom function, and call the alias in your operator expressions. See Custom Functions in Java. Aliases must be defined in configuration files.

The calljava() function distinguishes simple from aggregate functions by the number of arguments. For custom simple functions, you must specify both class and method names; for custom aggregate functions, you specify only the class name.

To learn about coding custom Java functions, refer to Using the StreamBase Java Function Wizard in the API Guide.

Return Types and Argument Types

The function you write to be called with calljava() can have any number of arguments, including none. Variable-length arguments are accepted by both simple and aggregate custom functions. Each calljava() argument arg1 through argn associates a StreamBase data type with one of the following primitive or Java object types:

StreamBase Data Type Java Primitive Java Object
blob com.streambase.sb.ByteArrayView
bool boolean java.lang.Boolean
double double java.lang.Double
int int java.lang.Integer
long long java.lang.Long
list java.util.List
string byte[] java.lang.String
timestamp com.streambase.sb.Timestamp
tuple com.streambase.sb.Tuple

In writing your Java function, use the Java primitive if the execution speed of your function is paramount. However, if you pass a null for any argument that resolves in your function to a Java primitive, calljava() returns an error and does not call your function. Only use Java primitives if your application otherwise insures that null values cannot be passed to your function through calljava().

Use the Java object when your application might pass null to your function through calljava(), and when no Java primitive is available.

Back to Top ^

Simple Functions: Financial

This category includes the following functions:

black_scholes()

Function syntax:

double black_scholes(string type, double underlying, double strike, 
    double dividendYield, double riskFreeInterestRate, 
    double Volatility, timestamp exerciseDate [, timestamp dealDate])

The black_scholes function calculates fair-value and risk statistics (delta, gamma, vega) for European style options on securities with continuous dividend yields. This is known as the Black-Scholes Generalized model. It also calculates implied volatility.

The black_scholes function takes eight input arguments. The first seven arguments are required, while the eighth argument, dealDate, is optional.

The arguments are:

  1. type: Required. Should be set to one of the following string values:

    • Call (The price of an option.)

    • Put (The price of an option.)

    • DeltaCall (The sensitivity of the price of an option to changes in the price of the stock.)

    • DeltaPut (The sensitivity of the price of an option to changes in the price of the stock.)

    • ThetaCall (Theta measures how the price of an option changes with time.)

    • ThetaPut (Theta measures how the price of an option changes with time.)

    • Gamma (The sensitivity of a stock's delta to the stock price.)

    • Vega (The rate of change of the value of an option with respect to the volatility of the stock's price.)

    • RhoCall (The sensitivity of the price of an option with respect to the risk-free interest rate.)

    • RhoPut (The sensitivity of the price of an option with respect to the risk-free interest rate.)

    • ImpliedVolatilityCall (Implied volatility of the underlying stock for a given price.)

    • ImpliedVolatilityPut (Implied volatility of the underlying stock for a given price.)

  2. underlying: Required. A double. The price of the underlying stock.

  3. strike: Required. A double. The strike price of the stock on the exercise date.

  4. dividendYield: Required. A double. For example, 0.03 for 3.0%.

  5. riskFreeInterestRate: Required. A double. For example, 0.05 for 5.0%.

  6. Volatility (or value): Required. A double. This input argument is overloaded. When computing anything except the implied volatility for a given call or put price this argument is the volatility of the stock. For example, 0.2 for 20%. When computing implied volatility this argument is the option price.

  7. exerciseDate: Required. An absolute or interval timestamp. An absolute timestamp represents the option exercise date, while an interval timestamp represents the number of days to exercise.

  8. dealDate: Optional. A timestamp representing the option deal date. If not provided, the current date is the default.

Back to Top ^

compound_interest()

Function syntax:

double compound_interest(double principalValue, double matureValue, 
    double numberOfPeriods [, double guess])

Returns the compound interest given the principal value, mature value, and number of periods. An optional fourth argument, guess, can be used to set the initial compound interest used by the function. The compound_interest function uses the Newton-Raphson algorithm to compute compound interest rate. This algorithm performs the same calculations repetitively. Each iteration results in a compound interest that is closer to the final result. The number of iterations required to get the results within an acceptable inaccuracy (1.0e-4) depends on the number of periods and the initial compound interest used to start the calculation.

If the guess argument is not provided, the default initial value is 0.01 (1%). If the function returns a NaN (not a number), you can provide the initial compound interest in this fourth argument. If the initial value is closer to the actual rate of the computation, it can be sped up significantly.

Back to Top ^

Simple Functions: Internet

This category includes the following functions:

inet_aton()

Function syntax:

long inet_aton(string address)

Takes a string containing an IPv4 address in dotted-quad notation ("nnn.nnn.nnn.nnn") and returns a long containing that address encoded in network byte order.

Back to Top ^

inet_ntoa()

Function syntax:

string inet_ntoa(int address)
string inet_ntoa(long address)

Takes an int or long containing an IPv4 address encoded in network byte order and returns a string containing that address in dotted-quad notation ("nnn.nnn.nnn.nnn").

Back to Top ^

Simple Functions: Lists

A list is an ordered collection of values (called elements), each of which is of the same StreamBase data type, called the list's element type. The element type can be any StreamBase data type, such as an int, a tuple, or even a list (thus allowing for constructions such as a list of list of int).

Individual elements in a list can be accessed using their zero-based integer position (their index) in the list. In any expression in an EventFlow program, use brackets to address individual elements of a list. Thus, for a list field named L, use L[0] to address the first element in the list, L[1] for the second element, and L[length(L)-1] to address the last element.

In most list-related functions that take an index, you can also use a negative index to count backward from the end of the list. Thus, for a list L, L[-1] is equivalent to L[length(L)-1].

The number of elements in a list is determined at application runtime. A null list is not the same as a list with zero elements (an empty list), which is not null.

Several functions that manipulate lists can take user-written functions as arguments. These are filterlist(), foldleft(), foldright(), maplist(), mergelist(), and sort().

append()

Function syntax:

list(T) append(list(T) L, T e1, T e2, ..., T en)

Returns its argument list, L, with elements e1 through en added to the end of the list in the order specified. The data types of the appended elements e and the element type of list L must be the same, except that element types int and long can be coerced to long and double following the rules in Data Type Coercion and Conversion. The element type of the returned list is the same as the element type of the argument list and of the appended elements.

The function takes any number of element arguments, including zero. That is, append(L) returns list L unchanged. If any list element is null, a null element is appended to the list in the position specified.

Back to Top ^

avg()

Function syntax:

double    avg(list(double) L)
double    avg(list(int) L)
double    avg(list(long) L)
timestamp avg(list(timestamp) L)

Given a list, L, of doubles, ints, longs, or timestamps, returns the average of all members of the list as a double. The argument list can have a mix of ints, longs, and doubles, which are coerced according to the standard coercion rules.

For a returned average of timestamp values to make sense, the list should contain all interval timestamps or all absolute timestamps, but not both.

See also the aggregate version of avg().

Back to Top ^

avg_list()

Function syntax:

list(double) avg_list( list(T) LL (list(T) L1, list(T) L2, ...) )

Accepts a list of lists as an argument, where each member list of the containing list LL must have the same numeric element type coercible to double (int, long, or double). Returns a list containing the element-wise average of the elements of each member list in LL. The returned list is a list(double).

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the average.

For example:

avg_list( [ [1,2,3], [4,5,6] ] )
avg_list( list( list(1,2,3), list(4,5,6) ) )
avg_list( list( list(1L,null,3L), list(4L,5L,6L,7L) ) )
avg_list( list( list(4.2,67,88.8), list(2.3,5.66,98.6) )

return:

[2.5, 3.5, 4.5]
[2.5, 3.5, 4.5]
[2.5, 5.0, 4.5, 7.0]
[3.25, 36.33, 98.735]

See also the avg_list() function for aggregates.

Back to Top ^

concat()

Function syntax:

list(T) concat(list(T) L1, list(T) L2 [, ..., list(T) Ln])

Returns a list produced by concatenating the elements of its argument lists in the order specified. Takes two or more list arguments, which must have the same element type. The element type, T, of the returned list is the same as the element types of the argument lists.

See also the aggregate version of concat().

Back to Top ^

contains()

Function syntax:

bool contains(list(T) L, T item)

Takes two arguments: a list with element type T, and a value, item, of the same type. Returns true if item is a member of the specified list, or false if not.

If the entire list is null, returns null. If a member of the list is null, and item is null, returns true. If no member of the list is null, and item is null, returns false.

If L is a list of strings, item must match an element exactly. Substrings of elements are not matched. Thus, the following expression returns false:

'foo' in ['there','is','food','here']

The string version of contains() does match substrings, so to determine if a substring exists in a list, you can use join() to convert a list of strings to a string. The following expression thus returns true:

'foo' in join(' ',['there','is','food','here'])

The IN infix operator provides an alternate implementation for contains().

Back to Top ^

count_distinct_elements()

Function syntax:

int count_distinct_elements(list(T) L)

Returns the number of unique elements in the argument list, L, with any element type, T.

For example:

count_distinct_elements(list(1, 2, 3, 3, 4, 4, 5, 6, 6, 7))

returns 7.

See also the related function, unique() and the aggregate function count_distinct().

Back to Top ^

count_list()

Function syntax:

list(int) count_list(list(T) LL(list(T) L1, list(T) L2, ...))

Accepts a list of lists as an argument, where each member list L1, L2, and so on of the containing list LL must have the same element type or a coercible type. Returns a list containing an element-wise count of the corresponding elements of each member list in LL. The list element type, T, of the returned list is the same as the element type of the argument lists, or a coerced superset type. For example:

list_list( [ [1,2,3], [4,5,6,7,8] ] )

returns:

[2, 2, 2, 1, 1]

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the sum.

See also the count_list() function for aggregates.

Back to Top ^

dotproduct()

Function syntax:

T dotproduct(list(T) L1, list(T) L2, ...)

Returns the sum of the element-wise product of its argument lists, whose element types must be int, double, or long. Accepts any number of lists, but is typically used with two list arguments. The data type of the returned value is the same as the element type of the argument lists. If the element type of the argument lists do not match, they are promoted using the rules in Data Type Coercion and Conversion.

This function produces the dot product, also known as the scalar product, of two or more numeric lists. The function multiplies the elements of each list in element order, L1.first times L2.first, then L1.second times L2.second, and so on. The resulting list of element products is summed and returned. If an argument list is null, the result is null. If any element in any argument list is null, the result is null. If an argument list is shorter than another argument list, the extra positions are filled in with 1 (or 1.0 or 1L), which preserves the extra position elements of the longer list unchanged.

Back to Top ^

emptylist()

Function syntax:

list(T) emptylist(T x)

Returns an empty list with the same element type as the argument. The argument x's value is ignored, and only its type is considered. This function is typically used like these examples:

emptylist(int())

emptylist(string())

You can also use an argument with a literal value, or an expression that resolves to a value, like the following examples. Only the type of the argument is considered:

emptylist(17)

emptylist("IBM")

Back to Top ^

filterlist()

Function syntax:

list(T) filterlist(inputlist(T), 
    inFunction(arg T) {boolean function})

The filterlist function takes a list as its first argument and a function as its second argument. The inputs to that function must be coercible to its declared input types. If the input argument inputlist contains tuples, the number, order, and names of its fields must match exactly, and its field types must be coercible to the field types of the input function's argument fields.

filterlist returns a list composed of only the elements of the argument inputlist for which the function provided in the second argument returns true when applied to elements of inputlist. The element type, T, of the returned list is the same as the element type of the argument inputlist.

The input function inFunction must take in a single argument of the same type as the input list element type, and must output a boolean value. The input function can be a built-in function or one defined using the expression language. Null values in the input list are not treated specially and are included in the output list if the function returns true for nulls.

Examples:

filterlist( [14,22,8,19,38,18,6,28,31,25,6], function myrange(x int) { x > 19 && x < 30} )

Returns:

[22, 28, 25]

In the next example, citytemps is a named schema of daily temperature records with schema {city string, mintemp int, maxtemp int}. We want to filter from this a list of tuples, each containing a city that had a temperature range of more than 20 degrees.

filterlist( list(citytemps("Phoenix", 77, 104), citytemps("Fargo", 58, 89), citytemps("Miami", 82, 96), citytemps("Seattle", 62, 73), citytemps("Boston", 54, 76)), function over20(rec citytemps) { rec.maxtemp - rec.mintemp > 20 } )

Returns:

[{"Phoenix", 77, 104}, {"Fargo", 58, 89}, {"Boston", 54, 76}]

Back to Top ^

filternull()

Function syntax:

list(T) filternull(list(T) L)

Returns a list composed of the elements of the argument list, L, with any null elements removed. The element type, T, of the returned list is the same as the element type of the argument list.

Back to Top ^

firstelement()

Function syntax:

T firstelement(list(T) L)

Returns the first element of argument list L. The data type, T, of the returned value is the same as the element type of the argument list.

The usage firstelement(L) is the equivalent of L[0].

Back to Top ^

foldleft()

Function syntax (with line breaks added for clarity):

T foldleft(function(arg1 T, arg2 T, arg3 T, ... , argn T) 
    {expression}, arg1_initial, List<arg2> list1, List<arg3> list2, ... , 
    List<argn> listn)
T foldleft(function name (arg1 T, arg2 T, arg3 T, ... , argn T) 
    {expression}, arg1_initial, List<arg2> list1, List<arg3> list2, ... , 
    List<argn> listn)

Executes {expression} to accumulate values on the elements of a set of input lists, working left to right, first to last, through the input list elements, and returns the accumulated result. All the input lists must have the same number of elements, which must be numeric types. Input lists that are null (nulllist(<type>)) are treated as lists of nulls.

Although normally unnamed, the function can be given a name, as the second syntax above and the second example below illustrate.

The number of arguments to the input function must be one more than the number of input lists. The first input argument, arg1, initializes the accumulator. Subsequent arguments must be coercible to arg1's data type, as must be the return data type.

To calculate the output, the function first sets the accumulator to the initial value of arg1. The input function is applied on this accumulator value along with the first elements of the lists (listn[0]) to produce a new accumulator value. The input function is then applied on this new accumulator value and the second element of the lists (listn[1]) to calculate a new accumulator value, which are used with the third elements of the lists (listn[2]) and so on, until the end of the lists is reached. The accumulator value at this point is returned as the output.

Examples:

Sum the elements of one list, starting at 10.

foldleft(function (a int, b int) { a+b }, 10, list(1, 2, 3))

This evaluates to 16.

Construct a list of the maximum values of corresponding elements of three lists leftward (line breaks added for clarity):

foldleft(function maxOfLists(incoming list(int), a int, b int, c int) 
    { append(incoming,(max(a, b, c))) },
    emptylist(int()), list(88, 78, 65), list(45, 90, 23), 
    list(890, 89, 78) )

Evaluates to:

list(890, 90, 78)

using this sequence of operations:

maxOfLists([], 88, 45, 890) yields [890]
maxOfLists([890], 78, 90, 89) yields  [890, 90]
maxOfLists([890, 90], 65, 23, 78) yields [890, 90, 78]

Back to Top ^

foldright()

Function syntax (line break added for presentation clarity):

T foldright(function(arg1 T, arg2 T, arg3 T, ... , argn T) 
    {expression}, arg1_initial, List<arg2> list1, List<arg3> list2, 
    ... , List<argn> listn)
T foldright(function name(arg1 T, arg2 T, arg3 T, ... , argn T) 
    {expression}, arg1_initial, List<arg2> list1, List<arg3> list2, 
    ... , List<argn> listn)

Executes {expression} to accumulate values on the elements of a set of input lists, working right to left, last to first, through the input list elements, and returns the accumulated result. All the input lists must have the same number of elements, which must be numeric types. Input lists that are null (nulllist(<type>)) are treated as lists of nulls.

Although normally anonymous, the function can be given a name, as the second syntax above and the second example below illustrate.

The number of arguments to the input function must be one more than the number of input lists. The first input argument, arg1, initializes the accumulator. Subsequent arguments must be coercible to arg1's data type, as must be the return data type.

To calculate the output, the function must set the first accumulator value to initial value arg1. The input function is applied on this accumulator value along with the last elements of the lists (listn[n-1]) to produce a new accumulator value. The input function is then applied on this new accumulator value and the second-from-last element of the lists (listn[n-2]) to calculate a new accumulator value, which is used with the third-from-last elements of the lists (listn[n-3]) and so on, until the beginning of the lists is reached. The accumulator value at this point is returned as the output.

Examples:

Sum the elements of two lists:

foldright(function (a int, b list(int), c list(int)) { a+b+c }, 0, list(1, 2, 3), list(4, 5, 6))

This evaluates to 21.

Construct a list of the maximum values of corresponding elements of three lists rightward (with line breaks added for clarity):

foldright(function maxOfLists(incoming list(int), a list(int), 
    b list(int), c list(int)) { append(incoming,(max(a, b, c))) },
    emptylist(int()), list(88, 78, 65), list(45, 90, 23), 
    list(890, 89, 78) )

Evaluates to:

list(78, 90, 890)

Using this sequence of operations:

maxOfLists([], 65, 23, 78) yields [78]
maxOfLists([78], 78, 90, 89) yields  [78, 90]
maxOfLists([78, 90], 88, 45, 890) yields [78, 90, 890]

Back to Top ^

has_intersection()

Function syntax:

bool has_intersection(list(T) L1, list(T) L2)

Returns true if lists L1 and L2, which must have the same type, contain any values in common and false if they do not. The input lists can be of different lengths. For example:

has_intersection([2,4,6,8,10],[1,3,5])

returns false, but

has_intersection([0,],[1,3,5,0])

returns true.

See also the intersect() function, which returns the set of values that two lists have in common.

indexof()

Function syntax:

int indexof(list(T) L, T e [, int start])

Returns the index of the first element of the list L whose value is e, starting at index start. The argument list can have any element type, T. The element e must have the same element type as the argument list. For list arguments, if e is not found, the function returns a value of null.

If the search is successful, the value returned is always greater than or equal to start. If start is unspecified, it is taken to be 0, which designates the first character of the string or the first element of the list.

See also the indexof() function that operates on strings.

insertelement()

Function syntax:

list(T) insertelement(list(T) L, int index, T e)

Returns a list composed of the elements of the argument list, L, with element e inserted before index position index in the argument list. index is zero-based. The element type, T, of the returned list is the same as the element type of the argument list and of the inserted element, e.

The usage insertelement(L, length(L), e) is the equivalent of append(). The usage insertelement(L, 0, e) is the equivalent of prepend().

Back to Top ^

intersect()

Function syntax:

list<T> intersect(list(T) L1, list(T) L2)

Returns a list containing the values common to lists L1 and L2, which must have the same type, or an empty list if they share no values in common. The input lists can be of different lengths. For example:

intersect([2,4,6,8,10,12,14,16,18,20],[3,6,9,12,15,18])

returns:

[6,12,18]

See also the has_intersection() function, which determines whether or not two lists have common values.

join(), joinlist()

Function syntax:

string join(list(T) L)
string joinlist(list(T) L)
string join(string separator, list(T) L)
string joinlist(string separator, list(T) L)

Returns a string composed of the elements of list L, having any element type. With a single argument, elements are concatenated. With two arguments, the separator character is inserted between each element. If the list argument is empty, join() returns an empty string. If list is null, join() returns null. If the separator string is null, join() returns null.

You can use joinlist() as an alias for join() in all contexts. In LiveView contexts, which has the reserved word JOIN, you must either use the joinlist() alias, or use escaped identifier syntax.

Also see the aggregate function join().

Back to Top ^

lastelement()

Function syntax:

T lastelement(list(T) L)

Returns the last element of argument list L. The data type of the returned value is the same as the element type of the argument list.

The usage lastelement(L) is the equivalent of L[-1].

Back to Top ^

lastindexof()

Function syntax:

int lastindexof(list(T) L, T e [, int lastStart])

Returns the index of the last element of the list L whose value is e, for which the index is less than or equal to lastStart. The argument list can have any element type, T. The element e must have the same element type as the argument list. For list arguments, if e is not found, the function returns a value of null.

If the search is successful, the value returned is always less than or equal to lastStart. If lastStart is unspecified, it is taken to be the index of the last grapheme of the string or the last element of the list.

See also the lastindexof() function that operates on strings.

length()

Function syntax:

int length(list L)

Returns the number of elements in a list, including any null elements. If the argument is a null list, the returned value is null.

See also the length() functions that operate on lists and blobs.

list()

Returns a list containing its arguments as elements. See list() in the Simple Functions: Type Conversions section.

maplist()

Function syntax:

list(T) maplist(function(arg1 T, arg2 T, ... , argn T)
  {function-body}, list1 T, list2 T, ... , listn T)

The maplist function maps the elements of one or more lists onto each other, according to a function() declared as the first argument. The maplist() function takes a declared function() as its first argument and one or more lists as additional arguments. It applies the function to the elements of each argument list, then returns a single list that contains the results of applying the function on each element position, in element order.

To take a simple example, the following maplist() instance declares a function that has two integer arguments and a function body that adds the arguments. When applied to the input list(int) lists, the result is that the first element of the first argument list is added to the first element of the second argument list. Then the second element of the first list is added to the second element of the second list, and so on.

maplist(function(x int, y int) { x + y }, list(1, 2, 3), list(4, 5, 6))

The result is:

(list(int)) [5, 7, 9]

The input function can be defined using the expression language or a built-in function. The number of arguments to the declared function must be at least equal to the number of input argument lists. If the argument lists have more elements than the number of function arguments, the function-body expression continues to be applied.

The inputs to the function — that is, the elements of the input lists — must be coercible to the declared data type of the function arguments. If an input argument is a tuple, the number, order, and names of the fields must match exactly, and the field types of the input arguments must be coercible to the field types of the input function's argument fields.

The element type of the returned list is the same as the element type of the input lists or one that is coercible from them.

Input lists with null elements are valid lists. The result element corresponding to a null element's position can be a null, depending on the operation declared in the function-body. The declared function must handle any nulls. The maplist() function as a whole does not do so. For example:

maplist(function(x int, y int) { x + y }, list(1, 2, 3), list(4, null, 6))

returns:

(list(int)) [5, null, 9]

To take another example:

maplist(function(n1 int, n2 int, n3 int) {n1+n2+n3}, [1,2,3,4], [2,4,6,8], [3,6,9,12])

returns:

(list(int)) [6, 12, 18, 24]

Input lists that are null (nulllist(<type>)) are treated as lists of nulls.

Back to Top ^

maplist_loose()

Function syntax:

list(T) maplist_loose(function(T1, T2, ... , Tn){...}, list<T1>, list<T2>, ... , list<Tn>)

The maplist_loose() function takes the same function() first argument and one or more lists as additional arguments as the maplist() function, and returns a list the same as maplist(). The difference is that this function allows lists of any length, padding shorter lists with nulls to the same length as the longest list encountered.

See maplist() for instructions on using this function.

Back to Top ^

maxelement()

Function syntax:

T maxelement(list(T) L)

This function returns the maximum non-null element value using the greater-than relational operator across all elements in the argument list, L. The data type of the returned value is the same as the element type of the argument list.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of max(), the aggregate version of max(), and the aggregate maxn() function. Use the simple function max(list, , list) to determine the larger of two or more lists. In an 2, max(list) returns the maximum list in the aggregate window. By contrast, use maxelement() to compare the element values of a single list.

Back to Top ^

median()

Function syntax:

double median(list(double) L)
double median(list(int) L)
double median(list(long) L)

Given a list, L, of doubles, ints, or longs, returns a double, the median of the elements of L. If the list has an even number of elements, returns the average of the middle two.

See also the aggregate version of median().

Back to Top ^

mergelist()

Function syntax:

list(T) mergelist(list<list<T>> listOfLists)
list(T) mergelist(list<list<T>> listOfLists, bool ascending)
list(T) mergelist(list<list<T>> listOfLists, function(arg1 T, 
    arg2 T) {int compareFunction})
list(T) mergelist(list<list<T>> listOfLists, bool ascending, 
    function(arg1 T, arg2 T) {int compareFunction})

The mergelist function merges a list of sorted input lists into a single sorted output list. It takes one, two, or three arguments:

  1. The single argument form takes a required list of lists, which is assumed to be presorted in the desired order. mergelist does not sort its input lists.

  2. The optional second argument is a boolean value indicating that the output is to be sorted: true for ascending or false for descending.

  3. The third argument is an optional function that compares the values of two inputs (see below).

If a boolean argument is used, it must be the second argument. If absent, the output list is sorted ascending.

When the input lists consist of simple types, default StreamBase expression language comparison operators are normally adequate to obtain properly sorted output.

When the input list elements are tuples, you can provide a compare function if you need to sort on a particular field or fields of input tuples. The compare function can be a built-in function or declared using the expression language. It must accept two input arguments and output an integer. mergelist expects the function's return value to be negative if the first argument is less than the second argument, positive if the first argument is greater than the second argument, and 0 if they are equal. The inner list type must be coercible to the function argument types.

For example, by default a list of lists of tuples with a schema {price int, volume int, symbol string}, would sort on the first field (price). When prices are equal, the second field (volume) is used to resolve any ties, and then the third field (symbol), as needed. If, however, you want to sort only on symbol, you can provide a function to do so, such as the following, where mytuple is the name of a named schema used as a data type:

function symbolCompare(t1 mytuple, t2 mytuple)

  if (t1.symbol > t2.symbol) then 1
  else if (t1.symbol < t2.symbol) then -1
  else 0 
}

If the input list of lists contains only one inner list, the output list's elements are the contents of that inner list in the same order. When null values exist in input lists, how they sort in the output list (like other values) is determined by the compare function. For more information see Null Literals.

A call to mergelist that uses the above compare function to sort an enqueued list of lists might take this form (line breaks were added for clarity):

mergelist( list(list()) input.trades, 
  function symbolCompare(t1 mytuple, t2 mytuple)
  { if (t1.symbol > t2.symbol) then 1 else if 
      (t1.symbol < t2.symbol) then -1 else 0 } )

Back to Top ^

minelement()

Function syntax:

T minelement(list(T) L)

This function returns the minimum non-null value using the less-than relational operator across all elements in the argument list, L. The data type of the returned value is the same as the element type of the argument list.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of min(), the aggregate version of min(), and the aggregate minn() function. Use the simple function min(list, , list) to determine the smaller of two lists. In an aggregate context, min(list) returns the minimum list in the aggregate window. By contrast, use minelement() to compare the element values of a single list.

Back to Top ^

nulllist()

Function syntax:

list(T) nulllist(T x)

Notice that the name of the function has three lowercase letter L's. Returns a null list with the same element type as the argument. The argument x's value is ignored, and only its type is considered. This function is typically used like these examples:

nulllist(double())

nulllist(timestamp())

You can also use an argument with a value, or an expression that resolves to a value, like the following examples. Only the type of the argument is considered:

nulllist(17)

nulllist("IBM")

Back to Top ^

prepend()

Function syntax:

list(T) prepend(list(T) L, T e)

Returns a list containing element e at the start, then all of list L. The data type of element e and the element type of list L must be the same, except that argument types int and long can be coerced to long and double following the rules in Data Type Coercion and Conversion. The element type of the returned list is the same as the element type of the argument list and of the prepended element, e.

Back to Top ^

product()

Function syntax:

double product(list(double) L)
int    product(list(int) L)
long   product(list(long) L)

Given a list, L, of doubles, ints, or longs, returns the product of each member of the list. The data type of the returned value is the same as the element type of the argument list.

See also the aggregate version of product().

Back to Top ^

range()

Function syntax:

list(int) range(int start, int end [, int step])

Returns a list of ints from value start up to (or down to) but not including value end. That is, the range includes the first item in the list and excludes the last item. Thus, range(1, 10) returns a list with nine elements, not ten:

list(int) [1, 2, 3, 4, 5, 6, 7, 8, 9]

The default value for step is 1. step cannot be 0, but you can specify a step value greater than 1 to produce a list with skipped elements. For example, a step of 2 returns every other number:

range(0, 8, 2)

returns

list(int) [0, 2, 4, 6]

If you specify a negative value for step, the list generates in reverse. For example:

range(3, 0, -1)

returns

list(int) [3, 2, 1]

while

range(6, 1, -2)

returns

list(int) [6, 4, 2]

When step is negative and end is greater than start, the result is an empty list. Likewise, an empty list also results when start is greater than end and step is positive.

Back to Top ^

regexsplit()

Parses a string into tokens delimited by a character matching a regular expression, and returns a list of strings with the tokens as elements. See regexsplit() in the Simple Functions: Strings section.

removeelement()

Function syntax:

list(T) removeelement(list(T) L, int index)

Returns a list composed of the elements of the argument list, L, with the element at index position index removed. The element type, T, of the returned list is the same as the element type of the argument list.

Back to Top ^

replaceelement()

Function syntax:

list(T) replaceelement(list(T) L, int index, T e)
list(T) replaceelement(list(T) L, updatefunction(arg T, index int)->T)
list(T) replaceelement(list(T) L, predicatefunction(arg T)-> bool, 
    replacement T)

Returns a list composed of the elements of the argument list, L, with the element at index position index replaced by element e. The element type of the returned list is the same as the element type of the argument list.

In the second form, instead of specifying an index into the list of interest, specify an updatefunction to be called with every element of the list, which function can return either the current value or some new value determined at runtime.

In the third form, specify a predicatefunction to be called with every element of the list, which returns replacement for all values that match the predicate function.

Back to Top ^

reverse()

Function syntax:

list(T) reverse(list(T) L)

Returns its argument list, L, in reverse order. The element type, T, of the returned list is the same as the element type of the argument list.

Back to Top ^

shuffle()

Function syntax:

list(T) shuffle(list(T) L)

Returns its argument list, L, with its elements shuffled in random order. Calling shuffle() again on the same list returns a different sort order. The element type, T, of the returned list is the same as the element type of the argument list.

Back to Top ^

sort()

Function syntax:

list(T) sort(list(T) L [, bool ascending])
list(T) sort(list(T) L , 
    function compareFunction(arg 1 T, arg2 T)->boolean)
list(T) sort(list(T) L , bool, 
    function compareFunction(arg1 T, arg2 T)->boolean)

Returns its argument list, L, sorted. Sorting occurs in ascending order if the optional ascending argument is true or omitted, or in descending order if ascending is false. If the ascending argument is used, it must be the second argument. The element type, T, of the returned list is the same as the element type of the argument list.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types. Null values sort as the lowest possible values for the element type of the list. For doubles, NaN sorts as the highest possible value, which yields the following relations: Null < –Infinity < 0 < +Infinity < NaN.

To exercise control over sorting, especially when list elements are tuples, you can provide your own comparison function that compares two input values and outputs an integer value. The output value must be negative if the first argument is less than the second argument, positive if the first argument is greater than the second argument, and 0 if they are equal.

For further details and an example of using a comparison function, see the description of the mergelist function.

Back to Top ^

split()

Parses a string into tokens delimited by a specified character and returns a list of strings with the tokens as elements. See split() in the Simple Functions: Strings section.

stdev()

Function syntax:

double    stdev(list(double) L)
double    stdev(list(int) L)
double    stdev(list(long) L)
timestamp stdev(list(timestamp) L)

Given a list, L, of doubles, ints, longs, or timestamps, returns the standard deviation for all members of the list. If the argument list is a list of doubles, ints, or longs, returns a double. If the argument list is a list of timestamps, returns an interval timestamp. For timestamp values, the list should contain all interval timestamps or all absolute timestamps, but not both. For lists of length 1, returns null.

See also the aggregate version of stdev().

Back to Top ^

stdevp()

Function syntax:

double    stdevp(list(double) L)
double    stdevp(list(int) L)
double    stdevp(list(long) L)
timestamp stdevp(list(timestamp) L)

Given a list, L, of doubles, ints, longs, or timestamps, returns the standard deviation for all members of the list. If the argument list is a list of doubles, ints, or longs, returns a double. If the argument list is a list of timestamps, returns an interval timestamp. For timestamp values, the list should contain all interval timestamps or all absolute timestamps, but not both. For lists of length 1, returns 0.

With stdevp() the data provided is the entire population, while with stdev(), the data provided is treated as a random sample. The stdevp() function is calculated using the biased (or n) method. The stdev() function is calculated using the unbiased (or n-1) method.

See also the aggregate version of stdevp().

Back to Top ^

sublist()

Function syntax:

list(T) sublist(list(T) L, int start [, int end [, int step]])

Returns a list composed of a section of the argument list, L, starting with index start, which is included in the result list. If end is specified, it is not included in the result list. The default value for step is 1. You can supply an alternate step value to skip elements in the argument list. For example, a step of 2 returns every other value in the list. You must specify an end value in order to specify a step value. The element type, T, of the returned list is the same as the element type of the argument list.

In contrast with the range() function, step cannot be negative.

Back to Top ^

sum()

Function syntax:

double    sum(list(double) L)
int       sum(list(int) L)
long      sum(list(long) L)
timestamp sum(list(timestamp) L)

Given a list, L, of doubles, ints, longs, or timestamps, returns the sum of all members of the list. The element type of the returned list is the same as the element type of the argument list.

When the list element type is timestamp, summing elements of the list follows the rules for adding timestamps as shown in the table in timestamp Data Type. That is, you cannot sum a list of two or more absolute timestamps. However, you can sum a list composed of all interval timestamps, or one composed of exactly one absolute timestamp plus one or more interval timestamps.

See also the aggregate version of sum().

Back to Top ^

sum_list()

Function syntax:

list(T) sum_list(list(T) LL(list(T) L1, list(T) L2, ...))

Accepts a list of lists as an argument, where each member list L1, L2, and so on of the containing list LL must have the same element type or a coercible type. Returns a list containing the element-wise sum of the elements of each member list in LL. The list element type, T, of the returned list is the same as the element type of the argument lists, or a coerced superset type. For example:

sum_list( [ [1,2,3], [4,5,6] ] )

returns:

[5, 7, 9]

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the sum.

See also the sum_list() function for aggregates.

Back to Top ^

unique()

Function syntax:

list(T) unique(list(T) L)

Returns a list composed of the unique elements in the argument list, L, in ascending (lexicographical) order. The element type, T, of the returned list is the same as the element type of the argument list.

For example:

unique([4, 8, 3, 7, 1, 3, 4, 4, 5, 6, 2, 6, 7])

returns:

list(int) [1, 2, 3, 4, 5, 6, 7, 8]

Be aware that lexicographic ordering of strings differs from numeric types. For example, with ints,

unique([400, 8, 3, 7, 1, 3, 4, 5, 400, 6, 6, 7])

returns:

(list(int)) [1, 3, 4, 5, 6, 7, 8, 400]

but given string tokens,

unique(['400', '8', '3', '7', '1', '3', '4', '5', '400', '6', '6', '7'])

returns a list in "dictionary" order:

(list(string)) [1, 3, 4, 400, 5, 6, 7, 8]

To find unique list elements and return them in descending order, use reverse(unique(<list>)).

See also the related function, count_distinct_elements().

Back to Top ^

unzip()

Function syntax:

tuple unzip(list(tuple) listOfTuples)

Takes a list of tuples and returns a tuple of lists.

An example will clarify the function. Let's say your application extracts a list of symbol and price tuples from a stream. One list might contain the following:

[ {prices: 102.51, symbols: AAPL},
  {prices: 96.52, symbols: IBM},
  {prices: 36.33, symbols: HPQ} ]

You can run this list through unzip(), using an expression like the following:

unzip(list(tuple(102.51 as prices,'AAPL' as symbols), 
           tuple(96.82 as prices,'IBM' as symbols), 
           tuple(36.33 as prices,'HPQ' as symbols)))

The returned value is a single tuple whose fields are lists, as follows:

((prices list(double), symbols list(string))) "[102.51,96.82,36.33]","[AAPL,IBM,HPQ]"

Back to Top ^

variance()

Function syntax:

double variance(list(double) L)
double variance(list(int) L)
double variance(list(long) L)

Given a list, L, of doubles, ints, or longs, returns a double, the variance for all members of the list. Variance is a measure of the dispersion of a set of data points around their mean value.

For lists of length 1, returns null. See the aggregate version of variance() for more on variance.

Back to Top ^

variancep()

Function syntax:

double variancep(list(double) L)
double variancep(list(int) L)
double variancep(list(long) L)

Given a list, L, of doubles, ints, or longs, returns a double, the variance for all members of the list. Variance is a measure of the dispersion of a set of data points around their mean value.

The variancep() function is similar to the variance() function: with variancep() the data provided is the entire population, while with variance(), the data provided is treated as a random sample. The variancep() function is calculated using the biased (or n) method. The variance() function is calculated using the unbiased (or n-1) method.

For lists of length 1, returns 0. See also the aggregate version of variancep() for more on variance.

Back to Top ^

zip()

Function syntax:

list(tuple) zip(tuple(list(T1), ..., list(Tn)) tupleOfLists)

In the spirit of the zip function in Python, zip() takes a tuple of lists, collates it, and returns it as a list of tuples, where the nth tuple contains the nth element from each of the argument lists. The element types of the lists in the argument tuple can be of any type, and do not need to be the same.

An example will clarify the function. Let's say you have a stream whose schema accepts lists of securities price data: { prices list(double), symbols list(string) }. One tuple might contain the following:

{ 
  prices:  [102.51, 96.82, 36.33], 
  symbols: [AAPL, IBM, HPQ]
}

You can run this tuple through zip(), using an expression like the following:

zip(tuple(list(102.51, 96.82, 36.33) as prices, list('AAPL', 'IBM', 'HPQ') as symbols))

The returned value is a list of tuples:

(list((prices double, symbols string))) [102.51,AAPL, 96.82,IBM, 36.33,HPQ]

Back to Top ^

Simple Functions: Math

This category includes the following functions:

The following functions provide bitwise operations:

The following functions return the maximum and minimum values for numeric data types:

abs()

Function syntax:

int    abs(int e)
double abs(double e)
long   abs(long e)

Returns the absolute value of an int, double, or long expression e. The return type is the same as the expression's type.

Back to Top ^

acos()

Function syntax:

double acos(int x)
double acos(double x)
double acos(long x)

Returns in radians the arc cosine of x, which is the value whose cosine is x. Undefined outside of the range –1 to 1.

Back to Top ^

andall()

Function syntax:

bool andall(bool b1, bool b2, ...)

Takes two or more values of type boolean, or expressions that resolve to type boolean, and returns the results of a logical AND operation on all arguments. For example, andall(true, true, false, true, false) returns false. Use andall() as a truth detector function.

Null arguments have special handling. A null argument does not change the result of the evaluation if a false value is among the arguments: andall(null, true, false, null, true) still returns false. However, if one or more arguments is null while all other arguments are true, the function returns null: andall(null, true, true, null, true) returns null.

The andall() function follows the logic expressed in this statement:

   if any argument is false, return false
   else if any argument is null, return null
   else return true

See also the orall() simple function and the aggregate versions of andall() and orall().

Back to Top ^

asin()

Function syntax:

double asin(int x)
double asin(double x)
double asin(long x)

Returns in radians the arc sine of x, which is the value whose sine is x. Undefined outside of the range –1 to 1.

Back to Top ^

atan()

Function syntax:

double atan(int x)
double atan(double x)
double atan(long x)

Returns in radians the arc tangent of x, which is the value whose tangent is x.

Back to Top ^

atan2()

Function syntax:

double atan2(int x, int y)
double atan2(double x, double y)
double atan2(long x, long y)

Returns in radians the arc tangent of two numeric variables, x and y, either of which can be int, long, or double. This is similar to calculating the arc tangent of (y/x), except that the signs of both arguments are used to determine the quadrant of the result tangent is x.

Back to Top ^

bitand()

int  bitand(int x, int y)
long bitand(long x, long y)

Returns the result of a bitwise AND operation on values x and y. Compares each bit in x to the corresponding bit in y. If both bits are 1, sets the corresponding result bit to 1. Thus, bitand(3,6) returns 2. (0011 AND 0110 returns 0010)

Back to Top ^

bitnot()

int  bitnot(int x)
long bitnot(long x)

Returns the result of a bitwise NOT operation on value x. Compares each bit in x and sets its bitwise complement in the corresponding result bit, switching 0 for 1 and 1 for 0. Thus, bitnot(5) returns -6. (0...0101 returns 1...1010)

Back to Top ^

bitor()

int  bitor(int x, int y)
long bitor(long x, long y)

Returns the result of a bitwise inclusive OR operation on values x and y. Compares each bit in x to the corresponding bit in y. If either bit is 1 or both bits are 1, sets the corresponding result bit to 1; otherwise, sets the corresponding result bit to 0. Thus, bitor(12,7) returns 15. (1100 OR 0111 returns 1111)

Back to Top ^

bitxor()

int  bitxor(int x, int y)
long bitxor(long x, long y)

Returns the result of a bitwise XOR (exclusive OR) operation on values x and y. Compares each bit in x to the corresponding bit in y. If either bit is 1 but not both bits, sets the corresponding result bit to 1; otherwise, sets the corresponding result bit to 0. Thus, bitxor(12,7) returns 11. (1100 OR 0111 returns 1011)

Back to Top ^

cbrt()

Function syntax:

double cbrt(int x)
double cbrt(double x)
double cbrt(long x)

Returns the (real) cube root of x.

Back to Top ^

ceil()

Function syntax:

double ceil(int x)
double ceil(double x)
double ceil(long x)

Returns the smallest double value that is greater than or equal to the argument and is equal to a mathematical integer. The function name, ceil, is shorthand for ceiling. The argument can be an int, double or long. Examples:

  • ceil(7.01) returns 8.0.

  • ceil(8.0) returns 8.0.

  • ceil(8) returns 8.0.

  • ceil(8L) returns 8.0.

See also the related function, floor().

Back to Top ^

choose()

Function syntax:

double choose(int n, int k)

Returns the n choose k binomial coefficient, which represents the number of ways to choose k unordered elements from a set of n elements. Specify n as a positive nonzero integer and k as a power of n, where k is greater than or equal to 1 and less than or equal to n–1. If n < 0 or k > n, 0.0 is returned.

Compare the permute() function.

Back to Top ^

cos()

Function syntax:

double cos(int x)
double cos(double x)
double cos(long x)

Returns the cosine of x, where x is given in radians.

Back to Top ^

cosh()

Function syntax:

double cosh(int x)
double cosh(double x)
double cosh(long x)

Returns the hyperbolic cosine of its argument.

Back to Top ^

exp()

Function syntax:

double exp(int x)
double exp(double x)
double exp(long x)

Returns the value of e (the base of natural logarithms) raised to the power of x. The argument x is a small value of type int, double, or long.

Back to Top ^

expm1()

Function syntax:

double expm1(int x)
double expm1(double x)
double expm1(long x)

Returns (e^x) -1. The argument x is a small value of type int, double, or long.

Back to Top ^

factorial()

Function syntax:

double factorial(int n)

Returns the factorial of the provided integer between 0 and 170. Arguments below zero return NaN; arguments above 170 exceed the size of a double and return Infinity.

Back to Top ^

floor()

Function syntax:

double floor(int x)
double floor(double x)
double floor(long x)

Returns the largest double value that is less than or equal to the argument x, and equal to a mathematical integer. The argument can be an int, double, or long. Examples:

  • floor(7.01) returns 7.0.

  • floor(8.0) returns 8.0.

  • floor(8) returns 8.0.

  • floor(8L) returns 8.0.

See also the related function, ceil().

Back to Top ^

gaussianrandom()

Function syntax:

double gaussianrandom(double mean, double stdev)

Takes two arguments of type double, the mean and standard deviation of a Gaussian distribution, and returns a double randomly chosen from that distribution.

Gaussian distributions are also known as normal distributions (in statistics) or bell curves (in testing). The function is useful for simulating data from any population that tends to follow a normal distribution.

For example, to simulate the SAT score for a random student, we could call gaussianrandom(1500, 100), where 1500 and 100 might be the national mean and standard deviation respectively.

Back to Top ^

ln()

Function syntax:

double ln(int x)
double ln(double x)
double ln(long x)

Returns the natural logarithm of x.

Back to Top ^

log10()

Function syntax:

double log10(int x)
double log10(double x)
double log10(long x)

Returns the base-10 logarithm of x.

Back to Top ^

log1p()

Function syntax:

double log1p(int x)
double log1p(double x)
double log1p(long x)

Returns the natural logarithm of the sum of the argument and 1.

Back to Top ^

lshift()

int  lshift(int x, int y)
long lshift(long x, long y)

Returns the result of a bitwise left shift of the bits in x by the number of bit positions y. Thus, lshift(5,1) returns 10. (0101 returns 1010)

Back to Top ^

max()

Function syntax:

T max(T e1, ..., T en)

This function returns the maximum non-null value using the greater-than relational operator across all its arguments e1 through en, which must all be expressions resolving to the same data type. The data type, T, of the returned value is the same as the arguments.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the aggregate version of max(), the maxelement() function for lists, and the aggregate maxn() function. Use this simple function max(list, , list) to determine the larger of two or more lists. In an aggregate context, max(list) returns the maximum list in the aggregate window. By contrast, use maxelement() to compare the element values of a single list.

Back to Top ^

maxdouble()

Function syntax:

double maxdouble()

This function returns the maximum value for the double data type.

Back to Top ^

maxint()

Function syntax:

int maxint()

This function returns the maximum value for the int data type.

Back to Top ^

maxlong()

Function syntax:

long maxlong()

This function returns the maximum value for the long data type.

Back to Top ^

min()

Function syntax:

T min(T e1, ..., T en)

This function returns the minimum non-null value using the less-than relational operator across all its arguments e1 through en, which must all be expressions resolving to the same data type. The data type of the returned value is the same as the arguments.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the aggregate version of min(), the minelement() function for lists, and the aggregate minn() function. Use this simple function min(list, , list) to determine the smaller of two or more lists. In an aggregate context, min(list) returns the minimum list in the aggregate window. By contrast, use minelement() to compare the element values of a single list.

Back to Top ^

mindouble()

Function syntax:

double mindouble()

This function returns the minimum value for the double data type.

Back to Top ^

minint()

Function syntax:

int minint()

This function returns the minimum value for the int data type.

Back to Top ^

minlong()

Function syntax:

long minlong()

This function returns the minimum value for the long data type.

Back to Top ^

orall()

Function syntax:

bool orall(bool b1, bool b2, ...)

Takes two or more values of type boolean, or expressions that resolve to type boolean, and returns the results of a logical OR operation on all arguments. For example, orall(true, true, false, true, false) returns true. Use orall() as a falsity detector function.

Null arguments have special handling. A null argument does not change the result of the evaluation if a true value is among the arguments: orall(null, false, true, null, false) still returns true. However, if one or more arguments is null while all other arguments are false, the function returns null: orall(null, false, false, null, false) returns null.

The orall() function follows the logic expressed in this statement:

   if any argument is true, return true
   else if any argument is null, return null
   else return false

See also the andall() simple function and the aggregate versions of andall() and orall().

Back to Top ^

permute()

Function syntax:

double permute(int n, int k)

Using permutation in the combinatorics sense, this function returns the number of ordered arrangements of k distinct elements that can be select from a set of n elements. Specify n as a positive nonzero integer and k as a number of elements of n, where k is between zero and n. If n < 0 or k > n, 0.0 is returned.

Compare the choose() function.

Back to Top ^

pow()

Function syntax:

double pow(int x, int y)
double pow(double x, double y)
double pow(long x, long y)

Returns the value of x raised to the power of y. Accepts an int, double, or long, and returns a double.

Back to Top ^

random()

Function syntax:

double random()
int    random(int maxrange)

Without arguments, returns a pseudo-random double value with positive sign, greater than or equal to 0.0 and less than 1.0.

With an integer argument maxrange, returns a pseudo-random integer with positive sign, greater than or equal to 0 and less than maxrange. Use this form to pick a random number between 0 and maxrange—1, inclusive. For example, to simulate rolling a six-sided die, the expression maxrange(6) + 1 returns a random integer between 1 and 6 inclusive.

Back to Top ^

round()

Function syntax:

long round(double e)
long round(int e, int n)
long round(double e, int n)
long round(long e, int n)

With one argument, a double, returns the closest long to the argument. Floating-point numbers are NOT supported.

With two arguments, round() is similar to the round() found in Microsoft Excel, but not identical to it. Specify a double, long, or int in the first argument. Specify an int in the second argument to designate the number of digits to round the first argument to. Use 0 to specify rounding to an integer, a positive integer to round the digits after the decimal point, or a negative number to specify rounding digits before the decimal point. If the first argument is an int or long, the second argument must be negative to actually perform rounding; otherwise the first argument is returned unchanged.

Back to Top ^

rshift()

int  rshift(int x, int y)
long rshift(long x, long y)

Returns the result of a logical bitwise right shift of the bits in x by the number of bit positions y. This is the equivalent of the Java >> operator. Thus, rshift(5,1) returns 2. (0101 returns 0010)

Back to Top ^

securerandom()

Function syntax:

double securerandom()

A more secure form of random() that generates random bytes, converted to return a double.

Back to Top ^

sign()

Function syntax:

int sign(int x)
int sign(double x)
int sign(long x)

Returns the sign of its argument: –1 if less than zero, 0 if equal to zero, and 1 if greater than zero.

Back to Top ^

sin()

Function syntax:

double sin(int x)
double sin(double x)
double sin(long x)

Returns the sine of x, where x is given in radians.

Back to Top ^

sinh()

Function syntax:

double sinh(int x)
double sinh(double x)
double sinh(long x)

Returns the hyperbolic sine of its argument.

Back to Top ^

sqrt()

Function syntax:

double sqrt(int x)
double sqrt(double x)
double sqrt(long x)

Returns the nonnegative square root of x.

Back to Top ^

tan()

Function syntax:

double tan(int x)
double tan(double x)
double tan(long x)

Returns the tangent of x, where x is given in radians.

Back to Top ^double

tanh()

Function syntax:

double tanh(int x)
double tanh(double x)
double tanh(long x)

Returns the hyperbolic tangent of a double value.

Back to Top ^

to_degrees()

Function syntax:

double to_degrees(double x)

Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

Back to Top ^

to_radians()

Function syntax:

double to_radians(double x)

Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

Back to Top ^

unsignedrshift()

int  unsignedrshift(int x, int y)
long unsignedrshift(long x, long y)

Returns the result of a bitwise right shift of the bits in x by the number of bit positions y, filling the leftmost bit position with 0. This is the equivalent of the Java >>> operator. Thus, rshift(-1,1) returns -1 (1111 1111 1111 1111), but unsignedrshift(-1,1) returns 2147483647 (0111 1111 1111 1111).

Back to Top ^

xorall()

bool  xorall(bool a, bool b, ...)

Takes two or more values of type boolean, or expressions that resolve to type boolean, and returns the results of a logical XOR (exclusive OR) operation on all arguments.

If any argument is null, the function returns null. Otherwise, it returns true if an odd number of arguments are true and returns false if an even number of arguments are true. Conceptually, xorall evaluates successive arguments pairwise to produce its result.

xorall(true, true) returns false.

xorall(true, false) returns true.

xorall(false, false) returns false.

xorall(true, null) returns null.

xorall(true, true, true) returns true, but

xorall(true, true, true, true) returns false.

The xorall function follows the logic expressed in this statement:

   if all arguments are false, return false
   else if an odd number of arguments are true, return true
   else return false

See also the andall() and orall() simple functions and the aggregate versions of andall() and orall().

Back to Top ^

Simple Functions: NaN

This category includes the following functions:

The functions in this group detect NaN (not a number) values.

isnan()

Function syntax:

bool isnan(double value)
bool isnan(int value)
bool isnan(long value)
bool isnan(timestamp value)

Returns true if the argument is NaN (not a number), or returns null if the argument is null. Returns false otherwise.

Back to Top ^

nan()

Function syntax:

double nan()

This function returns a NaN literal as type double.

Back to Top ^

notnan()

Function syntax:

bool notnan(double value)
bool notnan(int value)
bool notnan(long value)
bool notnan(timestamp value)

Returns false if the argument is NaN (not a number), or returns null if the argument is null. Returns true otherwise.

Notice that notnan() returns the opposite of isnan(), except if the argument is null, in which case they both return null.

Back to Top ^

Simple Functions: Runtime

This category includes the following functions:

getAllHostIPs()

Function syntax:

list(string) getAllHostIPs()

Returns as a list(string) all the non-loopback IP addresses assigned to the machine on which StreamBase Server is running.

Back to Top ^

getClientIP()

Function syntax:

string getClientIP(string connectionID)

Returns as a string the IP address associated with the specified connectionID. Obtain the connectionID string from the first field returned by an sbadmin listConnections command, or from the connection stream in the system container.

Use this function to determine by IP address the actual computer connecting to, or recently disconnected from, a running instance of StreamBase Server.

Back to Top ^

getCompilerProperty()

Function syntax:

string getCompilerProperty(string sysprop)

Returns the value of the specified system property that StreamBase Server is actually using.

By contrast, the function getSetCompilerProperty() answers the question of whether this value was explicitly set, and systemproperty() only returns values explicitly set in a configuration file.

Compare the output of the following, using a Map operator's Additional Expressions grid:

systemproperty("streambase.querytable.default-memory-tables")
null
getSetCompilerProperty("streambase.querytable.default-memory-tables")
UNSET
getCompilerProperty("streambase.querytable.default-memory-tables")
HEAP

These results tell you that the streambase.querytable.default-memory-tables system property was not explicitly set and is therefore using the compiled-in default value, which is HEAP.

Back to Top ^

get_conf_param()

Function syntax:

string  get_conf_param(default-value, path1, path2, path3, ...)
boolean get_boolean_conf_param(default-value, path1, path2, path3, ...)
int     get_int_conf_param(default-value, path1, path2, path3, ...)
long    get_long_conf_param(default-value, path1, path2, path3, ...)
double  get_double_conf_param(default-value, path1, path2, path3, ...)

These functions were provided in StreamBase 7 to return the value portion of a name-value pair specified with a <param> element in a project's currently active sbconf file.

These functions remain valid in StreamBase 10 to aid customers migrating a StreamBase 7 project to 10. In StreamBase 10, these functions always return the specified default-value.

These functions are deprecated as of January 2020 for all Spotfire Streaming primary or maintenance releases, and are expected to be removed in a future release.

Back to Top ^

getContainer()

Function syntax:

string getContainer()

Returns as a string the name of the StreamBase container in which the current application is loaded, and can be used in expressions like this example:

if getContainer() == "default" ...

Back to Top ^

getHostName()

Function syntax:

string getHostName()

Returns as a string the machine name of the host on which StreamBase Server is running, and can be used in expressions like this example:

if getHostName() == "fasthost" ...

Back to Top ^

getNodeName()

Function syntax:

string getNodeName()

Returns as a string the fully qualified name of the StreamBase Runtime node in which the current fragment is loaded.

Back to Top ^

getParallelRoot()

Function syntax:

string getParallelRoot()

Returns the qualified StreamBase path name of the top-level container or Module Reference that is the root of the current parallel region. See Parallel Region Defined.

If this function is called in a simple module without concurrency settings, it returns the path name of the StreamBase container that holds the operator that calls this function. If this function is called in a module whose Module Reference has a concurrency setting of Run in a parallel region, it returns the qualified path to that Module Reference. If the Module Reference also has a Multiplicity setting of multiple, and the number of instances is set to 2 or more, the returned path is appended with a colon and a zero-based integer representing the instance number of the calling module.

Thus:

  • When called in a simple module in the default container, this function returns default.

  • When called in a module, B, that has a Module Reference named B_ref in the top-level module, A, this function returns default.B_ref if the Module Reference is marked with the Run in a parallel region concurrency option.

  • In addition to the parallel region setting, if B_ref is also configured with a Multiplicity setting of 2, this function returns either default.B_ref:0 or default.B_ref:1.

See StreamBase Path Notation for more on the qualified StreamBase path names that can be returned.

Back to Top ^

getPath()

Function syntax:

string getPath(string componentname)

Returns the qualified StreamBase path of the specified component of an EventFlow module, and can be used for verifying the StreamBase container name for the specified component at run time, in expressions like this example:

if getPath("Symbol") == "default.Symbol" ...

See StreamBase Path Notation for details on the qualified path names that can be returned.

Back to Top ^

getResourceAbsolutePath()

Function syntax:

string getResourceAbsolutePath(string resourceName)

Returns the absolute file system path to the specified resourceName.

Back to Top ^

getServerURI()

Function syntax:

string getServerURI()

Returns the StreamBase URI of the currently running StreamBase Server. The value returned is from the point of view of the server, and thus always returns the hostname localhost. Use this function to obtain the port number and any authentication parameters appended to the URI.

Use this function instead of systemproperty("streambase.uri") to obtain the same information.

Back to Top ^

getSetCompilerProperty()

Function syntax:

string getSetCompilerProperty(string sysprop)

Returns the value of the specified system property if you explicitly set it, or returns UNSET otherwise. The UNSET result means the Server is using the default value for the specified property.

By contrast, getCompilerProperty() returns the actual value of the specified property, and systemproperty() only returns values explicitly set with the systemProperties object in a HOCON configuration file whose type is com.tibco.ep.streambase.configuration.sbengine.

See comparison examples in getCompilerProperty().

Back to Top ^

getTableSize()

Function syntax:

int getTableSize(string QTname)

Returns the number of rows in the specified Query Table. When used in a Query operator that is in the process of adding or deleting table rows, the result is only approximate.

Back to Top ^

Simple Functions: Statistical Calculations

This category includes the following functions:

correlation_coefficient()

Function syntax:

list(double) correlation_coefficient(string name, int index, 
                                     list(double) var1, list(double) var2 ...)

Returns a list of correlation coefficients, containing as many members as there are list arguments (there must be at least two). The output list of doubles is in the same order as the list arguments. Given n variables, the index argument specifies which of those n lists holds the primary variable (for which the returned correlation coefficient is 1.0).

Note

  • The correlation_coefficient of 0 items is null.

  • The correlation_coefficient of 1 item is 1.0.

  • The correlation_coefficient of N identical items is 1.0.

The name argument specifies the type of correlation coefficient to compute, taken from this list:

Click any name above to see the description of the statistic. You can abbreviate and approximate the spellings you supply for the name. You can call any of these functions directly by its full name, as well as using the above syntax. However, when you call these functions directly, only two variables can be correlated per call, and the output is a scalar double.

If any input list contains null values, the function returns null for its correlation coefficient(s).

Back to Top ^

goodman_kruskal_gamma()

Function syntax:

double goodman_kruskal_gamma(list(double) var1, list(double) var2)

Returns the Goodman-Kruskal Gamma correlation coefficient between values of variables in two lists as a double value. The Goodman-Kruskal Gamma statistic is a rank correlation of concordant pairs. If any field values are null, the returned value is null.

To perform multivariate Goodman-Kruskal Gamma correlations, use the simple correlation_coefficient function.

Back to Top ^

kendall_tau()

Function syntax:

double kendall_tau(list(double) var1, list(double) var2)

Returns the normalized Kendall Tau distance metric between values of variables in two lists as a double value. The lists must have the same number of elements. The Kendall Tau distance statistic counts the number of pairwise disagreements between the ranks of corresponding items in two fields. Larger values indicate greater disagreement in rank ordering, such that a return value of 0 means complete agreement and 1.0 means complete disagreement.

To perform multivariate Kendall Tau correlations, use the simple correlation_coefficient function.

Back to Top ^

pearson()

Function syntax:

double pearson(list(double) var1, list(double) var2)

Returns the Pearson product-moment correlation coefficient between values of variables in two lists as a double value that indicates the degree of linear dependence between two variables. The Pearson statistic is defined as the covariance of the two variables divided by the product of their standard deviations. The term product-moment refers to the mean (the first moment about the origin) of the product of the mean-adjusted random variables. A return value of -1 indicates a perfect negative linear correlation. A value of 0 indicates no correlation (random variates). A value of 1 indicates perfect positive linear correlation. If any list elements are null, the returned value is null. If the variance of either input variable is zero, the function returns NaN.

To perform multivariate Pearson correlations, use the simple correlation_coefficient function.

Back to Top ^

spearmans_rank()

Function syntax:

double spearmans_rank(list(double) var1, list(double) var2)

Returns the Spearman's Rank correlation coefficient between values of variables in two lists as a double value. Spearman's statistic is defined as the Pearson correlation coefficient between the ranked variables. A return value of -1 indicates a perfect negative correlation (reverse ordering). A value of 0 indicates no concordance of ranks. A value of 1 indicates identical rank orderings. If any list values are null, the returned value is null.

To perform multivariate Spearman's Rank correlations, use the simple correlation_coefficient function.

Back to Top ^

Simple Functions: Strings

This category includes the following functions:

The behavior of functions that deal with strings changes when Unicode support is enabled for StreamBase Server as described in Unicode Support. These cases are noted for each function in this section.

contains()

Function syntax:

bool contains(string haystack, string needle)

Takes two arguments: a string and a search string. Returns true if needle is a substring of haystack, or false if not.

If either argument is null, returns null.

The IN infix operator is an alternate syntax for contains().

Also see the list version of contains().

Back to Top ^

endswith()

Function syntax:

bool endswith(String biggerString, String smallerString)

Takes two arguments: a string to search within and a (sub)string to search for. Returns true if biggerString ends with smallerString, or false if not.

If either argument is null, returns null.

Back to Top ^

format()

Function syntax:

string format(string format-string, arg0, arg1, ..., argn)

Returns a formatted string for arguments arg0 through argn, each formatted according to a specification in format-string.

Format specifications are patterns specified in the manner of class java.text.MessageFormat.

For example, an incoming stream containing NYSE stock ticks might have a field named symbol containing four-character stock symbols. Let's say a downstream operator expects symbols to be prefixed with a string identifying their source stock exchange. In this case, use an expression like the following in a Map operator:

format("NYSE:{0}", symbol)

An incoming stream might have a price field specified to four decimal places. When the flow of tuples reaches an output stream that must be formatted for human readability, you can use an expression like the following to round the price field to two decimal places. This example assumes that the price field contains values of type double, which matches the format specifier ###.##.

format("{0,number,###.##}", price)

The format for argument arg0 is specified with the portion of the format-string that defines {0}. The format for arg1 is in the format-string portion that defines {1}, and so on. The data type of arg0 must match the type expected by the format-string specifier for {0}, and so on for each argument.

If you use format() to modify a tuple, use {0}, {1}, {2} and so on to specify the format for each field.

See the documentation for java.text.MessageFormat for further details and more examples.

Back to Top ^

indexof()

Function syntax:random

int indexof(string haystack, string needle [, int start])

Returns the index of the first instance of the string needle within the string haystack, for which the index is greater than or equal to start. For all supported Unicode character sets, this function returns the index in number of graphemes, not bytes. For string arguments, if needle is not found, the function returns a value of -1.

If the search is successful, the value returned is always greater than or equal to start. If start is unspecified, it is taken to be 0, which designates the first character of the string or the first element of the list.

See also the indexof() function that operates on lists.

Back to Top ^

isempty()

Function syntax:

bool isempty(T arg)

Takes one string or list argument, arg. Returns true if arg is null, or if it is a string and it is empty, or if it is a list and its size is zero; false in any other case.

Back to Top ^

lastindexof()

Function syntax:

int lastindexof(string haystack, string needle [, int lastStart])

Returns the index within string haystack of the rightmost instance of string needle, for which the index is less than or equal to lastStart. For all supported Unicode character sets, this function returns the index in number of graphemes, not bytes. For string arguments, if needle is not found, the function returns a value of -1.

If the search is successful, the value returned is always less than or equal to lastStart. If lastStart is unspecified, it is taken to be the index of the last grapheme of the string or the last element of the list.

See also the lastindexof() function that operates on lists.

Back to Top ^

length()

Function syntax:

int length(string x)

Returns the number of graphemes in the string, for ASCII and all supported Unicode character sets. If the argument is a null string, the returned value is null.

See also the length() functions that operate on lists and blobs.

Back to Top ^

lower()

Function syntax:

string lower(string str)

Returns its argument in lowercase letters.

Back to Top ^

ltrim()

Function syntax:

string ltrim(string str)
string ltrim(string s1, string s2)

The same as trim(), except that the first argument is trimmed on the left side only.

Back to Top ^

randomstring()

Function syntax:

string randomstring(int len)

Returns a string of size len containing random ASCII characters A-Z.

Back to Top ^

regexmatch()

Function syntax:

bool regexmatch(string regex, string str)

Attempts to match the specified regular expression in regex against the entire specified string in str. To match any substring of the specified string, use wildcards before and after your regular expression. For example, the regex IBM matches only when the entire string is "IBM", while the regex .*IBM.* matches if "IBM" is anywhere in the supplied string str.

Back to Top ^

regexmatch_ignorecase()

Function syntax:

bool regexmatch_ignorecase(string regex, string str)

Same as regexmatch(), but the supplied regex matches str without regard to case.

Back to Top ^

regexreplace()

Function syntax:

string regexreplace(string input_str, string regex_str, string replace_str [, int limit])

Accepts an input string, a regular expression pattern, a replacement string, and an optional integer, limit. Returns input_str modified by replacing occurrences of regex_str in input_str with replace_str up to limit times. If limit is negative or omitted, all occurrences in input_str are replaced.

See Oracle's Pattern class documentation for the supported regular expression syntax. The replace_str argument can contain references to fields in a regular expression capture group, in the same way as java.util.regex.Matcher.appendReplacement().

Usage note: the similar replace() command operates on string literals, and substitutes a string literal for a literal search string. The regexreplace() command operates on string literals, but substitutes its string literal for a regular expression search, and has support for capture groups. It would appear at first glance that regexreplace() without the optional limit argument is the same as replace(), but the difference in the second argument means the two commands have different semantics. For example, consider the following commands and their results:

replace("this.string", ".", ",") returns this,string

regexreplace("this.string", ".", ",") returns ,,,,,,,,,,,

regexreplace("this.string", "\\.", ",") returns this,string

To specify a capture group, surround an expression with parentheses (). This saves the section of the input string matching the capturing group for later recall via a backreference that you specify in replace_str.

A backreference is specified in the regular expression as a dollar sign ($) followed by a digit indicating the number of the group to be recalled, with a replacement string. For example:

regexreplace('a-b','([a-zA-Z])-([a-zA-Z])','$1_$2') returns a_b

Back to Top ^

regexsplit()

Function syntax:

list(string) regexsplit(string input_str, string regex_delimiter [, int limit])

Use the regexsplit() function in the same ways as the split() function, except that you can specify a regular expression as the field delimiter. See Oracle's Pattern class documentation for the supported regular expression syntax.

The following example parses input_str into list elements using either colon, semicolon, or a space as the field delimiter:

regexsplit("aaa:bbb;ccc:ddd;eee fff", "[:; ]")

returns

list(string) [aaa, bbb, ccc, ddd, eee, fff]

See split() for further details and for instructions on using the optional limit argument.

Back to Top ^

regextuple()

Function syntax:

tuple regextuple(tuple tuplespec, string regex , sring input)

Use the regextuple() function to parse a string into a tuple.

For the first argument, provide a tuple specification in the form of an empty tuple. To do so, use type conversion functions in a tuple() function to define the data type of each field, and at the same time to create a null field of each type. The first argument of the example below creates a tuple with the required schema, and with each field filled with null: ((symbol string, price double)) null,null.

The second argument is a Java regular expression that uses named capturing groups, where the capture group names are the same as, and in the same order as, the tuple fields specified in the first argument.

The third argument is the string you want to parse into tuple format, possibly extracted from CSV or other plain text input upstream.

See Oracle's Pattern class documentation for the supported regular expression syntax.

In the following example, notice that you must escape the predefned character class strings with double backslashes:

regextuple(tuple(string() as symbol, double() as price), "(?<symbol>[A-Z]+)\\s+(?<price>\\d+(\\.\\d+)?)", "STP 42.00")

This returns:

STP,42.0

Back to Top ^

replace()

Function syntax:

string replace(string main_str, string search_str, string replace_str)

Returns main_str after locating all instances of search_str and replacing them with replace_str. The search_str and replace_str arguments are read as case-sensitive, literal strings. Any wildcard character is read as the literal character, not a wildcard.

Examples:

For a string field named UStext in the incoming stream:  replace(UStext, "honor", "honour")

replace('Open the pod bay doors, IBM.', 'IBM' , 'HAL')

replace('1,2,3,4,5', ',' , '-')

See also the regexreplace() command and the usage note there comparing with replace() syntax.

Back to Top ^

rtrim()

Function syntax:

string rtrim(string str)
string rtrim(string s1, string s2)

The same as trim(), except that the first argument is trimmed on the right side only.

Back to Top ^

split()

Function syntax:

list(string) split(string input_str, string delimiter [, int limit])

Returns a list of strings parsed from input_str, splitting the input string into pieces delimited by the character or phrase specified as delimiter. The delimiter character or phrase itself is not preserved in the resulting list when a match is found. For example:

split("aaa:bbb:ccc:ddd:eee", ":")

returns

list(string) [aaa, bbb, ccc, ddd, eee]

If the delimiter character or phrase is not found in input_str, then input_str is returned verbatim as the only element of the resulting list. If the delimiter character or phrase occurs adjacent to another instance of the delimiter, the result is an empty list element. For example, split("aaa::bbb", ":") results in the following three-element list: [aaa,  ,bbb].

You can specify an optional third argument, limit, an integer specifying the number of list elements in the result list. Use the limit argument in the following ways:

  • If limit is any negative number, input_str is split into as many list elements as necessary, with no limit on the length of the list. All empty list elements are preserved, including any trailing empty elements. For example, split("aaa::bbb::", ":", -1) results in a five-element list: [aaa,  ,bbb,  ,  ].

  • If limit is zero or unspecified, input_str is split into as many list elements as necessary, but any trailing empty elements are discarded. For example, split("aaa::bbb::", ":", 0) and split("aaa:bbb::", ":") both result in a three-element list: [aaa,  ,bbb].

  • If limit is positive, input_str is split into at most limit-1 elements, plus one more element containing the remainder of input_str. For example, split("aaa:bbb:ccc:ddd:eee", ":", 3) results in a three-element list: [aaa, bbb, ccc:ddd:eee].

    Use this feature to split a string into the components you want to process, followed by a single catch-all element containing the remainder.

See also the regexsplit() function, a variation of split() that allows you to use a regular expression to specify the delimiter.

Back to Top ^

startswith()

Function syntax:

bool startswith(String biggerString, String smallerString)

Takes two arguments: a string to search within and a (sub)string to search for. Returns true if biggerString begins with smallerString, or false if not.

If either argument is null, returns null.

Back to Top ^

string()

Converts its argument into a string. See string() in the Simple Functions: Type Conversions section.

strlen()

Function syntax:

int strlen(string str)

Returns the length of the string str. This function returns the number of graphemes, not bytes, for all supported Unicode character sets.

Back to Top ^

strresize()

Function syntax:

string strresize(string str, int length)

The strresize() function performs no action: it simply returns its str argument and ignores the length argument. The function is preserved for compatibility with previous StreamBase releases, in which the StreamBase string data type had a fixed maximum size. To perform string truncation, use substr() or strresizetrunc().

Back to Top ^

strresizetrunc()

Function syntax:

string strresizetrunc(string str, int length)

Returns its argument str, truncated to the number of bytes specified in length.

Note

When using strresizetrunc() with Unicode multibyte character sets, count the length argument in bytes, not graphemes.

Back to Top ^

substr()

Function syntax:

string substr(string str, int beginchar, [int length])

Returns a substring of the string str, starting at the zero-indexed character beginchar with optional size of length characters.

For all supported Unicode character sets, specify length in number of graphemes, not bytes.

Back to Top ^

trim()

Function syntax:

string trim(string str)
string trim(string s1, string s2)

With one argument, returns the specified string with leading and trailing spaces removed.

With two arguments, the second argument contains the characters to be removed from the string specified in the first argument.

Back to Top ^

upper()

Function syntax:

string upper(string str)

Returns its argument in uppercase letters.

Back to Top ^

Simple Functions: System

This category includes the following functions:

advanceTimeBy

Function syntax:

bool advanceTimeBy(long millis)

Use this function to increment the system time of an instance of StreamBase server running in a controlled time environment by the specified number of milliseconds.

This function only works if you have configured the current project's sbd.sbconf file for testing purposes with elements like the following. See Using the EventFlow Fragment Unit TimeService for further information.

<time-service-configurations>
  <type>CONTROLLABLE</type>
</time-service-configurations>

Back to Top ^

advanceTimeTo

Function syntax:

bool advanceTimeTo(timestamp timevalue)

Use this function to increment the system time of an instance of sbd running in a controlled time environment to the specified timestamp's date and time.

This function only works if you have configured the current project's sbd.sbconf file for testing purposes with elements like the following. See Using the EventFlow Fragment Unit TimeService for further information.

<time-service-configurations>
  <type>CONTROLLABLE</type>
</time-service-configurations>

Back to Top ^

getTargetTime

Function syntax:

timestamp getTargetTime()

Use this function to retrieve a timestamp representing the time StreamBase Server thinks it is right now. This function is useful in a controlled time environment to retrieve a custom starting point time you have set with elements like the following in the project's sbd.sbconf file. Notice that the getTargetTime() timestamp is the same as that returned by now(), because in a controlled time environment, the server's sense of what current time means has been deliberately changed, and is temporarily not the same as wall clock time.

<time-service-configurations>
  <type>CONTROLLABLE</type>
  <target-time>2015-10-06 12:00</target-time>
</time-service-configurations>

Back to Top ^

nanotime()

Function syntax:

long nanotime()

Returns the current value of the system timer as a long value in nanoseconds. This function provides nanosecond precision, but not necessarily nanosecond resolution (that is, how frequently the value changes). Therefore, use this function only to measure elapsed time between two invocations, like the following example:

long startcount = nanotime()
// process tuple data
long elapsedtime = nanotime() - startcount

Differences in two calls to nanotime() are not accurate for a span greater than 263 (about 292 years). Do not compare nanotime() values between different JVM instances or even across JVM restart boundaries on the same machine.

Back to Top ^

sleep()

Function syntax:

timestamp sleep(timestamp t)

Pauses execution of the current thread for the specified time period. The argument must be an interval timestamp, and the return value is the argument. This function can be useful in conjunction with operators that run in a separate thread, and is provided for debugging or for rare uses. The sleep() function blocks the concurrent portion of an application, if concurrency is enabled, or blocks the StreamBase container holding the application, if concurrency is not enabled. Therefore, use sleep() with care in production applications. To learn about threading in StreamBase applications, refer to Execution Order and Concurrency.

Back to Top ^

systemenv()

Function syntax:

string systemenv(string key)

Calls the standard Java method java.lang.System.getenv(). Returns the string value of the environment variable specified by the key argument.

Back to Top ^

systemproperty()

Function syntax:

string systemproperty(string key [, string defaultstring])

Calls the standard java.lang.System.getProperty() method. Returns the string value of the system property specified by the key argument. Use the standard set of system property keys described in Java documentation for java.lang.System.getProperties() (for example, os.name, java.home, and user.name), or specify a StreamBase-specific Java property. Remember that this function returns values only for the Java engine in the node that contains the EventFlow fragment and EventFlow module that invokes this function.

If the optional defaultstring argument is used, and there is no system property for the specified key, the function returns the string specified in the defaultstring argument. If no default string is provided, null is returned.

Compare with getCompilerProperty() and getSetCompilerProperty().

Back to Top ^

version()

Function syntax:

string version()

Returns as a string the three-position release number from the currently running release of Spotfire StreamBase, such as 10.2.0.

Back to Top ^

Simple Functions: Timestamp Overview

The timestamp data type can hold either an absolute timestamp or an interval timestamp.

An absolute timestamp represents a date and time. Its value is the number of seconds between the epoch and that date and time, with a maximum precision of milliseconds. The epoch is defined as 00:00:00 of January 1, 1970 UTC.

An interval timestamp represents a duration. Its value is the number of seconds in the interval, with a maximum precision of milliseconds.

Absolute timestamps are expressed in the time format patterns of the java.text.SimpleDateFormat class and those shown in the description of the format_time() function below.

See timestamp Data Type for information on the range of timestamp values, on adding and subtracting timestamp values, and on comparing timestamps with relational operators.

Timestamp functions are organized into the following categories, described in the next three sections:

Timestamp Functions: Absolute Timestamps

This category includes the following functions, which return an absolute timestamp or a string converted from an absolute timestamp:

See also the nanotime() function in the System category.

Usage Note

StreamBase offers two similar functions to convert a StreamBase timestamp to a string: format_time() and strftime(). These functions use different libraries to provide support for their time format strings. Both functions are implemented in Java but are based on Java and C++ formatting standards respectively.

Be aware that these libraries make opposite use of the z and Z format strings, as shown in this table:

function Meaning of lowercase z or %z Meaning of uppercase Z or %Z
format_time() Time zone string (EST, PST8PDT, America/New_York) Time zone signed number (+0100, -0500, -0800)
strftime() Time zone signed number (+0100, -0500, -0800) Time zone string (EST, PST8PDT, America/New_York)

StreamBase also provides two similar functions for parsing a time format string in any format and converting it to a StreamBase absolute timestamp: parse_time() and strptime(). These are again both implemented in Java, but based on the Java and C++ formatting standards, respectively.

The parse_time() function interprets the z and Z format strings equally. The input timestring can be in either text, abbreviation, or signed number format, and either case of Z interprets them as time zone specifiers.

The strptime() function interprets the %z and %Z format strings as separate identifiers. When using strptime(), specify lowercase z in the formatstring if the incoming timestring expresses the time zone as a signed number, such as +0100, -0500, or -0800. Specify uppercase Z in the formatstring in the incoming timestring expresses the time zone as text or a standard abbreviation: EST, PST8PDT, or America/New_York. For more information on specifying time zone IDs, see Specifying Time Zones.

date()

Function syntax:

timestamp date(timestamp ts)

Accepts a timestamp ts and returns an absolute timestamp for the year, month, and day portion of ts, with all time values set to zero. This, in effect, extracts and returns the timestamp for midnight local time (00:00:00) for the date in the argument ts.

For example, the function now(), run on 12 Feb 2009 in the EST time zone, returns a full timestamp like this one:

2019-05-22 17:32:55.598-0400

But the same function used as an argument for date():

date(now())

returns the following. Notice that all time information was changed to zero to designate midnight:

2019-05-22 00:00:00.000-0400

Back to Top ^  Back to Timestamp Overview ^

epoch()

Function syntax:

timestamp epoch()

Returns an absolute timestamp representing the zero point of counted time. For both Windows and UNIX, the epoch is 00:00:00 of January 1st, 1970. The value returned is adjusted for the current time zone. To see the absolute epoch, use:

epoch() - timezoneoffset()

Because the epoch occurred during standard time in the northern hemisphere, you may need to add a one-hour offset if you are retrieving the absolute epoch during daylight savings time:

epoch() - timezoneoffset() + hours(1)

Incoming tuples from a market data provider or an exchange might include a timestamp field in the form of a long value (or integer value) representing the number of milliseconds (or number of seconds) since the epoch. Use epoch() to convert such field values to an absolute timestamp.

Many functions, including java.util.Date.getTime(), return a long value representing the number of milliseconds since the epoch. The timestamp value in market data from many exchanges use this format. To convert such long values, TSms, to an absolute timestamp, use an expression like the following:

milliseconds(TSms) + epoch()

Functions that return UNIX time, including the UNIX date +%s command, generally return an integer number of seconds. To convert such integer values, TSsec, to an absolute timestamp, use an expression like the following:

seconds(TSsec) + epoch()

Use seconds() as above, for double values that include a fractional milliseconds component, such as the value returned from the Python time.time() function.

Use to_seconds() or to_milliseconds() to take a formatted timestamp string and return the number of seconds or milliseconds since the epoch, respectively.

Back to Top ^  Back to Timestamp Overview ^

format_time()

Function syntax:

string format_time(timestamp ts, string formatstring)
string format_time(timestamp ts, string formatstring, string tz)

Converts the specified timestamp field, ts, to a string, using the specified formatstring pattern. The timestamp field can then be manipulated as a string instead of a timestamp.

Optional string argument tz is a time zone ID for specifying a string that specifies the location of a time zone that the formatted time should indicate. Location strings can follow canonical format as listed on this Wikipedia page, such as "Europe/Zurich", "America/New_York", "America/Los_Angeles", "America/Santiago", or "Etc/UTC"). You can also use a deprecated but more familiar string as listed on that page, such as CET, EST, EST5EDT, PST, PST8PDT, CLT, or GMT.

When used, time zone IDs must be specified exactly and are case-sensitive. When given an invalid timezone string, the function sets the time zone to the default, "Greenwich Mean Time". For more information on specifying time zone IDs, see Specifying Time Zones.

This function uses the time format patterns from the SimpleDateFormat class and summarized below.

For example, the function now(), run on 22 May 2019 in the EST time zone, returns a timestamp like this one:

2019-05-22 17:39:36.459-0400

The same function used as an argument for format_time():

format_time(now(), "EEEE, MMM d, yyyy HH:mm zzzz")

returns the following:

"Wednesday, May 22, 2019 17:40 Eastern Daylight Time"

If you supply a third argument, you can shift the time zone. For example, to obtain Greenwich Mean Time:

format_time(now(), "EEEE, MMM d, yyyy HH:mm zzzz","Etc/UTC")

returns:

"Wednesday, May 22, 2019 17:40 Coordinated Universal Time"

You can use any combination of time format designators from the SimpleDateFormat class. For example:

format_time(now(), "yyyy-MM-dd G, HH:mm a")

returns:

"2019-05-22 AD, 17:43 PM"

This parse_time() function performs the same task as strftime(), but uses a different historical source for its format pattern strings. See the Usage Note about the different ways this function and strftime() interpret the z and Z formatstring specifiers.

The following table shows the format strings supported by format_time(). See the SimpleDateFormat documentation for further details.

formatstring Component Meaning Examples
a AM or PM AM
d Day in the month 11, 31
D Day in the year 234
E Day of the week. Use EEEE for the full day spelling, EEE or fewer for the abbreviation. Friday, Fri
F Number of the day of the week. 2 for Tuesday
G The era. AD, BC
h The one-based hour on a 12-hour clock, 1 to 12. 12
H The zero-based hour on a 24-hour clock, 0 to 23. 23
k The one-based hour on a 24-hour clock, 1 to 24. 24
K The zero-based hour on a 12-hour clock, 0 to 11. 11
m The minute in the hour. 25
M The month of the year. Use MM for a two-digit month, MMM for an abbreviated month name, MMMM for the full month name. 11, Nov, November
s The second in the minute. 23
S The millisecond fraction of the second 456
w The number of the week in the year. 27
W The number of the week in the month. 3
y The year. Use yy for a two-digit year, interpreted as belonging to the century 80 years prior to or 20 years ahead of the use of this function. Use more than two y's to interpret the year literally. In 2019, yy = 19, yyyy = 2019.
z The time zone shown as text. Use three or fewer z's for the time zone abbreviation, four z's for the full time zone name. EST, Eastern Standard Time, EDT, Eastern Daylight Time
Z The time zone offset from UTC, shown as a positive or negative number as in RFC 822. -0500

Back to Top ^  Back to Timestamp Overview ^

from_gmtime()

Function syntax:

timestamp from_gmtime(int year, int month,
    int dayOfMonth, int hour, int minute, double seconds)

Creates an absolute timestamp in the UTC (GMT) time zone from the specified integer components of a date and time. All six arguments are required. You can express the seconds argument as a double with precision down to milliseconds, or you can use an integer, which is promoted automatically to a double by the rules of StreamBase Data Type Coercion and Conversion.

You might use this function on a stream that has the date and time broken into integer fields to concatenate and translate those fields into a single timestamp value.

Notice that the returned timestamp shows a moment in the UTC time zone, but that moment is translated and displayed in the local time zone. For example, compare the same arguments given to from_gmtime() and from_localtime() when run in the EST time zone:

from_gmtime(2019, 3, 22, 18, 30, 05)

returns:

2019-03-22 14:30:05.000-0400

while

from_localtime(2019, 3, 22, 18, 30, 05)

returns:

2019-03-22 18:30:05.000-0400

Back to Top ^  Back to Timestamp Overview ^

from_localtime()

Function syntax:

timestamp from_localtime(int year, int month,
    int dayOfMonth, int hour, int minute, double second)

Creates an absolute timestamp in the local time zone from the specified integer components of a date and time. All six arguments are required. You can express the seconds argument as a double with precision down to milliseconds, or you can use an integer, which is promoted automatically to a double by the rules of StreamBase Data Type Coercion and Conversion.

You might use this function on a stream that has the date and time broken into integer fields to concatenate and translate those fields into a single timestamp value.

For example:

from_localtime(2019, 3, 22, 18, 30, 05)

returns:

2019-03-17 18:30:05.000-0400

Back to Top ^  Back to Timestamp Overview ^

from_unixtime()

Function syntax:

timestamp from_unixtime(double u_time)

Creates an absolute timestamp in UTC from a Unix timestamp ("POSIX time"). Unix time is the number of seconds that have elapsed in the Unix epoch (since 00:00:00 UTC, Thursday, 1 January 1970), not counting leap seconds. Consequently, the output is encoded only to four decimal digits.

For example:

from_unixtime(0.0)

returns:

1969-12-31 19:00:00.000-0500

To get closer to exact Unix time, subtract your time zone offset and DST offset, if one is in effect. Also add an offset for the number of hours from your current moment to midnight:

from_unixtime(0)-timezoneoffset() - hours(1) + hours(2)

returns:

1970-01-01 00:00:00.000-0500

Provide an exact number of seconds, perhaps one returned from the to_milliseconds() function, to return a recent time:

from_unixtime(1.5585e9)

returns:

2019-05-22 00:40:00.000-0400

Back to Top ^  Back to Timestamp Overview ^

now()

Function syntax:

timestamp now()

Returns an absolute timestamp value representing the current time in the local time zone.

You can specify an alternate now() implementation in the StreamBase Server configuration file. The value is set as a child element of the <runtime> element:

<param name="now-implementation" value="system" />

Specify one of these values, as appropriate for your StreamBase application:

Value Meaning
system Directs the now() function to use Java's System.currentTimeMillis(). This is the default.
thread Directs the now() function to use a background thread that checks the time approximately every millisecond. This option results in decreased accuracy, but may be more efficient than system if you call now() more frequently than 1000 times per second.

Back to Top ^  Back to Timestamp Overview ^

parse_time()

Function syntax:

timestamp parse_time(string timestring, string formatstring)
timestamp parse_time(string timestring, string formatString, 
   string language, string region, string variant)
timestamp parse_time(string timestring, string formatstring, bool lenient)
timestamp parse_time(string timestring, string formatstring, 
   string language, string region, string variant, bool lenient)

Parses a string of time and date information and returns an absolute timestamp value. The timestring argument can be any string representation of date and time, as long as its components are individually parsable.

You must specify a formatstring argument to specify how the timestring argument is to be interpreted. The format string must specify a time format pattern as defined in the SimpleDateFormat class. See the format_time() function for a summary of the time format pattern strings. For information on specifying time zone IDs, see Specifying Time Zones.

The arguments added in the second syntax provide information to customize the timestamp for a locale, and may be necessary if timestrings include localized abbreviations. The Java class Locale specifies these arguments as:

  • language — lowercase two-letter ISO-639 code

  • country — uppercase two-letter ISO-3166 code

  • variant — vendor and browser specific code, such as WIN for Windows, MAC for macOS, and POSIX for POSIX

When using the second syntax, set any unused argument to an empty string.

For the third and fourth syntax, setting the lenient argument to false will strictly enforce that timestring will match the formatstring. For more information on lenient parsing, refer https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/text/DateFormat.html#setLenient(boolean).

The functions without the lenient argument will work as if the lenient argument is set to true.

Use this function to read a time and date string in any format in an incoming stream, and still interpret that string as an absolute timestamp. For example, an incoming field, TradeTime, might contain the time and date in the following string format:

TradeTime = "11:17 Fri Feb 13, 19"

In this case, interpret the field TradeTime as a StreamBase timestamp with an expression like the following:

parse_time(TradeTime, "HH:mm EE MMM dd, yy")

which returns:

2019-02-13 11:17:00.000-0500

The input timestring can be in either text, abbreviation, or signed number format. If the timestring argument includes a time zone specifier, this function interprets the zone and returns the timestamp converted to local time. The time zone specifier can be in standard text format, such as EST, EST5EDT, or PST8PDT, or signed number format, such as -0500, -0800, +0100. You can specify either z or Z to match the time zone specifier. They are interpreted equally.

This function interprets the z and Z format strings equally. The input timestring can be in either text, abbreviation, or signed number format, and both cases of Z interpret them as time zone specifiers.

For example, we can add Central European Time to the TradeTime field using its signed number designation:

TradeTime = "11:17 Fri Feb 13, 19 +0100"

In this case, add a z or Z to the formatstring argument:

parse_time(TradeTime, "HH:mm EE MMM dd, yy Z")

When run in the EST zone, this function returns the following. You can use the timezoneoffset() function to force the interpretation for a different time zone.

2019-02-13 05:17:00.000-0500

Notice that it is inconsistent for a time string to contain an hours field based on a 24 hour clock and at the same time to specify AM or PM in the same string. Thus, it is inconsistent to specify both HH for the hours field and a for AM/PM in the same format string. If your time string contains AM or PM, use lowercase hh to interpret the hours field. StreamBase does not flag an error if you specify HH and a in the same format string, but the hours interpretation will be 12 hours offset for afternoon hours.

This function, parse_time() performs the same task as strptime(), but uses a different historical source for its time format strings. Both functions are implemented in Java but are based on Java and C++ formatting standards respectively. See the Usage Note about the different ways this function and strptime() interpret the z and Z formatstring specifiers.

See also the timestamp() function.

Back to Top ^  Back to Timestamp Overview ^

strftime()

Function syntax:

string strftime(string formatstring, timestamp ts)
string strftime(string formatstring, timestamp ts, string tz)

Converts the specified timestamp field, ts, to a string, formatted according to the specified formatstring pattern, for the local time zone. The timestamp field can then be displayed and manipulated as a string instead of a timestamp.

Optional string argument tz is a time zone ID for specifying a standard name (for example "Etc/UTC") or location (for example, "Europe/Zurich") of a time zone that the formatted time should indicate. When used, time zone IDs must be specified exactly and are case-sensitive. For information on specifying time zone IDs, see Specifying Time Zones.

This function uses time format patterns described in the table below, which are based on the strftime library, part of the Open Group's Single UNIX Specification.

As an example of using strftime(), consider that the function now(), when run on 22 May 2019 in the EDT time zone, returns a timestamp like this one:

2019-05-22 18:27:31.375-0400

Here is the same function used as an argument for strftime():

strftime("Traded at %H:%M and %S.%f seconds on %b %d, %Y", now())

which returns the following:

"Traded at 18:27 and 31.375 seconds on May 22, 2019"

If you wanted the return value to be expressed in UTC time, add its time zone ID as a third argument:

strftime("Traded at %H:%M and %S.%f seconds on %b %d, %Y", now(),"Etc/UTC")

which changes the result to:

"Traded at 22:27 and 31.375 seconds on May 22, 2019"

This function performs the same task as format_time(), but uses a different library source for its format pattern strings. The strftime() function is faster than format_time(), and should be used when performance is critical in your application. The format pattern strings used by format_time() are easier to use, but note that format_time() does not recognize time zone ID arguments. See the Usage Note about the different ways this function and format_time() interpret the z and Z formatstring specifiers.

The following table shows the format strings supported by strftime(). See the strftime library documentation for further details. An asterisk in the formatstring column denotes format strings added by StreamBase.

formatstring Component Meaning
%a The locale's abbreviated weekday name: Mon, Wed.
%A The locale's full weekday name: Monday, Wednesday.
%b The locale's abbreviated month name: Jan, Nov.
%B The locale's full month name: January, November.
%c The locale's default time and date string.
%C The year divided by 100 and truncated to an integer: 99, 00, 02, 11
%d The two-digit day of the month, from 01 to 31.
%D The month, day, and year with slash separator. Equivalent to specifying %m/%d/%y.
%e The one- or two-digit day of the month, from 1 to 31. Like %d, except single digit days are preceded by a space.
%f    * The milliseconds portion of timestamp.
%F The month, day, and year with hyphen separator. Equivalent to specifying %m-%d-%y.
%g The last two digits of the year.
%G The four-digit year.
%h Same as %b.
%H The two-digit hour on a 24-hour clock, 00 to 23.
%I The two-digit hour on a 12-hour clock, 01 through 12.
%j The three-digit day of the year, 1 through 366.
%k The one or two digit hour on a 24-hour clock, with leading space if < 10.
%l The one or two digit hour on a 12-hour clock, with leading space if < 10.
%m The two-digit number of the month, 01 to 12.
%M The two-digit minute, from 00 to 59.
%n Replaced with a new line character.
%p The locale's equivalent of AM or PM.
%r The time in AM or PM notation. For the POSIX locale, the same as %I:%M:%S %p
%R (Not supported by the StreamBase strftime() function.)
%s    * The time value of timestamp ts in milliseconds divided by 1000.
%S The two-digit second, from 00 to 60. (60 is available in case of leap seconds)
%t Replaced with a tab character.
%T The time, equivalent to %H:%M:%S
%u The numeric day of the week, 1 through 7, with Monday = 1.
%U The two-digit number of the week of the year, from 00 through 53. The first January Sunday begins week 1; days in the new year before that Sunday are in week 0.
%V The two-digit number of the week of the year, from 01 through 53, with Monday as the first day. If the first week has four or more January days, it is week 1; otherwise it is the last week of the previous year. That is, January 4th and the first Thursday of January are always in week 1.
%w The numeric day of the week, 0 through 6, with Sunday = 0.
%W The two-digit number of the week of the year, from 00 through 53. The first January Monday begins week 1; days in the new year before that Monday are in week 0.
%x The date, using the locale's default date format.
%X The time, using the locale's default time format.
%y The year in two digits, from 00 through 99.
%Y The year in four digits.
%z The time zone offset from UTC in ISO 8601 standard format, such as +hhmm or -hhmm, or empty if no time zone can be determined.
%Z The time zone name or abbreviation, or empty if no zone information is in the timestamp being converted.
%% Replaced by a literal %.
%+    * Returns the current time formatted as if you had specified "%a %b %d %H:%M:%S %Z %Y". For example, strftime("%+", now()) returns a string like the following:
Thu Feb 18 15:42:35 EST 2010
* These format patterns are StreamBase extensions that are either not found in the Open Group's strftime() function, or are used differently.

Some format strings can be modified with E or O between the percent sign and the format character. See the strftime library documentation for further details.

Back to Top ^  Back to Timestamp Overview ^

strptime()

Function syntax:

timestamp strptime(string timestring, string formatstring)

Parses a string of time and date information and returns an absolute timestamp value. The timestring argument can be any string representation of date and time, as long as its components are individually parsable.

You must also specify a formatstring argument to specify how the timestring argument is to be interpreted. This function is implemented in Java, but for backwards compatibility, honors the syntax of a time format pattern using a subset of the patterns defined for the strptime library (part of the Open Group's Single UNIX Specification), as listed in the table below.

This function interprets the %z and %Z format strings as separate identifiers. Specify lowercase z in the formatstring if the incoming timestring expresses the time zone as a signed number, such as +0100, -0500, or -0800. Specify uppercase Z in the formatstring in the incoming timestring expresses the time zone as text or a standard abbreviation: Europe/London, EST5EDT, PST8PDT, or America/New_York.

Use this function to read a time and date string in any format in an incoming stream, and still interpret that string as an absolute timestamp. For example, an incoming field, TradeTime, might contain the time and date in the following string format:

TradeTime = "15-51-26.984 May 5, 19"

In this case, interpret the field TradeTime as a StreamBase timestamp with an expression like the following:

strptime(TradeTime, "%H-%M-%S.%f %b %d, %y")

which returns:

2019-05-05 15:51:26.984-0400

This function performs the same task as parse_time(), but uses a different library source for its format pattern strings. This strptime() function is faster than parse_time(), and should be used when performance is critical in your application. The format pattern strings used by parse_time() are easier to use. See the Usage Note about the different ways this function and parse_time() interpret the z and Z formatstring specifiers.

See also the timestamp() function, and strpinterval(), a function similar to strptime(), but for interval timestamps.

The following table shows the format strings supported by strptime(). See the strptime library documentation for further details.

formatstring Component Meaning
%b The month, using the current locale's month names. Abbreviated and full month names are recognized.
%B Same as %b.
%d The day of the month, from 1 to 31.
%D The month, day, and year with slash separator. Equivalent to specifying %m/%d/%y.
%f   * The milliseconds portion of the seconds field.
%H The hour, using a 12-hour clock, 0 to 11, or a 24-hour clock, 0 to 23. If the incoming string expresses hours on a 12-hour clock, be sure to include the %p format string. If an incoming string's hours value is from 12 to 23 and the format string includes %p, the %p is ignored.
%m The number of the month, 1 to 12.
%M The minute, from 0 to 59.
%p The strings AM or PM.
%S The second, from 0 to 59.
%x The date, using the locale's default date format.
%y The year in two digits. Values in the range 69 to 99, inclusive, are interpreted as 1969 through 1999. Values in the range 00 through 68, inclusive, are interpreted as 2000 through 2068.
%Y The year in four digits.
%z   * A time zone designator in the form +hhmm, -hhmm, +hh:mm, or -hh:mm. Examples: -0500, +1430, -08:00
%Z   * A standard time zone string or abbreviation, such as EST, GMT-05:00, America/Los_Angeles.
! Use an exclamation point in the format string to indicate that the remainder of the input string is optional.
For %d, %H, %m, %M, and %S, leading zeros are recognized but not required.
* These format patterns are StreamBase extensions not found in the Open Group's strptime() function, or are used differently.

Back to Top ^  Back to Timestamp Overview ^

today()

Function syntax:

timestamp today()

Returns an absolute timestamp value for the beginning of the current day, 00:00:00, in the local time zone. For example, when run on 22 May 2019 in the EST5EDT time zone.

2019-05-22 00:00:00.000-0400

Back to Top ^  Back to Timestamp Overview ^

today_utc()

Function syntax:

timestamp today_utc()

Returns an absolute timestamp value for the beginning of the current day, 00:00:00, in the UTC (GMT) time zone, translated for the local time zone.

Thus, when today() and today_utc() are run in sequence in the US Eastern time zone during DST, they return:

2019-05-22 00:00:00.000-0400
2019-05-21 20:00:00.000-0400

The expressions today() and today_utc() - timezoneoffset() return the same value:

2019-05-22 00:00:00.000-0400
2019-05-22 00:00:00.000-0400

On UNIX, today_utc() requires the TZ environment variable to be set in order to translate correctly to the local time zone. On Windows, the TZ environment variable is ignored if set.

Back to Top ^  Back to Timestamp Overview ^

Timestamp Functions: Interval Timestamps

This category includes the following functions, which return an interval timestamp:

days()

Function syntax:

timestamp days(int x)
timestamp days(double x)

Returns an interval timestamp representing an interval of x days.

Back to Top ^  Back to Timestamp Overview ^

hours()

Function syntax:

timestamp hours(int x)
timestamp hours(double x)

Returns an interval timestamp representing an interval of x hours.

Back to Top ^  Back to Timestamp Overview ^

interval()

Function syntax:

timestamp interval(string formatstring)

Parses the argument formatstring as a time format string and returns an interval timestamp. This function uses the time format patterns from strftime library, described in the Open Group's Single UNIX Specification. See the table of format patterns listed for the strftime() function.

The time format string is in the form %H:%M:%S!.!%f. The exclamation points show where the remainder of the string is optional. Thus, you can enter strings in any of the following formats, where hh = hours, mm = minutes, ss = seconds, and ff = a fractional second value in milliseconds:

  • hh:mm:ss

  • hh:mm:ss.

  • hh:mm:ss.ff

For example:

interval("2:00:01")

interval("2:00:01.")

both return (timestamp) 7201.0, the number of seconds in the specification of two hours plus one second.

interval("2:00:01.333")

returns (timestamp) 7201.333, the number of seconds in the specification of two hours plus one and a third seconds.

Back to Top ^  Back to Timestamp Overview ^

isinterval()

Function syntax:

bool isinterval(timestamp ts)

Returns true if the argument ts is an interval timestamp. See Simple Functions: Timestamp Overview for a discussion of interval versus absolute timestamp types.

Back to Top ^  Back to Timestamp Overview ^

milliseconds()

Function syntax:

timestamp milliseconds(int x)
timestamp milliseconds(double x)

Returns an interval timestamp representing an interval of x milliseconds expressed in seconds.

Use this function in conjunction with epoch() to convert a tuple field value representing the number of milliseconds since the epoch to an absolute timestamp. See epoch().

Back to Top ^  Back to Timestamp Overview ^

minutes()

Function syntax:

timestamp minutes(int x)
timestamp minutes(double x)

Returns an interval timestamp representing an interval of x minutes expressed in seconds.

Back to Top ^  Back to Timestamp Overview ^

seconds()

Function syntax:

timestamp seconds(int x)
timestamp seconds(double x)

Returns an interval timestamp representing an interval of x seconds.

Use this function in conjunction with epoch() to convert a tuple field value representing the number of seconds since the epoch to an absolute timestamp. See epoch().

Back to Top ^  Back to Timestamp Overview ^

strpinterval()

Function syntax:

timestamp strpinterval(string timestring, string formatstring)

Parses a string of time interval information and returns an interval timestamp value. The timestring argument can be any string representation of a time interval, as long as its components are individually parsable.

You must also specify a formatstring argument to specify how the timestring argument is to be interpreted. This function is implemented in Java, but for backwards compatibility, the format string must specify a time format pattern using the following subset of the patterns defined for the strptime library (part of the Open Group's Single UNIX Specification):

formatstring Component Meaning
%f The milliseconds portion of the seconds field. (This is a StreamBase extension not found in the strptime library.)
%H The hour, using a 24-hour clock, 0 to 23.
%M The minute, from 0 to 59.
%S The second, from 0 to 59.
! Use an exclamation point in the format string to indicate that the remainder of the input string is optional.
For %H, %M, and %S, leading zeros are recognized but not required.

Use this function to read a string representing a time interval in an incoming stream, and still interpret that string as an interval timestamp. For example, an incoming field, ElapsedTime, might contain a time interval expressed in following string format:

ElapsedTime = "1, 7, 5.250"

which you know from the data vendor's documentation means one hour and 7 minutes, plus five and a quarter seconds. In this case, interpret the field ElapsedTime as a StreamBase interval timestamp with an expression like the following:

strptime(ElapsedTime, "%H, %M, %S.%f")

which returns:

4025.250

If the ElapsedTime field in the incoming stream intermittently includes the seconds, use an exclamation point to show that everything after the minutes pattern is optional:

strptime(ElapsedTime, "%H, %M!, %S.%f")

Then when a new ElapsedTime value comes through set to "2, 4", it is still interpreted and converted to an interval timestamp:

7440.0

See also the strptime(), which performs for absolute timestamps what this function does for interval timestamps.

Back to Top ^  Back to Timestamp Overview ^

time()

Function syntax:

timestamp time(timestamp ts)

Returns as an interval timestamp the time of day expressed in the argument, ts – that is, the number of seconds between the start of the day expressed by ts and the moment expressed by ts. Thus, if run at the same instant, the following expressions return the same result: time(now()) and now() - today().

Back to Top ^  Back to Timestamp Overview ^

timezoneoffset()

timestamp timezoneoffset()
timestamp timezoneoffset(timestamp ts)

Without an argument, returns a signed interval timestamp that expresses in seconds the offset of the local time zone from UTC. Thus, timezoneoffset() returns as an interval the same value that format_time(now(), "Z") returns as a string. Because interval timestamps are expressed in seconds, you can divide the results by 3600 to obtain for display purposes the number of hours in the offset.

The returned time zone offset value is calculated relative to the system time for Windows, and relative to the system time or to the time zone specified in the TZ environment variable for UNIX. On UNIX, make sure that the environment from which the sbd process is opened either does not have a TZ variable or that it is correctly set for the local system's time zone. The TZ environment variable is ignored on Windows.

You can subtract the return of timezoneoffset() from any absolute timestamp value to generate a new absolute timestamp that represents the same moment in the UTC time zone. For example, now() - timezoneoffset() returns the present moment in the UTC time zone, and returns the same result when run from any time zone.

Because the value returned from timezoneoffset() is signed, you do not need to consider whether to add or subtract a positive or negative offset value from the timestamp of interest. Always use a subtraction operation, which insures that negative timezoneoffset() values (that is, those from time zones west of UTC) are effectively added, while positive values (from time zones east of UTC) are subtracted.

Once you have a timestamp value representing UTC time, you can convert it to a timestamp for another time zone by adding or subtracting a fixed value representing the offset of the time zone of interest. For example, the following expression returns the current time for Sydney, Australia for April to October, when Sydney is on standard time:

(now() - timezoneoffset()) - hours(10.0)

(For October to April, when Sydney is on daylight savings time, use hours(11.0).)

If you provide an absolute timestamp as an argument, timezoneoffset() returns the UTC offset of the system's current time zone at the date and time of the specified timestamp. The offset is calculated based only on the date and time information in the specified timestamp, relative to the system's current time zone (or, on UNIX, to the zone in TZ). You can use the argument form of timezoneoffset() to determine the UTC offset for future or historical dates.

For example, the following commands show that on the US East Coast, the UTC offset changed at 2:00 AM on March 10, 2019, when daylight savings time began.

  timezoneoffset(timestamp('2019-03-10 01:59:59'))/3600

returns -5.00.

  timezoneoffset(timestamp('2019-03-10 02:00:00'))/3600

returns -4.00.

  timezoneoffset(timestamp('2019-03-10 03:00:00'))/3600

returns -4.00.

(Actually, at 2:00 AM clocks were turned ahead one hour to 3:00 AM, but timezoneoffset() correctly calculates the UTC offset for both 2:00 and 3:00 AM.)

You can also extract specific fields of a timestamp in either local time or for a specific time zone using functions such as get_hour(), which take a string time zone ID as an optional argument. For more information, see Timestamp Functions: Timestamp Fields.

Back to Top ^  Back to Timestamp Overview ^

weeks()

Function syntax:

timestamp weeks(int x)
timestamp weeks(double x)

Returns an interval timestamp representing an interval of x weeks.

Back to Top ^  Back to Timestamp Overview ^

Timestamp Functions: Timestamp Fields

This category includes the following functions, used to get and set individual fields in absolute timestamps:

get_millisecond()

Function syntax:

double get_millisecond(timestamp ts)

Returns the milliseconds portion of the absolute timestamp ts. The returned value is a double, and includes only the fractional portion of current second's value.

Back to Top ^  Back to Timestamp Overview ^

get_second()

Function syntax:

double get_second(timestamp ts)
double get_second(timestamp ts, string tz)

Returns the seconds portion of the absolute timestamp ts for the local time zone. The returned value is a double, and includes a fractional value in milliseconds.

To return a value for a particular time zone, provide the text of its ID as a second argument, tz, which is case-sensitive. Unrecognized time zone IDs default to UTC. For information on specifying time zone IDs, see Specifying Time Zones.

Back to Top ^  Back to Timestamp Overview ^

get_minute()

Function syntax:

int get_minute(timestamp ts)
int get_minute(timestamp ts, string tz)

Returns the minutes portion of the absolute timestamp ts for the local time zone.

To return a value for a particular time zone, provide the text of its ID as a second argument, tz, which is case-sensitive. Unrecognized time zone IDs default to UTC. For information on specifying time zone IDs, see Specifying Time Zones.

Back to Top ^  Back to Timestamp Overview ^

get_hour()

Function syntax:

int get_hour(timestamp ts)
int get_hour(timestamp ts, string tz)

Returns the hours portion of the absolute timestamp ts for the local time zone.

To return a value for a particular time zone, provide the text of its ID as a second argument, tz, which is case-sensitive. Unrecognized time zone IDs default to UTC. For information on specifying time zone IDs, see Specifying Time Zones.

Back to Top ^  Back to Timestamp Overview ^

get_day_of_week()

Function syntax:

int get_day_of_week(timestamp ts)
int get_day_of_week(int year, int month, int day)

Returns an integer representing the day of the week given an absolute timestamp ts for the local time zone or given three integers representing a year, month, and day. The returned value is between 0 (Sunday) and 6 (Saturday).

Back to Top ^  Back to Timestamp Overview ^

get_day_of_month()

Function syntax:

int get_day_of_month(timestamp ts)

Returns an integer representing the day of the month in the absolute timestamp ts for the local time zone. The returned value is one-based, and thus between 1 and 31.

Back to Top ^  Back to Timestamp Overview ^

get_day_of_year()

Function syntax:

int get_day_of_year()
int get_day_of_year(int year, int month, int day)
int get_day_of_year(timestamp ts)

Returns an integer representing the Julian day of the year for the local time zone. The returned value is between 1 and 365 (or 366 for leap years).

get_day_of_year with three integer arguments returns the Julian day of the year for the specified year, month, and day; with one timestamp argument, for the date expressed by that timestamp.

get_day_of_year with no arguments is equivalent to get_day_of_year(now()).

Back to Top ^  Back to Timestamp Overview ^

get_month()

Function syntax:

int get_month(timestamp ts)
int get_month(timestamp ts, string tz)

Returns a one-based integer representing the month in the absolute timestamp ts for the local time zone. The returned value is between 1 (January) and 12 (December).

To return a value for a particular time zone, provide the text of its ID as a second argument, tz, which is case-sensitive. Unrecognized time zone IDs default to UTC. For information on specifying time zone IDs, see Specifying Time Zones.

Back to Top ^  Back to Timestamp Overview ^

get_quarter()

Function syntax:

int get_quarter(timestamp ts)
int get_quarter(timestamp ts, string tz)
int get_quarter(timestamp ts, int startMonth)
int get_quarter(timestamp ts, string tz, int startMonth)

Returns an integer representing the quarter of the year in which the absolute timestamp ts falls. The returned value is 1, 2, 3, or 4.

With a single timestamp ts argument, the function defaults to January as the start month of the year, and defaults to the local time zone.

To return a value for a particular time zone, provide the text of its ID as a second argument, tz, which is case-sensitive. Unrecognized time zone IDs default to UTC. For information on specifying time zone IDs, see Specifying Time Zones.

To specify a fiscal year different from the calendar year, provide an integer startMonth value from 1 to 12 as the second argument.

To specify both non-local time zone and a different start month, specify both tz and startMonth values as second and third arguments.

Back to Top ^  Back to Timestamp Overview ^

get_year()

Function syntax:

int get_year(timestamp ts)
int get_year(timestamp ts, string tz)

Returns the year portion of the absolute timestamp ts for the local time zone.

To return a value for a particular time zone, provide the text of its ID as a second argument, tz, which is case-sensitive. Unrecognized time zone IDs default to UTC. For information on specifying time zone IDs, see Specifying Time Zones.

Back to Top ^  Back to Timestamp Overview ^

set_second()

Function syntax:

timestamp set_second(timestamp ts, int x)
timestamp set_second(timestamp ts, long x)
timestamp set_second(timestamp ts, double x)

Returns a modified version of the absolute timestamp ts, with the seconds component replaced with x. If you supply an int or long value for x, the milliseconds portion of x is set to 000. If you supply a double value, you can replace the seconds component down to the millisecond.

Back to Top ^  Back to Timestamp Overview ^

set_minute()

Function syntax:

timestamp set_minute(timestamp ts, int x)

Returns a modified version of the absolute timestamp ts, with the minutes component replaced with x.

Back to Top ^  Back to Timestamp Overview ^

set_hour()

Function syntax:

timestamp set_hour(timestamp ts, int x)

Returns a modified version of the absolute timestamp ts, with the hours component replaced with x.

Back to Top ^  Back to Timestamp Overview ^

set_day_of_week()

Function syntax:

timestamp set_day_of_week(timestamp ts, int x)

Returns a new timestamp that has the same value as absolute timestamp ts, but with the day of the week component replaced with the value of x. Days of the week are numbered from 0 to 6, Sunday through Saturday. Day 7 is also considered to be Sunday.

Back to Top ^  Back to Timestamp Overview ^

set_day_of_month()

Function syntax:

timestamp set_day_of_month(timestamp ts, int x)

Returns a modified version of the absolute timestamp ts, with the day of month component replaced with x.

Back to Top ^  Back to Timestamp Overview ^

set_month()

Function syntax:

timestamp set_month(timestamp ts, int x)

Returns a modified version of the absolute timestamp ts, with the month component replaced with x.

Back to Top ^  Back to Timestamp Overview ^

set_year()

Function syntax:

timestamp set_year(timestamp ts, int x)

Returns a modified version of the absolute timestamp ts, with the year component replaced with x.

Back to Top ^  Back to Timestamp Overview ^

to_milliseconds()

Function syntax:

long to_milliseconds(timestamp ts)

Converts an interval or absolute timestamp to a long value representing milliseconds. This function is similar to to_seconds(), but is more precise and faster. If the argument ts is an interval, returns the number of milliseconds in the interval. For example, to_milliseconds(now() - today()) returns the number of milliseconds since midnight (00:00:00).

If the argument ts is an absolute time, returns the number of milliseconds since the epoch, Jan 1, 1970. For example, to_milliseconds(timestamp("2019-05-22 21:32")) returns 1558575120000.

Back to Top ^  Back to Timestamp Overview ^

to_seconds()

Function syntax:

double to_seconds(timestamp ts)

Converts an interval or absolute timestamp to a double in seconds. If the argument ts is an interval, returns the number of seconds in the interval. For example, to_seconds(now() - today()) returns the number of seconds since midnight (00:00:00).

If the argument ts is an absolute time, returns the number of seconds since the epoch, Jan 1, 1970. For example, to_seconds(timestamp("2019-05-22 21:32")) returns 1.55857512E9.

Back to Top ^  Back to Timestamp Overview ^

Simple Functions: Type Conversions

This category includes the following functions:

blob()

Function syntax:

blob blob(string arg)
blob blob()

Converts arg to a value of type blob where arg is a string. Returns a blob representation of the string. With no argument, returns a null blob as if you had used blob(null).

Back to Top ^

For additional conversion functions involving blobs, see Simple Functions: BSON.

bool()

Function syntax:

bool bool(bool arg)
bool bool(double arg)
bool bool(int arg)
bool bool(long arg)
bool bool(string arg)
bool bool(blob arg)
bool bool()

Converts arg to a value of type bool where arg is a bool, double, int, long, or string.

  • An int argument of 0 returns the Boolean false. All other int values, positive or negative, return true.

  • A double argument of 0 returns the Boolean false. All other double values, positive or negative, return true.

  • A string argument false returns a Boolean false and the string true returns Boolean true. The string is compared in case-insensitive manner. For example, bool("FALSE") and bool("false") are equivalent. Any string other than false or true is an error.

  • A blob argument returns the content of a BSON blob containing a bool.

  • A bool value returns itself.

  • With no argument, returns a null bool as if you had used bool(null).

Back to Top ^

double()

Function syntax:

double double(bool arg)
double double(double arg)
double double(int arg)
double double(long arg)
double double(string arg)
double double(blob arg)
double double()

Converts arg to a value of type double where arg is a bool, double, int, long, or string.

  • A bool argument returns 1.0 if true, or 0.0 if false.

  • A double argument returns itself.

  • An int argument returns the same value converted to type double. For example, double(3) returns 3.0.

  • A long argument returns the same value converted to type double. For example, double(15L) returns 15.0.

  • A string argument is parsed as a decimal number. For example, double("123.456") returns 123.456, but double("7abc") returns an error. Scientific notation is supported, so double("1.2E4") returns 12000.0.

  • A blob argument returns the content of a BSON blob containing a double.

  • With no argument, returns a null double as if you had used double(null).

Also See: The to_seconds() function, used to convert timestamps to double.

Back to Top ^

int()

Function syntax:

int int(bool arg)
int int(double arg)
int int(int arg)
int int(long arg)
int int(string arg)
int int(blob arg)
int int()

Converts arg to a value of type int where arg is a bool, double, int, long, or string.

  • A bool argument returns 1 if true, or 0 if false.

  • A double argument has its fractional part truncated. For example, int(3.4) returns 3.

  • An int argument returns itself.

  • A long argument returns the int same as a long value. For example, int(150L) returns 150.

  • A string argument is parsed as a decimal number. For example, int("123") returns 123, but int("7abc") is an error.

  • A blob argument returns the content of a BSON blob containing an int.

  • With no argument, returns a null int as if you had used int(null).

Back to Top ^

list()

Function syntax:

list list(e1, e2, e3, ...)

Returns a list containing each of the specified elements e1, e2, and so on, where each specified element is of the same data type. The specified elements can be literals or expressions, as long as each resolves to (or can be coerced to) the same data type. You can create a list of any StreamBase data type.

Use emptylist() to return an empty list, and use nulllist() to return a null list. (See Null Lists for a discussion of null lists compared to empty lists.)

Back to Top ^

long()

Function syntax:

long long(bool arg)
long long(double arg)
long long(int arg)
long long(long arg)
long long(blob arg)
long long()

Converts expression arg to a value of type long where arg evaluates to a bool, double, int, or long.

  • An int argument returns the same value, but with type long.

  • A double argument has its fractional part truncated, and the remainder returned with type long.

  • A bool argument returns 1 if true, or 0 if false.

  • A blob argument returns the content of a BSON blob containing a long.

  • A long value returns itself.

  • With no argument, returns a null long as if you had used long(null).

Back to Top ^

Named Schema Constructor Function

Function syntax:

tuple schema_name(fieldspec [, fieldspec2...]) [as tuplename]

For every named schema you define or import into a module, StreamBase automatically generates a new function in the StreamBase expression language that allows you to construct tuples with that schema. The generated function's name is the name of the named schema, and it takes one or more comma-separated expressions as value arguments. The result is a single tuple with the same schema as the named schema.

For example, for the named schema point, whose schema is (x double, y double), you can use a function named point() anywhere in the module that defines the named schema:

point(32.0, 44.5)

point(32.0 as x, 44.5 as y)

You can specify the list of fieldspec field specifications in two ways:

  • By position

  • By name

The following examples presume a named schema with the name Quotes, having the following schema:

(Symbol string, Price double, NumShares int)

When specifying by position, you do not need to specify AS keywords and field names, but you must specify all fields in schema order. For example:

Quotes("IBM", 132.42, 5000)

When specifying by name, you must use the AS keyword to specify field names, which must exactly match the spelling of the named schema's fields. Any fields you do not specify are automatically filled with nulls:

Quotes("IBM" as Symbol, 132.42 as Price)

When specifying by name, you can specify fields in any order:

Quotes(132.42 as Price, "IBM" as Symbol)

You cannot mix field specifications with and without the AS keyword.

Specifying fields by name provides flexibility when you use the same named schema in several places in a large application. If you later edit the named schema itself to add a new field, you do not need to add the new field everywhere the schema is used. If the fields are specified by name in all locations, the new field does not cause an error condition, and is automatically filled with null.

Use the name of a named schema with no arguments to construct a null tuple (which is not the same as a tuple whose fields are all null). For example: point(), Quotes()

Avoid naming a schema the same as an existing expression language function. If you do, the generated constructor function would mask and prevent the use of the same-named built-in function. See Named Schema Name Collisions.

Use the * AS * syntax for tuples defined with a named schema to copy the entire tuple into a single field of type tuple.

For example, let's say the tuple arriving at the input port of a Map operator was defined upstream with the NYSE_FeedSchema named schema. To preserve the input tuple unmodified for separate processing, the Map operator could add a field of type tuple using settings like the following in the Additional Expressions grid. When using the * AS * syntax in the Expression column, the name of the tuple field in the Field Name column has an implied asterisk for all of its fields.

Action Field Name Expression
Add OriginalOrder NYSE_FeedSchema(input1.* as *)

Because the Map operator has only one input port, the port does not need to be named:

Action Field Name Expression
Add OriginalOrder NYSE_FeedSchema(* as *)

string()

Function syntax:

string string(blob arg)
string string(bool arg)
string string(double arg)
string string(int arg)
string string(long arg)
string string(string arg)
string string(timestamp arg)
string string()

Converts arg to a value of type string where arg is a blob, bool, double, int, long, string, or timestamp.

  • An int, double or long argument returns the string representation of arg.

  • A string argument returns itself.

  • A bool argument returns the string true if true, or false if false.

  • A timestamp argument, if represented as a date and time, returns a string that includes the explicit time zone. A timestamp represented as a numeric value returns the string representation of that value.

  • A blob argument returns the string representation of the blob.

  • With no argument, returns a null string as if you had used string(null).

Back to Top ^

timestamp()

Function syntax:

timestamp timestamp(string timestring)
timestamp timestamp()

Returns an absolute timestamp parsed from the argument timestring, a string representation of a date with optional time. With no argument, returns a null timestamp as if you had used timestamp(null).

The timestring argument must be in the form of a time format pattern as defined in the java.text.SimpleDateFormat class. See the format_time() function for a summary of the time format pattern strings.

You can specify only the date portion:

timestamp("2009-02-12")

If you specify any part of the time value, you must specify at least the hours and minutes:

timestamp("2009-02-12 14:28") or timestamp("2009-02-12 14:28:05.250")

You can also specify the entire SimpleDateFormat string, including time zone specification:

timestamp("2009-02-12 14:28:05.250-0800")

See also the strptime() and parse_time() functions, used to convert strings to timestamps.

Back to Top ^

tuple()

Function syntax:

tuple tuple(value as fieldname [, value2 as fieldname2 [,...]]) [as tuplename]
tuple tuple()

Takes one or more comma-separated value expressions as arguments, each field named with the AS keyword. The result is a single tuple with the specified contents. For example:

tuple(126.50 as price, "IBM" as symbol)

When using tuple() to create individual tuples, to be compliant with a particular schema, you must specify all fields in the schema's field order. The tuple() function has no knowledge of the schema you are trying to match. For these reasons, we recommend using named schemas and the named schema constructor function.

You can duplicate any tuple field into another field of type tuple without using wildcards. For example, a Map operator might have an entry like the following in its Additional Expressions grid, where both IncomingTuple and CopyOfIncomingTuple are the names of tuple fields:

Action Field Name Expression
Add CopyOfIncomingTuple IncomingTuple

Use the .* syntax to flatten a tuple field into the top level of a stream.

For example, a Map operator might define an entry like the following in its Additional Expressions grid. When using this syntax, you must have an asterisk in both Field Name and Expression columns.

Action Field Name Expression
Add * IncomingTuple.*

Back to Top ^

Simple Functions: Utilities

This category includes the following functions:

coalesce()

Function syntax:

T coalesce(T arg1, ..., T argn)

Returns the first non-null argument, or a null value if all arguments are null. Accepts all data types, but all arguments must have the same type. The returned value has the same data type as the arguments. For tuple arguments, returns the first tuple that is non-null. This function stops evaluating arguments after the first non-null argument is detected.

Since literal values are never null, you can specify a literal value as the last argument to provide an effective default value for the list. In this way, coalesce() can be used to emulate the NVL() function provided by Oracle PL/SQL and the two-argument ISNULL() function provided by Microsoft T/SQL. For example, the following expression returns the value of fieldA if it is non-null, or 0 if it is null:

coalesce(fieldA, 0)

The following example returns the first non-null field among fieldA, fieldB, and fieldC in that order, or returns -99999 if all three fields are null:

coalesce(fieldA, fieldB, fieldC, -99999)

Back to Top ^

coalesce_tuples()

Function syntax:

T coalesce_tuples(T arg1, ..., T argn)

For non-hierarchical data types, coalesce_tuples() acts the same as the coalesce() function. For hierarchical types, this function coalesces each sub-field, recursing into further nested fields as necessary.

For example, the following function:

coalesce_tuples(
   tuple(double() AS x, 1.2 AS y, double() AS z),
   tuple(1.0 AS x, 1.3 AS y, double() AS z),
   tuple(double() AS x, double() AS y, 1.4 AS z)
)

returns the following tuple:

((x double, y double, z double)) 1.0, 1.2, 1.4

Remember that double() evaluates to double(null). Thus, the coalesce_tuples function in this example selected:

  • The first non-null value for x, 1.0, which was found in the second input tuple.

  • The first non-null value for y, 1.2, which was found in the first input tuple.

  • The first non-null value for z, 1.4, which was found in the third input tuple.

Back to Top ^

compile()

Function syntax:

function compile(string expr, tuple varfields, T typespec)
function compile(string expr, tuple varfields, T typespec, 
    string name)
function compile(string expr, tuple varfields, T typespec, 
    string name, tuple environment)

Converts a string defining an expression and a tuple defining its arguments into an executable StreamBase function at runtime. Compiles the function body expr specified as the first argument, providing it with the arguments specified in the second argument, varfields. The third argument defines the data type the function is to return when executed.

The first argument, expr, can be a string literal, a string constant, or the name of a field passed in from a stream. The second argument, varfields, can define a tuple using the tuple() function, or a named schema, or a parameter specifying a tuple. Typical usage is to provide the data types and field names that are to be used in the supplied expression, such as:

tuple(double() as factor)

More generally, if the context that the function will run in has the same fields as the current input fields, you can specify all current field names:

tuple(* as *)

Always provide the third argument, typespec, to define the data type that the function returns, which can be any StreamBase data type, such as int(), double(), list(), string(), or even function().

For example, compile a function that converts temperature in degrees Fahrenheit to degrees Celsius:

compile( "(fahr_temp - 32) * 1.8", tuple(double() as fahr_temp), double() )

The second syntax adds string name, an optional name for the function. Specify a name if the function is recursive, so that it can refer to itself. StreamBase expression language identifier naming rules apply, and the scope of the identifier is the function itself.

The third syntax adds tuple environment, which provides a set of fields that the function expression can use as names and values for local variables. When using this third syntax, you can specify an empty name argument ("") if it is not referenced within the function body.

You can define an expression within a stream's schema as a string and use it to define a function. For example, the input stream could contain a string field named converter with the value "fahr_temp - 32) * 1.8". The above example could then be rewritten as:

compile( converter, tuple(double() as fahr_temp), double() )

You can generalize the above example to convert Fahrenheit temperatures to either Celsius or Kelvin units by specifying an environment variable, which could either be defined on the input stream or as a constant. If the value of that variable is 0.0, the function returns degrees Celsius. If its value is 273.15, the function returns degrees Kelvin. Defining converter as "(fahr_temp - 32) * 1.8 + k_offset" and providing name and environment arguments changes the call to compile to:

compile( converter, tuple(double() as fahr_temp), double(), "", k_offset )

When environment is defined as (double(273.15) as k_offset), the function returns degrees Kelvin. When it is defined as (double(0.0) as k_offset), the function returns degrees Celsius.

Back to Top ^

eval()

Function syntax:

T eval(string expr, T typespec)
T eval(string expr, tuple varfields, T typespec [, int cachesize])

Evaluates the expression provided as the first argument and returns the result, whose data type is provided in the typespec argument. The expression must resolve to the data type specified as typespec; the function does not coerce the return's type.

The typespec argument is evaluated only for its data type, and its value is ignored. You can specify literal values, as long as their data type is unambiguous (1, 1L, 1.0, "string", and so on); or specify the null value function for the expected data type (int(), long(), double(), string(), and so on); or provide an expression that unambiguously resolves to a value with the expected data type.

For the two-argument form, the expr argument is a simple StreamBase expression that does not depend on any variables for its resolution. For the three-argument form, the expr argument can reference fields in the second argument's tuple.

The function returns an evaluation exception under the following circumstances:

  • If you provide a typespec argument with a different data type than actually returned from the evaluation of the first argument.

  • The expression in the expr argument fails to pass typechecking or resolves to null.

  • In the three-argument form, you provide a null tuple for the second argument.

In any field that takes expressions, you can, of course, enter any valid expression to be evaluated without needing to use eval(). The eval() function provides a way for you to build an expression string dynamically or have it based on input:

eval(input.exprfield, int())

The three-argument form can be used to pass an entire input tuple or table row for evaluation against an expression:

eval("price > 200.00", input, bool())

eval("remainingQuantity == 0", current, bool())

The expression in the expr argument is evaluated in a self-contained environment. Anything the expression needs for its evaluation must be passed in by means of the varfields argument. For example, to reference a constant or dynamic variable in the expr argument, make sure that constant or dynamic variable is included in the input tuple. Let's say a module defines the constant discount, and has a dynamic variable named curr_rate. You want to multiply field price in the input tuple by both values. Use the tuple() function to create a tuple consisting of the entire input tuple, plus the constant and variable:

eval("input.price * discount * curr_rate", tuple(input.* as *, discount, curr_rate), double())

The optional fourth argument is an advanced setting that specifies the size of the evaluation cache as the integer number of unique compiled evaluations to be saved in the cache. In the absence of this argument, the evaluation cache is the same as the current setting of the system property streambase.codegen.eval-cache-size, which defaults to 100 for StreamBase Server and to 1000 for LiveView Server. The cache setting is StreamBase container-wide, and setting it during runtime can only increase the cache, and never shrink it. This prevents a single use of eval() at runtime from reducing the cache size for everyone.

This argument must be in the fourth position; to use it with the two-argument form of this function, add an empty tuple for the second argument. (Remember that an empty tuple is not the same as a null tuple, which is disallowed for the second argument; see Null Tuples, Empty Tuples, No-Fields Tuples.)

Back to Top ^

fromCodePoint()

Function syntax:

string fromCodePoint(int value)

Accepts an integer value and returns the corresponding ASCII character.

Back to Top ^

great_circle_distance()

Function syntax:

double great_circle_distance(lat1 double, long1 double, lat2 double, long2 double)

Returns a double in radians representing the shortest distance between two points, calculated on the surface of a unit sphere. The arguments are the latitude and longitude of the beginning and ending coordinates (lat1, long1, lat2, long2), expressed in degrees.

To obtain the result in a certain unit, multiply the returned value by the radius of the sphere of interest in the desired unit. For example, to obtain a great circle distance on the earth's surface, multiply the returned value by the radius of the earth, which is approximately 6371 km or 3959 miles (but see Wikipedia on that subject).

The first example below gives the distance between two cities expressed in kilometers and the second example gives it in miles.

6371*great_circle_distance(42.3656, -71.0096, 53.4264, -6.2499)
3959*great_circle_distance(42.3656, -71.0096, 53.4264, -6.2499)

Back to Top ^

hash()

Function syntax:

int hash(T arg)

Returns a hashed integer value derived from the argument, which can be any StreamBase data type.

Back to Top ^

isnull()

Function syntax:

bool isnull(T value)

Accepts all data types. Returns true if the argument is a null value.

For list and tuple arguments, isnull() returns true if and only if the top-level list or tuple is null. If a list contains any elements at all, even if all their values are null, it is not a null list. If a tuple contains any fields at all, it is not null even if all field values are null.

Back to Top ^

jsonpath()

Function syntax:

string jsonpath(string json, string jsonpath)

Accepts a JSON-formatted string and a JSONPath expression and returns the result of applying the jsonpath expression as a filter on the input JSON string. JSONPath expressions are XPath-like pattern strings originally defined in JavaScript. This function is based on the Java implementation of JSONPath. See either of these pages for instructions on crafting JSONPath expressions.

For example, given the following JSON document:

{
  "store": {
    "book": [
      {
        "category": "fiction",
        "author": "Herman Melville",
        "title": "Moby Dick",
        "ISBN": "0-553-21311-3",
        "price": 8.99
      },
      {
        "category": "fiction",
        "author": "J. R. R. Tolkien",
        "title": "The Lord of the Rings",
        "ISBN": "0-395-19395-8",
        "price": 22.99
      }
    ]
  }
}

and passing the JSONPath string $.store.book[1].title as the second argument, the result is:

["The Lord of the Rings"]

(Remember that JSON array counts its membership starting with zero.)

Back to Top ^

length()

Function syntax:

int length(blob x)
int length(list L)
int length(string x)

Returns the length of its argument for argument types blob, list, and string. For a blob, returns a count of bytes. For a non-null list, returns the number of elements in the list, including any null elements. For a string, returns the number of graphemes in the string, for ASCII and all supported Unicode character sets.

If the argument is a null blob, null list, or null string, the returned value is null.

Back to Top ^

md5()

Function syntax:

string md5(string input)

Returns the md5 fingerprint of the provided input string.

Back to Top ^

new_tuple()

Function syntax:

tuple new_tuple(tuple t [, updated_value1 AS field1_name [, 
    updated_value2 AS field2_name [, ... [, 
    updated_valueN AS fieldN_name]]]])

Use the new_tuple() function to modify the values of a small number of fields in a tuple, leaving the other fields unchanged.

The required first argument t is a tuple value such as a named schema or the name of a tuple field in the incoming stream. Use one or more field update arguments of the form value AS field_name. You can use input.* AS * to fill in the fields in t with input stream fields that have matching field names.

You do not need to specify all fields in t, but you cannot specify fields not in t.

The following example both defines a tuple and immediately replaces the value of its x field. This example is impractical but can illustrate new_tuple():

new_tuple(tuple(1.0 AS x, 2.0 AS y, 3.0 AS z), 5 AS x)

For a more practical example, an input stream may have a schema defined as (Symbol string, Price double, NumSh double). The following expression in a Map operator's Additional Expressions grid inflates the incoming value of the Price field by 10%:

Action Field Name Expression
Add InflatedInput new_tuple(input, input.Price*1.10 as Price)

Back to Top ^

new_tuple_subset()

Function syntax:

tuple new_tuple_subset(tuple t, value1 AS field1_name[, 
    value2 AS field2_name[, ... [, 
    valueN AS fieldN_name]]])

This function returns a tuple with the schema of the tuple argument t, and with the specified new field values substituted.

The required first argument t is a tuple value such as a named schema or the name of a tuple field in the incoming stream. Use one or more field update arguments of the form value AS field_name. You can use input.* AS * to fill in the fields in t with input stream fields that have matching field names.

Unlike new_tuple(), this function allows but ignores any update field names not in tuple t.

Unlike new_tuple(), this function requires all field names in t to be specified.

Let's say an input stream has a schema defined as (Symbol string, Price double, NumSh double, CustID int), while the named schema Trade has a subset of that schema: (Symbol string, Price double, NumSh double). The following expression in a Map operator's Additional Expressions grid accepts the four-field input tuple and extracts only the three-field trade basics into a new field named TradeOnly.

Action Field Name Expression
Add TradeOnly new_tuple_subset(Trade(), input.* AS *)

Back to Top ^

new_tuple_subset_loose()

Function syntax:

tuple new_tuple_subset_loose(tuple t, updated_value1 AS field1_name[, 
    updated_value2 AS field2_name[, ... [, 
    updated_valueN AS fieldN_name]]])

This function returns a tuple with the schema of the tuple argument t, and with the specified new field values substituted.

The required first argument t is a tuple value such as a named schema or the name of a tuple field in the incoming stream. Use one or more field update arguments of the form value AS field_name. You can use input.* AS * to fill in the fields in t with input stream fields that have matching field names.

Unlike new_tuple(), this function allows but ignores any update field names not in tuple t.

Like new_tuple(), this function does not require all field names in t to be specified.

This function is especially useful when using a named schema to specify a declared output stream schema. In a Map operator immediately upstream of the output stream, use this function to reconcile a varying set of input fields to the output stream's declared schema. If the output stream's declared schema is named TargetSchema, use an Additional Expressions grid like the following:

Action Field Name Expression
Add TempTuple new_tuple_subset_loose(TargetSchema(), input.* AS *)

Then add a second Map to expand TempTuple's fields back to the top level:

Action Field Name Expression
Add * TempTuple.*

As an alternative, you can collapse both Map operators into one that uses the following:

Action Field Name Expression
Add * new_tuple_subset_loose(TargetSchema(), input.* AS *).*

This setup allows you to change the definition of TargetSchema elsewhere in a large application without having to also change the preceding Map operator.

Back to Top ^

notnull()

Function syntax:

bool notnull(T value)

Returns true if the argument is not a null value. Accepts all data types.

This function always returns the opposite of isnull().

Back to Top ^

nullif()

Function syntax:

T nullif(T arg1, T arg2)

Takes two arguments of any type (the same type for both). If the resolved value of both arguments is the same, returns null. If the resolved value of the two arguments differs, returns the value of the first argument. This function can be used to evaluate fields before and after an event, and to send a field upstream only if it has changed, otherwise null.

Back to Top ^

parsecsv()

Function syntax:

tuple parsecsv(tuple tuplespec, tuple tupleoptions, string csv-string)

Accepts a string formatted as comma-separated values (CSV) and returns a tuple. It takes three arguments:

  • tuplespec - a specification for the returned tuple's schema, in the form of a named schema constructor function, or the tuple() function, or an expression that resolves to a schema definition

  • tupleoptions - a tuple defining one or more of the processing options described in the following table, or null if none are passed in

  • csv-string - A string with a list of items delimited by commas (default) or by the delimiter character you specified in tupleoptions

The tuple options are:

Option Name Description Data Type Default Value
delimiter Delimiter character separating individual fields string , (comma)
quotechar Quote character for string fields string " (double quote)
header A CSV string, which may be optionally used to specify the order of tuple fields in the inputs to the function. string The order of the fields in the output Schema type
nullstring String that specifies what null values in the input CSV string look like string "null" (Tuple.NULL_STRING)
timestampformat Timestamp format to be used to parse timestamp fields in the input strings string yyyy-MM-dd hh:mm:ss.SSSZ
lenientTimestampParsing Whether to attempt to parse timestamps using the default datetime format ("yyyy-MM-dd hh:mm:ss.SSSZ") or interval format ("ss.SSS") when a timestamp field cannot be parsed using the specified timestampformat boolean true

No value is set for fields that do not have a value specified in the input string. If an input string has more fields than the field order specifies, the additional fields are ignored with a warning.

You can specify how values in CSV string inputs are ordered using the header option. The header field names must match those in the output schema. If the header contains any field names that are not in the schema, those fields are ignored, and a debug level log message is issued. If the header option is not specified, input values must be ordered as specified by the schema in the first argument.

Specify sub-tuples as escaped CSV strings (using \" or the designated quote character). There is no option to specify the field order within fields that are tuples themselves. See the second example below.

Example, with lines broken for publication clarity:

parsecsv(tuple ("a" AS FirstName, "b" AS LastName, "c" AS StreetNum, 
    "d" AS StreetName, 1 AS Zip, "e" AS email), null, 
    "John,Doe,30,Main Street,43210,jdoe666@myisp.com")

Output:

Result schema: ((FirstName string, LastName string, StreetNum string, StreetName string, 
Zip int, email string)) 
Result: John,Doe,30,Main Street,43210,jdoe666@myisp.com 

Note

All three input arguments (tuple, tuple, string) are required. The tuple options can be null, but must be included.

The same example with street address formatted as a sub-tuple, and with delimiter set to semicolon:

parsecsv(tuple ("a" AS FirstName, "b" AS LastName, 
    tuple ("c" AS StreetNum, "d" AS StreetName, 1 AS Zip) AS Address, 
    "e" AS email), tuple(";" AS delimiter),  
    "John;Doe;\"30;Main Street;43210\";jdoe666@myisp.com")

Output:

Result schema: ((FirstName string, LastName string, Address (StreetNum string, 
StreetName string, Zip int), email string))
Result: John,Doe,"30,Main Street,43210",jdoe666@myisp.com

Back to Top ^

parsejson()

Function syntax:

tuple parsejson(tuple tuplespec, string json-string [, string timestampformat]
[, string emptyStringJsonValue])

Accepts a JSON-formatted string, in either JSON object or JSON array format (as defined on http://www.json.org) and returns a tuple. The first argument, tuplespec, is a specification for the returned tuple's schema, in the form of a named schema constructor function, or the tuple() function, or an expression that resolves to a schema definition.

Use the optional third argument to specify a timestamp format with which to parse the JSON string, using format patterns in the manner of java.text.SimpleDateFormat. In the absence of a third argument, timestamps are parsed using the default format string yyyy-MM-dd HH:mm:ss.SSSZ.

Use the optional fourth argument to specify what the value "" or '' should be evaluated to be replaced in the processing of the JSON tuple. For instance, if this parameter is set to 'test', then all empty strings in the JSON that pass through that processor will be replaced with the string 'test'. By default, this parameter is set to "".

For example, for the named schema point(), defined as (int x, int y), the following expressions both return the tuple (x: 88, y: 99).

parsejson(point(), "[88, 99]")
parsejson(point(), "{x:88, y:99}")

The following example returns the tuple (test).

parsejson(tuple('' AS a),"{a:''}",null,'test')

The following example returns the tuple (1,test).

parsejson(tuple('' AS a,'' AS b),"{a:'1',b:''}",null,'test')

Back to Top ^

parsejson_loose()

Function syntax:

tuple parsejson_loose(tuple tuplespec, string json-string [, string timestampformat]
[, string emptyStringJsonValue])

Works similarly to parsejson(), but ignores json elements that do not fit the schema definition provided. Accepts a JSON-formatted string, in either JSON object or JSON array format (as defined on http://www.json.org) and returns a tuple. The first argument, tuplespec, is a specification for the returned tuple's schema, in the form of a named schema constructor function, or the tuple() function, or an expression that resolves to a schema definition.

Use the optional third argument to specify a timestamp format with which to parse the JSON string, using format patterns in the manner of java.text.SimpleDateFormat. In the absence of a third argument, timestamps are parsed using the default format string yyyy-MM-dd HH:mm:ss.SSSZ.

Use the optional fourth argument to specify what the value "" or '' should be evaluated to be replaced in the processing of the JSON tuple. For instance, if this parameter is set to 'test', then all empty strings in the JSON that pass through that processor will be replaced with the string 'test'. By default, this parameter is set to "".

For the named schema point(), defined as (int x, int y), parsejson_loose() returns the tuple (x: 88, y: 99) from both the following expressions.

parsejson_loose(point(), "[88, 99]")
parsejson_loose(point(), "{x:88, y:99}")

In addition, it would return the same result from a JSON string such as {x:88, y:99, z:101}, ignoring the extra element.

In the following example, the initial JSON element z is ignored, and the missing tuple element z is set to null, to return the tuple 1,"2,""null,3""" (line break added for clarity):

parsejson_loose(tuple('' AS x, tuple('' AS y, tuple ( '' AS z, '' AS t) AS q) AS r),"{z:'a', x:'1', r:{y:2, q:{t:'3'}}}")

The following example returns the tuple (test).

parsejson_loose(tuple('' AS a),"{a:''}",null,'test')

The following example returns the tuple (1,test).

parsejson_loose(tuple('' AS a,'' AS b),"{a:'1',b:''}",null,'test')

Back to Top ^

random_tuple()

Function syntax:

tuple random_tuple(schema-definition)

Given a schema definition, usually in the form of a named schema constructor function, returns a tuple with each field filled with random data of the correct data type for each field. For example, for the named schema, point, with schema (double x, double y, double z), the named schema constructor function point() creates a single tuple with that schema. Use point() as the schema definition argument for random_tuple():

random_tuple(point())

This returns three double values selected at random, any of which might be null. Numeric values returned can be fractional or negative. Each time this function is run, another three random values are returned.

You can also use random_tuple() using tuple() to define a schema. In this case, you must provide value arguments and field names, as in the following example:

random_tuple(tuple(132.34 as price, "IBM" as symb, now() as tdate))

This at first appears to be a contradiction, because you provide values to fields that you then ask random_tuple() to replace with new, random values. However, in this context, the values you provide to tuple() are evaluated by random_tuple() only to determine the data type for each field. For this example, random_tuple() returns three random fields of type double, string, and timestamp.

Back to Top ^

toCodePoint()

Function syntax:

int toCodePoint(string characters [, int index])

Accepts a string of one or more characters and returns the corresponding ASCII code. If more than one character is provided with no index argument, the code corresponding to the first character is returned. You can also provide a zero-based index value to specify which character in the provided string whose code you want returned.

Back to Top ^

toCSV()

Function syntax:

string toCSV(tuple fieldname, [tuple tupleoptions])

Accepts a tuple argument and returns the tuple's field values as a CSV string delimited by commas (default) or by the delimiter character specified in tupleoptions. The argument fieldname is likely to be a field of type tuple in the incoming stream, but can be any expression that resolves to a single tuple. tupleoptions is a tuple defining one or more of the following fields: delimiter, quotechar, timestampformat, header. The header value is a comma separated string of field names to include in the output, in the desired field order.

Back to Top ^

tojson()

Function syntax:

string tojson(tuple fieldname [, bool verbosity] [, string timestampformat])

Accepts a tuple argument and returns the tuple's field values as a JSON object of key-value pairs, or optionally as a JSON array of field values. The argument fieldname is likely to be a field of type tuple in the incoming stream, but can be any expression that resolves to a single tuple.

Use the optional third argument to specify a timestamp format with which to parse timestamp fields in the incoming tuple, using format patterns in the manner of java.text.SimpleDateFormat. In the absence of a third argument, timestamps are parsed using the default format string yyyy-MM-dd HH:mm:ss.SSSZ. When specifying the third argument, the verbosity second argument is no longer optional, and must be explicitly set.

By default, or with the optional verbosity argument expressed as true, the input tuple's field and value pairs are returned as a string formatted as a JSON object, as defined on http://www.json.org: a left brace followed by colon-separated name-value pairs (with field name quoted), and each pair separated by a comma, followed by a closing right brace.

Set the second argument to false to return a JSON array of field values instead: a left bracket following by comma-separated field values, followed by a closing right bracket.

For example, let's say an incoming tuple field named trade has the following schema and values:

Field Data Type Value
A double 2.0
B int 333
C string "IBM"
D list(int) range(0, 8, 2) = [0, 2, 4, 6]

In this case, the expressions tojson(trade) and tojson(trade, true) return the following string:

{"A":2.0,"B":333,"C":"IBM","D":[0,2,4,6]}

Test this function in a Map operator's Additional Expressions grid with a construction like the following:

tojson(tuple(2.0 as A, 333 as B, 'IBM' as C, range(0, 8, 2) as D))

The expression tojson(trade, false) returns the following string:

[2.0,333,"IBM",[0,2,4,6]]

Back to Top ^

uuid()

Function syntax:

string uuid()

Returns a pseudo randomly generated 128-bit universally unique identifier string in the manner of java.util.UUID.randomUUID().

Back to Top ^

Aggregate Functions Overview

Aggregate functions are used on sets of data to return a single result. Aggregate functions evaluate columns of data from windows or tables. In EventFlow applications, aggregate functions can only be used in the following aggregate contexts:

  • In Aggregate operators, in aggregate expressions that apply to an aggregate window.

  • In Query operators that perform Read or Delete operations, in expressions that apply to all rows read.

  • In Query operators that perform Read or Delete operations, in aggregate expressions used in conjunction with a Group-By field.

Expressions in aggregate contexts can contain a single aggregate function, or they can combine multiple aggregate functions together with constants and fields from an input stream or an associated Query Table. Aggregate expressions can include simple functions if the value of the simple function does not change over the lifetime of the Aggregate window.

Aggregate functions are organized into the following categories:

Aggregate Functions: Aggregate to List

This category includes the following functions:

aggregatelist()

Function syntax:

list(T) aggregatelist(T e)

Returns a list containing one element for each expression, e, in the Aggregate context. The list element type, T, of the returned list is the same as the data type of e.

Back to Top ^  Back to Aggregate Overview ^

aggregatelistnonnull()

Function syntax:

list(T) aggregatelistnonnull(T e)

Returns a list containing one element for each expression, e, in the Aggregate context. The list element type, T, of the returned list is the same as the data type of e. Unlike aggregatelist(), this function excludes nulls from the constructed list.

Back to Top ^  Back to Aggregate Overview ^

avg_list()

Function syntax:

list(double) avg_list(list(T) L)

Returns a list of doubles containing the element-wise average of the elements of the argument list field in the Aggregate context. Argument lists must be of a numeric type coercible to double (int, long, or double). For example, an Aggregate operator might specify avg_list(intlistfield) as an aggregation expression. With the following input tuples:

{intlistfield: [1, 2, 3], otherfield: x}
{intlistfield: [4, 5, 6], otherfield: y}

When the Aggregate operator emits, the result is:

[2.5, 3.5, 4.5]

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the average.

See also the avg_list() function for list elements.

Back to Top ^  Back to Aggregate Overview ^

concat()

Function syntax:

list(T) concat(list(T) L)

Returns a list containing the elements of the field of type list, L, in the Aggregate context. The list element type, T, of the returned list is the same as the element type of the argument list.

See also the concat() function for list elements.

Back to Top ^  Back to Aggregate Overview ^

count_list()

Function syntax:

list(int) count_list(list(T) L)

Returns a list containing the element-wise count of the elements of the field of type list, L, in the Aggregate context, returning a list of type int. For example, an Aggregate operator might specify count_list(intlistfield) as an aggregation expression. With the following input tuples:

{intlistfield: [1, 2, 3], otherfield: x}
{intlistfield: [4, 5, 6, 7, 8], otherfield: y}

When the Aggregate operator emits, the result is:

[2, 2, 2, 1, 1]

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the count.

See also the simple function count_list().

Back to Top ^

join(), joinlist()

Function syntax:

string join(string separator, string S)
string joinlist(string separator, string S)

Returns a string formed by concatenating string field S in the Aggregate context. The separator string is inserted between each string. If the separator string is empty, no separators are inserted. If the separator string is null, join() returns null.

You can use joinlist() as an alias for join() in all contexts. In LiveView contexts, which has the reserved word JOIN, you must either use the joinlist() alias, or use escaped identifier syntax.

Also see the simple function join().

Back to Top ^  Back to Aggregate Overview ^

rank()

Function syntax:

list(double) rank(double field)

Returns a list containing the rank ordering of values of a field of type double in the Aggregate context, in ascending order. The input type must be double or be cast to double. If the input field contains duplicated values, the rank returned for each of them is all the average of their ranks. For example, if a field called priority contained the values 5, 1, 6, 2 in the aggregate window, then rank(priority) returns [3, 1, 4, 2]. If the values for priority instead were 5, 2, 6, 2, then rank(priority) returns [3, 1.5, 4, 1.5].

Back to Top ^  Back to Aggregate Overview ^

sum_list()

Function syntax:

list(T) sum_list(list(T) L)

Returns a list containing the element-wise sum of the elements of the field of type list, L, in the Aggregate context. The list element type, T, of the returned list is the same as the element type of the argument list. For example, an Aggregate operator might specify sum_list(intlistfield) as an aggregation expression. With the following input tuples:

{intlistfield: [1, 2, 3], otherfield: x}
{intlistfield: [4, 5, 6], otherfield: y}

When the Aggregate operator emits, the result is:

[5, 7, 9]

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the sum.

See also the sum_list() function for list elements.

Back to Top ^  Back to Aggregate Overview ^

unique()

Function syntax:

list(T) unique(T field, int numuniq)

Returns a list containing the unique values in the specified field in the Aggregate context. The data type of the returned list is the same as that of the field.

The unique() function maintains an exact list of unique values until numuniq is reached, and does not report changes as the number of values added exceeds numuniq. If a value in the reported unique list is removed, it is replaced by some other unique value, if one exists. If there are more than numuniq values, there is no guarantee which numuniq values are reported.

Also see the simple function unique().

Back to Top ^  Back to Aggregate Overview ^

Aggregate Functions: External Functions

This category includes the following functions:

calljava()

Function syntax:

T calljava(string class [, arg1, ..., argn])

Use calljava() to run a custom Java aggregate function directly from a StreamBase operator that uses an expression. The return type, T, of calljava() is the same as the return type of the called function. Custom Java aggregate functions are functions you build with the StreamBase Java Client library, specifically those that extend the AggregateWindow class. You can use the aggregate form of calljava() in any aggregate expression and in the Query Table Group Options tab. (To use calljava() in simple expressions, refer to the simple calljava().)

As an alternative to using calljava(), you can define an alias for your custom functions. See Custom Functions in Java.

The calljava() function distinguishes simple from aggregate functions by the number of arguments. For custom simple functions, you must specify both class and method names. For custom aggregate functions, you specify only the class name, because these functions must inherit from AggregateWindow, which has an init() method that is automatically run.

To learn about coding custom Java functions, refer to Using the StreamBase Java Function Wizard in the API Guide.

See Return Types and Argument Types for a discussion of using Java primitives or Java objects in the function you write to be called with calljava().

Back to Top ^  Back to Aggregate Overview ^

Aggregate Functions: Statistical Calculations

This category includes the following functions:

alpha()

Function syntax:

double alpha(double index, double price, double dividend)

Returns a double that may indicate the part of a stock's movement that is independent of the index's movement.

Examples of stocks increasing in alpha could be those with takeover rumors, under strong syndicate manipulation, or having strong expectations of good results; that is, factors that make them increasingly move independently of the index.

The beta of a stock is defined as the slope of a regression line in a scatter graph of paired data points representing percentage changes of an index (return of an index) and the corresponding change (return of stock) in the price of a stock. The StreamBase beta() function also includes the stock dividend in the calculation of the stock return.

The alpha is the point where this regression line cuts the Y axis. To reiterate, a stock's beta can be described as that part of a stock's movement that is influenced by the index, while a stock's alpha can be regarded as that part of a stock's movement that is independent of the index's movement.

The alpha function's arguments are:

  1. index: the end-of-period market index (double)

  2. price: the end-of-period stock price (double)

  3. dividend: the stock dividend of the period (double)

The stock price must have already figured in any stock split (or reverse split). The alpha is usually calculated over a period of 61 months. This aggregate function assumes that the input data is already normalized. That is, index, price, and dividend are of the same period. If they are of different periods, first normalize the yield (return), and then use the intercept function to calculate alpha.

Back to Top ^  Back to Aggregate Overview ^

avg()

Function syntax:

double    avg(double field)
double    avg(int field)
double    avg(long field)
timestamp avg(timestamp field)

Returns the average value of a numeric field, field, in all tuples in the Aggregate context. The field's type can be a double, int, long, or timestamp.

For a returned average of timestamp values to make sense, the aggregate field must contain all interval timestamps or all absolute timestamps, but not both.

See also the avg() function for lists.

Back to Top ^  Back to Aggregate Overview ^

avgif()

Function syntax:

int avgif(T fieldname, bool predicate)

Returns the average of the values for which the specified numeric field is not null and for which the predicate evaluates to true. The predicate expression can be any valid expression on any field, and is not restricted to values from the specified field.

For example: avgif(input.income, input.age < 35)

Back to Top ^  Back to Aggregate Overview ^

andall()

Function syntax:

bool andall(bool f)

The argument f represents a field of type boolean in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate operator or Query Table.

This function evaluates the values in the specified field, and returns the results of a logical AND operation on all values. For example, suppose field BoolField contains the following values in successive rows of an Aggregate window or Query Table column: true, true, false, true, false. In that case, andall(BoolField) returns false. Use andall() as an aggregate truth detector function.

Null arguments have special handling. A null argument does not change the result of the evaluation if a false value is among the arguments. Thus, if BoolField contains null, true, false, null, true, then andall(BoolField) still returns false. However, if one or more arguments is null while all other arguments are true, the function returns null. So if BoolField contains null, true, true, null, true, then andall(BoolField) returns null.

The andall() function follows the logic expressed in this statement:

   if any argument is false, return false
   else if any argument is null, return null
   else return true

See also the orall() aggregate function and the simple versions of andall() and orall().

Back to Top ^

beta()

Function syntax:

double beta(double index, double price, double dividend)

Returns a double that may indicate the tendency of a security's returns to respond to swings in the market. A beta of 1 indicates that the security's price will move with the market. A beta of less than 1 means that the security will be less volatile than the market. A beta value that is greater than 1 indicates that the security's price will be more volatile than the market.

For related information, see the description of the alpha function.

Beta is a measure of a security's or portfolio's volatility, or systematic risk, in comparison to the market as a whole. For example, if a stock's beta is 1.2, it is theoretically 20% more volatile than the market.

The function's arguments are:

  1. index: the end-of-period market index (double)

  2. price: the end-of-period stock price (double)

  3. dividend: the stock dividend of the period (double)

The stock price must have already figured in any stock split (or reverse split). The beta is usually calculated over a period of 61 months. This aggregate function assumes that the input data is already normalized. That is, index, price, and dividend are of the same period. If they are of different periods, the yield should be normalized first and then use the slope function to calculate beta.

Back to Top ^  Back to Aggregate Overview ^

correlation_coefficient()

Function syntax:

double correlation_coefficient(double price, double index)
double correlation_coefficient(int price, int index)
list(double) correlation_coefficient(string name, int index, 
    double field1, double field2, ...)

The first two syntaxes return the Pearson correlation coefficient for two fields, price and index, for all tuples in the Aggregate context. This function takes two input arguments (both can be an int or a double) and returns a double. The correlations coefficient is a measure that determines the degree to which two variable's movements are associated. The correlation coefficient varies from -1.0 to 1.0. The value -1.0 indicates perfect negative correlation, and 1.0 indicates perfect positive correlation. 0 means no correlation.

Note

  • The correlation_coefficient of 0 items is null.

  • The correlation_coefficient of 1 item is 1.0.

  • The correlation_coefficient of N identical items is 1.0.

The third syntax, list(double) correlation_coefficient(string name, int index, double field1, double field2 ...), returns a list of correlation coefficients, containing as many members as there are field arguments (there must be at least two). The output list of doubles is in the same order as the field arguments. Given n field names, the index argument specifies which of those n fields holds the primary variable (for which field the returned correlation coefficient is 1.0). The value of index must not change over the life of the operator.

The name argument specifies the type of correlation coefficient to compute, taken from this list:

Click any name above to see the description of the statistic. You can abbreviate and approximate the spellings you supply for the name, which must not change over the life of the operator. You can call any of these functions directly by its full name, as well as using the above syntax. However, when you call these functions directly, only two variables can be correlated per call and the output is a scalar double.

If any input field contains null values, the function returns null for its correlation coefficient.

Back to Top ^  Back to Aggregate Overview ^

correlation_coefficientp()

Function syntax:

double correlation_coefficientp(double price, double index)
double correlation_coefficientp(int price, int index)

Returns the correlation coefficient for two fields, price and index, for all tuples in the Aggregate context. The correlation_coefficientp function is similar to the correlation_coefficient function, but use correlation_coefficientp when the data provided is the entire population, while the correlation_coefficient function is used for a random sample. The correlation_coefficientp function is calculated using the biased (or n) method. The correlation_coefficient function is calculated using the unbiased (or n-1) method.

This function takes two input arguments (both can be an int or a double) and returns a double. The correlations coefficient is a measure that determines the degree to which two variable's movements are associated. The correlation coefficient varies from -1.0 to 1.0. The value -1.0 indicates perfect negative correlation, and 1.0 indicates perfect positive correlation. 0 means no correlation.

Note

  • The correlation_coefficientp of 0 items is null.

  • The correlation_coefficientp of 1 item is 1.0.

  • The correlation_coefficientp of N identical items is 1.0.

Back to Top ^  Back to Aggregate Overview ^

count(), countlong()

Function syntax:

int  count([T expr])
long countlong([T expr])

Returns the number of tuples in the Aggregate context. If this function is called with an argument, then tuples for which the argument evaluates to null do not contribute to the count. The argument can be of any data type, T, but its actual value is ignored, other than the fact that it is non-null. If no argument is specified, then all tuples are included in the count.

countlong() works exactly as count() does, except that it returns a long value rather than an int.

To clarify the operation of count(), consider the following sequence of values for x:

x {1,2,3,4,5,6,7,8,9,10}

Calling count(x) returns 10. There are ten items in the list.

Calling count(x > 5) also returns 10. This is because it is counting the sequence {f,f,f,f,f,t,t,t,t,t} (five false, five true). All elements in this sequence are non-null, so all are counted.

Calling count(if x > 5 then 0 else null) returns 5, because it is counting the sequence {null, null, null, null, null, 0, 0, 0, 0, 0} (five nulls, five zeros). The five non-null elements are counted. Any non-null value other than 0 would also count in this expression.

The following example clarifies the count() function's behavior with query tables.

Let's say we have a table named flattop with the following schema and contents:

Field Name Data Type Contents of Three Rows
alpha int 2, 4, 6
beta long 3.0, 5.0

The expression select count(flattop.*) from in, tablet ... returns

3, 2 

which is the number of non-null values in the alpha field, plus the number of non-null values in the beta field.

If you want a count of the number of items in the window, use an expression like the following: select count() as c from in, flattop ...

Back to Top ^  Back to Aggregate Overview ^

count_distinct()

Function syntax:

int count_distinct(T arg1 [, T arg2 ...]])

Returns the number of tuples in the Aggregate context with specified values for the parameters. This function must be called with at least one argument of any data type. Multiple arguments can be of different types. Use commas to separate subsequent arguments. For the purpose of this function, null values of a field are considered equivalent to each other.

See also the count_distinct_elements() function for lists.

Back to Top ^  Back to Aggregate Overview ^

countif()

Function syntax:

int countif(bool predicate)
int countif(T fieldname, bool predicate)

With one argument, returns the number of rows for which the specified predicate argument evaluates to true. When used with two arguments, it returns the number of rows for which the specified field is not null and the predicate evaluates to true.

Back to Top ^  Back to Aggregate Overview ^

covariance()

Function syntax:

double covariance(double price, double index)
double covariance(int price, int index)
double covariance(long price, long index)

Returns the covariance for two fields, price and index, for all tuples in the Aggregate context. This function takes two input arguments (both can be a double, int, or long) and returns a double. In financial applications, covariance can be used to measure the degree to which returns on two risky assets move in tandem. A positive covariance means that asset returns move together. A negative covariance means returns vary inversely.

In the arguments, the price field is a stock price, and the index is the industry segment index.

One method of calculating covariance is to look at return deviations from the expected return in each scenario. Another method is to multiply correlation between the two variables by the standard deviation of each variable.

For example, if you owned one asset that had a high covariance with another asset that you did not own, then you would receive very little increased diversification by adding the second asset. Of course, the opposite is true as well, adding assets with low covariance to your portfolio would lower the overall portfolio risk.

Back to Top ^  Back to Aggregate Overview ^

covariancep()

Function syntax:

double covariancep(double price, double index)
double covariancep(int price, int index)
double covariancep(long price, long index)

Returns the covariance for two fields, price and index, for all tuples in the Aggregate context. The covariancep function is similar to the covariance function, but use covariancep when the data provided is the entire population, while the covariance function is used for a random sample. The covariancep function is calculated using the biased (or n) method. The covariance function is calculated using the unbiased (or n-1) method.

This function takes two input arguments (both can be a double, int, or long) and returns a double. In financial applications, covariance can be used to measure the degree to which returns on two risky assets move in tandem. A positive covariance means that asset returns move together. A negative covariance means returns vary inversely.

In the arguments, the price field is a stock price, and the index is the industry segment index.

One method of calculating covariance is by looking at return deviations from the expected return in each scenario. Another method is to multiply correlation between the two variables by the standard deviation of each variable.

For example, if you owned one asset that had a high covariance with another asset that you did not own, then you would receive very little increased diversification by adding the second asset. Of course, the opposite is true as well, adding assets with low covariance to your portfolio would lower the overall portfolio risk.

Back to Top ^  Back to Aggregate Overview ^

exp_moving_avg()

Function syntax:

double exp_moving_avg(double value, int count [, double weight])
double exp_moving_avg(int value, int count [, double weight])
double exp_moving_avg(long value, int count [, double weight])

Returns a weighted average where a percentage of each successive value is used with a percentage of the existing average to construct the new average. The effect of any one value on the average diminishes exponentially as new values are added.

The function takes these arguments:

value

The value (double, int, or long) to be averaged.

count

The integer number of periods over which a simple average is used prior to starting the EMA.

weight

An optional weighting value (a double between 0 and 1) that is to be applied to both the current period's value (CPV) and prior period's average (PPA) when the Exponential Moving Average (EMA) is calculated.

If the weight is not between 0 and 1 or is not specified, it is calculated from the following formula:

weight = 2 / (count + 1)

This function returns a double. It is most commonly used with an unlimited aggregate window size (although it does not have to be). When configuring an Aggregate operator for an unlimited window, do not configure the Advance and Size fields in StreamBase Studio, which results in unlimited window size. Instead, set the Emit field to 1 so that the aggregate emits for every tuple.

The exp_moving_avg function is calculated using the following formula:

EMA = (CPV * weight) + (PPA * (1 - weight))
  • EMA: exponential moving average so far

  • PPA: exponential moving average after the previous tuple

  • CPV: value of current tuple

Initially there is no PPA. A simple moving average (SMA) is computed instead over as many tuples as specified by the second parameter. When the SMA is computed, a Null value is emitted for each input tuple until the Aggregate operator receives enough tuples to compute the SMA. Following this, the EMA is calculated.

Back to Top ^  Back to Aggregate Overview ^

goodman_kruskal_gamma()

Function syntax:

double goodman_kruskal_gamma(double field1, double field2)

Returns the Goodman-Kruskal Gamma correlation coefficient between values of two fields in the Aggregate context as a double value when the window emits. The Goodman-Kruskal Gamma statistic is a rank correlation of concordant pairs. If any field values are null, the returned value is null.

To perform multivariate Goodman-Kruskal Gamma correlations, use the aggregate correlation_coefficient function.

Back to Top ^

intercept()

Function syntax:

double intercept(double x, double y)
double intercept(int x, int y)

For the data set of fields x and y for all tuples in the Aggregate context, a best-fit linear regression line is calculated, then the function returns the value where the line intersects the y axis. The method used to find the line that best fits a group of points is called least squares (or linear least squares). This function takes two input arguments of the same type (both can be an int or a double) and returns a double.

This function returns zero if the Aggregate window contains one row or duplicate rows containing the same values.

Back to Top ^  Back to Aggregate Overview ^

kendall_tau()

Function syntax:

double kendall_tau(double field1, double field2)

Returns the normalized Kendall Tau distance metric between values of two fields in the Aggregate context as a double value when the window emits. The Kendall Tau distance statistic counts the number of pairwise disagreements between the ranks of corresponding items in two fields. Larger values indicate greater disagreement in rank ordering, such that a return value of 0 means complete agreement and 1.0 means complete disagreement.

To perform multivariate Kendall Tau correlations, use the aggregate correlation_coefficient function.

Back to Top ^

max()

Function syntax:

T max(T f)

The argument f represents a field in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate operator or Query Table.

This function returns the maximum non-null value using the greater-than relational operation appropriate for the data type of f. The data type, T, of the returned value is the same as the argument.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of max(), the aggregate function maxn() below, and the maxelement() function for lists. In an aggregate context, this aggregate function max(list) returns the maximum list in the aggregate window. Use the simple function max(list, , list) to determine the larger of two or more lists. By contrast, use maxelement() to compare the element values of a single list.

Back to Top ^  Back to Aggregate Overview ^

maxn()

Function syntax:

list(T) maxn(int count, T f)

The argument f represents a field in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate context.

Returns a list with count elements containing the count number of maximum non-null values for field f in the Aggregate context. Accepts fields of all data types and returns a list whose elements are the same as the type of field f. The value of count must be nonnegative, cannot be null, and must remain the same for the duration of the Aggregate window. For count=0, an empty Aggregate window, or an empty Query Table column, an empty list is returned.

Maximum values are determined with the greater-than relational operation appropriate for the data type of f. StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of max(), the aggregate version of max(), and the maxelement() function for lists.

Back to Top ^  Back to Aggregate Overview ^

median()

Function syntax:

double median(double x)
double median(int x)
double median(long x)

Returns the median value of x for this Aggregate context. If the window has an even number of elements, returns the average of the middle two.

See also the median() function for lists.

Back to Top ^  Back to Aggregate Overview ^

min()

Function syntax:

T min(T f)

The argument f represents a field in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate operator or Query Table.

This function returns the minimum non-null value using the less-than relational operation appropriate for the data type of f. The data type, T, of the returned value is the same as the argument.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of min(), the aggregate function minn() below, and the minelement() function for lists. In an aggregate context, this aggregate function min(list) returns the minimum list in the aggregate window. Use the simple function min(list, , list) to determine the smaller of two or more lists. By contrast, use minelement() to compare the element values of a single list.

Back to Top ^  Back to Aggregate Overview ^

minn()

Function syntax:

list(T) minn(int count, T f)

The argument f represents a field in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate context.

Returns a list with count elements containing the count number of minimum non-null values for field f in the Aggregate context. Accepts fields of all data types and returns a list whose elements are the same as the type of field f. The value of count must be nonnegative, cannot be null, and must remain the same for the duration of the Aggregate window. For count=0, an empty Aggregate window, or an empty Query Table column, an empty list is returned.

Minimum values are determined with the less-than relational operation appropriate for the data type of f. StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of min(), the aggregate version of min(), and the minelement() function for lists.

Back to Top ^  Back to Aggregate Overview ^

orall()

Function syntax:

bool orall(bool f)

The argument f represents a field of type boolean in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate operator or Query Table.

This function evaluates the values in the specified field, and returns the results of a logical OR operation on all values. For example, suppose field BoolField contains the following values in successive rows of an Aggregate window or Query Table column: true, true, false, true, false. In that case, orall(BoolField) returns true. Use orall() as an aggregate falsity detector function.

Null arguments have special handling. A null argument does not change the result of the evaluation if a true value is among the arguments. Thus, if BoolField contains null, false, true, null, false, then orall(BoolField) still returns true. However, if one or more arguments is null while all other arguments are true, the function returns null. So if BoolField contains null, false, false, null, false, then orall(BoolField) returns null.

The orall() function follows the logic expressed in this statement:

   if any argument is true, return true
   else if any argument is null, return null
   else return false

See also the andall() aggregate function and the simple versions of andall() and orall().

Back to Top ^

pearson()

Function syntax:

double pearson(double field1, double field2)

Returns the Pearson product-moment correlation coefficient between values of two fields in the Aggregate context as a double value when the window emits that indicates the degree of linear dependence between two variables. The Pearson statistic is defined as the covariance of the two variables divided by the product of their standard deviations. The term product-moment refers to the mean (the first moment about the origin) of the product of the mean-adjusted random variables. A return value of -1 indicates a perfect negative linear correlation. A value of 0 indicates no correlation (random variates). A value of 1 indicates perfect positive linear correlation. If any field values are null, the returned value is null. If the variance of either input variable is zero, the function returns NaN.

To perform multivariate Pearson correlations, use the aggregate correlation_coefficient function or the aggregate correlation_coefficientp function.

Back to Top ^

percentile()

Function syntax:

T percentile(T field, double pct)

Returns the value of the specified field with percentile rank equal to the specified pct value. If there is no value in the Aggregate context at exactly pct percentile, then the value with the greatest percentile less than pct is returned. The data type of the specified field (and of the corresponding result) must be one of the comparable types in the expression language, as listed on the Data Types page.

Back to Top ^

product()

Function syntax:

double product(double f)
int    product(int f)
long   product(long f)

Returns the multiplication product computed for field f for all tuples in the Aggregate context. Supports data types double, int, and long.

See also the product() function for lists.

Back to Top ^  Back to Aggregate Overview ^

slope()

Function syntax:

double slope(double x, double y)
double slope(int x, int y)

For the data set of fields x and y for all tuples in the Aggregate context, a best-fit linear regression line is calculated, and then the function returns the line's slope. The method that is used to find the line that best fits a group of points is called least squares (or linear least squares). The slope function takes two input arguments of the same type (both can be an int or a double) and returns a double.

This function returns zero if the Aggregate window contains one row or duplicate rows containing the same values.

The slope() function returns a unitless double value that does not represent an angle, and is therefore in neither degrees nor radians. The internal calculation takes the tan(theta), where theta is the angle of incline of the best linear fit to the data. This calculation returns the same value whether theta is internally expressed in degrees or radians.

Back to Top ^  Back to Aggregate Overview ^

spearmans_rank()

Function syntax:

double spearmans_rank(double field1, double field2)

Returns the Spearman's Rank correlation coefficient between values of two fields in the Aggregate context as a double value when the window emits. Spearman's statistic is defined as the Pearson correlation coefficient between the ranked variables. A return value of -1 indicates a perfect negative correlation (reverse ordering). A value of 0 indicates no concordance of ranks. A value of 1 indicates identical rank orderings. If any field values are null, the returned value is null.

To perform multivariate Spearman's Rank correlations, use the aggregate correlation_coefficient function.

Back to Top ^

stdev()

Function syntax:

double    stdev(double f)
double    stdev(int f)
double    stdev(long f)
timestamp stdev(timestamp f)

Returns the standard deviation for field f for all tuples in the Aggregate context. The function takes one input argument. If the input argument is an int, double, or long, it returns a double. If the input argument is a timestamp, it returns an interval timestamp. For timestamp values, the aggregate field should contain all interval timestamps or all absolute timestamps, but not both. For windows with size < 2, returns null.

Standard deviation is a measure of the dispersion of a set of data from its mean. The more spread apart the data is, the higher the deviation. For example, in financial applications, standard deviation could be applied to the annual rate of return of an investment to measure the investment's volatility (risk). A volatile stock would have a high standard deviation. In mutual funds, the standard deviation indicates how much the return on the fund is deviating from the expected normal returns.

See also the stdev() function for lists.

Back to Top ^  Back to Aggregate Overview ^

stdevp()

Function syntax:

double    stdevp(double f)
double    stdevp(int f)
double    stdevp(long f)
timestamp stdevp(timestamp f)

Returns the standard deviation for field f for all tuples in the Aggregate context. The function takes one input argument. If the input argument is an int, double, or long, it returns a double. If the input argument is a timestamp, it returns an interval timestamp. For timestamp values, the aggregate field should contain all interval timestamps or all absolute timestamps, but not both. For windows with size < 2, returns 0.

The stdevp() function is similar to the stdev() function: with stdevp() the data provided is the entire population, while with stdev(), the data provided is treated as a random sample. The stdevp() function is calculated using the biased (or n) method. The stdev() function is calculated using the unbiased (or n-1) method.

See also the stdevp() function for lists.

Back to Top ^  Back to Aggregate Overview ^

sum()

Function syntax:

double    sum(double f)
int       sum(int f)
long      sum(long f)
timestamp sum(timestamp f)

Returns the sum of field f computed for all tuples in the Aggregate context. The field can be an double, int, long, or timestamp.

When the field type is timestamp, summing members of the field follows the rules for adding timestamps as shown in the table in timestamp Data Type. That is, you cannot sum an aggregate of two or more absolute timestamps. However, you can sum an aggregate field composed of all interval timestamps, or one composed of exactly one absolute timestamp plus one or more interval timestamps.

See also the sum() function for lists.

Back to Top ^  Back to Aggregate Overview ^

sumif()

Function syntax:

int sumif(T fieldname, bool predicate)

Returns the sum of the values for which the specified numeric field is not null and for which the predicate evaluates to true. The predicate expression can be any valid expression on any field, is not restricted to values from the specified field.

For example: sumif(input.income, input.age < 35)

Back to Top ^  Back to Aggregate Overview ^

variance()

Function syntax:

double variance(double f)
double variance(int f)
double variance(long f)

Returns the variance for field f for all tuples in the Aggregate context. The function takes one input argument (an int, double, or long) and returns a double. For windows with size < 2, returns null.

Variance is a measure of the dispersion of a set of data points around their mean value. It is a mathematical expectation of the average squared deviations from the mean. Variance measures the variability (volatility) from an average. Volatility is a measure of risk. So for example, this statistic can help determine the risk an investor might take on when purchasing a specific security.

See also the variance() function for lists.

Back to Top ^  Back to Aggregate Overview ^

variancep()

Function syntax:

double variancep(double f)
double variancep(int f)
double variancep(long f)

Returns the variance for field f for all tuples in the Aggregate context. The function takes one input argument (an int, double, or long) and returns a double. For windows with size < 2, returns 0.

The variancep() function is similar to the variance() function: with variancep() the data provided is the entire population, while with variance(), the data provided is treated as a random sample. The variancep() function is calculated using the biased (or n) method. The variance() function is calculated using the unbiased (or n-1) method.

See also the variancep() function for lists.

Back to Top ^  Back to Aggregate Overview ^

vwap()

Function syntax:

double vwap(double price, double volume)

Returns the volume-weighted average-price value from (typically) the price and volume values, for all tuples in the Aggregate context. Both arguments are doubles.

Back to Top ^  Back to Aggregate Overview ^

withmax()

Function syntax:

T1 withmax(T compare_expr, T1 result_expr)

Returns the result_expr for this Aggregate context that has the maximum corresponding compare_expr using the greater-than relational operator. The data type, T1, of the returned value is the same as the result_expr. The data type of the compare_expr can be any data type, not necessarily the same as the result_expr's type.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

Back to Top ^  Back to Aggregate Overview ^

withmin()

Function syntax:

T1 withmin(T compare_expr, T1 result_expr)

Returns the result_expr for this Aggregate operator's window that has the minimum corresponding compare_expr using the less-than relational operator. The data type, T1, of the returned value is the same as the result_expr. The data type of the compare_expr can be any data type, not necessarily the same as the result_expr's type.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

Back to Top ^  Back to Aggregate Overview ^

Aggregate Functions: Windowing

This category includes the following functions:

Function Use in Aggregate Operator? Use in Query Read on Query Tables?
closeval() yes no
firstn() yes yes
firstnonnullval() yes yes
firstval() yes yes
getOldestWindowID() yes no
getWindowID() yes no
isOldestWindow() yes no
lag() yes yes
lastn() yes yes
lastnonnullval() yes yes
lastval() yes yes
openval() yes no

All windowing functions work on columns of data in Aggregate windows, and most windowing functions also work on table columns returned by Query Read operations on Query Tables.

Note

When used with a Query operator, these functions are specified in the Output Settings tab of the Query operator's Properties view, and operate on the initial selection of rows from the Query Table as specified in the Query Settings tab for a Query-Read operation. Remember that the order of rows returned from the Query Table depends on the Read predicate expression and on the sort order, if specified.

For example, if the Query-Read operation specifies Read All Rows on a table whose primary index is a string field, and you specify running the firstn() or lastn() functions on that selection, those functions return fields from the first n rows (or last n rows) in alphabetical order by primary index.

The three functions getWindowID(), isOldestWindow(), and getOldestWindowID() are pseudo-aggregate functions in that they only operate in the context of an Aggregate operator, but they do not query or evaluate the tuples flowing through the Aggregate operator. Instead, they report on the state of the various windows within the Aggregate.

closeval()

Function syntax:

double    closeval([string dimension])
int       closeval([string dimension])
long      closeval([string dimension])
timestamp closeval([string dimension])

The closeval() function behaves the same as openval() except that it returns the upper limit of the specified dimension when the window is closed. Like openval(), this function works in Aggregate operators, and not in Query operators.

The value returned by closeval() might differ from any of the actual tuple values included in the window. For example, consider an Aggregate operator with a dimension CountDim, where the windows created by that dimension include a window with a range of 5 to 10. If the actual tuple values in the window are 6, 7, and 9, then closeval(CountDim) returns 10 for that window, not 9.

If the dimension name is omitted and the aggregate has only one dimension, then closeval() uses that dimension. If the dimension name is omitted and the aggregate has more than one dimension, the result is a typecheck error.

Back to Top ^  Back to Aggregate Overview ^

firstn()

Function syntax:

list(T) firstn(int count, T f)

Returns a list with count elements containing the first count values for field f in an Aggregate operator's window or in a Query Table column. Accepts fields of all data types and returns a list whose elements are the same type as the type of field f. Field f can contain null values, and such fields are returned in the list.

The value of count must be nonnegative, cannot be null, and must remain the same for the duration of the Aggregate window. For count=0, an empty Aggregate window, or an empty Query Table column, an empty list is returned.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

Back to Top ^  Back to Aggregate Overview ^

firstnonnullval()

Function syntax:

T firstnonnullval(T f)

Returns the first non-null value for field f in the Aggregate operator's window or in a Query Table column. Accepts all data types and returns the same type as its argument.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

Back to Top ^  Back to Aggregate Overview ^

firstval()

Function syntax:

T firstval(T f)

Returns the first value for field f in the Aggregate operator's window or in a Query Table column. Accepts all data types and returns the same type as its argument.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

Back to Top ^  Back to Aggregate Overview ^

getOldestWindowID()

Function syntax:

int getOldestWindowID()

Returns an integer that represents the ID of the oldest window still open of the current aggregation group.

This function works only in conjunction with the Aggregate operator.

Back to Top ^  Back to Aggregate Overview ^

getWindowID()

Function syntax:

int getWindowID()

Returns an integer representing the ID of the current window. All windows of each Aggregate operator have a unique ID. The larger the value of the window ID, the younger the window is.

This function works only in conjunction with the Aggregate operator.

Back to Top ^  Back to Aggregate Overview ^

isOldestWindow()

Function syntax:

bool isOldestWindow()

Returns true if the window in which this function is called is the oldest of the current aggregation group.

This function works only in conjunction with the Aggregate operator.

Back to Top ^  Back to Aggregate Overview ^

lag()

Function syntax:

T lag(T f)

Returns the next-to-last value for field f in the Aggregate operator's window or in a table column. If the window or table column has only one row or zero rows, lag() returns null. Accepts all data types and returns the same type as its argument.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

Example usage in an aggregate context: avg(fieldname/lag(fieldname)) where fieldname is a numeric value such as a price field.

Back to Top ^  Back to Aggregate Overview ^

lastn()

Function syntax:

list(T) lastn(int count, T f)

Returns a list with count elements containing the last count values for field f in an Aggregate operator's window or in a Query Table column. Accepts fields of all data types and returns a list whose elements are the same type as the type of field f. Field f can contain null values, and such fields are returned in the list.

The value of count must be nonnegative, cannot be null, and must remain the same for the duration of the Aggregate window. For count=0, an empty Aggregate window, or an empty Query Table column, an empty list is returned.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

Back to Top ^  Back to Aggregate Overview ^

lastnonnullval()

Function syntax:

T lastnonnullval(T f)

Returns the last non-null value for field f in the Aggregate operator's window or in a Query Table column. Accepts all data types and returns the same type as its argument.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

Back to Top ^  Back to Aggregate Overview ^

lastval()

Function syntax:

T lastval(T f)

Returns the last value for field f in the Aggregate operator's window or in a Query Table column. Accepts all data types and returns the same type as its argument.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

Back to Top ^  Back to Aggregate Overview ^

openval()

Function syntax:

double    openval([string dimension])
int       openval([string dimension])
long      openval([string dimension])
timestamp openval([string dimension])

Can only be used in an Aggregate operator, not in a Query Operator. Returns the lower limit of the specified dimension for the current window. The dimension name is passed as a quoted string, and must match the name of a dimension in the Aggregate operator. This function works even if the dimension specified is not the one that actually opened the current window. In that case, the value returned is the value of the dimension at the time the window was created.

The value returned by openval might differ from any of the actual tuple values included in the window. For example, consider an Aggregate operator with a dimension CountDim, where the windows created by that dimension include a window with a range of 5 to 10. If the actual tuple values in the window are 6, 7, and 9, then openval(CountDim) returns 5 for that window, not 6.

If the dimension name is omitted and the aggregate has only one dimension, then openval() uses that dimension. If the dimension name is omitted and the aggregate has more than one dimension, the result is a typecheck error.

Back to Top ^  Back to Aggregate Overview ^