# StreamBase Expression Language Functions

This topic describes the syntax and use of each function in the StreamBase expression language. See StreamBase Expression Language Features for information on using the expression language in general. See StreamBase Data Types for usage information on StreamBase data types.

 Alphabetical Index of StreamBase Functions Simple Functions: NaN Simple Functions: Type Conversions Simple Functions Overview Simple Functions: Runtime Simple Functions: Utilities Simple Functions: BSON Simple Functions: Statistical Calculations Aggregate Functions Overview Simple Functions: Errors Simple Functions: Strings Aggregate Functions: Aggregate to List Simple Functions: External Functions Simple Functions: System Aggregate Functions: External Functions Simple Functions: Financial Simple Functions: Timestamp Overview Aggregate Functions: Statistical Calculations Simple Functions: Internet Timestamp Functions: Absolute Timestamps Aggregate Functions: Windowing Simple Functions: Lists Timestamp Functions: Interval Timestamps Simple Functions: Math Timestamp Functions: Timestamp Fields

## Alphabetical Index of StreamBase Functions

The following table is a complete, alphabetized index into the functions provided in the StreamBase expression language, with links to the description for each function.

Function Name and Link to Section Category Simple or Aggregate Function
abs Math Simple function
acos Math Simple function
aggregatelist Aggregate to List Aggregate function
aggregatelistnonnull Aggregate to List Aggregate function
alpha Statistical calculations Aggregate function
andall (simple) Math Simple function
andall (aggregate) Statistical calculations Aggregate function
append Lists Simple function
asin Math Simple function
atan Math Simple function
atan2 Math Simple function
avg (simple) Lists Simple function
avg (aggregate) Statistical calculations Aggregate function
avg_list (simple) Lists Simple function
avg_list (aggregate) Statistical calculations Aggregate function
avgif (aggregate) Statistical calculations Aggregate function
beta Statistical calculations Aggregate function
bitand Math Simple function
bitnot Math Simple function
bitor Math Simple function
bitxor Math Simple function
black_scholes Financial Simple function
blob Type conversions Simple function
bool Type conversions Simple function
bsontojson BSON Simple function
callcpp (simple) External Functions (simple) Simple function
callcpp (aggregate) External Functions (aggregate) Aggregate function
calljava (simple) External Functions (simple) Simple function
calljava (aggregate) External Functions (aggregate) Aggregate function
catchexception Errors Simple function
cbrt Math Simple function
ceil Math Simple function
choose Math Simple function
closeval Windowing Aggregate function
coalesce Utilities Simple function
coalesce_tuples Utilities Simple function
compile Utilities Simple function
compound_interest Financial Simple function
concat (simple) Lists Simple function
concat (aggregate) Aggregate to List Aggregate function
contains (list item) Lists Simple function
contains (strings) Strings Simple function
correlation_coefficient Statistical calculations Simple function
correlation_coefficient Statistical calculations Aggregate function
correlation_coefficientp Statistical calculations Aggregate function
cos Math Simple function
cosh Math Simple function
count Statistical calculations Aggregate function
countlong Statistical calculations Aggregate function
count_distinct Statistical calculations Aggregate function
count_distinct_elements Lists Simple function
count_list (simple) Lists Simple function
count_list (aggregate) Statistical calculations Aggregate function
countif (aggregate) Statistical calculations Aggregate function
covariance Statistical calculations Aggregate function
covariancep Statistical calculations Aggregate function
date Timestamps Simple function
days Timestamps Simple function
dotproduct Lists Simple function
double Type conversions Simple function
emptylist Lists Simple function
endswith Strings Simple function
epoch Timestamps Simple function
error Errors Simple function
eval Utilities Simple function
exp Math Simple function
exp_moving_avg Statistical calculations Aggregate function
expm1 Math Simple function
factorial Math Simple function
filterlist Lists Simple function
filternull Lists Simple function
findbson BSON Simple function
firstelement Lists Simple function
firstn Windowing Aggregate function
firstnonnullval Windowing Aggregate function
firstval Windowing Aggregate function
floor Math Simple function
foldleft Lists Simple function
foldright Lists Simple function
format Strings Simple function
format_time Timestamps Simple function
fromCodePoint Utilities Simple function
from_gmtime Timestamps Simple function
from_localtime Timestamps Simple function
from_unixtime Timestamps Simple function
frombson BSON Simple function
gaussianrandom Math Simple function
getAllHostIPs Runtime Simple function
get_boolean_conf_param Runtime Simple function
get_conf_param Runtime Simple function
get_double_conf_param Runtime Simple function
get_int_conf_param Runtime Simple function
get_long_conf_param Runtime Simple function
get_conf_param Runtime Simple function
get_day_of_month Timestamps Simple function
get_day_of_week Timestamps Simple function
get_day_of_year() Timestamps Simple function
get_hour Timestamps Simple function
get_millisecond Timestamps Simple function
get_minute Timestamps Simple function
get_month Timestamps Simple function
getOldestWindowID() Windowing Aggregate function
get_second Timestamps Simple function
getWindowID() Windowing Aggregate function
get_year Timestamps Simple function
getClientIP Runtime Simple function
getContainer Runtime Simple function
getHostName Runtime Simple function
getNodeName Runtime Simple function
getParallelRoot Runtime Simple function
getPath Runtime Simple function
getServerURI Runtime Simple function
getTableSize Runtime Simple function
getTargetTime() System Simple function
goodman_kruskal_gamma Statistical calculations Simple function
goodman_kruskal_gamma Statistical calculations Aggregate function
has_intersection Lists Simple function
hash Utilities Simple function
hours Timestamps Simple function
indexof (for lists) Lists Simple function
indexof (for strings) Strings Simple function
inet_aton Internet Simple function
inet_ntoa Internet Simple function
insertelement Lists Simple function
int Type conversions Simple function
intercept Statistical calculations Aggregate function
intersect Lists Simple function
interval Timestamps Simple function
isempty Strings Simple function
isinterval Timestamps Simple function
isnan NaN (not a number) Simple function
isnull Utilities Simple function
isOldestWindow() Windowing Aggregate function
join, joinlist Lists Simple function
join, joinlist Aggregate to List Aggregate function
joinbson BSON Simple function
jsontobson BSON Simple function
jsonpath Utilities Simple function
kendall_tau Statistical calculations Simple function
kendall_tau Statistical calculations Aggregate function
lag Windowing Aggregate function
lastelement Lists Simple function
lastindexof (for lists) Lists Simple function
lastindexof (for strings) Strings Simple function
lastn Windowing Aggregate function
lastnonnullval Windowing Aggregate function
lastval Windowing Aggregate function
lenbson BSON Simple function
length Utilities Simple function
length (for lists) Lists Simple function
length (for strings) Strings Simple function
list Type conversions Simple function
listbson BSON Simple function
ln Math Simple function
log10 Math Simple function
log1p Math Simple function
long Type conversions Simple function
lower Strings Simple function
lshift Math Simple function
ltrim Strings Simple function
maplist Lists Simple function
maplist_loose Lists Simple function
max (simple) Math Simple function
max (aggregate) Statistical calculations Aggregate function
maxn (aggregate) Statistical calculations Aggregate function
maxdouble Math Simple function
maxelement Lists Simple function
maxint Math Simple function
maxlong Math Simple function
md5 Utilities Simple function
median (simple) Lists Simple function
median (aggregate) Statistical calculations Aggregate function
mergelist Lists Simple function
milliseconds Timestamps Simple function
min (simple) Math Simple function
min (aggregate) Statistical calculations Aggregate function
minn (aggregate) Statistical calculations Aggregate function
mindouble Math Simple function
minelement Lists Simple function
minint Math Simple function
minlong Math Simple function
minutes Timestamps Simple function
namebson BSON Simple function
named schema constructor function Type conversions Simple function
nan Math Simple function
nanotime System Simple function
new_tuple Utilities Simple function
new_tupkendall_taule_subset Utilities Simple function
new_tuple_subset_loose Utilities Simple function
notnan NaN (not a number) Simple function
notnull Utilities Simple function
now Timestamps Simple function
nullif Utilities Simple function
nulllist Lists Simple function
openval Windowing Aggregate function
orall (simple) Math Simple function
orall (aggregate) Statistical calculations Aggregate function
parsecsv Utilities Simple function
parsejson Utilities Simple function
parsejson_loose Utilities Simple function
parse_time Timestamps Simple function
pearson Statistical calculations Simple function
pearson Statistical calculations Aggregate function
percentile Statistical calculations Aggregate function
permute Math Simple function
pow Math Simple function
prepend Lists Simple function
product (simple) Lists Simple function
product (aggregate) Statistical calculations Aggregate function
putbson BSON Simple function
random Math Simple function
random_tuple Utilities Simple function
randomstring Strings Simple function
range Lists Simple function
rank (aggregate) Aggregate to List Aggregate function
regexmatch Strings Simple function
regexmatch_ignorecase Strings Simple function
regexreplace Strings Simple function
regexsplit Strings Simple function
regextuple Strings Simple function
removebson BSON Simple function
removeelement Lists Simple function
replace Strings Simple function
replaceelement Lists Simple function
reverse Lists Simple function
round Math Simple function
rshift Math Simple function
rtrim Strings Simple function
seconds Timestamps Simple function
securerandom Math Simple function
set_day_of_month Timestamps Simple function
set_day_of_week Timestamps Simple function
set_hour Timestamps Simple function
set_minute Timestamps Simple function
set_month Timestamps Simple function
set_second Timestamps Simple function
set_year Timestamps Simple function
sign Math Simple function
sin Math Simple function
sinh Math Simple function
shuffle() Lists Simple function
sleep System Simple function
slope Statistical calculations Aggregate function
sort Lists Simple function
spearmans_rank Statistical calculations Simple function
spearmans_rank Statistical calculations Aggregate function
split Strings Simple function
splitbson BSON Simple function
sqrt Math Simple function
startswith Strings Simple function
stdev (simple) Lists Simple function
stdev (aggregate) Statistical calculations Aggregate function
stdevp (simple) Lists Simple function
stdevp (aggregate) Statistical calculations Aggregate function
strftime Timestamps Simple function
string Type conversions Simple function
stringbson BSON Simple function
strlen Strings Simple function
strpinterval Timestamps Simple function
strptime Timestamps Simple function
strresize Strings Simple function
strresizetrunc Strings Simple function
sublist Lists Simple function
substr Strings Simple function
sum (simple) Lists Simple function
sum (aggregate) Statistical calculations Aggregate function
sum_list (simple) Lists Simple function
sum_list (aggregate) Statistical calculations Aggregate function
sumif (aggregate) Statistical calculations Aggregate function
systemenv System Simple function
systemproperty System Simple function
tan Math Simple function
tanh Math Simple function
throw Errors Simple function
time Timestamps Simple function
timezoneoffset Timestamps Simple function
timestamp Type conversions Simple function
tobson BSON Simple function
today Timestamps Simple function
today_utc Timestamps Simple function
toCodePoint Utilities Simple function
tojson Utilities Simple function
to_degrees Math Simple function
to_milliseconds Timestamps Simple function
to_seconds Timestamps Simple function
trim Strings Simple function
tuplebson BSON Simple function
tuple Type conversions Simple function
unique Lists Aggregate function
unique Aggregate to List Simple function
unsignedrshift Math Simple function
unzip Lists Simple function
upper Strings Simple function
uuid Utilities Simple function
variance (simple) Lists Simple function
variance (aggregate) Statistical calculations Aggregate function
variancep (simple) Lists Simple function
variancep (aggregate) Statistical calculations Aggregate function
version System Simple function
vwap Statistical calculations Aggregate function
weeks Timestamps Simple function
withmax Statistical calculations Aggregate function
withmin Statistical calculations Aggregate function
xorall Math Simple function
zip Lists Simple function

## Simple Functions Overview

The majority of functions in the StreamBase expression language are simple functions, which operate on a single tuple field at a time. You can use simple functions in expressions for any StreamBase operator (except the Heartbeat, Metronome, and Union operators, which do not accept expressions).

The expression language also supports aggregate functions, which are described in Aggregate Functions Overview.

Simple functions are organized into the following categories:

 Simple Functions: BSON Simple Functions: Runtime Simple Functions: Errors Simple Functions: Statistical Calculations Simple Functions: External Functions Simple Functions: Strings Simple Functions: Financial Simple Functions: System Simple Functions: Internet Simple Functions: Timestamp Overview Simple Functions: Lists Simple Functions: Type Conversions Simple Functions: Math Simple Functions: Utilities Simple Functions: NaN

## Simple Functions: BSON

This category includes the following functions:

 addbson() bsontojson() findbson() frombson() joinbson() jsontobson() lenbson() listbson() namebson() putbson() removebson() splitbson() stringbson() tobson() tuplebson()

These functions create, manipulate and extract contents of BSON blobs. BSON blobs, as defined by bsonspec.org, are maintained in a binary format in which zero or more values, arrays, and objects with fields (key/value pairs) are stored as a single entity. A variety of implementations exist.

BSON is short for Binary JSON. BSON blobs serialize data in the manner of JSON, and like JSON, BSON data may include nested objects. JSON, as defined on http://www.json.org is text consisting of colon-separated name-value pairs (with field name quoted), and each pair separated by a comma. A BSON-encoded blob is sometimes referred to as a document.

The following table indicates how StreamBase types correspond to BSON types.

 StreamBase Type BSON Type int int32 long int64 double double bool Boolean string UTF-8 string null null timestamp timestamp* blob binary list array tuple object

*The StreamBase BSON Java implementation for timestamps differs from the official BSON specification. They are both int64 but the StreamBase format is used internally rather than the bsonspec.org format.

Function syntax:

`blob addbson(blob bsonblob-array, T element)`

Adds a new field at the end of a BSON array. Accepts a BSON blob that contains an array and an element to be appended to it. `element` can be of any type compatible with the contents of the array `bsonblob-array`. Returns a BSON object for the expanded array.

### bsontojson()

Function syntax:

`string bsontojson(blob bsonblob)`

Converts BSON blob to JSON-formatted string. Accepts one BSON blob and returns its contents as a JSON object of key-value pairs.

### findbson()

Function syntax:

```blob findbson(blob bsonblob, int index)
blob findbson(blob bsonblob, string fieldname)```

Searches a BSON object for a specified array index or field name. Accepts a BSON blob assumed to contain either a BSON array or a BSON object and an index into the array or a string that should match a field name. Returns a blob containing the found field or throws an error if not found or `index` is out of bounds. `index` is a zero-based index. `fieldname` is a string, interpreted as a field name.

### frombson()

Function syntax:

`T frombson(blob bsonblob, T type)`

Converts a BSON blob to the requested StreamBase type. Accepts one BSON blob and returns its contents as the requested StreamBase type. The second argument provides an example of the return type and is not evaluated. If the type does not match, throws an error.

### joinbson()

Function syntax:

`blob joinbson(list(blob) L)`

Concatenates a list of BSON fields into a BSON object. Accepts a list of BSON blobs containing fields and concatenates them. If the list argument is empty, returns an empty blob. If the list is null, returns null. To test for null blobs, use the `isnull` function. You can use `splitbson` to create a list of BSON fields.

### jsontobson()

Function syntax:

`blob jsontobson(string jsonstring)`

Converts a JSON string to a BSON blob. Accepts one JSON string and returns its contents as a BSON blob. The argument `jsonstring` is any JSON string. For example:

`jsontobson('{"symbol":"IBM","price":75.91,"quantity":150, "date":"2013-04-28 16:00:00.000-0400"}')`

### lenbson()

Function syntax:

`int lenbson(blob bsonblob)`

Returns the length of a BSON object or array.

If `bsonblob` contains an array, returns the number of elements in it.

If `bsonblob` is an object, returns the number of fields in it.

If `bsonblob` is any other scalar type, throws an error.

### listbson()

Function syntax:

`blob listbson(T1 value1, T2 value2, ... Tn valuen)`

Accelerator for constructing a (possibly heterogeneous) BSON array, similar to a StreamBase list. Accepts any number of arguments and returns a BSON array containing them. Values are individually interpreted by `tobson` and can be of any type.

### namebson()

Function syntax:

`string namebson(blob bsonblob)`

Returns the name of a BSON field

If `bsonblob` is a BSON field, returns the field name, otherwise returns an empty string.

### putbson()

Function syntax:

`blob putbson(blob bsonblob-object, string name, T value)`

Adds a new field to a BSON object. Returns the blob `bsonblob-object` with added field `name` having value `value`. If field `name` already exists, its current value is replaced with `value`.

### removebson()

Function syntax:

```blob removebson(blob bson-list,
int element-index1, ... int element-indexN)
blob removebson(blob bson-tuple,
int element-index1|string fieldname1, ...
int element-index2|string fieldnameN) ```

Removes fields or list elements from BSON blobs. The BSON blob can contain an array or an object. A blob is returned with fields or elements that match the specified element-index or fieldname arguments removed. Fields can be removed from objects by index or field name.

The order of indexes and names in arguments does not matter. Any out-of-bounds indexes and non-matching field names are ignored.

### splitbson()

Function syntax:

```list splitbson(blob bsonblob-containing-list)
list splitbson(blob bsonblob-containing-object) ```

Returns a list of BSON fields from a BSON object or a list of BSON values from a BSON array. Use `frombson` to convert individual returned values to StreamBase types.

### stringbson()

Function syntax:

`string stringbson(blob bsonblob-string)`

Converts a BSON blob to a string. The BSON blob must contain one string. Unlike the `string` function, no type conversion is performed.

### tobson()

Function syntax:

`blob tobson(T arg)`

Converts StreamBase values to a BSON blob. Each StreamBase int, long, double, string, timestamp, and blob maps to a respective BSON type, nulls map to nulls, lists map to BSON arrays, and tuples map to BSON objects.

### tuplebson()

Function syntax:

`blob tuplebson(AS-expr1, AS-expr2, ... AS-exprN)`

Accelerator for constructing a BSON object from comma-separated AS expressions (comma-separated value expressions, each field named with the AS keyword). The resulting blob contains an object having the specified fields and values.

The following example creates a BSON object with three fields:

`tuplebson("IBM" AS symbol, 87.3875 AS price, 47 AS quantity)`

## Simple Functions: Errors

This category includes the following functions:

 catchexception() error() throw()

### catchexception()

Function syntax:

`T catchexception(T `arg1` [, ... [, T `argn`]])`

Attempts to evaluate all arguments in order, returning the first one that evaluates without an error, or null if all arguments evaluate to an error. Accepts all data types, but all arguments must have the same type. The returned value has the same data type, T, as the arguments.

For example, the following example attempts to parse `str` as an interval, but returns a null timestamp if parsing fails:

`catchexception(interval(str))`

The following example attempts to divide `a` by `b`, but returns `-1` if the division fails:

`catchexception(a/b, -1)`

### error()

Function syntax:

`bool error(string `message`)`

Throws an error with message text `message` that can be caught using the StreamBase error streams mechanism. In the default configuration, this produces an error message on the console, and the server continues processing.

Because error() returns bool, it works well as a predicate expression in operators. For example, use `error("Error condition")` as the last predicate expression for a Filter operator to produce an error message when no earlier predicate expression was matched.

### throw()

Function syntax:

`T throw(string `message`)`

Throws an error with message text `message` that can be caught using the StreamBase error streams mechanism. In the default configuration, this produces an error message on the console, and the server continues processing.

Because throw() returns an untyped null by default, it works well in `if` statements. For example: `if (test-expr) then (5) else throw("Failed the test")`

For the purposes of typechecking, cast the return of this function with one of the casting functions. For example, use `bool(throw("Unmatched filter option"))` as the last predicate expression of a Filter operator to produce an error message when no earlier predicate expression was matched.

## Simple Functions: External Functions

This category includes the following functions:

 callcpp() calljava()

### callcpp()

Function syntax:

`T callcpp(string `class` [, `arg1 `, ... [, `argn`]])`

As an alternative to using `callcpp()`, you can define an alias for your custom functions. See Using Function Aliases.

Use an alias or `callcpp()` to run a custom C++ function directly from a StreamBase operator that uses an expression. The return type, T, of `callcpp()` is the same as the return type of the called function. Custom C++ functions are functions you build with the StreamBase C++ Client API. You can use the simple form of `callcpp()` in any expression except aggregate expressions. (To use `callcpp()` in aggregate expressions, refer to the aggregate callcpp().)

For `callcpp()` to locate the function being called, you must specify the location of the containing DLL or `.so` file, and must register the function with a `<custom-function>` element in the StreamBase Server configuration file. See C++ Function Overview for details.

To learn about coding custom C++ functions, refer to Creating Custom C++ Functions in the API Guide.

### calljava()

Function syntax:

`T calljava(string `class`, string `method` [, `arg1 `, ..., `argn`])`

Use `calljava()` to run a Java method directly from any StreamBase operator that uses an expression. The return type, T, of `calljava()` is the same as the return type of the called method. The called method can be from a standard Java library class, such as `java.lang.Math`, or can be from a custom class of your own in the current project or in a referenced project.

Custom Java functions are functions you build with the StreamBase Java Client library. You can use this simple form of `calljava()` in any expression except aggregate or Query Table contexts. (To use `calljava()` in an aggregate expression, refer to the aggregate calljava().)

You can use `calljava()`without any configuration file entries if the method called is in an accessible standard library class, or in a class on the classpath of the StreamBase Server. To call a method in a custom class in the current project, the project's `sbconf` file must contain an entry like the following:

```<java-vm>
<dir path="./java-bin"/>
</java-vm>```

Projects created with FileNewStreamBase Project have this `sbconf` setting configured by default.

As an alternative to using `calljava()`, you can define an alias for your custom function, and call the alias in your operator expressions. See Defining Functions in Java and C++. Aliases must be defined in `sbconf` configuration files.

The `calljava()` function distinguishes simple from aggregate functions by the number of arguments. For custom simple functions, you must specify both class and method names; for custom aggregate functions, you specify only the class name.

To learn about coding custom Java functions, refer to Using the StreamBase Java Function Wizard in the API Guide. For information on the classpath requirements for custom Java functions, see Java Function Overview.

#### Return Types and Argument Types

The function you write to be called with `calljava()` can have any number of arguments, including none. Variable-length arguments are accepted by both simple and aggregate custom functions. Each `calljava()` argument `arg1` through `argn` associates a StreamBase data type with one of the following primitive or Java object types:

StreamBase Data Type Java Primitive Java Object
blob com.streambase.sb.ByteArrayView
bool boolean java.lang.Boolean
double double java.lang.Double
int int java.lang.Integer
long long java.lang.Long
list java.util.List
string byte[] java.lang.String
timestamp com.streambase.sb.Timestamp
tuple com.streambase.sb.Tuple

In writing your Java function, use the Java primitive if the execution speed of your function is paramount. However, if you pass a null for any argument that resolves in your function to a Java primitive, `calljava()` returns an error and does not call your function. Only use Java primitives if your application otherwise insures that null values cannot be passed to your function through `calljava()`.

Use the Java object when your application might pass null to your function through `calljava()`, and when no Java primitive is available.

## Simple Functions: Financial

This category includes the following functions:

 black_scholes() compound_interest()

### black_scholes()

Function syntax:

```double black_scholes(string type, double underlying, double strike,
double dividendYield, double riskFreeInterestRate,
double Volatility, timestamp exerciseDate [, timestamp dealDate])```

The `black_scholes` function calculates fair-value and risk statistics (delta, gamma, vega) for European style options on securities with continuous dividend yields. This is known as the Black-Scholes Generalized model. It also calculates implied volatility.

The `black_scholes` function takes eight input arguments; the first seven arguments are required, while the eighth argument, `dealDate`, is optional.

The arguments are:

1. `type`: Required. Should be set to one of the following string values:

• Call (The price of an option.)

• Put (The price of an option.)

• DeltaCall (The sensitivity of the price of an option to changes in the price of the stock.)

• DeltaPut (The sensitivity of the price of an option to changes in the price of the stock.)

• ThetaCall (Theta measures how the price of an option changes with time.)

• ThetaPut (Theta measures how the price of an option changes with time.)

• Gamma (The sensitivity of a stock's delta to the stock price.)

• Vega (The rate of change of the value of an option with respect to the volatility of the stock's price.)

• RhoCall (The sensitivity of the price of an option with respect to the risk-free interest rate.)

• RhoPut (The sensitivity of the price of an option with respect to the risk-free interest rate.)

• ImpliedVolatilityCall (Implied volatility of the underlying stock for a given price.)

• ImpliedVolatilityPut (Implied volatility of the underlying stock for a given price.)

2. `underlying`: Required. A double. The price of the underlying stock.

3. `strike`: Required. A double. The strike price of the stock on the exercise date.

4. `dividendYield`: Required. A double. For example, 0.03 for 3.0%.

5. `riskFreeInterestRate`: Required. A double. For example, 0.05 for 5.0%.

6. `Volatility` (or value): Required. A double. This input argument is overloaded. When computing anything except the implied volatility for a given call or put price this argument is the volatility of the stock. For example, 0.2 for 20%. When computing implied volatility this argument is the option price.

7. `exerciseDate`: Required. An absolute or interval timestamp. An absolute timestamp represents the option exercise date, while an interval timestamp represents the number of days to exercise.

8. `dealDate`: Optional. A timestamp representing the option deal date. If not provided, the current date is the default.

### compound_interest()

Function syntax:

```double compound_interest(double principalValue, double matureValue,
double numberOfPeriods [, double guess])```

Returns the compound interest given the principal value, mature value, and number of periods. An optional fourth argument, `guess`, can be used to set the initial compound interest used by the function. The `compound_interest` function uses the Newton-Raphson algorithm to compute compound interest rate. This algorithm performs the same calculations repetitively. Each iteration results in a compound interest that is closer to the final result. The number of iterations required to get the results within an acceptable inaccuracy (`1.0e-4`) depends on the number of periods and the initial compound interest used to start the calculation.

If the `guess` argument is not provided, the default initial value is `0.01` (1%). If the function returns a NaN (not a number), you can provide the initial compound interest in this fourth argument. If the initial value is closer to the actual rate of the computation, it can be sped up significantly.

## Simple Functions: Internet

This category includes the following functions:

 inet_aton() inet_ntoa()

### inet_aton()

Function syntax:

`long inet_aton(string `address`)`

Takes a string containing an IPv4 address in dotted-quad notation ("nnn.nnn.nnn.nnn") and returns a long containing that address encoded in network byte order.

### inet_ntoa()

Function syntax:

```string inet_ntoa(int `address`)
string inet_ntoa(long `address`)```

Takes an int or long containing an IPv4 address encoded in network byte order and returns a string containing that address in dotted-quad notation ("nnn.nnn.nnn.nnn").

## Simple Functions: Lists

A list is an ordered collection of values (called elements), each of which is of the same StreamBase data type, called the list's element type. The element type can be any StreamBase data type, such as an int, a tuple, or even a list (thus allowing for constructions such as a list of list of int).

Individual elements in a list can be accessed using their zero-based integer position (their index) in the list. In any expression in an EventFlow or StreamSQL program, use brackets to address individual elements of a list. Thus, for a list field named L, use `L[0]` to address the first element in the list, `L[1]` for the second element, and `L[length(L)-1]` to address the last element.

In most list-related functions that take an index, you can also use a negative index to count backward from the end of the list. Thus, for a list L, `L[-1]` is equivalent to `L[length(L)-1]`.

The number of elements in a list is determined at application runtime. A null list is not the same as a list with zero elements (an empty list), which is not null.

Several functions that manipulate lists can take user-written functions as arguments. These are `filterlist()`, `foldleft()`, ` foldright()`, ` maplist()`, ` mergelist()`, and `sort()`.

 append() avg() avg_list() concat() contains() count_distinct_elements() count_list() dotproduct() emptylist() filterlist() filternull() firstelement() foldleft() foldright() has_intersection() indexof() (for lists) insertelement() intersect() join() and joinlist() lastelement() lastindexof() (for lists) length() (for lists) list() maplist() maplist_loose() maxelement() median() mergelist() minelement() nulllist() prepend() product() range() regexsplit() removeelement() replaceelement() reverse() shuffle() sort() split() stdev() stdevp() sublist() sum() sum_list() unique() unzip() variance() variancep() zip()

### append()

Function syntax:

`list(T) append(list(T) `L`, T `e1`, T `e2`, ..., T `en`)`

Returns its argument list, `L`, with elements `e1` through `en` added to the end of the list in the order specified. The data types of the appended elements `e` and the element type of list `L` must be the same, except that element types int and long can be coerced to long and double following the rules in Data Type Coercion and Conversion. The element type of the returned list is the same as the element type of the argument list and of the appended elements.

The function takes any number of element arguments, including zero. That is, `append(L)` returns list L unchanged. If any list element is null, a null element is appended to the list in the position specified.

### avg()

Function syntax:

```double    avg(list(double) `L`)
int       avg(list(int) `L`)
long      avg(list(long) `L`)
timestamp avg(list(timestamp) `L`)```

Given a list, `L,` of doubles, ints, longs, or timestamps, returns the average of all members of the list. For a returned average of timestamp values to make sense, the list should contain all interval timestamps or all absolute timestamps, but not both. The data type of the returned value is the same as the element type of the argument list.

### avg_list()

Function syntax:

`list(T) avg_list(list(T) LL(list(T) `L1`, list(T) `L2`, ...))`

Accepts a list of lists as an argument, where each member list `L1`, `L2`, and so on of the containing list `LL` must have the same element type or a coercible type. Returns a list containing the element-wise average of the elements of each member list in LL. The list element type, T, of the returned list is the same as the element type of the argument lists. For example:

`avg_list( [ [1,2,3], [4,5,6] ] )`

returns:

`[2.5, 3.5, 4.5`

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the average.

### concat()

Function syntax:

`list(T) concat(list(T) `L1`, list(T) `L2` [, ..., list(T) `Ln`])`

Returns a list produced by concatenating the elements of its argument lists in the order specified. Takes two or more list arguments, which must have the same element type. The element type, T, of the returned list is the same as the element types of the argument lists.

### contains()

Function syntax:

`bool contains(list(T) `L`, T `item`)`

Takes two arguments: a list with element type T, and a value, `item`, of the same type. Returns `true` if `item` is a member of the specified list, or `false` if not.

If the entire list is null, returns `null`. If a member of the list is null, and `item` is null, returns `true`. If no member of the list is null, and `item` is null, returns `false`.

If `L` is a list of strings, `item` must match an element exactly. Substrings of elements are not matched. Thus, the following expression returns `false`:

`'foo' in ['there','is','food','here']`

The string version of contains() does match substrings, so to determine if a substring exists in a list, you can use join() to convert a list of strings to a string. The following expression thus returns `true`:

`'foo' in join(' ',['there','is','food','here'])`

The IN infix operator provides an alternate implementation for contains().

### count_distinct_elements()

Function syntax:

`int count_distinct_elements(list(T) `L`)`

Returns the number of unique elements in the argument list, `L`, with any element type, T.

For example:

`count_distinct_elements(list(1, 2, 3, 3, 4, 4, 5, 6, 6, 7))`

returns 7.

### count_list()

Function syntax:

`list(int) count_list(list(T) LL(list(T) `L1`, list(T) `L2`, ...))`

Accepts a list of lists as an argument, where each member list `L1`, `L2`, and so on of the containing list `LL` must have the same element type or a coercible type. Returns a list containing an element-wise count of the corresponding elements of each member list in LL. The list element type, T, of the returned list is the same as the element type of the argument lists, or a coerced superset type. For example:

`list_list( [ [1,2,3], [4,5,6,7,8] ] )`

returns:

`[2, 2, 2, 1, 1]`

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the sum.

### dotproduct()

Function syntax:

`T dotproduct(list(T) `L1`, list(T) `L2`, ...)`

Returns the sum of the element-wise product of its argument lists, whose element types must be int, double, or long. Accepts any number of lists, but is typically used with two list arguments. The data type of the returned value is the same as the element type of the argument lists. If the element type of the argument lists do not match, they are promoted using the rules in Data Type Coercion and Conversion.

This function produces the dot product, also known as the scalar product, of two or more numeric lists. The function multiplies the elements of each list in element order, L1.first times L2.first, then L1.second times L2.second, and so on. The resulting list of element products is summed and returned. If an argument list is null, the result is null. If any element in any argument list is null, the result is null. If an argument list is shorter than another argument list, the extra positions are filled in with 1 (or 1.0 or 1L), which preserves the extra position elements of the longer list unchanged.

### emptylist()

Function syntax:

`list(T) emptylist(T `x`)`

Returns an empty list with the same element type as the argument. The argument `x`'s value is ignored, and only its type is considered. This function is typically used like these examples:

`emptylist(int())`

`emptylist(string())`

You can also use an argument with a literal value, or an expression that resolves to a value, like the following examples. Only the type of the argument is considered:

`emptylist(17)`

`emptylist("IBM")`

### filterlist()

Function syntax:

```list(T) filterlist(inputlist(T),
inFunction(arg T) {boolean function})```

The `filterlist` function takes a list as its first argument and a function as its second argument. The inputs to that function must be coercible to its declared input types. If the input argument inputlist contains tuples, the number, order, and names of its fields must match exactly, and its field types must be coercible to the field types of the input function's argument fields.

`filterlist` returns a list composed of only the elements of the argument `inputlist` for which the function provided in the second argument returns `true` when applied to elements of `inputlist`. The element type, T, of the returned list is the same as the element type of the argument `inputlist`.

The input function inFunction must take in a single argument of the same type as the input list element type, and must output a boolean value. The input function can be a built-in function or one defined using the expression language. Null values in the input list are not treated specially and are included in the output list if the function returns `true` for nulls.

Examples:

`filterlist( [14,22,8,19,38,18,6,28,31,25,6], function myrange(x int) { x > 19 && x < 30} )`

Returns:

`[22, 28, 25]`

In the next example, `citytemps` is a named schema of daily temperature records with schema {city string, mintemp int, maxtemp int}. We want to filter from this a list of tuples, each containing a city that had a temperature range of more than 20 degrees.

```filterlist( list(citytemps("Phoenix", 77, 104), citytemps("Fargo", 58, 89), citytemps("Miami", 82, 96), citytemps("Seattle", 62, 73), citytemps("Boston", 54, 76)), function over20(rec citytemps) { rec.maxtemp - rec.mintemp > 20 } )```

Returns:

`[{"Phoenix", 77, 104}, {"Fargo", 58, 89}, {"Boston", 54, 76}]`

### filternull()

Function syntax:

`list(T) filternull(list(T) `L`)`

Returns a list composed of the elements of the argument list, `L`, with any null elements removed. The element type, T, of the returned list is the same as the element type of the argument list.

### firstelement()

Function syntax:

`T firstelement(list(T) `L`)`

Returns the first element of argument list `L`. The data type, T, of the returned value is the same as the element type of the argument list.

The usage `firstelement(L)` is the equivalent of `L[0]`.

### foldleft()

Function syntax (with line breaks added for clarity):

```T foldleft(function(arg1 T, arg2 T, arg3 T, ... , argn T)
{`expression`}, arg1_initial, List<arg2> list1, List<arg3> list2, ... ,
List<argn> listn)
T foldleft(function `name` (arg1 T, arg2 T, arg3 T, ... , argn T)
{`expression`}, arg1_initial, List<arg2> list1, List<arg3> list2, ... ,
List<argn> listn)```

Executes `{expression}` to accumulate values on the elements of a set of input lists, working left to right, first to last, through the input list elements, and returns the accumulated result. All the input lists must have the same number of elements, which must be numeric types. Input lists that are null (`nulllist(<type>)`) are treated as lists of nulls.

Although normally unnamed, the function can be given a name, as the second syntax above and the second example below illustrate.

The number of arguments to the input function must be one more than the number of input lists. The first input argument, `arg1`, initializes the accumulator. Subsequent arguments must be coercible to arg1's data type, as must be the return data type.

To calculate the output, the function first sets the accumulator to the initial value of `arg1`. The input function is applied on this accumulator value along with the first elements of the lists (`listn[0]`) to produce a new accumulator value. The input function is then applied on this new accumulator value and the second element of the lists (`listn[1]`) to calculate a new accumulator value, which are used with the third elements of the lists (`listn[2]`) and so on, until the end of the lists is reached. The accumulator value at this point is returned as the output.

Examples:

Sum the elements of one list, starting at 10.

`foldleft(function (a int, b int) { a+b }, 10, list(1, 2, 3))`

This evaluates to 16.

Construct a list of the maximum values of corresponding elements of three lists leftward (line breaks added for clarity):

```foldleft(function maxOfLists(incoming list(int), a int, b int, c int)      { append(incoming,(max(a, b, c))) },     emptylist(int()), list(88, 78, 65), list(45, 90, 23),      list(890, 89, 78) )```

Evaluates to:

`list(890, 90, 78)`

using this sequence of operations:

```maxOfLists([], 88, 45, 890) yields [890] maxOfLists([890], 78, 90, 89) yields  [890, 90] maxOfLists([890, 90], 65, 23, 78) yields [890, 90, 78]```

### foldright()

Function syntax (line break added for presentation clarity):

```T foldright(function(arg1 T, arg2 T, arg3 T, ... , argn T)
{`expression`}, arg1_initial, List<arg2> list1, List<arg3> list2,
... , List<argn> listn)
T foldright(function `name`(arg1 T, arg2 T, arg3 T, ... , argn T)
{expression}, arg1_initial, List<arg2> list1, List<arg3> list2,
... , List<argn> listn)```

Executes `{expression}` to accumulate values on the elements of a set of input lists, working right to left, last to first, through the input list elements, and returns the accumulated result. All the input lists must have the same number of elements, which must be numeric types. Input lists that are null (`nulllist(<type>)`) are treated as lists of nulls.

Although normally anonymous, the function can be given a name, as the second syntax above and the second example below illustrate.

The number of arguments to the input function must be one more than the number of input lists. The first input argument, `arg1`, initializes the accumulator. Subsequent arguments must be coercible to arg1's data type, as must be the return data type.

To calculate the output, the function must set the first accumulator value to initial value `arg1`. The input function is applied on this accumulator value along with the last elements of the lists (`listn[n-1]`) to produce a new accumulator value. The input function is then applied on this new accumulator value and the second-from-last element of the lists (`listn[n-2]`) to calculate a new accumulator value, which is used with the third-from-last elements of the lists (`listn[n-3]`) and so on, until the beginning of the lists is reached. The accumulator value at this point is returned as the output.

Examples:

Sum the elements of two lists:

`foldright(function (a int, b list(int), c list(int)) { a+b+c }, 0, list(1, 2, 3), list(4, 5, 6))`

This evaluates to 21.

Construct a list of the maximum values of corresponding elements of three lists rightward (with line breaks added for clarity):

```foldright(function maxOfLists(incoming list(int), a list(int),      b list(int), c list(int)) { append(incoming,(max(a, b, c))) },     emptylist(int()), list(88, 78, 65), list(45, 90, 23),      list(890, 89, 78) )```

Evaluates to:

`list(78, 90, 890)`

Using this sequence of operations:

```maxOfLists([], 65, 23, 78) yields [78] maxOfLists([78], 78, 90, 89) yields  [78, 90] maxOfLists([78, 90], 88, 45, 890) yields [78, 90, 890]```

### has_intersection()

Function syntax:

`bool has_intersection(list(T) `L1`, list(T) `L2`)`

Returns true if lists L1 and L2, which must have the same type, contain any values in common and false if they do not. The input lists can be of different lengths. For example:

`has_intersection([2,4,6,8,10],[1,3,5])`

returns false, but

`has_intersection([0,],[1,3,5,0])`

returns true.

See also the intersect() function, which returns the set of values that two lists have in common.

### indexof()

Function syntax:

`int indexof(list(T) `L`, T `e` [, int `start`])`

Returns the index of the first element of the list `L` whose value is `e`, starting at index `start`. The argument list can have any element type, `T`. The element `e` must have the same element type as the argument list. For list arguments, if `e` is not found, the function returns a value of `null`.

If the search is successful, the value returned is always greater than or equal to `start`. If `start` is unspecified, it is taken to be 0, which designates the first character of the string or the first element of the list.

### insertelement()

Function syntax:

`list(T) insertelement(list(T) `L`, int `index`, T `e`)`

Returns a list composed of the elements of the argument list, `L`, with element `e` inserted before index position `index` in the argument list. `index` is zero-based. The element type, T, of the returned list is the same as the element type of the argument list and of the inserted element, `e`.

The usage `insertelement(L, length(L), e)` is the equivalent of append(). The usage `insertelement(L, 0, e)` is the equivalent of prepend().

### intersect()

Function syntax:

`list<T> intersect(list(T) `L1`, list(T) `L2`)`

Returns a list containing the values common to lists L1 and L2, which must have the same type, or an empty list if they share no values in common. The input lists can be of different lengths. For example:

`intersect([2,4,6,8,10,12,14,16,18,20],[3,6,9,12,15,18])`

returns:

`[6,12,18]`

See also the has_intersection() function, which determines whether or not two lists have common values.

### join(), joinlist()

Function syntax:

```string join(list(T) `L`)
string joinlist(list(T) `L`)
string join(string `separator`, list(T) `L`)
string joinlist(string `separator`, list(T) `L`)```

Returns a string composed of the elements of list `L`, having any element type. With a single argument, elements are concatenated. With two arguments, the `separator` character is inserted between each element. If the `list` argument is empty, join() returns an empty string. If `list` is null, join() returns null. If the `separator` string is null, join() returns null.

You can use `joinlist()` as an alias for `join()` in all contexts. In StreamSQL contexts, which has the reserved word JOIN, and in LiveView LiveQL, which borrows its parser from StreamSQL, you must either use the `joinlist()` alias, or use escaped identifier syntax.

Also see the aggregate function join().

### lastelement()

Function syntax:

`T lastelement(list(T) `L`)`

Returns the last element of argument list `L`. The data type of the returned value is the same as the element type of the argument list.

The usage `lastelement(L)` is the equivalent of `L[-1]`.

### lastindexof()

Function syntax:

`int lastindexof(list(T) `L`, T `e` [, int `lastStart`])`

Returns the index of the last element of the list `L` whose value is `e`, for which the index is less than or equal to `lastStart`. The argument list can have any element type, T. The element `e` must have the same element type as the argument list. For list arguments, if `e` is not found, the function returns a value of `null`.

If the search is successful, the value returned is always less than or equal to `lastStart`. If `lastStart` is unspecified, it is taken to be the index of the last grapheme of the string or the last element of the list.

### length()

Function syntax:

`int length(list `L`)`

Returns the number of elements in a list, including any null elements. If the argument is a null list, the returned value is `null`.

### list()

Returns a list containing its arguments as elements. See list() in the Simple Functions: Type Conversions section.

### maplist()

Function syntax:

```list(T) maplist(function(arg1 T, arg2 T, ... , argn T)
{`function-body`}, list1 T, list2 T, ... , listn T)```

The `maplist` function maps the elements of one or more lists onto each other, according to a function() declared as the first argument. The maplist() function takes a declared function() as its first argument and one or more lists as additional arguments. It applies the function to the elements of each argument list, then returns a single list that contains the results of applying the function on each element position, in element order.

To take a simple example, the following maplist() instance declares a function that has two integer arguments and a function body that adds the arguments. When applied to the input list(int) lists, the result is that the first element of the first argument list is added to the first element of the second argument list. Then the second element of the first list is added to the second element of the second list, and so on.

`maplist(function(x int, y int) { x + y }, list(1, 2, 3), list(4, 5, 6)))`

The result is:

``(list(int) [5, 7, 9]``

The input function can be defined using the expression language or a built-in function. The number of arguments to the declared function must be at least equal to the number of input argument lists. If the argument lists have more elements than the number of function arguments, the `function-body` expression continues to be applied.

The inputs to the function — that is, the elements of the input lists — must be coercible to the declared data type of the function arguments. If an input argument is a tuple, the number, order, and names of the fields must match exactly, and the field types of the input arguments must be coercible to the field types of the input function's argument fields.

The element type of the returned list is the same as the element type of the input lists or one that is coercible from them.

Input lists with null elements are valid lists. The result element corresponding to a null element's position can be a null, depending on the operation declared in the `function-body.` The declared function must handle any nulls; the maplist() function as a whole does not do so. For example:

`maplist(function(x int, y int) { x + y }, list(1, 2, 3), list(4, null, 6)))`

returns:

``(list(int) [5, null, 9]``

To take another example:

`maplist(function(n1 int, n2 int, n3 int) {n1+n2+n3}, [1,2,3,4], [2,4,6,8], [3,6,9,12])`

returns:

`(list(int)) [6, 12, 18, 24]`

Input lists that are null (`nulllist(<type>)`) are treated as lists of nulls.

### maplist_loose()

Function syntax:

`list(T) maplist_loose(function(T1, T2, ... , Tn){...}, list<T1>, list<T2>, ... , list<Tn>)`

The maplist_loose() function takes the same function first argument and one or more lists as additional arguments as the maplist() function, and returns a list the same as maplist(). The difference is that this function allows lists of any length, padding shorter lists with nulls to the same length as the longest list encountered.

See maplist() for instructions on using this function.

### maxelement()

Function syntax:

`T maxelement(list(T) `L`)`

This function returns the maximum non-null element value using the greater-than relational operator across all elements in the argument list, `L`. The data type of the returned value is the same as the element type of the argument list.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of max(), the aggregate version of max(), and the aggregate maxn() function. Use the simple function `max(list, , list)` to determine the larger of two or more lists. In an 2, `max(list)` returns the maximum list in the aggregate window. By contrast, use `maxelement()` to compare the element values of a single list.

### median()

Function syntax:

```double median(list(double) `L`)
double median(list(int) `L`)
double median(list(long) `L`)```

Given a list, `L,` of doubles, ints, or longs, returns a double, the median of the elements of `L`. If the list has an even number of elements, returns the average of the middle two.

### mergelist()

Function syntax:

```list(T) mergelist(list<list<T>> `listOfLists`)
list(T) mergelist(list<list<T>> `listOfLists`, bool `ascending`)
list(T) mergelist(list<list<T>> `listOfLists`, function(arg1 T,
arg2 T) {int `compareFunction`})
list(T) mergelist(list<list<T>> `listOfLists`, bool `ascending`,
function(`arg1` T, `arg2` T) {int `compareFunction`})```

The `mergelist` function merges a list of sorted input lists into a single sorted output list. It takes one, two, or three arguments:

1. The single argument form takes a required list of lists, which is assumed to be presorted in the desired order. `mergelist` does not sort its input lists.

2. The optional second argument is a boolean value indicating that the output is to be sorted: true for ascending or false for descending.

3. The third argument is an optional function that compares the values of two inputs (see below).

If a boolean argument is used, it must be the second argument. If absent, the output list is sorted ascending.

When the input lists consist of simple types, default StreamBase expression language comparison operators are normally adequate to obtain properly sorted output.

When the input list elements are tuples, you can provide a compare function if you need to sort on a particular field or fields of input tuples. The compare function can be a built-in function or declared using the expression language. It must accept two input arguments and output an integer. `mergelist` expects the function's return value to be negative if the first argument is less than the second argument, positive if the first argument is greater than the second argument, and 0 if they are equal. The inner list type must be coercible to the function argument types.

For example, by default a list of lists of tuples with a schema {price int, volume int, symbol string}, would sort on the first field (price). When prices are equal, the second field (volume) is used to resolve any ties, and then the third field (symbol), as needed. If, however, you want to sort only on symbol, you can provide a function to do so, such as the following, where `mytuple` is the name of a named schema used as a data type:

```function symbolCompare(t1 mytuple, t2 mytuple) {    if (t1.symbol > t2.symbol) then 1   else if (t1.symbol < t2.symbol) then -1   else 0  }```

If the input list of lists contains only one inner list, the output list's elements are the contents of that inner list in the same order. When null values exist in input lists, how they sort in the output list (like other values) is determined by the compare function. For more information see Null Literals.

A call to `mergelist` that uses the above compare function to sort an enqueued list of lists might take this form (line breaks were added for clarity):

```mergelist( list(list()) input.trades,    function symbolCompare(t1 mytuple, t2 mytuple)   { if (t1.symbol > t2.symbol) then 1 else if        (t1.symbol < t2.symbol) then -1 else 0 } )```

### minelement()

Function syntax:

`T minelement(list(T) `L`)`

This function returns the minimum non-null value using the less-than relational operator across all elements in the argument list, `L`. The data type of the returned value is the same as the element type of the argument list.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of min(), the aggregate version of min(), and the aggregate minn() function. Use the simple function `min(list, , list)` to determine the smaller of two lists. In an aggregate context, `min(list)` returns the minimum list in the aggregate window. By contrast, use `minelement()` to compare the element values of a single list.

### nulllist()

Function syntax:

`list(T) nulllist(T `x`)`

Notice that the name of the function has three lowercase letter L's. Returns a null list with the same element type as the argument. The argument `x`'s value is ignored, and only its type is considered. This function is typically used like these examples:

`nulllist(double())`

`nulllist(timestamp())`

You can also use an argument with a value, or an expression that resolves to a value, like the following examples. Only the type of the argument is considered:

`nulllist(17)`

`nulllist("IBM")`

### prepend()

Function syntax:

`list(T) prepend(list(T) `L`, T `e`)`

Returns a list containing element `e` at the start, then all of list `L`. The data type of element `e` and the element type of list `L` must be the same, except that argument types int and long can be coerced to long and double following the rules in Data Type Coercion and Conversion. The element type of the returned list is the same as the element type of the argument list and of the prepended element, `e`.

### product()

Function syntax:

```double product(list(double) `L`)
int    product(list(int) `L`)
long   product(list(long) `L`)```

Given a list, `L`, of doubles, ints, or longs, returns the product of each member of the list. The data type of the returned value is the same as the element type of the argument list.

### range()

Function syntax:

`list(int) range(int `start`, int `end` [, int `step`])`

Returns a list of ints from value `start` up to (or down to) but not including value `end`. That is, the range includes the first item in the list and excludes the last item. Thus, `range(1, 10)` returns a list with nine elements, not ten:

`list(int) [1, 2, 3, 4, 5, 6, 7, 8, 9]`

The default value for `step` is 1. `step` cannot be 0, but you can specify a `step` value greater than 1 to produce a list with skipped elements. For example, a `step` of 2 returns every other number:

`range(0, 8, 2)`

returns

`list(int) [0, 2, 4, 6]`

If you specify a negative value for `step`, the list generates in reverse. For example:

`range(3, 0, -1)`

returns

`list(int) [3, 2, 1]`

while

`range(6, 1, -2)`

returns

`list(int) [6, 4, 2]`

When `step` is negative and `end` is greater than `start`, the result is an empty list. Likewise, an empty list also results when `start` is greater than `end` and `step` is positive.

### regexsplit()

Parses a string into tokens delimited by a character matching a regular expression, and returns a list of strings with the tokens as elements. See regexsplit() in the Simple Functions: Strings section.

### removeelement()

Function syntax:

`list(T) removeelement(list(T) `L`, int `index`)`

Returns a list composed of the elements of the argument list, `L`, with the element at index position `index` removed. The element type, T, of the returned list is the same as the element type of the argument list.

### replaceelement()

Function syntax:

```list(T) replaceelement(list(T) `L`, int `index`, T `e`)
list(T) replaceelement(list(T) `L`, `updatefunction`(arg T, index int)->T)
list(T) replaceelement(list(T) `L`, `predicatefunction`(arg T)-> bool,
replacement T)```

Returns a list composed of the elements of the argument list, `L`, with the element at index position `index` replaced by element `e`. The element type of the returned list is the same as the element type of the argument list.

In the second form, instead of specifying an index into the list of interest, specify an `updatefunction` to be called with every element of the list, which function can return either the current value or some new value determined at runtime.

In the third form, specify a `predicatefunction` to be called with every element of the list, which returns `replacement` for all values that match the predicate function.

### reverse()

Function syntax:

`list(T) reverse(list(T) `L`)`

Returns its argument list, `L`, in reverse order. The element type, T, of the returned list is the same as the element type of the argument list.

### shuffle()

Function syntax:

`list(T) shuffle(list(T) `L`)`

Returns its argument list, `L`, with its elements shuffled in random order. Calling shuffle() again on the same list returns a different sort order. The element type, T, of the returned list is the same as the element type of the argument list.

### sort()

Function syntax:

```list(T) sort(list(T) `L` [, bool `ascending`])
list(T) sort(list(T) `L` ,
function `compareFunction`(arg 1 T, arg2 T)->boolean)
list(T) sort(list(T) `L` , bool,
function `compareFunction`(arg1 T, arg2 T)->boolean)```

Returns its argument list, `L`, sorted. Sorting occurs in ascending order if the optional `ascending` argument is `true` or omitted, or in descending order if `ascending` is `false`. If the `ascending` argument is used, it must be the second argument. The element type, T, of the returned list is the same as the element type of the argument list.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types. Null values sort as the lowest possible values for the element type of the list. For doubles, NaN sorts as the highest possible value, which yields the following relations: Null < –Infinity < 0 < +Infinity < NaN.

To exercise control over sorting, especially when list elements are tuples, you can provide your own comparison function that compares two input values and outputs an integer value. The output value must be negative if the first argument is less than the second argument, positive if the first argument is greater than the second argument, and 0 if they are equal.

For further details and an example of using a comparison function, see the description of the mergelist function.

### split()

Parses a string into tokens delimited by a specified character and returns a list of strings with the tokens as elements. See split() in the Simple Functions: Strings section.

### stdev()

Function syntax:

```double    stdev(list(double) `L`)
double    stdev(list(int) `L`)
double    stdev(list(long) `L`)
timestamp stdev(list(timestamp) `L`)```

Given a list, `L,` of doubles, ints, longs, or timestamps, returns the standard deviation for all members of the list. If the argument list is a list of doubles, ints, or longs, returns a double. If the argument list is a list of timestamps, returns an interval timestamp. For timestamp values, the list should contain all interval timestamps or all absolute timestamps, but not both. For lists of length 1, returns null.

### stdevp()

Function syntax:

```double    stdevp(list(double) `L`)
double    stdevp(list(int) `L`)
double    stdevp(list(long) `L`)
timestamp stdevp(list(timestamp) `L`)```

Given a list, `L,` of doubles, ints, longs, or timestamps, returns the standard deviation for all members of the list. If the argument list is a list of doubles, ints, or longs, returns a double. If the argument list is a list of timestamps, returns an interval timestamp. For timestamp values, the list should contain all interval timestamps or all absolute timestamps, but not both. For lists of length 1, returns 0.

With `stdevp()` the data provided is the entire population, while with `stdev()`, the data provided is treated as a random sample. The `stdevp()` function is calculated using the biased (or n) method. The `stdev()` function is calculated using the unbiased (or n-1) method.

### sublist()

Function syntax:

`list(T) sublist(list(T) `L`, int `start `[, int `end` [, int `step`]])`

Returns a list composed of a section of the argument list, `L`, starting with index `start`, which is included in the result list. If `end` is specified, it is not included in the result list. The default value for `step` is 1. You can supply an alternate `step` value to skip elements in the argument list. For example, a `step` of 2 returns every other value in the list. You must specify an `end` value in order to specify a `step` value. The element type, T, of the returned list is the same as the element type of the argument list.

In contrast with the range() function, `step` cannot be negative.

### sum()

Function syntax:

```double    sum(list(double) `L`)
int       sum(list(int) `L`)
long      sum(list(long) `L`)
timestamp sum(list(timestamp) `L`)```

Given a list, `L,` of doubles, ints, longs, or timestamps, returns the sum of all members of the list. The element type of the returned list is the same as the element type of the argument list.

When the list element type is timestamp, summing elements of the list follows the rules for adding timestamps as shown in the table in timestamp Data Type. That is, you cannot sum a list of two or more absolute timestamps. However, you can sum a list composed of all interval timestamps, or one composed of exactly one absolute timestamp plus one or more interval timestamps.

### sum_list()

Function syntax:

`list(T) sum_list(list(T) LL(list(T) `L1`, list(T) `L2`, ...))`

Accepts a list of lists as an argument, where each member list `L1`, `L2`, and so on of the containing list `LL` must have the same element type or a coercible type. Returns a list containing the element-wise sum of the elements of each member list in LL. The list element type, T, of the returned list is the same as the element type of the argument lists, or a coerced superset type. For example:

`sum_list( [ [1,2,3], [4,5,6] ] )`

returns:

`[5, 7, 9]`

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the sum.

### unique()

Function syntax:

`list(T) unique(list(T) `L`)`

Returns a list composed of the unique elements in the argument list, `L`, in ascending (lexicographical) order. The element type, T, of the returned list is the same as the element type of the argument list.

For example:

`unique([4, 8, 3, 7, 1, 3, 4, 4, 5, 6, 2, 6, 7])`

returns:

`list(int) [1, 2, 3, 4, 5, 6, 7, 8]`

Be aware that lexicographic ordering of strings differs from numeric types. For example, with ints,

`unique([400, 8, 3, 7, 1, 3, 4, 5, 400, 6, 6, 7])`

returns:

`(list(int)) [1, 3, 4, 5, 6, 7, 8, 400]`

but given string tokens,

`unique(['400', '8', '3', '7', '1', '3', '4', '5', '400', '6', '6', '7'])`

returns a list in "dictionary" order:

`(list(string)) [1, 3, 4, 400, 5, 6, 7, 8]`

To find unique list elements and return them in descending order, use `reverse(unique(<list>))`.

### unzip()

Function syntax:

`tuple unzip(list(tuple) listOfTuples)`

Takes a list of tuples and returns a tuple of lists.

An example will clarify the function. Let's say your application extracts a list of symbol and price tuples from a stream. One list might contain the following:

```[ {prices: 102.51, symbols: AAPL},
{prices: 96.52, symbols: IBM},
{prices: 36.33, symbols: HPQ} ]```

You can run this list through `unzip()`, using an expression like the following:

```unzip(list(tuple(102.51 as prices,'AAPL' as symbols),             tuple(96.82 as prices,'IBM' as symbols),             tuple(36.33 as prices,'HPQ' as symbols)))```

The returned value is a single tuple whose fields are lists, as follows:

`((prices list(double), symbols list(string))) "[102.51,96.82,36.33]","[AAPL,IBM,HPQ]"`

### variance()

Function syntax:

```double variance(list(double) `L`)
double variance(list(int) `L`)
double variance(list(long) `L`)```

Given a list, `L,` of doubles, ints, or longs, returns a double, the variance for all members of the list. Variance is a measure of the dispersion of a set of data points around their mean value.

For lists of length 1, returns null. See the aggregate version of variance() for more on variance.

### variancep()

Function syntax:

```double variancep(list(double) `L`)
double variancep(list(int) `L`)
double variancep(list(long) `L`)```

Given a list, `L,` of doubles, ints, or longs, returns a double, the variance for all members of the list. Variance is a measure of the dispersion of a set of data points around their mean value.

The `variancep()` function is similar to the `variance()` function: with `variancep()` the data provided is the entire population, while with `variance()`, the data provided is treated as a random sample. The `variancep()` function is calculated using the biased (or n) method. The `variance()` function is calculated using the unbiased (or n-1) method.

For lists of length 1, returns 0. See also the aggregate version of variancep() for more on variance.

### zip()

Function syntax:

`list(tuple) zip(tuple(list(T1), ..., list(Tn)) tupleOfLists)`

In the spirit of the zip function in Python, `zip()` takes a tuple of lists, collates it, and returns it as a list of tuples, where the `n`th tuple contains the `n`th element from each of the argument lists. The element types of the lists in the argument tuple can be of any type, and do not need to be the same.

An example will clarify the function. Let's say you have a stream whose schema accepts lists of securities price data: `{ prices list(double), symbols list(string) }`. One tuple might contain the following:

```{
prices:  [102.51, 96.82, 36.33],
symbols: [AAPL, IBM, HPQ]
}```

You can run this tuple through `zip()`, using an expression like the following:

`zip(tuple(list(102.51, 96.82, 36.33) as prices, list('AAPL', 'IBM', 'HPQ') as symbols))`

The returned value is a list of tuples:

`(list((prices double, symbols string))) [102.51,AAPL, 96.82,IBM, 36.33,HPQ]`

## Simple Functions: Math

This category includes the following functions:

 abs() acos() andall() asin() atan() atan2() cbrt() ceil() choose() cos() cosh() exp() expm1() factorial() floor() gaussianrandom() ln() log10() log1p() max() min() orall() permute() pow() random() round() securerandom() sign() sin() sinh() sqrt() tan() tanh() to_degrees() to_radians() xorall()

The following functions provide bitwise operations:

 bitand() bitnot() bitor() bitxor() lshift() rshift() unsignedrshift()

The following functions return the maximum and minimum values for numeric data types:

 maxint() maxlong() maxdouble() minint() minlong() mindouble()

### abs()

Function syntax:

```int    abs(int `e`)
double abs(double `e`)
long   abs(long `e`)```

Returns the absolute value of an int, double, or long expression `e`. The return type is the same as the expression's type.

### acos()

Function syntax:

```double acos(int `x`)
double acos(double `x`)
double acos(long `x`)```

Returns in radians the arc cosine of `x`, which is the value whose cosine is `x`. Undefined outside of the range –1 to 1.

### andall()

Function syntax:

`bool andall(bool `b1`, bool `b`2, ...)`

Takes two or more values of type boolean, or expressions that resolve to type boolean, and returns the results of a logical AND operation on all arguments. For example, `andall(true, true, false, true, false)` returns `false`. Use andall() as a truth detector function.

Null arguments have special handling. A `null` argument does not change the result of the evaluation if a `false` value is among the arguments: `andall(null, true, false, null, true)` still returns `false`. However, if one or more arguments is null while all other arguments are `true`, the function returns `null`: `andall(null, true, true, null, true)` returns `null`.

The andall() function follows the logic expressed in this statement:

if any argument is false, return false
else if any argument is null, return null
else return true

See also the orall() simple function and the aggregate versions of andall() and orall().

### asin()

Function syntax:

```double asin(int `x`)
double asin(double x)
double asin(long x)```

Returns in radians the arc sine of `x`, which is the value whose sine is x. Undefined outside of the range –1 to 1.

### atan()

Function syntax:

```double atan(int `x`)
double atan(double `x`)
double atan(long `x`)```

Returns in radians the arc tangent of `x`, which is the value whose tangent is `x`.

### atan2()

Function syntax:

```double atan2(int `x`, int `y`)
double atan2(double `x`, double `y`)
double atan2(long `x`, long `y`)```

Returns in radians the arc tangent of two numeric variables, `x` and `y`, either of which can be int, long, or double. This is similar to calculating the arc tangent of (`y/x`), except that the signs of both arguments are used to determine the quadrant of the result tangent is `x`.

### bitand()

```int  bitand(int `x`, int `y`)
long bitand(long `x`, long `y`)```

Returns the result of a bitwise AND operation on values `x` and `y`. Compares each bit in `x` to the corresponding bit in `y`. If both bits are 1, sets the corresponding result bit to 1. Thus, `bitand(3,6)` returns 2. (0011 AND 0110 returns 0010)

### bitnot()

```int  bitnot(int `x`)
long bitnot(long `x`)```

Returns the result of a bitwise NOT operation on value `x`. Compares each bit in `x` and sets its bitwise complement in the corresponding result bit, switching 0 for 1 and 1 for 0. Thus, `bitnot(5)` returns -6. (0...0101 returns 1...1010)

### bitor()

```int  bitor(int `x`, int `y`)
long bitor(long `x`, long `y`)```

Returns the result of a bitwise inclusive OR operation on values `x` and `y`. Compares each bit in `x` to the corresponding bit in `y`. If either bit is 1 or both bits are 1, sets the corresponding result bit to 1; otherwise, sets the corresponding result bit to 0. Thus, `bitor(12,7)` returns 15. (1100 OR 0111 returns 1111)

### bitxor()

```int  bitxor(int `x`, int `y`)
long bitxor(long `x`, long `y`)```

Returns the result of a bitwise XOR (exclusive OR) operation on values `x` and `y`. Compares each bit in `x` to the corresponding bit in `y`. If either bit is 1 but not both bits, sets the corresponding result bit to 1; otherwise, sets the corresponding result bit to 0. Thus, `bitxor(12,7)` returns 11. (1100 OR 0111 returns 1011)

### cbrt()

Function syntax:

```double cbrt(int `x`)
double cbrt(double `x`)
double cbrt(long `x`)```

Returns the (real) cube root of `x`.

### ceil()

Function syntax:

```double ceil(int `x`)
double ceil(double `x`)
double ceil(long `x`)```

Returns the smallest double value that is greater than or equal to the argument and is equal to a mathematical integer. The function name, `ceil`, is shorthand for ceiling. The argument can be an int, double or long. Examples:

• `ceil(7.01)` returns `8.0`.

• `ceil(8.0)` returns `8.0`.

• `ceil(8)` returns `8.0`.

• `ceil(8L)` returns `8.0`.

### choose()

Function syntax:

`double choose(int `n`, int `k`)`

Returns the n choose k binomial coefficient, which represents the number of ways to choose `k` unordered elements from a set of `n` elements. Specify `n` as a positive nonzero integer and `k` as a power of `n`, where `k` is greater than or equal to 1 and less than or equal to `n`–1. If `n` < 0 or `k` > n, 0.0 is returned.

Compare the permute() function.

### cos()

Function syntax:

```double cos(int `x`)
double cos(double `x`)
double cos(long `x`)```

Returns the cosine of `x`, where `x` is given in radians.

### cosh()

Function syntax:

```double cosh(int `x`)
double cosh(double `x`)
double cosh(long `x`)```

Returns the hyperbolic cosine of its argument.

### exp()

Function syntax:

```double exp(int `x`)
double exp(double `x`)
double exp(long `x`)```

Returns the value of e (the base of natural logarithms) raised to the power of `x`. The argument `x` is a small value of type int, double, or long.

### expm1()

Function syntax:

```double expm1(int `x`)
double expm1(double `x`)
double expm1(long `x`)```

Returns (e^x) -1. The argument `x` is a small value of type int, double, or long.

### factorial()

Function syntax:

`double factorial(int `n`)`

Returns the factorial of the provided integer between 0 and 170. Arguments below zero return NaN; arguments above 170 exceed the size of a double and return Infinity.

### floor()

Function syntax:

```double floor(int `x`)
double floor(double `x`)
double floor(long `x`)```

Returns the largest double value that is less than or equal to the argument `x`, and equal to a mathematical integer. The argument can be an int, double, or long. Examples:

• `floor(7.01)` returns `7.0`.

• `floor(8.0)` returns `8.0`.

• `floor(8)` returns `8.0`.

• `floor(8L)` returns `8.0`.

### gaussianrandom()

Function syntax:

`double gaussianrandom(double `mean`, double `stdev`)`

Takes two arguments of type double, the mean and standard deviation of a Gaussian distribution, and returns a double randomly chosen from that distribution.

Gaussian distributions are also known as normal distributions (in statistics) or bell curves (in testing). The function is useful for simulating data from any population that tends to follow a normal distribution.

For example, to simulate the SAT score for a random student, we could call gaussianrandom(1500, 100), where 1500 and 100 might be the national mean and standard deviation respectively.

### ln()

Function syntax:

```double ln(int `x`)
double ln(double `x`)
double ln(long `x`)```

Returns the natural logarithm of `x`.

### log10()

Function syntax:

```double log10(int `x`)
double log10(double `x`)
double log10(long `x`)```

Returns the base-10 logarithm of `x`.

### log1p()

Function syntax:

```double log1p(int `x`)
double log1p(double `x`)
double log1p(long `x`)```

Returns the natural logarithm of the sum of the argument and 1.

### lshift()

```int  lshift(int `x`, int `y`)
long lshift(long `x`, long `y`)```

Returns the result of a bitwise left shift of the bits in `x` by the number of bit positions `y`. Thus, `lshift(5,1)` returns 10. (0101 returns 1010)

### max()

Function syntax:

`T max(T `e1`, ..., T `en`)`

This function returns the maximum non-null value using the greater-than relational operator across all its arguments `e`1 through `en`, which must all be expressions resolving to the same data type. The data type, T, of the returned value is the same as the arguments.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the aggregate version of max(), the maxelement() function for lists, and the aggregate maxn() function. Use this simple function `max(list, , list)` to determine the larger of two or more lists. In an aggregate context, `max(list)` returns the maximum list in the aggregate window. By contrast, use `maxelement()` to compare the element values of a single list.

### maxdouble()

Function syntax:

`double maxdouble()`

This function returns the maximum value for the double data type.

### maxint()

Function syntax:

`int maxint()`

This function returns the maximum value for the int data type.

### maxlong()

Function syntax:

`long maxlong()`

This function returns the maximum value for the long data type.

### min()

Function syntax:

`T min(T `e1`, ..., T `en`)`

This function returns the minimum non-null value using the less-than relational operator across all its arguments `e`1 through `en`, which must all be expressions resolving to the same data type. The data type of the returned value is the same as the arguments.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the aggregate version of min(), the minelement() function for lists, and the aggregate minn() function. Use this simple function `min(list, , list)` to determine the smaller of two or more lists. In an aggregate context, `min(list)` returns the minimum list in the aggregate window. By contrast, use `minelement()` to compare the element values of a single list.

### mindouble()

Function syntax:

`double mindouble()`

This function returns the minimum value for the double data type.

### minint()

Function syntax:

`int minint()`

This function returns the minimum value for the int data type.

### minlong()

Function syntax:

`long minlong()`

This function returns the minimum value for the long data type.

### orall()

Function syntax:

`bool orall(bool `b1`, bool `b2`, ...)`

Takes two or more values of type boolean, or expressions that resolve to type boolean, and returns the results of a logical OR operation on all arguments. For example, `orall(true, true, false, true, false)` returns `true`. Use orall() as a falsity detector function.

Null arguments have special handling. A null argument does not change the result of the evaluation if a `true` value is among the arguments: `orall(null, false, true, null, false)` still returns `true`. However, if one or more arguments is `null` while all other arguments are `false`, the function returns `null`: `orall(null, false, false, null, false)` returns `null`.

The orall() function follows the logic expressed in this statement:

if any argument is true, return true
else if any argument is null, return null
else return false

See also the andall() simple function and the aggregate versions of andall() and orall().

### permute()

Function syntax:

`double permute(int `n`, int `k`)`

Using permutation in the combinatorics sense, this function returns the number of ordered arrangements of `k` distinct elements that can be select from a set of `n` elements. Specify `n` as a positive nonzero integer and `k` as a number of elements of `n`, where `k` is between zero and `n`. If `n` < 0 or `k` > n, 0.0 is returned.

Compare the choose() function.

### pow()

Function syntax:

```double pow(int `x`, int `y`)
double pow(double `x`, double `y`)
double pow(long `x`, long `y`)```

Returns the value of `x` raised to the power of `y`. Accepts an int, double, or long, and returns a double.

### random()

Function syntax:

```double random()
int    random(int `maxrange`)```

Without arguments, returns a pseudo-random double value with positive sign, greater than or equal to 0.0 and less than 1.0.

With an integer argument `maxrange`, returns a pseudo-random integer with positive sign, greater than or equal to 0 and less than `maxrange`. Use this form to pick a random number between 0 and `maxrange`—1, inclusive. For example, to simulate rolling a six-sided die, the expression `maxrange(6) + 1` returns a random integer between 1 and 6 inclusive.

### round()

Function syntax:

```long round(double `e`)
long round(int `e`, int `n`)
long round(double `e`, int `n`)
long round(long `e`, int `n`)```

With one argument, a double, returns the closest long to the argument. Floating-point numbers are NOT supported.

With two arguments, round() is similar to the round() found in Microsoft Excel, but not identical to it. Specify a double, long, or int in the first argument. Specify an int in the second argument to designate the number of digits to round the first argument to. Use 0 to specify rounding to an integer, a positive integer to round the digits after the decimal point, or a negative number to specify rounding digits before the decimal point. If the first argument is an int or long, the second argument must be negative to actually perform rounding; otherwise the first argument is returned unchanged.

### rshift()

```int  rshift(int `x`, int `y`)
long rshift(long `x`, long `y`)```

Returns the result of a logical bitwise right shift of the bits in `x` by the number of bit positions `y`. This is the equivalent of the Java `>>` operator. Thus, `rshift(5,1)` returns 2. (0101 returns 0010)

### securerandom()

Function syntax:

`double securerandom()`

A more secure form of random() that generates random bytes, converted to return a double.

### sign()

Function syntax:

```int sign(int `x`)
int sign(double `x`)
int sign(long `x`)```

Returns the sign of its argument: –1 if less than zero, 0 if equal to zero, and 1 if greater than zero.

### sin()

Function syntax:

```double sin(int `x`)
double sin(double `x`)
double sin(long `x`)```

Returns the sine of `x`, where `x` is given in radians.

### sinh()

Function syntax:

```double sinh(int `x`)
double sinh(double `x`)
double sinh(long `x`)```

Returns the hyperbolic sine of its argument.

### sqrt()

Function syntax:

```double sqrt(int `x`)
double sqrt(double `x`)
double sqrt(long `x`)```

Returns the nonnegative square root of `x`.

### tan()

Function syntax:

```double tan(int `x`)
double tan(double `x`)
double tan(long `x`)```

Returns the tangent of `x`, where `x` is given in radians.

### tanh()

Function syntax:

```double tanh(int `x`)
double tanh(double `x`)
double tanh(long `x`)```

Returns the hyperbolic tangent of a double value.

### to_degrees()

Function syntax:

`double to_degrees(double `x`)`

Converts an angle measured in radians to an approximately equivalent angle measured in degrees.

Function syntax:

`double to_radians(double `x`)`

Converts an angle measured in degrees to an approximately equivalent angle measured in radians.

### unsignedrshift()

```int  unsignedrshift(int `x`, int `y`)
long unsignedrshift(long `x`, long `y`)```

Returns the result of a bitwise right shift of the bits in `x` by the number of bit positions `y`, filling the leftmost bit position with 0. This is the equivalent of the Java `>>>` operator. Thus, `rshift(-1,1)` returns -1 (1111 1111 1111 1111), but `unsignedrshift(-1,1)` returns 2147483647 (0111 1111 1111 1111).

### xorall()

`bool  xorall(bool `a`, bool `b`, ...)`

Takes two or more values of type boolean, or expressions that resolve to type boolean, and returns the results of a logical XOR (exclusive OR) operation on all arguments.

If any argument is null, the function returns null. Otherwise, it returns true if an odd number of arguments are true and returns false if an even number of arguments are true. Conceptually, xorall evaluates successive arguments pairwise to produce its result.

`xorall(true, true)` returns `false`.

`xorall(true, false)` returns `true`.

`xorall(false, false)` returns `false`.

`xorall(true, null)` returns `null`.

`xorall(true, true, true)` returns `true`, but

`xorall(true, true, true, true)` returns `false`.

The xorall function follows the logic expressed in this statement:

if all arguments are false, return false
else if an odd number of arguments are true, return true
else return false

See also the andall() and orall() simple functions and the aggregate versions of andall() and orall().

## Simple Functions: NaN

This category includes the following functions:

 isnan() nan() notnan()

The functions in this group detect NaN (not a number) values.

### isnan()

Function syntax:

```bool isnan(double `value`)
bool isnan(int `value`)
bool isnan(long `value`)
bool isnan(timestamp `value`)```

Returns `true` if the argument is NaN (not a number), or returns `null` if the argument is null. Returns `false` otherwise.

### nan()

Function syntax:

`double nan()`

This function returns a NaN literal as type double.

### notnan()

Function syntax:

```bool notnan(double `value`)
bool notnan(int `value`)
bool notnan(long `value`)
bool notnan(timestamp `value`)```

Returns `false` if the argument is NaN (not a number), or returns `null` if the argument is null. Returns `true` otherwise.

Notice that `notnan()` returns the opposite of `isnan()`, except if the argument is null, in which case they both return null.

## Simple Functions: Runtime

This category includes the following functions:

 getAllHostIPs() getClientIP() getCompilerProperty() get_conf_param(), get_`datatype`_conf_param() getContainer() getHostName() getLeadershipStatus() getNodeName() getParallelRoot() getPath() getServerURI() getSetCompilerProperty() getTableSize

### getAllHostIPs()

Function syntax:

`list(string) getAllHostIPs()`

Returns as a list(string) all the non-loopback IP addresses assigned to the machine on which StreamBase Server is running.

### getClientIP()

Function syntax:

`string getClientIP(string `connectionID`)`

Returns as a string the IP address associated with the specified `connectionID`. Obtain the `connectionID` string from the first field returned by an sbadmin listConnections command, or from the `connection` stream in the `system` container.

Use this function to determine by IP address the actual computer connecting to, or recently disconnected from, a running instance of StreamBase Server.

This function is not expected to work at the command prompt with sbd --eval.

### getCompilerProperty()

Function syntax:

`string getCompilerProperty(string `sysprop`)`

Returns the value of the specified system property that StreamBase Server is actually using. By contrast, getSetCompilerProperty() answers the question of whether this value was explicitly set, and systemproperty() only returns values explicitly set with `-Dsome.system.property=value`.

Compare the following:

```sbd -e "systemproperty('streambase.querytable.default-memory-tables') (string) null sbd -e "getSetCompilerProperty('streambase.querytable.default-memory-tables') (string) UNSET sbd -e "getCompilerProperty('streambase.querytable.default-memory-tables') (string) MEMORY```

These results tell you that the `streambase.querytable.default-memory-tables` system property was not explicitly set at the command line with `-D`, was not explicitly set in a property table and is therefore using the compiled-in default value, and that this default value is MEMORY.

### get_conf_param()

Function syntax:

```string  get_conf_param(`default-value`, `path1`, `path2`, `path3`, ...)
boolean get_boolean_conf_param(`default-value`, `path1`, `path2`, `path3`, ...)
int     get_int_conf_param(`default-value`, `path1`, `path2`, `path3`, ...)
long    get_long_conf_param(`default-value`, `path1`, `path2`, `path3`, ...)
double  get_double_conf_param(`default-value`, `path1`, `path2`, `path3`, ...)```

Returns the value portion of a name-value pair specified with a `<param>` element in the currently active server configuration file, usually named `sbd.sbconf`. Use the typed versions of the function to convert the returned value to the specified data type.

These functions only return values from `<param>` elements. Other configuration file elements cannot be accessed with these functions.

The first argument, `default-value`, specifies a value to return if parsing the configuration file returns nothing. The arguments `path1`, `path2` ... `pathn` specify element and attribute names in the XML path to the value of interest (without counting the `<streambase-configuration>` top-level element).

For example, the value of the TCP port might be set in the configuration file with an entry like the following:

```<server>   <param name="tcp-port" value="9900" /> </server>```

To extract the TCP port value from the configuration file, use the int version of the `get_conf_param()` function as follows:

`get_int_conf_param(10000, "server", "tcp-port")`

In another example, some of the values set in the configuration file's `<security>` element are nested deeper:

```<security>   <ssl-authentication>     <param name="keystore" value="../security/signed.keystore"/>     <param name="keystore-password" value="secret"/>     <param name="key-password" value="secret1"/>   </ssl-authentication> </security>```

To extract the keystore-password value, use the following function:

`get_conf_param("verysecret", "security", "ssl-authentication", "keystore-password")`

These functions are not expected to work at the command prompt with sbd --eval.

### getContainer()

Function syntax:

`string getContainer()`

Returns as a string the name of the container in which the current application is loaded. Can be used in expressions like this example:

`if getContainer() == "default" ...`

This function is not expected to work at the command prompt with sbd --eval.

### getHostName()

Function syntax:

`string getHostName()`

Returns as a string the machine name of the host on which StreamBase Server is running. Can be used in expressions like this example:

`if getHostName() == "fasthost" ...`

Function syntax:

`string getLeadershipStatus()`

Returns the string LEADER or NON_LEADER, which represent the current StreamBase Server's leadership status as a member of a high availability cluster. Servers have LEADER status by default on startup, which can be changed with the sbadmin setLeadershipStatus command.

This function is not expected to work at the command prompt with sbd --eval.

### getNodeName()

Function syntax:

`string getNodeName()`

Returns as a string the name of the node in which the current application is loaded. Can be used in expressions like this example:

`if getNodeName() == "primaryserver" ...`

This function is not expected to work at the command prompt with sbd --eval.

### getParallelRoot()

Function syntax:

`string getParallelRoot()`

Returns the qualified path name of the top-level container or Module Reference that is the root of the current parallel region. See Parallel Region Defined.

If this function is called in a simple module without concurrency settings, it returns the path name of the container that holds the operator that calls this function. If this function is called in a module whose Module Reference has a concurrency setting of Run in a parallel region, it returns the qualified path to that Module Reference. If the Module Reference also has a Multiplicity setting of multiple, and the number of instances is set to 2 or more, the returned path is appended with a colon and a zero-based integer representing the instance number of the calling module.

Thus:

• When called in a simple module in the default container, this function returns `default`.

• When called in a module, B, that has a Module Reference named `B_ref` in the top-level module, A, this function returns `default.B_ref` if the Module Reference is marked with the Run in a parallel region concurrency option.

• In addition to the parallel region setting, if `B_ref` is also configured with a Multiplicity setting of 2, this function returns either `default.B_ref:0` or `default.B_ref:1`.

See StreamBase Path Notation for details on the qualified path names that can be returned.

This function is not expected to work at the command prompt with sbd --eval.

### getPath()

Function syntax:

`string getPath(string `componentname`)`

Returns the qualified path of the specified component of a StreamBase application. Can be used for verifying the container name for the specified component at run time, in expressions like this example:

`if getPath("Symbol") == "default.Symbol" ...`

See StreamBase Path Notation for details on the qualified path names that can be returned.

This function is not expected to work at the command prompt with sbd --eval.

### getServerURI()

Function syntax:

`string getServerURI()`

Returns the StreamBase URI of the currently running StreamBase Server. The value returned is from the point of view of the server, and thus always returns the hostname `localhost`. Use this function to obtain the port number and any authentication parameters appended to the URI.

Use this function instead of `systemproperty("streambase.uri")` to obtain the same information.

This function is not expected to work at the command prompt with sbd --eval.

### getSetCompilerProperty()

Function syntax:

`string getSetCompilerProperty(string `sysprop`)`

Returns the value of the specified system property if you explicitly set it, or returns UNSET otherwise. The UNSET result means the Server is using the default value for the specified property.

By contrast, getCompilerProperty() returns the actual value of the specified property, and systemproperty() only returns values explicitly set with `-Dsome.system.property=value`.

Compare the following:

```sbd -e "systemproperty('streambase.querytable.default-memory-tables') (string) null sbd -e "getSetCompilerProperty('streambase.querytable.default-memory-tables') (string) UNSET sbd -e "getCompilerProperty('streambase.querytable.default-memory-tables') (string) MEMORY```

These results tell you that the `streambase.querytable.default-memory-tables` system property was not explicitly set at the command line with `-D`, was not explicitly set in a property table and is therefore using the compiled-in default value, and that this default value is MEMORY.

### getTableSize()

Function syntax:

`int getTableSize(string `QTname`)`

Returns the number of rows in the specified Query Table. When used in a Query operator that is in the process of adding or deleting table rows, the result is only approximate.

This function is not expected to work at the command prompt with sbd --eval.

## Simple Functions: Statistical Calculations

This category includes the following functions:

 correlation_coefficient() goodman_kruskal_gamma() kendall_tau() pearson() spearmans_rank()

### correlation_coefficient()

Function syntax:

```list(double) correlation_coefficient(string name, int index,
list(double) var1, list(double) var2 ...)```

Returns a list of correlation coefficients, containing as many members as there are list arguments (there must be at least two). The output list of doubles is in the same order as the list arguments. Given `n` variables, the index argument specifies which of those `n` lists holds the primary variable (for which the returned correlation coefficient is `1.0`).

### Note

• The correlation_coefficient of 0 items is null.

• The correlation_coefficient of 1 item is 1.0.

• The correlation_coefficient of N identical items is 1.0.

The name argument specifies the type of correlation coefficient to compute, taken from this list:

• `goodman_kruskal_gamma`

• `kendall_tau`

• `pearson`

• `spearmans_rank`

Click any name above to see the description of the statistic. You can abbreviate and approximate the spellings you supply for the name. You can call any of these functions directly by its full name, as well as using the above syntax. However, when you call these functions directly, only two variables can be correlated per call, and the output is a scalar double.

If any input list contains null values, the function returns null for its correlation coefficient(s).

### goodman_kruskal_gamma()

Function syntax:

`double goodman_kruskal_gamma(list(double) var1, list(double) var2)`

Returns the Goodman-Kruskal Gamma correlation coefficient between values of variables in two lists as a double value. The Goodman-Kruskal Gamma statistic is a rank correlation of concordant pairs. If any field values are null, the returned value is null.

To perform multivariate Goodman-Kruskal Gamma correlations, use the simple `correlation_coefficient` function.

### kendall_tau()

Function syntax:

`double kendall_tau(list(double) var1, list(double) var2)`

Returns the normalized Kendall Tau distance metric between values of variables in two lists as a double value. The lists must have the same number of elements. The Kendall Tau distance statistic counts the number of pairwise disagreements between the ranks of corresponding items in two fields. Larger values indicate greater disagreement in rank ordering, such that a return value of 0 means complete agreement and 1.0 means complete disagreement.

To perform multivariate Kendall Tau correlations, use the simple `correlation_coefficient` function.

### pearson()

Function syntax:

`double pearson(list(double) var1, list(double) var2)`

Returns the Pearson product-moment correlation coefficient between values of variables in two lists as a double value that indicates the degree of linear dependence between two variables. The Pearson statistic is defined as the covariance of the two variables divided by the product of their standard deviations. The term product-moment refers to the mean (the first moment about the origin) of the product of the mean-adjusted random variables. A return value of `-1` indicates a perfect negative linear correlation. A value of `0` indicates no correlation (random variates). A value of `1` indicates perfect positive linear correlation. If any list elements are null, the returned value is null. If the variance of either input variable is zero, the function returns `NaN`.

To perform multivariate Pearson correlations, use the simple `correlation_coefficient` function.

### spearmans_rank()

Function syntax:

`double spearmans_rank(list(double) var1, list(double) var2)`

Returns the Spearman's Rank correlation coefficient between values of variables in two lists as a double value. Spearman's statistic is defined as the Pearson correlation coefficient between the ranked variables. A return value of `-1` indicates a perfect negative correlation (reverse ordering). A value of `0` indicates no concordance of ranks. A value of `1` indicates identical rank orderings. If any list values are null, the returned value is null.

To perform multivariate Spearman's Rank correlations, use the simple `correlation_coefficient` function.

## Simple Functions: Strings

This category includes the following functions:

 contains() endswith() format() indexof() (for strings) isempty() (for strings) lastindexof() (for strings) length() (for strings) lower() ltrim() randomstring() regexmatch() regexmatch_ignorecase() regexreplace() regexsplit() regextuple() rtrim() split() startswith() string() strlen() strresize() strresizetrunc() substr() trim() upper()

The behavior of functions that deal with strings changes when Unicode support is enabled for StreamBase Server as described in Unicode Support. These cases are noted for each function in this section.

### contains()

Function syntax:

`bool contains(string haystack, string needle)`

Takes two arguments: a string and a search string. Returns `true` if needle is a substring of haystack, or `false` if not.

If either argument is null, returns `null`.

The IN infix operator is an alternate syntax for contains().

Also see the list version of contains().

### endswith()

Function syntax:

`bool endswith(String biggerString, String smallerString)`

Takes two arguments: a string to search within and a (sub)string to search for. Returns `true` if biggerString ends with smallerString, or `false` if not.

If either argument is null, returns `null`.

### format()

Function syntax:

`string format(string `format-string`, `arg0`, `arg1`, ..., `argn`)`

Returns a formatted string for arguments `arg0` through `argn`, each formatted according to a specification in `format-string`.

Format specifications are patterns specified in the manner of class `java.text.MessageFormat`.

For example, an incoming stream containing NYSE stock ticks might have a field named `symbol` containing four-character stock symbols. Let's say a downstream operator expects symbols to be prefixed with a string identifying their source stock exchange. In this case, use an expression like the following in a Map operator:

`format("NYSE:{0}", symbol)`

An incoming stream might have a `price` field specified to four decimal places. When the flow of tuples reaches an output stream that must be formatted for human readability, you can use an expression like the following to round the `price` field to two decimal places. This example assumes that the `price` field contains values of type double, which matches the format specifier ###.##.

`format("{0,number,###.##}", price)`

The format for argument `arg0` is specified with the portion of the `format-string` that defines `{0}`. The format for `arg1` is in the `format-string` portion that defines `{1}`, and so on. The data type of `arg0` must match the type expected by the `format-string` specifier for `{0}`, and so on for each argument.

If you use `format()` to modify a tuple, use {0}, {1}, {2} and so on to specify the format for each field.

See the documentation for `java.text.MessageFormat` for further details and more examples.

### indexof()

Function syntax:random

`int indexof(string `haystack`, string `needle` [, int `start`])`

Returns the index of the first instance of the string `needle` within the string `haystack`, for which the index is greater than or equal to `start`. For all supported Unicode character sets, this function returns the index in number of graphemes, not bytes. For string arguments, if `needle` is not found, the function returns a value of -1.

If the search is successful, the value returned is always greater than or equal to `start`. If `start` is unspecified, it is taken to be 0, which designates the first character of the string or the first element of the list.

### isempty()

Function syntax:

`bool isempty(String str)`

Takes one string argument, str. Returns `true` if str is null or empty; `false` if not.

### lastindexof()

Function syntax:

`int lastindexof(string `haystack`, string `needle` [, int `lastStart`])`

Returns the index within string `haystack` of the rightmost instance of string `needle`, for which the index is less than or equal to `lastStart`. For all supported Unicode character sets, this function returns the index in number of graphemes, not bytes. For string arguments, if `needle` is not found, the function returns a value of -1.

If the search is successful, the value returned is always less than or equal to `lastStart`. If `lastStart` is unspecified, it is taken to be the index of the last grapheme of the string or the last element of the list.

### length()

Function syntax:

`int length(string `x`)`

Returns the number of graphemes in the string, for ASCII and all supported Unicode character sets. If the argument is a null string, the returned value is `null`.

### lower()

Function syntax:

`string lower(string `str`)`

Returns its argument in lowercase letters.

### ltrim()

Function syntax:

```string ltrim(string `str`)
string ltrim(string `s1`, string `s2`)```

The same as trim(), except that the first argument is trimmed on the left side only.

### randomstring()

Function syntax:

`string randomstring(int `len`)`

Returns a string of size len containing random ASCII characters A-Z.

### regexmatch()

Function syntax:

`bool regexmatch(string `regex`, string `str`)`

Attempts to match the specified regular expression in `regex` against the entire specified string in `str`. To match any substring of the specified string, use wildcards before and after your regular expression. For example, the regex `IBM` matches only when the entire string is "IBM", while the regex `.*IBM.*` matches if "IBM" is anywhere in the supplied string `str`.

### regexmatch_ignorecase()

Function syntax:

`bool regexmatch_ignorecase(string `regex`, string `str`)`

Same as regexmatch(), but the supplied `regex` matches `str` without regard to case.

### regexreplace()

Function syntax:

`string regexreplace(string `input_str`, string `regex_str`, string `replace_str` [, int `limit`])`

Accepts an input string, a regular expression pattern, a replacement string, and an optional integer, `limit`. Returns `input_str` modified by replacing occurrences of `regex_str` in `input_str` with `replace_str` up to `limit` times. If `limit` is negative or omitted, all occurrences in `input_str` are replaced.

See Oracle's Pattern class documentation for the supported regular expression syntax. The `replace_str` argument can contain references to fields in a regular expression capture group, in the same way as java.util.regex.Matcher.appendReplacement().

Usage note: the similar replace() command operates on string literals, and substitutes a string literal for a literal search string. The regexreplace() command operates on string literals, but substitutes its string literal for a regular expression search, and has support for capture groups. It would appear at first glance that regexreplace() without the optional `limit` argument is the same as replace(), but the difference in the second argument means the two commands have different semantics. For example, consider the following commands and their results:

`replace("this.string", ".", ",")` returns `this,string`

`regexreplace("this.string", ".", ",")` returns `,,,,,,,,,,,`

`regexreplace("this.string", "\\.", ",")` returns `this,string`

To specify a capture group, surround an expression with parentheses `()`. This saves the section of the input string matching the capturing group for later recall via a backreference that you specify in `replace_str`.

A backreference is specified in the regular expression as a dollar sign (`\$`) followed by a digit indicating the number of the group to be recalled, with a replacement string. For example:

`regexreplace('a-b','([a-zA-Z])-([a-zA-Z])','\$1_\$2')` returns `a_b`

### regexsplit()

Function syntax:

`list(string) regexsplit(string `input_str`, string `regex_delimiter` [, int `limit`])`

Use the regexsplit() function in the same ways as the split() function, except that you can specify a regular expression as the field delimiter. See Oracle's Pattern class documentation for the supported regular expression syntax.

The following example parses `input_str` into list elements using either colon, semicolon, or a space as the field delimiter:

`regexsplit("aaa:bbb;ccc:ddd;eee fff", "[:; ]")`

returns

`list(string) [aaa, bbb, ccc, ddd, eee, fff]`

See split() for further details and for instructions on using the optional `limit` argument.

### regextuple()

Function syntax:

`tuple regextuple(tuple `tuplespec`, string `regex` , sring `input`)`

Use the regextuple() function to parse a string into a tuple.

For the first argument, provide a tuple specification in the form of an empty tuple. To do so, use type conversion functions in a tuple() function to define the data type of each field, and at the same time to create a null field of each type. The first argument of the example below creates a tuple with the required schema, and with each field filled with null: `((symbol string, price double)) null,null`.

The second argument is a Java regular expression that uses named capturing groups, where the capture group names are the same as, and in the same order as, the tuple fields specified in the first argument.

The third argument is the string you want to parse into tuple format, possibly extracted from CSV or other plain text input upstream.

See Oracle's Pattern class documentation for the supported regular expression syntax.

In the following example, notice that you must escape the predefned character class strings with double backslashes:

`regextuple(tuple(string() as symbol, double() as price), "(?<symbol>[A-Z]+)\\s+(?<price>\\d+(\\.\\d+)?)", "STP 42.00")`

This returns:

`((symbol string, price double)) STP,42.0`

### Tip

To run the above example at the command prompt with sbd --eval, surround the entire regextuple() function with single quotes. (The following single-line command is broken onto two lines for publication clarity.)

```sbd --eval 'regextuple(tuple(string() as symbol, double() as price),  "(?<symbol>[A-Z]+)\\s+(?<price>\\d+(\\.\\d+)?)", "STP 42.00")'```

### replace()

Function syntax:

`string replace(string `main_str`, string `search_str`, string `replace_str`)`

Returns `main_str` after locating all instances of `search_str` and replacing them with `replace_str`. The `search_str` and `replace_str` arguments are read as case-sensitive, literal strings. Any wildcard character is read as the literal character, not a wildcard.

Examples:

For a string field named `UStext` in the incoming stream:  `replace(UStext, "honor", "honour")`

`replace('Open the pod bay doors, IBM.', 'IBM' , 'HAL')`

`replace('1,2,3,4,5', ',' , '-')`

See also the regexreplace() command and the usage note there comparing with replace() syntax.

### rtrim()

Function syntax:

```string rtrim(string `str`)
string rtrim(string `s1`, string `s2`)```

The same as trim(), except that the first argument is trimmed on the right side only.

### split()

Function syntax:

`list(string) split(string `input_str`, string `delimiter` [, int `limit`])`

Returns a list of strings parsed from `input_str`, splitting the input string into pieces delimited by the character or phrase specified as `delimiter`. The `delimiter` character or phrase itself is not preserved in the resulting list when a match is found. For example:

`split("aaa:bbb:ccc:ddd:eee", ":")`

returns

`list(string) [aaa, bbb, ccc, ddd, eee]`

If the delimiter character or phrase is not found in `input_str`, then `input_str` is returned verbatim as the only element of the resulting list. If the delimiter character or phrase occurs adjacent to another instance of the delimiter, the result is an empty list element. For example, `split("aaa::bbb", ":")` results in the following three-element list: [aaa,  ,bbb].

You can specify an optional third argument, `limit`, an integer specifying the number of list elements in the result list. Use the limit argument in the following ways:

• If `limit` is any negative number, `input_str` is split into as many list elements as necessary, with no limit on the length of the list. All empty list elements are preserved, including any trailing empty elements. For example, `split("aaa::bbb::", ":", -1)` results in a five-element list: [aaa,  ,bbb,  ,  ].

• If `limit` is zero or unspecified, `input_str` is split into as many list elements as necessary, but any trailing empty elements are discarded. For example, `split("aaa::bbb::", ":", 0)` and `split("aaa:bbb::", ":")` both result in a three-element list: [aaa,  ,bbb].

• If `limit` is positive, `input_str` is split into at most limit-1 elements, plus one more element containing the remainder of `input_str`. For example, `split("aaa:bbb:ccc:ddd:eee", ":", 3)` results in a three-element list: [aaa, bbb, ccc:ddd:eee].

Use this feature to split a string into the components you want to process, followed by a single catch-all element containing the remainder.

See also the regexsplit() function, a variation of split() that allows you to use a regular expression to specify the delimiter.

### startswith()

Function syntax:

`bool startswith(String biggerString, String smallerString)`

Takes two arguments: a string to search within and a (sub)string to search for. Returns `true` if biggerString begins with smallerString, or `false` if not.

If either argument is null, returns `null`.

### string()

Converts its argument into a string. See string() in the Simple Functions: Type Conversions section.

### strlen()

Function syntax:

`int strlen(string `str`)`

Returns the length of the string `str`. This function returns the number of graphemes, not bytes, for all supported Unicode character sets.

### strresize()

Function syntax:

`string strresize(string `str`, int `length`)`

The `strresize()` function performs no action: it simply returns its `str` argument and ignores the `length` argument. The function is preserved for compatibility with previous StreamBase releases, in which the StreamBase string data type had a fixed maximum size. To perform string truncation, use substr() or strresizetrunc().

### strresizetrunc()

Function syntax:

`string strresizetrunc(string `str`, int `length`)`

Returns its argument `str`, truncated to the number of bytes specified in `length`.

### Note

When using `strresizetrunc()` with Unicode multibyte character sets, count the `length` argument in bytes, not graphemes.

### substr()

Function syntax:

`string substr(string `str`, int `beginchar`, [int `length]`)`

Returns a substring of the string `str`, starting at the zero-indexed character `beginchar` with optional size of `length` characters.

For all supported Unicode character sets, specify `length` in number of graphemes, not bytes.

### trim()

Function syntax:

```string trim(string `str`)
string trim(string `s1`, string `s2`)```

With one argument, returns the specified string with leading and trailing spaces removed.

With two arguments, the second argument contains the characters to be removed from the string specified in the first argument.

### upper()

Function syntax:

`string upper(string `str`)`

Returns its argument in uppercase letters.

## Simple Functions: System

This category includes the following functions:

Function syntax:

`bool advanceTimeBy(long `millis`)`

Use this function to increment the system time of an instance of sbd running in a controlled time environment by the specified number of milliseconds.

This function only works if you have configured the current project's `sbd.sbconf` file for testing purposes with elements like the following. See Using the StreamBase JUnit TimeService for further information.

```<time-service-configurations>   <type>CONTROLLABLE</type> </time-service-configurations>```

Function syntax:

`bool advanceTimeTo(timestamp `timevalue`)`

Use this function to increment the system time of an instance of sbd running in a controlled time environment to the specified timestamp's date and time.

This function only works if you have configured the current project's `sbd.sbconf` file for testing purposes with elements like the following. See Using the StreamBase JUnit TimeService for further information.

```<time-service-configurations>   <type>CONTROLLABLE</type> </time-service-configurations>```

### getTargetTime

Function syntax:

`timestamp getTargetTime()`

Use this function to retrieve a timestamp representing the time StreamBase Server thinks it is right now. This function is useful in a controlled time environment to retrieve a custom starting point time you have set with elements like the following in the project's `sbd.sbconf` file. Notice that the getTargetTime() timestamp is the same as that returned by now(), because in a controlled time environment, the server's sense of what current time means has been deliberately changed, and is temporarily not the same as wall clock time.

```<time-service-configurations>   <type>CONTROLLABLE</type>   <target-time>2015-10-06 12:00</target-time> </time-service-configurations>```

This function is not expected to work at the command prompt with sbd --eval.

### nanotime()

Function syntax:

`long nanotime()`

Returns the current value of the system timer as a long value in nanoseconds. This function provides nanosecond precision, but not necessarily nanosecond resolution (that is, how frequently the value changes). Therefore, use this function only to measure elapsed time between two invocations, like the following example:

```long startcount = nanotime() // process tuple data long elapsedtime = nanotime() - startcount```

Differences in two calls to nanotime() are not accurate for a span greater than 263 (about 292 years). Do not compare nanotime() values between different JVM instances or even across JVM restart boundaries on the same machine.

### sleep()

Function syntax:

`timestamp sleep(timestamp `t`)`

Pauses execution of the current thread for the specified time period. The argument must be an interval timestamp, and the return value is the argument. This function can be useful in conjunction with operators that run in a separate thread, and is provided for debugging or for rare uses. The `sleep()` function blocks the concurrent portion of an application, if concurrency is enabled, or blocks the container holding the application, if concurrency is not enabled. Therefore, use `sleep()` with care in production applications. To learn about threading in StreamBase applications, refer to Execution Order and Concurrency.

### systemenv()

Function syntax:

`string systemenv(string `key`)`

Calls the standard Java method `java.lang.System.getenv()`. Returns the string value of the environment variable specified by the `key` argument.

### systemproperty()

Function syntax:

`string systemproperty(string `key` [, string `defaultstring`])`

Calls the standard `java.lang.System.getProperty()` method. Returns the string value of the system property specified by the `key` argument. Use the standard set of system property keys described in Java documentation for `java.lang.System.getProperties()` (for example, `os.name`, `java.home`, and `user.name`), or specify a StreamBase-specific Java property.

If the optional `defaultstring` argument is used, and there is no system property for the specified key, the function returns the string specified in the `defaultstring` argument. If no default string is provided, null is returned.

### version()

Function syntax:

`string version()`

Returns as a string the three-position release number from the currently running release of TIBCO StreamBase, such as 7.5.4.

## Simple Functions: Timestamp Overview

The timestamp data type can hold either an absolute timestamp or an interval timestamp.

An absolute timestamp represents a date and time. Its value is the number of seconds between the epoch and that date and time, with a maximum precision of milliseconds. The epoch is defined as 00:00:00 of January 1, 1970 UTC.

An interval timestamp represents a duration. Its value is the number of seconds in the interval, with a maximum precision of milliseconds.

Absolute timestamps are expressed in the time format patterns of the `java.text.SimpleDateFormat` class and those shown in the description of the format_time() function below. Examples on this page are taken from the results of running sbd --eval commands.

See timestamp Data Type for information on the range of timestamp values, on adding and subtracting timestamp values, and on comparing timestamps with relational operators.

Timestamp functions are organized into the following categories, described in the next three sections:

 Timestamp Functions: Absolute Timestamps Timestamp Functions: Interval Timestamps Timestamp Functions: Timestamp Fields

## Timestamp Functions: Absolute Timestamps

This category includes the following functions, which return an absolute timestamp or a string converted from an absolute timestamp:

 date() epoch() format_time() from_gmtime() from_localtime() from_unixtime now() parse_time() strftime() strptime() today() today_utc()

### Usage Note

StreamBase offers two similar functions to convert a StreamBase timestamp to a string: format_time() and strftime(). These functions use different libraries, Java and C++ respectively, to provide support for their time format strings.

Be aware that these libraries make opposite use of the `z` and `Z` format strings, as shown in this table:

function Meaning of lowercase z or %z Meaning of uppercase Z or %Z
format_time() Time zone abbreviation (BST, EST, PDT) Time zone signed number (+0100, -0500, -0800)
strftime() Time zone signed number (+0100, -0500, -0800) Time zone abbreviation (BST, EST, PDT)

StreamBase also provides two similar functions for parsing a time format string in any format and converting it to a StreamBase absolute timestamp: parse_time() and strptime(). These are based on the same Java and C++ libraries, respectively.

The parse_time() function interprets the `z` and `Z` format strings equally. The input `timestring` can be in either text, abbreviation, or signed number format, and either case of Z interprets them as time zone specifiers.

The strptime() function interprets the `%z` and `%Z` format strings as separate identifiers. When using strptime(), specify lowercase `z` in the `formatstring` if the incoming `timestring` expresses the time zone as a signed number, such as `+0100`, `-0500`, or `-0800`. Specify uppercase `Z` in the `formatstring` in the incoming `timestring` expresses the time zone as text or a standard abbreviation: `BST`, `EST`, `PDT`, or `America/New York`.

### date()

Function syntax:

`timestamp date(timestamp `ts`)`

Accepts a timestamp `ts` and returns an absolute timestamp for the year, month, and day portion of `ts`, with all time values set to zero. This, in effect, extracts and returns the timestamp for midnight local time (00:00:00) for the date in the argument `ts`.

For example, the function now(), run on 12 Feb 2009 in the EST time zone, returns a full timestamp like this one:

`(timestamp) 2009-02-12 15:48:45.679-0500`

But the same function used as an argument for date():

`date(now())`

returns the following. Notice that all time information was changed to zero to designate midnight:

`(timestamp) 2009-02-12 00:00:00.000-0500`

### epoch()

Function syntax:

`timestamp epoch()`

Returns an absolute timestamp representing the zero point of counted time. For both Windows and UNIX, the epoch is 00:00:00 of January 1st, 1970. The value returned is adjusted for the current time zone. To see the absolute epoch, use:

`epoch() - timezoneoffset()`

Because the epoch occurred during standard time in the northern hemisphere, you may need to add a one-hour offset if you are retrieving the absolute epoch during daylight savings time:

`epoch() - timezoneoffset() + hours(1)`

Incoming tuples from a market data provider or an exchange might include a timestamp field in the form of a long value (or integer value) representing the number of milliseconds (or number of seconds) since the epoch. Use epoch() to convert such field values to an absolute timestamp.

Many functions, including `java.util.Date.getTime()`, return a long value representing the number of milliseconds since the epoch. The timestamp value in market data from many exchanges use this format. To convert such long values, `TSms`, to an absolute timestamp, use an expression like the following:

`milliseconds(TSms) + epoch()`

Functions that return UNIX time, including the UNIX `date +%s` command, generally return an integer number of seconds. To convert such integer values, `TSsec`, to an absolute timestamp, use an expression like the following:

`seconds(TSsec) + epoch()`

Use seconds() as above, for double values that include a fractional milliseconds component, such as the value returned from the Python `time.time()` function.

Use to_seconds() or to_milliseconds() to take a formatted timestamp string and return the number of seconds or milliseconds since the epoch, respectively.

### format_time()

Function syntax:

```string format_time(timestamp `ts`, string `formatstring`)
string format_time(timestamp `ts`, string `formatstring`, string `tz`)```

Converts the specified timestamp field, `ts`, to a string, using the specified `formatstring` pattern. The timestamp field can then be manipulated as a string instead of a timestamp.

Optional string argument `tz` is a time zone ID for specifying a standard name (for example `"UTC"`) or location (for example, `"Europe/Zurich"`) of a time zone that the formatted time should indicate. When used, time zone IDs must be specified exactly and are case-sensitive. When given an invalid timezone string, the function sets the time zone to the default, "GMT". For information on specifying time zone IDs, see Specifying Time Zones.

This function uses the time format patterns from the `SimpleDateFormat` class and summarized below.

For example, the function now(), run on 8 September 2014 in the EST time zone, returns a timestamp like this one:

`(timestamp) 2014-09-08 14:36:37.235-0400`

The same function used as an argument for `format_time()`:

`format_time(now(), "EEEE, MMM d, yyyy HH:mm zzzz")`

returns the following:

`(string) Monday, Sep 8, 2014 14:36 Eastern Daylight Time`

If you supply a third argument, you can shift the time zone. For example, to obtain Greenwich Mean Time:

`format_time(now(), "EEEE, MMM d, yyyy HH:mm zzzz","GMT")`

returns:

`(string) Monday, Sep 8, 2014 18:36 Greenwich Mean Time`

You can use any combination of time format designators from the `SimpleDateFormat` class. For example:

`format_time(now(), "yyyy-MM-dd G, HH:mm a")`

returns:

`(string) 2009-02-12 AD, 15:48 PM`

This function performs the same task as `strftime()`, but uses a different library source for its format pattern strings. The `strftime()` function is faster than `format_time()`, and should be used when performance is critical in your application. The format pattern strings used by `format_time()` are easier to use. See the Usage Note about the different ways this function and strftime() interpret the z and Z `formatstring` specifiers.

The following table shows the format strings supported by `format_time()`. See the `SimpleDateFormat` documentation for further details.

`formatstring` Component Meaning Examples
a AM or PM AM
d Day in the month 11, 31
D Day in the year 234
E Day of the week. Use EEEE for the full day spelling, EEE or fewer for the abbreviation. Friday, Fri
F Number of the day of the week. 2 for Tuesday
h The one-based hour on a 12-hour clock, 1 to 12. 12
H The zero-based hour on a 24-hour clock, 0 to 23. 23
k The one-based hour on a 24-hour clock, 1 to 24. 24
K The zero-based hour on a 12-hour clock, 0 to 11. 11
m The minute in the hour. 25
M The month of the year. Use MM for a two-digit month, MMM for an abbreviated month name, MMMM for the full month name. 11, Nov, November
s The second in the minute. 23
S The millisecond fraction of the second 456
w The number of the week in the year. 27
W The number of the week in the month. 3
y The year. Use yy for a two-digit year, interpreted as belonging to the century 80 years prior to or 20 years ahead of the use of this function. Use more than two y's to interpret the year literally. In 2011, yy = 11, yyyy = 2011.
z The time zone shown as text. Use three or fewer z's for the time zone abbreviation, four z's for the full time zone name. EST, Eastern Standard Time
Z The time zone offset from UTC, shown as a positive or negative number as in RFC 822. -0500

### from_gmtime()

Function syntax:

```timestamp from_gmtime(int `year`, int `month`,
int `dayOfMonth`, int `hour`, int `minute`, double `seconds`)```

Creates an absolute timestamp in the UTC (GMT) time zone from the specified integer components of a date and time. All six arguments are required. You can express the `seconds` argument as a double with precision down to milliseconds, or you can use an integer, which is promoted automatically to a double by the rules of StreamBase Data Type Coercion and Conversion.

You might use this function on a stream that has the date and time broken into integer fields to concatenate and translate those fields into a single timestamp value.

Notice that the returned timestamp shows a moment in the UTC time zone, but that moment is translated and displayed in the local time zone. For example, compare the same arguments given to `from_gmtime()` and `from_localtime()` when run in the EST time zone:

`from_gmtime(2009, 3, 17, 18, 30, 05)`

returns:

`(timestamp) 2009-03-17 14:30:05.000-0400`

while

`from_localtime(2009, 3, 17, 18, 30, 05)`

returns:

`(timestamp) 2009-03-17 18:30:05.000-0400`

### from_localtime()

Function syntax:

```timestamp from_localtime(int `year`, int `month`,
int `dayOfMonth`, int `hour`, int `minute`, double `second`)```

Creates an absolute timestamp in the local time zone from the specified integer components of a date and time. All six arguments are required. You can express the `seconds` argument as a double with precision down to milliseconds, or you can use an integer, which is promoted automatically to a double by the rules of StreamBase Data Type Coercion and Conversion.

You might use this function on a stream that has the date and time broken into integer fields to concatenate and translate those fields into a single timestamp value.

For example:

`from_localtime(2009, 3, 17, 18, 30, 05)`

returns:

`(timestamp) 2009-03-17 18:30:05.000-0400`

### from_unixtime()

Function syntax:

`timestamp from_unixtime(double `u_time`)`

Creates an absolute timestamp in UTC from a Unix timestamp ("POSIX time"). Unix time is the number of seconds that have elapsed in the Unix epoch (since 00:00:00 UTC, Thursday, 1 January 1970), not counting leap seconds. Consequently, the output is encoded only to four decimal digits.

For example:

`from_unixtime(0.0)`

returns:

`(timestamp) 1969-12-31 19:00:00.000-0500`

`from_unixtime(1.2e9)`

returns:

`(timestamp) 2008-01-10 16:20:00.000-0500`

### now()

Function syntax:

`timestamp now()`

Returns an absolute timestamp value representing the current time in the local time zone.

You can specify an alternate `now()` implementation in the StreamBase Server configuration file. The value is set as a child element of the `<runtime>` element:

`<param name="now-implementation" value="system" />`

Specify one of these values, as appropriate for your StreamBase application:

Value Meaning
system Directs the `now()` function to use Java's `System.currentTimeMillis()`. This is the default.
thread Directs the `now()` function to use a background thread that checks the time approximately every millisecond. This option results in decreased accuracy, but may be more efficient than `system` if you call `now()` more frequently than 1000 times per second.

### parse_time()

Function syntax:

```timestamp parse_time(string `timestring`, string `formatstring`)
timestamp parse_time(string `timestring`, string `formatString`,
string `language`, string `region`, string `variant`)```

Parses a string of time and date information and returns an absolute timestamp value. The `timestring` argument can be any string representation of date and time, as long as its components are individually parsable.

You must specify a `formatstring` argument to specify how the `timestring` argument is to be interpreted. The format string must specify a time format pattern as defined in the `SimpleDateFormat` class. See the format_time() function for a summary of the time format pattern strings. For information on specifying time zone IDs, see Specifying Time Zones.

The arguments added in the second syntax provide information to customize the timestamp for a locale, and may be necessary if `timestrings` include localized abbreviations. The Java class Locale specifies these arguments as:

• language — lowercase two-letter ISO-639 code

• country — uppercase two-letter ISO-3166 code

• variant — vendor and browser specific code, such as WIN for Windows, MAC for Macintosh, and POSIX for POSIX

When using the second syntax, set any unused argument to an empty string.

Use this function to read a time and date string in any format in an incoming stream, and still interpret that string as an absolute timestamp. For example, an incoming field, `TradeTime`, might contain the time and date in the following string format:

TradeTime = "11:17 Fri Feb 13, 09"

In this case, interpret the field `TradeTime` as a StreamBase timestamp with an expression like the following:

`parse_time(TradeTime, "HH:mm EE MMM dd, yy")`

which returns:

`(timestamp) 2009-02-13 11:17:00.000-0500`

The input `timestring` can be in either text, abbreviation, or signed number format. If the `timestring` argument includes a time zone specifier, this function interprets the zone and returns the timestamp converted to local time. The time zone specifier can be in standard text format, such as `EST`, `EDT`, or `PDT`, or signed number format, such as `-0500`, `-0800`, `+0100`. You can specify either `z` or `Z` to match the time zone specifier. They are interpreted equally.

This function interprets the `z` and `Z` format strings equally. The input `timestring` can be in either text, abbreviation, or signed number format, and both cases of Z interpret them as time zone specifiers.

For example, we can add Central European Time to the `TradeTime` field using its signed number designation:

TradeTime = "11:17 Fri Feb 13, 09 +0100"

In this case, add a `z` or `Z` to the `formatstring` argument:

`parse_time(TradeTime, "HH:mm EE MMM dd, yy Z")`

When run in the EST zone, this function returns the following. You can use the timezoneoffset() function to force the interpretation for a different time zone.

`(timestamp) 2009-02-13 05:17:00.000-0500`

Notice that it is inconsistent for a time string to contain an hours field based on a 24 hour clock and at the same time to specify AM or PM in the same string. Thus, it is inconsistent to specify both `HH` for the hours field and `a` for AM/PM in the same format string. If your time string contains AM or PM, use lowercase `hh` to interpret the hours field. StreamBase does not flag an error if you specify `HH` and `a` in the same format string, but the hours interpretation will be 12 hours offset for afternoon hours.

This function performs the same task as `strptime()`, but uses a different library source for its format pattern strings. The `strptime()` function is faster than `parse_time()`, and should be used when performance is critical in your application. The format pattern strings used by `parse_time()` are easier to use. See the Usage Note about the different ways this function and strptime() interpret the z and Z `formatstring` specifiers.

### strftime()

Function syntax:

```string strftime(string `formatstring`, timestamp `ts`)
string strftime(string `formatstring`, timestamp `ts`, string `tz`)```

Converts the specified timestamp field, `ts`, to a string, formatted according to the specified `formatstring` pattern, for the local time zone. The timestamp field can then be displayed and manipulated as a string instead of a timestamp.

Optional string argument `tz` is a time zone ID for specifying a standard name (for example `"UTC"`) or location (for example, `"Europe/Zurich"`) of a time zone that the formatted time should indicate. When used, time zone IDs must be specified exactly and are case-sensitive. For information on specifying time zone IDs, see Specifying Time Zones.

This function uses time format patterns described in the table below, which are based on the strftime library, part of the Open Group's Single UNIX Specification.

As an example of using strftime(), consider that the function now(), when run on 05 May 2014 in the EDT time zone, returns a timestamp like this one:

`(timestamp) 2014-05-05 10:52:44.273-0400`

Here is the same function used as an argument for `strftime()`:

`strftime("Traded at %H:%M and %S.%f seconds on %b %d, %Y", now())`

which returns the following:

`(string) Traded at 10:52 and 44.273 seconds on May 05, 2014`

If you wanted the return value to be expressed in UTC time, add its time zone ID as a third argument:

`strftime("Traded at %H:%M and %S.%f seconds on %b %d, %Y", now(),"UTC")`

which changes the result to:

`(string) Traded at 14:52 and 44.273 seconds on May 05, 2014`

This function performs the same task as `format_time()`, but uses a different library source for its format pattern strings. The `strftime()` function is faster than `format_time()`, and should be used when performance is critical in your application. The format pattern strings used by `format_time()` are easier to use, but note that `format_time()` does not recognize time zone ID arguments. See the Usage Note about the different ways this function and `format_time()` interpret the z and Z `formatstring` specifiers.

The following table shows the format strings supported by `strftime()`. See the strftime library documentation for further details. An asterisk in the `formatstring` column denotes format strings added by StreamBase.

`formatstring` Component Meaning
%a The locale's abbreviated weekday name: Mon, Wed.
%A The locale's full weekday name: Monday, Wednesday.
%b The locale's abbreviated month name: Jan, Nov.
%B The locale's full month name: January, November.
%c The locale's default time and date string.
%C The year divided by 100 and truncated to an integer: 99, 00, 02, 11
%d The two-digit day of the month, from 01 to 31.
%D The month, day, and year with slash separator. Equivalent to specifying `%m/%d/%y`.
%e The one- or two-digit day of the month, from 1 to 31. Like %d, except single digit days are preceded by a space.
%f    * The milliseconds portion of timestamp.
%F The month, day, and year with hyphen separator. Equivalent to specifying `%m-%d-%y`.
%g The last two digits of the year.
%G The four-digit year.
%h Same as %b.
%H The two-digit hour on a 24-hour clock, 00 to 23.
%I The two-digit hour on a 12-hour clock, 01 through 12.
%j The three-digit day of the year, 1 through 366.
%k The one or two digit hour on a 24-hour clock, with leading space if < 10.
%l The one or two digit hour on a 12-hour clock, with leading space if < 10.
%m The two-digit number of the month, 01 to 12.
%M The two-digit minute, from 00 to 59.
%n Replaced with a new line character.
%p The locale's equivalent of AM or PM.
%r The time in AM or PM notation. For the POSIX locale, the same as `%I:%M:%S %p`
%R (Not supported by the StreamBase `strftime()` function.)
%s    * The time value of timestamp `ts` in milliseconds divided by 1000.
%S The two-digit second, from 00 to 60. (60 is available in case of leap seconds)
%t Replaced with a tab character.
%T The time, equivalent to `%H:%M:%S`
%u The numeric day of the week, 1 through 7, with Monday = 1.
%U The two-digit number of the week of the year, from 00 through 53. The first January Sunday begins week 1; days in the new year before that Sunday are in week 0.
%V The two-digit number of the week of the year, from 01 through 53, with Monday as the first day. If the first week has four or more January days, it is week 1; otherwise it is the last week of the previous year. That is, January 4th and the first Thursday of January are always in week 1.
%w The numeric day of the week, 0 through 6, with Sunday = 0.
%W The two-digit number of the week of the year, from 00 through 53. The first January Monday begins week 1; days in the new year before that Monday are in week 0.
%x The date, using the locale's default date format.
%X The time, using the locale's default time format.
%y The year in two digits, from 00 through 99.
%Y The year in four digits.
%z The time zone offset from UTC in ISO 8601:2000 standard format, such as `+hhmm` or `-hhmm`, or empty if no time zone can be determined.
%Z The time zone name or abbreviation, or empty if no zone information is in the timestamp being converted.
%% Replaced by a literal %.
%+    * Returns the current time formatted as if you had specified `"%a %b %d %H:%M:%S %Z %Y"`. For example, `strftime("%+", now())` returns a string like the following:
`Thu Feb 18 15:42:35 EST 2010`
* These format patterns are StreamBase extensions that are either not found in the Open Group's strftime() function, or are used differently.

Some format strings can be modified with E or O between the percent sign and the format character. See the strftime library documentation for further details.

### strptime()

Function syntax:

`timestamp strptime(string `timestring`, string `formatstring`)`

Parses a string of time and date information and returns an absolute timestamp value. The `timestring` argument can be any string representation of date and time, as long as its components are individually parsable.

You must also specify a `formatstring` argument to specify how the `timestring` argument is to be interpreted. The format string must specify a time format pattern using a subset of the patterns defined for the strptime library (part of the Open Group's Single UNIX Specification), as listed in the table below.

This function interprets the `%z` and `%Z` format strings as separate identifiers. Specify lowercase `z` in the `formatstring` if the incoming `timestring` expresses the time zone as a signed number, such as `+0100`, `-0500`, or `-0800`. Specify uppercase `Z` in the `formatstring` in the incoming `timestring` expresses the time zone as text or a standard abbreviation: `BST`, `EST`, `PDT`, or `America/New York`.

Use this function to read a time and date string in any format in an incoming stream, and still interpret that string as an absolute timestamp. For example, an incoming field, `TradeTime`, might contain the time and date in the following string format:

TradeTime = "15-51-26.984 May 5, 14"

In this case, interpret the field `TradeTime` as a StreamBase timestamp with an expression like the following:

`strptime(TradeTime, "%H-%M-%S.%f %b %d, %y")`

which returns:

`(timestamp)  2014-05-05 15:51:26.984-0400`

This function performs the same task as `parse_time()`, but uses a different library source for its format pattern strings. This `strptime()` function is faster than `parse_time()`, and should be used when performance is critical in your application. The format pattern strings used by `parse_time()` are easier to use. See the Usage Note about the different ways this function and parse_time() interpret the z and Z `formatstring` specifiers.

See also the timestamp() function, and strpinterval(), a function similar to `strptime()`, but for interval timestamps.

The following table shows the format strings supported by `strptime()`. See the strptime library documentation for further details.

`formatstring` Component Meaning
%b The month, using the current locale's month names. Abbreviated and full month names are recognized.
%B Same as %b.
%d The day of the month, from 1 to 31.
%D The month, day, and year with slash separator. Equivalent to specifying `%m/%d/%y`.
%f   * The milliseconds portion of the seconds field.
%H The hour, using a 12-hour clock, 0 to 11, or a 24-hour clock, 0 to 23. If the incoming string expresses hours on a 12-hour clock, be sure to include the %p format string. If an incoming string's hours value is from 12 to 23 and the format string includes %p, the %p is ignored.
%m The number of the month, 1 to 12.
%M The minute, from 0 to 59.
%p The strings AM or PM.
%S The second, from 0 to 59.
%x The date, using the locale's default date format.
%y The year in two digits. Values in the range 69 to 99, inclusive, are interpreted as 1969 through 1999. Values in the range 00 through 68, inclusive, are interpreted as 2000 through 2068.
%Y The year in four digits.
%z   * A time zone designator in the form `+hhmm`, `-hhmm`, `+hh:mm`, or `-hh:mm`. Examples: -0500, +1430, -08:00
%Z   * A standard time zone string or abbreviation, such as `EST`, `GMT-05:00`, `America/Los_Angeles`.
! Use an exclamation point in the format string to indicate that the remainder of the input string is optional.
For %d, %H, %m, %M, and %S, leading zeros are recognized but not required.
* These format patterns are StreamBase extensions not found in the Open Group's strptime() function, or are used differently.

### today()

Function syntax:

`timestamp today()`

Returns an absolute timestamp value for the beginning of the current day, 00:00:00, in the local time zone.

### today_utc()

Function syntax:

`timestamp today_utc()`

Returns an absolute timestamp value for the beginning of the current day, 00:00:00, in the UTC (GMT) time zone, translated for the local time zone.

Thus, when `today()` and `today_utc()` are run in sequence in the US Eastern time zone during standard time, they return:

```2009-02-13 00:00:00.000-0500
2009-02-12 19:00:00.000-0500```

The expressions `today()` and `today_utc() - timezoneoffset()` return the same value:

```2009-02-13 00:00:00.000-0500
2009-02-13 00:00:00.000-0500```

On UNIX, `today_utc()` requires the TZ environment variable to be set in order to translate correctly to the local time zone. On Windows, the TZ environment variable is ignored if set.

## Timestamp Functions: Interval Timestamps

This category includes the following functions, which return an interval timestamp:

 days() hours() interval() isinterval() milliseconds() minutes() seconds() strpinterval() time() timezoneoffset() weeks()

### days()

Function syntax:

```timestamp days(int `x`)
timestamp days(double `x`)```

Returns an interval timestamp representing an interval of `x` days.

### hours()

Function syntax:

```timestamp hours(int `x`)
timestamp hours(double `x`)```

Returns an interval timestamp representing an interval of `x` hours.

### interval()

Function syntax:

`timestamp interval(string `formatstring`)`

Parses the argument `formatstring` as a time format string and returns an interval timestamp. This function uses the time format patterns from strftime library, described in the Open Group's Single UNIX Specification. See the table of format patterns listed for the strftime() function.

The time format string is in the form `%H:%M:%S!.!%f`. The exclamation points show where the remainder of the string is optional. Thus, you can enter strings in any of the following formats, where hh = hours, mm = minutes, ss = seconds, and ff = a fractional second value in milliseconds:

• hh:mm:ss

• hh:mm:ss.

• hh:mm:ss.ff

For example:

`interval("2:00:01")`

`interval("2:00:01.")`

both return `(timestamp) 7201.0`, the number of seconds in the specification of two hours plus one second.

`interval("2:00:01.333")`

returns `(timestamp) 7201.333`, the number of seconds in the specification of two hours plus one and a third seconds.

### isinterval()

Function syntax:

`bool isinterval(timestamp `ts`)`

Returns true if the argument `ts` is an interval timestamp. See Simple Functions: Timestamp Overview for a discussion of interval versus absolute timestamp types.

### milliseconds()

Function syntax:

```timestamp milliseconds(int `x`)
timestamp milliseconds(double `x`)```

Returns an interval timestamp representing an interval of `x` milliseconds.

Use this function in conjunction with epoch() to convert a tuple field value representing the number of milliseconds since the epoch to an absolute timestamp. See epoch().

### minutes()

Function syntax:

```timestamp minutes(int `x`)
timestamp minutes(double `x`)```

Returns an interval timestamp representing an interval of `x` minutes.

### seconds()

Function syntax:

```timestamp seconds(int `x`)
timestamp seconds(double `x`)```

Returns an interval timestamp representing an interval of `x` seconds.

Use this function in conjunction with epoch() to convert a tuple field value representing the number of seconds since the epoch to an absolute timestamp. See epoch().

### strpinterval()

Function syntax:

`timestamp strpinterval(string `timestring`, string `formatstring`)`

Parses a string of time interval information and returns an interval timestamp value. The `timestring` argument can be any string representation of a time interval, as long as its components are individually parsable.

You must also specify a `formatstring` argument to specify how the `timestring` argument is to be interpreted. The format string must specify a time format pattern using the following subset of the patterns defined for the strptime library (part of the Open Group's Single UNIX Specification):

`formatstring` Component Meaning
%f The milliseconds portion of the seconds field. (This is a StreamBase extension not found in the strptime library.)
%H The hour, using a 24-hour clock, 0 to 23.
%M The minute, from 0 to 59.
%S The second, from 0 to 59.
! Use an exclamation point in the format string to indicate that the remainder of the input string is optional.
For %H, %M, and %S, leading zeros are recognized but not required.

Use this function to read a string representing a time interval in an incoming stream, and still interpret that string as an interval timestamp. For example, an incoming field, `ElapsedTime`, might contain a time interval expressed in following string format:

ElapsedTime = "1, 7, 5.250"

which you know from the data vendor's documentation means one hour and 7 minutes, plus five and a quarter seconds. In this case, interpret the field `ElapsedTime` as a StreamBase interval timestamp with an expression like the following:

`strptime(ElapsedTime, "%H, %M, %S.%f")`

which returns:

`(timestamp) 4025.25`

If the `ElapsedTime` field in the incoming stream intermittently includes the seconds, use an exclamation point to show that everything after the minutes pattern is optional:

`strptime(ElapsedTime, "%H, %M!, %S.%f")`

Then when a new `ElapsedTime` value comes through set to `"2, 4"`, it is still interpreted and converted to an interval timestamp:

`(timestamp) 7440.0`

See also the strptime(), which performs for absolute timestamps what this function does for interval timestamps.

### time()

Function syntax:

`timestamp time(timestamp `ts`)`

Returns as an interval timestamp the time of day expressed in the argument, `ts` – that is, the number of seconds between the start of the day expressed by `ts` and the moment expressed by `ts`. Thus, if run at the same instant, the following expressions return the same result: `time(now())` and `now() - today()`.

### timezoneoffset()

```timestamp timezoneoffset()
timestamp timezoneoffset(timestamp `ts`)```

Without an argument, returns a signed interval timestamp that expresses in seconds the offset of the local time zone from UTC. Thus, `timezoneoffset()` returns as an interval the same value that `format_time(now(), "Z")` returns as a string. Because interval timestamps are expressed in seconds, you can divide the results by 3600 to obtain for display purposes the number of hours in the offset.

The returned time zone offset value is calculated relative to the system time for Windows, and relative to the system time or to the time zone specified in the `TZ` environment variable for UNIX. On UNIX, make sure that the environment from which the sbd process is launched either does not have a `TZ` variable or that it is correctly set for the local system's time zone. The `TZ` environment variable is ignored on Windows.

You can subtract the return of `timezoneoffset()` from any absolute timestamp value to generate a new absolute timestamp that represents the same moment in the UTC time zone. For example, `now() - timezoneoffset()` returns the present moment in the UTC time zone, and returns the same result when run from any time zone.

Because the value returned from `timezoneoffset()` is signed, you do not need to consider whether to add or subtract a positive or negative offset value from the timestamp of interest. Always use a subtraction operation, which insures that negative `timezoneoffset()` values (that is, those from time zones west of UTC) are effectively added, while positive values (from time zones east of UTC) are subtracted.

Once you have a timestamp value representing UTC time, you can convert it to a timestamp for another time zone by adding or subtracting a fixed value representing the offset of the time zone of interest. For example, the following expression returns the current time for Sydney, Australia for April to October, when Sydney is on standard time:

`(now() - timezoneoffset()) - hours(10.0)`

(For October to April, when Sydney is on daylight savings time, use `hours(11.0)`.)

If you provide an absolute timestamp as an argument, `timezoneoffset()` returns the UTC offset of the system's current time zone at the date and time of the specified timestamp. The offset is calculated based only on the date and time information in the specified timestamp, relative to the system's current time zone (or, on UNIX, to the zone in `TZ`). You can use the argument form of `timezoneoffset()` to determine the UTC offset for future or historical dates.

For example, the following commands show that on the US East Coast, the UTC offset changed at 2:00 AM on March 8, 2009, when daylight savings time began.

sbd --eval "timezoneoffset(timestamp('2009-03-08 01:59:59'))/3600"
timestamp (-5.0)
sbd --eval "timezoneoffset(timestamp('2009-03-08 02:00:00'))/3600"
timestamp (-4.0)
sbd --eval "timezoneoffset(timestamp('2009-03-08 03:00:00'))/3600"
timestamp (-4.0)

(Actually, at 2:00 AM clocks were turned ahead one hour to 3:00 AM, but `timezoneoffset()` correctly calculates the UTC offset for both 2:00 and 3:00 AM.)

You can also extract specific fields of a timestamp in either local time or for a specific time zone using functions such as `get_hour()`, which take a string time zone ID as an optional argument. For more information, see Timestamp Functions: Timestamp Fields.

### weeks()

Function syntax:

```timestamp weeks(int `x`)
timestamp weeks(double `x`)```

Returns an interval timestamp representing an interval of `x` weeks.

## Timestamp Functions: Timestamp Fields

This category includes the following functions, used to get and set individual fields in absolute timestamps:

 get_millisecond() get_year() to_milliseconds() get_second() set_second() to_seconds() get_minute() set_minute() get_hour() set_hour() get_day_of_week() set_day_of_week() get_day_of_month() set_day_of_month() get_day_of_year() set_month() get_month() set_year()

### get_millisecond()

Function syntax:

`double get_millisecond(timestamp `ts`)`

Returns the milliseconds portion of the absolute timestamp `ts`. The returned value is a double, and includes only the fractional value of the milliseconds of the current second.

### get_second()

Function syntax:

```double get_second(timestamp `ts`)
double get_second(timestamp `ts`, string `tz`)```

Returns the seconds portion of the absolute timestamp `ts` for local time zone. The returned value is a double, and includes a fractional value in milliseconds.

To return a value for a particular time zone, provide the text of its ID as a second argument, `tz`, which is case-sensitive. Unrecognized time zone IDs default to UTC. For information on specifying time zone IDs, see Specifying Time Zones.

### get_minute()

Function syntax:

```int get_minute(timestamp `ts`)
int get_minute(timestamp `ts`, string `tz`)```

Returns the minutes portion of the absolute timestamp `ts` for local time zone.

To return a value for a particular time zone, provide the text of its ID as a second argument, `tz`, which is case-sensitive. Unrecognized time zone IDs default to UTC. For information on specifying time zone IDs, see Specifying Time Zones.

### get_hour()

Function syntax:

```int get_hour(timestamp `ts`)
int get_hour(timestamp `ts`, string `tz`)```

Returns the hours portion of the absolute timestamp `ts` for local time zone.

To return a value for a particular time zone, provide the text of its ID as a second argument, `tz`, which is case-sensitive. Unrecognized time zone IDs default to UTC. For information on specifying time zone IDs, see Specifying Time Zones.

### get_day_of_week()

Function syntax:

```int get_day_of_week(timestamp `ts`)
int get_day_of_week(int year, int month, int day)```

Returns an integer representing the day of the week given an absolute timestamp `ts` for local time zone or three integers representing a year, month and day. The returned value is between `0` (Sunday) and `6` (Saturday).

### get_day_of_month()

Function syntax:

`int get_day_of_month(timestamp `ts`)`

Returns an integer representing the day of the month in the absolute timestamp `ts` for local time zone. The returned value is between `1` and `31` (is one-based).

### get_day_of_year()

Function syntax:

```int get_day_of_year()
int get_day_of_year(int year, int month, int day)
int get_day_of_year(timestamp `ts`)```

Returns an integer representing the Julian day of the year for local time zone. The returned value is between `1` and `365` (or `366` for leap years).

`get_day_of_year` with three integer arguments or one timestamp argument returns the Julian date for the specified year, month and day.

`get_day_of_year` with no arguments is equivalent to `get_day_of_year(today())`.

### get_month()

Function syntax:

```int get_month(timestamp `ts`)
int get_month(timestamp `ts`, string `tz`)```

Returns an integer representing the month in the absolute timestamp `ts` for local time zone. The returned value is between `1` (January) and `12` (December).

To return a value for a particular time zone, provide the text of its ID as a second argument, `tz`, which is case-sensitive. Unrecognized time zone IDs default to UTC. For information on specifying time zone IDs, see Specifying Time Zones.

### get_year()

Function syntax:

```int get_year(timestamp `ts`)
int get_year(timestamp `ts`, string `tz`)```

Returns the year portion of the absolute timestamp `ts` for local time zone.

To return a value for a particular time zone, provide the text of its ID as a second argument, `tz`, which is case-sensitive. Unrecognized time zone IDs default to UTC. For information on specifying time zone IDs, see Specifying Time Zones.

### set_second()

Function syntax:

```timestamp set_second(timestamp `ts`, int `x`)
timestamp set_second(timestamp `ts`, long `x`)
timestamp set_second(timestamp `ts`, double `x`)```

Returns a modified version of the absolute timestamp `t`s, with the seconds component replaced with `x`. If you supply an int or long value for `x`, the milliseconds portion of `x` is set to 000. If you supply a double value, you can replace the seconds component down to the millisecond.

### set_minute()

Function syntax:

`timestamp set_minute(timestamp `ts`, int `x`)`

Returns a modified version of the absolute timestamp `t`s, with the minutes component replaced with `x`.

### set_hour()

Function syntax:

`timestamp set_hour(timestamp `ts`, int `x`)`

Returns a modified version of the absolute timestamp `t`s, with the hours component replaced with `x`.

### set_day_of_week()

Function syntax:

`timestamp set_day_of_week(timestamp `ts`, int `x`)`

Returns a new timestamp that has the same value as absolute timestamp `t`s, but with the day of the week component replaced with the value of `x`. Days of the week are numbered from `0` to `6`, Sunday through Saturday. Day 7 is also considered to be Sunday.

### set_day_of_month()

Function syntax:

`timestamp set_day_of_month(timestamp `ts`, int `x`)`

Returns a modified version of the absolute timestamp `t`s, with the day of month component replaced with `x`.

### set_month()

Function syntax:

`timestamp set_month(timestamp `ts`, int `x`)`

Returns a modified version of the absolute timestamp `t`s, with the month component replaced with `x`.

### set_year()

Function syntax:

`timestamp set_year(timestamp `ts`, int `x`)`

Returns a modified version of the absolute timestamp `t`s, with the year component replaced with `x`.

### to_milliseconds()

Function syntax:

`long to_milliseconds(timestamp `ts`)`

Converts an interval or absolute timestamp to a long value representing milliseconds. This function is similar to `to_seconds()`, but is more precise and faster. If the argument `ts` is an interval, returns the number of milliseconds in the interval. For example, `to_milliseconds(now() - today())` returns the number of milliseconds since midnight (00:00:00).

If the argument `ts` is an absolute time, returns the number of milliseconds since the epoch, Jan 1, 1970. For example, `to_milliseconds(timestamp("2007-11-29 17:32"))` returns `1196375520000`.

### to_seconds()

Function syntax:

`double to_seconds(timestamp `ts`)`

Converts an interval or absolute timestamp to a double in seconds. If the argument `ts` is an interval, returns the number of seconds in the interval. For example, `to_seconds(now() - today())` returns the number of seconds since midnight (00:00:00).

If the argument `ts` is an absolute time, returns the number of seconds since the epoch, Jan 1, 1970. For example, `to_seconds(timestamp("2007-11-29 17:32"))` returns `1.19637552E9`.

## Simple Functions: Type Conversions

This category includes the following functions:

 blob() bool() double() int() list() long() named schema constructor function string() timestamp() tuple()

### blob()

Function syntax:

```blob blob(string arg)
blob blob()```

Converts `arg` to a value of type blob where `arg` is a string. Returns a blob representation of the string. With no argument, returns a null blob as if you had used `blob(null)`.

For additional conversion functions involving blobs, see Simple Functions: BSON.

### bool()

Function syntax:

```bool bool(bool arg)
bool bool(double arg)
bool bool(int arg)
bool bool(long arg)
bool bool(string arg)
bool bool(blob arg)
bool bool()```

Converts `arg` to a value of type bool where `arg` is a bool, double, int, long, or string.

• An int argument of 0 returns the Boolean `false`. All other int values, positive or negative, return `true`.

• A double argument of 0 returns the Boolean `false`. All other double values, positive or negative, return `true`.

• A string argument `false` returns a Boolean `false` and the string `true` returns Boolean `true`. The string is compared in a case-insensitive manner. For example, `bool("FALSE")` and `bool("false")` are equivalent. Any string other than `false` or `true` is an error.

• A blob argument returns the content of a BSON blob containing a bool

• A bool value returns itself.

• With no argument, returns a null bool as if you had used `bool(null)`.

### double()

Function syntax:

```double double(bool arg)
double double(double arg)
double double(int arg)
double double(long arg)
double double(string arg)
double double(blob arg)
double double()```

Converts `arg` to a value of type double where `arg` is a bool, double, int, long, or string.

• A bool argument returns `1.0` if true, or `0.0` if false.

• A double argument returns itself.

• An int argument returns the same value converted to type double. For example, `double(3)` returns `3.0`.

• A long argument returns the same value converted to type double. For example, double(15L) returns 15.0.

• A string argument is parsed as a decimal number. For example, `double("123.456")` returns `123.456`, but `double("7abc")` returns an error. Scientific notation is supported, so `double("1.2E4")` returns `12000.0`.

• A blob argument returns the content of a BSON blob containing a double

• With no argument, returns a null double as if you had used `double(null)`.

Also See: The to_seconds() function, used to convert timestamps to double.

### int()

Function syntax:

```int int(bool arg)
int int(double arg)
int int(int arg)
int int(long arg)
int int(string arg)
int int(blob arg)
int int()```

Converts `arg` to a value of type int where `arg` is a bool, double, int, long, or string.

• A bool argument returns `1` if true, or `0` if false.

• A double argument has its fractional part truncated. For example, `int(3.4)` returns `3`.

• An int argument returns itself.

• A long argument returns the int same as a long value. For example, int(150L) returns 150.

• A string argument is parsed as a decimal number. For example, `int("123")` returns `123`, but `int("7abc")` is an error.

• A blob argument returns the content of a BSON blob containing an int

• With no argument, returns a null int as if you had used `int(null)`.

### list()

Function syntax:

`list list(`e1`, `e2`, `e3`, ...)`

Returns a list containing each of the specified elements `e1`, `e2`, and so on, where each specified element is of the same data type. The specified elements can be literals or expressions, as long as each resolves to (or can be coerced to) the same data type. You can create a list of any StreamBase data type.

Use emptylist() to return an empty list, and use nulllist() to return a null list. (See Null Lists for a discussion of null lists compared to empty lists.)

### long()

Function syntax:

```long long(bool `arg`)
long long(double `arg`)
long long(int `arg`)
long long(long `arg`)
long long(blob `arg`)
long long()```

Converts expression `arg` to a value of type long where `arg` evaluates to a bool, double, int, or long.

• An int argument returns the same value, but with type long.

• A double argument has its fractional part truncated, and the remainder returned with type long.

• A bool argument returns `1` if true, or `0` if false.

• A blob argument returns the content of a BSON blob containing a long

• A long value returns itself.

• With no argument, returns a null long as if you had used `long(null)`.

### Named Schema Constructor Function

Function syntax:

`tuple `schema_name`(`fieldspec` [, `fieldspec2`...]) [as tuplename]`

For every named schema you define or import into a module, StreamBase automatically generates a new function in the StreamBase expression language that allows you to construct tuples with that schema. The generated function's name is the name of the named schema, and it takes one or more comma-separated expressions as value arguments. The result is a single tuple with the same schema as the named schema.

For example, for the named schema `point`, whose schema is `(x double, y double)`, you can use a function named `point()` anywhere in the module that defines the named schema:

`point(32.0, 44.5)`

`point(32.0 as x, 44.5 as y)`

In StreamSQL, a final AS keyword is required to name the resulting tuple:

`point(32.0, 44.5) AS p2`

`point(32.0 as x, 44.5 as y) AS p2`

You can specify the list of `fieldspec` field specifications in two ways:

• By position

• By name

The following examples presume a named schema with the name `Quotes`, having the following schema:

`(Symbol string, Price double, NumShares int)`

When specifying by position, you do not need to specify AS keywords and field names, but you must specify all fields in schema order. For example:

`Quotes("IBM", 132.42, 5000)`

When specifying by name, you must use the AS keyword to specify field names, which must exactly match the spelling of the named schema's fields. Any fields you do not specify are automatically filled with nulls:

`Quotes("IBM" as Symbol, 132.42 as Price)`

When specifying by name, you can specify fields in any order:

`Quotes(132.42 as Price, "IBM" as Symbol)`

You cannot mix field specifications with and without the AS keyword.

Specifying fields by name provides flexibility when you use the same named schema in several places in a large application. If you later edit the named schema itself to add a new field, you do not need to add the new field everywhere the schema is used. If the fields are specified by name in all locations, the new field does not cause an error condition, and is automatically filled with null.

Use the name of a named schema with no arguments to construct a null tuple (which is not the same as a tuple whose fields are all null). For example: `point(), Quotes()`

Avoid naming a schema the same as an existing expression language function. If you do, the generated constructor function would mask and prevent the use of the same-named built-in function. See Named Schema Name Collisions.

Use the `* AS *` syntax for tuples defined with a named schema to copy the entire tuple into a single field of type tuple.

For example, let's say the tuple arriving at the input port of a Map operator was defined upstream with the `NYSE_FeedSchema` named schema. To preserve the input tuple unmodified for separate processing, the Map operator could add a field of type tuple using settings like the following in the Additional Expressions grid. When using the `* AS *` syntax in the Expression column, the name of the tuple field in the Field Name column has an implied asterisk for all of its fields.

Action Field Name Expression

Because the Map operator has only one input port, the port does not need to be named:

Action Field Name Expression

### string()

Function syntax:

```string string(blob `arg`)
string string(bool `arg`)
string string(double `arg`)
string string(int `arg`)
string string(long `arg`)
string string(string `arg`)
string string(timestamp `arg`)
string string()```

Converts `arg` to a value of type string where `arg` is a blob, bool, double, int, long, string, or timestamp.

• An int, double or long argument returns the string representation of `arg`.

• A string argument returns itself.

• A bool argument returns the string `true` if true, or `false` if false.

• A timestamp argument, if represented as a date and time, returns a string that includes the explicit time zone. A timestamp represented as a numeric value returns the string representation of that value.

• A blob argument returns the string representation of the blob.

• With no argument, returns a null string as if you had used `string(null)`.

### timestamp()

Function syntax:

```timestamp timestamp(string `timestring`)
timestamp timestamp()```

Returns an absolute timestamp parsed from the argument `timestring`, a string representation of a date with optional time. With no argument, returns a null timestamp as if you had used `timestamp(null)`.

The `timestring` argument must be in the form of a time format pattern as defined in the `java.text.SimpleDateFormat` class. See the format_time() function for a summary of the time format pattern strings.

You can specify only the date portion:

`timestamp("2009-02-12")`

If you specify any part of the time value, you must specify at least the hours and minutes:

`timestamp("2009-02-12 14:28")` or `timestamp("2009-02-12 14:28:05.250")`

You can also specify the entire `SimpleDateFormat` string, including time zone specification:

`timestamp("2009-02-12 14:28:05.250-0800")`

See also the strptime() and parse_time() functions, used to convert strings to timestamps.

### tuple()

Function syntax:

```tuple tuple(`value` as `fieldname` [, `value2` as `fieldname2` [,...]]) [as tuplename]
tuple tuple()```

Takes one or more comma-separated value expressions as arguments, each field named with the AS keyword. The result is a single tuple with the specified contents. For example:

`tuple(126.50 as price, "IBM" as symbol)`

In StreamSQL, you must specify a name for the resulting tuple with a final AS keyword:

`tuple(126.50 as price, "IBM" as symbol) as tick1;`

When using tuple() to create individual tuples, to be compliant with a particular schema, you must specify all fields in the schema's field order. The tuple() function has no knowledge of the schema you are trying to match. For these reasons, we recommend using named schemas and the named schema constructor function.

You can duplicate any tuple field into another field of type tuple without using wildcards. For example, a Map operator might have an entry like the following in its Additional Expressions grid, where both `IncomingTuple` and `CopyOfIncomingTuple` are the names of tuple fields:

Action Field Name Expression

Use the `.*` syntax to flatten a tuple field into the top level of a stream.

For example, a Map operator might define an entry like the following in its Additional Expressions grid. When using this syntax, you must have an asterisk in both Field Name and Expression columns.

Action Field Name Expression

## Simple Functions: Utilities

This category includes the following functions:

 coalesce() coalesce_tuples() compile() eval() fromCodePoint() hash() isnull() jsonpath() length() md5() new_tuple() new_tuple_subset() new_tuple_subset_loose() notnull() nullif() parsecsv() parsejson() random_tuple() parsejson_loose() random_tuple() toCodePoint() tojson() uuid()

### coalesce()

Function syntax:

`T coalesce(T `arg1`, ..., T `argn`)`

Returns the first non-null argument, or a null value if all arguments are null. Accepts all data types, but all arguments must have the same type. The returned value has the same data type as the arguments. For tuple arguments, returns the first tuple that is non-null. This function stops evaluating arguments after the first non-null argument is detected.

Since literal values are never null, you can specify a literal value as the last argument to provide an effective default value for the list. In this way, `coalesce()` can be used to emulate the `NVL()` function provided by Oracle PL/SQL and the two-argument ISNULL() function provided by Microsoft T/SQL. For example, the following expression returns the value of `fieldA` if it is non-null, or `0` if it is null:

`coalesce(fieldA, 0)`

The following example returns the first non-null field among `fieldA`, `fieldB`, and `fieldC` in that order, or returns `-99999` if all three fields are null:

`coalesce(fieldA, fieldB, fieldC, -99999)`

### coalesce_tuples()

Function syntax:

`T coalesce_tuples(T `arg1`, ..., T `argn`)`

For non-hierarchical data types, coalesce_tuples() acts the same as the coalesce() function. For hierarchical types, this function coalesces each sub-field, recursing into further nested fields as necessary.

For example, the following function:

```coalesce_tuples(
tuple(double() AS x, 1.2 AS y, double() AS z),
tuple(1.0 AS x, 1.3 AS y, double() AS z),
tuple(double() AS x, double() AS y, 1.4 AS z)
)```

returns the following tuple:

`((x double, y double, z double)) 1.0, 1.2, 1.4`

Remember that double() evaluates to double(null). Thus, the `coalesce_tuples` function in this example selected:

• The first non-null value for x, 1.0, which was found in the second input tuple.

• The first non-null value for y, 1.2, which was found in the first input tuple.

• The first non-null value for z, 1.4, which was found in the third input tuple.

### compile()

Function syntax:

```function compile(string `expr`, tuple `varfields`, T `typespec`)
function compile(string `expr`, tuple `varfields`, T `typespec`,
string `name`)
function compile(string `expr`, tuple `varfields`, T `typespec`,
string `name`, tuple `environment`)```

Converts a string defining an expression and a tuple defining its arguments into an executable StreamBase function at runtime. Compiles the function body provided as the first argument, providing it with arguments specified in the second argument. The third argument defines the data type the function is to return when executed.

The first argument, `expr`, can be a string literal, a string constant, or the name of a field passed in from a stream. The second argument, `varfields`, can define a tuple using the tuple() function or be a named schema or parameter specifying a tuple. Typical usage is to provide the data types and field names that are to be used in the supplied expression, like:

`tuple(double() as factor)`

More generally, if the context that the function wil execute in has the same fields as the current input fields, you can specify all current field names:

`tuple(* as *)`

Always provide the third argument, `typespec`, to define the data type that the function returns, which can be any StreamBase data type, such as int(), double(), list(), string(), or even function().

As an example, compile a function that converts temperature in degrees Fahrenheit to degrees Celsius:

`compile( "(fahr_temp - 32) * 1.8", tuple(double() as fahr_temp), double() )`

The second syntax adds string `name`, an optional name for the function. Specify a name if the function is recursive, so that it can refer to itself. StreamBase expression language identifier naming rules apply, and the scope of the identifier is the function itself.

The third syntax adds tuple `environment`, which provides a set of fields that the function expression can use as names and values for local variables. When using this syntax, you can specify an empty `name` argument ("") if it is not referenced within the function body.

You can define an expression within a stream's schema as a string and use it to define a function. For example, the input stream could contain a string field named converter with the value `"(fahr_temp - 32) * 1.8"`. The above example could then be:

`compile( converter, tuple(double() as fahr_temp), double() )`

You could generalize the above example to convert Fahrenheit temperatures to either Celsius or Kelvin units by specifying an `environment` variable, which could either be defined on the input stream or as a constant. If the value of that variable is 0.0, the function returns degrees Celsius. If its value is 273.15, the function returns degrees Kelvin. Defining converter as `"(fahr_temp - 32) * 1.8 + k_offset"` and providing `name` and `environment` arguments changes the call to compile to:

`compile( converter, tuple(double() as fahr_temp), double(), "", k_offset )`

When `environment` is defined as (double(273.15) as k_offset) the function returns degrees Kelvin. When it is defined as (double(0.0) as k_offset), the function returns degrees Celsius.

### eval()

Function syntax:

```T eval(string expr, T typespec)
T eval(string expr, tuple varfields, T typespec [, int `cachesize`])```

Evaluates the expression provided as the first argument and returns the result, whose data type is provided in the `typespec` argument. The expression must resolve to the data type specified as `typespec`; the function does not coerce the return's type.

The `typespec` argument is evaluated only for its data type, and its value is ignored. You can specify literal values, as long as their data type is unambiguous (1, 1L, 1.0, "string", and so on); or specify the null value function for the expected data type (int(), long(), double(), string(), and so on); or provide an expression that unambiguously resolves to a value with the expected data type.

For the two-argument form, the `expr` argument is a simple StreamBase expression that does not depend on any variables for its resolution. For the three-argument form, the `expr` argument can reference fields in the second argument's tuple.

The function returns an evaluation exception under the following circumstances:

• If you provide a `typespec` argument with a different data type than actually returned from the evaluation of the first argument.

• The expression in the `expr` argument fails to pass typechecking or resolves to null.

• In the three-argument form, you provide a null tuple for the second argument.

In any field that takes expressions, you can, of course, enter any valid expression to be evaluated without needing to use eval(). The eval() function provides a way for you to build an expression string dynamically or have it based on input:

`eval(input.exprfield, int())`

The three-argument form can be used to pass an entire input tuple or table row for evaluation against an expression:

`eval("price > 200.00", input, bool())`

`eval("remainingQuantity == 0", current, bool())`

The expression in the `expr` argument is evaluated in a self-contained environment. Anything the expression needs for its evaluation must be passed in by means of the `varfields` argument. For example, to reference a constant or dynamic variable in the `expr` argument, make sure that constant or dynamic variable is included in the input tuple. Let's say a module defines the constant `discount`, and has a dynamic variable named `curr_rate`. You want to multiply field `price` in the input tuple by both values. Use the tuple() function to create a tuple consisting of the entire input tuple, plus the constant and variable:

`eval("input.price * discount * curr_rate", tuple(input.* as *, discount, curr_rate), double())`

The optional fourth argument is an advanced setting that specifies the size of the evaluation cache as the integer number of unique compiled evaluations to be saved in the cache. In the absence of this argument, the evaluation cache is the same as the current setting of the system property `streambase.codegen.eval-cache-size`, which defaults to 100 for StreamBase Server and to 1000 for LiveView Server. The cache setting is container-wide, and setting it during runtime can only increase the cache, and never shrink it. This prevents a single use of eval() at runtime from reducing the cache size for everyone.

This argument must be in the fourth position; to use it with the two-argument form of this function, add an empty tuple for the second argument. (Remember that an empty tuple is not the same as a null tuple, which is disallowed for the second argument; see Null Tuples, Empty Tuples, No-Fields Tuples.)

This function is not expected to work at the command prompt with sbd --eval.

### fromCodePoint()

Function syntax:

`string fromCodePoint(int `value`)`

Accepts an integer value and returns the corresponding ASCII character.

### hash()

Function syntax:

`int hash(T `arg`)`

Returns a hashed integer value derived from the argument, which can be any StreamBase data type.

### isnull()

Function syntax:

`bool isnull(T `value`)`

Accepts all data types. Returns true if the argument is a null value.

For list and tuple arguments, isnull() returns true if and only if the top-level list or tuple is null. If a list contains any elements at all, even if all their values are null, it is not a null list. If a tuple contains any fields at all, it is not null even if all field values are null.

### jsonpath()

Function syntax:

`string jsonpath(string `json`, string `jsonpath`)`

Accepts a JSON-formatted string and a JSONPath expression and returns the result of applying the `jsonpath` expression as a filter on the input JSON string. JSONPath expressions are XPath-like pattern strings originally defined in JavaScript. This function is based on the Java implementation of JSONPath. See either of these pages for instructions on crafting JSONPath expressions.

For example, given the following JSON document:

{
"store": {
"book": [
{
"category": "fiction",
"author": "Herman Melville",
"title": "Moby Dick",
"ISBN": "0-553-21311-3",
"price": 8.99
},
{
"category": "fiction",
"author": "J. R. R. Tolkien",
"title": "The Lord of the Rings",
"ISBN": "0-395-19395-8",
"price": 22.99
}
]
}
}

and passing the JSONPath string `\$.store.book[1].title` as the second argument, the result is:

`(string) ["The Lord of the Rings"]`

These input arguments work at the command prompt with `sbd -e jsonpath(...)`, but remember to escape the quotes with `\"`.

### length()

Function syntax:

```int length(blob `x`)
int length(list `L`)
int length(string `x`)```

Returns the length of its argument for argument types blob, list, and string. For a blob, returns a count of bytes. For a non-null list, returns the number of elements in the list, including any null elements. For a string, returns the number of graphemes in the string, for ASCII and all supported Unicode character sets.

If the argument is a null blob, null list, or null string, the returned value is `null`.

### md5()

Function syntax:

`string md5(string `input`)`

Returns the md5 fingerprint of the provided input string.

### new_tuple()

Function syntax:

```tuple new_tuple(tuple `t` [, `updated_value1` AS `field1_name` [,
`updated_value2` AS `field2_name` [, ... [,
`updated_valueN` AS `fieldN_name`]]]])```

Use the `new_tuple()` function to modify the values of a small number of fields in a tuple, leaving the other fields unchanged.

The required first argument `t` is a tuple value such as a named schema or the name of a tuple field in the incoming stream. Use one or more field update arguments of the form `value AS field_name`. You can use `input.* AS *` to fill in the fields in `t` with input stream fields that have matching field names.

You do not need to specify all fields in `t`, but you cannot specify fields not in `t`.

The following example both defines a tuple and immediately replaces the value of its `x` field. This example is impractical but can illustrate `new_tuple()` using sbd --eval:

`new_tuple(tuple(1.0 AS x, 2.0 AS y, 3.0 AS z), 5 AS x)`

For a more practical example, an input stream may have a schema defined as `(Symbol string, Price double, NumSh double)`. The following expression in a Map operator's Additional Expressions grid inflates the incoming value of the Price field by 10%:

Action Field Name Expression
Add InflatedInput new_tuple(input, input.Price*1.10 as Price)

### new_tuple_subset()

Function syntax:

```tuple new_tuple_subset(tuple `t`, `value1` AS `field1_name`[,
`value2` AS `field2_name`[, ... [,
`valueN` AS `fieldN_name`]]])```

This function returns a tuple with the schema of the tuple argument `t`, and with the specified new field values substituted.

The required first argument `t` is a tuple value such as a named schema or the name of a tuple field in the incoming stream. Use one or more field update arguments of the form `value AS field_name`. You can use `input.* AS *` to fill in the fields in `t` with input stream fields that have matching field names.

Unlike `new_tuple()`, this function allows but ignores any update field names not in tuple `t`.

Unlike `new_tuple()`, this function requires all field names in `t` to be specified.

Let's say an input stream has a schema defined as `(Symbol string, Price double, NumSh double, CustID int)`, while the named schema Trade has a subset of that schema: `(Symbol string, Price double, NumSh double)`. The following expression in a Map operator's Additional Expressions grid accepts the four-field input tuple and extracts only the three-field trade basics into a new field named TradeOnly.

Action Field Name Expression

### new_tuple_subset_loose()

Function syntax:

```tuple new_tuple_subset_loose(tuple `t`, `updated_value1` AS `field1_name`[,
`updated_value2` AS `field2_name`[, ... [,
`updated_valueN` AS `fieldN_name`]]])```

This function returns a tuple with the schema of the tuple argument `t`, and with the specified new field values substituted.

The required first argument `t` is a tuple value such as a named schema or the name of a tuple field in the incoming stream. Use one or more field update arguments of the form `value AS field_name`. You can use `input.* AS *` to fill in the fields in `t` with input stream fields that have matching field names.

Unlike `new_tuple()`, this function allows but ignores any update field names not in tuple `t`.

Like `new_tuple()`, this function does not require all field names in `t` to be specified.

This function is especially useful when using a named schema to specify a declared output stream schema. In a Map operator immediately upstream of the output stream, use this function to reconcile a varying set of input fields to the output stream's declared schema. If the output stream's declared schema is named `TargetSchema`, use an Additional Expressions grid like the following:

Action Field Name Expression
Add TempTuple new_tuple_subset_loose(TargetSchema(), input.* AS *)

Then add a second Map to expand TempTuple's fields back to the top level:

Action Field Name Expression

As an alternative, you can collapse both Map operators into one that uses the following:

Action Field Name Expression
Add * new_tuple_subset_loose(TargetSchema(), input.* AS *).*

This setup allows you to change the definition of `TargetSchema` elsewhere in a large application without having to also change the preceding Map operator.

### notnull()

Function syntax:

`bool notnull(T `value`)`

Returns true if the argument is not a null value. Accepts all data types.

This function always returns the opposite of `isnull()`.

### nullif()

Function syntax:

`T nullif(T `arg1`, T `arg2`)`

Takes two arguments of any type (the same type for both). If the resolved value of both arguments is the same, returns null. If the resolved value of the two arguments differs, returns the value of the first argument. This function can be used to evaluate fields before and after an event, and to send a field upstream only if it has changed, otherwise null.

### parsecsv()

Function syntax:

`tuple parsecsv(tuple `tuplespec`, tuple `tupleoptions`, string `csv-string`)`

Accepts a string formatted as comma-separated values (CSV) and returns a tuple. It takes three arguments:

• `tuplespec` - a specification for the returned tuple's schema, in the form of a named schema constructor function, or the tuple() function, or an expression that resolves to a schema definition

• `tupleoptions` - a tuple defining one or more of the processing options described in the following table, or `null` if none are passed in

• `csv-string` - A string with a list of items delimited by commas (default) or by the delimiter character you specified in tupleoptions

The tuple options are:

Option Name Description Data Type Default Value
delimiter Delimiter character separating individual fields string , (comma)
quotechar Quote character for string fields string " (double quote)
header A CSV string, which may be optionally used to specify the order of tuple fields in the inputs to the function. string The order of the fields in the output Schema type
nullstring String that specifies what null values in the input CSV string look like string "null" (Tuple.NULL_STRING)
timestampformat Timestamp format to be used to parse timestamp fields in the input strings string yyyy-MM-dd hh:mm:ss.SSSZ
lenientTimestampParsing Whether to attempt to parse timestamps using the default datetime format ("yyyy-MM-dd hh:mm:ss.SSSZ") or interval format ("ss.SSS") when a timestamp field cannot be parsed using the specified timestampformat boolean true

No value is set for fields that do not have a value specified in the input string. If an input string has more fields than the field order specifies, the additional fields are ignored with a warning.

You can specify how values in CSV string inputs are ordered using the header option. The header field names must match those in the output schema. If the header contains any field names that are not in the schema, those fields are ignored, and a debug level log message is issued. If the header option is not specified, input values must be ordered as specified by the schema in the first argument.

Specify sub-tuples as escaped CSV strings (using `\"` or the designated quote character). There is no option to specify the field order within fields that are tuples themselves. See the second example below.

Example (can be run with `sbd --eval` in StreamBase 7.x; line broken for publication clarity):

```parsecsv(tuple ("a" AS FirstName, "b" AS LastName, "c" AS StreetNum,
"d" AS StreetName, 1 AS Zip, "e" AS email), null,
"John,Doe,30,Main Street,43210,jdoe666@myisp.com")```

Output:

```((FirstName string, LastName string, StreetNum string, StreetName string,
Zip int, email string)) John,Doe,30,Main Street,43210,jdoe666@myisp.com ```

### Note

All three input arguments (tuple, tuple, string) are required. The tuple options can be `null` but must be included.

The same example with street address formatted as a sub-tuple with delimiter set to semicolon:

```sbd --eval 'parsecsv(tuple ("a" AS FirstName, "b" AS LastName,
tuple ("c" AS StreetNum, "d" AS StreetName, 1 AS Zip) AS Address,
"e" AS email), tuple(";" AS delimiter),
"John;Doe;\"30;Main Street;43210\";jdoe666@myisp.com")```

Output:

```((FirstName string, LastName string, Address (StreetNum string,
StreetName string, Zip int), email string))
John,Doe,"30,Main Street,43210",jdoe666@myisp.com```

### parsejson()

Function syntax:

`tuple parsejson(tuple `tuplespec`, string `json-string` [, string `timestampformat`])`

Accepts a JSON-formatted string, in either JSON object or JSON array format (as defined on http://www.json.org) and returns a tuple. The first argument, `tuplespec`, is a specification for the returned tuple's schema, in the form of a named schema constructor function, or the tuple() function, or an expression that resolves to a schema definition.

Use the optional third argument to specify a timestamp format with which to parse the JSON string, using format patterns in the manner of java.text.SimpleDateFormat. In the absence of a third argument, timestamps are parsed using the default format string `yyyy-MM-dd HH:mm:ss.SSSZ`.

For example, for the named schema point(), defined as `(int x, int y)`, the following expressions both return the tuple `(x: 88, y: 99)`.

```parsejson(point(), "[88, 99]") parsejson(point(), "{x:88, y:99}")```

### parsejson_loose()

Function syntax:

`tuple parsejson_loose(tuple `tuplespec`, string `json-string` [, string `timestampformat`])`

Works similarly to `parsejson()`, but ignores json elements that do not fit the schema definition provided. Accepts a JSON-formatted string, in either JSON object or JSON array format (as defined on http://www.json.org) and returns a tuple. The first argument, `tuplespec`, is a specification for the returned tuple's schema, in the form of a named schema constructor function, or the tuple() function, or an expression that resolves to a schema definition.

Use the optional third argument to specify a timestamp format with which to parse the JSON string, using format patterns in the manner of java.text.SimpleDateFormat. In the absence of a third argument, timestamps are parsed using the default format string `yyyy-MM-dd HH:mm:ss.SSSZ`.

For the named schema point(), defined as `(int x, int y)`, `parsejson_loose()` returns the tuple `(x: 88, y: 99)` from both the following expressions.

```parsejson(point(), "[88, 99]") parsejson(point(), "{x:88, y:99}")```

In addition, it would return the same result from a JSON string such as `{x:88, y:99, z:101}`, ignoring the extra element.

In the following example, the initial JSON element `z` is ignored, and the missing tuple element `z` is set to null, to return the tuple `1,"2,""null,3""`" (line break added for clarity):

```parsejson_loose(tuple('' AS x, tuple('' AS y, tuple ( '' AS z, '' AS t) AS q) AS r),   \"{z:'a', x:'1', r:{y:2, q:{t:'3'}}}\")```

### random_tuple()

Function syntax:

`tuple random_tuple(`schema-definition`)`

Given a schema definition, usually in the form of a named schema constructor function, returns a tuple with each field filled with random data of the correct data type for each field. For example, for the named schema, `point`, with schema `(double x, double y, double z)`, the named schema constructor function `point()` creates a single tuple with that schema. Use `point()` as the schema definition argument for random_tuple():

`random_tuple(point())`

This returns three double values selected at random, any of which might be null. Numeric values returned can be fractional or negative. Each time this function is run, another three random values are returned.

You can also use random_tuple() using tuple() to define a schema. In this case, you must provide value arguments and field names, as in the following example:

`random_tuple(tuple(132.34 as price, "IBM" as symb, now() as tdate))`

This at first appears to be a contradiction, because you provide values to fields that you then ask random_tuple() to replace with new, random values. However, in this context, the values you provide to tuple() are evaluated by random_tuple() only to determine the data type for each field. For this example, random_tuple() returns three random fields of type double, string, and timestamp.

### toCodePoint()

Function syntax:

`int toCodePoint(string `characters` [, int `index`])`

Accepts a string of one or more characters and returns the corresponding ASCII code. If more than one character is provided with no `index` argument, the code corresponding to the first character is returned. You can also provide a zero-based `index` value to specify which character in the provided string whose code you want returned.

### tojson()

Function syntax:

`string tojson(tuple `fieldname` [, bool `verbosity`] [, string `timestampformat`])`

Accepts a tuple argument and returns the tuple's field values as a JSON object of key-value pairs, or optionally as a JSON array of field values. The argument `fieldname` is likely to be a field of type tuple in the incoming stream, but can be any expression that resolves to a single tuple.

Use the optional third argument to specify a timestamp format with which to parse timestamp fields in the incoming tuple, using format patterns in the manner of java.text.SimpleDateFormat. In the absence of a third argument, timestamps are parsed using the default format string `yyyy-MM-dd HH:mm:ss.SSSZ`. When specifying the third argument, the `verbosity` second argument is no longer optional, and must be explicitly set.

By default, or with the optional `verbosity` argument expressed as `true`, the input tuple's field and value pairs are returned as a string formatted as a JSON object, as defined on http://www.json.org: a left brace followed by colon-separated name-value pairs (with field name quoted), and each pair separated by a comma, followed by a closing right brace.

Set the second argument to `false` to return a JSON array of field values instead: a left bracket following by comma-separated field values, followed by a closing right bracket.

For example, let's say an incoming tuple field named `trade` has the following schema and values:

Field Data Type Value
A double 2.0
B int 333
C string "IBM"
D list(int) range(0, 8, 2) = [0, 2, 4, 6]

In this case, the expressions `tojson(trade)` and `tojson(trade, true)` return the following string:

`{"A":2.0,"B":333,"C":"IBM","D":[0,2,4,6]}`

To test this function at the command prompt with sbd --eval, use a construction like the following:

`sbd --eval "tojson(tuple(2.0 as A, 333 as B, 'IBM' as C, range(0, 8, 2) as D))"`

The expression `tojson(trade, false)` returns the following string:

`[2.0,333,"IBM",[0,2,4,6]]`

### uuid()

Function syntax:

`string uuid()`

Returns a pseudo randomly generated 128-bit universally unique identifier string in the manner of java.util.UUID.randomUUID().

## Aggregate Functions Overview

Aggregate functions are used on sets of data to return a single result. Aggregate functions evaluate columns of data from windows or tables. In EventFlow applications, aggregate functions can only be used in the following aggregate contexts:

• In Aggregate operators, in aggregate expressions that apply to an aggregate window.

• In Query operators that perform Read or Delete operations, in expressions that apply to all rows read.

• In Query operators that perform Read or Delete operations, in aggregate expressions used in conjunction with a Group-By field.

In StreamSQL applications, aggregate functions can be used in SELECT statements related to aggregate or query read operations.

Expressions in aggregate contexts can contain a single aggregate function, or they can combine multiple aggregate functions together with constants and fields from an input stream or an associated Query Table. Aggregate expressions can include simple functions if the value of the simple function does not change over the lifetime of the Aggregate window.

Aggregate functions are organized into the following categories:

 Aggregate Functions: Aggregate to List Aggregate Functions: External Functions Aggregate Functions: Statistical Calculations Aggregate Functions: Windowing

## Aggregate Functions: Aggregate to List

This category includes the following functions:

 aggregatelist() aggregatelistnonnull() avg_list() concat() count_list() join(), joinlist() rank() sum_list() unique()

### aggregatelist()

Function syntax:

`list(T) aggregatelist(T e)`

Returns a list containing one element for each expression, `e`, in the Aggregate context. The list element type, T, of the returned list is the same as the data type of `e`.

### aggregatelistnonnull()

Function syntax:

`list(T) aggregatelistnonnull(T e)`

Returns a list containing one element for each expression, `e`, in the Aggregate context. The list element type, T, of the returned list is the same as the data type of `e`. Unlike `aggregatelist()`, this function excludes nulls from the constructed list.

### avg_list()

Function syntax:

`list(T) avg_list(list(T) `L`)`

Returns a list containing the element-wise average of the elements of the field of type list, `L`, in the Aggregate context. The list element type, T, of the returned list is the same as the element type of the argument list, or a coerced superset type. For example, an Aggregate operator might specify `avg_list(intlistfield)` as an aggregation expression. With the following input tuples:

```{intlistfield: [1, 2, 3], otherfield: x}
{intlistfield: [4, 5, 6], otherfield: y}```

When the Aggregate operator emits, the result is:

`[2.5, 3.5, 4.5]`

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the average.

### concat()

Function syntax:

`list(T) concat(list(T) `L`)`

Returns a list containing the elements of the field of type list, `L`, in the Aggregate context. The list element type, T, of the returned list is the same as the element type of the argument list.

### count_list()

Function syntax:

`list(int) count_list(list(T) `L`)`

Returns a list containing the element-wise count of the elements of the field of type list, `L`, in the Aggregate context, returning a list of type int. For example, an Aggregate operator might specify `count_list(intlistfield)` as an aggregation expression. With the following input tuples:

```{intlistfield: [1, 2, 3], otherfield: x}
{intlistfield: [4, 5, 6, 7, 8], otherfield: y}```

When the Aggregate operator emits, the result is:

`[2, 2, 2, 1, 1]`

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the count.

### join(), joinlist()

Function syntax:

```string join(string `separator`, string `S`)
string joinlist(string `separator`, string `S`)```

Returns a string formed by concatenating string field `S` in the Aggregate context. The `separator` string is inserted between each string. If the `separator` string is empty, no separators are inserted. If the `separator` string is null, join() returns null.

You can use `joinlist()` as an alias for `join()` in all contexts. In StreamSQL contexts, which has the reserved word JOIN, and in LiveView LiveQL, which partly inherits its parser from StreamSQL, you must either use the `joinlist()` alias, or use escaped identifier syntax.

Also see the simple function join().

### rank()

Function syntax:

`list(double) rank(double `field`)`

Returns a list containing the rank ordering of values of a field of type double in the Aggregate context, in ascending order. The input type must be double or be cast to double. If the input field contains duplicated values, the rank returned for each of them is all the average of their ranks. For example, if a field called priority contained the values `5, 1, 6, 2` in the aggregate window, then `rank(priority)` returns `[3, 1, 4, 2]`. If the values for priority instead were `5, 2, 6, 2`, then `rank(priority)` returns `[3, 1.5, 4, 1.5`].

### sum_list()

Function syntax:

`list(T) sum_list(list(T) `L`)`

Returns a list containing the element-wise sum of the elements of the field of type list, `L`, in the Aggregate context. The list element type, T, of the returned list is the same as the element type of the argument list. For example, an Aggregate operator might specify `sum_list(intlistfield)` as an aggregation expression. With the following input tuples:

```{intlistfield: [1, 2, 3], otherfield: x}
{intlistfield: [4, 5, 6], otherfield: y}```

When the Aggregate operator emits, the result is:

`[5, 7, 9]`

The length of component lists can vary. Short lists are padded with nulls to match the length of the longest list seen in the current context. Null values are ignored for the computation of the sum.

### unique()

Function syntax:

`list(T) unique(T `field`, int `numuniq`)`

Returns a list containing the unique values in the specified `field` in the Aggregate context. The data type of the returned list is the same as that of the `field`.

The unique() function maintains an exact list of unique values until `numuniq` is reached, and does not report changes as the number of values added exceeds `numuniq`. If a value in the reported unique list is removed, it is replaced by some other unique value, if one exists. If there are more than `numuniq` values, there is no guarantee which `numuniq` values are reported.

Also see the simple function unique().

## Aggregate Functions: External Functions

This category includes the following functions:

 callcpp() calljava()

### callcpp()

Function syntax:

`T callcpp(string `class` [, `arg1`, ..., `argn`])`

As an alternative to using `callcpp()`, you can define an alias for your custom functions. See Using Function Aliases.

Use an alias or `callcpp()` to run a custom C++ aggregate function directly from a StreamBase operator that uses an expression. The return type, T, of `callcpp()` is the same as the return type of the called function. Custom C++ aggregate functions are functions you build with the StreamBase C++ Client API, specifically those that inherit from the `sb::PluginAggregate` class. You can use the aggregate form of `callcpp()` in any aggregate expression. (To use `callcpp()` in simple expressions, refer to the simple callcpp().)

For `callcpp()` to locate the function being called, you must specify the location of the containing DLL or `.so` file, and must register the function with a `<custom-function>` element in the StreamBase Server configuration file. You specify the function as an aggregate function by means of the `type` attribute of `<custom-function>`. See C++ Function Overview for details.

To learn about coding custom C++ functions, refer to Creating Custom C++ Functions in the API Guide.

### calljava()

Function syntax:

`T calljava(string `class` [, `arg1`, ..., `argn`])`

Use `calljava()` to run a custom Java aggregate function directly from a StreamBase operator that uses an expression. The return type, T, of `calljava()` is the same as the return type of the called function. Custom Java aggregate functions are functions you build with the StreamBase Java Client library, specifically those that extend the `AggregateWindow` class. You can use the aggregate form of `calljava()` in any aggregate expression and in the Query Table Group Options tab. (To use `calljava()` in simple expressions, refer to the simple calljava().)

As an alternative to using `calljava()`, you can define an alias for your custom functions. See Defining Functions in Java and C++.

The `calljava()` function distinguishes simple from aggregate functions by the number of arguments. For custom simple functions, you must specify both class and method names. For custom aggregate functions, you specify only the class name, because these functions must inherit from `AggregateWindow`, which has an `init()` method that is automatically run.

As an alternative to using `calljava()`, you can define an alias for your custom functions. See Using Function Aliases.

To learn about coding custom Java functions, refer to Using the StreamBase Java Function Wizard in the API Guide. For information on the classpath requirements for custom Java functions, see Java Function Overview.

See Return Types and Argument Types for a discussion of using Java primitives or Java objects in the function you write to be called with `calljava()`.

## Aggregate Functions: Statistical Calculations

This category includes the following functions:

 alpha() andall() avg() avgif beta() correlation_coefficient() correlation_coefficientp() count_distinct() count() countif countlong() covariance() covariancep() exp_moving_avg() goodman_kruskal_gamma() intercept() kendall_tau() max() maxn() median() min() minn() orall() pearson() percentile() product() slope() spearmans_rank() stdev() stdevp() sum() variance() variancep() sumif() vwap() withmax() withmin()

### alpha()

Function syntax:

`double alpha(double `index`, double `price`, double `dividend`)`

Returns a double that may indicate the part of a stock's movement that is independent of the index's movement.

Examples of stocks increasing in alpha could be those with takeover rumors, under strong syndicate manipulation, or having strong expectations of good results; that is, factors that make them increasingly move independently of the index.

The beta of a stock is defined as the slope of a regression line in a scatter graph of paired data points representing percentage changes of an index (return of an index) and the corresponding change (return of stock) in the price of a stock. The StreamBase `beta()` function also includes the stock dividend in the calculation of the stock return.

The alpha is the point where this regression line cuts the Y axis. To reiterate, a stock's beta can be described as that part of a stock's movement that is influenced by the index, while a stock's alpha can be regarded as that part of a stock's movement that is independent of the index's movement.

The `alpha` function's arguments are:

1. `index`: the end-of-period market index (double)

2. `price`: the end-of-period stock price (double)

3. `dividend`: the stock dividend of the period (double)

The stock price must have already figured in any stock split (or reverse split). The alpha is usually calculated over a period of 61 months. This aggregate function assumes that the input data is already normalized. That is, index, price, and dividend are of the same period. If they are of different periods, first normalize the yield (return), and then use the intercept function to calculate alpha.

### avg()

Function syntax:

```double    avg(double `field`)
int       avg(int `field`)
long      avg(long `field`)
timestamp avg(timestamp `field`)```

Returns the average value of a numeric field, `field`, in all tuples in the Aggregate context. The field's type can be a double, int, long, or timestamp. For a returned average of timestamp values to make sense, the aggregate field must contain all interval timestamps or all absolute timestamps, but not both.

### avgif()

Function syntax:

`int avgif(T `fieldname`, bool `predicate`)`

Returns the average of the values for which the specified numeric field is not null and for which the predicate evaluates to true. The predicate expression can be any valid expression on any field, and is not restricted to values from the specified field.

For example: `avgif(input.income, input.age < 35)`

### andall()

Function syntax:

`bool andall(bool `f`)`

The argument `f` represents a field of type boolean in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate operator or Query Table.

This function evaluates the values in the specified field, and returns the results of a logical AND operation on all values. For example, suppose field `BoolField` contains the following values in successive rows of an Aggregate window or Query Table column: `true, true, false, true, false`. In that case, `andall(BoolField)` returns `false`. Use andall() as an aggregate truth detector function.

Null arguments have special handling. A `null` argument does not change the result of the evaluation if a `false` value is among the arguments. Thus, if `BoolField` contains `null, true, false, null, true`, then `andall(BoolField)` still returns `false`. However, if one or more arguments is null while all other arguments are `true`, the function returns `null`. So if `BoolField` contains `null, true, true, null, true`, then `andall(BoolField)` returns `null`.

The andall() function follows the logic expressed in this statement:

if any argument is false, return false
else if any argument is null, return null
else return true

See also the orall() aggregate function and the simple versions of andall() and orall().

### beta()

Function syntax:

`double beta(double `index`, double `price`, double `dividend`)`

Returns a double that may indicate the tendency of a security's returns to respond to swings in the market. A beta of `1` indicates that the security's price will move with the market. A beta of less than 1 means that the security will be less volatile than the market. A beta value that is greater than 1 indicates that the security's price will be more volatile than the market.

For related information, see the description of the alpha function.

Beta is a measure of a security's or portfolio's volatility, or systematic risk, in comparison to the market as a whole. For example, if a stock's beta is `1.2`, it is theoretically 20% more volatile than the market.

The function's arguments are:

1. `index`: the end-of-period market index (double)

2. `price`: the end-of-period stock price (double)

3. `dividend`: the stock dividend of the period (double)

The stock price must have already figured in any stock split (or reverse split). The beta is usually calculated over a period of 61 months. This aggregate function assumes that the input data is already normalized. That is, index, price, and dividend are of the same period. If they are of different periods, the yield should be normalized first and then use the slope function to calculate beta.

### correlation_coefficient()

Function syntax:

```double correlation_coefficient(double `price`, double `index`)
double correlation_coefficient(int `price`, int `index`)
list(double) correlation_coefficient(string `name`, int `index`,
double `field1`, double `field2`, ...)```

The first two syntaxes return the Pearson correlation coefficient for two fields, `price` and `index`, for all tuples in the Aggregate context. This function takes two input arguments (both can be an int or a double) and returns a double. The correlations coefficient is a measure that determines the degree to which two variable's movements are associated. The correlation coefficient varies from `-1.0` to `1.0`. The value `-1.0` indicates perfect negative correlation, and `1.0` indicates perfect positive correlation. `0` means no correlation.

### Note

• The correlation_coefficient of 0 items is null.

• The correlation_coefficient of 1 item is 1.0.

• The correlation_coefficient of N identical items is 1.0.

The third syntax, list(double) correlation_coefficient(string `name`, int `index`, double `field1`, double `field2` ...), returns a list of correlation coefficients, containing as many members as there are field arguments (there must be at least two). The output list of doubles is in the same order as the field arguments. Given `n` field names, the index argument specifies which of those `n` fields holds the primary variable (for which field the returned correlation coefficient is 1.0). The value of index must not change over the life of the operator.

The name argument specifies the type of correlation coefficient to compute, taken from this list:

• `goodman_kruskal_gamma`

• `kendall_tau`

• `pearson`

• `spearmans_rank`

Click any name above to see the description of the statistic. You can abbreviate and approximate the spellings you supply for the name, which must not change over the life of the operator. You can call any of these functions directly by its full name, as well as using the above syntax. However, when you call these functions directly, only two variables can be correlated per call and the output is a scalar double.

If any input field contains null values, the function returns null for its correlation coefficient.

### correlation_coefficientp()

Function syntax:

```double correlation_coefficientp(double `price`, double `index`)
double correlation_coefficientp(int `price`, int `index`)```

Returns the correlation coefficient for two fields, `price` and `index`, for all tuples in the Aggregate context. The `correlation_coefficientp` function is similar to the `correlation_coefficient` function, but use `correlation_coefficientp` when the data provided is the entire population, while the `correlation_coefficient` function is used for a random sample. The `correlation_coefficientp` function is calculated using the biased (or n) method. The `correlation_coefficient` function is calculated using the unbiased (or n-1) method.

This function takes two input arguments (both can be an int or a double) and returns a double. The correlations coefficient is a measure that determines the degree to which two variable's movements are associated. The correlation coefficient varies from `-1.0` to `1.0`. The value `-1.0` indicates perfect negative correlation, and `1.0` indicates perfect positive correlation. 0 means no correlation.

### Note

• The correlation_coefficientp of 0 items is null.

• The correlation_coefficientp of 1 item is 1.0.

• The correlation_coefficientp of N identical items is 1.0.

### count(), countlong()

Function syntax:

```int  count([T `expr`])
long countlong([T `expr`])```

Returns the number of tuples in the Aggregate context. If this function is called with an argument, then tuples for which the argument evaluates to null do not contribute to the count. The argument can be of any data type, T, but its actual value is ignored, other than the fact that it is non-null. If no argument is specified, then all tuples are included in the count.

`countlong()` works exactly as `count()` does, except that it returns a long value rather than an int.

To clarify the operation of `count()`, consider the following sequence of values for x:

`x {1,2,3,4,5,6,7,8,9,10}`

Calling `count(x)` returns 10. There are ten items in the list.

Calling `count(x > 5)` also returns 10. This is because it is counting the sequence {f,f,f,f,f,t,t,t,t,t} (five false, five true). All elements in this sequence are non-null, so all are counted.

Calling `count(if x > 5 then 0 else null)` returns 5, because it is counting the sequence {null, null, null, null, null, 0, 0, 0, 0, 0} (five nulls, five zeros). The five non-null elements are counted. Any non-null value other than 0 would also count in this expression.

The following example clarifies the `count()` function's behavior with query tables.

Let's say we have a table named flattop with the following schema and contents:

Field Name Data Type Contents of Three Rows
alpha int 2, 4, 6
beta long 3.0, 5.0

The expression `select count(flattop.*) from in, tablet ...` returns

`3, 2 `

which is the number of non-null values in the `alpha` field, plus the number of non-null values in the `beta` field.

If you want a count of the number of items in the window, use an expression like the following: `select count() as c from in, flattop ...`

### count_distinct()

Function syntax:

`int count_distinct(T arg1 [, T arg2 ...]])`

Returns the number of tuples in the Aggregate context with specified values for the parameters. This function must be called with at least one argument of any data type. Multiple arguments can be of different types. Use commas to separate subsequent arguments. For the purpose of this function, null values of a field are considered equivalent to each other.

### countif()

Function syntax:

```int countif(bool `predicate`)
int countif(T `fieldname`, bool `predicate`)```

With one argument, returns the number of rows for which the specified predicate argument evaluates to true. When used with two arguments, it returns the number of rows for which the specified field is not null and the predicate evaluates to true.

### covariance()

Function syntax:

```double covariance(double `price`, double `index`)
double covariance(int `price`, int `index`)
double covariance(long `price`, long `index`)```

Returns the covariance for two fields, `price` and `index`, for all tuples in the Aggregate context. This function takes two input arguments (both can be a double, int, or long) and returns a double. In financial applications, covariance can be used to measure the degree to which returns on two risky assets move in tandem. A positive covariance means that asset returns move together. A negative covariance means returns vary inversely.

In the arguments, the `price` field is a stock price, and the `index` is the industry segment index.

One method of calculating covariance is to look at return deviations from the expected return in each scenario. Another method is to multiply correlation between the two variables by the standard deviation of each variable.

For example, if you owned one asset that had a high covariance with another asset that you did not own, then you would receive very little increased diversification by adding the second asset. Of course, the opposite is true as well, adding assets with low covariance to your portfolio would lower the overall portfolio risk.

### covariancep()

Function syntax:

```double covariancep(double `price`, double `index`)
double covariancep(int `price`, int `index`)
double covariancep(long `price`, long `index`)```

Returns the covariance for two fields, `price` and `index`, for all tuples in the Aggregate context. The `covariancep` function is similar to the `covariance` function, but use `covariancep` when the data provided is the entire population, while the `covariance` function is used for a random sample. The `covariancep` function is calculated using the biased (or n) method. The `covariance` function is calculated using the unbiased (or n-1) method.

This function takes two input arguments (both can be a double, int, or long) and returns a double. In financial applications, covariance can be used to measure the degree to which returns on two risky assets move in tandem. A positive covariance means that asset returns move together. A negative covariance means returns vary inversely.

In the arguments, the `price` field is a stock price, and the `index` is the industry segment index.

One method of calculating covariance is by looking at return deviations from the expected return in each scenario. Another method is to multiply correlation between the two variables by the standard deviation of each variable.

For example, if you owned one asset that had a high covariance with another asset that you did not own, then you would receive very little increased diversification by adding the second asset. Of course, the opposite is true as well, adding assets with low covariance to your portfolio would lower the overall portfolio risk.

### exp_moving_avg()

Function syntax:

```double exp_moving_avg(double `value`, int `count` [, double `weight`])
double exp_moving_avg(int `value`, int `count` [, double `weight`])
double exp_moving_avg(long `value`, int `count` [, double `weight`])```

Returns a weighted average where a percentage of each successive value is used with a percentage of the existing average to construct the new average. The effect of any one value on the average diminishes exponentially as new values are added.

The function takes these arguments:

`value`

The value (double, int, or long) to be averaged.

`count`

The integer number of periods over which a simple average is used prior to starting the EMA.

`weight`

An optional weighting value (a double between 0 and 1) that is to be applied to both the current period's value (CPV) and prior period's average (PPA) when the Exponential Moving Average (EMA) is calculated.

If the weight is not between 0 and 1 or is not specified, it is calculated from the following formula:

`weight = 2 / (count + 1)`

This function returns a double. It is most commonly used with an unlimited aggregate window size (although it does not have to be). When configuring an Aggregate operator for an unlimited window, do not configure the Advance and Size fields in StreamBase Studio, which results in unlimited window size. Instead, set the Emit field to `1` so that the aggregate emits for every tuple.

The `exp_moving_avg` function is calculated using the following formula:

``EMA` = (`CPV` * `weight`) + (`PPA` * (1 - `weight`))`
• `EMA`: exponential moving average so far

• `PPA`: exponential moving average after the previous tuple

• `CPV`: value of current tuple

Initially there is no PPA. A simple moving average (SMA) is computed instead over as many tuples as specified by the second parameter. When the SMA is computed, a Null value is emitted for each input tuple until the Aggregate operator receives enough tuples to compute the SMA. Following this, the EMA is calculated.

### goodman_kruskal_gamma()

Function syntax:

`double goodman_kruskal_gamma(double field1, double field2)`

Returns the Goodman-Kruskal Gamma correlation coefficient between values of two fields in the Aggregate context as a double value when the window emits. The Goodman-Kruskal Gamma statistic is a rank correlation of concordant pairs. If any field values are null, the returned value is null.

To perform multivariate Goodman-Kruskal Gamma correlations, use the aggregate `correlation_coefficient` function.

### intercept()

Function syntax:

```double intercept(double `x`, double `y`)
double intercept(int `x`, int `y`)```

For the data set of fields `x` and `y` for all tuples in the Aggregate context, a best-fit linear regression line is calculated, then the function returns the value where the line intersects the y axis. The method used to find the line that best fits a group of points is called least squares (or linear least squares). This function takes two input arguments of the same type (both can be an int or a double) and returns a double.

This function returns zero if the Aggregate window contains one row or duplicate rows containing the same values.

### kendall_tau()

Function syntax:

`double kendall_tau(double field1, double field2)`

Returns the normalized Kendall Tau distance metric between values of two fields in the Aggregate context as a double value when the window emits. The Kendall Tau distance statistic counts the number of pairwise disagreements between the ranks of corresponding items in two fields. Larger values indicate greater disagreement in rank ordering, such that a return value of 0 means complete agreement and 1.0 means complete disagreement.

To perform multivariate Kendall Tau correlations, use the aggregate `correlation_coefficient` function.

### max()

Function syntax:

`T max(T `f`)`

The argument `f` represents a field in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate operator or Query Table.

This function returns the maximum non-null value using the greater-than relational operation appropriate for the data type of `f`. The data type, T, of the returned value is the same as the argument.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of max(), the aggregate function maxn() below, and the maxelement() function for lists. In an aggregate context, this aggregate function `max(list)` returns the maximum list in the aggregate window. Use the simple function `max(list, , list)` to determine the larger of two or more lists. By contrast, use `maxelement()` to compare the element values of a single list.

### maxn()

Function syntax:

`list(T) maxn(int `count`, T `f`)`

The argument `f` represents a field in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate context.

Returns a list with `count` elements containing the `count` number of maximum non-null values for field `f` in the Aggregate context. Accepts fields of all data types and returns a list whose elements are the same as the type of field `f`. The value of `count` must be nonnegative, cannot be null, and must remain the same for the duration of the Aggregate window. For `count`=0, an empty Aggregate window, or an empty Query Table column, an empty list is returned.

Maximum values are determined with the greater-than relational operation appropriate for the data type of `f`. StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of max(), the aggregate version of max(), and the maxelement() function for lists.

### median()

Function syntax:

```double median(double `x`)
double median(int `x`)
double median(long `x`)```

Returns the median value of `x` for this Aggregate context. If the window has an even number of elements, returns the average of the middle two.

### min()

Function syntax:

`T min(T `f`)`

The argument `f` represents a field in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate operator or Query Table.

This function returns the minimum non-null value using the less-than relational operation appropriate for the data type of `f`. The data type, T, of the returned value is the same as the argument.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of min(), the aggregate function minn() below, and the minelement() function for lists. In an aggregate context, this aggregate function `min(list)` returns the minimum list in the aggregate window. Use the simple function `min(list, , list)` to determine the smaller of two or more lists. By contrast, use `minelement()` to compare the element values of a single list.

### minn()

Function syntax:

`list(T) minn(int `count`, T `f`)`

The argument `f` represents a field in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate context.

Returns a list with `count` elements containing the `count` number of minimum non-null values for field `f` in the Aggregate context. Accepts fields of all data types and returns a list whose elements are the same as the type of field `f`. The value of `count` must be nonnegative, cannot be null, and must remain the same for the duration of the Aggregate window. For `count`=0, an empty Aggregate window, or an empty Query Table column, an empty list is returned.

Minimum values are determined with the less-than relational operation appropriate for the data type of `f`. StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

See also the simple version of min(), the aggregate version of min(), and the minelement() function for lists.

### orall()

Function syntax:

`bool orall(bool `f`)`

The argument `f` represents a field of type boolean in an Aggregate operator, a Query Table column, or an expression that depends on one or more fields in an Aggregate operator or Query Table.

This function evaluates the values in the specified field, and returns the results of a logical OR operation on all values. For example, suppose field `BoolField` contains the following values in successive rows of an Aggregate window or Query Table column: `true, true, false, true, false`. In that case, `orall(BoolField)` returns `true`. Use orall() as an aggregate falsity detector function.

Null arguments have special handling. A `null` argument does not change the result of the evaluation if a `true` value is among the arguments. Thus, if `BoolField` contains `null, false, true, null, false`, then `orall(BoolField)` still returns `true`. However, if one or more arguments is `null` while all other arguments are `true`, the function returns `null`. So if `BoolField` contains `null, false, false, null, false`, then `orall(BoolField)` returns `null`.

The orall() function follows the logic expressed in this statement:

if any argument is true, return true
else if any argument is null, return null
else return false

See also the andall() aggregate function and the simple versions of andall() and orall().

### pearson()

Function syntax:

`double pearson(double field1, double field2)`

Returns the Pearson product-moment correlation coefficient between values of two fields in the Aggregate context as a double value when the window emits that indicates the degree of linear dependence between two variables. The Pearson statistic is defined as the covariance of the two variables divided by the product of their standard deviations. The term product-moment refers to the mean (the first moment about the origin) of the product of the mean-adjusted random variables. A return value of `-1` indicates a perfect negative linear correlation. A value of `0` indicates no correlation (random variates). A value of `1` indicates perfect positive linear correlation. If any field values are null, the returned value is null. If the variance of either input variable is zero, the function returns `NaN`.

To perform multivariate Pearson correlations, use the aggregate `correlation_coefficient` function or the aggregate `correlation_coefficientp` function.

### percentile()

Function syntax:

`T percentile(T `field`, double `pct`)`

Returns the value of the specified field with percentile rank equal to the specified `pct` value. If there is no value in the Aggregate context at exactly `pct` percentile, then the value with the greatest percentile less than `pct` is returned. The data type of the specified `field` (and of the corresponding result) must be one of the comparable types in the expression language, as listed on the Data Types page.

### product()

Function syntax:

```double product(double `f`)
int    product(int `f`)
long   product(long `f`)```

Returns the multiplication product computed for field `f` for all tuples in the Aggregate context. Supports data types double, int, and long.

### slope()

Function syntax:

```double slope(double `x`, double `y`)
double slope(int `x`, int `y`)```

For the data set of fields `x` and `y` for all tuples in the Aggregate context, a best-fit linear regression line is calculated, and then the function returns the line's slope. The method that is used to find the line that best fits a group of points is called least squares (or linear least squares). The `slope` function takes two input arguments of the same type (both can be an int or a double) and returns a double.

This function returns zero if the Aggregate window contains one row or duplicate rows containing the same values.

The slope() function returns a unitless double value that does not represent an angle, and is therefore in neither degrees nor radians. The internal calculation takes the tan(theta), where theta is the angle of incline of the best linear fit to the data. This calculation returns the same value whether theta is internally expressed in degrees or radians.

### spearmans_rank()

Function syntax:

`double spearmans_rank(double field1, double field2)`

Returns the Spearman's Rank correlation coefficient between values of two fields in the Aggregate context as a double value when the window emits. Spearman's statistic is defined as the Pearson correlation coefficient between the ranked variables. A return value of `-1` indicates a perfect negative correlation (reverse ordering). A value of `0` indicates no concordance of ranks. A value of `1` indicates identical rank orderings. If any field values are null, the returned value is null.

To perform multivariate Spearman's Rank correlations, use the aggregate `correlation_coefficient` function.

### stdev()

Function syntax:

```double    stdev(double `f`)
double    stdev(int `f`)
double    stdev(long `f`)
timestamp stdev(timestamp `f`)```

Returns the standard deviation for field `f` for all tuples in the Aggregate context. The function takes one input argument. If the input argument is an int, double, or long, it returns a double. If the input argument is a timestamp, it returns an interval timestamp. For timestamp values, the aggregate field should contain all interval timestamps or all absolute timestamps, but not both. For windows with size < 2, returns null.

Standard deviation is a measure of the dispersion of a set of data from its mean. The more spread apart the data is, the higher the deviation. For example, in financial applications, standard deviation could be applied to the annual rate of return of an investment to measure the investment's volatility (risk). A volatile stock would have a high standard deviation. In mutual funds, the standard deviation indicates how much the return on the fund is deviating from the expected normal returns.

### stdevp()

Function syntax:

```double    stdevp(double `f`)
double    stdevp(int `f`)
double    stdevp(long `f`)
timestamp stdevp(timestamp `f`)```

Returns the standard deviation for field `f` for all tuples in the Aggregate context. The function takes one input argument. If the input argument is an int, double, or long, it returns a double. If the input argument is a timestamp, it returns an interval timestamp. For timestamp values, the aggregate field should contain all interval timestamps or all absolute timestamps, but not both. For windows with size < 2, returns 0.

The `stdevp()` function is similar to the `stdev()` function: with `stdevp()` the data provided is the entire population, while with `stdev()`, the data provided is treated as a random sample. The `stdevp()` function is calculated using the biased (or n) method. The `stdev()` function is calculated using the unbiased (or n-1) method.

### sum()

Function syntax:

```double    sum(double `f`)
int       sum(int `f`)
long      sum(long `f`)
timestamp sum(timestamp `f`)```

Returns the sum of field `f` computed for all tuples in the Aggregate context. The field can be an double, int, long, or timestamp.

When the field type is timestamp, summing members of the field follows the rules for adding timestamps as shown in the table in timestamp Data Type. That is, you cannot sum an aggregate of two or more absolute timestamps. However, you can sum an aggregate field composed of all interval timestamps, or one composed of exactly one absolute timestamp plus one or more interval timestamps.

### sumif()

Function syntax:

`int sumif(T `fieldname`, bool `predicate`)`

Returns the sum of the values for which the specified numeric field is not null and for which the predicate evaluates to true. The predicate expression can be any valid expression on any field, is not restricted to values from the specified field.

For example: `sumif(input.income, input.age < 35)`

### variance()

Function syntax:

```double variance(double `f`)
double variance(int `f`)
double variance(long `f`)```

Returns the variance for field `f` for all tuples in the Aggregate context. The function takes one input argument (an int, double, or long) and returns a double. For windows with size < 2, returns null.

Variance is a measure of the dispersion of a set of data points around their mean value. It is a mathematical expectation of the average squared deviations from the mean. Variance measures the variability (volatility) from an average. Volatility is a measure of risk. So for example, this statistic can help determine the risk an investor might take on when purchasing a specific security.

### variancep()

Function syntax:

```double variancep(double `f`)
double variancep(int `f`)
double variancep(long `f`)```

Returns the variance for field `f` for all tuples in the Aggregate context. The function takes one input argument (an int, double, or long) and returns a double. For windows with size < 2, returns 0.

The `variancep()` function is similar to the `variance()` function: with `variancep()` the data provided is the entire population, while with `variance()`, the data provided is treated as a random sample. The `variancep()` function is calculated using the biased (or n) method. The `variance()` function is calculated using the unbiased (or n-1) method.

### vwap()

Function syntax:

`double vwap(double price, double volume)`

Returns the volume-weighted average-price value from (typically) the `price` and `volume` values, for all tuples in the Aggregate context. Both arguments are doubles.

### withmax()

Function syntax:

`T1 withmax(T `compare_expr`, T1 `result_expr`)`

Returns the `result_expr` for this Aggregate context that has the maximum corresponding `compare_expr` using the greater-than relational operator. The data type, T1, of the returned value is the same as the `result_expr`. The data type of the `compare_expr` can be any data type, not necessarily the same as the `result_expr's` type.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

### withmin()

Function syntax:

`T1 withmin(T `compare_expr`, T1 `result_expr`)`

Returns the `result_expr` for this Aggregate operator's window that has the minimum corresponding `compare_expr` using the less-than relational operator. The data type, T1, of the returned value is the same as the `result_expr`. The data type of the `compare_expr` can be any data type, not necessarily the same as the `result_expr's` type.

StreamBase data types are comparable with relational operators in different ways, as listed in the entry for each data type on StreamBase Data Types.

## Aggregate Functions: Windowing

This category includes the following functions:

Function Use in Aggregate Operator? Use in Query Read on Query Tables?
closeval() yes no
firstn() yes yes
firstnonnullval() yes yes
firstval() yes yes
getOldestWindowID() yes no
getWindowID() yes no
isOldestWindow() yes no
lag() yes yes
lastn() yes yes
lastnonnullval() yes yes
lastval() yes yes
openval() yes no

All windowing functions work on columns of data in Aggregate windows, and most windowing functions also work on table columns returned by Query Read operations on Query Tables.

### Note

When used with a Query operator, these functions are specified in the Output Settings tab of the Query operator's Properties view, and operate on the initial selection of rows from the Query Table as specified in the Query Settings tab for a Query-Read operation. Remember that the order of rows returned from the Query Table depends on the Read predicate expression and on the sort order, if specified.

For example, if the Query-Read operation specifies Read All Rows on a table whose primary index is a string field, and you specify running the firstn() or lastn() functions on that selection, those functions return fields from the first `n` rows (or last `n` rows) in alphabetical order by primary index.

The three functions getWindowID(), isOldestWindow(), and getOldestWindowID() are pseudo-aggregate functions in that they only operate in the context of an Aggregate operator, but they do not query or evaluate the tuples flowing through the Aggregate operator. Instead, they report on the state of the various windows within the Aggregate.

### closeval()

Function syntax:

```double    closeval([string `dimension`])
int       closeval([string `dimension`])
long      closeval([string `dimension]`)
timestamp closeval([string `dimension`])```

The `closeval()` function behaves the same as openval() except that it returns the upper limit of the specified dimension when the window is closed. Like openval(), this function works in Aggregate operators, and not in Query operators.

The value returned by `closeval()` might differ from any of the actual tuple values included in the window. For example, consider an Aggregate operator with a dimension `CountDim`, where the windows created by that dimension include a window with a range of 5 to 10. If the actual tuple values in the window are 6, 7, and 9, then `closeval(CountDim)` returns `10` for that window, not `9`.

If the dimension name is omitted and the aggregate has only one dimension, then `closeval()` uses that dimension. If the dimension name is omitted and the aggregate has more than one dimension, the result is a typecheck error.

### firstn()

Function syntax:

`list(T) firstn(int `count`, T `f`)`

Returns a list with `count` elements containing the first `count` values for field `f` in an Aggregate operator's window or in a Query Table column. Accepts fields of all data types and returns a list whose elements are the same type as the type of field `f`. Field `f` can contain null values, and such fields are returned in the list.

The value of `count` must be nonnegative, cannot be null, and must remain the same for the duration of the Aggregate window. For `count`=0, an empty Aggregate window, or an empty Query Table column, an empty list is returned.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

### firstnonnullval()

Function syntax:

`T firstnonnullval(T `f`)`

Returns the first non-null value for field `f` in the Aggregate operator's window or in a Query Table column. Accepts all data types and returns the same type as its argument.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

### firstval()

Function syntax:

`T firstval(T `f`)`

Returns the first value for field `f` in the Aggregate operator's window or in a Query Table column. Accepts all data types and returns the same type as its argument.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

### getOldestWindowID()

Function syntax:

`int getOldestWindowID()`

Returns an integer that represents the ID of the oldest window still open of the current aggregation group.

This function works only in conjunction with the Aggregate operator.

### getWindowID()

Function syntax:

`int getWindowID()`

Returns an integer representing the ID of the current window. All windows of each Aggregate operator have a unique ID. The larger the value of the window ID, the younger the window is.

This function works only in conjunction with the Aggregate operator.

### isOldestWindow()

Function syntax:

`bool isOldestWindow()`

Returns `true` if the window in which this function is called is the oldest of the current aggregation group.

This function works only in conjunction with the Aggregate operator.

### lag()

Function syntax:

`T lag(T `f`)`

Returns the next-to-last value for field f in the Aggregate operator's window or in a table column. If the window or table column has only one row or zero rows, lag() returns null. Accepts all data types and returns the same type as its argument.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

Example usage in an aggregate context: `avg(fieldname/lag(fieldname))` where `fieldname` is a numeric value such as a price field.

### lastn()

Function syntax:

`list(T) lastn(int `count`, T `f`)`

Returns a list with `count` elements containing the last `count` values for field `f` in an Aggregate operator's window or in a Query Table column. Accepts fields of all data types and returns a list whose elements are the same type as the type of field `f`. Field `f` can contain null values, and such fields are returned in the list.

The value of `count` must be nonnegative, cannot be null, and must remain the same for the duration of the Aggregate window. For `count`=0, an empty Aggregate window, or an empty Query Table column, an empty list is returned.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

### lastnonnullval()

Function syntax:

`T lastnonnullval(T `f`)`

Returns the last non-null value for field `f` in the Aggregate operator's window or in a Query Table column. Accepts all data types and returns the same type as its argument.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

### lastval()

Function syntax:

`T lastval(T `f`)`

Returns the last value for field `f` in the Aggregate operator's window or in a Query Table column. Accepts all data types and returns the same type as its argument.

This function works best in conjunction with the Aggregate operator. See the note above about using this function with Query operators.

### openval()

Function syntax:

```double    openval([string `dimension`])
int       openval([string `dimension`])
long      openval([string `dimension`])
timestamp openval([string `dimension`])```

Can only be used in an Aggregate operator, not in a Query Operator. Returns the lower limit of the specified dimension for the current window. The dimension name is passed as a quoted string, and must match the name of a dimension in the Aggregate operator. This function works even if the dimension specified is not the one that actually opened the current window. In that case, the value returned is the value of the dimension at the time the window was created.

The value returned by `openval` might differ from any of the actual tuple values included in the window. For example, consider an Aggregate operator with a dimension `CountDim`, where the windows created by that dimension include a window with a range of 5 to 10. If the actual tuple values in the window are 6, 7, and 9, then `openval(CountDim)` returns `5` for that window, not `6`.

If the dimension name is omitted and the aggregate has only one dimension, then `openval()` uses that dimension. If the dimension name is omitted and the aggregate has more than one dimension, the result is a typecheck error.