Contents
StreamBase supports the use of null values in applications. Nulls can be used to explicitly represent data that is missing or unknown.
Use the reserved value null
to indicate that the value of
a tuple field is null. When you test your application in the StreamBase Studio SB
Test/Debug perspective, the Manual Input View allows you to enter either null
or actual values for the input
fields.
In most cases with StreamBase expressions, any time you apply an arithmetic operator
or function with a field that is null
, the result is
null. The four exceptions are the isnull()
and
notnull()
functions, which test whether a field is
null, and the coalesce()
and coalesce_tuples()
functions, which select a non-null value from a
set of potentially null arguments.
To use an expression that intentionally sets the result to null, use one of the data type-specific null literals, as shown in the list below.
blob(null) bool(null) double(null) int(null) long(null) string(null) timestamp(null)
Use the nulllist() function to create a null list, as described in Null Lists. Null tuples are described in Null Tuples.
The data type of each null is never implicit: you must specify which type of null you are using in the expression.
Expression Example | Result |
---|---|
3 + int(null) | A null int |
int(null) + int(null) | A null int |
int(null) + bool(null) | A typecheck error. You cannot add an int and a bool. |
if boolean(null) then 3 else 4 | A null int |
int(null) == int(null)
or int(null) = int(null) |
A null bool, because null is not equal to itself |
int(null) != int(null) | A null bool, because null is not equal to itself |
isnull(int(null)) | A bool that evaluates to true |
notnull(int(null)) | A bool that evaluates to false |
In general, expressions involving null evaluate to null. However, some Boolean logic statements can contain unknowns or nulls, but the rest of the expression fully determines the truth value. In the following table, the order of the arguments does not matter.
Expression | Result |
---|---|
bool(null) && bool(null) | A null bool |
bool(null) AND true | A null bool |
bool(null) && false | false |
bool(null) || bool(null) | A null bool |
bool(null) OR true | true |
bool(null) || false | A null bool |
This is three-value Boolean logic as
used in standard SQL. Null is not a value in the same sense as true
and false
, but is instead a
placeholder for the absence of value. Some RDMS systems refer to this placeholder as
Unknown
; StreamBase uses the term null
.
StreamBase provides the isnull()
and notnull()
functions, allowing you to test whether fields in
expressions are null.
-
isnull(
value
) -
Returns true if the argument is a null value.
-
notnull(
value
) -
Returns true if the argument is not a null value. This function always returns the opposite of
isnull(
.value
)
For example, an expression in a StreamBase operator could use:
if (isnull(fieldA)) then notnull(fieldB) else true
The coalesce()
function returns the first non-null
value from its list of arguments.
-
coalesce(
value1
,value2
...) -
Returns the first argument that is non-null, or a null value if all arguments are null. All arguments must have the same type. For tuple arguments, returns the first tuple that is non-null.
Since literal values are never null, you can specify a literal value as the last
argument to provide an effective default value for the list. In this way,
coalesce()
can be used to emulate the NVL()
function provided by Oracle PL/SQL and the two-argument
ISNULL() function provided by Microsoft T/SQL. For example, the following expression
returns the value of fieldA
if it is non-null, or 0
if it is null:
coalesce(fieldA, 0)
The following example returns the first non-null field among fieldA
, fieldB
, and fieldC
in that order, or returns
-99999
if all three fields are null.
coalesce(fieldA, fieldB, fieldC, -99999)
To illustrate how to specify nulls, assume an example where the schema for a tuple contains these fields:
-
fieldA, an int
-
fieldB, a string
-
fieldC, an int
In the input data CSV file:
-
To designate all three values of a tuple row to be null, enter:
null,null,null
. -
To indicate that just fieldB is null, enter:
120,null,40
. -
To indicate that fieldA has a value, fieldB is an empty string, and fieldC is null, enter:
120,"",null
.
The same rules apply when you use the Manual Input view in the SB Test/Debug perspective.
A null list is a list with no elements, where the entire list is null. To specify a
null list, use the nulllist()
function.
A null list is not the same as an empty list, which is a list with zero elements. In
this case, the list itself is not null. To specify an empty list, use the emptylist()
function.
To understand the difference between empty and null lists, consider the following four expressions evaluated with a Map operator's Additional Expressions grid or with the Evaluate StreamBase Expression dialog.
list(99)
Result type: (list(int) Result: [99]
list(int())
Result type: (list(int)) Result: [null]
emptylist(int())
Result type: (list(int)) Result: []
nulllist(int())
Result type: (list(int)) Result: null
The four commands in this example demonstrate the following:
-
list(99)
returns a list of type int with a single element, the integer 99. -
list(int())
returns a list of type int with a single null element. -
emptylist(int())
returns a list of type int with zero elements. -
nulllist(int())
returns a null list of type int.
A null tuple results when the entire tuple is set to null, not just the fields of the tuple. A null tuple is not the same as:
-
An empty tuple, which is a tuple with each individual field set to null.
-
A no-fields tuple, which is the input format for an input stream with an empty schema defined.
To specify a null tuple that uses a named schema, specify the schema name with empty
parentheses. For example, for the schema named nyse_data
, the expression nyse_data()
creates a null tuple.
When using an unnamed schema, you can only create a null tuple by coercion. For example, consider the following two expressions evaluated with a Map operator's Additional Expressions grid or with the Evaluate StreamBase Expression dialog:
if true then tuple(int() as x, double() as y) else null
Result type: ((x int, y double)) Result: null,null
if false then tuple(int() as x, double() as y) else null
Result type: ((x int, y double)) Result: null
The commands in this example demonstrate the following:
-
In the first command, since the
if
test is set totrue
, thethen
clause is returned. The returned value is a tuple with schema (int, double), with two fields, both set tonull
. -
In the second command, since the
if
test is set tofalse
, theelse
clause is returned. The returned value is a null tuple whose schema is (int, double).
The following table explains how each StreamBase operator handles null values in tuple fields.
Operator or Component | Result |
---|---|
Group-by parameter | Key fields that are null are accepted as a group. |
Order-by parameter | Key fields that are null are dropped. StreamBase throws a NullValueException. For example, if a Merge operator encounters a null value on the Order-by field, the Error Log View in the SB Test/Debug perspective contains a message such as: error Non-Shutdown Error on Error Stream: (time=2006-02-20 12:17:44.579-0400,tupleid=1, subsystem="merge::Merge1", description="Null order-by value",streamname="InputStream1",nodename="mymachine", type="non-fatal-error",action="continue", time=2006-02-20 12:17:44.579-0400) |
Filter operator | The evaluation of an expression in a Filter predicate that results in a NullValueException causes the tuple to pass to the StreamBase error port or error stream in effect for the containing module. |
Map operator |
A Map operator assigns a value of null to a
field whose expression evaluates to null. That is, if the evaluation of the
field results in a NullValueException, then
the value of the field is null .
|
Join operator | Boolean expressions return null. If tuple fields with null values are used in the boolean expression or the Order-by field, the tuple is ignored in the Join operation. |
Merge operator |
Any tuple with a null value in the ordering
field is ignored.
|
Gather operator |
Any tuple with a null value in the ordering
field is ignored.
|
BSort operator |
Any tuple with a null value in the ordering
field is ignored. Note that a null value in the
Group-by is grouped.
|
Query operator and Query Table data construct |
If the key field for a tuple record being written to a Query Table is
null , the tuple IS stored. In a Query Table with
a sorted (btree) index, the null-keyed stored
records are evaluated as less (in value) than other non-null records. On a
subsequent read (lookup) operation, the null-keyed tuples can be located.
|
Aggregate operator |
In Aggregate windows, if a field's value is null , the null value is not included in the calculation. For
example: an average, such as avg(price) of ten
tuples in an aggregate's window, if one of the values is null, then the
average is calculated for the nine values.
As noted previously, key fields that are null will be accepted as a group. In a aggregate's window, the aggregating function calculates a value, but it may be for a grouped set of tuples that have an unknown value (for example, if the stock's Symbol was missing). Also as noted previously, tuples with the key fields set to null are dropped (not figured into the aggregate's calculation). |
Union operator | No impact. |
Custom functions |
Custom C++ functions that you write should handle the processing of the
NullValueException thrown by get() methods in Value.hpp of the StreamBase C++ API.
|
When developing StreamBase client applications, you will generally interact with nulls in the StreamBase Client Libraries when you are either creating tuples to be enqueued, or receiving dequeued tuples.
When creating a tuple, it is important to note that all the values in the tuple are
initialized to null
, and it is your responsibility to
set the value. When dequeuing a tuple, determine whether a tuple field value is null
before accessing the value. The following client code snippets demonstrate these
concepts:
- Java
-
Tuple t1 = s1.createTuple(); for ( int i = 0; i < s1.getNumFields(); i++ ) assert( t1.isNull(i) ); t1.setBoolean("fbool",true); t1.setInt("fint",1); t1.setDouble("fdouble",1.11); t1.setTimestamp("ftimestamp", Timestamp.now()); t1.setString("fstring","AA"); for ( int i = 0; i < s1.getNumFields(); i++ ) assert( !t1.isNull(i) ); for ( int i = 0; i < s1.getNumFields(); i++ ) t1.setNull(i); for ( int i = 0; i < s1.getNumFields(); i++ ) assert( t1.isNull(i) ); t1.setBoolean("fbool",true); t1.setInt("fint",1); t1.setDouble("fdouble",1.11); t1.setTimestamp("ftimestamp", Timestamp.now()); t1.setString("fstring","AA"); for ( int i = 0; i < s1.getNumFields(); i++ ) assert( !t1.isNull(i) ); t1.clear(); for ( int i = 0; i < s1.getNumFields(); i++ ) assert( t1.isNull(i) );
- C++
-
// assume that s1 is a Schema with 5 fields BufferedTuple t1(s1); for ( unsigned int i = 0; i < s1.getNumFields(); i++ ) assert( t1.isNull(i) ); t1.setBool("fbool",true); t1.setInt("fint",1); t1.setDouble("fdouble",1.11); t1.setTimestamp("ftimestamp", Timestamp::now()); t1.setString("fstring","AA"); for ( unsigned int i = 0; i < s1.getNumFields(); i++ ) assert( !t1.isNull(i) ); for ( unsigned int i = 0; i < s1.getNumFields(); i++ ) t1.setNull(i); for ( unsigned int i = 0; i < s1.getNumFields(); i++ ) assert( t1.isNull(i) ); t1.setBool("fbool",true); t1.setInt("fint",1); t1.setDouble("fdouble",1.11); t1.setTimestamp("ftimestamp", Timestamp::now()); t1.setString("fstring","AA"); for ( unsigned int i = 0; i < s1.getNumFields(); i++ ) assert( !t1.isNull(i) ); t1.clear(); for ( unsigned int i = 0; i < s1.getNumFields(); i++ ) assert( t1.isNull(i) );
For more information on using nulls in the Client library, see:
By default, the SB Test/Debug perspective of StreamBase Studio uses the keyword
null
to indicate a null value. You can specify a
different null token by changing the value in the Indicate
nulls using field in the > pane of the > dialog.
The null token is used in editors and views when a field contains no value. You can
substitute any string to indicate nulls. This change does not affect the use of null
literals in expressions, which must use the reserved keyword shown in the previous
sections, such as int(null)
.