Using Nulls

StreamBase supports the use of null values in applications. Nulls can be used to explicitly represent data that is missing or unknown.

Use the reserved value null to indicate that the value of a tuple field is null. When you test your application in the StreamBase Studio SB Test/Debug perspective, the Manual Input View allows you to enter either null or actual values for the input fields.

Effect of Null Values in Expressions

In most cases with StreamBase expressions, any time you apply an arithmetic operator or function with a field that is null, the result is null. The four exceptions are the isnull() and notnull() functions, which test whether a field is null, and the coalesce() and coalesce_tuples() functions, which select a non-null value from a set of potentially null arguments.

To use an expression that intentionally sets the result to null, use one of the data type-specific null literals, as shown in the list below.

blob(null)
bool(null)
double(null)
int(null)
long(null)
string(null)
timestamp(null)

Use the nulllist() function to create a null list, as described in Null Lists. Null tuples are described in Null Tuples.

The data type of each null is never implicit: you must specify which type of null you are using in the expression.

Expression Example Result
3 + int(null) A null int
int(null) + int(null) A null int
int(null) + bool(null) A typecheck error. You cannot add an int and a bool.
if boolean(null) then 3 else 4 A null int
int(null) == int(null)

or

int(null) = int(null)

A null bool, because null is not equal to itself
int(null) != int(null) A null bool, because null is not equal to itself
isnull(int(null)) A bool that evaluates to true
notnull(int(null)) A bool that evaluates to false

Effect of Null Values in Boolean Logic

In general, expressions involving null evaluate to null. However, some Boolean logic statements can contain unknowns or nulls, but the rest of the expression fully determines the truth value. In the following table, the order of the arguments does not matter.

Expression Result
bool(null) && bool(null) A null bool
bool(null) AND true A null bool
bool(null) && false false
bool(null) || bool(null) A null bool
bool(null) OR true true
bool(null) || false A null bool

This is three-value Boolean logic as used in standard SQL. Null is not a value in the same sense as true and false, but is instead a placeholder for the absence of value. Some RDMS systems refer to this placeholder as Unknown; StreamBase uses the term null.

The isnull and notnull Functions

StreamBase provides the isnull() and notnull() functions, allowing you to test whether fields in expressions are null.

isnull(value)

Returns true if the argument is a null value.

notnull(value)

Returns true if the argument is not a null value. This function always returns the opposite of isnull(value).

For example, an expression in a StreamBase operator could use:

if (isnull(fieldA)) then notnull(fieldB) else true

The coalesce Functions

The coalesce() function returns the first non-null value from its list of arguments.

coalesce(value1, value2...)

Returns the first argument that is non-null, or a null value if all arguments are null. All arguments must have the same type. For tuple arguments, returns the first tuple that is non-null.

Since literal values are never null, you can specify a literal value as the last argument to provide an effective default value for the list. In this way, coalesce() can be used to emulate the NVL() function provided by Oracle PL/SQL and the two-argument ISNULL() function provided by Microsoft T/SQL. For example, the following expression returns the value of fieldA if it is non-null, or 0 if it is null:

coalesce(fieldA, 0)

The following example returns the first non-null field among fieldA, fieldB, and fieldC in that order, or returns -99999 if all three fields are null.

coalesce(fieldA, fieldB, fieldC, -99999)

How to Specify Nulls in Input Data

To illustrate how to specify nulls, assume an example where the schema for a tuple contains these fields:

  • fieldA, an int

  • fieldB, a string

  • fieldC, an int

In the input data CSV file:

  • To designate all three values of a tuple row to be null, enter: null,null,null.

  • To indicate that just fieldB is null, enter: 120,null,40.

  • To indicate that fieldA has a value, fieldB is an empty string, and fieldC is null, enter: 120,"",null.

The same rules apply when you use the Manual Input view in the SB Test/Debug perspective.

Null Lists

A null list is a list with no elements, where the entire list is null. To specify a null list, use the nulllist() function.

A null list is not the same as an empty list, which is a list with zero elements. In this case, the list itself is not null. To specify an empty list, use the emptylist() function.

To understand the difference between empty and null lists, consider the following four expressions evaluated at the command prompt with sbd --eval:

sbd --eval "list(99)"
(list(int)) [99]
sbd --eval "list(int())"
(list(int)) [null]
sbd --eval "emptylist(int())"
(list(int)) []
sbd --eval "nulllist(int())"
(list(int)) null

The four commands in this example demonstrate the following:

  • list(99) returns a list of type int with a single element, the integer 99.

  • list(int()) returns a list of type int with a single null element.

  • emptylist(int()) returns a list of type int with zero elements.

  • nulllist(int()) returns a null list of type int.

Null Tuples

A null tuple results when the entire tuple is set to null, not just the fields of the tuple. A null tuple is not the same as:

  • An empty tuple, which is a tuple with each individual field set to null.

  • A no-fields tuple, which is the input format for an input stream with an empty schema defined.

To specify a null tuple that uses a named schema, specify the schema name with empty parentheses. For example, for the schema named nyse_data, the expression nyse_data() creates a null tuple.

When using an unnamed schema, you can only create a null tuple by coercion. For example, consider the following two expressions evaluated at the command prompt with sbd --eval:

sbd --eval "if true then tuple(int() as x, double() as y) else null"
((int, double)) null,null
sbd --eval "if false then tuple(int() as x, double() as y) else null"
((int, double)) null

The commands in this example demonstrate the following:

  • In the first command, since the if test is set to true, the then clause is returned. The returned value is a tuple with schema (int, double), with two fields, both set to null.

  • In the second command, since the if test is set to false, the else clause is returned. The returned value is a null tuple whose schema is (int, double).

How Each Operator Handles Nulls

The following table explains how each StreamBase operator handles null values in tuple fields.

Operator or Component Result
Group-by parameter Key fields that are null are accepted as a group.
Order-by parameter Key fields that are null are dropped. StreamBase throws a NullValueException. For example, if a Merge operator encounters a null value on the Order-by field, the Error Log View in the SB Test/Debug perspective contains a message such as: error Non-Shutdown Error on Error Stream: (time=2006-02-20 12:17:44.579-0400,tupleid=1, subsystem="merge::Merge1", description="Null order-by value",streamname="InputStream1",nodename="mymachine", type="non-fatal-error",action="continue", time=2006-02-20 12:17:44.579-0400)
Filter operator The evaluation of an expression in a Filter predicate that results in a NullValueException causes the tuple to pass to the StreamBase error port or error stream in effect for the containing module.
Map operator A Map operator assigns a value of null to a field whose expression evaluates to null. That is, if the evaluation of the field results in a NullValueException, then the value of the field is null.
Join operator Boolean expressions return null. If tuple fields with null values are used in the boolean expression or the Order-by field, the tuple is ignored in the Join operation.
Merge operator Any tuple with a null value in the ordering field is ignored.
Gather operator Any tuple with a null value in the ordering field is ignored.
BSort operator Any tuple with a null value in the ordering field is ignored. Note that a null value in the Group-by is grouped.
Query operator and Query Table data construct If the key field for a tuple record being written to a Query Table is null, the tuple IS stored. In a Query Table with a sorted (btree) index, the null-keyed stored records are evaluated as less (in value) than other non-null records. On a subsequent read (lookup) operation, the null-keyed tuples can be located.
Aggregate operator In Aggregate windows, if a field's value is null, the null value is not included in the calculation. For example: an average, such as avg(price) of ten tuples in an aggregate's window, if one of the values is null, then the average is calculated for the nine values.

As noted previously, key fields that are null will be accepted as a group. In a aggregate's window, the aggregating function calculates a value, but it may be for a grouped set of tuples that have an unknown value (for example, if the stock's Symbol was missing).

Also as noted previously, tuples with the key fields set to null are dropped (not figured into the aggregate's calculation).

Union operator No impact.
Custom functions Custom C++ functions that you write should handle the processing of the NullValueException thrown by get() methods in Value.hpp of the StreamBase C++ API.

Using Nulls in the Client Libraries

When developing StreamBase client applications, you will generally interact with nulls in the StreamBase Client Libraries when you are either creating tuples to be enqueued, or receiving dequeued tuples.

When creating a tuple, it is important to note that all the values in the tuple are initialized to null, and it is your responsibility to set the value. When dequeuing a tuple, determine whether a tuple field value is null before accessing the value. The following client code snippets demonstrate these concepts:

Java
Tuple t1 = s1.createTuple();
for ( int i = 0; i < s1.getNumFields(); i++ )
  assert( t1.isNull(i) );
t1.setBoolean("fbool",true);
t1.setInt("fint",1);
t1.setDouble("fdouble",1.11);
t1.setTimestamp("ftimestamp", Timestamp.now());
t1.setString("fstring","AA");
for ( int i = 0; i < s1.getNumFields(); i++ )
  assert( !t1.isNull(i) );
for ( int i = 0; i < s1.getNumFields(); i++ )
t1.setNull(i);
for ( int i = 0; i < s1.getNumFields(); i++ )
  assert( t1.isNull(i) );
t1.setBoolean("fbool",true);
t1.setInt("fint",1);
t1.setDouble("fdouble",1.11);
t1.setTimestamp("ftimestamp", Timestamp.now());
t1.setString("fstring","AA");
for ( int i = 0; i < s1.getNumFields(); i++ )
  assert( !t1.isNull(i) );
t1.clear();
for ( int i = 0; i < s1.getNumFields(); i++ )
  assert( t1.isNull(i) ); 
C++
// assume that s1 is a Schema with 5 fields
BufferedTuple t1(s1);
for ( unsigned int i = 0; i < s1.getNumFields(); i++ )
  assert( t1.isNull(i) );
t1.setBool("fbool",true);
t1.setInt("fint",1);
t1.setDouble("fdouble",1.11);
t1.setTimestamp("ftimestamp", Timestamp::now());
t1.setString("fstring","AA");
for ( unsigned int i = 0; i < s1.getNumFields(); i++ )
  assert( !t1.isNull(i) );
for ( unsigned int i = 0; i < s1.getNumFields(); i++ )
  t1.setNull(i);
for ( unsigned int i = 0; i < s1.getNumFields(); i++ )
  assert( t1.isNull(i) );
t1.setBool("fbool",true);
t1.setInt("fint",1);
t1.setDouble("fdouble",1.11);
t1.setTimestamp("ftimestamp", Timestamp::now());
t1.setString("fstring","AA");
for ( unsigned int i = 0; i < s1.getNumFields(); i++ )
  assert( !t1.isNull(i) );
t1.clear();
for ( unsigned int i = 0; i < s1.getNumFields(); i++ )
  assert( t1.isNull(i) );

For more information on using nulls in the Client library, see:

Changing the Displayed Null Token

By default, the SB Test/Debug perspective of StreamBase Studio uses the keyword null to indicate a null value. You can specify a different null token by changing the value in the Indicate nulls using field in the StreamBase Studio>Test/Debug pane of the Window>Preferences dialog.

The null token is used in editors and views when a field contains no value. You can substitute any string to indicate nulls. This change does not affect the use of null literals in expressions, which must use the reserved keyword shown in the previous sections, such as int(null).

Back to Top ^