Using the Heartbeat Operator

Introduction

  The Heartbeat operator adds timer tuples on the same stream with your data tuples. Its purpose is usually to detect late or missing tuples, so that downstream operations can occur even if there is a pause in incoming data. Like the Metronome operator, it uses the system clock and emits output tuples periodically, but Heartbeat can also emit tuples using information in the input stream, independent of the system clock.

The Heartbeat operator passes through any tuples on its input stream directly through to the output stream, updating its internal clock. If an expected input tuple does not arrive within the configured interval plus a timeout value, the Heartbeat operator synthesizes a tuple with all null data fields except for the timestamp and emits it.

A Heartbeat operator has one input port and one output port. The schema of the output port is the same as the schema of the input port. The Heartbeat operator emits two different kinds of tuples:

  • Data tuples

  • Timer tuples

Data tuples are tuples received on the input port. All data tuples are emitted unmodified on the output port as soon as they are received.

Timer tuples are inserted in between data tuples, and are controlled by two things: the timestamp field in the input stream, and the passage of time as measured by the machine clock. The Heartbeat operator emits a timer tuple whenever the timestamp crosses a multiple of the heartbeat interval. For example, if the timer is set to emit every minute, and data tuple #1 has a timestamp of 59 seconds, and data tuple #2 has a timestamp of 0 seconds, a timer tuple is emitted just before data tuple #2.

Timer tuples can also be emitted if about the right amount of time has passed. Continuing the above example, if data tuple #1 is received with a timestamp of 59 seconds, and then several seconds go by without receiving any more tuples, then a timer tuple will eventually be emitted, claiming that it is now 0 seconds again. The Heartbeat operator waits a certain amount of time after when it would have expected to receive a tuple with a 0-second timestamp. This amount of time is called slack. If slack is set to 10 seconds, the Heartbeat operator does not emit the 0-second timer tuple until 11 seconds after receiving the 59-second tuple.

Timer tuples are emitted with all fields null, except for the timestamp field. The way to distinguish a timer tuple from a data tuple is to check whether a field whose value should never be null, is null. If there is no such field, and it is necessary to distinguish timer tuples from data tuples, you can insert an upstream Map operator to insert a non-null field.

If the Heartbeat operator receives a data tuple whose timestamp is null, or whose timestamp is earlier than a previously-seen timestamp, then that data tuple is emitted, but the timestamp is completely ignored for the purpose of emitting timestamp tuples.

If the Heartbeat operator receives data tuples whose timestamps are out of order, it does not re-emit any timer tuples that have already been emitted. In other words, the timestamps on timer tuples are always strictly increasing.

Specify a number or an expression that evaluates to seconds for the Tuple output interval and Maximum delay properties. You can can also enter these values as a parameter in the form ${parameter}.

The following example EventFlow application uses two Heartbeat operators to prevent a Merge operator from starving when either of its inputs stop receiving tuples. Without the Heartbeat operators, the Merge operator would stop outputting tuples as soon as either of its two input streams becomes inactive, even if it continued to receive tuples on the other input stream.

Note

If you submit a Feed Simulation or play back a StreamBase recording at an accelerated speed, the rate at which timer tuples are emitted by the Heartbeat operator will match the acceleration factor.

A StreamBase application can have one or more Heartbeat operators, each with different timing values. You can also have a StreamBase application that uses one or more Heartbeat operators plus one or more Metronome operators. Think of a Spotfire Streaming Heartbeat operator as another type of timing mechanism. Unlike the Metronome operator, which does not vary from its set timing interval, a Heartbeat operator's pulse can vary, like a person's beating heart, which can speed up or slow down. You can specify the amount of variance or slack allowed by the Heartbeat operator.

Properties: General Tab

Name: Use this required field to specify or change the name of this instance of this component. The name must be unique within the current EventFlow module. The name can contain alphanumeric characters, underscores, and escaped special characters. Special characters can be escaped as described in Identifier Naming Rules. The first character must be alphabetic or an underscore.

Storage method: Specify settings for this operator with the same in heap and in transactional memory options described for the Annotations tab of the EventFlow Editor. The default setting is Inherit from containing module.

Enable Error Output Port: Select this checkbox to add an Error Port to this component. In the EventFlow canvas, the Error Port shows as a red output port, always the last port for the component. See Using Error Ports to learn about Error Ports.

Description: Optionally, enter text to briefly describe the purpose and function of the component. In the EventFlow Editor canvas, you can see the description by pressing Ctrl while the component's tooltip is displayed.

Properties: Output Settings Tab

The Output Settings tab allows you to specify:

  • Tuple output interval: Specify how often a timer tuple should be added on the stream, to prevent lulls in the activity of downstream components. You can enter integers, or decimal values such as .10, meaning every 100 milliseconds. The lowest supported granularity is 10 milliseconds, or .01. If you enter a lower value, such as .009, StreamBase Studio displays the following warning: The Tuple Output Interval you entered is too fine-grained to be guaranteed. If you enter 0 (zero), StreamBase Studio displays the following error: The Tuple Output Interval must be a numeric value greater than zero.

  • Input tuple timestamp field: Identify the field that contains the inbound timestamp value.

  • Maximum delay: In this field, you define the maximum delay (or slack) when waiting for a data tuple to arrive in the Heartbeat operator.

    In the maximum delay parameter, you can enter an integer or a decimal value, again with the smallest supported value being 10 milliseconds, or .01. A specified maximum delay value factors into the Heartbeat operator's runtime behavior in the following ways:

    • If a data tuple is received by the Heartbeat operator before the scheduled emit time, then that data tuple should be emitted before any timer tuple.

    • If a data tuple is received by the Heartbeat operator after the scheduled emit time, but before the slack period expires, and that data tuple's timestamp is before the scheduled emit time, then the Heartbeat operator's time is resynchronized, and the slack period is restarted.

    • If a data tuple is received by the Heartbeat operator after the scheduled emit time, but before the slack period expires, and that tuple's timestamp is after the scheduled emit time, then a timer tuple is emitted, followed by the data tuple.

    • If no data tuple is received by the Heartbeat operator before the slack period expires, then a timer tuple is emitted. If a data tuple is then received whose timestamp is before the timestamp of the timer tuple just emitted, then the Heartbeat operator adjusts its sense of the current time, but does not re-emit the timer tuple that was previously emitted.

    • If no data tuple is received by the Heartbeat operator before the slack period expires, then a timer tuple is emitted. If more time passes, then another timer tuple is emitted. The time between the first and second timer tuples will not include additional slack.

If you have not already done so, please see this topic's Introduction for additional information.

Properties: Concurrency Tab

Use the Concurrency tab to specify parallel regions for this instance of this component, or multiplicity options, or both. The Concurrency tab settings are described in Concurrency Options, and dispatch styles are described in Dispatch Styles.

Caution

Concurrency settings are not suitable for every application, and using these settings requires a thorough analysis of your application. For details, see Execution Order and Concurrency, which includes important guidelines for using the concurrency options.