Predictive Modeling Sample: Regression Trees

About This Sample

This sample demonstrates the use of the Spotfire Streaming Regression Trees operator.

The provided StreamBase module uses the Boston Housing 2 data - BostonHousing2.csv. ValueofOccupiedHomes is selected as the response. The rest of the fields are selected as predictors. Data is fed into the Matrix operator to be collected and emitted every 500 rows. Once the matrix operator has collected the required number of rows of data, the SetReady output stream sets a dynamic variable indicating the Threshold condition has been met. The first N tuples are not scored because the operator is yet to be trained. This operator uses the collected data and options (from the provided schema) as inputs.

Importing This Sample into StreamBase Studio

In StreamBase Studio, import this sample with the following steps:

  • From the top-level menu, click File>Import Samples and Community Content.

  • In the search field, type regressiontrees to narrow the list of options.

  • Select Standard regression trees (C&RT) from the Streaming Datascience Operators category.

  • Click Import Now.

StreamBase Studio creates a single project containing the sample files.

Running This Sample in StreamBase Studio

  1. In the Project Explorer view, expand the sample_datascience_regressiontrees project and double-click to open the RegressionTrees.sbapp application. Make sure the application is the currently active tab in the EventFlow Editor.

  2. Click the Run button. This opens the SB Test/Debug perspective and starts the application.

  3. Click on the Feed Simulations tab, click the regressiontrees.sbfs, then click the Run button to start feeding the data.

  4. The Regression Trees operator starts taking data from the feed simulation and emitting the results after 500 rows are collected.

  5. When done, press F9 or click the Stop Running Application button.

Expected Output Stream