The data science operator
sample group provides one small EventFlow
application for each operator and most data constructs in the Palette view in StreamBase
Studio.

An operator is a StreamBase processing unit that performs predefined work on streaming data, such as aggregating windows of data streams, merging streams, or retrieving shared data from a table.

A data construct is a component that can store information from a stream or from an external data source that can then be used by an associated Spotfire Streaming operator.
Each sample has a separate README that describes the steps to run that sample.
Component Sample  Description 

Data Science Operator Samples  
ANOVA Operator Sample  Uses an ANOVA Operator to compute the analysis of variance which is a generalization of the ttest for comparing two or more groups with respect to equality of means. 
Chisquare Test Operator Sample  Uses Chisquare Test of Independence Operator to compute the chisquare test of independence between two categorical/discrete random variables along with other relevant summary information such as crosstabulation frequencies, relative frequences, etc. as well as the Cramer's V statistic. 
Predictive Modeling Sample: Classification Trees  Uses Classification Trees Operator to build classification tree models. The IRIS Flower data  irisdat.csv. SEPALLEN, SEPALLWID, PETALLEN, PETALWID features are selected as predictors. IRISTYPE is selected as response. 
Correlations Operator Sample  Uses Correlations Operator to gather tuples over various styles of output types such as over time or by selected values. The purpose of this operator is to create a matrix (list of tuples) of which the tuples fields are the columns of the matrix. 
Descriptive Statistics Operator Sample  Uses Descriptive Statistics Operator to provide basic statistical information for each specified variable including measures of central tendency (e.g. mean) and of dispersion (e.g. standard deviation). 
Frequency Tables Operator Sample  Uses a Frequency Tables Operator to compute contingency table that shows item and combination counts. 
KolmogorovSmirnov Two Sample Test  This sample uses an KolmogorovSmirnov Test Operator to compute the twosample KolmogorovSmirnov test. This is the nonparametric analogue to the twosample ttest, however, instead of comparing means between two groups, the test can be used to assess any differences between the two distributions. 
Predictive Modeling Sample: Linear Regression  Uses a Linear Regression Operator to build linear regression models. Ordinary least square, ridge regression, and lasso regression models are supported. 
Predictive Modeling Sample: Logistic Regression  Uses a Logistic Regression Operator to build binary logistic regression models. 
Predictive Modeling Sample: Multilayer Perceptron Classification  Uses a Multilayer Perceptron Classification Operator to build multilayer perceptron neural networks. It uses the IRIS Flower data  irisdat.csv. SEPALLEN, SEPALLWID, PETALLEN, PETALWID features are selected as predictors. IRISTYPE is selected as a response. 
Predictive Modeling Sample: Multilayer Perceptron Regression  Uses a Multilayer Perceptron Regression Operator to build multilayer perceptron neural networks. It uses the Boston Housing 2 data  BostonHousing2.csv. ValueofOccupiedHomes is selected as the response. The rest is selected as predictors. 
Paired Ttest Sample  Uses a Paired TTest Operator to compute the two sample dependent ttest where a two sample ttest is used to test the null hypothesis that the population means of two dependent groups as measured on a single variable are significantly different from one another. 
Predictive Modeling Sample: Regression Trees  Uses a Regression Trees Operator to build regression tree models. These operator starts taking data from the feed simulation and emitting the results after 300 rows collected. 
Single Sample TTest Operator  Uses a Single Sample TTest Operator to compute the single sample ttest. 
Predictive Modeling Sample: Support Vector Machine Classifier  Uses a SVM Classification Operator to build support vector machine classification models. 
Predictive Modeling Sample: Support Vector Machine Regression  Uses a SVM Regression Operator to build support vector machine regression models. 
Two Sample Ttest Sample  Uses Two Sample TTest Operator to compute the two sample independent ttest where a two sample ttest is used to test the null hypothesis that the population means of two groups as measured on a single variable are significantly different from one another. 
Two Sample TTest by Groups Operator  Uses TTest By Groups Operator to compute the two sample independent ttest where a two sample ttest is used to test the null hypothesis that the population means of two groups as measured on a single variable are significantly different from one another. 