Contents
Suppose you want to extract a range of values from a query table, such as the ten
highest or lowest stocks or best- or worst-selling items in an inventory. One method
is to use a Query operator to repeatedly read all the rows in the table, and then use
a series of other operators to process the result, compare prices across stock
symbols or products to find the top n
.
Conversely, you could pre-process tuples upstream from a query table that has just
one row containing n
fields. Each time a
tuple arrives on the input stream, compare values to see if the table needs to be
updated.
Both of these options require complicated processing to compare the values and continually calculate the top n values. The QueryTopN sample application demonstrates a simpler method, using the Query operator's built-in option to limit the number of output rows and b-tree indexing.
The sample has one input stream, in which you enter the value of n
as the int field howMany. The table is populated
from a CSV data file containing randomly generated values, but in a real application
data from an input stream or another module would be updating the table dynamically.
The table is indexed by field value in descending order, so that the first n
values of the index are always the largest ones.
Sending a tuple from the enterN input stream triggers a read operation that outputs
just the current top n
highest values by
setting the Limit field in the Query operator to the
input value howMany.
The tuples output from the table are split into two streams and processed by an
Aggregate operator and a Map operator, respectively. The aggregate operator uses
aggregatelist(tuple(...))
in a predicate dimension to
generate a list of the top n
tuples. The
dimension just has a Close expression, count()=howMany
,
to do this. The Map operator restores the original field names and drops input field
howMany to output n
individual tuples on
the lower stream.
This sample is part of the operator samples. In StreamBase Studio, import the operator samples with the following steps:
-
From the top-level menu, click
> . -
Enter
sample group
to narrow the list of options. -
Select Operator sample group from the Data Constructs and Operators category.
-
Click
.
StreamBase Studio creates a single project containing all the operator samples.
-
In the Project Explorer view, open the sample you just loaded.
If you see red marks on a project folder, wait a moment for the project to load its features.
If the red marks do not resolve themselves after a minute, select the project, right-click, and select
> from the context menu. -
Open the
src/main/eventflow/com.tibco.sb.sample.operator
folder. -
Open the
QueryTopN.sbapp
file and click the Run button. This opens the SB Test/Debug perspective and starts the module. -
In the Output Streams view, make sure that
All Output Streams
is selected in the Output stream control. -
Enter 1 for howMany (the number of values you want to output) in the Manual Input view and click
. -
Observe the output streams in the Output Streams view. Note that:
-
The topNtuples stream contains one tuple having fields value and symbol. It is the highest value in the table.
-
The topNlist stream contains one tuple, a list containing the above tuple.
-
-
Repeat steps 4 and 5, increasing howMany to 2, 3, ..., to see the set of top
n
values grow. -
When done, press F9 or click the Terminate EventFlow Fragment button.
When you load the sample into StreamBase Studio, Studio copies the sample project's files to your Studio workspace, which is normally part of your home directory, with full access rights.
Important
Load this sample in StreamBase Studio, and thereafter use the Studio workspace copy of the sample to run and test it, even when running from the command prompt.
Using the workspace copy of the sample avoids permission problems. The default workspace location for this sample is:
studio-workspace
/sample_operator
See Default Installation
Directories for the default location of studio-workspace
on your system.