HDFS CSV File Writer Output Adapter Sample

About This Sample

This sample demonstrates the usage of the Spotfire Streaming CSV File Writer Adapter for Apache Hadoop Distributed File System (HDFS).

Initial Setup

You must also open the CVSWriterTest.sbapp file in the src/main/eventflow/packageName folder. Select the Parameters tab and edit the HDFS_FILE_PATH and HDFS_USER values to represent your current HDFS setup and where you would like to store the sample data.

Importing This Sample into StreamBase Studio

In StreamBase Studio, import this sample with the following steps:

  • From the top-level menu, click File>Import Samples and Community Content.

  • Enter csv to narrow the list of options.

  • Select HDFS CSV file output adapter from the Large Data Storage and Analysis category.

  • Click Import Now.

StreamBase Studio creates a project for this sample.

Running This Sample in StreamBase Studio

  1. In the Project Explorer view, open the sample you just loaded.

    If you see red marks on a project folder, wait a moment for the project to load its features.

    If the red marks do not resolve themselves after a minute, select the project, right-click, and select Maven>Update Project from the context menu.

  2. Open the src/main/eventflow/packageName folder.

  3. Open the CSVWriterTest.sbapp file and click the Run button. This opens the SB Test/Debug perspective and starts the module.

  4. Select the Manual Input tab.

  5. Enter the value 10 for a and click Send Data.

  6. Press F9 or click the Terminate EventFlow Fragment button.

  7. The file specified in the HDFS_FILE_PATH parameter value should now contain tuples formatted as shown in this example:

    a,b,Timestamp
    10,100,2013-06-15 22:48:44.502-0400

    You must use an HDFS file browser (such as Hue) to open the file from your HDFS file system to view this information.

Sample Location

When you load the sample into StreamBase® Studio, Studio copies the sample project's files to your Studio workspace, which is normally part of your home directory, with full access rights.

Important

Load this sample in StreamBase® Studio, and thereafter use the Studio workspace copy of the sample to run and test it, even when running from the command prompt.

Using the workspace copy of the sample avoids permission problems. The default workspace location for this sample is:

studio-workspace/sample_hdfscsvwriter

See Default Installation Directories for the default location of studio-workspace on your system.