HDFS File Writer Output Adapter Samples

About The Samples

The basic adapter sample illustrates the use of the TIBCO StreamBase® File Writer for Apache Hadoop Distributed File System (HDFS) by taking in a tuple and writing one of its fields contents to a file.

The advanced adapter sample illustrates reading from a sample input file multiple times and writing that data back out in various compression formats.

Initial Setup

You must also open the FileWriterBasic.sbapp or the FileWriterAdvanced.sbapp file and select the Parameters tab and edit the value to represent both your current HDFS setup and where you would like to store the sample data. The .sbapp files are located in sample_adapter_embedded_hdfsfilewritersrc/main/eventflowcom.sample.adapter.hdfsfilewriter

The SampleIn.txt file used in the FileWriterAdvanced.sbapp sample must be placed on your HDFS file system before this sample can run.

Importing This Sample into StreamBase Studio

In StreamBase Studio, import this sample with the following steps:

  • From the top-level menu, select FileLoad StreamBase Sample.

  • Type hdfs to narrow the list of options.

  • Select hdfsfilewriter from the Large Data Storage and Analysis category.

  • Click OK.

StreamBase Studio creates a single project containing the sample files.

Running The Basic Sample in StreamBase Studio

  1. In the Project Explorer, open the sample you just loaded.

  2. Open the src/main/eventflow folder.

  3. Open the package folder (most samples contain a single package folder. Open the top-level package folder if your sample contains more than one folder).

  4. Open the FileWriterBasic.sbapp application file and click the Run button. This opens the SB Test/Debug perspective and starts the application.

    If you see red marks, wait a moment for the project in Studio to load its features.

    If red marks do not resolve themselves in a moment, select the project and right-click MavenUpdate Project from the context menu.

  5. In the Manual Output view, switch the Stream to Data, then enter a string value such as 'test', and then click Send Data to send a data tuple to be written to the file. Repeat for as many lines as you wish.

  6. In the Application Output view, observe tuples emitted on the Status output streams indicating actions performed to the file.

  7. In the Manual Output view, switch the Stream to Control, then enter 'Close' into the Command field, and then click Send Data to send a control tuple which will close the current file for writing.

  8. Press F9 or click the Stop Running Application button.

  9. This demo will have now created a file in your project called SampleOut.txt containing the lines of data you submitted.

Running The Advanced Sample in StreamBase Studio

  1. In the Project Explorer, open the sample you just loaded.

  2. Open the src/main/eventflow folder.

  3. Open the package folder (most samples contain a single package folder. Open the top-level package folder if your sample contains more than one folder).

  4. Open the FileWriterAdvanced.sbapp application file and click the Run button. This opens the SB Test/Debug perspective and starts the application.

    If you see red marks, wait a moment for the project in Studio to load its features.

    If red marks do not resolve themselves in a moment, select the project and right-click MavenUpdate Project from the context menu.

  5. In the Application Output view, observe tuples emitted on the Status output streams indicating actions performed to the files.

  6. Press F9 or click the Stop Running Application button.

  7. This demo will have now created multiple files in your project:

    1. Sample.gz - This file is a GZip compressed file created from the SampleIn.txt file.

    2. Sample.gz2 - This file is a BZip2 compressed file created from the SampleIn.txt file.

    3. Sample.zip - This file is a Zip compressed file created from the SampleIn.txt file.

    4. SampleOut.txt - This file is a un-compressed file created from the SampleIn.txt file.

Sample Location

When you load the sample into StreamBase Studio, Studio copies the sample project's files to your Studio workspace, which is normally part of your home directory, with full access rights.

Important

Load this sample in StreamBase Studio, and thereafter use the Studio workspace copy of the sample to run and test it, even when running from the command prompt.

Using the workspace copy of the sample avoids permission problems. The default workspace location for this sample is:

studio-workspace/sample_adapter_embedded_hdfsfilewriter

See Default Installation Directories for the default location of studio-workspace on your system.