# File streaming

File streaming is a technique used to process a large file by dividing it into smaller pieces, transferring one piece at a time, and then putting the pieces back in order.

For example, when Workato uses file streaming to transfer a file between systems, each piece is downloaded from the source and uploaded to the destination in sequence.

File streaming is useful when processing a large file is impractical or impossible due to size or memory limits at the source or destination app. With file streaming, you can stay within each app's constraints because Workato processes smaller chunks one at a time. This makes it possible for Workato to transfer files of any size between apps that support streaming. Workato also supports file streaming for large volume storage and data transformations.

The following diagram illustrates how file streaming works:

How file streaming works How file streaming works

# Use cases

You can use Workato's file streaming for use cases such as the following:

  • Send data in bulk to applications
  • Receive data from applications using bulk triggers or bulk actions
  • Store large files in Workato FileStorage or other file systems such as SFTP and S3
  • Send or receive data from SQL Transformations

For example, with file streaming, you can:

  • Extract a file with 1 million rows of leads data from an on-prem file server and load it into a Google BigQuery table
  • Accumulate 100,000 rows from a non-streaming source such as HubSpot to Workato FileStorage using batch actions, and then stream the rows from FileStorage to Salesforce bulk APIs all at once
  • Extract new or updated contact records from Marketo using bulk triggers and aggregate the records with other data sources in SQL Transformations

# Comparison with batch processing

Batch processing is when multiple jobs each process multiple records. The number of records that can be processed in a single job is determined by the batch size. Workato creates as many jobs as needed to process all relevant records, which makes it possible to process very large files. However, due to batch size and memory limits, this approach can result in a high volume of API calls, and it does not scale well as the volume of data increases.

In contrast, file streaming uses a single job to process all relevant records in a single call to a streaming API, chunk by chunk. This results in fewer API calls and faster processing, even when the file is very large.

Batch triggers and actions should typically be paired with other batch actions. Likewise, streaming-compatible data producers (such as file downloads, bulk triggers, and bulk export actions) must be paired with streaming-compatible data consumers (such as file uploads and bulk create, bulk upsert, and bulk import actions).

We recommend using triggers and actions that support file streaming whenever possible to help:

  • Simplify recipe design
  • Reduce task count
  • Reduce the number of API calls made between sources and destinations

# How to use file streaming

If either the source (such as a file download, bulk trigger, or bulk action) or the destination (such as a file upload or bulk action) supports file streaming, Workato automatically initiates file streaming when you pass a File contents or CSV contents datapill to the File contents input field. No additional configuration is required in the recipe.

Passing a File contents datapill to a File contents input field Passing a File contents datapill to a File contents input field

Note that the streaming transfer begins when the recipe reaches the consumer action step in the recipe. File streaming actions are usually long or deferred actions. As a result, streaming actions do not time out, no matter how long the transfer takes.

# File connector triggers and actions that support file streaming

The following table lists all file connector triggers and actions that support file streaming:

Connector name Triggers and actions that support file streaming
Amazon S3
  • Upload file streaming
  • Download file contents
  • New file trigger
Azure Blob Storage
  • Upload blob
  • Download blob contents
BIM360
  • Download document from project
  • Upload document to project
Box
  • Download file using file contents
  • Upload file
Dropbox
  • Download file
  • Upload file using file contents
Egnyte
  • Upload file
  • Download file from selected folder
FileStorage
  • Create file
  • Append to file
  • Get file contents
Files by Workato Get file from URL
FTP
  • Upload file
  • Download large file
Google Cloud Storage
  • Download object
  • Upload object
Google Drive
  • New file or folder trigger
  • Download file
  • Upload large file
OneDrive
  • Upload file from URL
  • Upload file from file contents
  • Download file
On-prem
  • Upload file
  • Download file
  • New file in folder trigger
Percolate
  • Download asset
SFTP
  • Upload file
  • Download file
SharePoint
  • Upload file in library

BUILD CUSTOM CONNECTORS

To learn how to build custom connectors that use file streaming, refer to the Connector SDK file streaming guide.

SEE FILE STREAMING IN ACTION

View a sample recipe (opens new window) that uses file streaming to transfer files from an on-prem file system to Amazon S3.


Last updated: 6/9/2024, 11:42:51 PM