StreamSets Data Collector v2.5 Adds IoT, Spark, Performance and Scale

We’re thrilled to announce version 2.5 of StreamSets Data Collector, a major release which includes important functionality related to the Internet of Things (IoT), high-performance database ingest, integration with Apache Spark and integration into your enterprise infrastructure. You can download the latest open source release here.

This release has over 22 new features, 95 improvements and 150 bug fixes.

Reduce the Cost of Internet of Things

This release includes new connectors for MQTT and Websockets. Both are offered as origins or destinations that allow you to use SDC as your IoT ingestion gateway, potentially saving you substantial money vs. proprietary solutions, particular metered solutions from cloud providers. Also included in this release is a HTTP Client Destination that can write a batch full of data to an HTTP Endpoint.

We will soon be adding other IoT gateway connectors such as OPC-UA and CoAP. If you'd like to see other IoT related functionality, please feel free to tell us through this survey.

High-performance Database Ingest

We have added a multi-threaded multi-table JDBC origin to our latest release in order to greatly speed ingest from databases. This support greatly increases throughput and shortens the time it takes to re-platform your data warehouse to Hadoop.

Scale Up Processing

In the last release we introduced a scale up architecture that allows pipelines to run in a multithreaded mode. With this release we’ve added a few more origins that can utilize this scale up architecture : Elasticsearch Origin, JDBC Multitable origin, Kineses Consumer and the Websocket origin.

Spark Executor

As part of our Dataflow Trigger framework we have added a Spark Executor which will allow you kick off simple Spark jobs such as format conversions and file compression all the way to running sophisticated machine learning algorithms. The Spark job can be run on a YARN or Databricks cloud.

StreamSets can greatly simplify developing ingest pipelines when using the Databricks cloud, The Spark Executor processor can trigger a Spark job, or even a Databricks notebook.

Cluster Mode Spark Evaluator

The Spark evaluator in StreamSets lets you inject Spark within the pipeline. If you are writing algorithms in Spark for machine learning you need not worry about the semantics or plumbing of reading and writing data from a myriad of sources and destinations; the pipeline will give your Spark job a RDD to work with, and will similarly read out a RDD and guarantee delivery to whatever destinations you choose. When the pipeline is run in standalone mode, it uses Spark local mode, but if you run the pipeline in cluster streaming mode, it runs on the Spark cluster and your Spark transformation code will be able to see the data available to the entire cluster.

Elastic Origin for Easy Offload

The new Elasticsearch origin lets you get data out of an Elastic index. Customers use this in architectures where they want to migrate data off Elastic into other databases or lower cost systems such as Hadoop, move to cloud hosted systems or archive data to the cloud.

Improved Integration

Two new features advance the ability of StreamSets Data Collector to be integrated into your data workflow. First we have added the ability to pass parameters to a pipeline via an API or CLI, making it possible to have fine grained control of pipelines from external scripts. Second, we have added support for alerting via Webhooks, meaning that you can trigger actions on external systems based on system, data and drift alerts within the pipeline.

Please be sure to check out the Release Notes for detailed information about this release. And download the Data Collector now.

The post StreamSets Data Collector v2.5 Adds IoT, Spark, Performance and Scale appeared first on StreamSets.

StreamSets Data Collector v2.5 Adds IoT, Spark, Performance and Scale

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112