Announcing StreamSets Data Collector 1.2.0.0

We are very excited to announce the next version of the StreamSets Data Collector. This version has seen over 250 JIRAs with a host of new features, performance enhancements and bug fixes.

Without further ado, here’s what’s new in 1.2.0.0:

Versioning Scheme

To keep up with an incredible volume of feature requests from customers and the community, we are introducing a new versioning scheme. From here on, we will use a 4-digit scheme that looks like so: W.X.Y.Z. Please see the FAQ for an explanation of this new scheme.

Core Functionality

Alert ELs for Detecting Data Drift
One of the core problems the Data Collector is designed to solve is data drift. This new version adds yet another powerful feature to deal with drift. We have new expression language (EL) primitives that detect when fields change. This is particularly useful for alerting you when fields are added, deleted, or changed in the incoming datasets.

Access Record Header Attributes via ELs
A much-requested feature was access to record header attributes through the expression language. Check the documentation on how to use this feature.

First Class Protocol Buffer Support
Adding to our existing support of serialization formats such as Avro, the Data Collector can now read and write protocol buffer data. Read more about the innovative way that we handle protobuf support here <link to blog post>.

SDC-Basic
To further simplify the installation and management of Data Collector, we now offer a new packaging paradigm. Users can download a basic version of the SDC, and add support for additional stages on an individual basis as needed.

Note: We created this feature for customers who requested a smaller download. The full TAR, RPM, and Cloudera parcel files with all stages enabled are still available.

File Destination
Another highly-requested feature that’s very useful for quick and dirty testing on local machines – we now support a local file system destination stage. You can write data to a local or filesystem mount.

Compressed File Support for Amazon S3 and File Directory
The Data Collector now natively reads and writes compressed files. We support a large number of compression formats including common ones such as zip and gzip. This enables easily reading compressed and archived files, such as rotated webserver logs.

LDAP/AD Support
An often-requested feature from our enterprise customers is LDAP or AD support. In our continued commitment to the open source community, we have decided to create and open source LDAP/AD support.

Cluster Mode Support

Mesos
Adding to our existing rich support for clustering, the Data Collector now has out-of-the-box support for Apache Spark on Mesos. Configure the Mesos Dispatcher URL and Checkpoint directory in the Data Collector and hit Start. – That’s all it takes to fire up StreamSets on Mesos.

Integrations

CDH 5.5 Support + Cloudera Parcels
Cloudera CDH 5.5 is now supported. In a previous release we announced support for Cloudera parcels. Cloudera customers can now also use Cloudera Manager to create and maintain Kerberos principals and keytabs for StreamSets.

RabbitMQ Origin
RabbitMQ is a lightweight message queue system that is very popular with our community. We are happy to announce support for reading from RabbitMQ.

Kafka 0.9 Support
We now support the latest Kafka release, version 0.9. With this version, users can use Data Collector with the new security features within Kafka (SSL encryption for data over the wire, authentication using TLS client certificates, and authentication using Kerberos).
* Please note: The Kafka community considers these security features beta quality. Please use caution when using these features in a production setting.

Amazon S3 Destination
We now support writing data to Amazon S3 buckets. Drag and drop an Amazon S3 destination to your canvas, specify the API keys and that’s all you need to start sending data to the cloud.

First Class Support for Amazon Kinesis
We’ve updated our support for Amazon Kinesis. You can now read and write multiple data formats, configure a number of partitioning strategies, and find overall performance improvements to the integration.

Tutorials

We’ve created a new tutorials repository on github with sample data, piplelines and guides on how to use some of the basic features of the Data Collector.

We will keep adding to this repository over time, please feel free suggest topics or contribute.

What’s Next

We are planning a host of new integrations, new core features and improvements to the project over the next few months. We continue to be humbled by the positive responses from our customers and the community and we’d love to hear your feedback on what else you’d like to see included and how you’d like to participate.

Please connect with us on github, our mailing list, or JIRA and let us know.

Download Data Collector

The post Announcing StreamSets Data Collector 1.2.0.0 appeared first on StreamSets.

Announcing StreamSets Data Collector 1.2.0.0

Versioning Scheme

Core Functionality

Cluster Mode Support

Integrations

Tutorials

What’s Next

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112