Quantcast
Channel: StreamSets
Browsing all 475 articles
Browse latest View live

Image may be NSFW.
Clik here to view.

Data Collecting for Snowflake

Mike Fuller, a consultant at Red Pill Analytics, has been working on ingesting data into Snowflake's cloud data warehouse using StreamSets for Snowflake. In this guest blog post, Mike explains how he...

View Article


Image may be NSFW.
Clik here to view.

Ingesting Data from Relational Databases to Cassandra with StreamSets

This post is summarized content from a full tutorial at https://academy.datastax.com/content/ingesting-data-relational-databases-cassandra-streamsets                 How do you ingest from an existing...

View Article


Image may be NSFW.
Clik here to view.

Creating Dataflow Pipelines with Amazon Kinesis

Although the recent public preview of Amazon Managed Streaming for Kafka (MSK) certainly made headlines, Kinesis remains Amazon's supported, production, real-time streaming service. In this blog post,...

View Article

Image may be NSFW.
Clik here to view.

Execute Machine Learning Jobs in MS Azure Databricks from StreamSets

In my previous blog post, I demonstrated how to achieve low-latency inference using Databricks ML models in StreamSets. Now let's say you have a dataflow pipeline that is ingesting data, enriching it,...

View Article

Image may be NSFW.
Clik here to view.

Scaling Data Collectors on Azure Kubernetes Service

In this blog post, I will present a step-by-step guide on how to scale Data Collector instances on Azure Kubernetes Service (AKS) using provisioning agents—which help automate upgrading and scaling...

View Article


Image may be NSFW.
Clik here to view.

Solving Data Quality in Streaming Data Flows

Vinu Kumar is Chief Technologist at HorizonX, based in Sydney, Australia. Vinu helps businesses in unifying data, focusing on a centralized data architecture. In this guest post, reposted from the...

View Article

Image may be NSFW.
Clik here to view.

Join Us at the First Annual DataOps Summit

  StreamSets is proud to be hosting the first annual DataOps Summit in San Francisco, California at the Hilton San Francisco Financial District on September 3rd-5th. The summit will feature a full day...

View Article

Image may be NSFW.
Clik here to view.

Building a Slack Slash Command as a Microservice Pipeline

One of the drivers behind Slack‘s rise as an enterprise collaboration tool is its rich set of integration mechanisms. Bots can monitor channels for messages and reply as if they were users, apps can...

View Article


Ingestion for a Cyber Security Data Lake with Oracle and StreamSets

If you were lucky enough to get the gift of replacing your existing security event and incident system (SEIM) this year, then there is a chance your organization has considered building a cybersecurity...

View Article


Image may be NSFW.
Clik here to view.

A Cost Comparison of a Cloudera Hadoop Cluster with StreamSets Ingestion...

Introduction It should come as no surprise that a Hadoop cluster and the public cloud go together like peanut butter and jelly because of scale, agility, and economy. It should come as even less of a...

View Article

Image may be NSFW.
Clik here to view.

Sensor data from Azure Event Hub to ADLS Gen2 and Azure SQL DWH

Data that’s in flight is perceived to be more challenging to work with than “landed” data sitting quietly in some storage platform. With the high volume, and variety of data constantly streaming in...

View Article

Image may be NSFW.
Clik here to view.

Binary Classification of Streaming Data using TensorFlow to ADLS Gen1 and...

Over the past decade, digital transformation has evolved such that every system and device has a digital trail: from IT servers, to factory equipment, to consumer electronics, to buildings, to cars....

View Article

Image may be NSFW.
Clik here to view.

Field Mapper Processor: The Swiss Army Knife of Bulk Field Manipulation

Guest post by Jeff Evans, Senior Software Engineer, StreamSets. The Field Mapper processor, introduced in Data Collector version 3.8.0, provides a flexible and powerful way to manipulate fields en...

View Article


Image may be NSFW.
Clik here to view.

Creating the OmniSci F1 Demo: Real-Time Data Ingestion With StreamSets

Randy Zwitch is a Senior Director of Developer Advocacy at OmniSci, enabling customers and community users alike to utilize OmniSci to its fullest potential. With broad industry experience in energy,...

View Article

Image may be NSFW.
Clik here to view.

Replicating Oracle to MySQL and JSON

Yannick Jaquier is a Database Technical Architect at STMicroelectronics in Geneva, Switzerland. Recently, Yannick started experimenting with StreamSets Data Collector's Oracle CDC Client origin,...

View Article


Image may be NSFW.
Clik here to view.

Ingesting Data from Apache Kafka to TimescaleDB

The Glue Conference (better known as GlueCon) is always a treat for me. I've been speaking there since 2012, and this year I presented a session explaining how I use StreamSets Data Collector to ingest...

View Article

Announcing StreamSets Data Collector 3.9.0 and StreamSets Data Collector Edge...

StreamSets is excited to announce the immediate availability of StreamSets Data Collector 3.9.0 and StreamSets Data Collector Edge 3.9.0. StreamSets Data Collector is open source under Apache License...

View Article


Image may be NSFW.
Clik here to view.

From Zero to Production ETL in 30 minutes with StreamSets

Jeff Schmitz has been working with big data for over a decade: at Shell, Sanchez Energy, MapR and, currently, as a senior solutions architect at MongoDB. Here, in a guest post reposted with permission...

View Article

Enhanced Error Diagnostics in StreamSets Data Collector 3.9.0

StreamSets Data Collector reads from and writes to a wide variety of data stores and messaging platforms. Any interaction with an external system brings with it the risk of an error, and error messages...

View Article

A New Definition of DataOps

This is short post, but relevant.  Ever since DataOps was started (about 5 years) it hasn’t had a well-adopted and common definition.  Wikipedia is partially OK at: DataOps is an automated,...

View Article
Browsing all 475 articles
Browse latest View live