Quantcast
Channel: StreamSets
Browsing all 475 articles
Browse latest View live

Image may be NSFW.
Clik here to view.

How DataOps is Adding Value to Data Lakes

For those of you who joined us on June 6th, you dialed into a forward-thinking conversation between three industry experts. They waxed poetic about topics including big data, DataOps, governance, data...

View Article


Announcing StreamSets Data Collector 3.10.0 and StreamSets Data Collector...

StreamSets is excited to announce the immediate availability of StreamSets Data Collector 3.10.0 and StreamSets Data Collector Edge 3.10.0. StreamSets Data Collector is open source under Apache License...

View Article


Image may be NSFW.
Clik here to view.

Automating Kerberos KeyTab Generation for Kubernetes-based Deployments

A major challenge when deploying dataflow pipelines to run on Kubernetes is how to handle Kerberos principals and keytabs needed when pipelines write to secure Hadoop. One approach, of using Kerberos...

View Article

Image may be NSFW.
Clik here to view.

StreamSets Transformer is Here

The emerging practice of DataOps encompasses many activities that an enterprise may execute today including ingestion, ETL, and real-time stream processing. The trick to DataOps is to execute these...

View Article

Image may be NSFW.
Clik here to view.

StreamSets Transformer Extensibility: Spark and Machine Learning

Apache Spark has been on the rise for the past few years and it continues to dominate the landscape when it comes to in-memory and distributed computing, real-time analysis and machine learning use...

View Article


Image may be NSFW.
Clik here to view.

StreamSets Transformer Extensibility — Part 2: Spark MLeap Bundles to S3

In part 1, you learned how to extend StreamSets Transformer in order to train Spark ML RandomForestRegressor model. In this part 2, you will learn how to create Spark MLeap bundle to serialize the...

View Article

Image may be NSFW.
Clik here to view.

The StreamSets Cloud Beta is Open for Participation!

Today we are opening the StreamSets Cloud Beta program, inviting you to experience and give feedback on the latest addition to the StreamSets product family. StreamSets Cloud is a cloud service for...

View Article

Image may be NSFW.
Clik here to view.

StreamSets Cloud Unlocking Insights: Amazon S3 to Snowflake

StreamSets Cloud is a cloud service for designing, deploying and operating smart data pipelines, combining ease and scalability with the flexibility to execute pipelines anywhere – on-premise, or in a...

View Article


Announcing StreamSets Data Collector 3.11.0 and StreamSets Data Collector...

StreamSets is excited to announce the immediate availability of StreamSets Data Collector 3.11.0 and StreamSets Data Collector Edge 3.11.0. StreamSets Data Collector is open source under Apache License...

View Article


Image may be NSFW.
Clik here to view.

StreamSets Transformer: Your Questions Answered

StreamSets Transformer, a powerful tool for creating highly instrumented Apache Spark applications for modern ETL, is the newest addition to the StreamSets DataOps Platform. StreamSets enables...

View Article

Image may be NSFW.
Clik here to view.

StreamSets Data Collector: Simple Network Management Protocol And Management...

This is a guest post by Clark Bradley, Solutions Engineer, StreamSets SNMP stands for simple network management protocol and allow for network devices to share information. SNMP is supported across a...

View Article

Image may be NSFW.
Clik here to view.

Ingest data into Azure Synapse Analytics (formerly SQL DW) with StreamSets Cloud

Azure Synapse Analytics, the next evolution of Azure SQL Data Warehouse, combines enterprise data warehousing and big data analytics into a single analytics service. StreamSets Cloud‘s new Azure SQL...

View Article

Image may be NSFW.
Clik here to view.

StreamSets Transformer: Design Patterns For Slowly Changing Dimensions

In this blog, we will look at a few design patterns for Slowly Changing Dimensions (SCD) Type 2 and see how StreamSets Transformer, the newest addition to the StreamSets DataOps Platform, makes it easy...

View Article


Announcing StreamSets Data Collector 3.12.0 and StreamSets Data Collector...

StreamSets is excited to announce the immediate availability of StreamSets Data Collector 3.12.0 and StreamSets Data Collector Edge 3.12.0. StreamSets Data Collector is open source under Apache License...

View Article

Image may be NSFW.
Clik here to view.

StreamSets Transformer: Natural Language Processing in PySpark

In two of my previous blogs I illustrated how easily you can extend StreamSets Transformer using Scala: 1) to train Spark ML RandomForestRegressor model, and 2) to serialize the trained model and save...

View Article


Image may be NSFW.
Clik here to view.

StreamSets Transformer Extensibility — Part 2: Spark MLeap Bundles to S3

In part 1, you learned how to extend StreamSets Transformer in order to train Spark ML RandomForestRegressor model. In this part 2, you will learn how to create Spark MLeap bundle to serialize the...

View Article

Image may be NSFW.
Clik here to view.

The StreamSets Cloud Beta is Open for Participation!

Today we are opening the StreamSets Cloud Beta program, inviting you to experience and give feedback on the latest addition to the StreamSets product family. StreamSets Cloud is a cloud service for...

View Article


Image may be NSFW.
Clik here to view.

StreamSets Cloud Unlocking Insights: Amazon S3 to Snowflake

StreamSets Cloud is a cloud service for designing, deploying and operating smart data pipelines, combining ease and scalability with the flexibility to execute pipelines anywhere – on-premise, or in a...

View Article

Announcing StreamSets Data Collector 3.11.0 and StreamSets Data Collector...

StreamSets is excited to announce the immediate availability of StreamSets Data Collector 3.11.0 and StreamSets Data Collector Edge 3.11.0. StreamSets Data Collector is open source under Apache License...

View Article

Image may be NSFW.
Clik here to view.

StreamSets Transformer: Your Questions Answered

StreamSets Transformer, a powerful tool for creating highly instrumented Apache Spark applications for modern ETL, is the newest addition to the StreamSets DataOps Platform. StreamSets enables...

View Article
Browsing all 475 articles
Browse latest View live