StreamSets Data Collector: Simple Network Management Protocol And Management...
This is a guest post by Clark Bradley, Solutions Engineer, StreamSets SNMP stands for simple network management protocol and allow for network devices to share information. SNMP is supported across a...
View ArticleIngest data into Azure Synapse Analytics (formerly SQL DW) with StreamSets Cloud
Azure Synapse Analytics, the next evolution of Azure SQL Data Warehouse, combines enterprise data warehousing and big data analytics into a single analytics service. StreamSets Cloud‘s new Azure SQL...
View ArticleStreamSets Transformer: Design Patterns For Slowly Changing Dimensions
In this blog, we will look at a few design patterns for Slowly Changing Dimensions (SCD) Type 2 and see how StreamSets Transformer, the newest addition to the StreamSets DataOps Platform, makes it easy...
View ArticleAnnouncing StreamSets Data Collector 3.12.0 and StreamSets Data Collector...
StreamSets is excited to announce the immediate availability of StreamSets Data Collector 3.12.0 and StreamSets Data Collector Edge 3.12.0. StreamSets Data Collector is open source under Apache License...
View ArticleStreamSets Transformer: Natural Language Processing in PySpark
In two of my previous blogs I illustrated how easily you can extend StreamSets Transformer using Scala: 1) to train Spark ML RandomForestRegressor model, and 2) to serialize the trained model and save...
View ArticleAccess Relational Databases Anywhere with StreamSets Cloud and SSH Tunnel
StreamSets Cloud makes it easy to build data pipelines integrating with cloud-based relational database services such as Amazon RDS, Google Cloud SQL and Azure’s databases. Now, with the January 13,...
View ArticleSentiment Analysis: Microsoft SQL Server 2019 Big Data Cluster And StreamSets...
Here’s how to enable greater flexibility in interacting with Big Data and managing it in a way that makes it easy to use the data for machine learning and analytical tasks. It allows data team members...
View ArticleDigging Deeper into StreamSets Cloud and SSH Tunnel
The recent blog post, Access Relational Databases Anywhere with StreamSets Cloud and SSH Tunnel, presented an overview of StreamSets Cloud’s SSH tunnel feature. In this article, aimed at IT...
View ArticleDatabricks Bulk Ingest of Salesforce Data Into Delta Lake
Learn how to leverage newly released Databricks COPY command for bulk ingest into Delta Lake using the hosted StreamSets Cloud service. StreamSets is proud to announce an expansion of its partnership...
View ArticleFun with Spark and Kafka: Slot Car Performance Tracking
Antonin Bruneau is a Solution Engineer at StreamSets, based in Paris, France. He recently created a fun demo system showing how the StreamSets DataOps Platform can collect, ingest, and transform IoT...
View ArticleThe Hidden Complexity of Data Operations
Akin to art and music, the most appealing part of the design process is always the ideation, the canvases where you get to explore new ideas that usher in new value and opportunity. Consider The Space...
View Article10 (More) Tips for Working from Home
The StreamSets team will be working hard and working remotely for the next few weeks. As I headed back across the San Francisco Bay to my home in Marin County, I posed the question on our...
View ArticleCoping in the Time of COVID19 – A CPO’s (and Mom’s) Perspective
This is my first blog for StreamSets and I’m going to do it all wrong. I should be writing a thought leadership piece on the DataOps category, drop hints about how uniquely differentiated our product...
View ArticleCOVID-19 Open Research Dataset Analysis on StreamSets DataOps Platform
Recently I attended an inspirational tech talk hosted by Databricks where the presenters shared some great tips and techniques around analyzing COVID-19 Open Research Dataset (CORD-19) freely available...
View ArticleStreamSets Live: Demos with Dash — Your Questions Answered
On March 25, 2020 I hosted my first StreamSets Live: Demos with Dash and I want to thank everyone that took the time to join despite what might have been going on in your lives. Times are definitely...
View ArticleStreaming Analysis Using Spark ML in StreamSets DataOps Platform
Learn how to load a serialized Spark ML model stored in MLeap bundle format on Databricks File System (DBFS), and use it for classification on new, streaming data flowing through the StreamSets DataOps...
View ArticleAction in the Face of Uncertainty: Leading through Crisis
“These are the times that try men’s souls.” —Thomas Paine, The American Crisis Many of you, reading this blog, weathered the systemic disruptions of the dot-com bubble burst and the great financial...
View ArticleStreamSets Named in InsideBigData Impact 50 List
Every year the insideBigData team puts together a list of companies making the biggest impact on this space, and for the 3rd year in a row StreamSets is one of these 50 prominent disruptors....
View ArticleScenario 3: The Overnight Spike
Dramatic Demand Spike Requires Major Operational Changes You’ve seen a spike in demand due to the pandemic. Your organization is stretched to the extreme while making major operational changes, and...
View ArticleScenario 2: Temporary Hibernation
Maintain Essential Operations When Demand Drops Your business has dropped significantly. You’re still operating but with major downsizing or other changes. First, I’m so sorry that you and your...
View Article