Hey everyone, it’s been an incredible journey over the past ten years since we open-sourced Apache NiFi. Right from the beginning, our mission with NiFi was crystal clear: to make it easier for all of you to gather data from anywhere and transport it to wherever it’s needed, all while ensuring the data is prepared for consumption. We wanted to achieve this in a way that allowed you to build powerful data pipelines quickly and with complete transparency through a visual interface.
Today, I want to share with you how, with the advent of large language models, we’re taking data integration to the next level. We’re making it even more accessible and helping you create robust data flows in record time. Many of you have experienced firsthand how NiFi has transformed your data processing tasks, turning what used to take weeks or months into mere hours or days. And now, with the introduction of Datavolo, we’re on a mission to make it happen even faster.
Our goal is simple: to reduce the time it takes to process data from hours to days, down to minutes, or even less. At Datavolo, we’re reimagining the user experience to make it incredibly user-friendly. This encompasses everything from setting up and scaling NiFi clusters to building and monitoring data flows. And let me tell you, our all-new AI-powered flow generation capabilities are just the beginning.
Let’s dive into a real-world demonstration to see Datavolo in action:
In this video, I’m going to take you to the DataVolo community workspace on Slack, where we interact with our FlowGen user. We’re going to request the creation of a data flow that extracts JSON data from an S3 bucket we call “NiFi notes.” We know that the data might be compressed, so we specify that it should include a decompression step. Additionally, we want to reformat the transaction date field and convert all field names to uppercase. Lastly, the data should be pushed into a Postgres database using the transactions table.
As soon as we send the request, we receive a quick acknowledgment, signaling that Datavolo is starting to work on it. Sometimes the messages from Datavolo can be a bit quirky, but they’re always entertaining!
In no time, we receive the generated flow, which we can download and add to our NiFi canvas. It’s named “S3 JSON to Postgres,” and our first task is to ensure its accuracy. While AI-generated, there’s always a possibility of minor errors. However, the overall structure appears sound: listing S3 bucket contents, fetching files, identifying and decompressing when necessary, reformatting dates, converting field names to uppercase, and finally, pushing data into Postgres.
Configuring parameters is a breeze, with Datavolo smartly assisting in their creation. We enable controller services, and the flow is ready to go.
Remarkably, it takes just about two minutes to build and configure this fully functional data flow, and it executes all the specified tasks, from sourcing data in S3 to loading it into Postgres, in mere milliseconds. This incredible speed and simplicity demonstrate that you don’t need to be an NiFi expert to harness Datavolo’s capabilities.
This development in data integration is nothing short of revolutionary, and I couldn’t be more excited to share it with all of you. Datavolo’s ability to expedite and simplify data processing tasks holds immense promise for businesses and individuals alike. If you’re eager to learn more and experience this capability for yourself, please visit Datavolo.io and reach out to us. We can’t wait to hear from you and help you revolutionize your data processing workflows.