The Open Worldwide Application Security Project (OWASP) states that insecure output handling neglects to validate large language model (LLM) outputs that may lead to downstream security exploits, including code execution that compromises systems and exposes data. This...
Security
Prompt Injection Attack Explained
By now, it’s no surprise that we’ve all heard about prompt injection attacks affecting Large Language Models (LLMs). Since November 2023, prompt injection attacks have been wreaking havoc on many in house built chatbots and homegrown large language models. But what is...
Secure Data Pipeline Observability in Minutes
Monitoring data flows for Apache NiFi has evolved quite a bit since its inception. What started generally with logs and processors sprinkled throughout the pipeline grew to Prometheus REST APIs and a variety of Reporting Tasks. These components pushed NiFi closer to...
How to Package and Deploy Python Processors for Apache NiFi
Introduction Support for Processors in native Python is one of the most notable new features in Apache NiFi 2. Each milestone version of NiFi 2.0.0 has enhanced Python integration, with milestone 3 introducing support for loading Python Processors from NiFi Archive...
Troubleshooting Custom NiFi Processors with Data Provenance and Logs
We at Datavolo like to drink our own champagne, building internal tooling and operational workflows on top of the Datavolo Runtime, our distribution of Apache NiFi. We’ve written about several of these services, including our observability pipeline and Slack chatbots....
Collecting Logs with Apache NiFi and OpenTelemetry
Introduction OpenTelemetry has become a unifying force for software observability, providing a common vocabulary for describing logs, metrics, and traces. With interfaces and instrumentation capabilities in multiple programming languages, OTel presents a compelling...
How custom code can add security risk to enterprise AI projects and LLMs
Data teams are actively delivering new architectures to propel AI innovation at a rapid pace. In this blog, we’ll explore how Datavolo empowers these teams to accelerate while addressing the critical aspects of security, observability, and maintenance for their data...
What is Data Observability for AI?
In today's data-driven world, understanding and measuring what is happening within and between disparate IT systems is paramount. Modern distributed application systems utilizing complex architectures with microservices and cloud-based infrastructure require a...
Reducing Observability Costs and Improving Operational Support at Datavolo
Finding the Observability Balance Through our evaluation of observability options at Datavolo, we’ve seen a lot of strong vendors providing real-time dashboards, ML-driven alerting, and every feature our engineers would use to evaluate our services across the three...
Seven Strategies for Securing Data Ingest Pipelines
Introduction Information security is an elusive but essential quality of modern computer systems. Implementing secure design principles involves different techniques depending on the domain, but core concepts apply regardless of architecture, language, or layers of...