Constructing Apache NiFi Clusters on Kubernetes

Introduction

Clustering is a core capability of Apache NiFi. Clustered deployments support centralized configuration and distributed processing. NiFi 1.0.0 introduced clustering based on Apache ZooKeeper for coordinated leader election and shared state tracking. Among the many technical advances in NiFi 2.0.0, clustering based on Kubernetes capabilities provides a native foundation for scalable data pipeline deployments. Kubernetes clustering unlocks better monitoring strategies, optimized resource allocation, and streamlined configuration. In addition to service coordination, robust NiFi cluster deployments on Kubernetes also require storage provisioning, certificate management, access control, and ingress configuration. Understanding both the foundational features and the supporting infrastructure for NiFi clustering on Kubernetes is essential to deploying resilient data processing solutions.

Native Clustering Features for Kubernetes

Deploying NiFi clusters on Kubernetes is not a revolutionary strategy. Following the implementation of clustering with ZooKeeper for NiFi 1.0.0 in 2016, both commercial and open source projects have provided options for deployments on Kubernetes. ZooKeeper provides a capable solution in a variety of environments, but with Kubernetes supporting native service coordination features, managing an independent service presents additional resource and configuration challenges.

Recognizing the need for improvement, NiFi 2.0.0-M1 included direct support for Kubernetes clustering, without the need for ZooKeeper. Building on NiFi interface restructuring for leader election, the new implementation uses Kubernetes Leases for managing the role of cluster coordinator. Based on existing framework abstractions, native Kubernetes support also includes shared state tracking using ConfigMaps. NiFi 2.0.0 packages these components together in the nifi-framework-kubernetes-nar library. Both of these implementations build on the Fabric8 Kubernetes Client library. NiFi streamlines Kubernetes client configuration with Fabric8, providing support for Kubernetes API orchestration across multiple versions.

Kubernetes Leases for Leader Election

Leases in Kubernetes provide a first-class resource for coordinated leader election across multiple instances of integrated services. Kubernetes itself uses Leases to coordinate scheduling. providing high availability for essential services.

NiFi uses Kubernetes Leases to track Cluster Coordinator and Primary Node roles within a cluster. The Cluster Coordinator is responsible for maintaining and distributing flow configuration. In data pipelines with a single source Processor, such as listing items in an object storage service, the Primary Node is responsible for running selected Processors. The NiFi server elected as the leader for these roles can change during the life of a cluster. In the event of a single server failure, cluster members elect a new leader to perform these functions.

Kubernetes Leases include specific properties that identify the current lease holder as well as lease duration and modification timestamps. Based on these properties, cluster members can determine the current Cluster Coordinator and Primary Node, and also decide when to initiate a new leader election. The acquire time and renew time properties on each Lease object also provide helpful details when troubleshooting the behavior of cluster leader election.

Standard tools such as kubectl provide straightforward access to cluster leader information when NiFi is configured for Kubernetes clustering. Listing the Leases for a specific namespace provides a quick summary of the current Cluster Coordinator and Primary Node.

kubectl -n nifi get leases

Listing the Leases includes the name of the NiFi role along with the age, indicating when the first pod created the Lease resource record.

NAME                   HOLDER                                AGE
cluster-coordinator    node-0.nifi.svc.cluster.local:8445    12h
primary-node           node-0.nifi.svc.cluster.local:8445    12h

Kubernetes ConfigMaps for State Tracking

NiFi clustering requires not only coordinated leadership but also shared state tracking. Multiple NiFi Processors depend on shared state to determine offsets for iterative operations, such as consuming events or change detection. This feature is critical for resilient processing in scenarios where one NiFi server fails and cluster members elect a new Primary Node. With shared state tracking, the new Primary Node continues processing based on the last successful state recorded.

Building on the NiFi State Provider abstraction, the Kubernetes implementation stores key and value pairs in ConfigMaps for identified components. Processors such as ListS3 and ListGoogleDrive integrate with cluster state tracking for the last item or timestamp observed to avoid processing duplicate information. The Kubernetes implementation translates programmatic key and value information to persistent properties in a ConfigMap named according to the NiFi Processor identifier. This identification mapping supports straightforward correlation between a list of NiFi Processors and a list of Kubernetes ConfigMaps.

The NiFi State Map interface supports keys and values as unicode strings, providing a significant flexibility for integrating Processors. Although Kubernetes ConfigMaps support arbitrary string values, keys are restricted to valid DNS Subdomain Names. To avoid potential errors, the NiFi State Provider for Kubernetes applies Base64 encoding to ConfigMap keys. This encoding obfuscates keys, requiring an additional step for decoding, but ensures compatibility with the NiFi State Map API. Kubernetes ConfigMaps also have a size restriction, limiting stored values to a total of 1 MiB. Most NiFi Processors integrating with cluster state tracking use a very small set of keys and values, so the size limit for ConfigMaps is not a concern.

Reviewing Component State in ConfigMaps

Listing the ConfigMaps for a specific namespace provides an indication of the number of Processors using shared state information.

kubectl -n nifi get configmaps

The ConfigMaps listed may not include any state tracking entries if the flow configuration does not include any Processors that require cluster state information. For flow configurations that use cluster state, the list of ConfigMaps will include one or more entries with names starting with nifi-component followed by the UUID of the Processor.

NAME                                                   DATA    AGE
nifi-component-9341f491-633a-33dc-80af-662531f95dfb 1 12h
nifi-component-9aefc196-466a-3ad0-976f-3eb857ba80d6 2 12h

Supporting Infrastructure for Kubernetes Clustering

Native Kubernetes clustering in NiFi 2.0.0 provides an important foundation for declarative configuration and scalable deployments. Leader election and shared state management are required for NiFi clustering, but a maintainable deployment solution requires additional capabilities.

Providing persistent storage for NiFi repositories is essential for resilient processing. Automated certificate distribution and parameterized authorization settings are also key elements of a secure cluster. External access to NiFi interfaces through an ingress controller is required for managing and monitoring flow configurations. Having an architecture design and implementation strategy that brings together all of these elements is necessary for a successful deployment of NiFi on Kubernetes.

Persistent Volumes for Configuration and Repositories

NiFi is not a data storage service, but it uses persistent storage for maintaining the content and metadata of files being processed. NiFi also maintains a serialized representation of the flow configuration. Based on these features, Kubernetes Persistent Volumes are necessary to maintain flow configuration as well as Content, FlowFile, and Provenance repository records. Although NiFi can be configured with memory-backed implementations for some repositories, this approach exposes pipelines to potential data loss in the event of application restarts.

Selecting the appropriate Kubernetes Storage Class is also an important part of the persistence configuration. Different Kubernetes providers have different options, requiring careful evaluation of available options for a particular Kubernetes distribution. Storage selection, often categorized according to the number of IOPS, has a direct impact on data pipeline throughput. Robust data pipeline designs should support the ability to replay data and process potential duplicates, which reduces but does not remove the importance of persistent storage for NiFi repositories and local configuration.

Certificate Distribution for Secure Communication

Encrypted communication with Transport Layer Security is a central feature of NiFi clustering. NiFi cluster members communicate with each other using mutual authentication with TLS, requiring each cluster member to validate peer certificates against a store of trusted authorities. Deploying NiFi clusters on Kubernetes requires each cluster member to have access to trusted certificate authorities along with an identity certificate and key.

NiFi configuration properties support file-based access to key store and trust store information, providing a natural integration with Kubernetes Secrets mounted as virtual files. The cert-manager project is a popular solution for certificate creation and distribution in Kubernetes. As a member of the Cloud Native Computing Foundation, cert-manager provides declarative certificate management and distribution to applications through Kubernetes Secrets. Supporting automated certificate provisioning simplifies the process of certificate rotation and also enables scaling NiFi clusters with additional members. NiFi 2.0.0-M4 and following include improvements to automated key store reloading, removing the need to restart cluster members when rotating certificates.

Authorization for Scalable Clusters

Building on strong authentication strategies, fine-grained authorization is necessary for a complete security solution. NiFi enforces authorization requirements for both end users and cluster members, requiring cluster members to have the permissions required to make requests on behalf of authenticated users. Automated maintenance of cluster membership, integrated with authorization handling, is essential for deploying scalable NiFi clusters on Kubernetes.

Standard authorization in NiFi requires cluster node identities to be specified in the managed configuration. Although this solution is sufficient for manual additions to cluster membership, it is not designed for automated scaling.

As a framework-level extension, the NiFi Authorizer interface provides the opportunity for integrating alternative solutions. Datavolo deployments of NiFi include a custom Authorizer implementation based on the Cedar Policy Language, bringing together support for fine-grained authorization, dynamic reloading, and statement-based policy decisions. The Cedar Policy implementation enables both user access control and cluster member authorization using a straightforward domain-specific language. This approach allows authorization handling that scales together with Kubernetes clusters while providing the level of control necessary for enterprise deployments.

Centralized Gateway for Cluster Access

With integrated configuration and monitoring, the NiFi web user interface presents data engineers with a compelling method for building data pipelines. Based on dynamic leader election, it is possible for any NiFi cluster member to provide user interface access. When deployed in Kubernetes, providing a single stable URL for access is an important usability and security concern.

Kubernetes Ingress is an established pattern for exposing access to cluster resources over HTTP. With an Ingress Controller such as nginx, access to the NiFi user interface and REST API is possible with encrypted HTTPS from the client browser to the NiFi cluster member. NiFi provides direct support for running behind a reverse proxy using specialized HTTP request headers. This configuration strategy also requires enabling session affinity for HTTP requests. Session affinity, also known as sticky sessions, ensures that the gateway sends multiple asynchronous transactions to the same cluster member. A gateway running in Kubernetes must also participate in automated certificate distribution to ensure that NiFi cluster members accept TLS requests.

Datavolo for Clustering NiFi on Kubernetes

Building on the years of experience maintaining Apache NiFi, Datavolo has developed robust solutions for running scalable clusters on Kubernetes. The Datavolo distribution of NiFi incorporates best practices for performance and security, enabling customers to focus on creating multimodal data pipelines. With a custom service implementing the Kubernetes Operator Pattern, it is possible to deploy NiFi with secure settings and elastic scaling. The operator solution combines both core clustering configuration and supporting infrastructure requirements, providing a repeatable process for maximum resource utilization and optimal resource segmentation.

Conclusion

Apache NiFi 2.0.0 brings together years of development effort. The latest version embraces numerous technical advances in Java and cloud native computing. With support for native clustering on Kubernetes, NiFi 2 provides a strong foundation for building scalable data pipelines. Unlocking the potential of NiFi on Kubernetes requires both foundational capabilities and supporting services, along with the knowledge to bring these elements together. Contact us at Datavolo to start building your next scalable data processing system!


Top Related Posts

How we use the Kubernetes Operator pattern

Organizations using NiFi for business-critical workloads have deep automation, orchestration, and security needs that Kubernetes by itself cannot support. In this second installment of our Kubernetes series, we explore how the Kubernetes Operator pattern alleviates...

Survey Findings – Evolving Apache NiFi

Survey of long time users to understand NiFi usage Datavolo empowers and enables the 10X Data Engineer. Today's 10X Data Engineer has to know about and tame unstructured and multi-modal data. Our core technology, Apache NiFi, has nearly 18 years of development,...