Humility is the first lesson
In the machine learning era of software engineering, one persistent truth has emerged: engineers are increasingly submitting to the will of the machine. A significant milestone in the transition from classical machine learning to deep learning was the replacement of hand-designed features with learned features—representations learned by the model. It became evident that, for many tasks, the application of vast amounts of data and computational power could yield more breakthroughs than the human’s algorithmic insights.
We have swiftly transitioned from crafting algorithms to training models, from manually engineered features to learned features in neural networks, and most recently, from hand-crafted loss functions to learned ones, thanks to techniques like Reinforcement Learning with Human Feedback (RLHF). The profound impact of RL optimization on top of Large Language Models (LLMs) has been crucial for achieving breakthroughs for increasingly complex tasks. Surrendering control over the definition of the loss function is arguably one of the key breakthroughs associated with RLHF.
For engineers developing AI applications, we are witnessing a recapitulation of this pattern. As we converge on standardized approaches for building AI applications, which entail integrating and customizing LLM behavior, a continuum of approaches with associated trade-offs and payoffs has emerged. These methods range from basic prompt engineering to naive RAG, to sophisticated approaches to RAG (which require navigating challenging Information Retrieval problems), to fine-tuning and parameter-efficient fine-tuning methods, and even the creation of LLMs from the ground up, such as BloombergGPT.
When the model decides
One of the most exciting developments in recent waves of LLM innovation is the ability for models to invoke functions and expose an agent API, which is capable of utilizing tools created and registered by AI engineers.This is an instantiation of the idea that LLMs can improve their responses by using tools external to their language generation capability. These tools could be for Math, Coding, Search and Information Retrieval, and ultimately the use of any external API.
In this framework, the LLM becomes an intelligent router, possibly even a data OS as suggested by Andrej Karpathy, capable of directing the user’s intent to the most suitable tool to provide the best response. This evolution began with Meta’s ToolFormer paper, introducing the ability for LLMs to make function calls, and has continued with OpenAI’s announcement at their Dev Day of the Assistant API, which facilitates the integration of custom tools with GPT-4 model descendants.
The agent as the user persona
These agentic design patterns are once again empowering AI engineers to relinquish some control, with an eye toward new breakthroughs. Unlike RAG patterns where the engineer determines when and how to incorporate more context into the prompt, the model now decides when to leverage Retrieval as a tool to deliver the highest-quality response. As engineers explore this pattern, product designers must embrace the concept that they may be writing tool specifications for an agent API, which effectively becomes the user story they are delivering!
At Datavolo, we view the path forward as a continuum of approaches, each with its own trade-offs and payoffs, spanning these core design patterns. Enterprises will need to carefully consider critical trade-offs across dimensions such as system coupling and vendor lock-in, cost, complexity, and security and privacy. Furthermore, in each of these core design patterns, we believe Datavolo will play a pivotal role.
For in-context learning, Datavolo will prove invaluable for the data engineering steps essential in constructing effective RAG applications—acquiring, extracting, chunking & structuring, transforming, and loading multimodal data. In fine-tuning, Datavolo will assist data engineers in building staging environments with multimodal training data and assessing the performance of models. As for agent APIs, Datavolo will one day evolve into a tool that agents use themselves!
Datavolo offers the flexibility users need, ensuring that, regardless of the chosen design pattern, they will be well-equipped to develop multimodal AI applications that are impactful within the enterprise. Please stay tuned for future blogs where we will continue to explore this theme!