Tapestry's Audrey Miller presents her curated shortlist of the most useful frameworks to better understand the growing world of LLMs, plus some bonus resources to explore the topic further.
The first public lecture to mention computer intelligence took place in London in 1947 by Alan Turing about the “possibility of letting the machine alter its own instructions” introduced in his unpublished paper, “Intelligent Machinery” [1]. Since then we’ve seen the invention of ELIZA (1964), Watson (2004-2011), and an explosion of new language models beginning in 2018 with OpenAI’s GPT-1. I’m no history buff but the internet is a wonderful place where one can find a timeline of AI model development and a compiled list of models sorted by lab, parameter size, and date announced (no gatekeeping!).
Though development has rapidly accelerated there remain limitations inherent to the technology that slow broader adoption. Hazy memory, constrained environment operability, and sensitivities limit use cases to situations where imperfect outputs are tolerated. In response, many projects have emerged to confront these obstacles to implementation.
In an effort to better understand these emerging tools, I set out to explore already proposed frameworks for mapping tools for deploying this growing world of LLMs. I originally set out to propose my own, perhaps a combination of the inputs I ingested, but found them all quite robust. Rather than reinvent the wheel, I present here my curated shortlist and views on the best frameworks I have found, the areas I’m excited about, plus the most interesting other resources I’ve come across that have informed my thinking.
Years ago, I developed an internal framework for categorizing companies depending on where they sit on the value chain. I called it my “four-layer company cake”. In this framework all companies can be described as being one of four layers: data layer, “highway” layer (think APIs), orchestration layer, or consumption layer. Later, I realized this was oddly similar to (yet an overly simplified version of) the OSI model – a conceptual model for communicating levels of abstraction between systems. The idea of using a framework as short-hand for categorizing emerging technologies isn’t novel and various have been proposed for the surge of AI companies in the market.
Similar to my four-layer cake, Tomasz Tunguz’s Theory Ventures proposes a four-part mental model for exploring the LLM stack: (1) data layer, (2) model layer, (3) deployment layer, (4) interface layer. I like many things about this proposed framework. In particular, I find the delineation of the four categories in the “deployment layer” helpful. In my head, however, the categorization of his “Model Layer” ignores circularities inherent in the training, routing and fine-tuning process. Few applications, categorized in this framework as “Interface Layers”, can run linearly with a model directly integrated into a user interface. There is an interdependence on these layers that is lost in an overly linear framework that, though helpful for simplification, actually can make it more difficult to visualize, for lay people like me!
Heather Miller of Two Sigma recently published a Guide to Large Language Model Abstractions that digs even deeper into the proposed “third layer” above. In one article they provide two frameworks for dissecting this layer. First they proposed Language Model System Interface Model (LMSI) – their seven-layer framework to think about LLM abstraction. Presented in order of level of abstraction they start with layer one – neural networks that directly access LLM architecture, and move to layers that focus on prompt input, rules, circular loops, optimizations, and lastly, applications. Their final “layer” is “user”, which describes the application layer where humans perform tasks.
Miller also proposes a secondary framework that categorizes projects not by theoretical levels of abstraction but by functionality. Divided into five groups, the framework outlines (1) Controlled Generation - defining output constraints, (2) Schema-Driven Generation - user-based type-level output, (3) Compilation - automatically-generated, high-quality prompt chains, (4) Prompt Engineering Tools with Pre-Packaged Modules - tools for generating prompts for more meaningful LLM interactions, and (5) Open-Ended Agent or Multi-Agents - orchestrating LLMs for general purpose problem solving.
Menlo Ventures’s Naomi Pilosof defined the building blocks of the modern AI stack across four layers in her State of Generative AI in the Enterprise Report last year. Pilasof’s first layer, compute and foundation, groups foundational models with GPU providers and training and deployment tools. This spans all three of the first three layers in Tunguz’s approach. The “Data” layer sits above the model layer in this framework and is split into pre-processing, databases, and pipelines. While I agree that this layer is an interesting space to spend time, I disagree with the placement as I find it more intuitive to think about data layers below the model layers. Like Tunguz, Pilosof buckets compute and foundational models in their layer one, to contain training and fine-tuning infrastructure. Another great read is their updated “The Modern AI Stack: Design Principles for the Future of Enterprise AI Architectures”.
Other frameworks I like include Andre Retterath’s at Earlybird VC, though I found his earlier piece on value accrual in AI even more interesting. Felix Becker of Heartcore Capital also outlines the blueprint of a modern AI app which neatly describes the building blocks of AI applications. This framework looks very similar to Matt Bornstein and Rajko Radovanovic of A16Z’s proposed framework in their “Emerging Architectures for LLM Applications” piece that highlights the circularity of the stack. It shows app hosting tools (which in the earlier framework would sit between deployment and interface layers) acting as both inputs and outputs of orchestration tools. However, it is less “full stack” than Tunguz’s framework, only focusing on “middle layers” of model and deployment.
Lastly Sonya Huang and Pat Grady at Sequoia have been prolific in an effort to position themselves at the helm of the AI narrative. The frameworks they’ve published are robust and simple. I particularly liked the simplicity of their landscape map by modality split into model vs application layer. Their proposed Gen AI infrastructure stack includes non-AI observability tools at the top and mixes data labeling alongside the model layer. They also include synthetic data as its own category which, though small today, is an interesting hint at how they may see that space evolving. Last but not least, I appreciate the simplicity of Index’s proposed three-layer stack (models, infra, applications). At the end of the day, that’s what it boils down to.
Rather than propose my own framework, which sits somewhere in the middle of all these, I’d rather share a few questions I’ve been thinking through and, for those that make it to the end, my favorite resources for playing in the space that have helped refine my thinking.
Lastly, as promised, my favorite fun resources: