Introduction

Large Language Models have fundamentally changed what computers can do. For decades, software could store, retrieve, and process structured data with precision. But it could not understand natural language, extract meaning from unstructured text, or generate coherent prose. LLMs changed that.

When organizations adopt LLMs, they typically start by using them for discrete tasks: summarize this document, draft this email, answer this question. This works well and delivers immediate value. But as ambitions grow, a critical architectural question emerges that most teams fail to ask explicitly: how should LLMs fit into larger systems that accomplish complex goals?

The answer is that LLMs serve two fundamentally different roles in AI systems. Confusing these roles — or failing to recognize the distinction — is one of the most common sources of poorly designed AI products. Understanding the distinction is essential for building systems that work reliably.

Role 1: The Worker

The first role is the one most people encounter initially. The LLM executes discrete knowledge operations that previously required human cognition.

Consider what happens when a knowledge worker processes information. They read a document and extract key points. They synthesize information from multiple sources. They draft communications. They evaluate whether something meets certain criteria. They translate between formats or languages. Each of these is an atomic operation — a single cognitive task with a defined input and a defined output.

Before LLMs, computers could retrieve the document and present it to a user, but could not perform the reading and extraction. They could store the draft, but could not write it. They could route the information, but could not evaluate it. The human had to perform every operation that required understanding or generating natural language.

LLMs change this equation. They can execute these atomic knowledge operations programmatically. For the first time, we can build automated workflows that include steps like: read this contract and extract the key terms; evaluate whether this claim matches the policy criteria; synthesize these three reports into a summary; draft a response to this customer inquiry; assess the sentiment and urgency of this message.

Each of these is a discrete task. The LLM receives an input, performs a cognitive operation, and produces an output. The system designer determines what operations occur and in what sequence. The LLM is the worker executing each step — powerful, but operating within a structure defined externally.

Role 2: The Planner

The second role is fundamentally different. Instead of executing a predefined operation, the LLM decides which operations to perform and in what order.

Consider a research task: "Find information about recent regulatory changes affecting our industry and summarize the implications." A human researcher would approach this by making a series of decisions. What sources should I check? What search queries should I use? Which results are relevant enough to read in full? How should I organize the findings? What level of detail does the summary require?

In the worker role, a system designer would need to predefine this entire workflow: search these three databases with these queries, filter results by these criteria, retrieve the top five, summarize each, then compile. This works if the task is predictable. But research tasks vary. The right approach depends on what you find along the way.

In the planner role, the LLM itself makes these decisions. You give it a goal and a set of capabilities — search tools, retrieval tools, analysis tools — and it determines the sequence. It might search, evaluate results, decide to search again with a refined query, retrieve promising documents, extract relevant sections, and synthesize findings. The specific path emerges from the model's judgment about what is needed at each step.

This is agentic behavior. The LLM is not just executing operations; it is planning and coordinating them. It operates at a higher level of abstraction, reasoning about goals and strategies rather than just processing inputs.

The Spectrum

These two roles create a design spectrum that every AI system must navigate.

Fully Deterministic. The system designer specifies the complete workflow. Step one does this, step two does that, step three does the next thing. LLMs execute each step but make no decisions about sequencing. This approach maximizes predictability and auditability. You know exactly what will happen for any given input. When something goes wrong, you can identify exactly where. It works well when the process is well-understood and consistent across inputs.

Fully Agentic. The system provides the LLM with capabilities and goals but no prescribed workflow. The LLM interprets requests, plans approaches, selects tools, and determines sequencing. This approach maximizes flexibility and handles novel situations well. But the path through the system varies with each execution, making it harder to predict, audit, and optimize. When something goes wrong, understanding why requires reconstructing the model's reasoning chain.

Most production systems live somewhere between these extremes. And the interesting insight is that these approaches can be combined at different layers of the same system.

Hybrid Architectures: The Recursive Structure

Here is where the architecture becomes genuinely interesting. A deterministic workflow can invoke an agentic step, and an agentic system can invoke deterministic tools. This creates a recursive structure with significant design implications.

Deterministic Calling Agentic

Consider a claims processing workflow. The overall structure is deterministic: receive claim, validate format, assess coverage, calculate payout, generate decision letter. This sequence is fixed for compliance and consistency reasons — the business requires that every claim passes through the same stages in the same order.

But within the "assess coverage" step, the task might be genuinely complex. The claim might require research into ambiguous policy language, judgment about edge cases, or synthesis of multiple policy provisions that interact in non-obvious ways. This step resists predefinition because the right approach depends on the specific claim.

The solution: that step invokes an agent. The agent has access to policy documents, historical decisions, and research tools. It determines how to evaluate this specific claim's coverage. The outer workflow knows what it needs from this step — a coverage determination with supporting rationale — but delegates the how to an LLM that can adapt its approach to the situation.

Agentic Calling Deterministic

Now consider the reverse. A customer service agent receives an open-ended request: "I need help with my account." The agent must interpret this, ask clarifying questions, and determine what the customer actually needs. This outer layer is necessarily agentic — you cannot predict what customers will ask, and the appropriate response depends entirely on the conversation.

But when the agent determines that the customer needs a policy summary, it does not improvise one from scratch. It invokes a "generate policy summary" tool that internally runs a tight, optimized, deterministic pipeline: extract policy details, structure according to template, validate completeness, format output. The agent operates at the level of "what does this customer need" while proven workflows handle "how do we produce that artifact."

Arbitrary Depth

These patterns nest to arbitrary depth. A deterministic workflow might call an agent that selects among deterministic tools, each of which might have agentic substeps. The architecture becomes a tree where each node is either deterministic or agentic, and the leaves are atomic operations that an LLM executes.

This is not theoretical elegance — it is how effective production systems actually work. The architecture mirrors the structure of the work itself: some parts are predictable and should be locked down, other parts require judgment and should be flexible.

Design Principles

This framework suggests several principles for building effective AI systems.

Choose the Top Level Based on the Domain

What sits at the top of your system sets its overall character. Regulated processes with compliance requirements often want deterministic tops — predictable, auditable, consistent. Every claim goes through the same stages. Every report follows the same structure. Customer-facing interfaces handling diverse requests often need agentic tops — flexible, adaptive, able to handle the long tail of unexpected inputs. The top level is a strategic choice about the nature of your domain.

Encapsulate Complexity in Well-Orchestrated Tools

The most effective agentic systems do not give agents raw primitives. They give them high-quality compound tools that encapsulate proven workflows. Instead of "here is a search function, a read function, and a summarize function," the agent gets "here is a research-and-summarize tool that reliably produces quality output."

This matters because every decision point for an agent is a potential failure point. An agent reasoning through ten micro-steps has more chances to go wrong than an agent selecting among three well-designed macro-capabilities. The internal complexity of those capabilities is handled by deterministic orchestration that has been tested and optimized. The agent's job becomes selection and sequencing, not reinventing processes from primitives.

Place Determinism Where You Need Auditability

Deterministic segments produce clear logs: step one received this input and produced this output, step two received that and produced the next thing. The path through the system is explicit and reproducible. Agentic segments are harder to audit because the path varies based on the model's reasoning.

For regulated industries, this often means deterministic wrappers around agentic cores. Fixed entry points, fixed output validation, flexible internals. You can demonstrate that every claim went through the required evaluation steps even if the evaluation itself involved dynamic reasoning.

Place Agentic Behavior Where You Need Adaptability

Some tasks are predictable — same inputs, same process, every time. Lock those down as deterministic workflows and optimize them relentlessly. Other tasks vary significantly based on context. Customer requests come in infinite varieties. Research questions require different approaches depending on what is available. Novel situations do not fit existing templates.

These are candidates for agentic approaches. The art is identifying which tasks are which. Over-constraining adaptive tasks makes systems brittle — they break on anything outside the predefined path. Under-constraining predictable tasks wastes agent reasoning on solved problems and introduces unnecessary variability.

Conclusion

LLMs contribute to AI systems in two distinct ways: as workers executing atomic knowledge operations, and as planners deciding which operations to invoke and in what order. Effective system design requires understanding both roles and choosing deliberately where each applies.

The most sophisticated production systems use both, layered recursively. Deterministic pipelines provide predictability and efficiency for known processes. Agentic components provide flexibility for genuinely dynamic tasks. Well-orchestrated tools encapsulate complexity so that agents operate at appropriate abstraction levels rather than drowning in micro-decisions.

This is not just an architectural nicety. It is the difference between AI systems that work reliably in production and those that remain impressive demos. The technology is powerful. The design choices determine whether that power translates to value.