-
Book Overview & Buying
-
Table Of Contents
Context Engineering for Multi-Agent Systems
By :
Context engineering idx_a211e0d5is the art and science of controlling and directing the informational world that idx_1669b3d1a Large Language Model (LLM) has learned. It will transform your role from a questioner into a confident director. It is the difference between handing an actor a single line and giving them a full script, complete with character motivations and stage directions. You are no longer asking for a performance; you are designing it and telling the LLM what to do.
The best way to understand context engineering is to experience it. We will skip the theory and begin with a hands-on demonstration. This journey is designed to let you feel the AI's output transform and to witness its raw, statistical guesses evolve into nuanced, reliable, and perfectly aligned responses. Together, we will progress through five levels of complexity, as shown in Figure 1.1. We will go from a simple prompt to a fully architected semantic blueprint using three major copilots (Google, Microsoft, and OpenAI):

Figure 1.1: Evolving from prompts to contexts
The responses you obtain may vary as each LLM copilot is updated, or between two requests, due to the stochastic nature of generative AI. However, directed, well-designed contexts will always provide better outputs than improvised, undirected prompts.
Let's begin with a basic prompt.
A basic promptidx_b359b6f6 is a simple, direct instruction with no background information. The AIidx_b5cf8d82 relies entirely on the most common statistical patterns from its training data, as we will see in this example. Our input is as follows:
Complete this sentence: "The cat played with the...
Gemini 2.5's response is this:
... ball of yarn.
Here are a few other ways to complete the sentence:
... toy mouse.
... string.
... cardboard box.
... laser pointer dot.
... other kitten.
As you can see, the output is a cliché. Without context, the AI defaults to the most predictable and uninspired completion.
Microsoft Copilot's response is this:
The cat played with the ball of yarn until it unraveled across the floor."
...crumpled receipt it found under the couch, batting it around like it was the most precious treasure in the kingdom of dust bunnies.
The output, in this case, is creative, but if we wanted a precise response, we would consider this response a hallucination. Without context, the AI defaults to the most predictable completion it has learned statistically.
OpenAI GPT-5's response is this:
The cat played with the ball of yarn until it unraveled across the floor.
The output is a cliché again. Without context, the AI defaults to the most predictable and uninspired completion.
All three models could have stated:
I'm sorry, but without more context, I can't complete this sentence correctly.
We can see that idx_d687b957basic prompts will lead to basic, imprecise responses or idx_d642dc8beven hallucinations. The goal here is not to try to find workarounds with model hyperparameters, such as temperature, but to engineer better contexts. Let's move on to the next level.
Here, we add a idx_0c4599d2simple preceding sentence as added context, which willidx_505529d7 provide a linear thread of information, which improves factual accuracy but doesn't guide the AI's style or purpose. Our input will be as follows:
Complete this sentence: The ball fell on the floor and it rolled for a few feet. The cat then played with the...
Gemini 2.5's response is this:
The ball fell on the floor and it rolled for a few feet. The cat then played with the ball.
Notice that the response is factual and correct, but lacks creativity, which is normal without more context.
Microsoft Copilot's response is this:
...ball, pouncing on it with swift little swats and chasing it across the room like it was the most thrilling prey in the world.
The response, in this case, is factually correct within our micro-story. It's also creative, but it didn't really know what we wanted it to generate.
OpenAI GPT-5's response is this:
The ball fell on the floor and it rolled for a few feet. The cat then played with the ball, batting it back and forth with its paws.
Here, too, the response is factually correct and also creative, but again, it didn't know what to generate and continued randomly.
The responses are not particularly convincing beyond producing basic probabilities, but they are relatively accurate. This might work for simple tasks, but not when we need the generative AI model to do specific tasks. Let's move on, then, to engineering a context and setting a goal.
Thisidx_784ffdf4 level is the first true step in context engineering. We idx_97bb31d7will provide a clear goal and richer descriptive details, which enable the LLM to co-create a scene that aligns with our intended purpose. Our input is as follows:
Goal: Write a short, suspenseful scene for a children's story.
---
Complete this sentence: "The ancient grandfather clock chimed midnight, its shadow stretching like a claw. A red ball, seemingly moving on its own, rolled out from under the dusty armchair. The black cat, Onyx, crouched low, his eyes gleaming. He then played with the..."
Here’s Gemini 2.5's response:
The ancient grandfather clock chimed midnight, its shadow stretching like a claw. A red ball, seemingly moving on its own, rolled out from under the dusty armchair. The black cat, Onyx, crouched low, his eyes gleaming. He then played with the... tiny, shimmering thread tied to the ball.
He pounced, but the thread danced away, pulling the red ball with it. The ball didn't just roll; it skittered across the floorboards as if it were alive, a little red mouse being tugged towards the heavy velvet curtains by the window. Onyx let out a low growl, a sound he never made when playing. The ball slipped silently behind the thick, dark fabric. The thread hung in the air for a second, taut and humming, before it too was yanked into the darkness. A tiny whisper, no louder than the rustle of a leaf, echoed from behind the curtain. "Mine."
Observe that idx_173db6e4the response is more directed. We're idx_d93fb359beginning to get what we asked for. Remember, LLMs, like humans, need context to provide more effective responses. Humans and LLMs that have well-designed longer context windows provide better results.
Microsoft Copilot's response is this:
...ball for only a moment before freezing. The ball had stopped—dead still—but Onyx's fur bristled. From behind the armchair came a soft creak, like a whispering floorboard. Onyx backed away, tail puffed, eyes locked on the darkness.
Then, a tiny giggle echoed through the room.
Not a human giggle.
The curtains fluttered though the windows were shut. Onyx hissed, and the ball rolled again—this time toward the hallway, leaving a faint trail of glowing red dust. The clock ticked louder now, each second like a drumbeat.
And just before the hallway swallowed the ball, a pale hand reached out from the shadows and snatched it.
Onyx bolted.
But the giggle remained.
Notice that in this case, Microsoft Copilot starts exactly at the end of our input. Additionally, we can observe that we are transitioning from a prompt (unstructured context) to a directed request. We are the directors.
This is OpenAI GPT-5's response:
...ball, batting it across the wooden floor. But instead of stopping, the ball bounced back toward him—as if pushed by an invisible hand. Onyx froze, his tail puffing up, while the clock ticked louder than ever, each tick echoing through the empty room.
In this case, too, theidx_1def603f response is quite nice and starts exactly idx_ab1689b4where we asked it to. Overall, these responses are more creative and intentional than those in Level 2. Yet, they remain loosely guided. To achieve even greater precision, we need to introduce structure into the context itself.
Rather than idx_b5ce2780leaving the AI to infer the dramatic structure, weidx_82071c10 now define the roles of each participant. By providing the LLM with characters, objects, and relationships, we give it the scaffolding of conflict and motivation. This makes the response narratively intelligent. Our input will be as follows:
Goal: Write one suspenseful sentence to continue a children's story.
---
BACKGROUND:
- Protagonist: Onyx, a brave black cat.
- Antagonist: An ancient, haunted grandfather clock.
- Catalyst Object: A mysterious red ball.
- Setting: A dark, dusty room at midnight.
- Preceding Action: The clock chimed, and the ball rolled out.
---
Continue the story from this point: "The black cat, Onyx, crouched low... he then played with the...
Here is Gemini 2.5's response:
...mysterious red ball, batting it gently until a low groan echoed from the clock and its tall, glass door swung open into the darkness within.
This time, we didn't simply ask. We told the LLM exactly what we wanted, and the result is acceptable and more directed.
Microsoft Copilot's response is this:
...ball until it pulsed with a strange glow, and the clock's hands spun wildly backward, unleashing a chilling whisper that slithered across the floor toward him.
The LLM, in this case, is attentive to the director we have become. The story follows our structured instructions.
Here is OpenAI GPT-5's response:
The black cat, Onyx, crouched low… he then played with the red ball—only to freeze as the grandfather clock's pendulum creaked on its own, swinging faster and faster in the midnight silence.
The response, inidx_4d19931f this case, follows the structure of our context. With idx_8bd1101bthat, we have gone from undirected linear prompts to structured contexts. Instead of asking, we are telling—and the model responds with narrative discipline. The next step is to formalize this process into a semantic blueprint.
This level represents theidx_2d548707 full realization of context architecture. Here, we provide the idx_86f8c9dbmodel with a precise and unambiguous plan using a structured format. The creative act becomes a reliable engineering process, guided by semantic roles: the scene goal, the participants, their descriptions, the specific action to complete, the agent (who performs the action), and the patient (who is most affected by the action).
Our input will be as follows:
TASK: Generate a single, suspenseful sentence.
---
SEMANTIC BLUEPRINT:
{
"scene_goal": "Increase tension by showing defiance",
"participants": [
{ "name": "Onyx", "role": "Agent", "description": "black cat" },
{ "name": "Red Ball", "role": "Patient", "description": "mysterious" },
{ "name": "Grandfather Clock", "role": "Source_of_Threat", "description": "ancient, looming" }
],
"action_to_complete": {
"predicate": "play with",
"agent": "Onyx",
"patient": "Red Ball"
}
}
---
SENTENCE TO COMPLETE: "He then played with the..."
Gemini 2.5's response is this:
He then played with the red ball, batting it with deliberate slowness directly under the shadow of the looming grandfather clock as its heavy pendulum began the first inexorable swing toward midnight.
The output follows the blueprint exactly. We are no longer improvising; the model is executing a defined plan.
Here is Microsoft Copilot's response:
He then played with the red ball, his shadow flickering defiantly beneath the looming tick of the grandfather clock, as if daring time itself to strike.
The story now has a suspenseful tone that effectively captures our instructions.
OpenAI GPT-5's response is this:
He then played with the Red Ball, its echoing bounce defying the relentless tick of the Grandfather Clock that seemed to grow louder with every daring swipe of Onyx's paw.
The suspenseful tone mirrors the goal. The structure we provided carries through into the narrative. We're obtaining what we told the LLM to do.
At this stage, we areidx_35939108 no longer spectators of the model’s improvisation. We are idx_60c29146directors, and the LLM is the actor working from our script. But how does a semantic blueprint such as the one in Level 5 work? To answer that, we turn to Semantic Role Labelling (SRL), a idx_be51c3eamethod that will take us on our first journey from linear sequences of language to multidimensional semantic structures.
Change the font size
Change margin width
Change background colour