Context Engineering Theory
How the LLM Works
Everything Copilot does starts with text. You send a message, and behind the scenes, that message — along with a lot of other content — gets assembled into a single string and sent to the language model. The model reads it, predicts what should come next, and that prediction is Copilot's response.
That string is called the context. It's literally what gets fed into the model. And because the model's output is entirely determined by its input, context is everything. Better context → better output. The wrong context → hallucinations, mistakes, missed requirements.
Copilot's job is to assemble the right context string for every request. It pulls together:
System-level content: the system prompt, descriptions of available tools, your instructions file, skill descriptions
User-level content: your current message, the conversation history, the results of any tool calls Copilot has already made
When you write an instructions file or a skill, you're contributing to that context string. You're shaping what the model sees, and therefore what it does.
The Agent Loop
Here's what actually happens when you send Copilot a message:
Copilot assembles the context string and sends it to the LLM
The LLM decides: is the task done? If not, it picks a tool — read a file, write a file, run a terminal command, load a skill
The tool runs, and its output gets added back to the context string
The updated context string goes back to the LLM
Repeat until the LLM decides the task is complete
This is why Copilot can do complex multi-step work. It's not solving the problem in one pass — it's going around this loop many times, accumulating context as it goes, until it's confident the work is done.
Skills slot into step 2. When the LLM decides a skill is relevant, it loads the skill content into the context string. That's how a skill influences behavior: it adds more text to the context, and that text shapes what the LLM does next.
The Context Window
The context window is the maximum amount of text the LLM can process at once. Think of it as a fixed-size container. Everything has to fit: the system prompt, your instructions file, the conversation history, tool outputs, skill content, and the space reserved for Copilot's response.
When the context window fills up, Copilot starts to lose track of things. It might forget earlier instructions, repeat itself, or make decisions that seem inconsistent with what you asked. This is called context rot.
When context rot gets bad enough, the context will get compacted — summarizing older parts of the conversation to free up space. This helps, but it's not perfect. The compacted version loses some detail.
What this means practically:
Keep your instructions file concise — it's loaded with every message
Start a new conversation if you've been working in one for a long time
For very long tasks, break them into smaller conversations
✏️ Exercise: Check the Context Window
To see where you stand, look at the context window indicator at the bottom of the Copilot chat panel. Click on it to see how full the context window currently is.
If the window is getting full, it's a good time to start a fresh conversation before things start going sideways.
Key Takeaways
Skills — markdown files in
.github/skills/that define how to do something. Copilot invokes them automatically when relevant, or you can call them with/skill-name. The description is what triggers auto-invocation.Instructions files —
CLAUDE.mdor.github/copilot-instructions.mdat the project root. Loaded with every request. Defines the universe Copilot operates in.Context — the full string of text sent to the LLM. Everything you do in context engineering is shaping this string.
Context window — finite capacity. Keep instructions lean and start fresh conversations when things get long.
Context rot — what happens when the window fills up. Copilot loses focus and consistency.