Generative AI & Agentic Solutions
Exam weight: 30โ35% โ the highest weighted domain
Overviewโ
This is the core of AI-103. You need to be able to build generative AI applications using Foundry SDKs, implement RAG pipelines, create agents with tools, and orchestrate multi-agent workflows โ all with responsible AI controls baked in.
Key Conceptsโ
Generative AI Applicationsโ
| Concept | Description |
|---|---|
| RAG (Retrieval-Augmented Generation) | Pattern: retrieve relevant context โ inject into prompt โ generate grounded response |
| Prompt Flow | Visual/code pipeline tool in Foundry for orchestrating LLM + tools + logic |
| Grounding | Connecting a model to your own data to reduce hallucinations |
| Evaluation | Measuring output quality โ relevance, groundedness, fluency, safety |
| Fine-tuning | Additional training of a base model on your domain data |
| Tool-augmented flow | A flow that calls external tools (search, functions, APIs) mid-generation |
RAG Pipeline (end-to-end)โ
1. Ingest documents โ chunk into segments
โ
2. Embed chunks โ vector representations
โ
3. Store in Azure AI Search (vector index)
โ
4. At query time: embed user question
โ
5. Retrieve top-k matching chunks
โ
6. Inject chunks into system prompt as context
โ
7. LLM generates a grounded response
Agentsโ
| Concept | Description |
|---|---|
| Tool | A function or service the agent can call โ search, code execution, APIs |
| Instructions | System-level prompt defining the agent's goal, persona, and constraints |
| Memory | Conversation history and/or external state the agent can access |
| Thread | A conversation session โ multiple messages, one agent |
| Run | One execution cycle of the agent against a thread |
| Multi-agent | Multiple specialized agents orchestrated together to complete a complex task |
| Safeguards | Constraints on what tools an agent can call and when it needs human approval |
Agent Code Patternโ
from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import BingGroundingTool
from azure.identity import DefaultAzureCredential
client = AIProjectClient(
endpoint="https://<your-project>.services.ai.azure.com",
credential=DefaultAzureCredential()
)
# Create an agent with tools
agent = client.agents.create_agent(
model="gpt-4o",
name="research-agent",
instructions="You are a research assistant. Search the web to answer questions accurately.",
tools=BingGroundingTool(connection_id="<bing-connection-id>").definitions,
)
# Create a thread and run it
thread = client.agents.create_thread()
client.agents.create_message(thread.id, role="user", content="What are the latest Azure AI announcements?")
run = client.agents.create_and_process_run(thread.id, agent.id)
messages = client.agents.list_messages(thread.id)
print(messages.data[0].content[0].text.value)
# Clean up
client.agents.delete_agent(agent.id)
Optimization & Observabilityโ
| Technique | What it does |
|---|---|
| Prompt engineering | Craft better system/user prompts to improve output quality |
| Temperature / top-p tuning | Control creativity vs. determinism |
| Chain-of-thought | Ask the model to reason step-by-step before answering |
| Model reflection | Have the model critique and improve its own output |
| Tracing | Record full input/output/tool-call chains for debugging |
| Token analytics | Monitor token usage, latency, and cost |
Azure Services & Foundry Featuresโ
| Feature | Location |
|---|---|
| Model catalog + deployments | Foundry portal |
| Playground | Foundry portal โ test prompts interactively |
| Prompt Flow | Foundry portal โ build and evaluate LLM pipelines |
| Agents | Foundry portal + azure-ai-projects SDK |
| Evaluations | Foundry portal + SDK |
| Azure AI Search | Connected via Foundry project connection |