Generative AI & Agentic Solutions

Exam weight: 30–35% — the highest weighted domain

Overview

This is the core of AI-103. You need to be able to build generative AI applications using Foundry SDKs, implement RAG pipelines, create agents with tools, and orchestrate multi-agent workflows — all with responsible AI controls baked in.

Key Concepts

Generative AI Applications

Concept	Description
RAG (Retrieval-Augmented Generation)	Pattern: retrieve relevant context → inject into prompt → generate grounded response
Prompt Flow	Visual/code pipeline tool in Foundry for orchestrating LLM + tools + logic
Grounding	Connecting a model to your own data to reduce hallucinations
Evaluation	Measuring output quality — relevance, groundedness, fluency, safety
Fine-tuning	Additional training of a base model on your domain data
Tool-augmented flow	A flow that calls external tools (search, functions, APIs) mid-generation

RAG Pipeline (end-to-end)

1. Ingest documents → chunk into segments
       ↓
2. Embed chunks → vector representations
       ↓
3. Store in Azure AI Search (vector index)
       ↓
4. At query time: embed user question
       ↓
5. Retrieve top-k matching chunks
       ↓
6. Inject chunks into system prompt as context
       ↓
7. LLM generates a grounded response

Agents

Concept	Description
Tool	A function or service the agent can call — search, code execution, APIs
Instructions	System-level prompt defining the agent's goal, persona, and constraints
Memory	Conversation history and/or external state the agent can access
Thread	A conversation session — multiple messages, one agent
Run	One execution cycle of the agent against a thread
Multi-agent	Multiple specialized agents orchestrated together to complete a complex task
Safeguards	Constraints on what tools an agent can call and when it needs human approval

Agent Code Pattern

from azure.ai.projects import AIProjectClient
from azure.ai.projects.models import BingGroundingTool
from azure.identity import DefaultAzureCredential

client = AIProjectClient(
    endpoint="https://<your-project>.services.ai.azure.com",
    credential=DefaultAzureCredential()
)

# Create an agent with tools
agent = client.agents.create_agent(
    model="gpt-4o",
    name="research-agent",
    instructions="You are a research assistant. Search the web to answer questions accurately.",
    tools=BingGroundingTool(connection_id="<bing-connection-id>").definitions,
)

# Create a thread and run it
thread = client.agents.create_thread()
client.agents.create_message(thread.id, role="user", content="What are the latest Azure AI announcements?")
run = client.agents.create_and_process_run(thread.id, agent.id)

messages = client.agents.list_messages(thread.id)
print(messages.data[0].content[0].text.value)

# Clean up
client.agents.delete_agent(agent.id)

Optimization & Observability

Technique	What it does
Prompt engineering	Craft better system/user prompts to improve output quality
Temperature / top-p tuning	Control creativity vs. determinism
Chain-of-thought	Ask the model to reason step-by-step before answering
Model reflection	Have the model critique and improve its own output
Tracing	Record full input/output/tool-call chains for debugging
Token analytics	Monitor token usage, latency, and cost

Azure Services & Foundry Features

Feature	Location
Model catalog + deployments	Foundry portal
Playground	Foundry portal — test prompts interactively
Prompt Flow	Foundry portal — build and evaluate LLM pipelines
Agents	Foundry portal + `azure-ai-projects` SDK
Evaluations	Foundry portal + SDK
Azure AI Search	Connected via Foundry project connection

Overview​

Key Concepts​

Generative AI Applications​

RAG Pipeline (end-to-end)​

Agents​

Agent Code Pattern​

Optimization & Observability​

Azure Services & Foundry Features​

Study Resources​