AI Workloads & Capabilities

Exam objective: Identify AI workloads and their features

Overview

AI-901 tests your ability to match a real-world scenario to the appropriate AI workload type and Azure capability. This includes text analysis, speech, computer vision, information extraction, generative AI, and agentic AI.

Key Concepts

Common AI Workload Types

Workload	What it does	Key Azure capability
Text analysis	Extract meaning from text	Azure AI Language
Speech	Convert between audio and text	Azure AI Speech (via Foundry Tools)
Computer vision	Interpret visual content	Multimodal models, Azure AI Vision
Information extraction	Pull structured data from documents, images, audio, video	Azure Content Understanding
Generative AI	Create new text, images, or code from prompts	Azure OpenAI (GPT, DALL-E)
Agentic AI	AI that takes multi-step actions using tools to complete a goal	Azure AI Agents (Foundry)

Text Analysis Techniques

Technique	Description	Example
Keyword extraction	Identify the most important words and phrases	"Azure, AI, certification" from an article
Entity detection (NER)	Recognize named entities — people, places, dates, organizations	"Microsoft" → Organization, "Seattle" → Location
Sentiment analysis	Determine the emotional tone — positive, negative, neutral, mixed	Product review → "positive"
Summarization	Condense long text into a shorter version	Summarize a 5-page report into 3 sentences

Speech Capabilities

Capability	Description
Speech recognition (STT)	Convert spoken audio to text
Speech synthesis (TTS)	Convert text to natural-sounding speech
Speaker recognition	Identify or verify a speaker from their voice
Real-time translation	Translate spoken language in near real-time

Computer Vision Capabilities

Capability	Description
Image analysis	Describe image content, detect objects, read text (OCR)
Image captioning	Generate a natural language description of an image
Image generation	Create a new image from a text prompt (DALL-E)
Visual Q&A	Answer questions about an image using a multimodal model

Information Extraction (Azure Content Understanding)

Source	What can be extracted
Documents & forms	Fields, tables, key-value pairs, signatures
Images	Text, objects, structured data in visual layout
Audio	Transcription, speaker turns, topics
Video	Transcription, scenes, faces, on-screen text

Agentic AI

An agent is an AI system that can plan and take multi-step actions to complete a goal. Unlike a simple chatbot, an agent can:

Use tools (e.g., search the web, run code, call an API)
Maintain state across multiple steps
Plan sequences of actions to achieve an objective

In Foundry, you can create a single-agent solution that combines a deployed model with tools.

Study Resources

📖 Identify common AI workloads — Microsoft Learn
📖 Azure AI Language documentation
📖 Introduction to Azure AI Agents
📖 Azure Content Understanding overview

Overview​

Key Concepts​

Common AI Workload Types​

Text Analysis Techniques​

Speech Capabilities​

Computer Vision Capabilities​

Information Extraction (Azure Content Understanding)​

Agentic AI​

Study Resources​