Skip to main content

AI Workloads & Capabilities

Exam objective: Identify AI workloads and their features

Overview​

AI-901 tests your ability to match a real-world scenario to the appropriate AI workload type and Azure capability. This includes text analysis, speech, computer vision, information extraction, generative AI, and agentic AI.

Key Concepts​

Common AI Workload Types​

WorkloadWhat it doesKey Azure capability
Text analysisExtract meaning from textAzure AI Language
SpeechConvert between audio and textAzure AI Speech (via Foundry Tools)
Computer visionInterpret visual contentMultimodal models, Azure AI Vision
Information extractionPull structured data from documents, images, audio, videoAzure Content Understanding
Generative AICreate new text, images, or code from promptsAzure OpenAI (GPT, DALL-E)
Agentic AIAI that takes multi-step actions using tools to complete a goalAzure AI Agents (Foundry)

Text Analysis Techniques​

TechniqueDescriptionExample
Keyword extractionIdentify the most important words and phrases"Azure, AI, certification" from an article
Entity detection (NER)Recognize named entities — people, places, dates, organizations"Microsoft" → Organization, "Seattle" → Location
Sentiment analysisDetermine the emotional tone — positive, negative, neutral, mixedProduct review → "positive"
SummarizationCondense long text into a shorter versionSummarize a 5-page report into 3 sentences

Speech Capabilities​

CapabilityDescription
Speech recognition (STT)Convert spoken audio to text
Speech synthesis (TTS)Convert text to natural-sounding speech
Speaker recognitionIdentify or verify a speaker from their voice
Real-time translationTranslate spoken language in near real-time

Computer Vision Capabilities​

CapabilityDescription
Image analysisDescribe image content, detect objects, read text (OCR)
Image captioningGenerate a natural language description of an image
Image generationCreate a new image from a text prompt (DALL-E)
Visual Q&AAnswer questions about an image using a multimodal model

Information Extraction (Azure Content Understanding)​

SourceWhat can be extracted
Documents & formsFields, tables, key-value pairs, signatures
ImagesText, objects, structured data in visual layout
AudioTranscription, speaker turns, topics
VideoTranscription, scenes, faces, on-screen text

Agentic AI​

An agent is an AI system that can plan and take multi-step actions to complete a goal. Unlike a simple chatbot, an agent can:

  • Use tools (e.g., search the web, run code, call an API)
  • Maintain state across multiple steps
  • Plan sequences of actions to achieve an objective

In Foundry, you can create a single-agent solution that combines a deployed model with tools.

Study Resources​