Information Extraction with Foundry
Exam objectives:
- Extract information from documents and forms using Azure Content Understanding in Foundry Tools
- Extract information from images using Content Understanding
- Extract information from audio and video using Content Understanding
- Build a lightweight application with information extraction capabilities using Content Understanding
Overviewโ
Azure Content Understanding is a newer Azure AI service (available through Foundry Tools) that extracts structured information from a wide variety of content types: documents, images, audio, and video. It combines capabilities previously spread across multiple services into a unified API.
This is a high-value topic for AI-901 โ it's newer, Foundry-native, and likely to be tested.
Key Conceptsโ
What is Azure Content Understanding?โ
Content Understanding uses AI models to extract structured, queryable data from unstructured content. You define an analyzer (or use a pre-built one) that specifies what fields to extract, and then submit content for analysis.
| Content type | What can be extracted |
|---|---|
| Documents & forms | Fields, tables, key-value pairs, checkboxes, signatures |
| Images | Text (OCR), objects, layout, structured visual data |
| Audio | Transcription, speaker diarization (who said what), topics, sentiment |
| Video | Transcription, scene detection, on-screen text, faces, objects |
Core Conceptsโ
| Concept | Description |
|---|---|
| Analyzer | A configuration that defines what to extract from content |
| Pre-built analyzer | A ready-made analyzer for common document types (invoices, receipts, IDs) |
| Custom analyzer | Trained on your own documents for domain-specific extraction |
| Field | A named piece of data extracted from content (e.g., InvoiceTotal, CustomerName) |
| Confidence score | A number (0โ1) indicating how confident the model is in an extracted value |
Building an Information Extraction Appโ
from azure.ai.contentsafety import ContentSafetyClient # example import pattern
# Note: exact SDK module names may vary โ check the latest Azure Content Understanding docs
# The general pattern for Content Understanding:
# 1. Create a client with your endpoint and key
# 2. Submit content (document URL, image URL, audio file, video URL)
# 3. Poll for results
# 4. Parse the extracted fields from the response
import requests
endpoint = "https://<your-resource>.cognitiveservices.azure.com/"
key = "<your-key>"
headers = {
"Ocp-Apim-Subscription-Key": key,
"Content-Type": "application/json"
}
# Submit a document for analysis
payload = {
"url": "https://example.com/invoice.pdf"
}
response = requests.post(
f"{endpoint}contentunderstanding/analyzers/prebuilt-invoice:analyze?api-version=2024-12-01-preview",
headers=headers,
json=payload
)
# Get the operation ID and poll for results
operation_id = response.headers.get("Operation-Id")
# Poll until status == "succeeded", then read result.fields
Pre-built Analyzersโ
| Analyzer | What it extracts |
|---|---|
prebuilt-invoice | Vendor, items, totals, dates, PO numbers |
prebuilt-receipt | Merchant, items, totals, payment method |
prebuilt-idDocument | Name, address, ID number, date of birth |
prebuilt-businessCard | Name, company, phone, email, address |
prebuilt-read | All text (OCR) from any document |
prebuilt-layout | Text + tables + selection marks with position |
Azure Services & Foundry Featuresโ
| Service | Notes |
|---|---|
| Azure Content Understanding | The primary service for this exam topic |
| Azure AI Document Intelligence | Previous name for document-focused extraction โ pre-built models still relevant |
| Foundry Tools | Access Content Understanding analyzers through the Foundry portal |