Skip to main content

AI Model Components & Configuration

Exam objective: Identify AI model components and configurations

Overviewโ€‹

Understanding how generative AI models work and how to select and configure them is a core skill for AI-901. The exam tests your ability to choose the right model for a task and understand the key configuration parameters that affect model behavior.

Key Conceptsโ€‹

How Generative AI Models Workโ€‹

ConceptDescription
Transformer architectureThe underlying architecture of most modern LLMs โ€” processes tokens using attention mechanisms
TokenThe unit of text a model processes (roughly ยพ of a word). Models have a context window (max tokens)
Pre-trainingTraining on massive text datasets to learn general language patterns
Fine-tuningAdditional training on a smaller, task-specific dataset to specialize the model
PromptThe input text you provide to a model to guide its output
Completion / ResponseThe text the model generates in response to a prompt
EmbeddingA numeric vector representation of text, used for semantic search and similarity

Choosing the Right Modelโ€‹

ScenarioAppropriate model type
Generate or summarize textGPT-4o, GPT-4, GPT-3.5-turbo
Analyze images + generate textMultimodal model (e.g., GPT-4o)
Convert text to imagesDALL-E
Convert speech to textWhisper
Semantic search, RAGEmbedding model (e.g., text-embedding-ada-002)

Key Configuration Parametersโ€‹

ParameterWhat it controls
TemperatureRandomness of output. 0 = deterministic/predictable, 1 = creative/varied
Max tokensMaximum length of the generated response
Top-p (nucleus sampling)Controls diversity โ€” the model picks from the top tokens whose probability sums to p
System promptSets the model's behavior, persona, and constraints for the entire conversation
Deployment nameThe name you give when deploying a model in Foundry โ€” used when calling the API

Model Deployment Options in Foundryโ€‹

OptionDescription
Standard deploymentPay-per-token, globally load-balanced
Provisioned throughputReserved capacity for predictable workloads
Serverless APIPay-per-use for models from the Foundry model catalog

Study Resourcesโ€‹