AIF-C01 Cheatsheet — AI & Generative AI Fundamentals, AWS Service Map, RAG & Prompt Patterns

High-signal AIF-C01 reference: AI/ML terminology, generative AI concepts (tokens, embeddings, RAG), AWS services (Bedrock, SageMaker and core AI services), prompt engineering patterns, evaluation rubrics, responsible AI, and security/governance essentials.

Keep this page open while drilling questions. AIF‑C01 rewards clean definitions, best-fit service selection, and risk-aware design (hallucinations, privacy, prompt injection, responsible use).


Quick facts (AIF-C01)

ItemValue
Questions65 (multiple-choice + multiple-response)
Time90 minutes
Passing score700 (scaled 100–1000)
Cost100 USD
DomainsD1 20% • D2 24% • D3 28% • D4 14% • D5 14%

How AIF-C01 questions work (fast strategy)

  • If the prompt says least operational effort, prefer managed services (and native integrations).
  • If the question is about “improving factual accuracy,” the best answer is often grounding (RAG + citations) rather than “make the model bigger.”
  • If the question is about “sensitive data,” the best answer usually includes least privilege, encryption, and minimizing what you send to the model.
  • If the scenario includes “untrusted user input,” think prompt injection defenses and safe tool use (allowlists, scoped permissions).
  • Read the last sentence first to capture the constraint (cost, latency, safety, compliance).

0) Core mental model: a GenAI app (with RAG)

    flowchart LR
	  U[User] --> A[App / API]
	  A -->|Prompt + context| FM[Foundation Model]
	  A -->|Embed query| E[Embeddings]
	  E --> VS[(Vector Store)]
	  VS -->|Top-k chunks| A
	  A -->|Policy filters| G[Guardrails / Moderation]
	  A -->|Logs/metrics| O[Observability]
	  A -->|AuthN/AuthZ| IAM[IAM / Identity]

RAG in one sentence: retrieve relevant private content, then ask the model to answer using only that content (ideally with citations).


1) AI/ML fundamentals (Domain 1)

Core terminology (must know)

TermExam-friendly meaning
AIBroad goal: machines doing tasks that appear intelligent (perception, language, planning).
MLSubset of AI: models learn patterns from data to make predictions/decisions.
Deep learningML with neural networks (often needs more data/compute; strong for vision/language).
Supervised learningLearn from labeled examples (classification/regression).
Unsupervised learningFind structure without labels (clustering, dimensionality reduction).
Reinforcement learningLearn actions via rewards/penalties (policies).
Feature / labelInput signal vs correct output.
Training vs inferenceFit the model vs use the model to predict/generate.
OverfittingGreat on training data, poor on new data (memorization).
Data leakageTraining sees information it shouldn’t (inflates metrics).
DriftData or reality changes → performance decays over time.

Metrics (common, conceptual)

Use caseUseful metricsWhat to watch for
ClassificationPrecision/recall/F1, ROC-AUCClass imbalance; false positives vs false negatives
RegressionMAE/MSE/RMSEOutliers; error tolerance
Ranking/retrievalPrecision@k / Recall@k“Did we retrieve the right things?”

ML lifecycle (high level)

    flowchart LR
	  P[Define problem + metric] --> D[Collect/prepare data]
	  D --> T[Train + tune]
	  T --> E[Evaluate]
	  E --> DEP[Deploy]
	  DEP --> M[Monitor + feedback]
	  M --> D

Common best answer patterns:

  • If you can’t define a metric or get data, ML is usually the wrong first move.
  • Production ML needs monitoring (quality/latency/cost) and retraining plans.

2) Generative AI fundamentals (Domain 2)

Key GenAI terms (must know)

TermExam-friendly meaning
LLMLanguage model that generates text from prompts.
TokensModel “chunks” of text; drives cost/limits.
Context windowMax tokens model can consider in one request.
EmbeddingsNumeric vectors that capture semantic meaning for similarity search.
Vector storeDatabase/index optimized for similarity search over embeddings.
RAGRetrieve relevant data and include it in the prompt to ground answers.
Temperature / top-pControls randomness vs determinism.
HallucinationOutput that sounds plausible but isn’t supported by facts.
Prompt injectionUntrusted text attempts to override instructions (“ignore previous”).

Prompting vs RAG vs fine-tuning (decision table)

NeedBest starting pointWhy
Better instructions/formatPrompt engineeringFast, cheap, reversible
Fresh/private knowledgeRAGGrounds answers in your content without retraining
Consistent style/behaviorFine-tuningTeach patterns; reduces prompt complexity
A completely new capabilityUsually not AIF-C01 scopeConsider specialist ML work

GenAI limitations to recognize

  • Factuality isn’t guaranteed → use grounding/citations and “unknown” responses.
  • Context is limited → don’t paste entire corpora; retrieve and summarize.
  • Outputs can be unsafe/biased → add guardrails, evaluation, and human review paths.
  • Costs scale with tokens → control prompt size, choose smaller models when acceptable, cache repeated work.

3) AWS service map (what to pick when)

Foundation models and ML platforms

You need…Typical AWS answer
Managed foundation model access for GenAI appsAmazon Bedrock
Build/train/tune/deploy custom ML modelsAmazon SageMaker
A GenAI assistant for work/dev tasksAmazon Q

Pre-built AI services (use-case driven)

Use caseTypical AWS service
Extract text/forms from documentsAmazon Textract
NLP (entities, sentiment, classification)Amazon Comprehend
Image/video analysisAmazon Rekognition
Speech-to-textAmazon Transcribe
TranslationAmazon Translate
Text-to-speechAmazon Polly
Chatbot interfacesAmazon Lex
Enterprise searchAmazon Kendra

Common building blocks for GenAI apps (glue)

NeedTypical AWS building blocks
Store docs and artifactsAmazon S3
Orchestrate workflowsAWS Step Functions
Serverless computeAWS Lambda
Containerized APIsAmazon ECS/Fargate or Amazon EKS
Vector searchAmazon OpenSearch Service, Aurora PostgreSQL with pgvector
Secrets and keysAWS Secrets Manager, AWS KMS
Audit + monitoringAWS CloudTrail, Amazon CloudWatch

4) RAG: design notes that show up in exam scenarios (Domain 3)

RAG architecture (end-to-end)

    flowchart TB
	  subgraph Ingestion
	    S3[(Docs in S3)] --> C[Chunk + clean]
	    C --> EMB1[Create embeddings]
	    EMB1 --> VS[(Vector store)]
	  end
	
	  subgraph Answering
	    Q[User question] --> EMB2[Embed query]
	    EMB2 --> VS
	    VS --> K[Top-k chunks]
	    K --> P[Prompt template: instructions + context]
	    P --> FM[Foundation model]
	    FM --> A[Answer + citations]
	  end

High-yield design choices

  • Chunking: smaller chunks improve precision; larger chunks improve context. The exam often wants “tune chunking for relevance.”
  • Citations: if the requirement says “trust” or “audit,” add citations/source links.
  • Freshness: if content changes often, prefer RAG over fine-tuning.
  • Privacy: don’t send more data than needed; redact PII; restrict who can retrieve what (multi-tenant boundaries).

5) Prompt engineering patterns (Domain 3)

Techniques you should recognize

TechniqueWhat it doesWhen to use
Clear instructions + constraintsReduces ambiguityMost questions
Few-shot examplesImproves formatting/edge casesStructured outputs
DelimitersSeparates instructions vs dataUntrusted input scenarios
Output schemaProduces predictable JSONApp integrations
Grounding instructionsReduces hallucinationsRAG and knowledge tasks
Refusal/escalationSafer behaviorPolicy/safety constraints

Prompt template (practical)

 1Goal: Answer the user question using ONLY the provided context.
 2Context:
 3<<<
 4{retrieved_chunks}
 5>>>
 6Rules:
 7- If the answer is not in the context, say "Insufficient context".
 8- Provide 2-3 bullet citations (source titles/ids).
 9Output format (JSON):
10{"answer":"...", "citations":[{"source":"...","quote":"..."}]}
11User question: {question}

Anti-prompt-injection rule of thumb

Treat user-provided text as data, not instructions. If the model is allowed to call tools/actions, use allowlists and scoped permissions.


6) Evaluation and monitoring (Domain 3)

What to evaluate

DimensionHow to test it (high level)
CorrectnessGold questions, expert review, spot checks
GroundednessRequire citations; verify claims against sources
SafetyToxicity/harm prompts; policy violations; refusal behavior
BiasCompare outcomes across groups; document disparities
ReliabilityRegression tests for prompt/model changes
Latency/costMeasure P50/P95 and token usage; set budgets

Common “best answers”:

  • Use a representative test set (not just a few demos).
  • Do A/B testing when changing prompts/models.
  • Monitor production for quality regressions and abuse.

7) Responsible AI (Domain 4)

Responsible AI checklist (high signal)

  • Define intended use + out-of-scope use (avoid “silent expansion”).
  • Add human oversight for high-impact decisions.
  • Evaluate for bias and document limitations.
  • Implement safety policies (harmful content, privacy leakage).
  • Be transparent with users (what it is, what it isn’t, how to verify).

Common risks and mitigations

RiskTypical mitigation
HallucinationsRAG + citations; “unknown” responses
Unsafe contentGuardrails/moderation + refusal behavior
Privacy leakageData minimization; redaction; access controls
Bias/unfairnessDiverse evaluation sets; monitoring and remediation
Over-trustUser messaging + explainability + source links

8) Security, compliance, and governance (Domain 5)

Security “gotchas” the exam expects you to notice

  • Over-permissive IAM roles (“*” actions/resources)
  • Secrets embedded in prompts, logs, or code
  • Sending unnecessary sensitive data to the model
  • No audit trail for access and changes
  • Tool use without constraints (model can “do anything”)

AWS controls to name in answers (by theme)

ThemeCommon AWS controls
IdentityIAM roles/policies, least privilege
EncryptionAWS KMS, TLS
SecretsAWS Secrets Manager
NetworkVPC endpoints/PrivateLink, security groups
AuditAWS CloudTrail
MonitoringAmazon CloudWatch, AWS Security Hub, Amazon GuardDuty
GovernanceAWS Organizations (accounts, SCPs), tagging
Compliance evidenceAWS Artifact

Next steps

  • Use the Syllabus as your checklist (objective-by-objective).
  • Use Practice to drill weak tasks fast.
  • Use the Study Plan if you want a 30/60/90-day schedule.