1Z0-1127-25 Cheatsheet — RAG, Evaluation, Safety, Deployment & Cost Control

Last-mile 1Z0-1127-25 review: RAG pipeline, chunking and embeddings pickers, evaluation signals, prompt injection defenses, deployment patterns, and cost/latency controls.

On this page

Use this for last‑mile review. Pair it with the Syllabus.

1) The canonical RAG pipeline

    flowchart LR
	  DOC["Docs"] --> CH["Chunk + clean"]
	  CH --> EMB["Embeddings"]
	  EMB --> IDX["Vector index"]
	  Q["Query"] --> QEMB["Query embedding"]
	  QEMB --> RET["Retrieve top-k"]
	  RET --> PROMPT["Prompt with context"]
	  PROMPT --> LLM["LLM"]
	  LLM --> OUT["Answer + citations"]

Rule: “Better prompts” rarely fix a broken retrieval layer.

2) Chunking pickers (why retrieval looks wrong)

Decision	Too small	Too big
Chunk size	low context	low precision
Overlap	wasted cost	continuity breaks
Metadata	missing filters	wrong tenant/version

3) Evaluation signals (exam-friendly)

Layer	What to measure
Retrieval	hit rate/top‑k relevance, filter correctness
Generation	groundedness, correctness, citation quality
Safety	leakage, injection resilience, policy violations

4) Prompt injection defenses (practical)

Treat retrieved content as untrusted input.
Use explicit system instructions to ignore instructions from documents.
Strip/limit tool permissions; enforce allowlists.
Log and monitor suspicious queries and “jailbreak” patterns.

5) Cost/latency controls

Reduce candidate set with metadata filters.
Keep top‑k intentional; cap context length.
Cache embeddings and reuse indexes.
Monitor token usage and tail latency.

Syllabus

Practice

Browse Exams — Mock Exams & Practice Tests