High-signal DVA-C02 reference: Lambda invocation and retry behavior, API Gateway choices, DynamoDB modeling, S3/CloudFront patterns, SQS/SNS/EventBridge/Kinesis/Step Functions, IAM/Cognito/KMS security, CI/CD + IaC, and troubleshooting with CloudWatch/X-Ray.
Keep this page open while drilling questions. Prioritize defaults, failure modes, and the “least-privilege + event-driven” mental model.
| Item | Value |
|---|---|
| Questions | 65 (multiple-choice + multiple-response) |
| Time | 130 minutes |
| Passing score | 720 (scaled 100–1000) |
| Cost | 150 USD |
| Domains | D1 32% • D2 26% • D3 24% • D4 18% |
flowchart LR
C[Client] -->|HTTPS| APIGW[API Gateway]
APIGW -->|IAM/Cognito/Lambda authorizer| L[Lambda]
L --> DDB[(DynamoDB)]
L --> S3[(S3)]
L --> EB[EventBridge]
EB --> SQS[(SQS)]
SQS --> L2[Lambda worker]
L2 --> CW[(CloudWatch Logs/Metrics)]
L2 --> XR[(X-Ray Traces)]
High-yield takeaway: professional “developer” answers usually combine authn/authz, decoupling, and observability (logs/metrics/traces).
| Invocation model | Common triggers | Retry behavior (what the exam expects) | Failure handling knobs |
|---|---|---|---|
| Synchronous (request/response) | API Gateway → Lambda | No automatic retries by Lambda | Caller retry, app-level retry, redesign to async |
| Asynchronous (event) | EventBridge, SNS, S3 → Lambda | Automatic retries (default is multiple attempts) | Async destinations / DLQ, max event age |
| Poll-based event source mapping | SQS, DynamoDB Streams, Kinesis → Lambda | Retries until success; stream failures can block progress | DLQ/on-failure destination, partial batch, bisect batch |
Without partial batch response, one failure can cause the whole batch to be retried. Prefer partial batch response when supported.
Lambda batch response format (concept)
1{
2 "batchItemFailures": [
3 { "itemIdentifier": "message-or-record-id" }
4 ]
5}
Common approach: store a request key (for example, requestId) in DynamoDB with TTL.
11) Put idempotency key with condition attribute_not_exists(PK)
22) If conditional check fails -> return cached result / treat as already processed
33) Process work -> write result
| Need | Best-fit | Why it wins (DVA framing) |
|---|---|---|
| Usage plans, API keys, advanced REST features | REST API | “Full-featured API Gateway” |
| Lower cost + simpler HTTP routing | HTTP API | “Cheaper and faster” (fewer features) |
| Real-time bidirectional messaging | WebSocket API | Connection-oriented messaging |
| Need | Best-fit | Notes |
|---|---|---|
| AWS-to-AWS or signed clients | IAM auth (SigV4) | Great for service-to-service, not human logins |
| User authentication with tokens | Cognito authorizer | Validate JWTs from a Cognito user pool |
| Custom logic / external JWTs | Lambda authorizer | You own token validation and caching behavior |
API keys: not “security” by themselves; they’re primarily for usage plans and throttling/quota.
If you see throttling, check the whole chain:
1{
2 "Version": "2012-10-17",
3 "Statement": [
4 {
5 "Effect": "Allow",
6 "Action": "execute-api:Invoke",
7 "Resource": "arn:aws:execute-api:REGION:ACCOUNT:apiId/stage/GET/items"
8 }
9 ]
10}
| Need | Operation |
|---|---|
| Fetch one item by full key | GetItem |
| Fetch a set by partition key (+ sort key conditions) | Query |
| Bulk read without keys (rarely best in prod) | Scan |
| Feature | GSI | LSI |
|---|---|---|
| When created | Any time | At table creation |
| Partition key | Can be different | Same as table |
| Consistent reads | Eventually consistent only | Can support consistent reads |
| Capacity | Separate (provisioned) / managed (on-demand) | Shares table capacity model |
Rule: if you need a new access pattern later, it’s usually a GSI.
1PutItem with ConditionExpression: attribute_not_exists(PK)
This shows up in questions about idempotency and race prevention.
Use DynamoDB Streams when you need to react to table changes:
Gotcha: stream processing is ordered per shard; failures can block progress until handled (configure retries/handling).
| Need | Best-fit |
|---|---|
| Object storage | S3 |
| Block storage for EC2 | EBS |
| Shared POSIX file system | EFS |
Best for direct-to-S3 upload/download without proxying files through your app.
sequenceDiagram
participant C as Client
participant API as API (Lambda)
participant S3 as S3
C->>API: Request upload URL
API-->>C: Pre-signed URL
C->>S3: PUT object (direct)
CLI example:
1aws s3 presign s3://my-bucket/path/file.bin --expires-in 3600
| Option | What it means | When it’s best |
|---|---|---|
| SSE-S3 | S3-managed keys | Simple default encryption |
| SSE-KMS | KMS-managed keys | Auditability/control requirements |
| Client-side | App encrypts before upload | Strict compliance and key custody needs |
Common “AccessDenied” cause: KMS key policy doesn’t allow the caller/service to use the key.
| Need | Best-fit |
|---|---|
| Decouple workloads with a durable buffer | SQS |
| Fan-out notifications to many subscribers | SNS |
| Route events by pattern across services/accounts | EventBridge |
| Ordered, high-throughput streaming ingestion | Kinesis |
| Workflow with steps, retries, wait states | Step Functions |
flowchart LR
Pub[Publisher] --> T[SNS Topic]
T --> Q1[SQS Queue A]
T --> Q2[SQS Queue B]
Q1 --> L1[Lambda Worker A]
Q2 --> L2[Lambda Worker B]
Q1 --> DLQ1[DLQ]
Q2 --> DLQ2[DLQ]
Why it wins: SNS handles fanout; SQS gives durability and independent retry per consumer.
| Queue type | When to choose | Key behavior |
|---|---|---|
| Standard | Highest throughput, best-effort ordering | At-least-once; duplicates possible |
| FIFO | Ordered processing per group | Ordering + deduplication support |
Pick Kinesis when you need:
If you just need a durable buffer for background jobs, SQS is often simpler.
| Type | Best for | Key point |
|---|---|---|
| Standard | Longer workflows and exact orchestration | Durable and auditable |
| Express | High-volume short workflows | Optimized for throughput |
Retry + Catch snippet (Amazon States Language)
1{
2 "StartAt": "Work",
3 "States": {
4 "Work": {
5 "Type": "Task",
6 "Resource": "arn:aws:states:::lambda:invoke",
7 "Retry": [
8 {
9 "ErrorEquals": ["States.ALL"],
10 "IntervalSeconds": 2,
11 "MaxAttempts": 3,
12 "BackoffRate": 2.0
13 }
14 ],
15 "Catch": [
16 {
17 "ErrorEquals": ["States.ALL"],
18 "Next": "Failed"
19 }
20 ],
21 "End": true
22 },
23 "Failed": { "Type": "Fail" }
24 }
25}
1{
2 "Version": "2012-10-17",
3 "Statement": [
4 {
5 "Effect": "Allow",
6 "Principal": { "AWS": "arn:aws:iam::111122223333:role/CallerRole" },
7 "Action": "sts:AssumeRole"
8 }
9 ]
10}
| Need | Use | Why |
|---|---|---|
| Authenticate users and get JWTs | User Pool | User directory + tokens |
| Get temporary AWS creds for a user | Identity Pool | Federated identities → STS credentials |
Common pattern: API Gateway uses a Cognito User Pool authorizer to validate JWTs.
| Need | Best-fit |
|---|---|
| Rotating secrets (DB creds, API keys) | Secrets Manager |
| Config parameters (often cheaper/simpler) | SSM Parameter Store |
Exam cue: if you see “automatic rotation,” choose Secrets Manager.
1{
2 "Version": "2012-10-17",
3 "Statement": [
4 {
5 "Effect": "Allow",
6 "Action": [
7 "dynamodb:GetItem",
8 "dynamodb:PutItem",
9 "dynamodb:Query",
10 "dynamodb:UpdateItem"
11 ],
12 "Resource": "arn:aws:dynamodb:REGION:ACCOUNT:table/MyTable"
13 }
14 ]
15}
flowchart LR
Repo[Source Repo] --> Pipe[CodePipeline]
Pipe --> Build[CodeBuild]
Build --> Artifacts[(S3 Artifacts)]
Artifacts --> Deploy[CloudFormation/SAM/CodeDeploy]
Deploy --> Alias[Lambda Alias/Version]
High-yield terms:
1Resources:
2 Api:
3 Type: AWS::Serverless::Api
4 Fn:
5 Type: AWS::Serverless::Function
6 Properties:
7 Handler: app.handler
8 Runtime: nodejs20.x
9 AutoPublishAlias: live
10 DeploymentPreference:
11 Type: Canary10Percent5Minutes
12 Events:
13 Get:
14 Type: Api
15 Properties:
16 RestApiId: !Ref Api
17 Path: /items
18 Method: get
Exam cue: if you see “minimize risk” and “automatic rollback,” think CodeDeploy traffic shifting + alarms.
| Symptom | Likely cause | What to check first |
|---|---|---|
API Gateway 502/504 | Lambda error/timeout | Lambda logs, Lambda timeout vs API timeout |
429 / throttling | API Gateway/Lambda/DynamoDB throttles | Metrics, concurrency, capacity, backoff |
AccessDenied | IAM/resource/KMS key policy mismatch | Evaluate each policy layer separately |
| SQS redrive to DLQ | Poison message or too-short visibility timeout | Visibility timeout, retries, handler idempotency |
| Lambda can’t reach internet | Lambda in VPC without NAT | VPC routing, NAT, VPC endpoints |
1fields @timestamp, @message
2| filter @message like /ERROR|Exception|Task timed out/
3| sort @timestamp desc
4| limit 50
Pseudo:
1sleep(random(0, base * 2^attempt)); retry