Your browser does not support JavaScript.

1Z0-1195-25 Syllabus — Learning Objectives by Topic

Learning objectives for Oracle Data Platform 2025 Foundations Associate (1Z0-1195-25), organized by topic with quick links to targeted practice.

Use this syllabus as your checklist for 1Z0‑1195‑25.

What’s covered

Topic 1: Data Platform Fundamentals
Topic 2: Storage and Data Lake on OCI
Topic 3: Ingestion and Integration
Topic 4: Processing and Transformation (Batch)
Topic 5: Streaming and Change Data Capture (CDC)
Topic 6: Analytics and Consumption
Topic 7: Security, Governance, and Operations

Topic 1: Data Platform Fundamentals

Practice this topic →

1.1 Core data platform concepts (lake, warehouse, lakehouse)

Differentiate a data lake, data warehouse, and lakehouse in terms of storage, governance, and consumption.
Explain batch vs streaming processing and when each is appropriate.
Recognize common data pipeline stages: ingest, validate, transform, serve, monitor.
Given a scenario, choose ETL vs ELT based on where transformations should run.

1.2 Data engineering workflow and pipeline patterns

Describe common pipeline patterns: scheduled batch, event-driven, micro-batch, and CDC-based replication.
Explain why idempotency and retries are required for reliable ingestion jobs.
Identify why metadata (source, lineage, timestamps) is essential for trustworthy analytics.
Given a scenario, choose a pipeline pattern that meets freshness and resiliency requirements.

1.3 Data quality and analytics readiness

Define core data quality dimensions: completeness, validity, timeliness, and consistency.
Recognize common analytics failure modes caused by poor data quality (duplicate events, missing keys).
Explain why slowly changing dimensions and late-arriving data complicate downstream reporting.
Given a scenario, choose validation checks to run before publishing a dataset.

Topic 2: Storage and Data Lake on OCI

Practice this topic →

2.1 Object Storage fundamentals for data platforms

Explain how Object Storage is used for raw, curated, and published data zones.
Given a scenario, choose Standard vs Archive tiers based on access patterns and cost.
Recognize when multipart uploads and parallelism are required for large ingests.
Identify common security controls for lake storage (encryption, IAM policies, private access).

2.2 Data formats, partitioning, and schema strategy

Differentiate common analytical formats (Parquet, ORC, Avro) at a conceptual level.
Given a scenario, choose partition keys that reduce scan cost and improve query performance.
Explain schema-on-read vs schema-on-write and how it affects governance and flexibility.
Recognize how compression and file sizing impact distributed processing performance.

2.3 Lifecycle management and organization

Design a folder/object naming strategy that supports discovery and automation.
Explain why retention policies and lifecycle rules reduce storage sprawl and cost.
Recognize how tagging and compartments support access control and chargeback/showback.
Given a scenario, choose governance controls for sensitive datasets (restricted compartments, audit).

Topic 3: Ingestion and Integration

Practice this topic →

3.1 Ingestion sources and connectivity

Identify common data sources: operational databases, SaaS apps, files, logs, and streams.
Given a scenario, choose online vs offline transfer methods based on volume and time constraints.
Explain why network connectivity and identity controls are prerequisites for reliable ingestion.
Recognize when to stage data before transformation to support replay and backfills.

3.2 ETL/ELT tools and orchestration concepts

Differentiate data integration vs processing engines and where each fits in the pipeline.
Given a scenario, choose orchestration features needed (scheduling, dependencies, retries, alerts).
Explain how parameterization enables environment promotion and reusable pipelines.
Recognize the need for secrets management when connecting to protected data sources.

3.3 Validation, reconciliation, and backfills

Design record-count and checksum-based reconciliation checks for ingestion validation.
Given a scenario, choose a backfill strategy that avoids double-processing and preserves ordering.
Explain why late-arriving data requires watermarking or windowing strategies.
Recognize when to quarantine bad records vs fail the pipeline.

Topic 4: Processing and Transformation (Batch)

Practice this topic →

4.1 Distributed processing concepts (Spark/Data Flow)

Explain how distributed processing works at a high level (parallelism, partitions, shuffles).
Given a scenario, choose a batch processing engine vs pushing transforms into the warehouse.
Recognize common performance bottlenecks (wide shuffles, skew, small files).
Identify best practices for writing efficient transformations (filter early, avoid unnecessary joins).

4.2 Job execution, scheduling, and environments

Describe how jobs are packaged and executed (artifacts, dependencies, parameters).
Given a scenario, choose concurrency limits and retries to protect downstream systems.
Explain why dev/test/prod separation is important for data platforms.
Recognize the role of configuration management and infrastructure-as-code for repeatable pipelines.

4.3 Monitoring and troubleshooting batch pipelines

Identify the key metrics for batch pipelines: throughput, latency, failure rate, and cost per run.
Given a scenario, diagnose common failures: permissions, schema drift, network timeouts, and out-of-memory.
Explain why data lineage and run metadata accelerate root-cause analysis.
Recognize when to optimize compute vs optimize data layout (partitioning, compaction).

Topic 5: Streaming and Change Data Capture (CDC)

Practice this topic →

5.1 Streaming fundamentals (topics, partitions, consumers)

Define core streaming concepts: topics, partitions, offsets, and consumer groups.
Given a scenario, choose ordering guarantees and partitioning keys that support downstream analytics.
Explain the difference between at-least-once and exactly-once processing at a conceptual level.
Recognize common reliability patterns: dead-letter queues, replay, and idempotent sinks.

5.2 CDC and replication concepts (GoldenGate)

Explain what CDC is and why it is used for near-real-time replication.
Given a scenario, choose CDC over batch extracts to meet freshness requirements.
Recognize common CDC considerations: schema changes, large transactions, and failover.
Identify how to validate replicated data (lag metrics, reconciliation checks).

5.3 Event-driven architectures for data products

Design an event-driven ingestion flow that triggers processing when new data arrives.
Given a scenario, choose micro-batching vs true streaming based on downstream constraints.
Explain why backpressure and rate limiting are necessary to keep consumers stable.
Recognize when to store raw events to enable replay and new derived datasets.

Topic 6: Analytics and Consumption

Practice this topic →

6.1 Warehousing and analytical query fundamentals

Differentiate operational workloads (OLTP) from analytical workloads (OLAP).
Given a scenario, choose a warehouse vs lake query approach based on performance and governance needs.
Recognize why dimensional models support reporting and self-service analytics.
Explain why data freshness and SLAs should be explicit for published datasets.

6.2 BI dashboards and KPIs (Analytics Cloud concepts)

Describe how dashboards and reports consume curated datasets and semantic definitions.
Given a scenario, choose appropriate visualization types for different metrics and distributions.
Explain why consistent metric definitions prevent executive dashboard disagreement.
Recognize common BI risks: stale data, unclear filters, and missing drill-down paths.

Define role-based access needs for analysts, data engineers, and business users.
Given a scenario, choose dataset sharing controls that prevent overexposure of sensitive fields.
Explain why data catalogs and dataset documentation improve adoption and reuse.
Recognize when to publish certified datasets vs exploratory sandbox datasets.

Topic 7: Security, Governance, and Operations

Practice this topic →

7.1 IAM and access control for data platforms

Apply least-privilege IAM to ingestion, processing, storage, and analytics components.
Given a scenario, choose compartment and policy structures that align with teams and environments.
Recognize why separation of duties reduces risk in production data systems.
Explain how tagging supports governance and cost allocation.

7.2 Encryption, auditing, and compliance fundamentals

Explain encryption at rest and in transit and where to enforce each in a data pipeline.
Given a scenario, choose key management and secret management controls for regulated data.
Recognize the role of audit trails and logging for compliance investigations.
Identify data classification and retention considerations for enterprise data platforms.

7.3 Observability and cost management

Define platform health signals: pipeline success rate, lag, data freshness, and cost per dataset.
Given a scenario, implement alerting that detects broken ingestion or stale dashboards early.
Explain why capacity planning is required for peak loads and backfills.
Recognize cost drivers (storage tiering, compute time, data scans) and how to control them.