DA-ASSOC Cheatsheet — Databricks SQL: Joins, Windows, Dashboards & Quick Rules

Last-mile DA-ASSOC review: high-yield SQL patterns (joins, windows, CTEs), common pitfalls, and Databricks SQL dashboard/alert best practices.

Use this for last‑mile review. Pair it with the Syllabus for coverage and Practice to validate speed.


1) Join correctness (high-yield)

“Row explosion” checklist

  • Is the join key unique on the dimension side?
  • Did you accidentally join on the wrong granularity?
  • Did you filter after a LEFT join in a way that turns it into an INNER join?

Common join types

JoinKeeps rows fromTypical use
INNERboth sides matchenrich when match is required
LEFTleft side alwayskeep all facts and add dims when present
ANTIleft rows without matchfind “missing” records

2) Window functions (the “analytics power tool”)

 1SELECT
 2  customer_id,
 3  order_ts,
 4  amount,
 5  SUM(amount) OVER (
 6    PARTITION BY customer_id
 7    ORDER BY order_ts
 8    ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
 9  ) AS running_total
10FROM orders;

Rule: partition = “grouping”; order = “sequence”; frame = “range of rows.”


3) CTEs (clarity + correctness)

1WITH base AS (
2  SELECT * FROM events WHERE event_date >= current_date() - 7
3)
4SELECT event_type, COUNT(*) AS cnt
5FROM base
6GROUP BY event_type;

4) Dashboard hygiene (trustworthy analytics)

PracticeWhy it matters
Define metrics clearlyreduces misinterpretation
Use consistent filterscomparable slices
Show freshnessavoids “stale data” surprises
Prefer parameterized queriesreusable and safer

5) Fast troubleshooting pickers

  • Wrong counts after join: non-unique dim key or many-to-many join.
  • Window gives unexpected results: missing ORDER BY or wrong frame.
  • Slow query: missing filters, scanning too much data; reduce early and avoid unnecessary DISTINCT.