1. Ignite

12 Weeks Up to 62h

Starting Point

Business users with little/no prior knowledge of databases, SQL, Python, and ETL. Primarily working with Excel/CSV.

Hours Breakdown

Self Study: up to 48h
Instructor-Led: up to 8h
Community: Up to 6 Times

Skills Unlocked

  • Move from Excel/CSV workflows to Databricks.
  • Import, clean, transform, and visualize data.
  • Build simple tables and dashboards.
  • Apply SQL and Python for analysis.
  • Understand Lakehouse architecture & Delta Lake.
  • Learn basics of data pipelines & AI-powered analytics.

Core Course Structure

Block 1: Fundamentals & Quick Start

  • Data Basics: Tables, columns, keys, typical Excel pitfalls.
  • From Excel/CSV to Databricks: Importing files and overview of the Lakehouse.
  • Architecture & Governance: Data organization, access guardrails, and compliance.
  • Visualization: Preparing initial analyses and visualizations.
Instructor-led Training (B2B) • 2 Quizzes • Community Session • 1 Skill Lab

Block 2: Working with Data & AI in the Lakehouse

  • Unity Catalog: Permissions, lineage, discoverability, and ownership.
  • Genie for Analytics: Ask, Refine, Validate, and Share.
  • Power BI: Connecting Databricks and building dashboards.
  • Agentic Workflows: Kasal low-code UI, multi-step workflows.
Instructor-led Training (B2B) • 2 Quizzes • 1 Community Session • 1 Skill Lab

Block 3: SQL Analytics on the Lakehouse (Analyst Track)

  • Databricks SQL Essentials: Warehouses vs clusters, SQL editor.
  • SQL Analytics: SELECT, Joins, Aggregations, CTEs.
  • Advanced SQL: Window functions, ranking, time series, grouping sets.
  • Delta Lake Concepts: ACID, Time Travel, table vs view.
2 Quizzes • 1 Community Session • 1 Skill Lab

Block 4: Python for Analytics on Databricks (Builder Track)

  • Notebook Productivity: Variables, parameters, modular notebooks.
  • Python Crash Course: Variables, lists/dicts, functions.
  • Pandas & Spark Basics: Load, clean, transform; moving to Spark.
  • Operationalizing Basics: Write to Unity Catalog, data quality alerts, scheduling.
2 Quizzes • 1 Community Session • 1 Skill Lab

2. Learn

12 Weeks Up to 84h

Starting Point

New to Databricks/cloud-native platforms, but experienced in Excel/CSV transformations, basic SQL/Python, and Lakehouse fundamentals. Curious about real-world scenarios.

Hours Breakdown

Self Study: up to 60h
Instructor-Led: up to 16h
Community: Up to 8

Skills Unlocked

  • Navigate workspace and use notebooks efficiently.
  • Perform repeatable data ingestion and transformation.
  • Build optimized, testable ETL pipelines (Lakeflow Pipelines).
  • Pick the right compute/runtime settings.
  • Understand Unity Catalog and MLflow at a high level.
  • Prepare for the Databricks Associate certification.

Core Course Structure

Block 1: Databricks Intelligence Platform

  • Platform Fundamentals: Lakehouse vs. data warehouse, Medallion Architecture.
  • Workspace & Compute: Navigating workspace, cluster lifecycle, Liquid Clustering.
Instructor Led Training • 1 Quiz • 1 Community Session

Block 2: Development & Ingestion

  • Databricks Connect: Local/remote dev workflows.
  • Notebooks & Debugging: Rapid prototyping, Spark UI, and logs.
  • Auto Loader: Sources, syntax & ingestion best practices.
2 Quizzes • 1 Community Session • 1 Skill Lab

Block 3: Data Processing & Transformations

  • Medallion Architecture: Layer application & cluster sizing.
  • Lakeflow Pipelines: Declarative pipelines, expectations.
  • Transformations: DDL/DML & PySpark DataFrame aggregations/UDFs.
3 Quizzes • 1 Skill Lab • 1 Community Session

Block 4: Productionization, Governance & Quality

  • Productionization: Databricks Asset Bundles (DAB), workflow repair, serverless compute tuning.
  • Governance: Unity Catalog roles, grants, lineage & audit logs.
  • Sharing: Delta Sharing and Lakehouse Federation use-cases.
2 Quizzes • 1 Community Session • 1 Milestone Recap • Optional Exam Prep

3. Apply

12 Weeks Up to 82h

Starting Point

Foundational Databricks knowledge acquired. Ready to move from sandbox learning to real-world use cases, collaborative projects, and CI/CD basics.

Hours Breakdown

Self Study: up to 50h
Instructor-Led: up to 8h
Community: up to 8
Coaching: up to 8h

Skills Unlocked

  • Build reliable, maintainable ETL pipelines using Lakeflow.
  • Apply performance tuning and data modeling best practices.
  • Orchestrate workflows with Databricks Workflows.
  • Operationalize ML workflows in production (MLflow).
  • Collaborate effectively using notebooks and Unity Catalog.

Core Course Structure

Block 1: Data Processing & Automation

  • ETL Pipelines: Develop declarative pipelines using Lakeflow Jobs.
  • Incremental Ingestion: Auto Loader for cloud storage.
  • Optimization: Liquid-Clustering, caching, partitioning, autoscaling.
  • Transformations: Spark SQL & PySpark.
Quiz • Instructor Led Training • 1 Community Session

Block 2: Machine Learning & Operationalization

  • MLflow: Track experiments and deploy models.
  • Feature Engineering: PySpark and Delta Lake.
  • Training: Spark MLlib, scikit-learn, XGBoost.
  • Model Lifecycle: Databricks Model Registry and Unity Catalog access.
2 Quizzes • 1 Skill Lab • 1 Community Session

Block 3: Data Governance & Security

  • Unity Catalog: Manage access, lineage, and governance.
  • Security: Implement RBAC and security policies.
  • Collaboration: Databricks Repos for Git-based dev.
  • Automation: Design and automate CI/CD pipelines.
2 Quizzes • 1 Skill Lab • 1 Community Session

Block 4: Capstone Project & Certification Prep

  • End-to-End Build: Production-grade ETL and ML pipelines.
  • Operations: Orchestrate and monitor jobs, tasks, and costs.
  • Best Practices: Reproducibility, scalability, maintainability.
  • Prep: Databricks Professional Certification preparation.
1 Quiz • 1 Milestone Recap • 1 Community Session • Optional Exam Prep

4. Grow

12 Weeks Up to 80h

Starting Point

Actively delivering in Databricks projects. Looking to scale impact, mentor others, reflect on delivery patterns, and deepen architectural knowledge.

Hours Breakdown

Self Study: up to 40h
Instructor-Led: up to 8h
Community: up to 8
Coaching: up to 16h

Skills Unlocked

  • Architect scalable, secure data and ML pipelines.
  • Build production-grade Lakeflow Pipelines and structured streaming.
  • Deploy workflows using MLflow and Feature Store.
  • Apply advanced governance, lineage, and FinOps.
  • Develop reusable frameworks for cross-team adoption.
  • Mentor colleagues and influence delivery standards.

Core Course Structure

Block 1: Platform Architecture & Scaling

  • Architectural Scaling: Apply Medallion Architecture at scale.
  • Performance Optimization: Clusters, autoscaling, Z-Ordering, caching.
  • FinOps: Platform governance and cost control best practices.
Quiz • Instructor Led Training • 1 Community Session

Block 2: Advanced Data Processing & Automation

  • Advanced ETL: Lakeflow pipelines with CDC and schema evolution.
  • Streaming: Structured streaming with checkpointing for real-time data.
  • Reusability: Parameterized notebooks and shared libraries.
2 Quizzes • 1 Skill Lab • 1 Community Session

Block 3: Machine Learning & Operationalization

  • Advanced MLflow: Model registry, feature engineering using Delta Lake.
  • Deployment: Deploy models to batch or real-time endpoints.
  • Collaboration: Secure data access through Unity Catalog.
2 Quizzes • 1 Skill Lab • 1 Community Session

Block 4: Governance, Mentoring & Capstone

  • Governance: Unity Catalog audit APIs and secure Delta Sharing.
  • Leadership: Mentor junior colleagues and lead retrospectives.
  • Capstone: Build E2E ETL and ML pipelines for cross-team projects.
1 Quiz • 1 Milestone Recap • 1 Community Session • Role Dev & Mentoring

5. Lead

12 Weeks Up to 78h

Starting Point

Recognized experts with a track record of successful Databricks delivery. Ready to formalize internal coaching and scale enablement organization-wide.

Hours Breakdown

Self Study: up to 30h
Instructor-Led: up to 16h
Community: up to 8
Coaching: up to 24h

Skills Unlocked

  • Lead internal enablement as a trainer/coach.
  • Architect, review, and optimize large-scale solutions.
  • Implement governance, CI/CD, and observability.
  • Build reusable components and frameworks.
  • Drive standards in complex, multi-project environments.

Core Course Structure

Block 1: Architectural Excellence

  • Reference Architectures: Define and maintain scalable solutions.
  • Review & Optimization: Audit Medallion Architecture, Lakeflow, and ML workflows.
  • Best Practices: Ensure security and cost-efficiency across pipelines.
Quiz • Instructor Led Training • 1 Community Session

Block 2: Governance, Observability & Technical Debt

  • Observability: Manage audit logs, lineage, and cost dashboards.
  • Compliance: Enforce CI/CD and Unity Catalog standards.
  • Technical Debt: Identify, prioritize, and remediate across projects.
2 Quizzes • 1 Skill Lab • 1 Community Session

Block 3: Reusable Components & Frameworks

  • Standardization: Modular notebooks and orchestration templates.
  • Shared Libraries: Establish reusable patterns for ETL, ML, and analytics.
  • Consistency: Promote maintainability across organization-wide projects.
2 Quizzes • 1 Skill Lab • 1 Community Session

Block 4: Leadership, Coaching & Multiplication

  • Architecture Reviews: Lead reviews and design validations.
  • Mentorship: Mentor and train teams to build internal capabilities.
  • Multiplication: Establish training structures and knowledge-sharing practices.
1 Quiz • 1 Milestone Recap • 1 Community Session • Execution Mentoring