Databricks Platform Architecture
The Databricks platform architecture consists of two main components: the Control Plane and the Data Plane (also known as the Compute Plane). Here's a breakdown of each component and what resides in the customer's cloud account:
Control Plane:
Purpose: The control plane hosts Databricks' backend services, including the web application, REST APIs, and account management.
Location: The control plane is managed by Databricks and runs within Databricks' cloud account.
Components: It includes services for workspace management, job scheduling, cluster management, and other administrative functions.
Data Plane (Compute Plane):
Purpose: The data plane is responsible for data processing and client interactions.
Location: The data plane can be deployed in two ways:
Serverless Compute Plane: Databricks compute resources run in a serverless compute layer within Databricks' cloud account.
Classic Compute Plane: Databricks compute resources run in the customer's cloud account (e.g., AWS, Azure, GCP). This setup provides natural isolation as it runs within the customer's own virtual network.
Components: It includes clusters, notebooks, and other compute resources used for data processing and analytics.
Customer's Cloud Account:
Workspace Storage: Each Databricks workspace has an associated storage bucket or account in the customer's cloud account. This storage contains:
Workspace System Data: Includes notebook revisions, job run details, command results, and Spark logs.
DBFS (Databricks File System): A distributed file system accessible within Databricks environments, used for storing and accessing data.
Subscribe to:
Post Comments (Atom)
Data synchronization in Lakehouse
Data synchronization in Lakebase ensures that transactional data and analytical data remain up-to-date across the lakehouse and Postgres d...
-
Steps to Implement Medallion Architecture : Ingest Data into the Bronze Layer : Load raw data from external sources (e.g., databases, AP...
-
from pyspark.sql import SparkSession from pyspark.sql.types import ArrayType, StructType from pyspark.sql.functions import col, explode_o...
-
Databricks Platform Architecture The Databricks platform architecture consists of two main components: the Control Plane and the Data Pla...
No comments:
Post a Comment