Databricks-to-Databricks Delta Sharing: Secure sharing across Azure regions (2024)

Delta Sharing is an open protocol developed by Databricks for secure data sharing within an organization and externally, regardless of the computing platforms used. Depending on who you are sharing the data with, Delta Sharing can be used in two ways.

  • Databricks to Open (D2O) sharing or simply Open sharing lets you share data with any user regardless of whether they have access to Databricks.
  • Databricks-to-Databricks (D2D) sharing lets you share data with Databricks users who are using Unity Catalog (UC) with a metastore that is different from yours.

As there is usually a single UC metastore per cloud region for a given Databricks account, all workspaces in that particular cloud region will link to the same UC metastore and can access data seamlessly using the built-in governance capabilities.

In a multi-region or multi-cloud setup, there is a need to share data between regions/clouds, for which there are three options:

In this article, we will foray into D2D delta sharing on top of Azure Databricks.

Delta sharing across cloud regions works as follows:

  1. A recipient using Azure Databricks in region A requests access to a dataset/table shared with them by the provider also using Azure Databricks in region B.
  2. UC verifies the request and returns pre-signed URLs to the recipient.
  3. The recipient then fetches the data “directly” from the Storage Account using these pre-signed URLs.

As you might have already guessed, this only works in the following cases:

  1. The Azure storage account is publicly accessible. In other words, ADLS Gen2 should not have any firewall restrictions in place
  2. The IP address or the CIDR range of the recipient(s) has been whitelisted on the provider’s storage account firewall
  3. The communication between the recipient and the provider’s ADLS Gen2 is private and the provider side ADLS Gen2 firewall allows for this

We will delve deeper into case 3 above. In addition, note that for the scenario that we are interested in, D2D sharing, we do not really have any public IP addresses associated with Databricks workspaces to whitelist them on ADLS Gen2 firewalls.

Securely Accessing Storage Accounts

Given the fact that Databricks’ data plane runs in VNETs, there are 2 ways to access storage accounts from a Databricks workspace securely.

Service Endpoints

The recipient side Databricks data plane VNETs (public/host subnet) should be added to the provider side ADLS Gen2 network configuration. This can be done using a global/cross-region service endpoint. Until April 2023, Service Endpoints allowed secure storage account access from VNETs only within the same cloud region. But now service endpoints can be used cross-region as well. Cross-region service endpoints for Azure Storage became generally available in April 2023. Details are here.

Databricks-to-Databricks Delta Sharing: Secure sharing across Azure regions (1)

Private Endpoints

Azure Private Link is the most secure way to access Azure data services from Azure Databricks. Although Service Endpoints and Private Endpoints both route the traffic between your virtual network and the storage account over the Microsoft network backbone, the Service Endpoint remains a publicly routable IP address, whereas the Private Endpoint is a private IP in the address space of the virtual network where the Private Endpoint is configured.\

Cross-Region Secure Data Access using a private endpoint

Note: This applies to a Databricks workspace created with secure cluster connectivity (SCC) and VNet injection.

The private endpoint setup for allowing access to a firewall-enabled storage account across the cloud region is as follows:

Databricks-to-Databricks Delta Sharing: Secure sharing across Azure regions (2)

  1. Create a sample Catalog, Schema, and Table that has data stored in a Firewall enabled ADLS Gen2 storage account.
  2. Create a UC Delta Sharing "SHARE" in one Databricks workspace (provider) and share it with the metastore (recipient) in the other region.

https://learn.microsoft.com/en-us/azure/databricks/data-sharing/share-data-databricks

Note: Since the storage account firewall is on, the recipient fails to access the share (the Databricks workspace tries to fetch the files directly from the storage account).

3. Create a private endpoint in the (provider) storage account from the (recipient) Databricks workspace VNET in the other region.

The following configuration should be used for the setup:

  • Region: The region of the recipient Databricks workspace
  • Virtual Network: The VNET where the recipient Databricks workspace data plane is deployed
  • Subnet: One of the subnets in the recipient Databricks workspace VNET
  • Target sub-resource: dfs
  • Integrate with private DNS zone

4. Test the data access from the provider by the recipient.

Databricks-to-Databricks Delta Sharing: Secure sharing across Azure regions (3)

Cross-Region Data Access from more than one Databricks workspace

If there is more than one Databricks workspace in isolated VNETs in the recipient region that needs to access the same storage account, then either we need to create a separate private endpoint for each VNET or we could peer VNETs and use a single private endpoint. In addition to peering VNETs, for each VNET we need to add a Virtual-Network-Link to the private DNS Zone, created during the setup of the private endpoint.

Databricks-to-Databricks Delta Sharing: Secure sharing across Azure regions (4)

Getting started with Delta Sharing across cloud regions

We discussed the network configuration options available to access the data stored in the ADLS Gen2 storage account using Delta Sharing. Depending on your security requirements and budget constraints, you can either use Service Endpoints, which have no additional charges and are easier to set up, or use Private Endpoints which incur additional costs but are more secure.

In the next blog in this series, we will dive into cross-region D2D Delta Sharing on AWS as well as cross-cloud data sharing between Databricks workspaces across multiple cloud providers.

Databricks-to-Databricks Delta Sharing: Secure sharing across Azure regions (2024)
Top Articles
Aerocareusa Hmebillpay Com
San Francisco CA Real Estate - San Francisco CA Homes For Sale | Zillow
Funny Roblox Id Codes 2023
Golden Abyss - Chapter 5 - Lunar_Angel
Www.paystubportal.com/7-11 Login
Joi Databas
DPhil Research - List of thesis titles
Shs Games 1V1 Lol
Evil Dead Rise Showtimes Near Massena Movieplex
Steamy Afternoon With Handsome Fernando
Which aspects are important in sales |#1 Prospection
Detroit Lions 50 50
18443168434
Newgate Honda
Zürich Stadion Letzigrund detailed interactive seating plan with seat & row numbers | Sitzplan Saalplan with Sitzplatz & Reihen Nummerierung
Grace Caroline Deepfake
978-0137606801
Nwi Arrests Lake County
Justified Official Series Trailer
London Ups Store
Committees Of Correspondence | Encyclopedia.com
Pizza Hut In Dinuba
Jinx Chapter 24: Release Date, Spoilers & Where To Read - OtakuKart
How Much You Should Be Tipping For Beauty Services - American Beauty Institute
Free Online Games on CrazyGames | Play Now!
Sizewise Stat Login
VERHUURD: Barentszstraat 12 in 'S-Gravenhage 2518 XG: Woonhuis.
Jet Ski Rental Conneaut Lake Pa
Unforeseen Drama: The Tower of Terror’s Mysterious Closure at Walt Disney World
Ups Print Store Near Me
C&T Wok Menu - Morrisville, NC Restaurant
How Taraswrld Leaks Exposed the Dark Side of TikTok Fame
University Of Michigan Paging System
Dashboard Unt
Access a Shared Resource | Computing for Arts + Sciences
Speechwire Login
Healthy Kaiserpermanente Org Sign On
Duke University Transcript Request
Lincoln Financial Field, section 110, row 4, home of Philadelphia Eagles, Temple Owls, page 1
Jambus - Definition, Beispiele, Merkmale, Wirkung
Ark Unlock All Skins Command
Craigslist Red Wing Mn
D3 Boards
Jail View Sumter
Nancy Pazelt Obituary
Birmingham City Schools Clever Login
Thotsbook Com
Funkin' on the Heights
Vci Classified Paducah
Www Pig11 Net
Ty Glass Sentenced
Latest Posts
Article information

Author: Stevie Stamm

Last Updated:

Views: 5625

Rating: 5 / 5 (80 voted)

Reviews: 87% of readers found this page helpful

Author information

Name: Stevie Stamm

Birthday: 1996-06-22

Address: Apt. 419 4200 Sipes Estate, East Delmerview, WY 05617

Phone: +342332224300

Job: Future Advertising Analyst

Hobby: Leather crafting, Puzzles, Leather crafting, scrapbook, Urban exploration, Cabaret, Skateboarding

Introduction: My name is Stevie Stamm, I am a colorful, sparkling, splendid, vast, open, hilarious, tender person who loves writing and wants to share my knowledge and understanding with you.