How to Share Data Products from Snowflake to Databricks

Written by

Lucy Kelly

Product Marketing Manager

Monda

Published on

July 17, 2025

Table of contents

Sharing data products from Snowflake to Databricks has become one of the most common and powerful data team workflows today. As organizations scale, teams inevitably find themselves needing to surface governed, production-grade data inside Databricks for analytics, AI, and machine learning. Whether it's refining ETL pipelines, building feature-rich models, or enabling real-time reporting, that seamless bridge between Snowflake’s warehouse and Databricks’ lakehouse is essential.

‍

But let’s be honest: setting up that bridge can feel like assembling IKEA furniture without the manual. Between configuring JDBC or ODBC drivers, handling connector versions, juggling secret managers and IAM roles, and chasing dependencies across multiple clouds, those integrations can quickly spiral into a major DevOps headache.

‍

⚡ Skip the Complexity: Use Amplify Data Instead

What if you could bypass all that glue code and cloud plumbing? That’s where Amplify Data’s platform truly shines. Built specifically to alleviate the friction of modern data product distribution, Amplify empowers teams to publish curated datasets directly from Snowflake and let downstream systems (like Databricks) self-connect.

‍

With Amplify, you can:

Publish secure, governed datasets from Snowflake
Let recipients choose their destination (Snowflake share, API, SFTP, Databricks, etc.)
Empower recipients to onboard autonomously, no technical admin needed
Monitor usage, manage retentions, and support multi-cloud destinations

All of this happens without writing custom connectors or worrying about IAM and secrets flow, making it a powerful alternative to hand-rolled JDBC integrations.

If you’re interested in the nitty-gritty of how it works behind the scenes, keep reading: the detailed steps are below.

But if you just want to save time and focus on your data, check out Amplify Data and skip the complexity entirely.

‍

🧭 Technical Guide to Sharing Data Products from Snowflake to Databricks

Option 1: Use Snowflake External Tables or Unload to Cloud Storage

The simplest and often the most robust way to share data:

Snowflake writes the data into cloud storage (S3, ADLS, or GCS).
Databricks reads it from there.

✅ Works well for batch and large datasets.

✅ Decouples compute.

🚫 Not real-time.

Steps:

‍

1. In Snowflake:

COPY INTO 's3://your-bucket/path/'
FROM your_snowflake_table
FILE_FORMAT = (TYPE = PARQUET);

‍‍Or define an external table in Snowflake that already points to S3.

‍

2. In Databricks:

df = spark.read.parquet("s3://your-bucket/path/")
df.display()

‍

You can also use Delta Lake format if desired:

Write from Snowflake as Parquet, then convert to Delta in Databricks

(spark.read.parquet(...).write.format("delta").save(...)).

‍

Option 2: Databricks Snowflake Connector

Databricks offers a Snowflake Spark Connector, which allows Databricks to read/write directly to Snowflake via JDBC with pushdown.

✅ Good for interactive and real-time queries.

✅ No intermediate files needed.

🚫 Can incur Snowflake compute costs for each query.

Steps:

Install the Snowflake Spark Connector in your Databricks cluster.
- Include the Maven coordinates (e.g., net.snowflake:spark-snowflake_2.12:2.11.0-spark_3.1 — version depends on your Spark version).
- Include the Snowflake JDBC driver as well.
In Databricks (PySpark example):

sfOptions = {
    "sfURL": "<account>.snowflakecomputing.com",
    "sfDatabase": "<database>",
    "sfSchema": "<schema>",
    "sfWarehouse": "<warehouse>",
    "sfRole": "<role>",
    "sfUser": "<user>",
    "sfPassword": "<password>"
}

snowflake_df = (
    spark.read.format("snowflake")
    .options(**sfOptions)
    .option("dbtable", "your_table")
    .load()
)

snowflake_df.display()

‍

You can also write data back to Snowflake the same way.

‍

3. Use Snowflake’s Data Share

If you want to share a managed data product in Snowflake Marketplace or to another account:

Publish the data in Snowflake using Secure Data Sharing.
The recipient (even on a different account) can query it inside Snowflake.

However, if you still want that data to end up in Databricks:

The recipient connects Databricks to their Snowflake account as described in Option 2 and queries the shared dataset.

‍

Notes

When comparing these three options, always consider cost implications. Snowflake charges for compute (query, unload), storage; Databricks charges for compute.
Set up appropriate IAM roles / service accounts for access to S3/ADLS/GCS when using cloud storage.
For production, consider securing the connection (SSL, rotating keys).

‍

To Re-cap

🛠️ Manual Snowflake → Databricks Integration

Snowflake and Databricks both offer solid connectivity via Spark connectors, JDBC/ODBC, external tables, Delta Sharing, etc., making this one of the most common patterns in modern data architectures. In the traditional manual integration:

Connector setup & configuration: You need to install and configure the Snowflake Spark connector (JDBC driver, secret management, version matching) in your Databricks cluster.
Pipeline engineering overhead: Requires careful orchestration when managing credentials, IAM roles, storage access (S3, ADLS, GCS), and possibly exporting intermediate files like Parquet or Delta.
Governance & cost complexity: Balancing Snowflake compute costs, Databricks Spark cluster tuning, and ensuring trusted data delivery involves cross-platform monitoring, scalability configurations, and access control layers.

‍

🎯 In short: It's powerful, flexible, and production‑grade, but requires significant DevOps, engineering coordination, and ongoing maintenance.

‍

⚡Streamlined Sharing with Amplify Data

Amp up your workflow with Amplify Data (now part of Monda) - a platform built for sharing governed data products across systems without all the plumbing:

No-code, secure publishing: Simply define, filter, and publish datasets directly from Snowflake (e.g. table shares), with automated provisioning, no manual connector installs or IAM juggling.
Self-service onboarding: Downstream teams pick their integration (in this case, Databricks) without needing Snowflake credentials or IT‑heavy setup; Amplify handles delivery via Snowflake share, API, SFTP, or Delta Share for Databricks.
Built-in visibility & governance: Amplify tracks usage analytics, monitors deliveries, manages retention, and reduces data egress costs—all within a unified enterprise-grade platform.

‍

✅ Simplifies sharing into a governed, tracked data marketplace offered to internal/external consumers.

🚀 Frees up engineering time for more valuable work, while your team focuses on data innovation, not connector maintenance.

‍

Skip the technical steps and share data products from Snowflake to Databricks easily using Amplify (powered by Monda). Find out more or get a demo.

Access & share data products, the easy way.

150+ data companies use Monda to easily access, create, and share AI-ready data products.

Explore all features