If you’re building modern data products, chances are you’ve already come across this scenario:
“I have all my data stored neatly in Google Cloud Storage, but my analytics and downstream consumers live in Snowflake. How do I get it all in there?”
This is one of the most common use cases for Snowflake - and for good reason. GCS is a natural place to store raw or processed data products because it’s cheap, scalable, and tightly integrated with Google Cloud. Meanwhile, Snowflake is the go-to platform for analytics, BI, and sharing data with stakeholders.
So moving data from GCS to Snowflake is a critical part of many data pipelines.
But here’s the catch: it’s also a bit techy.
Snowflake’s native support for GCS is powerful but requires several steps:
COPY INTO
commands.For teams without a dedicated data engineer or who just want to move fast, this process can feel unnecessarily complicated. And even experienced teams can find the setup error-prone, especially when working across cloud platforms.
If you’d rather skip all those manual steps and focus on delivering insights, you can use Amplify Data to load data products from GCS into Snowflake easily, without the headache.
Amplify abstracts away the messy details and schema mapping for you, letting you move data from GCS into Snowflake with just a few clicks (or API calls).
You still get all the flexibility and power of Snowflake - but none of the boilerplate setup.
If you’re ready to move your data products from GCS into Snowflake without wrestling with IAM policies, integrations, and SQL scripts, check out Amplify Data and get started in minutes.
But let’s look at both methods. Whether you want to understand the nuts and bolts of how loading data products from Amazon S3 into Snowflake works, or you just want to get your data flowing today using Amplify, this guide will walk you through both options.
gs://my-bucket/data/products/
You’ll need:
a. Create a storage integration (recommended)
This securely manages the GCS credentials:
sql
CopyEdit
CREATE STORAGE INTEGRATION gcs_integration
TYPE = EXTERNAL_STAGE
STORAGE_PROVIDER = GCS
ENABLED = TRUE
STORAGE_ALLOWED_LOCATIONS = ('gcs://my-bucket/data/');
b. Get the STORAGE_GCP_SERVICE_ACCOUNT
for your integration:
sql
CopyEdit
DESC STORAGE INTEGRATION gcs_integration;
This gives you the service account email that Snowflake uses to access GCS.
c. Grant this service account read access to the GCS bucket:
In Google Cloud:
Once the integration and permissions are set, define a stage:
sql
CopyEdit
CREATE STAGE gcs_stage
URL = 'gcs://my-bucket/data/products/'
STORAGE_INTEGRATION = gcs_integration;
You can now reference @gcs_stage
in your COPY INTO
commands.
Prepare a Snowflake table that matches your data schema:
sql
CopyEdit
CREATE TABLE products (
product_id STRING,
name STRING,
price FLOAT,
created_at TIMESTAMP
);
COPY INTO
Run the COPY INTO
command to load the data:
sql
CopyEdit
COPY INTO products
FROM @gcs_stage
FILE_FORMAT = (TYPE = 'CSV' FIELD_OPTIONALLY_ENCLOSED_BY='"' SKIP_HEADER=1);
For Parquet:
sql
CopyEdit
COPY INTO products
FROM @gcs_stage
FILE_FORMAT = (TYPE = 'PARQUET');
You can preview what would be loaded using:
sql
CopyEdit
COPY INTO products
FROM @gcs_stage
FILE_FORMAT = (TYPE='CSV')
VALIDATION_MODE = RETURN_ALL_ERRORS;
After you’re done loading:
sql
CopyEdit
DROP STAGE gcs_stage;
DROP STORAGE INTEGRATION gcs_integration;
CREATE FILE FORMAT
) for reuse.
What you do:
COPY INTO
statement to load your data into your Snowflake table.Challenges:
What you do:
Why it’s better:
Skip these technical steps and load data products from GCS into Snowflake easily using Amplify (powered by Monda). Find out more or get a demo.
150+ data companies use Monda to easily access, create, and share AI-ready data products.
Explore all featuresMonda makes it easy to create data products, publish a data storefront, integrate with data marketplaces, and manage data demand - data monetization made simple.
Sign up to Monda Monthly to get data & AI thought leadership, product updates, and event notifications.