Snowflake Connector Guide for DataXPipe
Configure Snowflake connections in the DataXPipe Catalog, register credentials securely, and run transform checks against Snowflake warehouses.
- snowflake
- connectors
- integrations
Snowflake is a common source and target in DataXPipe pipelines. The Catalog’s Snowflake connector executes check SQL and supports transform metadata registration against Snowflake accounts. This guide covers connection setup, authentication patterns, and production tuning.
Prerequisites
Install the optional Snowflake client in your API environment:
pip install snowflake-connector-python
The connector raises a helpful error if the library is missing at runtime. Generated check scripts inherit the same dependency when executed on Airflow workers.
Registering a Snowflake connection
POST connection metadata to the Catalog:
curl -X POST https://api.dataxpipe.com/api/v1/connections/ `
-H "X-API-KEY: dxp_your_key" `
-H "Content-Type: application/json" `
-d '{
"id": "sf-prod",
"type": "snowflake",
"config": {
"account": "xyz123.us-east-1",
"user": "DATAXPIPE_SVC",
"password_secret": "vault/snowflake/dataxpipe-password",
"warehouse": "COMPUTE_WH",
"database": "ANALYTICS",
"schema": "PUBLIC",
"role": "DATAXPIPE_ROLE"
}
}'
Use _secret suffix keys to reference Vault or env-backed secrets instead of plaintext passwords.
Authentication options
| Method | Config | Notes |
|---|---|---|
| Password | user, password or password_secret | Simple; rotate regularly |
| Key pair | private_key_secret | Preferred for service accounts |
| OAuth | External token provider | Enterprise SSO integrations |
For service accounts, create a dedicated Snowflake role with minimal grants:
CREATE ROLE DATAXPIPE_ROLE;
GRANT USAGE ON WAREHOUSE COMPUTE_WH TO ROLE DATAXPIPE_ROLE;
GRANT USAGE ON DATABASE ANALYSIS TO ROLE DATAXPIPE_ROLE;
GRANT SELECT ON ALL TABLES IN SCHEMA ANALYSIS.PUBLIC TO ROLE DATAXPIPE_ROLE;
Grant only what checks and transforms require. Avoid ACCOUNTADMIN for pipeline service accounts.
Referencing Snowflake in pipeline specs
Sources and targets use connection_ref to point at registered connections:
sources:
- id: raw_orders
type: snowflake
connection_ref: sf-prod
object: RAW.ECOMMERCE.ORDERS
load_mode: incremental
watermark_column: UPDATED_AT
targets:
- id: clean_orders
type: snowflake
connection_ref: sf-prod
object: ANALYTICS.ORDERS.CLEAN_ORDERS
write_mode: merge
primary_key: [ORDER_ID]
The object field uses Snowflake’s three-part naming: DATABASE.SCHEMA.TABLE.
Executing checks against Snowflake
Use the execute endpoint to run check SQL:
curl -X POST https://api.dataxpipe.com/api/v1/checks/execute `
-H "X-API-KEY: dxp_your_key" `
-H "Content-Type: application/json" `
-d '{
"connection_id": "sf-prod",
"sql": "SELECT COUNT(*) AS violations FROM ANALYTICS.ORDERS.CLEAN_ORDERS WHERE ORDER_ID IS NULL",
"check_id": "chk_not_null_order_id",
"run_id": "run-abc123",
"failure_sample_sql": "SELECT * FROM ANALYTICS.ORDERS.CLEAN_ORDERS WHERE ORDER_ID IS NULL LIMIT 5"
}'
The failure_sample_sql parameter captures example violating rows for debugging—critical for Snowflake tables too large to scan manually.
Performance tuning
Snowflake bills by compute time. Keep checks efficient:
- Use appropriate warehouse size — X-Small for metadata checks; scale up only for heavy aggregations
- Filter to recent partitions —
WHERE order_date >= CURRENT_DATE - 7on large fact tables - Set statement timeout —
DATAXPIPE_TIMEOUT_SECONDS=60prevents runaway queries - Auto-suspend warehouse — Configure Snowflake warehouse auto-suspend to 60 seconds
Configure retry behavior for transient errors via DATAXPIPE_RETRY_ATTEMPTS and DATAXPIPE_BACKOFF_BASE.
Common errors
| Error | Cause | Resolution |
|---|---|---|
250001 auth failure | Wrong account identifier | Use full account locator with region |
002043 object not found | Schema/database mismatch | Verify object three-part name |
| Warehouse suspended | Cold start delay | Pre-warm or increase auto-suspend threshold |
| Timeout | Full table scan in check | Add partition filter |
See Airflow integration guide for deploying Snowflake transform DAGs.