Migrating Teradata to Snowflake: BTEQ and Stored Procedures to Snowflake SQL

Teradata has been the enterprise data warehousing standard for decades, powering mission-critical analytics at the world's largest organizations. But its proprietary hardware model, per-node licensing, and specialized SQL extensions create a total cost of ownership that increasingly conflicts with the economics of cloud-native data platforms. Snowflake's consumption-based pricing, elastic compute, automatic optimization, and standard SQL make it the most common migration target for Teradata workloads — but the migration itself requires deep understanding of Teradata-specific constructs and their Snowflake equivalents.

This guide provides a thorough technical mapping from Teradata BTEQ scripts, SQL extensions, stored procedures (SPL), macros, bulk loading utilities (FastLoad, MultiLoad, TPT), primary index strategies, statistics collection, and workload management to their Snowflake-native counterparts. Whether you are migrating hundreds of BTEQ scripts or thousands of stored procedures, this article gives your team the comprehensive reference framework needed to plan and execute the conversion.

Teradata Architecture vs. Snowflake Architecture

Teradata uses a shared-nothing MPP (Massively Parallel Processing) architecture where data is distributed across AMPs (Access Module Processors) based on the Primary Index (PI) hash. The PI selection is the single most critical performance decision in Teradata — a poor PI choice leads to data skew, spool space exhaustion, and degraded query performance. Every table must have a PI, and changing it requires a full table rebuild. The system runs on dedicated hardware nodes, and scaling requires adding physical nodes to the cluster.

Snowflake uses a multi-cluster shared data architecture with automatic micro-partitioning. There is no concept of a primary index or manual data distribution. Snowflake's query optimizer automatically prunes micro-partitions, selects join strategies, and manages data layout. Optional clustering keys can be defined for very large tables, but they are advisory hints rather than mandatory physical distribution schemes. Compute is provided by virtual warehouses that can be created, resized, suspended, and resumed in seconds.

Teradata Concept	Snowflake Equivalent	Notes
BTEQ Script	SQL Worksheet / SnowSQL / Stored Procedure	Interactive and batch SQL execution
BTEQ .IF / .THEN / .GOTO	SQL Scripting IF/THEN/ELSE in procedures	Conditional logic in stored procedures
BTEQ .LOGON / .LOGOFF	SnowSQL connection / session	Managed via SnowSQL config or key-pair auth
BTEQ .EXPORT	COPY INTO @stage (unload)	Export query results to files in stages
Stored Procedure (SPL)	SQL Scripting / Snowpark procedure	Snowflake SQL scripting or Python/Java/Scala
Macro	SQL Stored Procedure	Parameterized SQL blocks
QUALIFY	QUALIFY	Snowflake supports QUALIFY natively — identical syntax
SAMPLE	SAMPLE / TABLESAMPLE	Snowflake supports both SAMPLE and TABLESAMPLE
SET Table (no duplicates)	SELECT DISTINCT / QUALIFY dedup	Snowflake tables allow duplicates by default; dedup explicitly
MULTISET Table	Default table behavior	Snowflake tables are MULTISET by default
NORMALIZE	SQL with window functions	Requires manual implementation with LAG/LEAD
FastLoad	COPY INTO (bulk load)	High-speed parallel loading from files
MultiLoad	MERGE / Snowpipe	Upsert and continuous loading
TPT (Teradata Parallel Transporter)	COPY INTO + Snowpipe + Tasks	Orchestrated parallel data movement
Primary Index (PI)	Automatic micro-partitioning	No manual distribution; optional clustering keys
Partitioned Primary Index (PPI)	Clustering key	Advisory optimization, not mandatory physical layout
COLLECT STATISTICS	Automatic (no action needed)	Snowflake maintains statistics automatically
Workload Management (TASM/TIWM)	Resource Monitors + Warehouse Sizing	Per-warehouse budgets and auto-scaling
EXPLAIN plan	EXPLAIN / Query Profile	Visual query profile in Snowsight
Volatile Table	Temporary Table	Session-scoped, automatically dropped
Global Temporary Table	Transient Table	Persistent but with reduced Time Travel
Teradata UDF	Snowflake UDF / Snowpark UDF	SQL, Python, Java, or Scala UDFs
Teradata Scheduling (DBCAL)	Snowflake Tasks	CRON-based scheduling with DAG dependencies

Teradata to Snowflake migration — automated end-to-end by MigryX

BTEQ Scripts to Snowflake SQL and Stored Procedures

BTEQ (Basic Teradata Query) is the command-line SQL interface for Teradata, used for both interactive queries and batch script execution. BTEQ scripts combine SQL statements with dot-commands (.IF, .THEN, .GOTO, .LOGON, .EXPORT, .SET) that provide control flow and session management. These dot-commands are not SQL — they are BTEQ interpreter directives that control execution flow based on return codes and activity counts.

Snowflake's equivalent for batch execution is SnowSQL (the CLI client) combined with SQL scripting in stored procedures for conditional logic. The key architectural difference is that BTEQ control flow (.IF/.GOTO) becomes structured programming (IF/THEN/ELSE) in Snowflake stored procedures, which is easier to maintain and debug.

BTEQ Script with .IF/.THEN to Snowflake Stored Procedure

-- TERADATA BTEQ SCRIPT:
.LOGON tdserver/etl_user,password123
.SET WIDTH 200
.SET ERROROUT STDOUT

DATABASE analytics_db;

-- Step 1: Verify source data availability
SELECT COUNT(*) FROM staging_db.daily_transactions
WHERE txn_date = DATE;

.IF ACTIVITYCOUNT = 0 THEN .GOTO NO_DATA
.IF ERRORCODE <> 0 THEN .GOTO ERROR_HANDLER

-- Step 2: Delete existing records for today
DELETE FROM analytics_db.fact_transactions
WHERE txn_date = DATE;
.IF ERRORCODE <> 0 THEN .GOTO ERROR_HANDLER

-- Step 3: Load new data with transformations
INSERT INTO analytics_db.fact_transactions
SELECT
    t.txn_id,
    t.txn_date,
    t.account_id,
    t.txn_type,
    t.amount,
    CASE WHEN t.amount > 10000 THEN 'HIGH'
         WHEN t.amount > 1000 THEN 'MEDIUM'
         ELSE 'LOW'
    END AS risk_level,
    a.branch_id,
    a.customer_segment,
    CURRENT_TIMESTAMP AS load_ts
FROM staging_db.daily_transactions t
JOIN analytics_db.dim_account a ON t.account_id = a.account_id
WHERE t.txn_date = DATE;
.IF ERRORCODE <> 0 THEN .GOTO ERROR_HANDLER

-- Step 4: Log success
.SET ACTCNT = ACTIVITYCOUNT
INSERT INTO etl_db.run_log VALUES
    ('fact_transactions', DATE, &ACTCNT, 'SUCCESS', CURRENT_TIMESTAMP);
.GOTO END_SCRIPT

.LABEL NO_DATA
INSERT INTO etl_db.run_log VALUES
    ('fact_transactions', DATE, 0, 'NO_DATA', CURRENT_TIMESTAMP);
.QUIT 4

.LABEL ERROR_HANDLER
INSERT INTO etl_db.run_log VALUES
    ('fact_transactions', DATE, 0, 'ERROR', CURRENT_TIMESTAMP);
.QUIT 12

.LABEL END_SCRIPT
.LOGOFF
.QUIT 0

-- SNOWFLAKE EQUIVALENT: Stored procedure replaces BTEQ control flow
CREATE OR REPLACE PROCEDURE analytics.etl.load_fact_transactions(
    p_txn_date DATE DEFAULT CURRENT_DATE()
)
RETURNS VARIANT
LANGUAGE SQL
AS
$$
DECLARE
    v_source_count INTEGER;
    v_loaded_count INTEGER;
    v_yesterday_count INTEGER;
    v_variance_pct FLOAT;
BEGIN
    -- Step 1: Verify source data availability (replaces .IF ACTIVITYCOUNT)
    SELECT COUNT(*) INTO v_source_count
    FROM staging.daily_transactions
    WHERE txn_date = :p_txn_date;

    IF (v_source_count = 0) THEN
        INSERT INTO etl.run_log (table_name, load_date, row_count, status, log_time)
        VALUES ('fact_transactions', :p_txn_date, 0, 'NO_DATA', CURRENT_TIMESTAMP());
        RETURN OBJECT_CONSTRUCT('status', 'NO_DATA', 'rows', 0);
    END IF;

    -- Step 2: Get yesterday count for variance checking
    SELECT COUNT(*) INTO v_yesterday_count
    FROM analytics.fact_transactions
    WHERE txn_date = DATEADD('day', -1, :p_txn_date);

    -- Step 3: Delete and reload (replaces BTEQ DELETE + INSERT)
    DELETE FROM analytics.fact_transactions WHERE txn_date = :p_txn_date;

    INSERT INTO analytics.fact_transactions
    SELECT
        t.txn_id,
        t.txn_date,
        t.account_id,
        t.txn_type,
        t.amount,
        CASE WHEN t.amount > 10000 THEN 'HIGH'
             WHEN t.amount > 1000 THEN 'MEDIUM'
             ELSE 'LOW'
        END AS risk_level,
        a.branch_id,
        a.customer_segment,
        CURRENT_TIMESTAMP() AS load_ts
    FROM staging.daily_transactions t
    JOIN analytics.dim_account a ON t.account_id = a.account_id
    WHERE t.txn_date = :p_txn_date;

    v_loaded_count := SQLROWCOUNT;

    -- Step 4: Variance check (improvement over BTEQ)
    IF (v_yesterday_count > 0) THEN
        v_variance_pct := ABS(v_loaded_count - v_yesterday_count) * 100.0 / v_yesterday_count;
        IF (v_variance_pct > 50) THEN
            INSERT INTO etl.run_log (table_name, load_date, row_count, status, log_time)
            VALUES ('fact_transactions', :p_txn_date, v_loaded_count,
                    'WARNING: ' || v_variance_pct::VARCHAR || '% variance',
                    CURRENT_TIMESTAMP());
        END IF;
    END IF;

    -- Step 5: Log success
    INSERT INTO etl.run_log (table_name, load_date, row_count, status, log_time)
    VALUES ('fact_transactions', :p_txn_date, v_loaded_count, 'SUCCESS',
            CURRENT_TIMESTAMP());

    RETURN OBJECT_CONSTRUCT(
        'status', 'SUCCESS',
        'rows_loaded', v_loaded_count,
        'source_count', v_source_count,
        'variance_pct', v_variance_pct
    );

EXCEPTION
    WHEN OTHER THEN
        INSERT INTO etl.run_log (table_name, load_date, row_count, status, log_time)
        VALUES ('fact_transactions', :p_txn_date, 0,
                'ERROR: ' || SQLERRM, CURRENT_TIMESTAMP());
        RAISE;
END;
$$;

-- Schedule via Task (replaces cron-scheduled BTEQ script)
CREATE OR REPLACE TASK analytics.etl.daily_transactions_task
  WAREHOUSE = etl_wh
  SCHEDULE = 'USING CRON 30 5 * * * America/New_York'
AS
  CALL analytics.etl.load_fact_transactions(CURRENT_DATE());

BTEQ .EXPORT to COPY INTO (Unload)

-- TERADATA BTEQ: Export query results to file
.EXPORT REPORT FILE=/data/exports/customer_report.csv
SELECT customer_id, customer_name, total_revenue
FROM gold.customer_summary
WHERE region = 'NORTH_AMERICA'
ORDER BY total_revenue DESC;
.EXPORT RESET

-- SNOWFLAKE: COPY INTO stage (unload)
COPY INTO @export_stage/customer_report/
FROM (
    SELECT customer_id, customer_name, total_revenue
    FROM gold.customer_summary
    WHERE region = 'NORTH_AMERICA'
    ORDER BY total_revenue DESC
)
FILE_FORMAT = (TYPE = 'CSV' HEADER = TRUE COMPRESSION = 'GZIP')
OVERWRITE = TRUE
SINGLE = TRUE;

MigryX: Purpose-Built Parsers for Every Legacy Technology

MigryX does not rely on generic text matching or regex-based parsing. For every supported legacy technology, MigryX has built a dedicated Abstract Syntax Tree (AST) parser that understands the full grammar and semantics of that platform. This means MigryX captures not just what the code does, but why — understanding implicit behaviors, default settings, and platform-specific quirks that generic tools miss entirely.

Teradata SQL Extensions to Snowflake SQL

Teradata extends ANSI SQL with several proprietary constructs. Some have direct Snowflake equivalents; others require rewriting with standard SQL patterns. Understanding these differences is essential for automated transpilation.

QUALIFY: Identical Syntax in Snowflake

Teradata's QUALIFY clause filters the result of window functions without requiring a subquery. Snowflake supports QUALIFY natively with identical syntax — this is one of the smoothest migration paths for any Teradata SQL construct.

-- TERADATA: Get the latest order per customer using QUALIFY
SELECT customer_id, order_id, order_date, amount
FROM orders
QUALIFY ROW_NUMBER() OVER (
    PARTITION BY customer_id
    ORDER BY order_date DESC
) = 1;

-- SNOWFLAKE: Identical syntax — QUALIFY is supported natively
SELECT customer_id, order_id, order_date, amount
FROM orders
QUALIFY ROW_NUMBER() OVER (
    PARTITION BY customer_id
    ORDER BY order_date DESC
) = 1;

SET vs. MULTISET Tables: Default Dedup Behavior

Teradata distinguishes between SET tables (no duplicate rows allowed, enforced at insert time) and MULTISET tables (duplicates allowed). This distinction is a fundamental part of Teradata's data model. Snowflake tables are MULTISET by default — duplicates are always allowed. SET table behavior must be implemented explicitly using deduplication logic.

-- TERADATA: SET table rejects duplicate rows at insert time
CREATE SET TABLE analytics_db.dim_customer (
    customer_id INTEGER,
    customer_name VARCHAR(200),
    segment VARCHAR(50),
    region VARCHAR(50)
)
PRIMARY INDEX (customer_id);

-- SNOWFLAKE: Deduplication must be explicit
-- Option 1: Use QUALIFY to dedup on insert
INSERT INTO analytics.dim_customer
SELECT customer_id, customer_name, segment, region
FROM staging.raw_customers
QUALIFY ROW_NUMBER() OVER (
    PARTITION BY customer_id, customer_name, segment, region
    ORDER BY 1
) = 1;

-- Option 2: Use DISTINCT
INSERT INTO analytics.dim_customer
SELECT DISTINCT customer_id, customer_name, segment, region
FROM staging.raw_customers;

Teradata MERGE with QUALIFY

A common Teradata pattern combines MERGE with QUALIFY to deduplicate source data before merging into the target. This is a powerful pattern for slowly changing dimensions and incremental loads where the source may contain multiple versions of the same record.

-- TERADATA: MERGE with QUALIFY to deduplicate before upsert
MERGE INTO analytics_db.dim_customer AS t
USING (
    SELECT
        customer_id,
        customer_name,
        email,
        phone,
        segment,
        region,
        updated_timestamp
    FROM staging_db.customer_feed
    QUALIFY ROW_NUMBER() OVER (
        PARTITION BY customer_id
        ORDER BY updated_timestamp DESC
    ) = 1
) AS s
ON t.customer_id = s.customer_id
WHEN MATCHED AND (
    t.customer_name <> s.customer_name
    OR t.email <> s.email
    OR t.segment <> s.segment
) THEN UPDATE SET
    customer_name = s.customer_name,
    email = s.email,
    phone = s.phone,
    segment = s.segment,
    region = s.region,
    updated_at = CURRENT_TIMESTAMP
WHEN NOT MATCHED THEN INSERT (
    customer_id, customer_name, email, phone,
    segment, region, created_at, updated_at
) VALUES (
    s.customer_id, s.customer_name, s.email, s.phone,
    s.segment, s.region, CURRENT_TIMESTAMP, CURRENT_TIMESTAMP
);

-- SNOWFLAKE: Identical MERGE with QUALIFY (supported natively)
MERGE INTO analytics.dim_customer AS t
USING (
    SELECT
        customer_id,
        customer_name,
        email,
        phone,
        segment,
        region,
        updated_timestamp
    FROM staging.customer_feed
    QUALIFY ROW_NUMBER() OVER (
        PARTITION BY customer_id
        ORDER BY updated_timestamp DESC
    ) = 1
) AS s
ON t.customer_id = s.customer_id
WHEN MATCHED AND (
    t.customer_name != s.customer_name
    OR t.email != s.email
    OR t.segment != s.segment
) THEN UPDATE SET
    customer_name = s.customer_name,
    email = s.email,
    phone = s.phone,
    segment = s.segment,
    region = s.region,
    updated_at = CURRENT_TIMESTAMP()
WHEN NOT MATCHED THEN INSERT (
    customer_id, customer_name, email, phone,
    segment, region, created_at, updated_at
) VALUES (
    s.customer_id, s.customer_name, s.email, s.phone,
    s.segment, s.region, CURRENT_TIMESTAMP(), CURRENT_TIMESTAMP()
);

The MERGE with QUALIFY pattern migrates almost identically between Teradata and Snowflake. The only syntax differences are the inequality operator (<> in Teradata vs. != in Snowflake, though both work) and the CURRENT_TIMESTAMP function call (parentheses required in Snowflake). This is one of the cleanest migration paths for any Teradata SQL pattern.

Teradata Date Arithmetic and FORMAT

Teradata uses integer date representations and proprietary date arithmetic syntax, including the FORMAT phrase for inline date formatting. Snowflake uses standard date functions.

-- TERADATA: Date arithmetic and formatting
SELECT
    current_date - INTERVAL '30' DAY AS thirty_days_ago,
    sale_date - DATE '2025-01-01' AS days_since_jan1,
    ADD_MONTHS(hire_date, 6) AS review_date,
    EXTRACT(MONTH FROM sale_date) AS sale_month,
    sale_date (FORMAT 'YYYY-MM-DD') AS formatted_date,
    CAST(sale_date AS CHAR(10)) AS date_string
FROM employees;

-- SNOWFLAKE: Equivalent date operations
SELECT
    DATEADD('day', -30, CURRENT_DATE()) AS thirty_days_ago,
    DATEDIFF('day', '2025-01-01'::DATE, sale_date) AS days_since_jan1,
    DATEADD('month', 6, hire_date) AS review_date,
    EXTRACT(MONTH FROM sale_date) AS sale_month,
    TO_CHAR(sale_date, 'YYYY-MM-DD') AS formatted_date,
    sale_date::VARCHAR AS date_string
FROM employees;

NORMALIZE: Merging Overlapping Intervals

Teradata's NORMALIZE statement merges overlapping or adjacent intervals (date ranges, timestamps). Snowflake does not have a NORMALIZE keyword, but the logic can be implemented with window functions. This is one of the more complex conversions in a Teradata migration.

-- TERADATA: Merge overlapping date ranges per employee
SELECT employee_id,
       BEGIN(valid_period) AS range_start,
       END(valid_period) AS range_end
FROM employee_assignments
NORMALIZE ON valid_period;

-- SNOWFLAKE: Equivalent using window functions
WITH ordered AS (
    SELECT
        employee_id,
        range_start,
        range_end,
        MAX(range_end) OVER (
            PARTITION BY employee_id
            ORDER BY range_start
            ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING
        ) AS prev_max_end
    FROM employee_assignments
),
grouped AS (
    SELECT *,
        SUM(CASE
            WHEN prev_max_end IS NULL OR range_start > prev_max_end THEN 1
            ELSE 0
        END) OVER (
            PARTITION BY employee_id
            ORDER BY range_start
        ) AS grp
    FROM ordered
)
SELECT
    employee_id,
    MIN(range_start) AS range_start,
    MAX(range_end) AS range_end
FROM grouped
GROUP BY employee_id, grp
ORDER BY employee_id, range_start;

Teradata Stored Procedures (SPL) to Snowflake SQL Scripting

Teradata stored procedures use SPL (Stored Procedure Language), a procedural extension with DECLARE, SET, IF/THEN/ELSE, WHILE, FOR, CURSOR, and exception handling via DECLARE EXIT HANDLER. Snowflake SQL scripting provides equivalent constructs with slightly different syntax. The most significant differences are: Teradata uses ACTIVITY_COUNT where Snowflake uses SQLROWCOUNT, Teradata uses EXIT HANDLER for exceptions where Snowflake uses EXCEPTION blocks, and Teradata uses IN/OUT/INOUT parameters where Snowflake returns VARIANT objects.

-- TERADATA SPL: Incremental load with error handling
REPLACE PROCEDURE etl_db.load_fact_sales (
    IN p_load_date DATE,
    OUT p_rows_loaded INTEGER,
    OUT p_status VARCHAR(100)
)
BEGIN
    DECLARE v_count INTEGER;
    DECLARE EXIT HANDLER FOR SQLEXCEPTION
    BEGIN
        SET p_status = 'FAILED: ' || SQLSTATE;
        SET p_rows_loaded = 0;
        INSERT INTO etl_db.error_log (proc_name, error_msg, log_time)
        VALUES ('load_fact_sales', p_status, CURRENT_TIMESTAMP);
    END;

    -- Check source availability
    SELECT COUNT(*) INTO v_count
    FROM staging_db.raw_sales
    WHERE sale_date = p_load_date;

    IF v_count = 0 THEN
        SET p_status = 'NO DATA for ' || CAST(p_load_date AS VARCHAR(10));
        SET p_rows_loaded = 0;
    ELSE
        DELETE FROM analytics_db.fact_sales
        WHERE sale_date = p_load_date;

        INSERT INTO analytics_db.fact_sales
        SELECT
            s.sale_id, s.sale_date, s.customer_id,
            s.product_id, s.store_id, s.quantity,
            s.unit_price,
            s.quantity * s.unit_price AS total_amount,
            s.discount_pct,
            s.quantity * s.unit_price * (1 - s.discount_pct / 100) AS net_amount,
            CURRENT_TIMESTAMP AS load_timestamp
        FROM staging_db.raw_sales s
        WHERE s.sale_date = p_load_date;

        SET p_rows_loaded = ACTIVITY_COUNT;
        SET p_status = 'SUCCESS';
    END IF;

    INSERT INTO etl_db.audit_log
        (proc_name, load_date, rows_loaded, status, log_time)
    VALUES ('load_fact_sales', p_load_date, p_rows_loaded,
            p_status, CURRENT_TIMESTAMP);
END;

-- SNOWFLAKE: Equivalent stored procedure
CREATE OR REPLACE PROCEDURE etl.load_fact_sales(p_load_date DATE)
RETURNS VARIANT
LANGUAGE SQL
AS
$$
DECLARE
    v_count INTEGER;
    v_rows_loaded INTEGER DEFAULT 0;
    v_status VARCHAR DEFAULT 'INIT';
    result VARIANT;
BEGIN
    SELECT COUNT(*) INTO v_count
    FROM staging.raw_sales
    WHERE sale_date = :p_load_date;

    IF (v_count = 0) THEN
        v_status := 'NO DATA for ' || :p_load_date::VARCHAR;
    ELSE
        DELETE FROM analytics.fact_sales
        WHERE sale_date = :p_load_date;

        INSERT INTO analytics.fact_sales
        SELECT
            s.sale_id, s.sale_date, s.customer_id,
            s.product_id, s.store_id, s.quantity,
            s.unit_price,
            s.quantity * s.unit_price AS total_amount,
            s.discount_pct,
            s.quantity * s.unit_price * (1 - s.discount_pct / 100) AS net_amount,
            CURRENT_TIMESTAMP() AS load_timestamp
        FROM staging.raw_sales s
        WHERE s.sale_date = :p_load_date;

        v_rows_loaded := SQLROWCOUNT;
        v_status := 'SUCCESS';
    END IF;

    INSERT INTO etl.audit_log
        (proc_name, load_date, rows_loaded, status, log_time)
    VALUES ('load_fact_sales', :p_load_date, :v_rows_loaded,
            :v_status, CURRENT_TIMESTAMP());

    result := OBJECT_CONSTRUCT(
        'status', v_status,
        'rows_loaded', v_rows_loaded
    );
    RETURN result;

EXCEPTION
    WHEN OTHER THEN
        INSERT INTO etl.error_log (proc_name, error_message, error_time)
        VALUES ('load_fact_sales', SQLERRM, CURRENT_TIMESTAMP());
        RAISE;
END;
$$;

Teradata Macros to Snowflake Stored Procedures

Teradata macros are parameterized SQL blocks that execute as a single transaction. They are simpler than stored procedures but widely used for routine operations. In Snowflake, macros convert directly to stored procedures.

-- TERADATA MACRO: Parameterized refresh
REPLACE MACRO etl_db.refresh_customer_segment (
    p_segment VARCHAR(50)
) AS (
    DELETE FROM analytics_db.segment_summary
    WHERE segment = :p_segment;

    INSERT INTO analytics_db.segment_summary
    SELECT
        :p_segment AS segment,
        COUNT(*) AS customer_count,
        SUM(lifetime_value) AS total_ltv,
        AVG(lifetime_value) AS avg_ltv,
        CURRENT_TIMESTAMP AS refreshed_at
    FROM analytics_db.dim_customer
    WHERE segment = :p_segment;
);

-- Execute: EXEC etl_db.refresh_customer_segment('PREMIUM');

-- SNOWFLAKE: Stored procedure equivalent
CREATE OR REPLACE PROCEDURE etl.refresh_customer_segment(
    p_segment VARCHAR
)
RETURNS VARCHAR
LANGUAGE SQL
AS
$$
BEGIN
    DELETE FROM analytics.segment_summary
    WHERE segment = :p_segment;

    INSERT INTO analytics.segment_summary
    SELECT
        :p_segment AS segment,
        COUNT(*) AS customer_count,
        SUM(lifetime_value) AS total_ltv,
        AVG(lifetime_value) AS avg_ltv,
        CURRENT_TIMESTAMP() AS refreshed_at
    FROM analytics.dim_customer
    WHERE segment = :p_segment;

    RETURN 'Refreshed segment: ' || :p_segment;
END;
$$;

-- Execute: CALL etl.refresh_customer_segment('PREMIUM');

FastLoad, MultiLoad, and TPT to Snowflake Bulk Loading

Teradata provides specialized bulk loading utilities — FastLoad for high-speed empty-table loads, MultiLoad for upserts and deletes against populated tables, and TPT (Teradata Parallel Transporter) as the unified framework. Each has its own scripting syntax, connection model, and error handling. Snowflake replaces all of these with COPY INTO for batch loading, MERGE for upserts, and Snowpipe for continuous ingestion.

FastLoad to COPY INTO

-- TERADATA FASTLOAD SCRIPT:
.LOGTABLE etl_db.fl_log_sales;
.LOGON tdserver/loader_user,password;
.SET RECORD VARTEXT "|";
DEFINE
    sale_id (VARCHAR(20)),
    sale_date (VARCHAR(10)),
    customer_id (VARCHAR(20)),
    amount (VARCHAR(20))
FILE=/data/feeds/daily_sales.csv;
BEGIN LOADING analytics_db.fact_sales_staging
    ERRORFILES etl_db.fl_err1, etl_db.fl_err2;
INSERT INTO analytics_db.fact_sales_staging (
    sale_id, sale_date, customer_id, amount
) VALUES (
    :sale_id, :sale_date (DATE, FORMAT 'YYYY-MM-DD'),
    :customer_id, :amount (DECIMAL(18,2))
);
END LOADING;
.LOGOFF;

-- SNOWFLAKE: COPY INTO replaces entire FastLoad script
CREATE OR REPLACE FILE FORMAT etl.pipe_delimited_csv
  TYPE = 'CSV'
  FIELD_DELIMITER = '|'
  SKIP_HEADER = 1
  FIELD_OPTIONALLY_ENCLOSED_BY = '"'
  NULL_IF = ('NULL', 'null', '\\N', '')
  DATE_FORMAT = 'YYYY-MM-DD'
  TRIM_SPACE = TRUE;

CREATE OR REPLACE STAGE etl.sales_feed_stage
  URL = 's3://data-feeds/daily_sales/'
  STORAGE_INTEGRATION = s3_feed_integration
  FILE_FORMAT = etl.pipe_delimited_csv;

COPY INTO analytics.fact_sales_staging (
    sale_id, sale_date, customer_id, amount
)
FROM @etl.sales_feed_stage
  FILE_FORMAT = etl.pipe_delimited_csv
  ON_ERROR = 'CONTINUE'
  PURGE = TRUE;

-- Check for rejected rows (replaces FastLoad error tables)
SELECT * FROM TABLE(VALIDATE(analytics.fact_sales_staging, JOB_ID => '_last'));

MultiLoad to MERGE and Snowpipe

-- TERADATA MULTILOAD: Upsert + delete in single utility pass
-- Applies DML operations based on change flags

-- SNOWFLAKE: MERGE replaces MultiLoad upsert
MERGE INTO analytics.dim_customer t
USING staging.customer_updates s
ON t.customer_id = s.customer_id
WHEN MATCHED AND s.operation = 'U' THEN UPDATE SET
    t.customer_name = s.customer_name,
    t.email = s.email,
    t.segment = s.segment,
    t.updated_at = CURRENT_TIMESTAMP()
WHEN MATCHED AND s.operation = 'D' THEN DELETE
WHEN NOT MATCHED AND s.operation IN ('I', 'U') THEN INSERT
    (customer_id, customer_name, email, segment, created_at, updated_at)
VALUES
    (s.customer_id, s.customer_name, s.email, s.segment,
     CURRENT_TIMESTAMP(), CURRENT_TIMESTAMP());

-- Snowpipe for continuous loading (replaces scheduled MultiLoad jobs)
CREATE OR REPLACE PIPE staging.customer_pipe
  AUTO_INGEST = TRUE
AS
  COPY INTO staging.customer_updates
  FROM @etl.customer_feed_stage
  FILE_FORMAT = etl.pipe_delimited_csv;

Primary Index and PPI to Snowflake Automatic Micro-Partitioning

In Teradata, the Primary Index (PI) determines data distribution across AMPs. A Unique Primary Index (UPI) guarantees even distribution but requires uniqueness. A Non-Unique Primary Index (NUPI) allows duplicates but risks data skew. Partitioned Primary Index (PPI) adds row-level partitioning within the PI distribution. These are mandatory physical design decisions that fundamentally affect query performance and often require DBA involvement for every new table.

In Snowflake, data is automatically distributed across micro-partitions. Optional clustering keys provide ordering hints but never restrict data access patterns. There is no UPI/NUPI decision, no skew analysis, and no spool space exhaustion from bad PI choices.

-- TERADATA: Table with PI and PPI
CREATE TABLE analytics_db.fact_sales (
    sale_id BIGINT NOT NULL,
    sale_date DATE NOT NULL,
    customer_id BIGINT,
    store_id INTEGER,
    product_id INTEGER,
    quantity INTEGER,
    amount DECIMAL(18,2)
)
PRIMARY INDEX (customer_id)
PARTITION BY RANGE_N(sale_date BETWEEN DATE '2020-01-01'
    AND DATE '2030-12-31' EACH INTERVAL '1' MONTH);

-- Statistics collection required for optimizer
COLLECT STATISTICS ON analytics_db.fact_sales
    COLUMN (customer_id),
    COLUMN (sale_date),
    COLUMN (customer_id, sale_date);

-- SNOWFLAKE: No PI; optional clustering key
CREATE OR REPLACE TABLE analytics.fact_sales (
    sale_id BIGINT NOT NULL,
    sale_date DATE NOT NULL,
    customer_id BIGINT,
    store_id INTEGER,
    product_id INTEGER,
    quantity INTEGER,
    amount DECIMAL(18,2)
)
CLUSTER BY (sale_date, customer_id);
-- No COLLECT STATISTICS needed — fully automatic

-- Monitor clustering effectiveness
SELECT SYSTEM$CLUSTERING_INFORMATION('analytics.fact_sales');

The elimination of Primary Index design is one of the most impactful simplifications in Teradata-to-Snowflake migration. In Teradata, PI selection mistakes cause production outages from data skew and spool exhaustion. In Snowflake, there is no PI to get wrong — the optimizer manages data layout automatically. This removes an entire class of performance engineering work and DBA review processes.

COLLECT STATISTICS to Automatic Optimization

Teradata requires explicit COLLECT STATISTICS commands to maintain optimizer metadata. Stale statistics cause poor query plans, and many Teradata shops run daily or weekly statistics collection jobs that consume significant CPU and I/O. Snowflake maintains all metadata automatically — there is no statistics collection process, no stale statistics problem, and no maintenance window needed.

-- TERADATA: Statistics collection (common daily maintenance)
COLLECT STATISTICS ON analytics_db.fact_sales
    COLUMN (sale_date),
    COLUMN (customer_id),
    COLUMN (product_id),
    COLUMN (store_id),
    COLUMN (sale_date, customer_id),
    COLUMN (sale_date, store_id);

COLLECT STATISTICS ON analytics_db.dim_customer
    COLUMN (customer_id),
    COLUMN (segment),
    COLUMN (region);

-- Often hundreds of COLLECT STATISTICS across maintenance scripts

-- SNOWFLAKE: No equivalent needed
-- Snowflake automatically tracks:
--   - Row counts per micro-partition
--   - Min/max values per column per micro-partition
--   - Distinct value estimates
--   - NULL counts
-- All maintained in real-time as data changes. Zero maintenance.

Teradata UDFs to Snowflake UDFs and Snowpark

Teradata supports SQL UDFs and C/C++ external UDFs for custom logic. Snowflake supports UDFs in SQL, Python, Java, JavaScript, and Scala, with Snowpark providing the most flexible option for complex logic that previously required C/C++ UDFs in Teradata.

-- TERADATA SQL UDF
REPLACE FUNCTION analytics_db.fiscal_quarter(p_date DATE)
RETURNS VARCHAR(6)
LANGUAGE SQL
DETERMINISTIC
RETURN
    CASE
        WHEN EXTRACT(MONTH FROM p_date) BETWEEN 1 AND 3 THEN 'Q3-' || TRIM(EXTRACT(YEAR FROM p_date))
        WHEN EXTRACT(MONTH FROM p_date) BETWEEN 4 AND 6 THEN 'Q4-' || TRIM(EXTRACT(YEAR FROM p_date))
        WHEN EXTRACT(MONTH FROM p_date) BETWEEN 7 AND 9 THEN 'Q1-' || TRIM(EXTRACT(YEAR FROM p_date) + 1)
        ELSE 'Q2-' || TRIM(EXTRACT(YEAR FROM p_date) + 1)
    END;

-- SNOWFLAKE SQL UDF
CREATE OR REPLACE FUNCTION analytics.fiscal_quarter(p_date DATE)
RETURNS VARCHAR
LANGUAGE SQL
AS
$$
    CASE
        WHEN EXTRACT(MONTH FROM p_date) BETWEEN 1 AND 3 THEN 'Q3-' || EXTRACT(YEAR FROM p_date)::VARCHAR
        WHEN EXTRACT(MONTH FROM p_date) BETWEEN 4 AND 6 THEN 'Q4-' || EXTRACT(YEAR FROM p_date)::VARCHAR
        WHEN EXTRACT(MONTH FROM p_date) BETWEEN 7 AND 9 THEN 'Q1-' || (EXTRACT(YEAR FROM p_date) + 1)::VARCHAR
        ELSE 'Q2-' || (EXTRACT(YEAR FROM p_date) + 1)::VARCHAR
    END
$$;

From parsed legacy code to production-ready modern equivalents — MigryX automates the entire conversion pipeline

From Legacy Complexity to Modern Clarity with MigryX

Legacy ETL platforms encode business logic in visual workflows, proprietary XML formats, and platform-specific constructs that are opaque to standard analysis tools. MigryX’s deep parsers crack open these proprietary formats and extract the underlying data transformations, business rules, and data flows. The result is complete transparency into what your legacy code actually does — often revealing undocumented logic that even the original developers had forgotten.

Teradata Workload Management to Snowflake Virtual Warehouses

Teradata's TASM (Teradata Active System Management) and TIWM (Teradata Intelligent Workload Manager) control resource allocation across workload classes, priority levels, and throttling rules on a shared compute system. Snowflake replaces this with per-warehouse sizing, auto-scaling multi-cluster warehouses, and resource monitors for budget control. The fundamental shift is from shared compute with priority management to isolated compute pools that cannot interfere with each other.

-- TERADATA TASM: Complex priority configuration on shared system
-- All users and workloads compete for the same AMPs

-- SNOWFLAKE: Separate warehouses per workload (isolated compute)
CREATE OR REPLACE WAREHOUSE etl_wh
  WAREHOUSE_SIZE = 'X-LARGE'
  AUTO_SUSPEND = 120
  AUTO_RESUME = TRUE
  MAX_CLUSTER_COUNT = 1;

CREATE OR REPLACE WAREHOUSE reporting_wh
  WAREHOUSE_SIZE = 'MEDIUM'
  AUTO_SUSPEND = 300
  AUTO_RESUME = TRUE
  MIN_CLUSTER_COUNT = 1
  MAX_CLUSTER_COUNT = 4
  SCALING_POLICY = 'STANDARD';

CREATE OR REPLACE WAREHOUSE datascience_wh
  WAREHOUSE_SIZE = 'LARGE'
  AUTO_SUSPEND = 60
  AUTO_RESUME = TRUE;

-- Resource monitors for budget control
CREATE OR REPLACE RESOURCE MONITOR etl_budget
  CREDIT_QUOTA = 500
  FREQUENCY = MONTHLY
  START_TIMESTAMP = IMMEDIATELY
  TRIGGERS
    ON 75 PERCENT DO NOTIFY
    ON 90 PERCENT DO NOTIFY
    ON 100 PERCENT DO SUSPEND;

ALTER WAREHOUSE etl_wh SET RESOURCE_MONITOR = etl_budget;

Teradata Scheduling (DBCAL) to Snowflake Tasks

Teradata scheduling relies on external tools like DBCAL, AutoSys, or Control-M to trigger BTEQ scripts and stored procedures. Snowflake Tasks provide built-in scheduling with CRON expressions, predecessor-based DAGs, conditional execution, and stream-based triggering.

-- External scheduler (AutoSys/Control-M) calling BTEQ scripts:
-- Job 1: 02:00 AM - run extract.bteq
-- Job 2: after Job 1 - run transform.bteq
-- Job 3: after Job 2 - run load.bteq
-- Job 4: after Job 3 - run validate.bteq

-- SNOWFLAKE: Native Task DAG with stream-based triggering
CREATE OR REPLACE TASK etl.extract_task
    WAREHOUSE = etl_wh
    SCHEDULE = 'USING CRON 0 2 * * * America/New_York'
    WHEN SYSTEM$STREAM_HAS_DATA('staging.source_stream')
AS
    CALL etl.extract_from_sources();

CREATE OR REPLACE TASK etl.transform_task
    WAREHOUSE = etl_wh
    AFTER etl.extract_task
AS
    CALL etl.transform_and_enrich();

CREATE OR REPLACE TASK etl.load_task
    WAREHOUSE = etl_wh
    AFTER etl.transform_task
AS
    CALL etl.load_to_warehouse();

CREATE OR REPLACE TASK etl.validate_task
    WAREHOUSE = etl_wh
    AFTER etl.load_task
AS
    CALL etl.run_quality_checks();

-- Enable bottom-up
ALTER TASK etl.validate_task RESUME;
ALTER TASK etl.load_task RESUME;
ALTER TASK etl.transform_task RESUME;
ALTER TASK etl.extract_task RESUME;

Dynamic Tables: Declarative Replacement for Scheduled BTEQ Jobs

Many Teradata BTEQ scripts exist solely to run a scheduled query and populate a summary or aggregate table — a DELETE followed by an INSERT with GROUP BY. Snowflake Dynamic Tables replace this pattern entirely with a declarative definition that Snowflake automatically keeps up to date, including incremental refresh.

-- TERADATA: BTEQ script run hourly via scheduler
-- DELETE FROM analytics_db.regional_summary;
-- INSERT INTO analytics_db.regional_summary
-- SELECT region, product_category, SUM(amount), COUNT(*)
-- FROM analytics_db.fact_sales
-- JOIN analytics_db.dim_store ON ...
-- GROUP BY 1, 2;

-- SNOWFLAKE: Dynamic Table — no scheduling, no procedure, no task
CREATE OR REPLACE DYNAMIC TABLE analytics.regional_summary
  TARGET_LAG = '1 hour'
  WAREHOUSE = etl_wh
AS
  SELECT
      st.region,
      p.category AS product_category,
      SUM(s.amount) AS total_sales,
      COUNT(*) AS transaction_count,
      AVG(s.amount) AS avg_transaction,
      COUNT(DISTINCT s.customer_id) AS unique_customers
  FROM analytics.fact_sales s
  JOIN analytics.dim_store st ON s.store_id = st.store_id
  JOIN analytics.dim_product p ON s.product_id = p.product_id
  GROUP BY st.region, p.category;

Time Travel: A Capability Teradata Cannot Match

Snowflake Time Travel allows querying historical data states and recovering from accidental changes without backup-and-restore operations. Teradata has temporal tables (validtime/transactiontime), but they require explicit schema design and are not the same as Snowflake's automatic Time Travel on every table.

-- Query data as it was 2 hours ago
SELECT * FROM analytics.fact_sales AT (OFFSET => -7200);

-- Recover from accidental DELETE
CREATE OR REPLACE TABLE analytics.fact_sales
  CLONE analytics.fact_sales
  BEFORE (STATEMENT => LAST_QUERY_ID(-1));

-- Zero-copy clone for testing (Teradata has no equivalent)
CREATE DATABASE dev_analytics CLONE prod_analytics;

Migration Strategy with MigryX

MigryX uses AST-based deterministic parsers to analyze Teradata SQL and BTEQ scripts at the syntax-tree level, understanding not just the SQL statements but also BTEQ dot-commands, SPL control flow, data type declarations, PI/PPI definitions, and macro structures. This deep structural understanding enables automated conversion with over 95% accuracy across the full spectrum of Teradata SQL extensions and procedural constructs.

The MigryX platform generates column-level data lineage from Teradata source tables through every transformation step to final targets, producing STTM (Source-to-Target Mapping) documentation automatically. For organizations with thousands of BTEQ scripts and stored procedures, this lineage analysis provides the impact assessment needed to plan migration waves and validate completeness.

AST-based deterministic parsing — MigryX parses Teradata SQL, BTEQ, and SPL at the abstract syntax tree level, not with regex patterns. This captures nested expressions, complex CASE logic, QUALIFY clauses, and procedural control flow that regex tools miss.
+95% parser accuracy — handles the full range of Teradata SQL extensions including NORMALIZE, TEMPORAL, FORMAT phrases, and proprietary functions across Teradata 14 through 17.
Column-level lineage — traces data from source columns through every join, aggregation, and expression to target columns, producing STTM documentation that validates migration correctness.
Multi-target output — generates Snowflake SQL, Snowpark Python, and Task DAG definitions from a single Teradata analysis.
On-premise and air-gapped deployment — runs entirely within your network. No Teradata DDL, DML, or data leaves your environment. Critical for financial services, healthcare, and government.
STTM documentation — automatically produces comprehensive Source-to-Target Mapping documents covering every column, transformation rule, and lineage path.
Merlin AI — for complex Teradata constructs (NORMALIZE, temporal queries, C/C++ UDFs, TASM workload rules), Merlin AI generates optimized Snowflake equivalents with explanations.

Migration Comparison: Teradata vs. Snowflake Operational Model

Dimension	Teradata	Snowflake
Infrastructure	Dedicated hardware nodes, rack deployment	Fully managed cloud, zero hardware
Scaling	Add physical nodes (weeks, procurement)	Resize warehouse in seconds
Data Distribution	Primary Index hash across AMPs	Automatic micro-partitioning
Statistics	Manual COLLECT STATISTICS jobs	Automatic, real-time, zero maintenance
Workload Management	TASM/TIWM on shared compute	Isolated virtual warehouses per workload
Bulk Loading	FastLoad, MultiLoad, TPT utilities	COPY INTO, Snowpipe, MERGE
Scheduling	External (AutoSys, Control-M, cron)	Native Snowflake Tasks with CRON + DAGs
CDC	Teradata CDC or custom triggers	Native Snowflake Streams
Cost Model	Per-node licensing + hardware + maintenance	Per-second compute consumption
Semi-Structured Data	JSON/XML support (limited)	Native VARIANT, FLATTEN, PARSE_JSON
Point-in-Time Recovery	Journal tables, archive/restore	Time Travel (up to 90 days)
Environment Cloning	Full data copy (days/weeks)	Zero-copy clone (seconds)

Key Takeaways
BTEQ scripts with .IF/.THEN/.GOTO control flow convert to Snowflake SQL scripting stored procedures with IF/THEN/ELSE, loops, and structured exception handling — replacing unstructured GOTO logic with maintainable code.
Teradata QUALIFY is supported natively in Snowflake with identical syntax — including MERGE with QUALIFY patterns that migrate with minimal changes.
SET table deduplication behavior must be implemented explicitly in Snowflake using DISTINCT or QUALIFY ROW_NUMBER() since Snowflake tables are MULTISET by default.
Teradata SPL stored procedures convert to Snowflake SQL scripting with equivalent variables, cursors, loops, and exception handling. ACTIVITY_COUNT becomes SQLROWCOUNT, EXIT HANDLER becomes EXCEPTION blocks.
Teradata macros convert directly to Snowflake stored procedures with parameterized SQL.
FastLoad, MultiLoad, and TPT are replaced by COPY INTO (batch), MERGE (upsert), and Snowpipe (continuous) — with simpler configuration and no client-side utilities.
Primary Index and PPI design are eliminated entirely — Snowflake's automatic micro-partitioning removes the most error-prone aspect of Teradata physical design.
COLLECT STATISTICS maintenance jobs are eliminated — Snowflake maintains all optimizer metadata automatically.
MigryX automates Teradata SQL transpilation, BTEQ conversion, and SPL procedure migration with AST-based parsing and column-level lineage generation at over 95% accuracy.

Migrating from Teradata to Snowflake is fundamentally a move from a proprietary, hardware-bound data warehousing model to an elastic, consumption-based cloud platform. The technical conversion of SQL, BTEQ scripts, stored procedures, macros, and loading utilities is substantial but well-defined — every Teradata construct has a Snowflake equivalent. The operational benefits are immediate: no hardware procurement, no PI design reviews, no statistics collection jobs, no TASM configuration, and a pricing model that scales with actual usage rather than peak capacity. For organizations running Teradata, the migration to Snowflake eliminates the infrastructure complexity and licensing costs that have constrained data warehousing budgets for years.

Why MigryX Is the Only Platform That Handles This Migration

The challenges described throughout this article are exactly what MigryX was built to solve. Here is how MigryX transforms this process:

Deep AST parsing: MigryX’s custom-built parsers achieve 95% accuracy on every supported legacy technology — not through approximation, but through true semantic understanding.
Merlin AI augmentation: Where deterministic parsing reaches its limit, Merlin AI resolves ambiguities and implicit behaviors, pushing accuracy to 99%.
Complete coverage: MigryX supports 25+ source technologies including SAS, Informatica, DataStage, SSIS, Alteryx, Talend, ODI, Teradata, and Oracle PL/SQL.
End-to-end automation: From parsing to conversion to validation — MigryX automates the entire pipeline, not just one step.

MigryX combines precision AST parsing with Merlin AI to deliver 99% accurate, production-ready migration — turning what used to be a multi-year manual effort into a streamlined, validated process. See it in action.

Ready to migrate from Teradata to Snowflake?

See how MigryX converts Teradata BTEQ scripts, SQL, stored procedures, and macros to production-ready Snowflake SQL, Snowpark pipelines, and Task DAGs — with full column-level lineage.

Explore Snowflake Migration Schedule a Demo