Holy Macros...
- Whitney Wilger
- Mar 26
- 3 min read
Updated: Mar 31
Holy macros is right. dbt macros that is. The point of macros is repurposable code, sort of like functions in a class. When I heard that, the first thing I thought about was out multi-tenant [deidentified] data lake that we store in Snowflake. Right now we load the data into Snowflake, then stored procedures grab it, de-identify, and combine into a single database for holistic analysis. Pretty much something that dbt could be doing, right? Right.

The logic that I approached the problem with is nested macros. One macro will grab all the tenant, or account, metadata, another will grab all available transformations (we aggregated activities, lab orders, and user data at the moment), and a third which will grab the SQL for the particular transformation we are interested in at that moment. All of the account and SQL metadata is stored in snowflake tables for ease of retrieval. Finally, all of these will be orchestrated by an outer macro that brings everything together and outputs the data to a table; That last bit is still up in the air at the moment. More to come, but for now, here is where I am at this evening:
Orchestration Macro
{%- macro gather_account_data(gather_accounts, gather_aggs, get_agg_query) -%}%- if execute -%}
{%- set accounts = gather_accounts() -%}
{%- set aggs = gather_aggs().columns[0].values() -%}
{%- for agg in aggs -%}
{%- for account in accounts -%}
{%- set act = account['ACCOUNT_ID'] -%}
{{ log(act, True) }}
{%- set schema = target.schema | replace('DBT_','') -%}
{%- set db = 'HRM_' ~ schema ~ '_' ~ act | replace('-','') | upper() -%}
{%- set query_prep = get_agg_query(agg).columns[0].values()[0] -%}
{%- set query = query_prep | replace(':database',db ) -%}
{%- if loop.last -%}
{{ query }} ORDER BY 2,3
{% else %}
{{ query }} union
{% endif -%}
{%- endfor -%}
{%- if loop.last -%}
{%- break -%}
{%- endif -%}
{%- endfor -%}
{%- endif -%}
{%- endmacro -%}
Accounts Macro
{% macro gather_accounts() -%}
% set schema = target.schema | replace('DBT_','') %}
{% set results = run_query("SELECT DISTINCT M.ACCOUNT_ID, M.DATABASE_NAME FROM HRM_" ~ schema ~ "_USAGE.OHO_" ~ schema ~ "_META.ACCOUNT_META M
INNER JOIN SNOWFLAKE.ACCOUNT_USAGE.DATABASES D ON CONCAT('HRM_" ~ schema ~ "_',UPPER(REPLACE(M.ACCOUNT_ID,'-',''))) = D.DATABASE_NAME AND D.DELETED IS NULL INNER JOIN SNOWFLAKE.ACCOUNT_USAGE.TABLES T ON D.DATABASE_NAME = T.TABLE_CATALOG AND T.TABLE_NAME = 'HOURS_DIMENSION' AND D.DELETED IS NULL") %}
{{ return(results) }}
{%- endmacro %}
Transformation Macro
{% macro gather_aggs() -%}
{% set schema = target.schema | replace('DBT_','') %}
{% set results = run_query("SELECT AGG FROM OHO_" ~ schema ~ "_META.AGG_QUERY_META M WHERE AGG NOT LIKE 'looker%'") %}
{{ return(results) }}
{%- endmacro %}
Transformation SQL Macro
{% macro get_agg_query(agg) -%}
% set schema = target.schema | replace('DBT_','') %}
{% set results = run_query("SELECT QUERY FROM OHO_" ~ schema ~ "_META.AGG_QUERY_META WHERE AGG = '" ~ agg ~ "'") %}
{{ return(results) }}
{%- endmacro %}
Please excuse my gross, run on sentence, query SQL--it's been a long day and sleep for tomorrow is important. Aside from that, most of the macros are fairly simple, the orchestrator clearly being the most complicated, simply because it handles a lot of logic and passing of data.
If you have ideas of how to better this setup or questions about how any of it works, please reach out. I'd love to chat with other dbt'ers!
Comments