DBT Research

Here is my technical analysis of dbt-on-Snowflake security and a practical, non-exploitative penetration-test plan (objectives, likely attack vectors, and what to look for). I’ll keep this actionable for defenders and pentesters but avoid step-by-step exploit instructions — always get explicit written authorization before testing.

Threat model — how dbt + Snowflake typically fits together

dbt (developer tool) runs transformations that connect to Snowflake using a service account / credentials from profiles.yml or CI secrets. dbt creates/updates models (tables/views), may run seeds, snapshots, tests, and deploy artifacts (compiled SQL, logs). CI/CD systems (GitHub Actions, CircleCI, GitLab CI, Jenkins) typically store dbt credentials as secrets and run dbt run on behalf of a pipeline user. Developers may store profiles locally or commit insecure artifacts by mistake (compiled artifacts, credentials, or connection strings). Snowflake hosts data; controls access with roles/privileges, network policies, stages (external/internal), external functions, user-defined functions (UDFs), tasks, and access history / query history.

Assets to protect: production data, service account keys, CI secrets, compiled artifacts, stages (S3/GCS/Azure blobs), external functions/UDFs, and dbt Cloud/web UI access tokens.

High-level security concerns

Over-privileged dbt service principal dbt often requires CREATE/INSERT/USAGE on schemas/tables. If the service role is granted broader rights (ACCOUNTADMIN, wide GRANTs, ability to create new roles/users, or access other databases), a compromised pipeline can escalate access. Secrets exposure in repo / CI / artifacts profiles.yml, .env files, or CI logs can leak credentials. dbt compiled/target artifacts may contain SQL with literals or metadata revealing object names. Poor segregation between dev/test/prod Using same role/credentials across environments or giving CI access to prod leads to lateral movement risk. Inadequate monitoring & retention Lack of enabled Access History, incomplete query auditing, or short retention hampers detection and forensics. Insecure stages / external storage Stages backing Snowflake can be writable/readable by Snowflake and third parties; misconfigured policies on external buckets can enable data exfil. Runtime code execution vectors UDFs, external functions, and Snowpark allow code execution — a malicious transformation or code injection could execute arbitrary logic if permissions allow external network calls. SSO/OAuth/misconfigured auth flows Weak session policies, lack of MFA, or incorrect OAuth client scopes increase token theft risk. dbt Cloud or web UI token abuse dbt Cloud tokens and integrations may be overly-permissive or stored insecurely. Improper grants (future grants / role inheritance) Future GRANTS or broad grants to PUBLIC can create perpetual over-privilege. Supply-chain & dependency risk Malicious macros/packages in dbt packages can inject SQL or change behavior during compilation.

Penetration test — scope & preconditions (must be authorized)

Before testing: get a signed rules-of-engagement that defines scope (accounts, roles, environments), test hours, communications, and rollback/backup plans. Ensure read-only options for sensitive tests where possible.

Pre-engagement checklist

Written authorization with scope, time window, and escalation contacts. Backups or safe snapshot plan if destructive actions might be attempted. PoC/data collection limits and agreed-upon disclosure policies. Test accounts provided (if possible): attacker account vs. production service accounts.

Test plan (objectives, test cases, evidence to gather)

1 — Identity & Access Management (IAM) / Roles

Objective: Find excessive privileges & potential for privilege escalation.

Enumerate roles and role inheritance (conceptually): look for service roles with ACCOUNTADMIN-level grants, or roles granted to many users. Check whether dbt service principal has: CREATE DATABASE / DROP / CREATE ROLE / MANAGE GRANTS Ability to access other sensitive schemas or the ACCOUNT_USAGE database Evidence to collect: list of grants assigned to the dbt role, any future grants, role hierarchy diagrams, and change history.

Why it matters: Over-privileged service accounts are the most common root cause for major breaches.

2 — Secrets handling (CI, repos, artifacts)

Objective: Detect credential leakage and insecure secret storage.

Review repo for committed profiles.yml, .env, or other credential files. Inspect CI job logs for accidental printing of secrets or connection strings. Check artifact storage (dbt target dir, compiled SQL) for embedded credentials or sensitive object names. Evidence: repo scan results, CI logs excerpt (redacted), list of artifacts containing sensitive substrings.

Why it matters: Stolen CI secrets = immediate access to Snowflake with whatever rights were granted.

3 — Environment segregation & least privilege

Objective: Verify dev/test/prod separation and that service accounts are scoped appropriately.

Confirm separate Snowflake accounts/roles for dev vs prod or policies that prevent CI from writing to prod. Check whether dbt jobs can target arbitrary databases/schemas via variables or compiled artifacts. Evidence: dbt profiles and CI configs (redacted), cross-environment GRANTs.

Why it matters: Same creds across envs enable lateral movement from less sensitive to sensitive data.

4 — Data exfiltration vectors (stages & external storage)

Objective: Identify ways data can be exported from Snowflake.

Inventory external/internal stages accessible by the dbt role. Assess external stage permissions (S3/GCS/Azure) – public buckets, overly broad IAM/ACLs. Check for COPY INTO statements or usage of GET/PUT with public URLs in dbt macros. Evidence: list of stages and their cloud permissions, examples of SQL referencing stages.

Why it matters: Misconfigured stages are low-effort, high-impact exfil targets.

5 — SQL injection & malicious compiled artifacts / supply chain

Objective: Look for places where user-controlled input can become SQL executed by Snowflake.

Review dbt macros and packages for unsanitized string concatenation of identifiers or values used in run_query or similar. Inspect pull-request process to see if compiled artifacts are validated before deployment. Evidence: macro examples, PR workflow notes, use of package manager (dbt deps) controls.

Why it matters: Malicious macros or unchecked input may execute arbitrary SQL in the target DB.

6 — Code execution (UDFs, external functions, Snowpark)

Objective: Assess whether dbt or the service principal can create/modify functions or external integrations that enable code execution.

Inventory privileges to create/replace functions, external functions, or stages with external code references. Check if external functions are allowed to call external endpoints (network egress). Evidence: grants allowing CREATE FUNCTION, presence of existing external functions, UDF creation history.

Why it matters: Functions can act as persistence mechanisms or run code outside Snowflake (e.g., via external services).

7 — Session & authentication controls

Objective: Validate MFA, session policies, SSO and OAuth security posture.

Check whether service accounts use passwords, keypair auth, or OAuth; whether keys are rotated. Review session length policies, network policies restricting allowed IPs, and use of PrivateLink or IP whitelisting. Evidence: session policy configs, auth method inventory, rotation schedules.

Why it matters: Weak session/credential policies simplify token theft and reuse.

8 — Monitoring, logging, and alerting

Objective: Ensure detection capability and forensics exist.

Confirm Access History and Query History retention and whether CloudTrail / cloud logs (S3 access logs) are integrated. Check if CI runs, dbt artifacts, and service principal activity generate alerts for anomalous behavior. Evidence: retention configs, sample alert rules, available audit trails.

Why it matters: Lack of detection extends dwell time for attackers.

9 — Grants & future grants review

Objective: Find misapplied future grants or PUBLIC grants enabling privilege creep.

Audit schema/table-level grants, look for GRANT … TO ROLE PUBLIC or GRANT … WITH FUTURE GRANTS. Evidence: grant listings, examples of unintended permissions.

Why it matters: Future grants often create persistent access that is hard to revoke.

Prioritized test activities (triage order)

Secrets & CI (highest impact, easy to verify) — scan repos & CI logs. Service role privilege audit — confirm least-privilege. Stage/external storage audit — verify no public buckets or weak ACLs. Monitoring & access history check — ensure detection. UDF/external functions privileges — look for code-exec vectors. dbt macro/package review — supply-chain risks. Session/auth policies & rotation — prevent token reuse. Future grants / PUBLIC grants — privilege creep.

Remediations & defensive controls

Principle of least privilege: create a dedicated dbt role that has only the required privileges (USAGE on warehouse, SELECT/INSERT/CREATE on target schema, no ACCOUNTADMIN-level grants). Use ephemeral credentials or key-pair/OAuth with automated rotation rather than long-lived static passwords stored in repos. Store secrets in a vault (HashiCorp Vault, cloud secret manager) integrated into CI with short TTLs and scoped policies. Separate environments: enforce separate Snowflake roles/accounts for dev/test/prod; require approvals for PRs that change prod models. Harden external stages: minimize publicly accessible buckets, restrict write/read to only necessary principals, enable server-side encryption and object logging. Audit & alerting: enable Access History, Query History, and integrate with SIEM (CloudTrail + Snowflake EVENT tables); alert on anomalous queries, large data egress, or role escalation. Restrict function creation: limit who can create external functions/UDFs; review and approve all function deployments. CI safety checks: lint and scan dbt macros for unsafe patterns; block changes that modify privileged macros without review. Use row-level security & mask policies: apply masking/row access policies to limit exposure if a user gains read privilege. Inventory & tag sensitive objects: use object tagging and automated policies to prevent accidental exposure.

Evidence & reporting format (what to deliver)

For each finding, include:

Title & severity (Critical/High/Medium/Low) Description — what was observed and why it matters Affected assets — roles, schemas, stages, CI pipelines, etc. Impact — worst-case scenario (data exfil, privilege escalation) Evidence — logs, config snippets (redacted), screenshots, query IDs, timestamps Remediation steps — prioritized, clear defensive actions Retest criteria — how defenders can validate the fix

Tools & techniques (conceptual)

Repo scanning & secrets detection — use SAST/secret scanners to find accidental commits. Configuration audit — review Snowflake grants, roles, stages, and session policies via console or metadata exports. Artifact analysis — inspect compiled dbt artifacts and CI logs for exposure. Policy & monitoring validation — verify Access History, Query History, alert rules, and retention. Manual code review — inspect dbt macros, packages, and external function definitions for unsafe constructs.

	deepdark103 on Your Data Was Never Yours: Mic…
	deepdark103 on My Split Heart: Why I’m…
	deepdark103 on They didn’t embrace Linu…
	Ian McCormack on Terraform Cloud with Vaul…
	Pete on Welcome home Teams

Peters ramblings

Technology, mobile, identity and rants

DBT Research

Share this: