Common errors
- Node failures
- Memory pressure
- Timeouts
- Schema mismatch
A single node turns red while upstream stayed green. Open node details for the exception class, SQL state, or HTTP status. Typical causes: syntax after variable substitution, missing file, forbidden API, or type coercion on null-heavy columns. Fix the node config or guard with null-safe expressions and default values.
Debugging with preview, node logs, and run details
Open the failed run
From the pipeline canvas or Runs list, select the attempt with the error badge. Note start time, environment, and parameter overrides.
Read node logs
Expand the failed node and load logs filtered to Error and Warn. Follow stack traces to the first caused by line—later messages are often cascading.
Use preview where safe
Preview samples rows through the subgraph. Use limited row counts and masked columns for PII. Preview hits the same connections as production—respect rate limits on external APIs.
Some failures are transient (network blips). Use retry policies on idempotent branches instead of manual reruns for every flake.
Performance optimization tips
Push down before pull up
Push down before pull up
Filter and project in source queries or warehouse SQL before you move large datasets through the orchestration tier.
Right-size partitions
Right-size partitions
Too many tiny files hurts listing; too few huge files hurts parallelism. Aim for 128–512 MB compressed objects where the format allows, subject to source constraints.
Cache stable dimensions
Cache stable dimensions
Reuse broadcast or cached small lookups instead of repeating joins on every micro-batch in streaming paths.
Watch queue depth
Watch queue depth
Backpressure in streaming or orchestrated jobs shows up as growing lag before hard failures. Alert on lag early.
Related topics
Diagnostics
Automated anomaly hints across runs.
Dead letter queue
Inspect rejected rows and poison messages.