Joins and unions combine multiple datasets into one rowset. Use join nodes when rows relate on keys (exact or fuzzy). Use union when schemas align and you want to stack similar tables.Documentation Index
Fetch the complete documentation index at: https://docs.planasonix.com/llms.txt
Use this file to discover all available pages before exploring further.
Join
Join merges two inputs on key columns using standard relational semantics. Join types:| Type | Result |
|---|---|
| Inner | Only keys present in both inputs. |
| Left | All rows from the left input; matched right columns or nulls. |
| Right | All rows from the right input; matched left columns or nulls. |
| Full | All keys from either side; unmatched sides null-padded. |
| Cross | Cartesian product—use sparingly and filter immediately. |
- Left / right inputs: Wire two upstream edges.
- Key mapping:
customer_id↔cust_idwhen names differ. - Null-safe equality (when offered): Treat
NULL = NULLif your rule requires it—otherwise unexpected drops occur on inner joins. - Select list: Choose columns to carry forward; disambiguate names with aliases.
orders (left) with customers (right) on orders.customer_id = customers.id using a left join so unpaid carts still appear with nullable customer attributes.
Fuzzy Join
Fuzzy Join matches rows when keys are approximately equal—names, addresses, or human-entered text with typos. Configuration:- Similarity metric: Edit distance, token overlap, or phonetic match—per product options.
- Threshold: Minimum score to accept a match; tune on labeled samples.
- Blocking keys (when available): Restrict comparisons to same
postal_codeprefix or first letter buckets to keep runtime feasible.
vendor_name from invoices to a reference suppliers table where strings differ ("Acme LLC" vs "Acme Incorporated"). Start with high thresholds; inspect unmatched rows in preview.
Union
Union stacks two or more inputs vertically when columns correspond. Configuration:- Column alignment: By name or position—prefer by name to reduce breakage when column order changes.
- Type coercion: Harmonize types (
INTvsBIGINT) before union when strict mode is enabled. - Distinct: Optional
UNION DISTINCTbehavior to drop identical rows across branches.
sales_us, sales_eu, and sales_apac tables that share schema into sales_all for downstream aggregation.
Design checklist
Normalize keys first
Trim strings, standardize casing, and parse dates before joins on those fields.
Profile cardinalities
Preview row counts after inner joins; investigate explosion after cross joins or many-to-many fuzzy matches.
Related nodes
Aggregation
Summarize after you establish the correct grain.
Column transforms
Prepare join keys and handle type mismatches.