Waterfall Enrichment

The waterfall pattern runs multiple enrichment providers in sequence and stops as soon as one returns a valid result. This maximizes coverage while minimizing cost — you only pay for lookups that actually run.

Key concepts

```
--with-waterfall <NAME>
```
— start a waterfall block named
```
<NAME>
```

--type

— what you're looking for:

email

phone

linkedin

first_name

last_name

full_name

```
--result-getters
```
— JSON path(s) where to find the value in provider output
```
--end-waterfall
```
— close the block; the waterfall name becomes a column with the resolved value
After
```
--end-waterfall
```
, use
```
{{<waterfall_name>}}
```
to reference the resolved scalar

Always pilot first

bash

deepline enrich --input leads.csv --in-place --rows 0:1 \
  --with-waterfall "email" \
  --type email \
  --result-getters '["data.email","email","data.0.email"]' \
  --with 'provider_a=tool_name:{"param":"{{Column}}"}' \
  --with 'provider_b=tool_name:{"param":"{{Column}}"}' \
  --end-waterfall

Review output, then scale to

--rows 1:

for remaining rows.

Phone waterfall

bash

deepline enrich --input leads.csv --in-place --rows 0:1 \
  --with-waterfall "phone" \
  --type phone \
  --result-getters '["data.phone","phone","mobile","data.mobile"]' \
  --with 'mobile_finder=leadmagic_mobile_finder:{"email":"{{Email}}","first_name":"{{First Name}}","last_name":"{{Last Name}}","company":"{{Company}}"}' \
  --end-waterfall

Email waterfall (name + company)

bash

deepline enrich --input leads.csv --in-place --rows 0:1 \
  --with-waterfall "email" \
  --type email \
  --result-getters '["data.email","email","data.0.email"]' \
  --with 'apollo_match=apollo_people_match:{"first_name":"{{First Name}}","last_name":"{{Last Name}}","organization_name":"{{Company}}"}' \
  --with 'crust_profile=crustdata_person_enrichment:{"linkedinProfileUrl":"{{LinkedIn}}","fields":["email","current_employers"],"enrichRealtime":true}' \
  --with 'pdl_enrich=peopledatalabs_enrich_contact:{"first_name":"{{First Name}}","last_name":"{{Last Name}}","domain":"{{Company Domain}}"}' \
  --end-waterfall \
  --with 'email_validation=leadmagic_email_validation:{"email":"{{email}}"}'

LinkedIn URL waterfall

bash

deepline enrich --input leads.csv --in-place --rows 0:1 \
  --with-waterfall "linkedin" \
  --type linkedin \
  --result-getters '["linkedin_url","data.linkedin_url","data.0.linkedin_url"]' \
  --with 'apollo_match=apollo_people_match:{"first_name":"{{First Name}}","last_name":"{{Last Name}}","organization_name":"{{Company}}"}' \
  --with 'pdl_identify=peopledatalabs_person_identify:{"first_name":"{{First Name}}","last_name":"{{Last Name}}","company":"{{Company}}"}' \
  --end-waterfall

Pre-flight validation

Before running any waterfall, validate your input data:

bash

# Check for required columns and empty values
python3 -c "
import csv, sys
with open('leads.csv') as f:
    rows = list(csv.DictReader(f))
cols = rows[0].keys() if rows else []
print(f'Columns: {list(cols)}')
print(f'Total rows: {len(rows)}')
# Check for empty required fields
for col in ['First Name', 'Last Name', 'Company']:
    empty = sum(1 for r in rows if not r.get(col, '').strip())
    if empty: print(f'WARNING: {empty} rows missing {col}')
# Check for duplicates (same person appears multiple times)
keys = [(r.get('First Name','').strip().lower(), r.get('Last Name','').strip().lower(), r.get('Company','').strip().lower()) for r in rows]
from collections import Counter
dupes = {k: v for k, v in Counter(keys).items() if v > 1}
if dupes: print(f'WARNING: {len(dupes)} duplicate contacts — deduplicate before enrichment to avoid paying for the same lookup twice')
"

Skip rows that can't match: If a row is missing both email AND name+company, no provider will find a match. Remove these before running to save credits.

Realistic coverage expectations

Don't assume 100% fill rates. Actual coverage per provider:

Data type	Single provider	2-provider waterfall	3-provider waterfall
Email	~50%	~65%	~75%
Phone	~30%	~40%	~45%
LinkedIn URL	~65%	~75%	~85%

Set expectations with stakeholders before running. Diminishing returns after 2-3 providers.

Hard rules

Always end with
```
leadmagic_email_validation
```
when the waterfall resolves email
Use
```
--rows 0:1
```
pilot before any full run
Do not reuse an existing output CSV path
Chain one waterfall at a time; close
```
--end-waterfall
```
before starting another

Use canonical

--type

values:

email | phone | linkedin | first_name | last_name | full_name

Never call enrichment without minimum data: email waterfalls need name + company OR LinkedIn URL. Phone waterfalls need a verified email. Don't waste credits on rows missing required fields.

After a waterfall run

The Playground auto-opens for inspection:

bash

deepline playground start --csv leads.csv --open

Use

--rows 0:1

in the playground to re-run a single block for debugging.

Related skills

This skill teaches the waterfall pattern. For specific enrichment tasks, use:

Finding emails →
```
contact-to-email
```
(pre-built email waterfalls)
Finding LinkedIn URLs →
```
linkedin-url-lookup
```
(with nickname expansion + validation)
Finding contacts at companies →
```
get-leads-at-company
```
Building prospect lists →
```
build-tam
```

Get started

bash

npm install -g @deepline/cli
deepline auth login

waterfall-enrichment

NPX Install

Tags

SKILL.md Content

Waterfall Enrichment

Key concepts

Always pilot first

Phone waterfall

Email waterfall (name + company)

LinkedIn URL waterfall

Pre-flight validation

Realistic coverage expectations

Hard rules

After a waterfall run

Related skills

Get started