Research

A Research record is LeadHunter’s audit trail for how an account got into your database. Every Google Maps sweep, CSV import, URL lookup, AI enrichment pass, and operator-logged manual research session has its own row — so months later you can answer “why do we have this account, who put it there, what method was used, and what did it cost?”

Research is the audit-trail equivalent of status_history on the account itself — but for discovery, not lifecycle.

Where to find it

The dashboard’s Research page (sidebar → Research) is the home for the log. It shows:

Stats — total / in progress / completed / failed.
A filterable list of every research record in the active Company, newest first, with status, method, result counts, and the user who triggered it.
A New research action for logging research you did yourself — useful when discovery happened outside LeadHunter (a directory you bought, a trade-show booth scan, a vendor list) and you want the record for accounting and team visibility.

What kicks off a research record

Method	Trigger	UI surface
Google Maps search	Text query like “bike shops in Berlin”.	Accounts → Discover
Google Maps URL	Paste a `maps.google.com/...` URL.	Accounts → Lookup
Website / domain	Paste `acme.com` or a full URL — LeadHunter scrapes it.	Accounts → Lookup
CSV / XLSX import	Upload a spreadsheet through the import wizard. One research record per file.	Accounts → Import
AI deep research	Optional follow-up pass that fills missing fields and writes salesperson notes.	Accounts → Lookup, `Deep` mode
Manual research log	You did the discovery yourself; you’re recording it.	Research → New research

The method field on the record captures which path produced it: manual, google_maps_api, tavily_research, apollo_enrichment, web_scraping, csv_import, api_integration, other.

What a research record carries

Bucket	Fields
Identity	`name`, `description`, `method`, `status`
Execution	`started_at`, `completed_at`, `duration_seconds`, `executed_by`
Tooling (automated runs only)	`script_name`, `script_version`, `api_key_used` (the env var name, not the key value)
Search parameters	`countries`, `cities`, `search_terms`, `search_radius_meters`, `instructions`, `configuration`
Results	`total_items_found`, `new_items_added`, `items_updated`, `duplicates_found`, `errors_count`
Cost	`api_calls_made`, `estimated_cost` (USD)
Provenance	`source_files`, `notes`, `error_log`, `created_by`, `created_at`, `updated_at`

Search-parameter fields (countries, cities, search_terms) make it easy to re-run a discovery later: a Google Maps sweep done six months ago records exactly what was searched, so a fresh sweep with the same terms is one click away to catch new businesses.

Lifecycle

A research record moves through five states:

planned — created but not started yet. Multi-step wizards (CSV import column-mapping, complex Maps queries) hold the research here while the operator configures it.
in_progress — the discovery / scrape / enrichment is actively running. Long imports stay here for minutes.
completed — finished successfully. Result counts and cost are populated.
partial — finished with errors but produced some results. The error_log records what went wrong; the results that came through are on accounts.
failed — couldn’t produce useful results (Google Maps quota exhausted before the first hit, the CSV was malformed, the website refused to scrape). The error is recorded so you can read it and decide whether to retry.

Forward-only: a research record doesn’t go back to planned once it’s started running.

How research connects to accounts

When research completes, every account it produced has gone through the same five-level dedupe stack. Each result is either:

Created fresh — the dedupe stack didn’t find a match.
Merged into an existing account — the dedupe stack matched at confidence 85+ and absorbed the new data into the survivor.
Surfaced as a fuzzy candidate — match at the 5th level (fuzzy name within the same city, ≥85% similarity); awaiting human review under Accounts → Duplicates.

The research record’s counts reflect that split:

Field	What it counts
`total_items_found`	Everything the discovery produced (pre-dedupe).
`new_items_added`	Rows where the dedupe stack didn’t match — new accounts created.
`items_updated`	Rows that were merged into existing accounts at levels 1–4 (auto-merge).
`duplicates_found`	Rows that surfaced as fuzzy candidates awaiting review.
`errors_count`	Items that hit a per-row error (malformed input, scrape failed, …) — see `error_log`.

So a quick scan of “discovered 142 places, 87 new, 51 merged into existing, 4 awaiting duplicate review, 0 errors” tells you what to look at next.

Costs and quotas

External calls cost money — Google Maps charges per place lookup, Tavily charges per search, every LLM call burns tokens. Each cost is auto-attributed to the research record that triggered it; the record carries api_calls_made and estimated_cost for a quick at-a-glance number.

For the cross-research breakdown (provider totals, daily trends, per-research-record-comparison views), see Usage and costs — the polished Usage page that surfaces all of this in one screen is on the roadmap.

Manually-entered campaign expenses (Adwords, agency fees, labor hours) don’t flow into research records — those attach to campaigns instead, since they’re paying for a specific outbound effort, not for one batch of discovery. See Track campaign costs and CAC for that side.

When to look at the research log

A weird account showed up in a campaign — open the account, check the source field, and trace back to the research record. Useful when you don’t remember which import dropped it in.
You ran a Google Maps sweep months ago and want to re-run the same query to catch new businesses. The search_terms, countries, and cities fields on the old research record carry exactly what you searched.
You’re auditing AI cost spend and want per-discovery attribution — sort by estimated_cost desc.
An import partially failed (status='partial'). Open the record, read error_log, decide whether the errors are retryable.