Skip to content

Research

A Research record is LeadHunter’s audit trail for how an account got into your database. Every Google Maps sweep, CSV import, URL lookup, AI enrichment pass, and operator-logged manual research session has its own row — so months later you can answer “why do we have this account, who put it there, what method was used, and what did it cost?”

Research is the audit-trail equivalent of status_history on the account itself — but for discovery, not lifecycle.

The dashboard’s Research page (sidebar → Research) is the home for the log. It shows:

  • Stats — total / in progress / completed / failed.
  • A filterable list of every research record in the active Company, newest first, with status, method, result counts, and the user who triggered it.
  • A New research action for logging research you did yourself — useful when discovery happened outside LeadHunter (a directory you bought, a trade-show booth scan, a vendor list) and you want the record for accounting and team visibility.
MethodTriggerUI surface
Google Maps searchText query like “bike shops in Berlin”.Accounts → Discover
Google Maps URLPaste a maps.google.com/... URL.Accounts → Lookup
Website / domainPaste acme.com or a full URL — LeadHunter scrapes it.Accounts → Lookup
CSV / XLSX importUpload a spreadsheet through the import wizard. One research record per file.Accounts → Import
AI deep researchOptional follow-up pass that fills missing fields and writes salesperson notes.Accounts → Lookup, Deep mode
Manual research logYou did the discovery yourself; you’re recording it.Research → New research

The method field on the record captures which path produced it: manual, google_maps_api, tavily_research, apollo_enrichment, web_scraping, csv_import, api_integration, other.

BucketFields
Identityname, description, method, status
Executionstarted_at, completed_at, duration_seconds, executed_by
Tooling (automated runs only)script_name, script_version, api_key_used (the env var name, not the key value)
Search parameterscountries, cities, search_terms, search_radius_meters, instructions, configuration
Resultstotal_items_found, new_items_added, items_updated, duplicates_found, errors_count
Costapi_calls_made, estimated_cost (USD)
Provenancesource_files, notes, error_log, created_by, created_at, updated_at

Search-parameter fields (countries, cities, search_terms) make it easy to re-run a discovery later: a Google Maps sweep done six months ago records exactly what was searched, so a fresh sweep with the same terms is one click away to catch new businesses.

A research record moves through five states:

  1. planned — created but not started yet. Multi-step wizards (CSV import column-mapping, complex Maps queries) hold the research here while the operator configures it.
  2. in_progress — the discovery / scrape / enrichment is actively running. Long imports stay here for minutes.
  3. completed — finished successfully. Result counts and cost are populated.
  4. partial — finished with errors but produced some results. The error_log records what went wrong; the results that came through are on accounts.
  5. failed — couldn’t produce useful results (Google Maps quota exhausted before the first hit, the CSV was malformed, the website refused to scrape). The error is recorded so you can read it and decide whether to retry.

Forward-only: a research record doesn’t go back to planned once it’s started running.

When research completes, every account it produced has gone through the same five-level dedupe stack. Each result is either:

  • Created fresh — the dedupe stack didn’t find a match.
  • Merged into an existing account — the dedupe stack matched at confidence 85+ and absorbed the new data into the survivor.
  • Surfaced as a fuzzy candidate — match at the 5th level (fuzzy name within the same city, ≥85% similarity); awaiting human review under Accounts → Duplicates.

The research record’s counts reflect that split:

FieldWhat it counts
total_items_foundEverything the discovery produced (pre-dedupe).
new_items_addedRows where the dedupe stack didn’t match — new accounts created.
items_updatedRows that were merged into existing accounts at levels 1–4 (auto-merge).
duplicates_foundRows that surfaced as fuzzy candidates awaiting review.
errors_countItems that hit a per-row error (malformed input, scrape failed, …) — see error_log.

So a quick scan of “discovered 142 places, 87 new, 51 merged into existing, 4 awaiting duplicate review, 0 errors” tells you what to look at next.

External calls cost money — Google Maps charges per place lookup, Tavily charges per search, every LLM call burns tokens. Each cost is auto-attributed to the research record that triggered it; the record carries api_calls_made and estimated_cost for a quick at-a-glance number.

For the cross-research breakdown (provider totals, daily trends, per-research-record-comparison views), see Usage and costs — the polished Usage page that surfaces all of this in one screen is on the roadmap.

Manually-entered campaign expenses (Adwords, agency fees, labor hours) don’t flow into research records — those attach to campaigns instead, since they’re paying for a specific outbound effort, not for one batch of discovery. See Track campaign costs and CAC for that side.

  • A weird account showed up in a campaign — open the account, check the source field, and trace back to the research record. Useful when you don’t remember which import dropped it in.
  • You ran a Google Maps sweep months ago and want to re-run the same query to catch new businesses. The search_terms, countries, and cities fields on the old research record carry exactly what you searched.
  • You’re auditing AI cost spend and want per-discovery attribution — sort by estimated_cost desc.
  • An import partially failed (status='partial'). Open the record, read error_log, decide whether the errors are retryable.
  • Import accounts — the three discovery / enrichment paths in detail.
  • Account — the row that research produces.
  • Merge duplicates — what happens to fuzzy candidates a research record surfaces.
  • Usage and costs — the cost surface research records contribute to.