Skip to content

Import accounts

There are three ways to get accounts into LeadHunter. All three run the same dedupe checks against your existing database, so you can’t accidentally create a second row for an account you already have — and the merge keeps every unique field from every duplicate rather than dropping data on the floor.

Pick the path that matches the source:

SourceUseWhere
One-off entryQuick add by name, URL, or search queryAccounts → Lookup
Existing list (CSV / XLSX)Bring over a sheet you already maintainAccounts → Import
New market discoverySweep a geo/category in one passAccounts → Discover (Google Maps)

Best for ad-hoc additions — “there’s this one bike shop I want to track, can you find it for me?”

From Accounts → Lookup, paste any of:

  • A website URL (https://acme.com) — scraped + enriched.
  • A bare domain (acme.com) — LeadHunter prepends https://.
  • A Google Maps URL — pulls address, phone, hours, rating, place id.
  • A plain-text query (“bike shops in Berlin”) — Google Maps text search; the top match is used. Requires Google Maps to be configured.

Each lookup runs in one of three modes — the dropdown next to the URL field:

ModeWhat runsWhen to pick
NonePlain scrape; no LLM.You’re importing a vetted URL and don’t want to spend Gemini quota.
Quick (default)AI extracts business name, sector, language, key fields you didn’t provide.Most of the time.
DeepQuick + multi-page crawl (about / team / staff / contact pages) + decision-maker contact extraction.When you want named contacts pre-filled, not just the company.

Deep mode is heavier (more page fetches, more tokens) but produces an account that’s already populated with one or two contacts with names, titles, and (when discoverable) emails. Use it for high-value accounts where you’d otherwise hand-research the buying group.

Best for bringing over a list you already maintain — your CRM export, a trade-show booth scan, an industry directory you bought, etc. Accounts → Import runs a three-step wizard.

Drop a .csv, .xlsx, or .xls file. LeadHunter parses the headers and shows you a sample of the first few rows so you can confirm parsing worked (Excel files with ragged rows — rows that have fewer cells than the header — are handled correctly).

This is the step where you tell LeadHunter what each column means. The AI proposes a mapping based on the column names and the sample rows — e.g. org_name → name, tel → phone, country_iso → country. Review and adjust.

Standard target fields:

name, address, city, state_province_region, postal_code, country, phone, email, website, business_type, specialization, rating, language, status, notes, acquisition_channel, latitude, longitude, google_place_id, imported_id.

Aliasesstate, province, and region all resolve to state_province_region, so you don’t have to rename your column.

Skip — set the target to skip for columns you don’t want imported. Skipped columns can still be preserved as JSON (see Save extras below).

imported_id — map your source’s stable row id here. If your file has a column like Station UUID, CRM Record ID, Customer ID, or any other identifier that uniquely identifies a row in the source dataset, map it to imported_id. Re-importing the same file (or a new export from the same source) will then resolve every match via that id at the highest confidence — even when name + city don’t agree, or are missing entirely. Without an imported_id, dedupe falls back to name + city + phone + website + fuzzy name, which can miss matches when the file is sparse (radio-station directories, lead-magnet form dumps, etc.). Common header names that the AI auto-suggests: ID, External ID, OriginalID, Imported ID.

name is mandatory — the import refuses to start without at least one column mapped to it.

Already imported this file? You’ll see a warning.

Section titled “Already imported this file? You’ll see a warning.”

If the bytes of the file you just uploaded exactly match a previous import in this company, the mapping step shows a yellow banner at the top: “You’ve already imported this exact file” — with the prior import date, who ran it, how many rows landed, and how many of those accounts still exist. Cancel if you uploaded the wrong file; proceed if the warning is just informational (you re-uploaded on purpose). The dedupe stack will skip the duplicates regardless.

Step 3: Run (and the options worth knowing)

Section titled “Step 3: Run (and the options worth knowing)”

A few options control how the import behaves:

  • Static values — constants applied to every row. Useful when the file doesn’t carry a column for it but the value is the same for the whole batch. Examples: country=Spain for a Spanish trade-show export, language=Catalan for a regional directory, acquisition_channel=event for a booth scan. Anything you set here overrides values from the file.
  • Save unmapped columns as extra_data — when on, every column you mapped to skip (or didn’t map at all) is preserved as a JSON dict on each account. Use this when the source has fields LeadHunter doesn’t model but you don’t want to lose them.
  • Enrich accounts after import — when on, every newly-created account is queued for website discovery + scrape (logo, email, phone, social links, site language) once the rows land. The enrich runs as a separate background job that shows on the Tasks page. Off by default — it costs Gemini calls and time. Tick it for a small trusted list; leave it off for a 50k-row directory unless you actually want every row enriched.
  • Source label — a free-text tag stamped on each imported row’s source field. Defaults to the filename (lower-case, with separators tidied — radio_stations_radiobrowser.xlsx becomes radio stations radiobrowser). Edit it to whatever provenance marker reads best to you and your teammates: "LinkedIn export 2026-Q2", "Trade show 2026", "Stripe customers", etc. Beyond the label, every imported row also carries the file’s MD5 fingerprint invisibly, so you can later answer “show me every account that came from this specific upload” exactly even if multiple imports share the same label.
  • Dry-run — runs every dedupe check and shows you what would happen (created / merged / fuzzy candidates) without writing. Recommended for any import larger than a handful of rows.

Files above about 2,000 rows are handed to a background worker so the browser doesn’t sit on the request. You’ll see a live progress bar with rows-processed / total, animating as the import advances:

  • You can close the dialog at any time — the import keeps running on the server, and the accounts show up on Accounts as soon as it finishes.
  • If the file is rejected (e.g. an invalid column type Postgres won’t accept), the worker rolls back the whole batch atomically — you’ll see the failure in the result panel and nothing partial lands in the database.
  • The job also shows on the Tasks page as a row of method CSV Import. Click it for full details (timing, totals, error log).

Smaller files (and any dry-run) finish synchronously — same modal, no progress step.

When the real import finishes, the summary tells you: rows created, rows merged into existing accounts, fuzzy-match candidates that need your review, and rows skipped (with reasons). Fuzzy candidates land in Accounts → Duplicates for confirmation.

If LeadHunter had to truncate any over-long values to fit the column limits (Name is capped at 255 characters, Website at 500, Language at 50, etc.), an amber summary box lists the count per field — e.g. language: 126 · name: 17 · website: 2. The rows still imported; you just lost the tail of those long strings. Fix the source file if the truncated values matter.

If the database rejected the batch outright (a constraint violation that truncation can’t fix), the panel turns red, says “Import failed — nothing saved”, and lists the underlying database error. No partial data lands; fix the source and retry.

Best for entering a new market — “who are all the bike shops in Berlin?” Run a query from Accounts → Discover and LeadHunter pages through Google Maps results, fetches the place details for each, and adds them as accounts.

Each result runs through the same dedupe stack, so re-running the same query a month later doesn’t double-count — it merges fresh place details (new phone, updated hours, current rating) into the row you already have. The google_place_id constraint makes the match exact.

Requires Google Maps API to be configured on your environment; ad-hoc lookups by Maps URL or domain don’t depend on it.

LeadHunter can run an enrichment pass on every freshly-imported account that:

  1. Tries to discover the website via Gemini grounding (and a domain-guessing fallback when the model has nothing).
  2. Scrapes the discovered URL and caches the text content on the account.
  3. Detects the language of the site so scoring writes in the right one.

Two ways this gets triggered:

  • Single-account lookup (Quick / Deep modes) runs the enrich inline — it’s part of the lookup itself.
  • Bulk import (CSV / XLSX) — tick Enrich accounts after import during step 3. The enrich runs as a separate background job once the rows land. Off by default so you can decide; a 50k-row directory imported without the box ticked stays cheap and fast, and you can enrich a subset later from Accounts → Enrich selected.
  • Google Maps discovery rows arrive already-populated from the Maps result (website + phone + address), so the enrich is usually a no-op and is skipped.

When you do enrich, rows that arrive already-populated skip the work — the discovery / scrape only fires when there’s a gap to fill. Merge survivors also skip; they’re already populated from the source rows. You’ll see freshly-imported accounts grow website content and a language tag over the following minutes. The activity is tracked under API costs and visible on the Tasks page.

Every path runs the same dedupe stack. The first level that matches wins:

  1. Google Place ID — exact match → auto-merge (100% confidence). The hardest constraint; a Google place is unambiguous.
  2. Imported ID — exact match on imported_id within the same company → auto-merge (100% confidence). When you mapped the source’s stable row id (Station UUID, CRM Record ID, etc.) during the import, a re-import resolves here first — even when name + city don’t agree or are blank. This is the level that makes monthly CRM re-exports clean: every row that already exists is found here.
  3. Name + city — exact match after normalisation (legal suffixes, punctuation, filler words stripped). Auto-merge at 95%.
  4. Phone — normalised exact match (strips formatting, country codes, spaces). Auto-merge at 90%.
  5. Website domain — normalised exact match (www., scheme, trailing slash stripped). Auto-merge at 85%.
  6. Fuzzy name within the same city — ≥85% similarity. Suggested merge — surfaces in Accounts → Duplicates for you to confirm.

When a candidate is auto-merged (levels 1–5), every unique field from every duplicate is preserved on the survivor (the “golden record”), and the audit trail records what was merged. When a candidate is suggested (level 6), it shows up in Accounts → Duplicates for you to confirm or reject — LeadHunter never auto-merges fuzzy candidates.

For the full merge workflow + field-winner rules, see Merge duplicates.

  • Importing without dry-run. For anything over ~50 rows, run dry-run first. It costs nothing and tells you exactly what the dedupe stack will do — including which rows will merge into existing accounts.
  • Skipping the column mapping review. The AI suggestion is usually right, but it’s worth a scan — a notes column that lands on description and vice versa is annoying to clean up later.
  • Forgetting to set acquisition_channel for inbound imports. If you’re loading a Stripe-customer export or an Adwords-leads CSV, set acquisition_channel as a static value (adwords, cold_inbound, etc.) so the rows land at status='contacted' and slot into the attribution dashboard instead of looking like outbound prospects.
  • Re-importing a file you’ve already imported. LeadHunter detects this — the mapping step shows a yellow warning with the prior import’s date, who ran it, and how many rows are still around. The dedupe stack then skips every match, so re-imports of unchanged files are safe no-ops. The trap is when the file changed (new export from the same CRM) but you didn’t map a stable id: without imported_id, dedupe falls back to name + city + phone + website, which can miss matches on sparse files and create duplicates. Map the source’s stable id to imported_id for any dataset you’ll re-import regularly.