Introduction and context
1.1 The investigative need
The subject of every published investigation will, eventually, look up the byline. So will the subject's lawyers, the subject's PR firm, and anyone the subject pays to retaliate. What they find on the first page of a search for your name shapes the legal threat letter you receive, the harassment campaign that follows publication, and in some cases the level of physical risk attached to the work. You do not get to decide whether they look. You only get to decide what they find.
This tutorial inverts the standard OSINT workflow and applies it to the practitioner. You are the target. The exercise is a structured pass through the same tools, sources and reasoning patterns you use against subjects, run against your own name, handles, addresses, family, employers and assets. The output is three artefacts: an adversary summary, a remediation list, and a follow-up cadence. The adversary summary tells you what someone with twenty minutes and a search bar will find first. The remediation list tells you what to remove, what to suppress and what is unrecoverable. The cadence keeps the audit current as new exposures accumulate.
The exercise is foundational because the controls in CZ1, browser isolation, VPN, persona hygiene, only protect the work you do from this point onward. They cannot retract data already in circulation. The dry run quantifies what is already out and produces a defensible plan to address it. Done once, it is a baseline. Done every six months, it is a discipline.
1.2 Learning outcomes
- Run a structured self-OSINT pass across name, handles, email addresses, phone numbers, addresses and asset records, using the same tools an adversary would.
- Identify which exposures sit on UK, US and EU sources and route remediation to the correct opt-out or removal procedure for each.
- Distinguish recoverable exposure (live broker listings, current social profiles) from unrecoverable exposure (cached pages, archived public records, leaked breach data) and triage accordingly.
- Produce three documented artefacts: an adversary summary, a remediation list with priorities, and a six-month follow-up checklist.
- Establish a six-month cadence to re-run the audit and capture new exposures as they accumulate.
1.3 Threat model
- Pre-publication legal threats and intimidation that rely on the subject identifying your home address, family or financial profile from public sources.
- Targeted harassment, swatting and doxxing campaigns aimed at your physical address, family members or workplace.
- Account-takeover attempts that exploit credentials exposed in past data breaches, especially against email, banking and social accounts used for source contact.
Foundational theory and ethical-legal framework
Digital footprint: The aggregate of personally-identifying information about an individual that is retrievable from public sources without authentication. Includes name and address records, social media profiles, breached credentials, leaked datasets, professional registrations, property records, court filings, news mentions, and content the individual has published themselves.
Self-OSINT (or self-doxxing dry run): The practice of running a structured open-source investigation against your own identity using the same tools and reasoning patterns an adversary would use, with the goal of producing a remediation plan. The term "self-doxxing" appears in some practitioner literature; the operational meaning is identical.
Data broker: A commercial entity that aggregates personal data from public records, marketing partners, electoral registers and breach disclosures, then resells access. Examples: 192.com (UK), Spokeo, Whitepages, BeenVerified (US), and dozens of EU-specific equivalents. Most operate an opt-out or suppression procedure mandated by GDPR or local equivalents.
Open Register (UK): The version of the UK electoral register available for sale to any organisation for any purpose. Approximately 40 percent of UK registered electors remain on it. Opting out is free at gov.uk/register-to-vote but is not retrospective: brokers who purchased prior editions retain that data legally.
Recoverable versus unrecoverable exposure: Recoverable exposure is data on a live source with a working removal procedure: a broker listing you can opt out of, an old social profile you can delete, a cached page that will refresh out. Unrecoverable exposure is data that is permanently in circulation: archived breach dumps, court records, news articles, archived versions of opted-out pages. The distinction governs whether you remediate, suppress or accept.
2.2 Ethical and legal boundaries
The audit operates entirely on data about yourself, so consent and lawful basis questions do not apply to the lookups.
Three specific legal points apply to the remediation phase.
First, UK GDPR Article 17 (right to erasure) and equivalent provisions in EU GDPR and California CCPA give you a legal right to demand removal from most commercial data brokers, with statutory response timelines.
Second, the Open Register opt-out at gov.uk/register-to-vote is the only operational way to remove your name from the upstream feed that supplies most UK people-search sites; the Representation of the People Bill 2024-26, currently before Parliament, may switch the register from opt-out to opt-in, which would alter the procedure.
Third, suppression of director and officer information at Companies House is now possible under the Economic Crime and Corporate Transparency Act 2023 from January 2025, at a fee per document. Court records, news archives and most public registers do not fall under any erasure right.
Consult legal counsel before any remediation step that touches on your professional registrations, court records or published bylines.
2.3 Device compartmentalisation
The audit generates significant traffic to broker, breach-check and social-media services from your IP and your accounts. Run the audit from the investigative browser profile and VPN endpoint established in CZ1, not from your personal browser. This prevents the audit traffic from polluting your personal recommendation algorithms, search histories and login records, and also prevents the rare case where a broker site stores your IP and serves it back to subjects who pay to look you up.
Applied methodology
3.1 Required tools and setup
Have I Been Pwned (HIBP): the standard public reference for whether an email address or phone number appears in a known data breach. Free, no account required to search. Subscribe to the notification service for ongoing alerts. HIBP currently lists more than 970 breached sites and processes new dumps continuously. Available at haveibeenpwned.com. Note that Google's Dark Web Report tool was discontinued in February 2026; HIBP is the recommended replacement.
Firefox Monitor: uses the HIBP backend with a browser-integrated interface for users on Firefox or with a Mozilla account. Useful for monitoring multiple addresses simultaneously. Available at monitor.firefox.com.
Google Takeout: exports the data Google holds about you across Search history, location history, YouTube, Maps, Drive metadata and account activity. The export reveals what the dominant search engine knows about you, which is also a partial picture of what an adversary cross-referencing leaked Google data might know. Available at takeout.google.com.
192.com (UK): aggregates the UK Open Register, BT phone directories, Companies House filings and Land Registry property prices. The default first stop for any UK lookup. Free results require registration; full results are credit-based. Use the free tier only. Available at 192.com.
Tracesmart and PeopleTraceUK (UK): UK-specific people-search services that retain Open Register data going back to 2002 and 2008 respectively. Routinely missed by US-focused removal services. Each has a manual opt-out procedure on its site.
Spokeo, Whitepages, BeenVerified, Radaris (US): the four most exposed US people-search aggregators. Each has a removal procedure, each takes between one and four weeks to action, and each must be re-checked periodically because removed records sometimes reappear. Optery maintains a current list of US broker opt-out pages organised by aggregator at optery.com.
EU broker landscape: highly fragmented by member state. Germany has Das Telefonbuch and 11880; France has 118712 and Les Pages Blanches; Netherlands has Bellingland and KvK (Chamber of Commerce). National data protection authorities maintain lists of registered brokers; check your DPA's broker registry. GDPR Article 17 gives you a right of erasure that applies to all of them.
Wayback Machine and archive.today: use to check what archived versions exist of pages you have controlled (old personal sites, old social profiles). Removed live pages often persist indefinitely in archives. The Wayback Machine has a removal request procedure; archive.today does not. Available at web.archive.org and archive.today.
Google, Bing, DuckDuckGo, Yandex: run the same name and handle searches across all four. Each indexes differently and surfaces different exposures. Yandex in particular often surfaces image and Russian-language sources that the others miss.
Pimeyes (caveat): a reverse-image search service that has indexed billions of public face images. The free tier reveals where your face appears online. We list it for completeness because adversaries use it; we do not recommend using it casually because the lookup itself can be logged. If you use it, do so once, from your investigative browser, and read its privacy policy first. Available at Pimeyes.com.
3.2 OPSEC and target awareness
OPSEC for self-OSINT also extends to who observes your remediation. Some opt-out procedures generate confirmation emails to the address being searched, which is fine. Some generate postal mail to the address being suppressed, which can alert household members or co-residents to what you are doing. Some require a photograph of government ID to action removal, which is itself an exposure. Read each broker's removal procedure before submitting it.
3.3 Practical execution
The audit runs in seven steps, working from broadest exposure to most targeted. Each step produces an entry on the adversary summary and a candidate entry on the remediation list. Total time, first pass, is approximately two hours.
Tool or operator | Investigative query | Investigative value |
|---|---|---|
Have I Been Pwned | Email address; phone number | Lists known breach exposures by date and source; subscribes to ongoing notifications |
Firefox Monitor | Multiple email addresses | Same backend as HIBP, multi-address monitoring |
Google site search |
| Surfaces profiles indexable but not findable through native platform search |
Google image search | Drag profile photo to image-search box | Reverse-finds where your face has been used elsewhere |
Yandex image search | Same input | Surfaces non-Western image sources others miss |
192.com (UK) | Full name plus town | Lists Open Register, BT directory, Companies House and Land Registry hits |
Optery directory | Lookup of US broker landscape | Maintains current list of US opt-out URLs by broker |
Wayback Machine | Lists archived versions of any site you have controlled | |
Same input | Independent archive often holding what Wayback does not | |
Companies House (UK) | Name and date of birth | Lists current and former directorships |
Strava heat-map | Account activity log | Reveals home and work locations through repeated routes |
3.4 Visual documentation standards
Capture audit-grade evidence at each step. The screenshots become the baseline for the six-month re-run.
Full-window screenshots, not crops, for every search result page. Include the URL bar, the search query, the timestamp.
Per-broker, capture the listing in full plus the URL of the broker's own opt-out or removal page.
For each social profile, capture the privacy-settings page as well as the public profile page. The two together establish what is exposed and what could be exposed if a setting is changed.
For HIBP results, capture the per-breach detail pages, not only the summary. Each breach detail records what data classes were exposed (passwords, addresses, dates of birth), which determines the remediation priority.
File-name every capture with the pattern
YYYY-MM-DD_source_query.png, in a single audit folder per pass. Audits dated more than a week apart should never co-mingle.
3.5 Data preservation and chain of custody
The audit folder is itself sensitive: it is a complete map of your exposure. Treat it as you would treat investigative material. Store it on the encrypted partition of the investigative laptop, not in cloud sync. Hash the folder at the close of each audit pass and record the hash, so that a future re-run can verify the previous baseline has not been altered. Generate the hash with a single command per platform.
Verification and analysis for reporting
4.1 Corroboration strategy
Verification in self-OSINT is about ensuring no exposure category has been missed and that the recoverability classification is correct.
Every identifier from Step 1 has been searched on at least four search engines and on every relevant people-search platform for your jurisdiction.
Every breach result is corroborated by checking the same email on a second source (Firefox Monitor, the breach's own notification email if held, or the source's own announcement).
Every people-search hit has been verified by visiting the listing directly, not relying on the search snippet, since some brokers serve different content to logged-in versus public visitors.
Every "no result" claim has been re-tested by searching a known-exposed identifier as a control. If your control returns nothing, the search itself is broken.
Every classification of "unrecoverable" has been tested by attempting one removal request, since some sources are more responsive than their reputation suggests.
4.2 Technical caveats and false positives
A nil result on a broker site is not the same as not being listed: Brokers segment by region and by query exactness. A search for "Jane Smith" in central London may return zero hits while "J Smith" in greater London returns a hit on you. Run multiple variants per identifier before concluding the broker has no record.
HIBP only includes breaches it has been given access to: Many breaches circulate privately and never reach HIBP. A clean HIBP record is reassuring but not a guarantee. Treat it as the floor of your exposure, not the ceiling.
Removed broker listings often reappear: Several US brokers have a known pattern of restoring removed records on their next data refresh, typically every three to twelve months. The six-month re-run cadence (5.3) is calibrated to catch this.
Cached and archived versions outlive the live page: Removing your name from a current source does not remove it from the Wayback Machine, archive.today, search-engine cache or third-party scrapers. Plan remediation on the assumption that anything that was once public is permanently public.
Pimeyes and reverse-image services produce false positives at scale: Their face-matching is not biometric-grade. A face that returns a hit may be someone who looks similar, not you. Treat every hit as a candidate to verify, not a confirmed exposure.
4.3 Linking data to narrative
Technical finding | Journalistic interpretation |
|---|---|
Email address X appears in eight breaches between 2014 and 2024 | Account credentials tied to that address should be considered compromised; password rotation alone is insufficient if the address is reused for sensitive accounts |
192.com lists current home address tied to electoral roll | Open Register opt-out is required; existing broker copies are not retrospectively recallable |
Strava heat-map shows daily route between two specific points | Home and work locations are inferable from public activity; profile must be set to private and historical activity scrubbed |
Wayback Machine holds archived copy of a deleted personal site | The deletion is partial; the page persists in archive and in any third-party scrape that occurred during indexing |
Pimeyes returns three matches across non-personal sites | Face image is in commercial reverse-image indices; removing original sources will not retract the indexed copies |
4.4 AI assistance in the research browser
Practice and resources
5.1 Practice exercise
- Inventory your identifiers (Step 1) and run all seven steps end-to-end. Capture screenshots throughout.
- Produce the adversary summary: a single-page document titled "What an adversary finds in 20 minutes". List the five highest-impact exposures only, in priority order, with a one-line description of each.
- Produce the remediation list: a table of every recoverable exposure, with the source, the removal procedure URL, the estimated time to remove, the priority (high, medium, low), and a status column for tracking completion.
- Produce the six-month follow-up checklist (the Session-start checklist below) as your audit baseline document.
- Action the top three items on the remediation list this week. Schedule the next audit pass in your calendar for six months from today.
5.2 Advanced resources
- Have I Been Pwned: Troy Hunt. The standard public reference for breach exposure. haveibeenpwned.com
- Optery broker directory: Optery, Inc. Maintained list of US data broker opt-out URLs organised by aggregator. optery.com/data-brokers
- Open Register opt-out (UK): UK Government. The single upstream opt-out that governs most UK people-search exposure. gov.uk/electoral-register/opt-out-of-the-open-register
- Surveillance Self-Defense: Electronic Frontier Foundation. Practitioner-grade guidance on threat modelling, account hygiene and account compartmentalisation. ssd.eff.org
- Online Harassment Field Manual: PEN America. Field-tested guidance on responding to doxxing and online harassment, including pre-emptive footprint reduction. onlineharassmentfieldmanual.pen.org
5.3 Six-month follow-up checklist
- Run the audit before publication, not after the legal threat letter arrives. The remediation procedures take weeks; the threat letter takes days.
- Distinguish recoverable from unrecoverable exposure. Spend remediation time only where there is a working procedure; for everything else, assume permanence and plan accordingly.
- Treat your self-OSINT dossier as you would treat investigative material. Encrypt at rest, hash for baseline, never paste into a hosted LLM.
- UK, US and EU broker landscapes do not overlap. A US-focused removal service does not address 192.com or Tracesmart, and vice versa. Run the audit against the right jurisdiction or pay an investigator who will.
- The audit decays. New breaches accumulate, new brokers appear, removed listings reappear. The six-month cadence is the operational discipline that converts a one-off pass into protection that holds.


