This website uses cookies

Read our Privacy policy and Terms of use for more information.

◆ Methods · Course Zero
Self-OSINT HIBP · 192.com · Google Takeout Foundational
Prerequisites
This tutorial assumes you have completed:

Introduction and context

1.1 The investigative need

The subject of every published investigation will, eventually, look up the byline. So will the subject's lawyers, the subject's PR firm, and anyone the subject pays to retaliate. What they find on the first page of a search for your name shapes the legal threat letter you receive, the harassment campaign that follows publication, and in some cases the level of physical risk attached to the work. You do not get to decide whether they look. You only get to decide what they find.

This tutorial inverts the standard OSINT workflow and applies it to the practitioner. You are the target. The exercise is a structured pass through the same tools, sources and reasoning patterns you use against subjects, run against your own name, handles, addresses, family, employers and assets. The output is three artefacts: an adversary summary, a remediation list, and a follow-up cadence. The adversary summary tells you what someone with twenty minutes and a search bar will find first. The remediation list tells you what to remove, what to suppress and what is unrecoverable. The cadence keeps the audit current as new exposures accumulate.

The exercise is foundational because the controls in CZ1, browser isolation, VPN, persona hygiene, only protect the work you do from this point onward. They cannot retract data already in circulation. The dry run quantifies what is already out and produces a defensible plan to address it. Done once, it is a baseline. Done every six months, it is a discipline.

1.2 Learning outcomes

Learning outcomes
After completing this tutorial, you will be able to:
  • Run a structured self-OSINT pass across name, handles, email addresses, phone numbers, addresses and asset records, using the same tools an adversary would.
  • Identify which exposures sit on UK, US and EU sources and route remediation to the correct opt-out or removal procedure for each.
  • Distinguish recoverable exposure (live broker listings, current social profiles) from unrecoverable exposure (cached pages, archived public records, leaked breach data) and triage accordingly.
  • Produce three documented artefacts: an adversary summary, a remediation list with priorities, and a six-month follow-up checklist.
  • Establish a six-month cadence to re-run the audit and capture new exposures as they accumulate.

1.3 Threat model

Threat model
This tutorial defends against three specific threats:
  • Pre-publication legal threats and intimidation that rely on the subject identifying your home address, family or financial profile from public sources.
  • Targeted harassment, swatting and doxxing campaigns aimed at your physical address, family members or workplace.
  • Account-takeover attempts that exploit credentials exposed in past data breaches, especially against email, banking and social accounts used for source contact.
It partially defends against: state-actor surveillance with access to closed datasets (intelligence holdings, telecoms intercepts, paid commercial datasets) where the public-source picture is only one input.
It does not defend against: physical surveillance, supply-chain compromise of devices, insider leaks from employers or family members, or any threat that does not begin with a public-source lookup. If your threat model includes any of those, this technique is necessary but not sufficient.
Theory and framework

Foundational theory and ethical-legal framework

Digital footprint: The aggregate of personally-identifying information about an individual that is retrievable from public sources without authentication. Includes name and address records, social media profiles, breached credentials, leaked datasets, professional registrations, property records, court filings, news mentions, and content the individual has published themselves.

Self-OSINT (or self-doxxing dry run): The practice of running a structured open-source investigation against your own identity using the same tools and reasoning patterns an adversary would use, with the goal of producing a remediation plan. The term "self-doxxing" appears in some practitioner literature; the operational meaning is identical.

Data broker: A commercial entity that aggregates personal data from public records, marketing partners, electoral registers and breach disclosures, then resells access. Examples: 192.com (UK), Spokeo, Whitepages, BeenVerified (US), and dozens of EU-specific equivalents. Most operate an opt-out or suppression procedure mandated by GDPR or local equivalents.

Open Register (UK): The version of the UK electoral register available for sale to any organisation for any purpose. Approximately 40 percent of UK registered electors remain on it. Opting out is free at gov.uk/register-to-vote but is not retrospective: brokers who purchased prior editions retain that data legally.

Recoverable versus unrecoverable exposure: Recoverable exposure is data on a live source with a working removal procedure: a broker listing you can opt out of, an old social profile you can delete, a cached page that will refresh out. Unrecoverable exposure is data that is permanently in circulation: archived breach dumps, court records, news articles, archived versions of opted-out pages. The distinction governs whether you remediate, suppress or accept.

Ethical boundary · stop at the login
Self-OSINT looks the same as adversary-OSINT until the point where you are tempted to use credentials you would not otherwise have. The boundary is identical to investigations: stop at the login. If a paid breach-dump aggregator offers to show you what is in your old credentials behind a paywall, do not pay. If a "people search" service requires a credit card to reveal results, do not pay. The free ecosystem (HIBP, gov.uk, official broker opt-out pages) is sufficient for the audit. Paying broker subscriptions to surface your own data also feeds the broker economy that exposes you in the first place.
Legal considerations

The audit operates entirely on data about yourself, so consent and lawful basis questions do not apply to the lookups.

Three specific legal points apply to the remediation phase.

First, UK GDPR Article 17 (right to erasure) and equivalent provisions in EU GDPR and California CCPA give you a legal right to demand removal from most commercial data brokers, with statutory response timelines.

Second, the Open Register opt-out at gov.uk/register-to-vote is the only operational way to remove your name from the upstream feed that supplies most UK people-search sites; the Representation of the People Bill 2024-26, currently before Parliament, may switch the register from opt-out to opt-in, which would alter the procedure.

Third, suppression of director and officer information at Companies House is now possible under the Economic Crime and Corporate Transparency Act 2023 from January 2025, at a fee per document. Court records, news archives and most public registers do not fall under any erasure right.

Consult legal counsel before any remediation step that touches on your professional registrations, court records or published bylines.

2.3 Device compartmentalisation

The audit generates significant traffic to broker, breach-check and social-media services from your IP and your accounts. Run the audit from the investigative browser profile and VPN endpoint established in CZ1, not from your personal browser. This prevents the audit traffic from polluting your personal recommendation algorithms, search histories and login records, and also prevents the rare case where a broker site stores your IP and serves it back to subjects who pay to look you up.

Compartmentalisation · know the layer you are working at
This tutorial works at the same browser-container and VPN-endpoint layer as CZ1. It does not cover account-level compartmentalisation (separating personal, professional and investigative email addresses), device-level compartmentalisation (a dedicated audit laptop or VM), or jurisdictional compartmentalisation (where your audit is run from). For audits that touch sensitive professional registrations or family members, run the audit from the strictest compartmentalisation layer your work demands.
Applied methodology

Applied methodology

3.1 Required tools and setup

Have I Been Pwned (HIBP): the standard public reference for whether an email address or phone number appears in a known data breach. Free, no account required to search. Subscribe to the notification service for ongoing alerts. HIBP currently lists more than 970 breached sites and processes new dumps continuously. Available at haveibeenpwned.com. Note that Google's Dark Web Report tool was discontinued in February 2026; HIBP is the recommended replacement.

Firefox Monitor: uses the HIBP backend with a browser-integrated interface for users on Firefox or with a Mozilla account. Useful for monitoring multiple addresses simultaneously. Available at monitor.firefox.com.

Google Takeout: exports the data Google holds about you across Search history, location history, YouTube, Maps, Drive metadata and account activity. The export reveals what the dominant search engine knows about you, which is also a partial picture of what an adversary cross-referencing leaked Google data might know. Available at takeout.google.com.

192.com (UK): aggregates the UK Open Register, BT phone directories, Companies House filings and Land Registry property prices. The default first stop for any UK lookup. Free results require registration; full results are credit-based. Use the free tier only. Available at 192.com.

Tracesmart and PeopleTraceUK (UK): UK-specific people-search services that retain Open Register data going back to 2002 and 2008 respectively. Routinely missed by US-focused removal services. Each has a manual opt-out procedure on its site.

Spokeo, Whitepages, BeenVerified, Radaris (US): the four most exposed US people-search aggregators. Each has a removal procedure, each takes between one and four weeks to action, and each must be re-checked periodically because removed records sometimes reappear. Optery maintains a current list of US broker opt-out pages organised by aggregator at optery.com.

EU broker landscape: highly fragmented by member state. Germany has Das Telefonbuch and 11880; France has 118712 and Les Pages Blanches; Netherlands has Bellingland and KvK (Chamber of Commerce). National data protection authorities maintain lists of registered brokers; check your DPA's broker registry. GDPR Article 17 gives you a right of erasure that applies to all of them.

Wayback Machine and archive.today: use to check what archived versions exist of pages you have controlled (old personal sites, old social profiles). Removed live pages often persist indefinitely in archives. The Wayback Machine has a removal request procedure; archive.today does not. Available at web.archive.org and archive.today.

Google, Bing, DuckDuckGo, Yandex: run the same name and handle searches across all four. Each indexes differently and surfaces different exposures. Yandex in particular often surfaces image and Russian-language sources that the others miss.

Pimeyes (caveat): a reverse-image search service that has indexed billions of public face images. The free tier reveals where your face appears online. We list it for completeness because adversaries use it; we do not recommend using it casually because the lookup itself can be logged. If you use it, do so once, from your investigative browser, and read its privacy policy first. Available at Pimeyes.com.

3.2 OPSEC and target awareness

OPSEC · target awareness
There is no adversary observing this audit; you are the subject. The relevant target awareness is that broker sites, social platforms and breach-check services log every lookup. If you run the audit signed in to your personal accounts, those lookups become part of your account history and may surface in future cross-referencing by the same platforms. Run the audit signed out of all personal accounts, on the investigative browser profile, on the VPN endpoint set up in CZ1. Do not enter your real credentials anywhere during the audit. Do not pay any broker. Do not click "claim this profile" links on people-search sites; doing so confirms the listing is accurate and may convert a partial record into a complete one.

OPSEC for self-OSINT also extends to who observes your remediation. Some opt-out procedures generate confirmation emails to the address being searched, which is fine. Some generate postal mail to the address being suppressed, which can alert household members or co-residents to what you are doing. Some require a photograph of government ID to action removal, which is itself an exposure. Read each broker's removal procedure before submitting it.

3.3 Practical execution

The audit runs in seven steps, working from broadest exposure to most targeted. Each step produces an entry on the adversary summary and a candidate entry on the remediation list. Total time, first pass, is approximately two hours.

STEP 01
Inventory your identifiers
Goal: enumerate every searchable handle, address and number associated with you
List your full legal name, current and previous addresses going back ten years, all current and former email addresses, all current and former phone numbers, all social media handles you have ever used, your byline name if it differs from legal name, your spouse and dependents' names, your employer's name. This is your search seed. Save it locally in your investigative environment, not in cloud sync.
STEP 02
Run the breach check
Goal: enumerate which credentials of yours are in known breach corpora
Search every email address from Step 1 on Have I Been Pwned. Note each breach by name and date. For any account in a breach, the password used at the time of breach should be considered compromised forever. Subscribe each address to HIBP notifications for ongoing alerts.
STEP 03
Run the search-engine pass
Goal: capture what an adversary sees in the first ten minutes
Search your full name across Google, Bing, DuckDuckGo and Yandex. Then your byline, then each handle, then phone number, then current address. Capture screenshots of the first two result pages from each engine. The differences across engines are themselves diagnostic; an exposure that appears only on Yandex is no less recoverable than one that appears across all four.
STEP 04
Run the people-search pass
Goal: enumerate live broker listings against your identifiers
For UK residents, search 192.com, Tracesmart and PeopleTraceUK using the free tiers only. For US residents, search Spokeo, Whitepages, BeenVerified and Radaris; use the Optery directory for additional brokers. For EU residents, check your national broker registry maintained by your data protection authority. Capture screenshots of every listing that returns a hit. Each hit is a remediation candidate.
STEP 05
Run the social-platform pass
Goal: enumerate active and dormant social profiles tied to your identifiers
Search each handle from Step 1 on Facebook, LinkedIn, Instagram, X, Bluesky, Mastodon, TikTok, Reddit, Strava, Pinterest, GitHub and any niche platform you have used. Note dormant accounts, especially those tied to old email addresses, since these often retain personal data the platform never deleted. Strava and similar fitness apps are routinely overlooked but expose home and work addresses through activity heat-maps.
STEP 06
Run the registry pass
Goal: enumerate exposure in public registries you may not have considered
Check Companies House (UK), the SEC EDGAR (US) or your national equivalent for any directorships or officer roles in your name. Check Land Registry (UK), county assessor sites (US) or national land registries for property records. Check court records: BAILII (UK), CourtListener and PACER (US). Check professional registers (NUJ, IFJ, bar councils) you are publicly listed on. Each public registry exposure is typically permanent; remediation here is suppression where legally possible, not removal.
STEP 07
Pull what your platforms hold
Goal: see what the dominant platforms have on you that does not appear in public search
Run Google Takeout for every Google account you have. Run Facebook's "Download Your Information". Run X's data export. These are not adversary-visible by default but become so under subpoena, breach or platform policy change. They also reveal how much resolution your daily activity has accumulated, which is itself a calibration exercise.

Tool or operator

Investigative query

Investigative value

Have I Been Pwned

Email address; phone number

Lists known breach exposures by date and source; subscribes to ongoing notifications

Firefox Monitor

Multiple email addresses

Same backend as HIBP, multi-address monitoring

Google site search

site:linkedin.com "Your Name" and variants per platform

Surfaces profiles indexable but not findable through native platform search

Google image search

Drag profile photo to image-search box

Reverse-finds where your face has been used elsewhere

Yandex image search

Same input

Surfaces non-Western image sources others miss

192.com (UK)

Full name plus town

Lists Open Register, BT directory, Companies House and Land Registry hits

Optery directory

Lookup of US broker landscape

Maintains current list of US opt-out URLs by broker

Wayback Machine

Lists archived versions of any site you have controlled

Same input

Independent archive often holding what Wayback does not

Companies House (UK)

Name and date of birth

Lists current and former directorships

Strava heat-map

Account activity log

Reveals home and work locations through repeated routes

3.4 Visual documentation standards

Capture audit-grade evidence at each step. The screenshots become the baseline for the six-month re-run.

  • Full-window screenshots, not crops, for every search result page. Include the URL bar, the search query, the timestamp.

  • Per-broker, capture the listing in full plus the URL of the broker's own opt-out or removal page.

  • For each social profile, capture the privacy-settings page as well as the public profile page. The two together establish what is exposed and what could be exposed if a setting is changed.

  • For HIBP results, capture the per-breach detail pages, not only the summary. Each breach detail records what data classes were exposed (passwords, addresses, dates of birth), which determines the remediation priority.

  • File-name every capture with the pattern YYYY-MM-DD_source_query.png, in a single audit folder per pass. Audits dated more than a week apart should never co-mingle.

3.5 Data preservation and chain of custody

The audit folder is itself sensitive: it is a complete map of your exposure. Treat it as you would treat investigative material. Store it on the encrypted partition of the investigative laptop, not in cloud sync. Hash the folder at the close of each audit pass and record the hash, so that a future re-run can verify the previous baseline has not been altered. Generate the hash with a single command per platform.

Command reference · folder hash for audit baseline
macOS / Linux
find ./audit-2026-04 -type f -exec shasum -a 256 {} \; | shasum -a 256
Windows PowerShell
Get-ChildItem -Recurse .\audit-2026-04 | Get-FileHash -Algorithm SHA256 | Format-Table Hash, Path
Verification and analysis

Verification and analysis for reporting

4.1 Corroboration strategy

Verification in self-OSINT is about ensuring no exposure category has been missed and that the recoverability classification is correct.

  • Every identifier from Step 1 has been searched on at least four search engines and on every relevant people-search platform for your jurisdiction.

  • Every breach result is corroborated by checking the same email on a second source (Firefox Monitor, the breach's own notification email if held, or the source's own announcement).

  • Every people-search hit has been verified by visiting the listing directly, not relying on the search snippet, since some brokers serve different content to logged-in versus public visitors.

  • Every "no result" claim has been re-tested by searching a known-exposed identifier as a control. If your control returns nothing, the search itself is broken.

  • Every classification of "unrecoverable" has been tested by attempting one removal request, since some sources are more responsive than their reputation suggests.

4.2 Technical caveats and false positives

A nil result on a broker site is not the same as not being listed: Brokers segment by region and by query exactness. A search for "Jane Smith" in central London may return zero hits while "J Smith" in greater London returns a hit on you. Run multiple variants per identifier before concluding the broker has no record.

HIBP only includes breaches it has been given access to: Many breaches circulate privately and never reach HIBP. A clean HIBP record is reassuring but not a guarantee. Treat it as the floor of your exposure, not the ceiling.

Removed broker listings often reappear: Several US brokers have a known pattern of restoring removed records on their next data refresh, typically every three to twelve months. The six-month re-run cadence (5.3) is calibrated to catch this.

Cached and archived versions outlive the live page: Removing your name from a current source does not remove it from the Wayback Machine, archive.today, search-engine cache or third-party scrapers. Plan remediation on the assumption that anything that was once public is permanently public.

Pimeyes and reverse-image services produce false positives at scale: Their face-matching is not biometric-grade. A face that returns a hit may be someone who looks similar, not you. Treat every hit as a candidate to verify, not a confirmed exposure.

4.3 Linking data to narrative

Technical finding

Journalistic interpretation

Email address X appears in eight breaches between 2014 and 2024

Account credentials tied to that address should be considered compromised; password rotation alone is insufficient if the address is reused for sensitive accounts

192.com lists current home address tied to electoral roll

Open Register opt-out is required; existing broker copies are not retrospectively recallable

Strava heat-map shows daily route between two specific points

Home and work locations are inferable from public activity; profile must be set to private and historical activity scrubbed

Wayback Machine holds archived copy of a deleted personal site

The deletion is partial; the page persists in archive and in any third-party scrape that occurred during indexing

Pimeyes returns three matches across non-personal sites

Face image is in commercial reverse-image indices; removing original sources will not retract the indexed copies

4.4 AI assistance in the research browser

AI assistance · self-data uploaded to a hosted model
The technique-specific AI threat in self-OSINT is that the audit produces a single, complete dossier of your exposure. Pasting that dossier into a hosted LLM to "summarise my biggest risks" hands a verified personal-data corpus to a third-party processor and adds another copy of the dossier to a vendor's training, retention or telemetry pipeline. The convenience is high; the exposure is also high. Self-data is the most sensitive data you handle.
Privacy warning
If you must use an LLM to assist with audit summarisation, use a local model (Ollama, LM Studio, llama.cpp) running on your investigative machine, or a hosted provider with a verified zero-retention configuration that you have read and confirmed. Do not paste the dossier into a free consumer chatbot. Do not paste into any model whose privacy policy you have not read.
Verification warning
An LLM cannot verify whether a broker listing is recoverable, whether a breach is current, or whether a removal procedure exists. It can only summarise what you give it. Treat any AI-generated remediation suggestion as a hypothesis to test against the source itself.
Practice and resources

Practice and resources

5.1 Practice exercise

Practice exercise
Run the seven-step audit against yourself today and produce three artefacts.
  1. Inventory your identifiers (Step 1) and run all seven steps end-to-end. Capture screenshots throughout.
  2. Produce the adversary summary: a single-page document titled "What an adversary finds in 20 minutes". List the five highest-impact exposures only, in priority order, with a one-line description of each.
  3. Produce the remediation list: a table of every recoverable exposure, with the source, the removal procedure URL, the estimated time to remove, the priority (high, medium, low), and a status column for tracking completion.
  4. Produce the six-month follow-up checklist (the Session-start checklist below) as your audit baseline document.
  5. Action the top three items on the remediation list this week. Schedule the next audit pass in your calendar for six months from today.
Estimated time: 120 minutes for the audit, plus remediation time per item

5.2 Advanced resources

Advanced resources
  • Have I Been Pwned: Troy Hunt. The standard public reference for breach exposure. haveibeenpwned.com
  • Optery broker directory: Optery, Inc. Maintained list of US data broker opt-out URLs organised by aggregator. optery.com/data-brokers
  • Open Register opt-out (UK): UK Government. The single upstream opt-out that governs most UK people-search exposure. gov.uk/electoral-register/opt-out-of-the-open-register
  • Surveillance Self-Defense: Electronic Frontier Foundation. Practitioner-grade guidance on threat modelling, account hygiene and account compartmentalisation. ssd.eff.org
  • Online Harassment Field Manual: PEN America. Field-tested guidance on responding to doxxing and online harassment, including pre-emptive footprint reduction. onlineharassmentfieldmanual.pen.org

5.3 Six-month follow-up checklist

Six-month follow-up checklist · re-run twice yearly
Re-search every identifier on Have I Been Pwned. Note any new breaches.
Re-search your name and current address on the people-search platforms relevant to your jurisdiction. Confirm previously-removed listings have not reappeared.
Re-run the search-engine pass on Google, Bing, DuckDuckGo and Yandex. Note any new mentions or surfaced exposures.
Pull a fresh Google Takeout and platform export. Compare against the previous export to spot accumulated activity.
Hash the new audit folder, compare against the previous baseline, and schedule the next pass in the calendar before closing the session.
Key takeaways
Key takeaways
  1. Run the audit before publication, not after the legal threat letter arrives. The remediation procedures take weeks; the threat letter takes days.
  2. Distinguish recoverable from unrecoverable exposure. Spend remediation time only where there is a working procedure; for everything else, assume permanence and plan accordingly.
  3. Treat your self-OSINT dossier as you would treat investigative material. Encrypt at rest, hash for baseline, never paste into a hosted LLM.
  4. UK, US and EU broker landscapes do not overlap. A US-focused removal service does not address 192.com or Tracesmart, and vice versa. Run the audit against the right jurisdiction or pay an investigator who will.
  5. The audit decays. New breaches accumulate, new brokers appear, removed listings reappear. The six-month cadence is the operational discipline that converts a one-off pass into protection that holds.
Next in this series
CZ3 Legal and ethical boundaries of OSINT: GDPR, data minimisation and defensible collection
The third and final Course Zero tutorial. Article 6 lawful basis logic, the journalism exemption and its limits, data minimisation in practice, and the line between public-data collection and doxxing.
Evidentiary Standard
This tutorial was produced using the Signal & Shadow methodology framework. All techniques described apply only to publicly accessible data. No method described here involves or endorses unauthorised access to systems or data. Verify all findings independently before publication.
About Signal & Shadow
Signal & Shadow is an independent forensic investigation and methodology practice. Methods is the structured tutorial series, published for investigators and journalists working in OSINT and digital forensics.

Reply

Avatar

or to participate

Keep Reading