Underground Forum-to-Pastebin OSINT Pivot: Alias Correlation and Leaked Document Discovery

osint_soc_enrichment Difficulty 1–5 30 min certifiable

Theory

Why This Matters

Investigative journalists, corporate fraud analysts, and threat intelligence teams regularly confront a core problem: a known forum alias must be connected to a real identity or an active credential set. In 2021, researchers at threat intelligence firm Group-IB traced a ransomware affiliate network by pivoting from a darknet forum handle to a Pastebin dump containing the same username alongside an email address and a partial password hash. The correlation took fewer than four hours of structured OSINT work and ultimately led to the de-anonymisation of three operators. Law enforcement agencies have used the same methodology — forum post breadcrumb to Pastebin credential dump — in prosecutions of cybercriminals across multiple jurisdictions. Understanding this pivot chain is essential for any intelligence analyst tasked with attributing malicious online activity to a real-world actor.

Core Concept

Forum posts create permanent, indexed breadcrumbs: a username, writing style, post timestamp, email used during registration (sometimes visible in raw HTML before forum hardening), and the content of posts themselves. When a subject reuses a forum alias across platforms — a common operational security failure — each forum post effectively becomes a node in an attribution graph.

Pastebin and similar paste services (Ghostbin, Hastebin, PrivateBin) are frequently abused by threat actors to distribute credential dumps. These dumps, often formatted as username:password or email:hash pairs, are indexed by search engines within minutes of posting unless marked private. The psbdmp.ws archive mirrors Pastebin pastes before deletion, making historically posted credential files recoverable even after the original paste is removed.

Cross-platform username correlation exploits the psychological tendency to reuse the same handle everywhere. A subject who registers as shadow_wolf_99 on a hacking forum, uses the same alias on Reddit, and posts a Python script to Pastebin under the same name has created a trivially traversable identity graph. Sherlock automates this search across 300+ platforms, reducing hours of manual checking to seconds.

HaveIBeenPwned (HIBP) provides a public API that checks a given email address against its database of several billion compromised records. Combined with a Pastebin search that surfaces the email, the HIBP API response reveals which specific breaches the address appeared in, informing the analyst which password formats or hash types to expect.

Google dorks are the primary free-tier discovery method: site:pastebin.com "shadow_wolf_99" searches the Pastebin subdomain for any indexed paste containing the target username. Adding "password" or "email" to the dork narrows results to credential-relevant pastes.

Technical Deep-Dive

# Step 1: Google dork — find pastes mentioning the forum alias
# Query: site:pastebin.com "shadow_wolf_99"
# Query: site:pastebin.com "shadow_wolf_99" "password"
# Run via browser or CLI with googler:
googler --count 20 'site:pastebin.com "shadow_wolf_99"'

# Step 2: psbdmp.ws search — archived Pastebin content (pastes deleted from Pastebin persist here)
curl -s "https://psbdmp.ws/api/search/shadow_wolf_99" | python3 -m json.tool
# Response contains paste IDs, timestamps, and snippet previews

# Step 3: Retrieve a specific paste by ID
curl -s "https://pastebin.com/raw/PASTE_ID_HERE"

# Step 4: HaveIBeenPwned API — check discovered email against known breaches
EMAIL="[email protected]"
curl -s -H "hibp-api-key: YOUR_API_KEY" 
  "https://haveibeenpwned.com/api/v3/breachedaccount/${EMAIL}?truncateResponse=false" 
  | python3 -m json.tool
# Returns: breach names, dates, data classes (passwords, emails, usernames)

# Step 5: Sherlock — enumerate the forum alias across 300+ platforms
python3 sherlock/sherlock.py shadow_wolf_99 --output results.txt --timeout 10
# Sample output:
# [+] Twitter: https://twitter.com/shadow_wolf_99
# [+] GitHub: https://github.com/shadow_wolf_99
# [+] Reddit: https://www.reddit.com/user/shadow_wolf_99
# [-] Instagram: Not Found

# Step 6: Pastebin API search (requires API key — free tier available)
curl -s "https://pastebin.com/api/api_post.php" 
  -d "api_dev_key=YOUR_KEY&api_option=list&api_user_key=USER_KEY&api_results_limit=100"

# Automated HIBP batch check for a list of candidate emails
import requests, time

API_KEY = "YOUR_HIBP_API_KEY"
HEADERS = {"hibp-api-key": API_KEY, "User-Agent": "OSINT-Research/1.0"}

emails = [
    "[email protected]",
    "[email protected]",
]

for email in emails:
    url = f"https://haveibeenpwned.com/api/v3/breachedaccount/{email}"
    r = requests.get(url, headers=HEADERS, params={"truncateResponse": "false"})
    if r.status_code == 200:
        breaches = [b["Name"] for b in r.json()]
        print(f"[FOUND] {email}: {breaches}")
    elif r.status_code == 404:
        print(f"[CLEAN] {email}: no breaches found")
    time.sleep(1.6)  # HIBP rate limit: max 1 request per 1.5s

Intelligence Collection Methodology

Extract the forum alias from the target post, profile page, or prior intelligence. Note any associated metadata: post dates, writing patterns, avatar images, email fragments visible in HTML source, user ID numbers in profile URLs.
Run Google dorks targeting paste sites: site:pastebin.com "alias", site:pastebin.com "alias" "password", site:ghostbin.co "alias". Save all matching URLs.
Query psbdmp.ws using the API endpoint (/api/search/alias) to recover deleted pastes. Download and archive all returned paste content locally before conducting further analysis.
Extract candidate email addresses from any recovered pastes. Common formats: [email protected], [email protected], or partial obfuscations like sh***[email protected].
Run Sherlock (sherlock alias --timeout 10) to enumerate the alias across social, code, and professional platforms. Flag any confirmed profiles for further investigation.
Submit discovered emails to HaveIBeenPwned API. For each breach returned, note the data classes (passwords, hashes, plaintext) to understand what credential material may be circulating.
Cross-reference results in Maltego: create an entity for the alias, add discovered email nodes, add platform profile nodes, and draw links to any confirmed Pastebin dumps. Use the Sherlock transform if installed.
Document chain of custody: record the search query, timestamp, and exact URL or API response for each discovered piece of intelligence. OSINT findings may be used in legal proceedings and require a reproducible audit trail.

Common Intelligence Collection Errors

Treating a Pastebin hit as confirmed attribution without corroboration: A paste containing a username does not prove the named individual posted it. Credential dumps often aggregate data from multiple sources and may include fabricated or recycled entries. Always seek at least two independent corroborating data points before treating a Pastebin find as confirmed attribution.
Ignoring psbdmp.ws when the Pastebin link is dead: Many analysts stop when a Pastebin URL returns 404. psbdmp.ws archives a significant fraction of public pastes before deletion. Always query the archive before concluding that no paste content is recoverable.
Searching only the primary alias without variant enumeration: Subjects frequently register slight variants of their primary alias (underscores vs. dots, trailing numbers, prefixes). If shadow_wolf_99 yields limited results, try shadowwolf99, shadow_wolf, s_wolf_99, and common substitutions.
Violating HaveIBeenPwned rate limits and receiving 429 errors: The HIBP API enforces a rate limit of approximately one request per 1.5 seconds per API key. Automated scripts that ignore the Retry-After header will be throttled and may have the API key suspended. Always implement sleep intervals and handle 429 responses gracefully.
Failing to document the exact query and timestamp: OSINT findings are ephemeral — pastes are deleted, profiles are deactivated, and search engine caches expire. Failing to record the exact dork string, the URL, and the datetime of access makes it impossible to reproduce or verify findings later, which is a critical failure in any professional intelligence or investigative context.
Confusing username correlation with identity confirmation: Sherlock confirming that shadow_wolf_99 exists on GitHub, Twitter, and Reddit proves account existence, not that all three accounts belong to the same person. Confirmation requires corroborating evidence such as identical writing patterns, shared images, cross-posts, or linked email addresses.

NICE Framework Alignment

Code	Knowledge/Skill/Task Statement	How This Card Develops It
K0058	Knowledge of network protocols	Understanding how paste sites use HTTP APIs and how search engine indexing of public HTTP resources enables OSINT discovery
K0145	Knowledge of security assessment approaches	Applying structured pivot methodology — forum alias to paste search to breach database — as a repeatable OSINT assessment workflow
K0272	Knowledge of network security architecture	Recognising how public indexing of paste services creates unintended intelligence exposure even when subjects believe content is ephemeral
K0427	Knowledge of encryption algorithms	Interpreting credential dump formats including plaintext, MD5, bcrypt, and SHA-1 hashes recovered from Pastebin to assess risk severity
S0040	Skill in identifying and extracting data of interest	Extracting usernames, email addresses, and credential pairs from unstructured Pastebin paste content using pattern matching and manual review
T0569	Apply and utilize authorized cyber capabilities to achieve objectives	Executing the forum-to-Pastebin pivot chain using Sherlock, HIBP API, psbdmp.ws, and Google dorks within an authorised intelligence collection mandate