Advanced Search

Google Dorks

Advanced search operators for security research, vulnerability assessment, and open-source intelligence gathering.

Google dorking, also known as Google hacking, uses advanced search operators to uncover information that's publicly indexed but not easily discoverable through standard searches. Security professionals leverage these techniques for reconnaissance, vulnerability assessment, and open-source intelligence gathering. This reference guide compiles the most practical and effective dorks used in real-world security research, penetration testing, and bug bounty programs.

Basic Search Operators

These foundational operators form the building blocks of all Google dorks. Understanding how to combine them effectively is essential for advanced reconnaissance.

site:

Syntax: site:domain.com

Example: site:example.com filetype:pdf confidential

Use: Restricts search results to a specific domain or subdomain. This is the most fundamental operator for targeted reconnaissance, allowing you to focus searches on specific organizations. Use wildcards for subdomain enumeration: site:*.example.com discovers all indexed subdomains.

filetype: (or ext:)

Syntax: filetype:extension or ext:extension

Example: filetype:xlsx site:gov budget 2024

Use: Searches for specific file types that often contain sensitive information. Documents, spreadsheets, and configuration files frequently expose data not intended for public access. Both operators work identically, though ext: is slightly shorter.

intitle:

Syntax: intitle:"search term" or allintitle:word1 word2

Example: intitle:"Index of /admin"

Use: Searches for keywords in page titles. Particularly effective for finding admin panels, login pages, and directory listings. Use allintitle: when all words must appear in the title (no need for multiple intitle: operators).

inurl:

Syntax: inurl:keyword or allinurl:word1 word2

Example: inurl:wp-config.php

Use: Searches for keywords in the URL path. Essential for discovering specific files, vulnerable scripts, or standard paths used by web applications. URL structure often reveals technology stack and potential entry points.

intext:

Syntax: intext:"search term" or allintext:word1 word2

Example: intext:"DB_PASSWORD" filetype:env

Use: Searches for keywords in page body content. Useful for finding specific configuration strings, error messages, or exposed credentials within the visible page content or source code.

cache:

Syntax: cache:https://example.com/page

Example: cache:https://example.com/deleted-page.html

Use: Retrieves Google's cached version of a page. Invaluable for accessing content that's been removed, modified, or is temporarily unavailable. Can reveal historical information about targets.

related:

Syntax: related:domain.com

Example: related:example.com

Use: Finds websites similar to the target domain. Useful for discovering competitors, partner organizations, or related infrastructure that might share similar vulnerabilities or configurations.

Logical Operators

  • OR - Matches either term: site:target.com (filetype:doc OR filetype:pdf)
  • AND - Matches both terms (implicit): confidential password
  • - (minus) - Excludes terms: site:example.com -site:www.example.com
  • " " (quotes) - Exact phrase match: "database error"
  • * (asterisk) - Wildcard for unknown terms: "password is *"
  • ( ) - Groups operators: site:example.com (inurl:admin OR inurl:login)

File Discovery and Document Intelligence

Document discovery reveals organizational structure, internal processes, and often unintentionally exposed sensitive information. PDFs, spreadsheets, and office documents frequently contain metadata, internal references, and confidential content.

Sensitive Documents

site:target.com filetype:pdf (confidential OR internal OR proprietary)

Discovers documents marked as confidential that were accidentally published. Organizations often upload sensitive reports, financial statements, or internal communications without proper access controls.

Office Documents Intelligence

site:target.com (filetype:xlsx | filetype:docx | filetype:pptx)

Microsoft Office documents often contain rich metadata including author names, revision history, and internal network paths. Spreadsheets may expose salary data, customer lists, or financial projections.

Configuration Files

filetype:env intext:"DB_PASSWORD"

Configuration files are goldmines for credentials and system information. Environment files (.env) used by modern frameworks contain database passwords, API keys, and service credentials.

Backup Files

intitle:"Index of" (backup | old | ~)

Backup files often have fewer access restrictions than production systems. Common extensions include .bak, .old, .backup, .sql, .zip, and .tar.gz.

Resume and Personnel Mining

site:linkedin.com "software engineer" "target company" "@target.com"

OSINT for social engineering or understanding organizational structure. Resumes reveal technologies used, project details, and employee relationships.

Vulnerable Systems and Exposed Services

These dorks identify misconfigured systems, exposed services, and security weaknesses that attackers actively exploit.

Directory Indexing

intitle:"Index of /"

Directory listings occur when web servers lack proper index files and have directory browsing enabled. This exposes the entire directory structure, allowing direct access to files, source code, backups, and configuration files.

Misconfigured Web Servers

intitle:"Apache2 Ubuntu Default Page"

Default installation pages indicate freshly deployed, likely unconfigured servers. Apache Status pages (server-status) expose real-time connection information and virtual hosts.

Exposed API Keys and Tokens

site:github.com "AWS_SECRET_ACCESS_KEY"

Developers accidentally commit sensitive credentials to public repositories. AWS keys grant cloud infrastructure access, Firebase keys expose database backends, and OAuth tokens allow account takeover.

Open FTP Servers

intitle:"index of" inurl:ftp

FTP servers with directory listing enabled expose uploaded files, backups, and website content. Anonymous FTP access remains surprisingly common on older infrastructure.

Login Pages and Authentication Portals

Discovering login pages is the first step in authentication testing. These dorks identify admin panels, database interfaces, and remote access systems.

Generic Admin Panels

inurl:admin intitle:login

Admin panels are high-value targets for credential stuffing, brute force attacks, and vulnerability exploitation. Common paths include /admin/, /administrator/, /cpanel.

CMS-Specific Login Pages

inurl:wp-login.php

WordPress (wp-login.php), Joomla (/administrator/), and Drupal (/user/login) each have standard login URLs. Identifying the CMS allows targeted vulnerability scanning.

Database Management Interfaces

intitle:"phpMyAdmin" "Welcome to phpMyAdmin"

phpMyAdmin provides web-based MySQL database administration. Exposed instances with weak credentials grant direct database access, allowing data exfiltration and system compromise.

Remote Access Portals

intitle:"Remote Desktop Web Connection"

Remote management interfaces like RDP Web Access, VNC viewers, Tomcat Manager, and Jenkins provide administrative access. Default credentials are common.

Directory Listings and Exposed Files

WordPress Exposure

inurl:"/wp-content/uploads/"

WordPress stores uploaded media in wp-content/uploads/, plugins in wp-content/plugins/, and themes in wp-content/themes/. Plugin directories may contain vulnerable code.

Git Repositories

intitle:"Index of /.git"

Exposed .git directories allow complete source code reconstruction. This reveals application logic, credentials in commit history, and developer information.

Cloud Storage Buckets

site:s3.amazonaws.com "target"

AWS S3 buckets, Azure Blob Storage, and Google Cloud Storage with misconfigured permissions expose data to the internet. Bucket names often follow predictable patterns.

Sensitive Configuration Directories

intitle:"Index of" site:.edu (/admin | /backup | /db)

Common sensitive directories include /backup/, /db/, /config/, /temp/, and /old/. Educational institutions often have weaker security postures.

Database Files and Credentials

SQL Database Dumps

filetype:sql intext:"INSERT INTO" (password | passwd | pwd)

SQL dump files contain complete database contents including user tables, password hashes, email addresses, and sensitive business data.

Environment Configuration Files

filetype:env "DB_PASSWORD"

Modern web frameworks use .env files for environment-specific configuration. These files contain database credentials, API keys, encryption keys, and third-party service tokens.

Database Connection Strings

inurl:wp-config.php intext:"DB_PASSWORD"

Application configuration files contain hard-coded database credentials. WordPress uses wp-config.php, Joomla uses configuration.php.

MongoDB Exposure

intitle:"index of" mongodb

MongoDB instances often run without authentication enabled. Directory listings of MongoDB data files or configuration exposures indicate vulnerable databases.

Network Infrastructure Discovery

IP Cameras and Surveillance Systems

intitle:"Live View / - AXIS"

Network cameras from AXIS, D-Link, Hikvision, and others expose live video feeds through web interfaces. Default credentials remain common.

Router and Network Device Interfaces

intitle:"DD-WRT" inurl:Status_Router.asp

Consumer and enterprise routers expose web management interfaces. DD-WRT, pfSense, MikroTik RouterOS, TP-Link, and Linksys devices have identifiable login pages.

IoT and Smart Devices

"Server: Boa" "200 OK"

The Boa web server is embedded in countless IoT devices with minimal security. Smart home devices and embedded systems often lack authentication.

Industrial Control Systems (ICS/SCADA)

intitle:"SCADA" inurl:login

Critical infrastructure systems including SCADA, PLCs, and industrial control systems occasionally have internet-exposed interfaces. These represent catastrophic security risks.

Social Media and People Search

Email Address Discovery

site:linkedin.com "person name" "@target.com"

Finding email addresses enables phishing campaigns and social engineering. Corporate email patterns can be inferred and validated through Google searches.

LinkedIn X-Ray Searches

site:linkedin.com "Chief Security Officer" "New York"

Google searches LinkedIn more effectively than LinkedIn's own search. Identify key personnel, organizational hierarchy, and technology stacks from job descriptions.

Username Enumeration

"username" site:*.com -site:target.com

Finding the same username across multiple platforms builds comprehensive profiles for social engineering. People often reuse usernames.

Contact Lists and Email Databases

filetype:csv intext:"email,firstname,lastname"

Exposed customer databases, marketing lists, and employee directories provide email addresses. CRM exports and newsletter subscriber lists are frequently exposed.

Government and Organization Targeting

Government Data

site:gov filetype:xlsx budget

Government sites contain policy documents, budget data, organizational charts, and RFPs. Classification markings like "For Official Use Only" (FOUO) indicate information sensitivity.

Educational Institution Research

site:.edu intitle:"Index of /admin"

Universities often have weaker security postures due to open academic culture and limited IT resources. Educational domains frequently expose research data and administrative systems.

Targeted Organization Reconnaissance

site:example.com (ext:env | ext:log | ext:conf | ext:sql)

Comprehensive site-specific reconnaissance combines multiple file type searches, admin panel discovery, and subdomain enumeration. Look for development servers (dev., staging., test.).

Error Messages and Debug Information

SQL Error Messages

intext:"MySQL server" intext:"on line" filetype:php

SQL errors reveal database types, table/column names, file system paths, and query structure. This enables targeted SQL injection attacks.

PHP Error Disclosure

"Fatal error: Call to" ext:php

PHP warnings and errors expose file paths, function names, code structure, and include/require statements. Local File Inclusion (LFI) vulnerabilities often accompany these errors.

Debug and Information Pages

inurl:phpinfo.php

phpinfo() pages display complete PHP configuration including installed modules, file paths, environment variables, and server software versions.

Stack Traces and Exception Details

intitle:"exception" | intitle:"failure" intext:"stack trace"

Application stack traces reveal code structure, library versions, function call chains, and internal variable names.

Platform-Specific Dorks

WordPress Comprehensive

  • Configuration: inurl:wp-config.php intext:DB_PASSWORD
  • Plugins: inurl:/wp-content/plugins/
  • Uploads: inurl:"/wp-content/uploads/" intitle:"Index of"
  • Backups: inurl:"/wp-content/backup-" filetype:sql
  • Login: inurl:wp-login.php

WordPress powers 43% of websites, making it a primary target. Plugin vulnerabilities are common.

phpMyAdmin Detection

intitle:"Welcome to phpMyAdmin"

Direct database access through web interface. Default credentials or weak passwords remain common.

Jenkins CI/CD

intitle:"Dashboard [Jenkins]"

Jenkins servers often lack authentication or use default credentials. Access enables code execution and credential harvesting.

Git Repository Exposure

intitle:"Index of /.git"

.git directory exposure allows complete source code reconstruction. Repository history contains credentials and API keys.

Elasticsearch and Kibana

intitle:"Elasticsearch" port:9200

Elasticsearch without authentication exposes complete indexed data. Kibana dashboards reveal log aggregation and metrics.

Docker and Container Platforms

intitle:"Docker" inurl:container

Exposed Docker APIs allow container manipulation, image pulling, and command execution.

Advanced Combined Queries

Complete Site Audit

site:target.com (ext:env | ext:log | ext:bak | ext:sql | ext:zip | ext:conf | ext:ini) (intext:"password" | intext:"api_key" | intext:"private_key")

Comprehensive search for sensitive files across multiple extensions and credential patterns.

Credential Hunting Multi-Vector

site:target.com intext:"username" intext:"password" (filetype:txt | filetype:log | filetype:cfg | filetype:ini)

Targets multiple file types likely to contain plaintext credentials.

Admin Panel Discovery Matrix

site:target.com (inurl:admin | inurl:administrator | inurl:login | inurl:portal | inurl:dashboard) (intitle:login | intitle:admin | intitle:dashboard)

Combines URL patterns and page titles to discover administrative interfaces.

API Endpoint and Token Discovery

site:target.com (inurl:api | inurl:v1 | inurl:v2 | inurl:graphql) (intext:"token" | intext:"key" | intext:"secret" | intext:"authorization")

Modern applications use REST APIs and GraphQL endpoints. API documentation and tokens enable API abuse.

Cloud Infrastructure Enumeration

(site:s3.amazonaws.com | site:blob.core.windows.net | site:storage.googleapis.com) "target"

Discovers cloud storage across AWS S3, Azure Blob Storage, and Google Cloud Storage simultaneously.

Error Message Aggregation

site:target.com (inurl:"error" | intitle:"exception" | intitle:"failure" | "database error" | "SQL syntax" | "undefined index" | "stack trace" | "fatal error")

Comprehensive error discovery reveals multiple technology layers and potential vulnerabilities.

Subdomain Intelligence Gathering

site:*.target.com (intitle:"index of" | inurl:admin | inurl:portal | inurl:dev | inurl:staging)

Wildcard subdomain searches combined with vulnerability indicators. Development subdomains often have weaker security.

Private Key and Certificate Discovery

(ext:pem | ext:key | ext:p12 | ext:pfx) ("PRIVATE KEY" | "BEGIN RSA" | "BEGIN OPENSSH")

SSL/TLS private keys and SSH keys occasionally get committed to repositories. Private key exposure enables impersonation and server compromise.

Best Practices and Operational Guidelines

Ethical and Legal Framework

Always obtain explicit written authorization before conducting security testing. Bug bounty programs provide legal safe harbors, but respect scope restrictions. Unauthorized access to computer systems violates federal law with potential felony charges and civil liability. View exposed data minimally—don't download entire databases or access sensitive personal information unnecessarily.

Effective Dorking Strategy

Start with broad reconnaissance (site:target.com) to understand scope, then narrow to specific vulnerabilities. Use exclusions to reduce noise. Chain operators logically: file type first to limit result set, then content searches. Space queries temporally to avoid rate limiting.

Rate Limiting and Detection Avoidance

Google implements rate limiting on suspicious query patterns. Distribute queries over time, use VPN rotation if necessary, and consider alternative search engines. Bing, DuckDuckGo, and Yandex have different indexes and may reveal different results.

Beyond Google: Specialized Tools

Shodan (shodan.io) indexes internet-connected devices, IoT, and industrial control systems more comprehensively than Google. Censys (censys.io) focuses on certificate transparency. These platforms are essential for comprehensive infrastructure reconnaissance.

Automation and Tool Integration

Manual verification is essential—automated tools produce false positives. Use tools like Pagodo, go-dork, and Dorks Eye for efficiency, but verify findings personally. DorkSearch.com provides a web interface for query building.

Defensive Countermeasures

Organizations should "self-dork" regularly—use these techniques on your own infrastructure to identify exposures before adversaries do. Implement proper robots.txt, disable directory listings, remove development files from production, and enable authentication on all administrative interfaces.

Bug Bounty Applications

Bug bounty hunters use dorking for initial reconnaissance and scope mapping. Create custom search engines combining multiple target domains. Automate discovery of common vulnerability patterns: open redirects, potential XSS, and exposed endpoints.

Current Trends and Future Considerations

Google continually tightens security filtering. Cloud misconfigurations (S3 buckets, Azure blobs) remain the most common high-severity findings. Environment variable leaks persist despite awareness. API exposure and GraphQL endpoints represent emerging attack surfaces.