Google Dork for OSINT Investigations Guide

  author
Rollins Duke   
Published: February 25th, 2022 • 7 Min Read

Many people use the phrase “Google dorks” to refer to those who use Google’s search operators in combination with tailored parameters for specialised searches. Lets understand how google dork for OSINT investigations help.

We classified the dorks according to the type of target information they were utilised for, beginning with humans:

People & Accounts

A well-known cyber specialist was the first to tweet about it after we discussed it. Because he enjoys searching for emails by username, he defers to Google to perform the difficult labour. The domain name was substituted with an asterisk instead of looking for all available email providers:

“username*com”

Cyber expert mentioned his favourite dork, who searches for people’s online resumes. You can conduct a search within a website’s URL or within its text:

inurl:resume “ken wales”
intext:resume “ken wales”

It was also shared by a cybersecurity expert who talked about jobs. By focusing on the LinkedIn site, he looks for people with a certain job title and a certain location. But he also told us about another trick, which is that you can search for icons or Unicode characters:

site:linkedin.com/in “<job title>” (☎ OR ☏ OR ✆ OR 📱) +”<location>”

It’s also possible to search for a specific name if you want to find it:

“<name>” (☎ OR ☏ OR ✆ OR 📱)

Cyber expert sent us the most recent dork touching people via Twitter. He demonstrates a clever way to find people within the GitHub code:

site:github.com/orgs/*/people

In addition, if you are seeking for lists of attendees or finalists, you can use the dork listed below:

intitle:final.attendee.list OR inurl:final.attendee.list

Another tip that looks for login information on a Trello board is provided. A large number of Trello boards have been exposed and indexed by Google due to the fact that many people forget to tighten the security settings on their boards.

site:trello.com password + admin OR username

Documents

You can use the basic Google dork to search for specific documents within a website or domain name using the Google search engine.

site:<domain> filetype:PDF

Note: You can also use the abbreviation for the extension, which is: ‘ext:’ in place of the word ‘filetype:’

As part of his presentation, Cyber expert showed us how to conduct a search that targets PDFs as well, but he also demonstrated how to search for only those documents that may contain possible email information. Take a look at what’s available by changing the “domain” to the name of your company’s specific domain name:

filetype:pdf <domain> “email”

Here’s an example of a dork searching for XLS files on government websites:

filetype:xls site:.gov

It is possible to get additional extensions based on what you are attempting to accomplish with them. You can accomplish this by including multiple file extensions between double quotes, with each extension separated by a pipe denoted by a vertical line “|.” Do not forget to leave some additional space around it as well:

filetype:”xls | xlsx | doc | docx | txt | pdf” site:.gov

When it comes to finding additional extensions that may be of interest, Twitter user Cyber Expert provided us with the following tip, which he shared via the social media platform Twitter:

filetype:”doc | pdf | xls | txt | ps | rtf | odt | sxw | psw | ppt | pps | xml”

Lastly, another tip from a cyber expert, this time involving searching on HubSpot for any type of document that contains the word “trends” and that has the year 2021 included in the URL:

site:hubspot.net intitle:2021 OR inurl:2021 “* trends”

Another cyber expert shared another method, which involves using Google as an email platform to search for txt or pdf files that contain words such as FBI, CIA, or NYPD (these can be replaced with words of your choice), and then sending them to yourself:

“Email delivery powered by Google” ext:pdf OR ext:txt nypd OR fbi OR cia

Cloud, Buckets and Databases

One of the cyber expert’s favourite dorks was shared with the audience. Another searches for indexed documents that contain the phrase “confidential” or “top secret,” which are stored in open Amazon S3 buckets:

site:s3.amazonaws.com confidential OR “top secret”

Another search involving Amazon buckets came from a cyber-security professional. In some XLS files, this may reveal some sensitive login information, such as the following:

s3 site:amazonaws.com filetype:xls password

In addition, since Excel files may not be the only interesting document format that contains the information you are looking for, it is recommended that you consider adding all kinds of interesting extensions to your document.

And, of course, you can use Google to look for copies of databases if you want to. To locate some of them, simply conduct a search for:

ext:sql intext:”– phpMyAdmin SQL Dump”

Social Media

A cyber expert shared a tip on how to determine whether a particular tweet has been shared on other media, such as a news website, by other users. To do so, search for the specific text and tell Google to ignore anything that was posted on Twitter.com by adding the minus sign to the following part of the dork:

“text of a tweet” -site:twitter.com

Almost the same method can be used to search for messages and/or links for a specific username that do not originate from the account associated with that username. For example, this searches for links or information that contains the phrase “@delhipolice” but does not originate from the Twitter user timeline Delhi Police.

@delhipolice -site:twitter.com/Delhi_Police

Would you like to know more about Google Dorking?

Google search operators are an art form in and of themselves, and mastering them is a skill in and of itself. The goal of OSINT is (for the most part) to create a haystack that contains your targeted information. With the right keywords and dorking in conjunction with the appropriate search operators, you will be able to create the most minimal amount of noise while also increasing the likelihood of finding a needle in the haystack.

Take note that Google has restricted the number of keywords that you can enter into a search to a maximum of 32 words in total. This means that all search terms (keywords 33 and beyond) that exceed the 32-word limit will be ignored when conducting a search. In addition, there is a character limit for each key phrase. It is not possible to have more than 2048 characters in a single keyword.

All of the Google dorks listed above were generated in response to a specific intelligence requirement question. It is much more difficult to craft a targeted search query if you do not have a good question to begin with. So my advice would be to ask yourself, what exactly are you looking for? What is the question you’re attempting to answer? It becomes easier to determine which search operators might be required to answer a specific question as a result of this.

You might want to look into sources such as the Google Hacking Database if you need some additional inspiration. This is a website that collects Google dorks that have been submitted by users who have a specific need or interest. The GoogleGuide is yet another excellent resource for learning how to be a successful Google dork. The GoogleGuide is a useful website for learning about the various Google search operators that are available and how to use them for research purposes. Even though it is a little dated (the information is from 2007), the GoogleGuide still has a lot to offer and teach those who are interested in open source intelligence. Another option is the ahrefs blog, which contains information on Google search operators. They provide a visual demonstration of how to use 42 different search operators.

It also has an entire chapter dedicated to Google Dorking, which is a must-read for anyone interested in learning and comprehending how to conduct research using Google searches. Google Hacking for Penetration Testers is the title of a book written by a penetration tester. There is a heated debate within the osint community about which version is the most appropriate for OSINT practitioners.