Google search engine operators. Using Little-Known Google Functions to Find the Hidden Partner intitle All User Posts First

Receiving private data does not always mean hacking - sometimes it is published in the public domain. Knowledge Google settings and a little ingenuity will allow you to find a lot of interesting things - from credit card numbers to FBI documents.

WARNING

All information is provided for informational purposes only. Neither the editors nor the author are responsible for any possible harm caused by the materials of this article.

Everything is connected to the Internet today, caring little about restricting access. Therefore, many private data become the prey of search engines. Spider robots are no longer limited to web pages, but index all content available on the Web and constantly add confidential information to their databases. Learning these secrets is easy - you just need to know how to ask about them.

Looking for files

In capable hands, Google will quickly find everything that is bad on the Web, such as personal information and files for official use. They are often hidden like a key under a rug: there are no real access restrictions, the data just lies in the back of the site, where links do not lead. Google's standard web interface only provides basic settings advanced search, but even those will suffice.

Limit search by files a certain kind Google can use two operators: filetype and ext . The first sets the format that the search engine determined by the file header, the second - the file extension, regardless of its internal content. When searching in both cases, you need to specify only the extension. Initially, the ext operator was convenient to use in cases where the file did not have specific format characteristics (for example, to search for configuration files). ini files and cfg, which can contain anything). Now Google algorithms changed, and there is no visible difference between the operators - the results are the same in most cases.


Filtering the output

By default, Google searches for words and in general for any characters entered in all files on indexed pages. You can limit the search scope by domain top level, a specific site or the location of the desired sequence in the files themselves. For the first two options, the site statement is used, followed by the name of the domain or the selected site. In the third case, a whole set of operators allows you to search for information in service fields and metadata. For example, allinurl will find the specified in the body of the links themselves, allinanchor - in the text provided with the tag , allintitle - in the page headers, allintext - in the body of the pages.

For each operator there is a lighter version with a shorter name (without the prefix all). The difference is that allinurl will find links with all words, while inurl will only find links with the first of them. The second and subsequent words from the query can appear anywhere on web pages. The inurl operator also differs from another similar in meaning - site . The first one also allows you to find any sequence of characters in the link to the desired document (for example, /cgi-bin/), which is widely used to find components with known vulnerabilities.

Let's try it in practice. We take the allintext filter and make the query return a list of credit card numbers and verification codes, which will expire only after two years (or when their owners get tired of feeding everyone in a row).

Allintext: card number expiration date /2017 cvv

When you read on the news that a young hacker "hacked into the servers" of the Pentagon or NASA, stealing classified information, then in most cases it is precisely this elementary technique of using Google. Suppose we are interested in a list of NASA employees and their contact details. Surely such a list is in electronic form. For convenience or due to an oversight, it can also lie on the organization's website itself. It is logical that in this case there will be no references to it, since it is intended for internal use. What words can be in such a file? At least - the field "address". It is easy to test all these assumptions.


inurl:nasa.gov filetype:xlsx "address"


We use bureaucracy

Such finds are a pleasant trifle. The really solid catch comes from a more detailed knowledge of Google Webmaster Operators, the Web itself, and the structure of what you're looking for. Knowing the details, you can easily filter the output and refine the properties of the files you need in order to get really valuable data in the rest. It's funny that bureaucracy comes to the rescue here. It produces typical formulations that make it convenient to search for secret information that has accidentally leaked onto the Web.

For example, the Distribution statement, which is mandatory in the office of the US Department of Defense, means standardized restrictions on the distribution of a document. The letter A marks public releases in which there is nothing secret; B - intended for internal use only, C - strictly confidential, and so on up to F. Separately, there is the letter X, which marks especially valuable information that represents a state secret of the highest level. Let those who are supposed to do it on duty look for such documents, and we will limit ourselves to files with the letter C. According to DoDI 5230.24, such marking is assigned to documents containing a description of critical technologies that fall under export control. You can find such carefully guarded information on sites in the .mil top-level domain allocated to the US Army.

"DISTRIBUTION STATEMENT C" inurl:navy.mil

It is very convenient that only sites from the US Department of Defense and its contract organizations are collected in the .mil domain. Domain-limited search results are exceptionally clean, and the titles speak for themselves. It is practically useless to search for Russian secrets in this way: chaos reigns in the .ru and .rf domains, and the names of many weapons systems sound like botanical (PP "Kiparis", self-propelled guns "Acacia") or even fabulous (TOS "Pinocchio").


By carefully examining any document from a site in the .mil domain, you can see other markers to refine your search. For example, a reference to the export restrictions "Sec 2751", which is also convenient to search for interesting technical information. From time to time, it is removed from official sites, where it once appeared, so if you can’t follow an interesting link in the search results, use the Google cache (cache operator) or the Internet Archive website.

We climb into the clouds

In addition to accidentally declassified documents from government departments, links to personal files from Dropbox and other data storage services that create "private" links to publicly published data occasionally pop up in the Google cache. It's even worse with alternative and self-made services. For example, the following query finds the data of all Verizon clients that have an FTP server installed and actively using a router on their router.

Allinurl:ftp://verizon.net

There are now more than forty thousand such smart people, and in the spring of 2015 there were an order of magnitude more. Instead of Verizon.net, you can substitute the name of any well-known provider, and the more famous it is, the larger the catch can be. Through the built-in FTP server, you can see files on an external drive connected to the router. Usually this is a NAS for remote work, a personal cloud, or some kind of peer-to-peer file download. All the content of such media is indexed by Google and other search engines, so you can access files stored on external drives via a direct link.

Peeping configs

Before the wholesale migration to the clouds, simple FTP servers, which also lacked vulnerabilities, ruled as remote storages. Many of them are still relevant today. For example, the popular WS_FTP Professional program stores configuration data, user accounts, and passwords in the ws_ftp.ini file. It is easy to find and read because all entries are stored in plain text and passwords are encrypted using the Triple DES algorithm with minimal obfuscation. In most versions, simply discarding the first byte is sufficient.

Decrypting such passwords is easy using the WS_FTP Password Decryptor utility or a free web service.

When talking about hacking an arbitrary site, they usually mean getting a password from logs and backups of CMS or e-commerce application configuration files. If you know their typical structure, then you can easily indicate the keywords. Lines like those found in ws_ftp.ini are extremely common. For example, Drupal and PrestaShop always have a user ID (UID) and a corresponding password (pwd), and all information is stored in files with the .inc extension. You can search for them like this:

"pwd=" "UID=" ext:inc

We reveal passwords from the DBMS

In the configuration files of SQL servers, names and addresses Email users are stored in clear text, and their MD5 hashes are written instead of passwords. Decrypting them, strictly speaking, is impossible, but you can find a match among known hash-password pairs.

Until now, there are DBMSs that do not even use password hashing. The configuration files of any of them can simply be viewed in the browser.

Intext:DB_PASSWORD filetype:env

With the advent of Windows servers the place of configuration files was partly occupied by the registry. You can search through its branches in exactly the same way, using reg as the file type. For example, like this:

Filetype:reg HKEY_CURRENT_USER "Password"=

Don't Forget the Obvious

sometimes get to classified information succeeds with the help of accidentally opened and caught in the field of view google data. The ideal option is to find a list of passwords in some common format. Store account information in a text file, Word document or spreadsheet Excel can only desperate people, but there are always enough of them.

Filetype:xls inurl:password

On the one hand, there are many means to prevent such incidents. It is necessary to specify adequate access rights in htaccess, patch CMS, do not use left scripts and close other holes. There is also a file with a robots.txt exclusion list, which prohibits search engines from indexing the files and directories specified in it. On the other hand, if the robots.txt structure on some server differs from the standard one, then it immediately becomes clear what they are trying to hide on it.

The list of directories and files on any site is preceded by the standard inscription index of. Since it must appear in the title for service purposes, it makes sense to limit its search to the intitle operator. Interesting stuff can be found in the /admin/, /personal/, /etc/ and even /secret/ directories.

Follow the updates

Relevance is extremely important here: old vulnerabilities are closed very slowly, but Google and its search results are constantly changing. There is even a difference between the "last second" filter (&tbs=qdr:s at the end of the request url) and the "real time" filter (&tbs=qdr:1).

Date timespan latest update Google's file is also indicated implicitly. Through the graphical web interface, you can select one of the typical periods (hour, day, week, and so on) or set a date range, but this method is not suitable for automation.

From the appearance of the address bar, one can only guess about a way to limit the output of results using the &tbs=qdr: construct. The letter y after it specifies a limit of one year (&tbs=qdr:y), m shows the results for the last month, w for the week, d for the past day, h for the last hour, n for the minute, and s for the give me a sec. The most recent results just made known to Google are found using the &tbs=qdr:1 filter.

If you need to write a tricky script, it will be useful to know that the date range is set in Google in Julian format through the daterange operator. For example, this is how you can find the list PDF documents with the word confidential uploaded from January 1 to July 1, 2015.

Confidential filetype:pdf daterange:2457024-2457205

The range is specified in Julian date format without decimals. It is inconvenient to translate them manually from the Gregorian calendar. It's easier to use a date converter.

Targeting and filtering again

In addition to specifying additional operators in the search query, they can be sent directly in the link body. For example, the filetype:pdf trait corresponds to the as_filetype=pdf construct. Thus, it is convenient to set any clarifications. Let's say that the output of results only from the Republic of Honduras is set by adding the construction cr=countryHN to the search URL, but only from the city of Bobruisk - gcs=Bobruisk . See the developer section for a complete list of .

Google's automation tools are designed to make life easier, but often add to the hassle. For example, a user's city is determined by the user's IP through WHOIS. Based on this information, Google not only balances the load between servers, but also changes the search results. Depending on the region, for the same query, different results will get to the first page, and some of them may turn out to be completely hidden. Feel like a cosmopolitan and search for information from any country will help its two-letter code after the directive gl=country . For example, the code for the Netherlands is NL, while the Vatican and North Korea do not have their own code in Google.

Often search results are littered even after using a few advanced filters. In this case, it is easy to refine the query by adding a few exception words to it (each of them is preceded by a minus sign). For example, banking , names , and tutorial are often used with the word Personal. Therefore, cleaner search results will show not a textbook example of a query, but a refined one:

Intitle:"Index of /Personal/" -names -tutorial -banking

Last Example

A sophisticated hacker is distinguished by the fact that he provides himself with everything he needs on his own. For example, a VPN is a convenient thing, but either expensive or temporary and with restrictions. Signing up for yourself alone is too expensive. It's good that there are group subscriptions, and with the help of Google it's easy to become part of a group. To do this, just find the Cisco VPN configuration file, which has a rather non-standard PCF extension and a recognizable path: Program Files\Cisco Systems\VPN Client\Profiles . One request, and you join, for example, the friendly staff of the University of Bonn.

Filetype:pcf vpn OR Group

INFO

Google finds configuration files with passwords, but many of them are encrypted or replaced by hashes. If you see strings of a fixed length, then immediately look for a decryption service.

The passwords are stored in encrypted form, but Maurice Massard has already written a program to decrypt them and is providing it for free via thecampusgeeks.com.

At Google help hundreds of different types attacks and penetration tests. There are many options for popular programs, major database formats, numerous vulnerabilities in PHP, clouds, and so on. If you accurately represent what you are looking for, it will greatly simplify getting necessary information(especially one that was not planned to be made public). Not only Shodan feeds interesting ideas, but any database of indexed network resources!

Dear friends, today I will share with you one of my latest developments in website promotion. I will tell you how to remove the publication date from the SERPs and what advantages this provides.

As you know, in the results of the issuance of many pages of sites, the date of their publication is displayed. Dates allow users to navigate search results and select pages with more up-to-date and up-to-date information.

In most cases, I myself prefer to go to pages that were published not so long ago, and I visit materials 3-5 years old or more much less often, since information in many topics often quickly becomes outdated and loses its relevance.

Do you think this article will receive Firefox plugins the maximum number of clicks from a search if it is dated 2008?

Or my post about WordPress plugins from 2007:

I think not, because the information in these topics becomes outdated over the years.

I thought about how I can use this moment to increase traffic to the sites I promote. There are many "evergreen" topics in which information practically does not become outdated, and materials published several years ago will also be useful and interesting for visitors.

For example, take the topic of dog training. There the basic principles have not changed for many years. At the same time, the owner of such a site will be sad 😉 when, in a few years, fewer visitors from the search results will go to his articles, as they will see the publication date and choose newer articles on other sites simply because they are more recent, although they may not be nearly as interesting or useful.

But if we take such topics as smartphones, gadgets, fashion, women's clothing, then the information in them becomes outdated very quickly and loses its relevance. They do not make sense to remove the date from the search results.

🔥 By the way! I'm running a paid course on promoting Shaolin SEO sites in English. If interested, you can apply on his website seoshaolin.com.

I wish you high traffic on your sites!

Dessert for today - a fascinating video about how one boy rides a bike 😉 . It is better for the faint of heart and impressionable people not to watch 🙂:

This article will be primarily useful for novice optimizers, because more advanced ones should already know everything about them. In order to use this article with maximum efficiency, it is desirable to know exactly which words need to be promoted to the right positions. If you're not sure about the word list yet, or use a keyword suggestion service, it's a bit confusing, but you can figure it out.

Important! Rest assured, Google is well aware that ordinary users will not use them and only promotion specialists resort to their help. Therefore, Google may slightly distort the information provided.

Intitle operator:

Usage: intitle: word
Example: intitle: site promotion
Description: When using this operator, you will receive a list of pages that contain the word you are interested in in the title (title), in our case, this is the phrase "website promotion" in its entirety. Note that there should not be a space after the colon. The page title has importance when ranking, so take your headlines seriously. When using this variable, you can estimate the approximate number of competitors who also want to be in the top positions for this word.

Inurl operator:

Usage: inurl:phrase
Example: inurl: search engine optimization cost calculation
Description: This command shows sites or pages that have the original keyword in their URL. Note that there should not be a space after the colon.

Inanchor operator:

Usage: inanchor:phrase
Example: inanchor:seo books
Description: Using this operator will help you see the pages that are linked with the used keyword. This is a very important command, but unfortunately search engines are reluctant to share this information with SEOs for obvious reasons. There are services, Linkscape and Majestic SEO, who are willing to provide you with this information for a fee, but rest assured, the information is worth it.

Also, it is worth remembering that now Google is paying more and more attention to the “trust” of the site and less and less to the link mass. Of course, links are still one of the most important factors, but “trust” is playing an increasingly significant role.

A combination of two variables gives good results, for example intitle:inanchor promotion:website promotion. And what do we see, the search engine will show us the main competitors, the page title of which contains the word “promotion” and incoming links with the anchor “website promotion”.

Unfortunately, this combination does not allow you to find out the "trust" of the domain, which, as we have already said, is a very important factor. For example, a lot of older corporate sites don't have as many links as their younger competitors, but they do have a lot of old links that pull those sites to the top of the search results.

Site operator:

Usage: site: site address
Example: site:www.aweb.com.ua
Description: With this command, you can see a list of pages that are indexed by the search engine and that it knows about. It is mainly used to learn about the pages of competitors and analyze them.

cache statement:

Usage: cache:page address
Example: cache:www.aweb.com.ua
Description: This command shows a “snapshot” of the page since the last time the robot visited the site and in general how it sees the page content. By checking the page cache date, you can determine how often robots visit the site. The more authoritative the site, the more often the robots visit it and, accordingly, the less authoritative (according to Google) the site, the less often the robots take pictures of the page.

Cache is very important when buying links. The closer the page caching date is to the link purchase date, the faster your link will be indexed by the Google search engine. Sometimes it turned out to find pages with a cache age of 3 months. By buying a link on such a site, you will only waste your money, because it is quite possible that the link will never be indexed.

Link operator:

Usage: link:url
Example: link:www.aweb.com.ua
Description: The link operator: searches for and displays pages that link to the specified url. It could be like main page site, and internal.

Related operator:

Usage: related:url
Example: related:www.aweb.com.ua
Description: The related: statement displays pages that, in the opinion search engine, are similar to the specified page. For a human, all the resulting pages may not have anything similar, but for a search engine, they do.

Info operator:

Usage: info:url
Example: info: www.aweb.com.ua
Description: When using this operator, we will be able to get information about the page that is known to the search engine. This can be the author, publication date, and more. Additionally, on the search page, Google offers several actions at once that it can do with this page. Or, more simply, it will suggest using some of the operators that we described above.

Allintitle operator:

Usage: allintitle:phrase
Example: allintitle:aweb promotion
Description: If we start search query from this word, we will get a list of pages whose title contains the entire phrase. For example, if we try to search for the word allintitle:aweb promotion, we get a list of pages that have both of these words in their titles. And it is not at all necessary that they should go one after another, they can be located in different places in the header.

Allintext operator:

Usage: allintext:word
Example: allintext:optimization
Description: This operator searches for all pages that contain the specified word in the text body. If we try to use allintext:aweb optimization, we will see a list of pages in the text of which these words occur. That is, not the entire phrase is “aweb optimization”, but both words are “optimization” and “aweb”.

How to search using google.com

Everyone probably knows how to use a search engine like Google =) But not everyone knows that if you correctly compose a search query using special structures, you can achieve the results of what you are looking for much more efficiently and faster =) In this article I will try to show that and how you need to do to search correctly

Google supports several advanced search operators that have special meaning when searching on google.com. Typically, these operators change the search, or even tell Google to do the whole thing. different types search. For example, the construction link: is a special operator, and the query link: www.google.com will not give you a normal search, but will instead find all web pages that have links to google.com.
alternative request types

cache: If you include other words in the query, Google will highlight those included words within the cached document.
For example, cache:www.web site will show cached content with the word "web" highlighted.

link: the above search query will show web pages that contain links to the specified query.
For example: link:www.website will display all pages that have a link to http://www.site

related: Displays web pages that are "related" to the specified web page.
For example, related: www.google.com will list web pages that are similar home page Google.

info: Request Information: will provide some information that Google has about the requested web page.
For example, info:website will show information about our forum =) (Armada - Forum of adult webmasters).

Other information requests

define: The define: query will provide a definition of the words you type after this, compiled from various online sources. The definition will be for the entire phrase entered (that is, it will include all words in the exact query).

stocks: If you start a query with stocks: Google will process the rest of the query terms as stock tickers, and link to a page showing the prepared information for those characters.
For example, stocks: intel yahoo will show information about Intel and Yahoo. (Note that you must type the characters latest news, not company name)

Request Modifiers

site: If you include site: in your query, Google will limit the results to the websites it finds in that domain.
You can also search for individual zones, such as ru, org, com, etc ( site:com site:ru)

allintitle: If you run a query with allintitle:, Google will limit the results with all the query words in the title.
For example, allintitle: google search will return all Google search pages like images, Blog, etc

title: If you include intitle: in your query, Google will restrict results to documents containing that word in the title.
For example, title:Business

allinurl: If you run a query with allinurl: Google will limit the results with all the query words in the URL.
For example, allinurl: google search will return documents with google and search in the title. Also, as an option, you can separate words with a slash (/) then the words on both sides of the slash will be searched within the same page: Example allinurl: foo/bar

inurl: If you include inurl: in your query, Google will restrict the results to documents containing that word in the URL.
For example, Animation inurl:website

intext: searches only in the text of the page for the specified word, ignoring the title and texts of links, and other things not related to. There is also a derivative of this modifier - allintext: those. further, all words in the query will be searched only in the text, which is also important, ignoring frequently used words in links
For example, intext:forum

daterange: searches in time frames (daterange:2452389-2452389), dates for time are specified in Julian format.

Well, and all sorts of interesting examples of requests

Examples of compiling queries for Google. For spammers

inurl:control.guest?a=sign

Site:books.dreambook.com “Homepage URL” “Sign my” inurl:sign

Site:www.freegb.net Homepage

Inurl:sign.asp "Character Count"

"Message:" inurl:sign.cfm "Sender:"

inurl:register.php “User Registration” “Website”

Inurl:edu/guestbook “Sign the Guestbook”

Inurl:post "Post Comment" "URL"

Inurl:/archives/ “Comments:” “Remember info?”

“Script and Guestbook Created by:” “URL:” “Comments:”

inurl:?action=add “phpBook” “URL”

Intitle:"Submit New Story"

Magazines

inurl:www.livejournal.com/users/mode=reply

inurl greatestjournal.com/mode=reply

Inurl:fastbb.ru/re.pl?

inurl:fastbb.ru /re.pl? "Guest book"

Blogs

Inurl:blogger.com/comment.g?”postID”"anonymous"

Inurl:typepad.com/ “Post a comment” “Remember personal info?”

Inurl:greatestjournal.com/community/ “Post comment” “addresses of anonymous posters”

“Post comment” “addresses of anonymous posters” -

Intitle:"Post comment"

Inurl:pirillo.com “Post comment”

Forums

Inurl:gate.html?”name=Forums” “mode=reply”

inurl:”forum/posting.php?mode=reply”

inurl:”mes.php?”

inurl:”members.html”

inurl:forum/memberlist.php?”

A computer