Search: Beyond Basic Googling

According to Wikipedia, "to google" became a transitive verb that commonly refers to searching for information on the World Wide Web, regardless of which search engine is used. The American Dialect Society chose it as the "most useful word of 2002," and in 2006 it was added to the Oxford English Dictionary and to the Merriam-Webster Collegiate Dictionary.

Yet despite Google's apparent dominance, it is by no means the only game in town, nor is it the be all and end all of Internet research. In many cases, in order to obtain the most complete set of results when seeking information on a person or an issue of interest, numerous search engines and methods must be used, each for its own comparative advantage over the others.

In general, the Internet can be divided into two main (though entirely unequal) parts: "The Dark Web", and everything else (i.e. the surface and deep webs). For the average Internet user, the Surface Web and Deep Web is where they will spend the vast majority of their virtual life; this is where search engines, social media and content sharing platforms and commercial and commerce websites reign supreme, and where most of us go every day to keep up with news and developments, shop and pay bills, study and research, find entertainment and kill time, and stay in touch with colleagues, friends and family. In other words, when most of us speak of the Internet, we have these two levels in mind.

The Dark Web, on the other hand, is for most a mysterious and foreboding place, rife with illicit and illegal activity and content, accessible only with cryptic strings of numbers and dots instead of the much more user friendly domain names of the surface web. Yet here too there are search engines that can help users find what they seek, albeit with a little more effort and a taste for adventure.

The first and arguably the most important thing to understand about searching the internet, is that no single search engine or tool can cover all layers of the internet, and that the common search engines such as Google, Bing, Yahoo, Yandex, Baidu, etc. all exist and operate on the surface layer, with limited access to the Deep Web, which often requires user interactions (scrolling down pages to load results, clicking on buttons, filling forms and providing user credentials, and accessing closed directories and files) that traditional search engines do not perform. So in that sense, the Deep Web is simply an extension of the Surface Web, and it is probably easiest to think of it simply as the online content that has not been indexed by Google and others.

Second, in terms of size, the surface web constitutes only about 5% of the web (roughly equivalent to 6 billion websites), whereas the deep web is, according to most estimates, over 600 times larger. This means that for all their power, search engines, quite literally, merely scrape the surface the of the web and therefore the view they can offer is limited. This, coupled with the fact that many website impose severe restrictions on the robots (also known as crawlers) used by search engines to read and retrieve online content, means that in truth search engines are only seeing a fraction of the surface web.

Third, each search engine is built by a different team of people using a different approach and unique algorithms. This means that the way results are found, ranked, and presented to the user can differ significantly from one engine to the next. It is therefore entirely possible, and even quite common, to see entirely different results presented by two engines running identical search queries.

So how can we effectively and efficiently search the Internet to find valuable information?

We start by identifying the different categories of search tools that are available. As the graph above shows, the three primary categories of search tool operate on the surface and deep web: general, meta, and vertical search.

General search engines such as Google, Bing, Yandex, Baidu, etc., can be extremely powerful when used to their full potential and in conjunction with each other, each for its comparative advantage. One of their hidden powers is that they enable users to go far beyond basic keyword queries and use complex, multi-factored queries with Boolean and other search operators, which are likely to return significantly more targeted results.

In addition, most of these search engines offer an "advanced search" form that allows users to construct complex searches even without any prior knowledge of search operators.

Meta search engines attempt to capture the collective power of general search engines by running a search query on multiple engines simultaneously and then returning an aggregated or side-by-side list of results. These can be good time savers when doing an initial sweep of an issue, however, they usually will not allow complex search strings.

Finally, vertical search tools are usually highly specific, highly unique websites that focus on one particular subject area, such as people search engines, reverse image search engines, flight and ship tracking sites, hobbyist sites and forums, and various social media search tools, to name but a few. These can be very useful resources, as beyond their search capabilities, many such sites are operated and frequented by subject matter experts, who often participate in group discussions on issues of interest to their online community, and who are usually willing to answer questions and offer guidance.