Invisible Web: How Is The Deep Web Invisible To Search Engines?

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Invisible Web

Search engines are, in a sense, the heartbeat of the internet; googling has become a part of
everyday speech and is even recognized by Merriam-Webster as a grammatically correct verb.
Its a common misconception; however, that googling a search term will reveal every site out
there that addresses your search. In fact, typical search engines like Google, Yahoo, or Bing
actually access only a tiny fraction estimated at 0.03% of the internet. The sites that
traditional searches yield are part of whats known as the Surface Web, which is comprised of
indexed pages that a search engines web crawlers are programmed to retrieve.
So wheres the rest? The vast majority of the Internet lies in the Deep Web, sometimes referred to
as the Invisible Web. The actual size of the Deep Web is impossible to measure, but many
experts estimate it is about 500 times the size of the web as we know it.
Deep Web pages operate just like any other site online, but they are constructed so that their
existence is invisible to Web crawlers.

How is the Deep Web Invisible to Search Engines?


Search engines like Google are extremely powerful and effective at distilling up-to-the-moment
Web content. What they lack, however, is the ability to index the vast amount of data that isnt
hyperlinked and therefore immediately accessible to a Web crawler. This may or may not be
intentional; for example, content behind a pay-wall or a blog post thats written but not yet
published both technically exist in the Deep Web.
Some examples of other Deep Web content include:
Data that needs to be accessed by a search interface
Results of database queries
Subscription-only information and other password-protected data
Pages that are not linked to by any other page
Technically limited content, such as that requiring CAPTCHA technology
Text content that exists outside of conventional http:// or https:// protocols

How to Access and Search for Invisible Content


If a site is inaccessible by conventional means, there are still ways to access the content, if not
the actual pages. Aside from software like TOR, there are a number of entities who do make it
possible to view Deep Web content, like universities and research facilities. For invisible content
that cannot or should not be visible, there are still a number of ways to get access:
Join a professional or research association that provides access to records, research and
peer-reviewed journals.
Access a virtual private network via an employer.
Request access; this could be as simple as a free registration.
Pay for a subscription.
Use a suitable resource. Use an invisible Web directory, portal or specialized search
engine such as Google Book Search, Librarians Internet Index, or BrightPlanets
Complete Planet.
1.4.3 Invisible Web Search Tools
Here is a small sampling of invisible web search tools (directories, portals, engines) to help you
find invisible content. To see more like these, please look at Research Beyond Google article.
A List of Deep Web Search Engines Purdue Owls Resources to Search the Invisible
Web
Art Musie du Louvre
Books Online The Online Books Page
Economic and Job Data FreeLunch.com
Finance and Investing Bankrate.com
General Research GPOs Catalog of US Government Publications
2

Government Data Copyright Records (LOCIS)


International International Data Base (IDB)
Law and Politics THOMAS (Library of Congress)
Library of Congress Library of Congress
Medical and Health PubMed
Transportation FAA Flight Delay Information

You might also like