[http://creativecommons.org/licenses/by/3.0/]
This work is licensed under a CC
Attribution 3.0 Unported License [http://creativecommons.org/licenses/by/3.0/]
In his early vision of the Web, Tim Berners-Lee expected that most people would discover information by following hyperlinks, rather than by using keyword searches. Thus there is no search functionality built into the Web. Web search engines came later and had a profound effect on how we use and experience the Web. Now it is hard to imagine using the Web without search, a fact that has both technological and political implications.
The specific details of how web search engines work are closely guarded trade secrets. However, the basic ingredients of web search are shared by all search engines. These are:
via Drunk Men Work Here [http://drunkmenworkhere.org/]
robots.txt
is the answer to this problemrobots.txt
is a text file placed at the root of a given domainrobots.txt
robots.txt
: they tell crawlers what is importantrobots.txt
[robots.txt (1)]rel="nofollow"
attribute on an anchor (link) tag does thisnofollow
attribute to links posted in comment threadsnofollow
[nofollow (1)])Search engines are like a TV camera crew let loose in the middle of a crowd of rowdy fans after a game. Seeing the camera, everyone acts boorishly and jostles to get in front. The act of observing something changes it.
Lee Gomes, "Our Columnist Creates Web 'Original Content' But Is in for a Surprise" [http://online.wsj.com/public/article/SB114116587424585798.html], Wall Street Journal, 2006