Wednesday, August 8, 2007

Why Aren't Alt Search Engines Crawling Websites?

Based on log file evidence from a friend who runs a personal website, Rich Skrenta claims that only 11 search startups are actually crawling the web. He wonders where all the alt search engines are? For some reason, Rich doesn't link to Charles Knight's Top 100 Alt Search Engine List in asking that question, but to Don Dodge's post linking to us. Nevertheless, this brings up some interesting questions: why are only a few of the hundreds of alternative search engines crawling? Are many of them using a licensed index? Are many of them using alternative ways to get their data?

AltSearchEngines editor Charles Knight has asked his many contacts for more information on this, so we will report back soon on the results. Meanwhile Yakov from alt search engine Quintura (a sponsor of AltSearchEngines.com) says in a comment on Skrenta's post that "having its own index is a necessity for search startup". In another comment, Tailrank's Kevin Burton points out that some alt search engines have a limited scope: "Well with Spinn3r we only crawl blog content so we shouldn't show up on a historical site. I wonder if other crawlers/startups have similar limitations." Also Rafael Cosentino says that his service Congoo uses feeds to gather content, so they don't need to crawl websites. FAROO uses a special kind of distributed crawler, which is crawling "below the radar".

Rich Skrenta clarifies in a comment that he's talking about "web scale" search engines, not niche ones. Even so, it is indeed strange that only 11 crawlers showed up in his friend's website logs.

Do R/WW readers have any more information about this?

Image: changturtle

No comments: