Crawler-Based Search Engines
Google, Yahoo and MSN are all crawler-based search engines. They are called crawler based because they create their listings automatically by "crawling" or "spidering" the web.
Crawler-based search engines have three major elements.
- First is the spider or crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. The spider returns to the site on a regular basis, such as every month or two, to look for changes.
- Everything the spider finds goes into the second part of the search engine, the index. Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed." Until it is added to the index it is not available to those searching with the search engine.
- Search engine software is the third part of a search engine. This is the program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant.
Next Topic: Search Rankings