Sunday, May 27, 2007

How Search Engine Works?

It is impossible to understand about SEO (Search Engine Optimization) without the knowledge of search engine itself. The major search engines are Google, Yahoo, MSN, AOL, Altavista, ASK and Hotbot. Google is in a much subjugated position in the IT industry. More than 80% users of the internet use Google for their searching purpose. Google loves websites that have rich content and clear structure.

What a “Spider” Does

The first thing that you need to understand is what is what is search engine “spider” and how it works. Spider (“robot” or “crawler”) is software that is used by search engine to find out what’s going on the web. There are various types of spiders hi
ex up-to-date and accurate as possible.

The First type of spider is one that actually “crawls” the web looking for websites and pages. This program starts at a website, loads the pages, and follows the hyperlinks on each page. In this way everything on the web will eventually be found, as the spider crawls from one page to another. Search engines run anywhere from dozens to hundreds of copies of their web-crawling spider programs simultaneously, on multiple servers.

When a “crawler” visits your home page, it loads the page’s contents into a database. Broadly we divide search engine crawling into two types.

1. Shallow Crawling
2. Deep Crawling

1. Shallow Crawling: In this crawling technique search engine only index the home page and few pages that are linked from the home page of the website.

2. Deep Crawling: Today Most of the search engines follow this technique of crawling. Under this technique crawls follow each links from your home page and load that page into the database and successively getting deeper into your website.

“404 spotters” is another type of spider. This spider is used by search engines to websites that are no longer exist online. Theses check search engine index page by page and try to load each page. If page is not found the web server returns a “404 error” which mean pages are not exist or online. When spider not find web page it delete that page from search engine index so keep this thing in mind while selecting any web server to host you website. Your website must be online for 24X7. if your website is down at the wrong, your website may be get deleted form the search engine index and who knows how much time it will take to indexed again.

No comments: