How Search Engines Work

75-C

Internet search engines are designed to help people find information stored on web pages. They perform four basic tasks:

* They search the Internet based on important words.
* They keep an index of the words they find.
* They allow people to search for words that they want additional information about.
* They present results with the most relevant pages ranked highest in the Search Engine Results Pages (SERPs).

 

 

Finding Content :

Before a search engine can tell you where a file or document is, it must find it. To find information on the billions of Web pages that exist, a search engine employs software robots, called spiders or crawlers, to build lists of the words found on Web sites. The action of these software tools building lists is called Web crawling.

Spiders take a Web page's content and create key search words that enable online users to find pages they're looking for.

As an example of this when the Google spider looked at an HTML page, it took note of two things:

* The words within the page
* Where the words were found

Words occurring in the title, subtitles, meta tags and other positions of relative importance were noted for special consideration during a subsequent user search. The Google spider was built to index every significant word on a page, leaving out the articles "a," "an" and "the."

Other spiders take different approaches. These different approaches usually attempt to make the spider operate faster, allow users to search more efficiently, or both. For example, some spiders will keep track of the words in the title, sub-headings and links, along with the 100 most frequently used words on the page and each word in the first 20 lines of text. Lycos is said to be an example of this approach.

Other systems, such as AltaVista, go in the other direction, indexing every single word on a page, including "a," "an," "the" and other "insignificant" words. The push to completeness in this approach is matched by other systems in the attention given to the unseen portion of the Web page, the meta tags.

Meta tags are portions of the web page that cannot be seen by the casual visitor but provide information on page structure and content to web browser software. They allow the owner of a page to specify key words and concepts under which the page may be indexed. The importance of some meta tags has been devalued due to the practice of META spamming, however they still provide some value when used properly.

 

 

If you are interested in understanding what things need to change on your site to attract more visitors, simply click on the button below to request a free site evaluation.

GET FREE SITE IMPROVEMENT TIPS

 

Building the Index
Once the spiders have completed the task of finding information on Web pages (the continuous change inherent in the Web means that the spiders are always crawling), the search engine must store the information in a useful manner. There are two key components involved in doing this:

* The information stored with the crawled words
* The method by which the information is indexed

Most search engines store more than just the word and URL. An engine might store the number of times that the word appears on a page. The engine might assign a weight to each entry, with increasing values assigned to words as they appear near the top of the document, in sub-headings, in links, in the meta tags or in the title of the page. Each commercial search engine has a different formula for assigning weight to the words in its index. This is one of the reasons that a search for the same word on different search engines will produce different lists, with the pages presented in different order.

Regardless of the precise combination of additional pieces of information stored by a search engine, the data will be encoded to save storage space. For example, the original Google paper describes using 2 bytes, of 8 bits each, to store information on weighting -- whether the word was capitalized, its font size, position, and other information to help in ranking the hit. Each factor might take up 2 or 3 bits within the 2-byte grouping (8 bits = 1 byte). As a result, a great deal of information can be stored in a very compact form. After the information is compacted, it's ready for indexing.

An index has a single purpose: It allows information to be found as quickly as possible. There are quite a few ways for an index to be built, but one of the most effective ways is to build a hash table. In hashing, a formula is applied to attach a numerical value to each word. The combination of efficient indexing and effective storage makes it possible to get results quickly, even when the user creates a complicated search.

Building a Search
This is the process of opening up a search engine page and entering a keyword or keyphrase into the search bar, i.e. using Google to find relevant pages for the term "Mississauga SEO". We've all done this and no point in dwelling on it here.

Presenting the information
All of the above stuff is done to get to this point, the SERP (Search Engine Result Page). As you already know, since you are reading this, being on the top of page 1 in the search result pages for any of the search engines is MUCH better than being on page 2.
How do the SE's determine which pages go on page 1 and which get consigned to ranking hell below page 2? No one who DOESN'T work for a search engine company can give you an exact answer to that BUT we can make some pretty shrewd guesses based on experience:

  1. Content, content, content
    Relevant, high value, well written, unique content is the MAJOR determining factor in achieving high rankings for competitive search terms.
  2. Links, links, links
    Incoming links from relevant, authority sites are the major factor in determining your page's position in the SERPs, once you've put up some content worth linking to. Links factor most significantly in the Google SERPs.

An SE that serves up crap pages in response to search queries won't be around for long. Hence their focus on well written, keyword relevant content when returning search results.

Mississauga Search Engine Marketing| Search engine ranking improvement© 75-c.com 2008 All Rights Reserved