16 September 2015

How Search Engine Works ? Case Study

Leave a Comment

How Search Engine Works. How It Crawl And Index Data

Search Engines are goldmines for us from where we can collect any information we want. But do you know how these search engine works?

Most probably the answer is NO. Tons of search queries are made each hours and it looks simple to search over internet from outside but it's not as from inside. So I have posted this tutorial on my own experience and experimental study which can surely help you to provide deep knowledge about this.


What Are Search Engines?


What Is Search Engine?

Lol, this is really funny question and most of us know about it. Though a few introduction is better for your knowledge or for newbies to this Internet world.

Accordingly To Wikipedia Statement, "A web search engine is a software system that is designed to search for information on the World Wide Web. The information may be a mix of web pages, images, and other types of files." Learn More about on Wikipedia

There are numbers of search engines available out of which four are most loved. Google, BaiduBing and yahoo, of which Google is at #1st rank from the day it all started and contribute around 70 percent in market share. Coming back to the point, I would prefer you to use Google as your primary choice.

Search Engines are two sided. From one side they interact with users who searched over them. On the other hand, they have to collect, index and arrange data from different websites available over internet.


Search Engine Strategy. How It Works?


I think this would be interesting for you to know about search engines strategy. Under this topic I will teach you how it works, collect and represents data for a particular query. As a new website is created, it is first submitted to search engines via webmaster tools. This work is done by site owner itself. After submission, search engine sends a web crawler ( or spider ) to crawl website data. The famous web crawler of google is "Googlebot" ( for desktop ) and "Googlebot-Mobile" ( for smartphones ). These web spiders are sent each time a new update is found on sitemap of any website and they are smart enough to detect the type of data and its quality. Sitemap is nothing but a xml page on website which tells search engines about new updates. You can check any website sitemap simply by adding "/sitemap.xml" at the end of web address. Note: This may not work for few website as they have their sitemap at different location on server. One more important term related to this is robots.txt file. It is a simple text file which instruct web crawler or robot on how to crawl and index data. Using robots.txt site owner can block links which he doesn't want to index in result. This file is also publicly available to everyone in most of the websites and you can check it by adding "/robots.txt" at the end of web address.

What Are Web Crawlers Or Bot Robots?


Web crawlers or Bot robots are the automatic machines which are used by search engines to crawl website content. Think about in this way, You are class monitor and your teacher told you to collect project work from all students and submit it back to him. In above example, teacher is represented as search engine. You as a class monitor is bot robot ( or web crawler or spider ) and all project work that is to be submitted is the content that has to be crawled or indexed. Crawling and Indexing are two different terms. Crawling refers to the searching. Whenever a web spider look for or search website content, it is called crawling. On the other hand when it saves data to it's database on the basis of results found after crawling, is what we call Indexing.

How Search Engine Crawl And Index Content?


How Search Engine Crawl & Index

Till now, you know how web crawler is sent to website when a new update is found. Now the question is how this web crawler crawl and index content. The answer is in Keywords.

What are keywords?


What are keywords and how search engine make use of them.

Keywords are the words or phrases that convey special meaning about the content. As web crawlers reach website,  they look for these keywords which are generally found in title tag, meta description, anchor texts, alt tags and inside body content. On the basis of these keywords, data is indexed in result or in SERP ( Search Engine Result Page ). Still don't get what keywords are? Ok, lets take an example. If you are searching for "online shoes" than "online" and "shoes" are 2 keywords which search engine look for in its database and represent those posts having best pair of "online" and "shoes" keyword. Now if you search for "online shoes nike", than their will be 3 keywords. This time search engine will narrow down its research and represents only those posts having best pair of "online", "shoes" and "nike". However it's not mandatory that you will get perfect result. It may be mix of images, posts, Videos etc., whichever will be relevant to query.

Example Of Keywords

From the above picture, you can see the drop in rank of amazon post as I changed keywords from "online shoes" to "online shoes nike".

How Data Is Represented In Search Result


Now you may be wondering about how it arrange data in search result. It's very simple. Those posts having high quality unique content will stand better in SERP ( search engine result page ). However It also depends on site design ( How website use heading tags, navigation etc., ), keyword density, meta description, title tag and many other factors.

Keyword Density may be a new word for you. Actually it is the number of occurrence of a single keyword in a post which is calculated in percentage. It's a simple math. As like you find percentage of your marks, in the same way first count number of keyword occurrence, then divide it with total number of words present in post and multiply it with 100. Keyword density should be around 2% which is considered best for SEO ranking. From above example, those posts having keyword density of "online" and "shoes" around 2% will have more chances to get better rank. If keywords are used repeatedly in web page, is known as keyword stuffing and search engine hates those pages.

Title Tag is the title of web page. For example in this page, title is "How Search Engine Works ? Case Study". Title is biggest factor which has direct effect on page ranking and it consists of all major keywords which describe post.

Meta Description is a HTML meta tag which appear as short snippet in search result below title. It consist of eye-catching description highlighting the uniqueness of content.


Search Result Preview

Above picture shows how web page preview consist of title tag and meta description in result. Here I had found this video which may be helpful to you. In this video, google spam head Matt Cutts tell about how does google search works. Have a watch!


Final Words


So this is all what you need to know on how search engine works. Wrapping up all, we can say that search engines directly interact users with the content of website available over internet which is relevant to their queries. I have covered all basis terms here. However If you have any problem or want to suggest something on this topic, please let me know.

Leave A Comment