Search Engine industry

As Internet started to grow and became an integral part of day-to-day work, it became almost impossible for a user to fetch the exact or relevant information from such a huge web. And hence 'Search Engines' were developed. Search engines became so popular that now more than 80% of the web-site visitors come from the search engines. But what exactly is a Search Engine? According to webopedia, "Search Engine is a program that searches documents for specified keywords and returns a list of the documents where the keywords were found".

For Example, if you want to know about the Automobile market in India , you will type some keywords like automotive market, automobiles in India , automobile manufacturers in India etc. and once you click on search button, you'll get the best relevant data related to those keywords.

Search engine industry is dominated by 3 major players- Google, Yahoo! and MSN with a market share of 35%, 32% and 16% respectively (According to searchenginewatch.com survey, 2004). A report released in March 2005 indicated that the search engines are being used between 2 to 3.5 billion times per day to find information online. And of course people using search engines are on the hunt for specific information and hence the audience is highly targeted. There are many other search engines such as AskJeeves, AOL, and Excite etc which are also famous among net users. But what is attractive is the stats regarding usage of search engines.

The use of search engine is a top online activity and netizens increasingly feel they get the information they want when they execute search queries.

On the probable eve of Google's initial public offering, new surveys and traffic data confirm that search engines have become an essential and popular way for people to find information online. A nationwide phone survey of 1,399 Internet users between May 14 and June 17 by the Pew Internet & American Life Project shows:

  • 84% of internet users have used search engines. On any given day online, more than half those using the Internet use search engines. And more than two-thirds of Internet users say they use search engines at least a couple of times per week.
  • The use of search engines usually ranks only second to email use as the most popular activity online. During periods when major news stories are breaking, the act of getting news online usually surpass the use of search engines.
  • There is a substantial payoff as search engines improve and people become more adept at using them. Some 87% of search engine users say they find the information they want most of the time when they use search engines.
  • The convenience and effectiveness of the search experience solidifies its appeal. Some 44% say that most times they search they are looking for vital information they absolutely need.

comScore Networks tracking of Internet use shows that among the top 25 search engines:

  • Americans conducted 3.9 billion total searches in June
  • 44% of those searches were done from home computers, 49% were done from work computers, and 7% were done at university-based computers.
  • The average Internet user performed 33 searches in June.
  • The average visit to a search engine resulted in 4.4 searches.
  • The average visitor scrolled through 1.8 result pages during a typical search.
  • In June, the average user spent 41 minutes at search engine sites.
  • comScore estimates that 40-45 percent of searches include sponsored results.
  • Approximately 7 percent of searches in March included a local modifier, such as city and state names, phone numbers or the words "map" or "directions."
  • The percentage of searches that occurred through browser toolbars in June was 7%

A listing on a SERP that is achieved through outbidding competitors (as in PPC). The term is sometimes also used to refer to keyword-targeted advertisements, where the advertiser pays the search engine a fixed amount to have its ad shown on the SERP for a specific keyword.

 

Search Engines Market Share:

Four times voted as Most Outstanding Search Engine, Google is an undisputed market leader of search engine industry. Google is a crawler based search engine, which is known for providing both comprehensive coverage of web page and most relevant information. It attracts the largest number searches and the number goes upto 250 million searches everyday.

Yahoo! is the second largest player in the industry with 32% of market share. Yahoo! started as a human based directory but turned into Crawler based search engine in 2002. Till early 2004, it was powered by Google but after that they started to use their own technology.

Overture stands next to Google in terms of number of searches per day. It is owned by yahoo and attracts more than 167 million searches per day. Overture was the first search engine who came up with PPC program. AskJeeves initially gained fame in 1998 and 1999 as being the "natural language" search engine that let you search by asking questions and responded with what seemed to be the right answer to everything. When launched, it was run by around 100 editors who monitored search logs. Today, however, AskJeeves depends on crawler-based technology to provide results to its users.

 
Search Engine History:

Though Google is responsible for where the search engines stands today, actual search engine was invented much before Google incorporated.

Alan Emtage, a student at McGill University , created the first search engine in 1990 and he named it 'Archie'. Back then there was no world wide web! FTP was the mean to share the data. It was effective in smaller groups but the data became as much fragmented as it was collected. Archie helped solve this data scatter problem by combining a script-based data gatherer with a regular expression matcher for retrieving file names matching a user query.

A categorized collection of links to the web, usually compiled manually. Directories can either be general (to the entire web) like ODP (Open Directory Project) or Topical like the Dotcom Directory. Although they cannot rival search engines for index size, the generally do offer higher quality search results , arrived at through some editorial selection process.

Pay-Per-Click. An advertising payment model where the advertiser pays only when the advertisement is actually clicked. In other words, the advertiser literally pays only for visitors rather than per advertisement impression.

 

Essentially Archie became a database of web filenames, which it would match with the user's queries.

Archie had such popularity that in 1993 the University of Nevada System Computing Services group developed 'Veronica'. Veronica served the same purpose as Archie, but it worked on plain text files. Soon another user interface name 'Jughead' appeared with the same purpose as Veronica; both of these were used for files sent via Gopher, which was created as an Archie alternative by Mark McCahill at the University of Minnesota in 1991.

Now the challenge was to automate the process. And the first internet robot was introduced. Computer robots are simply programs that automate repetitive tasks at speed impossible for humans to reproduce. He initially wanted to measure the growth of the web and created this bot to count active web servers. He soon upgraded the bot to capture actual URL's.

His database became knows as the Wandex. The Wanderer was as much of a problem as it was a solution because it caused system lag by accessing the same page hundreds of times a day.

By December of 1993, three full-fledged bot fed search engines had surfaced on the web: JumpStation, the World Wide Web Worm, and the Repository-Based Software Engineering (RBSE) spider. The JumpStation gathered info about the title and header from Web pages and retrieved these using a simple linear search. As the web grew, JumpStation slowed to a stop. The WWW Worm indexed titles and URL's. The problem with JumpStation and the World Wide Web Worm is that they listed results in the order that they found them, and provided no discrimination. The RSBE spider did implement a ranking system.

Brian Pinkerton of the University of Washington released the WebCrawler on April 20, 1994. It was the first crawler, which indexed entire pages. Soon it became so popular that during daytime hours it could not be used. AOL eventually purchased WebCrawler and ran it on their network. Then in 1997, Excite bought out WebCrawler, and AOL began using Excite to power its NetFind. WebCrawler opened the door for many other services to follow suit. Within 1 year of its debuted came Lycos, Infoseek, and OpenText.

In 1998 the last of the current search super powers, and the most powerful to date, Google, was launched. It decided to rank pages using an important concept of implied value due to inbound links. This makes the web somewhat democratic as each off going link is a vote. Google has become so popular that major portals such as AOL and Yahoo have used Google and allowed that search technology to own the lion's share of web searches. In 1998 MSN search was launched. The Open Directory and Direct Hit were also launched in 1998.

 

Search Engines and Directories

Web Directory is a web search tool compiled manually by human editors. Once websites are submitted with information such as a title and description, they are assessed by an editor and, if deemed suitable for addition, will be listed under one or more subject categories. Users can search across a directory using keywords or phrases, or browse through the subject hierarchy. Best examples of a directory are Yahoo and the Open Directory Project.

The major difference between search engine and directory is the human factor. A web site search directory indexes a web site based on an independent description of a site. While directories perform many of the same functions of a web page search engine, their indexing format is different. The main difference is that directories do not spider your site to gather information about it. Instead they rely on a few text entries, typically a site title, domain name, and description, to determine which keywords describe your site. While sites in the search engines are scanned and resulted by program (crawler), they are edited manually in directories. Directories contain number of websites according to theme or industry i.e. automobile related sites are placed in one sub-directory, sports sites are placed into the other sub-directory and so on. So directories help organize thousands of web sites together. A directory contained inside another directory is called a subdirectory of that directory. Together, the directories form a hierarchy, or tree structure.

There are directories on the web for almost any category you could name. Some search engines are adding general directories to their web pages. While helping researchers by suggesting general topics to search under,

Open Directory Project or Dmoz.org is a massive directory continually expanded by volunteers. What sets this directory apart is that it makes its database of indexed documents available to other directories & search engines. A listing here results in the page automatically being listed in many other directories and search engines.

There are 5 types of directories namely Human Edited, User Categorized, User Classified, Independently Classified and Pay Per Click (PPC).

  1. Human Edited (Categories):
    This is the 'traditional' directory. It is the most prestigious, as each listed site is 'hand picked' and reviewed by a human editor. The assumption is that the editor is an 'expert' in his/her field and will select for inclusion only appropriate sites. Such directories usually have very clear and stringent acceptance rules, which ensure the quality of the search results. Invariably, the Directory is comprised of categories to which sites are 'assigned'. This type of Directory is relatively hard to maintain, as it is labor intensive and hence expensive. That also explains why many such directories are using volunteers to do the work. Notable examples of Human Edited Directories are Yahoo, Dmoz, Joeant and Gimpsy, but there are many more. There is no doubt that this is the most important type to submit your site to. Only the scrutiny of an independent human reviewer can ensure the quality and suitability of a web site to a given category.
  2. User Categorised:
    The Directory is structured in a very similar way as the Edited Directory, but it is the user's decision as to the best category to place the site in. While this is quite attractive for the Directory Owner (the users do the 'hard work') as well as the Site Owner (freedom to place the site in any category), the search results may be far from satisfactory. One such Directory is Websquash. You may get benefits from registering in such a directory, but make sure you consider all the relevant aspects.
  3. User Classified:
    Sites are classified by keywords, entered by the Site Owner in the Meta Tags of the home page. The attraction here is that the site is classified (potentially) by many keywords and the process is fully automatic (low maintenance). While easy to register, the sorting algorithm has very little to go by, hence the position of the site in the search results doesn't mean much. Moreover, should you choose popular keywords you have little chance of being found due to the number of sites competing with you. On the other hand, selecting a rare combination of keywords suffers from the obvious problem of the miniscule number of searchers using that combination. One of the better known examples is ExactSeek , which enjoys significant popularity. Its attraction may be related to the use of the Alexa ranking , which measures the site's popularity, as a primary sorting criterion of the searched results.
  4. Independently Classified:
    Instead of letting the Site Owner decide which keywords to use for finding his site, this type of directory allows every user to determine the relevancy of keywords. This latest addition to the Directory family harnesses the public vote to examine and determine relevancy of keywords to sites. Each user may choose to rate a (random) site and voice his/her opinion of the suitability of specific keywords to that site. The best example for such a site is Netnose. Due to the democratic process, it is highly likely that relevancy will be good. However, for such a site to achieve prominence requires a larger number of users willing to donate their time and effort to that rating activity.
  5. Pay Per Click:
    While technically PPC Directories are of the User Classified type, their business model implies some significant characteristics that Site Owners should be aware of:
  • A link from a PPC is never a direct or simple link. Hence being listed in a PPC directory will never help to increase Link Popularity with search Engines.
  • A link from PPC directory remains in place only as long as the user's account is cash positive.
  • PPC Directories try to maximize their revenues by encouraging Site Owners to bid for as many keywords as they can, even those that are only remotely related to their site's business.

At the beginning of the web era, users would go to directories to find sites relevant to their interests. In fact, Yahoo!, the web's number one destination, started as a directory. Nowadays, most users rely on search engines, not directories, to find what they're looking for.

When search engines started to become popular, they relied on web pages' 'keyword meta tags' to determine the topic and relevance of the page (the keyword meta tag is a section within a web page's HTML code where webmasters can insert words that are relevant to the page's content). Webmasters discovered that by stuffing their meta tags with popular search terms repeated hundreds of times, they could propel their pages to the top of the search results.

Search engines caught on to the abuse and decided to ignore the meta tags and rely instead on web page copy. Webmasters then started to overstuff their page copy with popular search terms, often writing them in the same color as the web page's background, so that they could be detected by search engines while being invisible to users.

Again, search engines discovered the trick and decided that the best way to rank a web page's content and its topical relevance was to rely on inbound links from other pages. The rationale behind this is that it is much more difficult to influence other people to link to you than it is to manipulate your own web page elements.


There are several ways to get inbound links, among them writing articles that include your bylines with a link to your page, exchanging links, and listing your site in directories.

Listing your sites in good directories is probably the best way to get quality links that are highly valued by the search engines. Since directories rely on human editors who enforce strict criteria to list a site, and since directories organize the information in highly focused categories, they are an invaluable resource for search engines to measure the quality and the relevance of a web page.

In summary, directories are important not because they generate significant traffic, but because they are given great importance by the search engines to qualify and rank web pages, and to determine their topical relevance.

 

Major Search Engines

Among the thousands of search engines, very few are famous. Thanks to their algorithm which helps the user to find most relevant information. As observed earlier, Google, Yahoo! and MSN are the top three search engines in the world. But then there is Teoma, Excite, Ask Jeeves, AOL, HotBot, Alta Vista, Lycos etc. also counts lots of searches.

The listing in these search engines can attract huge traffic to the website. Hence, it is very important for search engine optimizer to know which search engine is best and highly used. It is very important for searchers as well! For them well known, commercially backed search engines mean dependable results. These search engines are more likely to be well maintained and upgraded when necessary, to keep pace with the growing web.

There are 9 main features on which search engines can be evaluated. They are as below.

Boolean: Boolean searching refers to how multiple terms are combined in a search.

and - requires that both terms be found.
or - lets either term be found
not - means any record containing the second term will be excluded
(.) means the Boolean operators can be nested using parentheses
+ is equivalent to AND, requiring the term; the + should be placed directly in front of the search term
- is equivalent to NOT and means to exclude the term; the - should be placed directly in front of the search term

Operators can be entered in the case shown by the example.

Examples:
(salad and (lime or kiwi)) not nuts
+salad -nuts lime kiwi

 
 
 
 
 
 
 
 
 
 
 
................................................................................................................................................................
Just Drop Your About 'Meta Refresh Tags' Feedback Here :: thanks                       More Search : Click Here !
................................................................................................................................................................
HTML Comment Box is loading comments...
........................................................................................
Get in Touch With Us For Your Queries
........................................................................................

     Date & Time :