The reading level for this article is Novice

15th November 2003 would probably go down in search engine history as an important milestone. It is the day Google implemented sweeping updates in its algorithm, (nicknamed – Google ‘Florida’ algo update), which threw several thousand high-ranking sites off their ranks. The entire SEO community is exhibiting unprecedented nervousness, even rage. While the feelings are mixed in various camps, one thing is certain that Google has ruffled quite many feathers. Although Google usually updates its algorithm every alternate month or so (with minor updates even monthly), the magnitude of changes this time was more than what webmasters and SEO community expected. The new changes have off-listed several thousand high-profile commercial sites that enjoyed a continued ranking. The change is sending shivers to several SEO businesses that are on the brink of closure if they are unable to understand how the changes take effect and how to get the clients ranked again. Considering Google commands almost 80% of search market share, moving focus to other engines does not seem to be an option.

The magnitude of algo changes has been rather severe this time. While the dust is still in the air, speculations are at large, on what the new algo is. As a company policy, Google does not comment on what their new updates do. There are no new guidelines on their site for webmasters or SEO community other than the same old narration of how ‘very good content’ would be rewarded and ‘unethical techniques’ would be penalized. Algos are always closely guarded secrets of search engines since any leak would mean an abuse of the system leading to contamination of their search results. Silence pays. Any comment, acceptance or a denial on the new algo behavior usually lets out parts of the algo secret.
In the absence of any official guidelines or comments from Google, our analysis is based on validations of various speculations, actual research, experience, knowledge of search engine behavior, trends and history. Some of our analysis and findings are covered in this article. Since the wide-ranging implications cannot be covered in one article, I’m intending to cover all the important aspects in a series of articles focusing on changes, speculations, myths and facts. This article would cover an overview of some important aspects and we would go into the depth of each in the next few articles.

Here are some noteworthy initial findings –

Many SEO analysts believe that Google is filtering sites using a secret filter list. I do not believe this to be true. I’ll explain a little below –

Is Google filtering sites if the search term contains ‘money keywords’?

Since the hardest hit sites are from commercial categories, many SEO analysts believe that Google is filtering sites, which were ranking through search terms containing ‘Money Keywords’ (also called ‘commercial keywords’ or ‘hit list words’). Initial tests conducted by analysts indeed show symptoms that seem to substantiate this theory.

What would be Google’s motive?

If true, why would Google want to do this? The reason that this speculation seems to getting substantial backing is because the timing of the algo change effect coincided with shopping season as well as the forthcoming Google IPO. Analysts believe that Google wants to force commercial sites to now pump in money in their AdWords paid listing program if they wish to cash-in on the December 2003 Christmas season. The free ride seems to be over. Others believe that Google wants to paint their bottom-line a little rosy to impress their future investors.


So who replaces the commercial sites in the ranking then?

It seems that the top-20 results have a high population of government sites (.gov), educational sites (.edu), non-profit sites (.org), directories and non-US sites. Since these kind of sites have not been voracious advertisers on Google, analysts believe that ranking them high would not effect Google’s revenues and therefore puts pressure on commercial sites to go the Google AdWords PPC route.

From where would Google get the ‘Money Words’ list?

Google has access to a large database of ‘money words’ from their AdWords program. Interestingly, the advertisers and commercial site owners have themselves educated Google on which words are ‘good’ money words. Since bids on each keyword phrase vary, Google also knows how ‘valuable’ each keyword is.

So what’s the real story? Is Google actually filtering commercial sites using a ‘money words’ filter list?

Personally, I believe that nothing can be farther from the truth. I strongly believe that Google is not ‘filtering’ sites, as the analysts believe. There is no ‘money words’ list. The popular ‘filter list’ theory is derived from the symptoms analysts are seeing, which in fact occur due to other reasons, as you will see in my next few articles.

It is true that one can notice a filtering-like effect in practice. We have reasons to believe that such an effect is actually a by-product of the new algorithm rather than intended. The new algo tends to affect commercial sites more than the non-profit ones. I would discuss this in detail in the subsequent articles and explain why this is actually happening.

Some time ago, a site was setup by a ‘Google Hate Group’ which offered a tool to check your ‘unfiltered’ old ranking results at Google by extracting data from Google in a crafty manner. This site (www.scroogle.org) is no longer able to offer the facility since Google updated its algo to prevent such searches. However, the site still shows a so-called ‘Filter Hit List’ collected from various searches on its site. I studied the list and if anything, it validates that Google is not using any such list. How could one otherwise explain, that the term ‘California Divorce Attorney’ appearing at the top, is about 20 times more ‘valuable’ to Google than the terms ‘Books’ or ‘Adult’ which appear at the bottom of the list.

Any attempt by Google to filter commercial sites from the organic ranking would severely damage Google’s brand and credibility to serve unbiased search results. Such attempt, if true, would be extremely near-sighted, not worth risking a fantastic brand and service Google has come to create. The IPO as well as the bottom-line would be devastated. Besides Google says it has different groups working on ‘Search’ and ‘AdWords’ and one group cannot influence the other. I believe them.

Google now uses stemming

Google has indeed deployed the stemming of keywords within its search results. Earlier, searching for singular terms (like ‘home garden’) did not show up plurals (home gardens) and vice-versa. Google now considers several variations of the keywords to search for sites in its database to display search results. (garden, gardens, gardening etc.). Seems minor; does it? Actually, this has a far-reaching implication on the competition for the keywords and ranking. Suddenly, the canvas for keyword competition has become enlarged. While for Google users, this feature may be helpful as they now get results from a larger database. From the SEO perspective, one now needs to compete with a much larger set of pages for ranking. Competition for keywords has suddenly intensified.

Google is using its spelling correction tool in search results.

In several search criteria, Google is now deploying its spelling correction tool in the search results. I discovered this when I searched for the term ‘Search Engine Optimisation’ as Asians spell it. Earlier, I used to get results with only ‘s’ in ‘optimisation’. However, now I see results with ‘z’ in ‘optimization’. ‘e-mail solution’ will now fetch ‘email solution’ and ‘e-commerce’ will get ‘ecommerce’. Competition data has increased.


Google has deployed the Hilltop Algo to fine-tune the effect of PR

Most Internet buffs know that Sergey Brin and Larry Page pioneered the PageRank algorithm (named after Larry Page), to refine ranking of sites on search results. This was one of the important factor in Google’s success story since 1998. See details about PageRank here – http://www.google.com/technology

However, there is a basic flaw in the PageRank system. The PageRank (PR) system allocates an absolute ‘value of importance’ to a web page based on the number of sites that link to it. The PR of the linking page is also valued. The higher the PR of the incoming link page, the higher the PR value passed to the linked page. However, ‘PR value’ is not specific to search terms and therefore a high-PR web page that even contained a passing reference to an off-topic keyword phrase, often got a high ranking for that phrase. Krishna Bharat from California realized the flaw in this PR-based ranking system and came up with an algorithm he called ‘Hilltop’ in the year 2000. He filed for the Hilltop patent in Jan 2001 with Google as an assignee. Needless to say, Google realized the advantage this new algo would offer to their ranking system if combined with their own PR system.

I believe that Google has indeed deployed the Hilltop algo in its last algo update in combination with Google’s own algo of PR and relevance. However the Hilltop algo may have gone refinements from its original form before this deployment.

What is the Hilltop algo?

For the geeks who wish to go into great depths, there is detailed info available here –

Hilltop Paper & Hilltop Patent: http://www.cs.toronto.edu/~georgem/hilltop/

For the rest of us, here is a simple explanation –

Bharat formulated that instead of using just the ‘PR value’ to find the ‘authoritative’ web pages, it would be more useful if the ‘value’ has topical relevance. As such, counting links from ‘topic relevant’ document to a web page would be more useful. He called these ‘topic relevant’ documents as ‘expert documents’ and links from these expert documents to the target documents evaluated their ‘authority score’.

The Hilltop algo calculates a ‘score of authority’ of web pages (over-simplified) as follows:

Run a normal search on the keyphrase to locate a ‘corpus’ of expert documents. The qualifying rules of ‘expert documents’ are stringent so the ‘corpus’ is a manageable number of web pages. Filter affiliate* sites and duplicate sites from the experts list.
Pages are assigned a LocalScore of ‘authority’ based on number and quality of votes they get from these expert documents. Pages are then ranked based on their LocalScore.

*Affiliate sites are pages that originate from the same domain, same domain with different suffix (ibm.com, ibm.co.uk, ibm.co.jp etc.) or from neighborhood IPs (first 3 common octet in the IP number like 64.129.220.xxx)

The combination of Hilltop algo, Google-PR and on-page relevance factors seems to be a highly potent combination, very difficult to beat. Not impossible but very difficult. This new combination has far-reaching implications on how link-popularity and PageRank would affect your site ranking. While there are several upsides to this new system, there are also several flaws in this new system. I will be addressing these analytics in a different exclusive article.

With the major algo updates in Google, one thing is clear. There is no easy way to get to the top. Sites that deployed simple techniques of Meta optimization or even on-page optimization will find it difficult to rank. There are close to 100 algo variables that need to be evaluated and addressed; not just Meta tags. SEO is now going to be highly specialized. The old tricks are out, or have become ineffective. Comprehensive SEO strategies will need to be planned and implemented. SEO experts will now require putting in greater intellect, talent, research, analysis, planning and man-hours to get good results. SEO costs will therefore escalate.

PPC costs will rise. PPC campaigns will need to form an integral part of your online marketing strategy and will need to be balanced with the organic search engine traffic. Link-building and PR now carries a greater significance. It has a greater impact on ranking and therefore cannot be ignored or viewed in isolation. Isn’t Link Building an integral part of SEO ? will need to be combined to formulate the overall strategy as its combined impact has greater synergy. Algo updates will now be more frequent and more intense. Monthly maintenance of SEO is now extremely important. Long-term partnerships with SEO providers are a must and will be productive.

Several 1000 high-profile sites have lost ranks. SEO providers are now in greater demand. Several 1000 SEO experts now cannot deal with the new algo. They are fumbling for answers. Fly-by-night SEO providers will fly-by-night. The ‘real’ SEO providers list has just shrunk. SEO and ranking business has just become more serious, more difficult and more important.

As always, I welcome your comments, suggestions and questions.

Last Updated: 13th December 2003


This Web Marketing article was written by Atul Gupta on 4/7/2005

Atul Gupta is the founder & CEO of www.SEOrank.com, a leading Search Engine Optimization services company and www.PugmarksDesign.com, a leading eCommerce and Web Development services company. He has about 20 years of experience in the field of Graphic Design, Visual Communication, Web Development and Search Engine Marketing Services. He has spent last 9 years of his career devoted solely in pursuing Search Engine Marketing and Web Development activities.