SEO-News: Septermber 2, 2004 Feature Article

To Print: Click here or Select File/ Print from your Browser Menu.


  Article printed from SEO-News: http://www.seo-news.com
  HTML version available at: http://www.seo-news.com/archives.html

Google Holes, Yahoo Gaps & Black Holes
By Richard Zwicky, CEO, Metamend

Google might like to be thought of as a 'black hole of Internet 
search engines,' consuming all the information that falls within 
its gravitational reach. The difference being, the information 
does escape, and the web is not really ripped apart at the seams. 
Oh well, so much for analogy. 

But there really are holes in Google, Yahoo! and all other search 
engines that have nothing to do with the forces of nature. These 
holes have serious implications for the quality of search engine 
results, and therefore require the attention of your optimization 
efforts. 

Let's begin the analysis with Google - the current technology 
leader in the search engine field. When users visit the Google 
search engine and run a search, they often enter in complete 
phrases. This tendency is likely to become more common as text 
to speech comes to reality. How Google treats these phrases 
demonstrates a fault within their algorithms and a hole in the 
accuracy of their search results. When you include a common word 
in a phrase within the Google search box, it gives you the 
following message above the search results: 

"for" is a very common word and was not included in your 
search." [details]  If you click for details, you get 
Google's explanation on how and why common words are
excluded.

But, here's where Google falls down. Visit Google right now. 
Open up 4 windows and in each window's search box type the 
following queries: 

     Hotels New York 
     Hotels in New York 
     Hotels for New York 
     Hotels about New York 

The words 'in' 'for' and 'about' all get the standard, "This is 
a very common word and was not included in your search," message. 
Yet all four searches display entirely different results? 

What is Google doing? I considered the possibility that I was 
pulling results from different data centers, so I ensured this 
was not the case. I then tried a variation on this search query, 
using the term "search engine optimization X hotels" the 'X'' 
representing a blank space, or one of the words, in' for' or 
about. In this test, only where the X' represented a blank space 
did I get varying results. Still, by rights they ought to have 
all been identical. 

It occurred to me that perhaps Google was using different 
algorithms when it identified a place name in the search query 
by trying to understand the context of the query. That would be 
a logical move. I'm very familiar with software that comprehends 
the context of textual content. Could it be that Google is trying 
to apply some contextual filtering to their results? I then 
proceeded to try a garbage search. A search phrase with common 
words which really have no direct relevance, and therefore words 
which would never appear together logically: 

"room hotel tapestry highway lagoon" 

Interestingly, Google had 1720 entries which matched this 
query, and the results varied depending on which of the X terms 
I inserted between any two of the words. Search results also 
varied if I moved the placement of the ignored word within the 
query. But is this context? A further test would be required. 
I put together 3 queries using the same terms, but with a common 
or ignored word inserted as follows: 

     Filing tax return(s) 
     Filing a tax return(s) 
     Filing of tax return(s) 

In this case, I tried singular and pluralized searches, to 
ensure that poor grammar was not affecting the results. Results 
varied for each search. That's not to say they were all entirely 
different, just that they varied. I tried a few other searches 
and received similar results. Most importantly, the results I 
received were all equally contextually correct, which was a 
relief. 

Some people have written to news groups and discussion boards 
that when Google comes across an 'ignore' word, it substitutes 
a wild card. However, if that were true, the various ignore 
words, would all return the same results and this is not the 
case. Therefore, it can be surmised that Google does not in 
fact ignore words at all! It is more likely that Google is 
using some measure of context algorithm. This is logical. The 
technology exists and Google is known to have bought a UK firm 
last year which was developing such a technology. Our own firm 
uses software which uses contextual analysis in its algorithms. 

Taking the analysis a step further, which other engines seem to 
have a grasp on context? Obviously, the places to look first 
were Google's competitors: Yahoo! Microsoft, and AskJeeves. 

Askjeeves sprang immediately to mind, as it had originated the 
concept of "phrase a question" type searching, thus it should 
logically have some context filtering in place. In fact, when I 
ran the 'tax return' query through the engine, I did receive 
varying results. Very different results than Google, I might 
add. When multiple 'ignore' words were added to a query, results 
did not vary, which may indicate very limited filtering. 

I then tried an alternate query. "diapers for (a) baby" and 
"diapers on (a) baby" This should logically return different 
results. One recommending diapers, and one about how to put them 
on, or keep them on or how they should look, etc. Surprisingly, 
I received identical results to my queries. Context was not 
being properly filtered by the very search engine which first 
introduced the concept! I tried the same search on Google. While 
results were jumbled a bit, the top web sites were the same for 
both queries, just in varying order. With over 550,000 results 
to choose from, this would indicate Google too, has a long way 
to go to fulfilling the promise of contextually correct 
responses. 

Next, I turned my attention to Yahoo! I was somewhat surprised 
to discover that Yahoo! does not seem to have -any- filtering in 
place. Results did not vary at all for the test searches run when 
the "ignore" words were inserted or removed. Yahoo! also did not 
identify these terms as being ignore terms in their results, but 
the fact that results were unchanged when the terms were added 
or deleted would indicate that they were omitted and Yahoo! does 
not have the necessary algorithms to allow it to comprehend the 
context of a search query. 

Is context an area where Yahoo! seriously lags behind Google and 
others? If true, this points to a widening gap between the search 
engines in the future. Google is already positioning for speech 
to text devices, can intonation be far behind? Yahoo! has not 
demonstrated any evidence of making strides in either of these 
areas. 

Lastly I looked at the new Microsoft engine. No contextual 
filtering in place. Since this search engine is still in beta, 
I cannot in all fairness comment on it being behind in a race 
where we have not yet seen the final product. Still, it's 
something to keep in mind for the future. 

Implications for SEO 

The implication of contextual search on how your web site 
performs in the search engines is immense. It means that the 
nuances of how people search have to be better taken into 
account by all SEO firms. 

In our firm we recognized that as the world moved to speech to 
text and as the web grew in size, context would be the next big 
differentiator in search results. This means that context is 
already recognized and taken into account both by our technicians 
and our technology when analyzing a web site, and optimizing it 
for search engines. 

Working to improve your web site's performance in the search 
engines now requires a comprehension of how people are actually 
phrasing search queries and using that knowledge to properly 
position the content on your site, to account for the idioms 
used by your target audience. 

Ensure that you are using phrases in the way you hear people 
asking questions. Ensure you cover all the bases and get all 
possible variations. Get outside help if you need it, but don't 
miss out on your opportunity to take advantage of the Black 
Holes out there.
 
================================================================
Richard Zwicky is a founder and the CEO of Metamend Software, a 
Victoria, B.C. based firm whose cutting edge Search Engine 
Optimization software has been recognized around the world as a 
leader in its field. Employing a staff of 10, the firm's business 
comes from around the world, with clients from every continent. 
Most recently the company was recognized for their geo-locational,
or LBS technology, which correlates online businesses with their 
physical locations. 
================================================================




Copyright © 2004 Jayde Online, Inc.  All Rights Reserved.

SEO-News is a registered service mark of Jayde Online, Inc.