SEO-News: Septermber 2, 2004 Feature Article |
|
To Print: Click here or Select File/ Print from your Browser Menu. |
Article printed from SEO-News: http://www.seo-news.com HTML version available at: http://www.seo-news.com/archives.html
Google Holes, Yahoo Gaps & Black Holes
By Richard Zwicky, CEO, Metamend
Google might like to be thought of as a 'black hole of Internet
search engines,' consuming all the information that falls within
its gravitational reach. The difference being, the information
does escape, and the web is not really ripped apart at the seams.
Oh well, so much for analogy.
But there really are holes in Google, Yahoo! and all other search
engines that have nothing to do with the forces of nature. These
holes have serious implications for the quality of search engine
results, and therefore require the attention of your optimization
efforts.
Let's begin the analysis with Google - the current technology
leader in the search engine field. When users visit the Google
search engine and run a search, they often enter in complete
phrases. This tendency is likely to become more common as text
to speech comes to reality. How Google treats these phrases
demonstrates a fault within their algorithms and a hole in the
accuracy of their search results. When you include a common word
in a phrase within the Google search box, it gives you the
following message above the search results:
"for" is a very common word and was not included in your
search." [details] If you click for details, you get
Google's explanation on how and why common words are
excluded.
But, here's where Google falls down. Visit Google right now.
Open up 4 windows and in each window's search box type the
following queries:
Hotels New York
Hotels in New York
Hotels for New York
Hotels about New York
The words 'in' 'for' and 'about' all get the standard, "This is
a very common word and was not included in your search," message.
Yet all four searches display entirely different results?
What is Google doing? I considered the possibility that I was
pulling results from different data centers, so I ensured this
was not the case. I then tried a variation on this search query,
using the term "search engine optimization X hotels" the 'X''
representing a blank space, or one of the words, in' for' or
about. In this test, only where the X' represented a blank space
did I get varying results. Still, by rights they ought to have
all been identical.
It occurred to me that perhaps Google was using different
algorithms when it identified a place name in the search query
by trying to understand the context of the query. That would be
a logical move. I'm very familiar with software that comprehends
the context of textual content. Could it be that Google is trying
to apply some contextual filtering to their results? I then
proceeded to try a garbage search. A search phrase with common
words which really have no direct relevance, and therefore words
which would never appear together logically:
"room hotel tapestry highway lagoon"
Interestingly, Google had 1720 entries which matched this
query, and the results varied depending on which of the X terms
I inserted between any two of the words. Search results also
varied if I moved the placement of the ignored word within the
query. But is this context? A further test would be required.
I put together 3 queries using the same terms, but with a common
or ignored word inserted as follows:
Filing tax return(s)
Filing a tax return(s)
Filing of tax return(s)
In this case, I tried singular and pluralized searches, to
ensure that poor grammar was not affecting the results. Results
varied for each search. That's not to say they were all entirely
different, just that they varied. I tried a few other searches
and received similar results. Most importantly, the results I
received were all equally contextually correct, which was a
relief.
Some people have written to news groups and discussion boards
that when Google comes across an 'ignore' word, it substitutes
a wild card. However, if that were true, the various ignore
words, would all return the same results and this is not the
case. Therefore, it can be surmised that Google does not in
fact ignore words at all! It is more likely that Google is
using some measure of context algorithm. This is logical. The
technology exists and Google is known to have bought a UK firm
last year which was developing such a technology. Our own firm
uses software which uses contextual analysis in its algorithms.
Taking the analysis a step further, which other engines seem to
have a grasp on context? Obviously, the places to look first
were Google's competitors: Yahoo! Microsoft, and AskJeeves.
Askjeeves sprang immediately to mind, as it had originated the
concept of "phrase a question" type searching, thus it should
logically have some context filtering in place. In fact, when I
ran the 'tax return' query through the engine, I did receive
varying results. Very different results than Google, I might
add. When multiple 'ignore' words were added to a query, results
did not vary, which may indicate very limited filtering.
I then tried an alternate query. "diapers for (a) baby" and
"diapers on (a) baby" This should logically return different
results. One recommending diapers, and one about how to put them
on, or keep them on or how they should look, etc. Surprisingly,
I received identical results to my queries. Context was not
being properly filtered by the very search engine which first
introduced the concept! I tried the same search on Google. While
results were jumbled a bit, the top web sites were the same for
both queries, just in varying order. With over 550,000 results
to choose from, this would indicate Google too, has a long way
to go to fulfilling the promise of contextually correct
responses.
Next, I turned my attention to Yahoo! I was somewhat surprised
to discover that Yahoo! does not seem to have -any- filtering in
place. Results did not vary at all for the test searches run when
the "ignore" words were inserted or removed. Yahoo! also did not
identify these terms as being ignore terms in their results, but
the fact that results were unchanged when the terms were added
or deleted would indicate that they were omitted and Yahoo! does
not have the necessary algorithms to allow it to comprehend the
context of a search query.
Is context an area where Yahoo! seriously lags behind Google and
others? If true, this points to a widening gap between the search
engines in the future. Google is already positioning for speech
to text devices, can intonation be far behind? Yahoo! has not
demonstrated any evidence of making strides in either of these
areas.
Lastly I looked at the new Microsoft engine. No contextual
filtering in place. Since this search engine is still in beta,
I cannot in all fairness comment on it being behind in a race
where we have not yet seen the final product. Still, it's
something to keep in mind for the future.
Implications for SEO
The implication of contextual search on how your web site
performs in the search engines is immense. It means that the
nuances of how people search have to be better taken into
account by all SEO firms.
In our firm we recognized that as the world moved to speech to
text and as the web grew in size, context would be the next big
differentiator in search results. This means that context is
already recognized and taken into account both by our technicians
and our technology when analyzing a web site, and optimizing it
for search engines.
Working to improve your web site's performance in the search
engines now requires a comprehension of how people are actually
phrasing search queries and using that knowledge to properly
position the content on your site, to account for the idioms
used by your target audience.
Ensure that you are using phrases in the way you hear people
asking questions. Ensure you cover all the bases and get all
possible variations. Get outside help if you need it, but don't
miss out on your opportunity to take advantage of the Black
Holes out there.
================================================================
Richard Zwicky is a founder and the CEO of Metamend Software, a
Victoria, B.C. based firm whose cutting edge Search Engine
Optimization software has been recognized around the world as a
leader in its field. Employing a staff of 10, the firm's business
comes from around the world, with clients from every continent.
Most recently the company was recognized for their geo-locational,
or LBS technology, which correlates online businesses with their
physical locations.
================================================================
Copyright © 2004 Jayde Online, Inc. All Rights Reserved.
SEO-News is a registered service mark of Jayde Online, Inc.