Web analytics is an inexact science. Good news for the sellers in the traffic industry where 10-15% is considered a
normal discrepancy between the publisher's and advertiser's campaign statistics; not so good news for the traffic
buyers. This error margin is a fertile soil for a traffic inflation cottage industry. You know there is a problem
when Google in its pre-IPO filings cites click fraud as one of the potential risks investors should worry about.
Click inflation usually comes from three principal sources.
1. Rotten apples among the traffic and content partners of pay-per-click search engines (PPCs) and
directories. Their incentive is direct financial gain in the form of traffic partner commissions.
2. Unscrupulous competitors who may try to devise a scheme to click on your paid links in an attempt to
make your PPC advertising cost-prohibitive.
3. Bots, spiders, and crawlers. Some of them may be bona fide search engine agents. Some others could be
malicious automated scripts designed to simulate visitor behavior and methodically deplete your PPC advertising
account.
Most PPCs must have a working mechanism for detecting fraudulent clicks. Otherwise, we suspect that they wouldn't
be able to stay in business. Today's PPCs are likely to be able to weed out non-malicious bots and amateur
perpetrators. But do these systems have the capacity to stop the professionals? We're not so certain. If the
history of spam-fighting is any indicator, the click inflation problem is here to stay.
Define them. Score them. Own them.
In order to remain undetected, professional inflators need to closely simulate real visitor behavior and visit
parameters. They know the number of pageviews their clicks generate is among the first things to be evaluated.
The good news is if you use statistical methods, you will be able to beat the perpetrators at their own game.
Whether it's for your internal use or for negotiating a refund from a PPC provider, what's needed is a system for
statistically defining and documenting fraudulent click activity.
Enter the Click Inflation Index system. This system performs a variety of tests to detect fraudulent user session
signatures, assigning penalty points to each offense. If the cumulative score - we call it Click Inflation Index -
exceeds the threshold, the user's session is tagged as fraudulent.
This article explains the basic principles and tests you can use when developing your own Click Inflation Index
algorithm. You will need a competent technical team armed with an adequate web analytics solution. The process is
fun and the results are well worth the effort.
Words of caution before you begin to implement a wide-scale click fraud fighting campaign: Make sure your keyword
bidding strategy is up to date. Top expensive keywords remain a high-profile target for con-artists. Unless your
marketing strategy calls for you to engage in a bidding war -- and provides the budget for it -- it's a good idea
to diversify and bid on the largest possible number of well-researched, lower-cost keywords.
Let the wild goose chase begin!
The click-fraud detecting tests you can use include:
Test 1. Visit depth. How many pageviews did this particular user session generate? If it's just one, it's
a good reason to lift a red flag a notch or two - but not more. Keep in mind that there could be a variety of
reasons behind the single-page visits. Perhaps your ad copy isn't clear and misleads the visitors, or maybe the
network connection was too slow and user decided not to wait for the other pages to load.
Test 2. Visitors per IP. Because of the proxy servers and networks of users sharing one Internet connection,
there will always be unique visitors with the same IP address. It's normal. You just need to calculate the "normal"
for your website's unique mix of traffic sources. IP addresses whose visitor counts exceed the control group by a
certain percentage are added to the blacklist and trigger a penalty.
Test 2a. Paid clicks per IP. Works the same way as Test 2, except counts only user sessions that resulted
from clicking on one of your paid links. Typically, you will track these by the unique destination URLs used in
pay-per-click listings, such as yourwebsite.com/?source=google.
Test 3. No cookie - no play? Many marketers will tell you that because most bots and scripts are not capable
of supporting the cookie mechanism, a user session without a cookie is a good cause for alarm. Others will say that
it can't be an accurate indicator because some privacy devotees do not accept cookies and thus look indistinguishable
from bots. So, penalize or not? We think you should.
Test 3. Pageview frequency. Most bots travel through your site and request pages from the server much faster
than humans. If a particular user session has generated a few pageviews in a matter of seconds, it's a good enough
reason to penalize it. On the other hand, you have to be careful not to go overboard when defining your threshold.
Humans can surf through your site pretty fast too!
Test 4. Anonymous proxy servers. Click thieves know that IP address is the primary means for identifying
the user session. Therefore they need to launch their attacks from many different IP addresses. The more, the
merrier. Fortunately, IP address spoofing is not a trivial task. For this reason, click inflators often channel
their activity through anonymous proxy servers. Your solution is to develop and maintain an up-to-date list of
anonymous proxy servers and penalize user sessions originating from them. Most legitimate visitors have no reasons
to use anonymous proxies.
Test 5. Geographic origin. Now on to the politically incorrect part. You get to blacklist any country in the
world you'd like! Just think of the countries from which you never have and likely never will receive a viable lead.
Remember, you're not about to ban visitors from these countries to access your website. You're just going about your
regular business of assigning points.
Test 6 and beyond. Finesse and customize. You can devise your own triggers and assign points to them. For
example, if 98% of your business activity occurs during normal business hours, you may want to penalize visitor
sessions originated at all other times. Or you may track visits from a set of suspicious IP addresses for a period
of time, and plot their activity vs. time of the day. Does it follow your site's average activity patterns? It better!
Now you need to sit down with your technical, design, sales, and marketing teams. The agenda for the meeting is to:
1) decide on which tests to use,
2) come up with the scoring system for the selected tests, and
3) pick the right threshold.
To test and adjust your selections, run through the possible actions of a dozen or so hypothetical real user
personas, and calculate their scores. They shouldn't trip the alarm.
Now do the same exercise using personas of click-inflating robots and humans. Visits made for the sole purpose of
depleting of your PPC account should trip the wire every time.
Remember, to make sure your scoring system works precisely as intended, always compare your results against a
control group of unbiased traffic sources, such as Google's and other major engines' organic search results.
Click fraud is a contact sport with no rules. Click Inflation Index is a defense system you can use to protect
yourself and fight back.
About The Author
Dmitri Eroshenko is CEO of Clicklab, the first customizable web analytics
service with built-in ROI tracking, score-based click fraud protection, and usability testing tools.