In this Journalism.co.uk series Brian Clifton, senior strategist with Omega Digital and former Google EMEA head of web analytics, looks at how publishers can make sense of online analytics.
Following 'Improving the web with analytics (part one)', which explains why online publishers need web analytics, part two explores the differences between off-site and on-site analytics.
How off-site web analytics works Off-site web analytics tools measure your potential website audience. They are the macro tools that allow you to see the bigger picture of how your website compares to others.
There are two types of techniques that achieve this – using panel data or Internet Service Provider (ISP) data.
Companies such as comScore and Nielsen Netratings use the panel method by recruiting participants using a combination of their website and the calling of prospective panellists.
Their technique is to have monitoring software installed on users' computers to measure their web activity. Panel sizes vary, but range from tens to hundreds of thousands of participants, with the majority of these based in the US. For example, comScore reports 2 million participants worldwide with over 50 per cent of these based in the US.
Most panel participants are home users, as these are not restricted by IT policies when it comes to installing tracking software (public access is screened out from comScore data). Similar to election polling, panel data is extrapolated (multiplied up) to provide an estimate of the behaviour for total web population.
An important advantage of panel data is that the analytics vendor knows who its panellists are. Demographic information such as age, gender, income bracket etc, are available, though these are inferred in the extrapolated data set.
The caveat to this method, is that websites you wish to measure must have sufficient visitors to show up above the 'noise' threshold and mitigate sampling errors. Think of this in terms of having a high signal to noise ratio. The threshold will vary depending on where most of your visitors connect from, as the sample size of panellists varies from country to country.
Alternatively, companies such as Hitwise (now part of Experian), collect off-site visitor information by aggregating anonymous data provided by ISPs.
This has the potential to offer much larger sample sizes than panels (Hitwise reports 25 million people worldwide, 40 per cent based in the US) and therefore a lesser degree of extrapolation is required, potentially resulting in greater accuracy.
Because this type of off-site tracking happens at the ISP/network level, all visitor types are represented, including home, work, mobile, educational and public access. The trade off is that this data is anonymous. Therefore demographic data is not available.
How on-site web analytics works On-site web analytics tools measure the actual visitor traffic arriving on your website. They are capable of tracking the engagements and interactions your visitors have, for example, whether they convert to a customer or lead or not, how they got to that point or where they dropped out of the process altogether.
Although there are several techniques to measure visitors on your site, the method used by the vast majority of vendors is the so-called 'page tagging' technique.
This requires the placement of a small snippet (aka 'tag') of java script code on your webpages that act as a beacon – capturing visitor information in their browser, storing these as cookies, then broadcasting this to a data collection server in real-time.
A key difference between on-site and off-site web analytics tools is that on-site visitor data is only available to the website owner and the people he/she grants access to, such as a third-party marketing agency.
Conversely off-site web analytics data can be obtained for any website – including your competitors and partners, provided there is sufficient visit data. Discrepancies – what's accurate? As you can see, the differences in methodology of each of these techniques are significant and this leads to very different results. Even for the same metric, basic website numbers, such as the number of visitors a website receives, the total number of pageviews etc, can vary dramatically and this is a constant and exasperating problem for site owners, media buyers and marketers alike.
So which method produces the more accurate data?
The truth is, all web analytics solutions have their limitations as shown below:
Off-site analytics Advantages:
Demographic information available
Can track competitors and related sites e.g. visitors first went to competitor A, then your site, then onto Site B
No website required - can track trends irrespective of a web presence
Disadvantages:
Inferred data - not real visitors
Small sample sizes limit accuracy - requires a significant level of traffic to your website to be viable
Extrapolation errors – analogous to polling
US-centric data
Expensive
On-site analytics Advantages:
Real visitors measured
Tracks engagements and conversions
Available for any web site, regardless of size
Inexpensive tools available
Disadvantages:
No demographics available (unless you ask for it)
Cannot track competitors/related sites
Visitors can block*, loose and delete cookies
* The blocking of first party cookies by visitors is considered to be very low and of the order of 3-5 per cent
As long as your interest is about trends, for example, our site experienced a 10 per cent increase in visits week-on-week, and you use the same measurement tool throughout, then those trends will be accurate.
However, problems arise when you start to compare different tools in order understand the underlying absolute numbers, as these will vary widely. The key is to use the right tool for the job.