Google Analytics is a fantastic web analytics tool- and it is improving constantly. One result of improvements and evolution, however, is that how data is collected and stored changes, and can change your metrics.
Usually these step changes are subtle- in certain cases, depending on your traffic and situation they could be earth shifting. While these changes can sometimes be frustrating, there is no simple solution- and they are often the price to be paid for continuous improvement, and best accuracy.
The next time you see a step change in your data, it doesn’t hurt to consider the possibility that your traffic hasn’t changed, but the way that Google Analytics is reporting it has.
Here are some recent examples:
History is the old definition, the future is the new
Example: Session definition
Before Aug 11 2011
A session ended when :
- More than 30 minutes have elapsed between pageviews for a single visitor.
- At the end of a day.
- When a visitor closes their browser.
After Aug 11 2011
A session ends when:
- More than 30 minutes have elapsed between pageviews for a single visitor.
- At the end of a day.
- When any traffic source value for the user changes. Traffic source information includes: utm_source, utm_medium, utm_term, utm_content, utm_id, utm_campaign, and gclid.
What it means to you and your analysis
You are going to see more visits in cases when visitors leave and return to your site in a short period of time. Because there are more visits this is going to affect many metrics- bounce, pageviews per visit etc. etc.
The session change and it’s impact is explained in great detail by Avinish Kaushik in a long, but very descriptive video well worth worth watching.
Example: Image search
Before July 22 2011
Image search from images.google.com is referal traffic with a referal path of “imgres”.
After July 22 2011
Image search is included in search traffic. (Which makes sense).
What it means to you and your analysis
If you have significant search from images.google.com, and if that traffic is different than your existing search traffic, all your metrics could have a step change for the search category.
These changes were covered here by searchenginewatch.com
Changing categories within a dimension
Example: Changes to how city is recorded
Before Feb 24 2011
All traffic within a certain rectangle was defined as “London” in Google analytics.
After Feb 24 2011
A number of smaller, more detailed areas within the original rectangle were given their own city name.
What it means to you and your analysis
You’d suddenly get much fewer visits to London. SEOOptimise.com noticed this and did a good blog post on it.
Discontinuing metrics, including removal of history.
Example: Connection Speed
Before Feb 24 2011
Connection speed was recorded for each visit.
After Feb 24 2011
Connection speed is being eliminated- new functionality around page load time has been added, requiring additional code, but providing much better, meaningful information.
What it means to you and your Analysis
After the depreciation date, no data is collected (the interface puts all visits under “unknown”). This metric does not exist in the new GA interface, and will return an error if an attempt to query it in the API is made even for historical periods. If you need this data for any reason, you need to export it out of the old GA web interface now, or it will be soon unavailable.
Conclusion? Be aware of step changes.
The bottom line is that in many cases the changes are all for the better, and while historical analysis is important, we know that the future is the key, and looking too much in the rearview mirror isn’t good analysis- the benefit of better and more accurate metrics are worth the temporary challenges in comparing recent historical data.
If you do need to compare apples to apples for annual or quarterly reporting for example, then you can use the Google Analytics API to pull the more detailed data out, and adjust for any troublesome changes based on the the date of the change.
And in terms of eliminated data, you need to be aware of the upcoming changes, and if needed pull the data out of Google Analytics while it is still available, and store it in your own database.
While obviously we’re biased, we think Analytics Canvas is the best tool to do this kind of work- it connects easily and quickly to the Google Analytics API, and lets you do the analysis you need, as well as export the data directly in to databases and files as needed.
So next time you see a sudden step change, maybe something has changed in the real world, and you are seeing visitor behaviour… or maybe the definition of what you are analyzing has changed.
For notifications from Google for the Google Analytics API, this group has been setup:
http://groups.google.com/group/google-analytics-api-notify?pli=1
And of course another source of information is the Google Analytics blog.
Comments 1
Pingback: 什麼是Bounce Rate (跳離率) 真正的意思? « Seo搜尋引擎優化 « 台灣搜尋引擎優化與行銷研究院:SEO:SEM