Blending data between UA, GA4 API, and GA4 BigQuery

Ameet WadhwaniGA4

As the transition to GA4 ramps up, many are wondering if they can blend UA data together with GA4 data, or if they can backfill GA4 BigQuery using the GA4 Data API. This article reviews the ways that you can blend Google Analytics data together and how to create both continuous and comparison reports between Google Analytics data sources.

Blending data between UA and GA4 using the Analytics Reporting API and the GA4 Data API

There's a good reason by Google didn't backfill GA4 with data from UA - it's because the two models are different at their core: one is session based, the other event based.

The two models are fundamentally different and as a result, you've had to make an entirely new property with new and separate tags.  If you're like most sites, you had the two running in parallel for quite a while, and will want to compare results between the two as you migrate from UA to GA4.

While the two models are not the same, there are still some high level metrics that can be compared or blended for continuous reporting.

a canvas blending data from UA and GA4

There are some metrics that are comparable, as outlined in the Analytics support docs here. Those metrics are: 

  • Users 
  • Pageviews
  • Purchases
  • Sessions
  • Sesstion/Traffic based Acquisition metrics 
  • Conversions
  • Bounce rate 
  • Event Count

Still there are caveats when blending them together.  For example, the "Users" metric in GA4 generally refers to Active Users, while in UA it refers to Total Users. 

🛑 Your data will not line up perfectly! Expect to see fewer users and sessions in GA4.   

 Other things to be aware of: 

  • Your UA reports may be excluding data based on view filters.
  • GA4 tends to handle bot and spam traffic better than UA (at least in or experience). This can explain some spikes seen in UA that are absent in GA4.  
  • In the GA4 API, Users and Sessions are based on an estimate, whereas in the BigQuery export, you will often count distinct SessionIDs or UserPseudoIDs. 
  • GA4 will be missing data for users who decline consent in your consent banner.
  • There are many other reasons why the data within GA4 will not match. 

Preparing the data for Comparison Reports

With all the caveats above taken into account, we can now create queries that help us to determine if our site is tracking correctly between UA and GA4 and to show a continuous view of the transition from UA to GA4.

The canvas shown above is a visual representation of the steps involved.  It creates a continuous view where both queries have the same (or very similar dimensions).  The UA query is filtered to include data only up to the end of 2022.  The GA4 query starts in 2023. 

Both queries are connected to calculation blocks, where the field names are standardized so that they can be combined into a single table. They are then brought together (blended) in a Union block.  Finally the blended table is published to Looker Studio for visualization.

🔗  To prepare the data using Analytics Canvas, follow this article on our knowledge-base.

Creating Comparison Reports

By making similar queries against both APIs, we can create views that compare the two data sources to each other to ensure they're similar in direction and scale.

We can also create a continuous report, where we're reporting with UA up until a given date, then switch to GA4 from that date onwards. 

Image

Blending data between the GA4 Data API and GA4 BigQuery Export

BigQuery linking is available to all accounts, not just Analytics 360.  The costs are very reasonable, especially when using a tool like Analytics Canvas that implements a number of cost controls.

Analysts often find that in order to answer specific questions, the only option is to use the rich detail that is in the raw BigQuery export for GA4. Others will need to switch to the BigQuery export because of API limits that make the Data API unusable, such as the high occurance of "(Other)" in the data.

Unlike UA, GA4 does not offer any history when activating the BigQuery export and there is no way to 'backfill' it directly from the API. 

🛑 Unlike UA, GA4 does not offer any history or 'backfill' when activating the BigQuery export.

Furthermore, the Data API does not export data from the raw BigQuery tables.  It is a different underlying data source with a different schema.  

With the Analytics Data API, you do not use SQL, instead you run reports that can request a maximum of 9 dimensions and 10 metrics per query, and are subject to a quota system. Your result is a JSON response that is incompatible directly with the GA4 BigQuery export. 

With the BigQuery export, the data is structured as 1 row per event, with a series of nested column containing dozens of dimensions and metrics.  The two results don't align, so unless you activated the BigQuery link at the time you setup the GA4 Property, the transition won't be seamless.  

You will need to create summary tables using the same dimensions and metrics in both data sources, then blend those together into a report table.  As such, you will still need the API until you get sufficient history in BigQuery. 

Image
But don't throw away your API access just yet; the BigQuery export does not contain all of the data that's available in the API.  Check out this article for more details on the differences between the GA4 data sources. 

Blending between UA, GA4 Data API, and GA4 BigQuery Export

In some cases, you'll be going from UA to the GA4 API to the BigQuery export in order to complete your transitional reports. This often happens when the property quickly hits the limits of what the GA4 API can offer.

If the BigQuery export wasn't already enabled, you'll have a gap between the start date of your GA4 property, for which there will be data in the API, and the day your BigQuery link was established.

Image
To create the view above, we need 3 separate queries, one to UA, one to GA4 API, and one to the GA4 BigQuery export.  Something like the workflow shown below:
Image
If instead we wanted to have a continuous dataset, we'd modify the workflow slightly to insert filters on the first two data sources. Something like this: 
Image

Blending between Universal Analytics, the GA4 Data API, and the GA4 BigQuery Export with Analytics Canvas

As Google Analytics properties transition from Universal Analytics to Google Analytics 4, analysts will want to blend the two data sources together for comparison or continuous reporting.

🔗  Step-by-step instructions on blending UA and GA4 data are in this article on our knowledge-base.

There's no application better suited to blend historic data from Universal Analytics with new data from the Google Analytics 4, whether it's from the API or the BigQuery export.

If you're looking to blend these sources together, all of the features described above are available in the free-trial. Sign-up today to start creating your reports!

Next Steps

Whenever you’re ready… here are 3 ways Canvas can help you with your GA4 reporting challenges:

  1. Extract data from all your properties using the API or BigQuery without writing code
  2. Profile, analyse, and prepare data for reporting  
  3. Maintain your GA4 data warehouse within Analytics Canvas Online or your own DB

Ready for the next step?

  • Start an instant 30 day risk-free trial. No credit card or sales call required. 
  • Schedule a demo for you and your team.
  • Contact us to discuss plans and pricing or activate your subscription 

Wondering if Canvas is right for you? Check out the related articles to learn more about our GA4 connectors.