1. Home
  2. Knowledge Base
  3. Universal Analytics Backup
  4. How much does it cost to store and use the UA Backup in BigQuery?

How to calculate the storage cost of your Universal Analytics backup in BigQuery

This article discusses how to calculate the storage cost of your UA Backup in BigQuery, and provides some estimates for those who haven't run a backup yet.  

While this article relates to the Universal Analytics Backup Utility offered by Analytics Canvas, the calculations provided below apply to any BigQuery dataset.

Estimating the storage size of your backup

There is no 'one-click' backup of Universal Analytics data, and Google has not provided a standard set of tables to provide an estimate from. However, Analytics Canvas has!  

Try the UA Backup utility to generate an estimate based on the included tables or your own custom backup requirements.  Canvas will estimate the total number of rows, which you can use to estimate the total cost.. 

>> How to estimate the cost of your UA backup.

Calculating the storage cost of your UA Backup

Once you have run your backup, or a FREE PREVIEW of your backup, you can begin to estimate the long term storage costs.

Run the following SQL either in a BigQuery SQL block using Analytics Canvas, or in the BigQuery Console:


SELECT 
  COUNT(*) AS TotalNumberOfTables,
  SUM(row_count) AS TotalNumberOfRows,
  SUM(size_bytes)/POW(1024, 3) AS TotalSizeInGB
FROM 
  `YOUR_PROJECT.YOUR_DATASET.__TABLES__`

Running this in BigQuery we see that a 9 million row backup is approximately 1.28 GB for the sample website.  Results from your website may differ based on the data you are collecting. 

Image

Using the BigQuery Cost estimator, the cost was $0.  We had to 10x the size in order to get the cost to register.  Using 90M rows and 12.8GB of data, the cost is a mere $0.02/mo

Image

Storage costs are negligible.  Costs may add-up based on usage, however with the size of the report.

Additional queries on rows counts and table size

Note that to run these queries, the account running the query needs the role "BigQuery Metadata Viewer" on the Project in Google Cloud IAM.

These queries will show you additional details about your tables and assumes that there is only 1 backup in the dataset, or that you want the total for all backups within the dataset (it does not filter by dataset name).

Total rows by table within a dataset:

SELECT table_name, total_rows FROM `project.region-us.INFORMATION_SCHEMA.TABLE_STORAGE` 
where table_schema = 'dataset';

Total rows in a UA backup:

SELECT sum(total_rows), FROM `project.region-us.INFORMATION_SCHEMA.TABLE_STORAGE` 
where table_schema = 'dataset';

Total bytes of a UA backup DATASET:

SELECT sum(total_logical_bytes)/ (1e+9) as GB FROM `project.region-us.INFORMATION_SCHEMA.TABLE_STORAGE` 
where table_schema = 'dataset';

Calculating the usage cost of your UA Backup

To calculate the usage costs of your report, you must first refresh all pages of the report, or go to File > Download As > PDF and download the report.  This will perform all queries required to update the full report once.  You can then run the following SQL to get a sense for the query cost.


SELECT
  job_id,
  job_type,
  user_email,
  ARRAY_TO_STRING(REGEXP_EXTRACT_ALL(FORMAT("%t", labels), r'\"(.+?)\"'), ", ") AS job_labels,
  query,
  state,
  creation_time,
  end_time,
  total_bytes_processed,
  total_bytes_processed / POW(1024, 3) AS total_gb_processed
FROM
  `region-US.INFORMATION_SCHEMA.JOBS_BY_PROJECT`
WHERE
  project_id = 'YOUR_PROJECT'
  AND REGEXP_CONTAINS(query, r'YOUR_DATASET_NAME')
ORDER BY
  creation_time DESC
⚠️

This is a best estimate based on a single refresh. To get an accurate sense for costs, you should interact with the report and run the query again.

The following query will give you a summary of your usage for a specific dataset in a given time period.  You can use this to estimate the cost of using your UA Backup Report in Looker Studio.


SELECT
  SUM(total_bytes_processed) AS total_bytes_processed,
  SUM(total_bytes_processed) / POW(1024, 3) AS total_gb_processed
FROM
  `region-US.INFORMATION_SCHEMA.JOBS_BY_PROJECT`
WHERE
  project_id = 'your_project_id'
  AND REGEXP_CONTAINS(query, r'your_dataset_name[.]')
  AND creation_time BETWEEN TIMESTAMP('start_date_time') AND TIMESTAMP('end_date_time')

Enter the total bytes processed into the BigQuery cost calculator to determine the usage costs.  Costs are approximately $6.25/TB of data processing and the first TB is free.  It would take a LOT of usage on these reports in order to get to $6.25.  

If you have any questions about these cost calculations, please contact support@analyticscanvas.com

Was this article helpful?

Related Tutorials/Video