How To Export To BigQuery
⚠️ You must first provide Write access to the Google Cloud Service Account for your Analytics Canvas subscription. Review and follow this article from our knowledge base.
1. Drag and drop the BigQuery block from the Export Data section. You will see that it only has one node to accept inputs. Link the previous block, the one where you’re sending the information from, to the “BigQuery” block.
2. Click on the output node of the previous block to have a look at the incoming data. You can see the same data by clicking on the input node of the “BigQuery” block once the link is made.
3. Select the BigQuery block by clicking on the centre of it. Click 'Select Table' to select an existing table, or to create a new table within an existing data set.
After clicking 'Select Table', use the tree view on the left to navigate through the datasets from your BigQuery account and select the one where you want to add the new data. Note that only the projects that have authorized the Analytics Canvas Service Account for your subscription will be shown here.
After you make the selection, either create a new table by entering a name and clicking 'Create', or overwrite an existing table by clicking on “Select” next to its name.
When defining a table for the first time, you will be provided with an option to validate the table but not write to it. or to create the table. Make a selection or click Cancel to select or define a different table.
4. After configuring the table and creating it if it is new, you can change the output options by clicking "Edit Write Options"
Data can be overwritten on each run (Overwrite), appended in full to an existing table (append), or incrementally loaded based on a date field (delete data in date range before append).
The incremental load option is the most efficient. It first removes data from the table based on a selection, generally the last 5 days when loading web analytics data, then filters the incoming data to fill the gap and add the most recent day.
This accounts for data that is reprocessed over a given period of time. For example, Google Analytics reprocesses data over a 3 day window. There is no need to ever reload data beyond that point as it will never change.
By overwriting the data each time, you are incurring the maximum number of bytes processed for the operation. By incrementally loading it and giving a day or two of buffer, you are minimizing the number of bytes processed for the operation.
First select the date column, then select "delete date range then append" to perform an incremental load.
5. In the same control, you will notice that by default, the table will be created as a Date Partitioned table in BigQuery, with Day as the level of granularity. You can change this to monthly or yearly, but note that the maximum number of partitions is 4,000 - keep that in mind when choosing the granularity as this cannot be changed later.
6. To write the data out to BigQuery after you have made a change, you can either click the "Write to Table" button to process only the selected table, you can click the Run button to update all datasets and run all exports on the canvas, or you can go to the Automation menu to setup an automation workflow and have the canvas run on a schedule or from an API call.