Data Studio makes great looking reports and dashboards, and at a price point of free for everyone, Google is obviously making it easy to afford. But many organisations find a number of key challenges and "connection does not exist" errors when connecting Google Data Studio to a database...
3 serious problems when connecting Google Data Studio to a database
Even if you are able to whitelist the bank of Google IPs and reach the MySQL or Postgres databases behind your firewall, you're likely to encounter connection, performance, or security issues while using the native Data Studio connectors.
Slow Response Time in Data Studio from Database Connections
Data Studio dashboard speed can be an issue when you are connecting with databases. Users don’t want to wait for their reports to refresh. Depending on the size of your data and the power of your database/server users can spend a lot of time waiting for dashboards to refresh.
In some cases, the issue arises from differences in the extreme level of detail in the database and the higher level results actually needed in a dashboard or report- if a user wants the total by product category for the month, but your database has to sum up every single individual order for every click in Data Studio, it’s not going to be fast.
Data Studio can Overload Your Database
If you are connecting Google Data Studio directly to your database using native Google Data Studio connectors for MySQL or PostGres, you might have noticed it generates a heavy load with a very large number of queries.
Depending on the size and complexity of your Data Studio dashboards, and the performance of your database server, this can be deadly for your DB. To understand why there are so many queries, you need to look at how Data Studio works- it generates one or more queries for every element on a dashboard when its refreshed.
If your dashboard has a filter (like a date range, or one or more filter selectors) every time a user changes the selection, all those queries hit your database all at once. If you have a dashboard with 15 things on it, that might be two dozen queries! If you have multiple users online, slicing and dicing their data, clicking on filters, changing date ranges, you will see a steady stream of queries from Data Studio pounding your database.
Pretty soon Data Studio might be consuming all your database capacity, resulting in errors connecting to MySQL or PostGres. You may receive the "Connection does not exist" error or have other troubles connecting through. There have been reports that Data Studio can open a large number of database connections and not release them, potentially even crashing your database completely. Since that database is likely being used by other applications/users in your environment, this is obviously a disaster.
By allowing Data Studio to query your database directly, you give every report viewer and author the same power. Imagine if there are hundreds of users- and they all decide to do some analysis at the same time as that critical database job you need to run...
Connectivity and Security Concerns - Don’t Expose your Database!
Chances are you don’t need every table and every row of data in your database available in Data Studio. In fact, its likely the data you do want to report on is in databases that also have highly sensitive data.
But to connect to it you have to provide Data Studio report developers with your database login credentials, and you have to open access to your database through your firewall to Data Studio.
It is likely you will need to create new credentials, and manage their access- adding more administrative work.
Very often, your users will hire third party consultants to build dashboards and reports- by connecting directly from Data Studio to your database, you are providing database credentials to potentially critical systems to these temporary resources.
How to speed up dashboards and protect your database but still give Data Studio the data it needs
How can you speed up your dashboards, ensure you have complete control over database loading and manage security concerns?
Analytics Canvas is a tool that provides a buffer between Google Data Studio and your internal databases.
You load your data into Analytics Canvas at a frequency and schedule that best suits your database operations. Then, when Data Studio requires data, it queries it from Analytics Canvas directly- meaning no impact on your database, and unlimited access to hundreds of users in Data Studio if needed, potentially thousands and thousands of complex queries.
Analytics Canvas uses the Google Cloud Platform to cache your datasets. What’s more:
- we are Data Studio Partners with a connector in the Data Studio Connector Gallery
- we’ve been featured by Google
- we were chosen by Google to be one of the handful of companies that built the first Data Studio Community Connectors.
Manage your network security exposure and keep control of data flows
The Analytics Canvas Server can be installed on premise, which means you don’t have to grant access to your databases outside your firewall. You provide specific connectivity between the Analytics Canvas server installation and the Analytics Canvas cloud. This is a specific, limited connection- and it’s a process that pushes only the data you want Data Studio to see on the schedule that you want.
This means that rather than wondering when Data Studio report viewers or authors are going to bombard your database, you can schedule regular data refreshes that refresh just once- and you can pick the schedule, so do it in period when your internal databases are most available.
If you need to get data from your database into Data Studio, try Analytics Canvas. Your dashboards will load quickly, you will not put unnecessary pressure on your database, and you can eliminate the hassles of setting up separate credentials or maintaining a separate DB to house your reporting tables.
You can enjoy all the benefits of Data Studio and have access to a best-in-class Data Studio Community Connector from as little as $299/mo.