How we integrate datasets using Pi Sync, Google BigQuery, and Tableau
25 Oct 2019
There has been growing questions from our customers about integrating data and moving away from siloed or manual reporting.
We also found that many of our customers are now building automated marketing reporting stacks. They’ve been achieving this using our API and importing data into their data warehouse. So we asked our team: if we were to build a marketing reporting stack ourselves using Pi data, how would we do it?
So the challenge was set out: Integrate a Pi account with over 100,000 search terms with other marketing data sources and then create automated visualisations.
Data importing
Arguably the most difficult aspect of building automated marketing reports is the data importing aspect. How does the team automatically get Pi data into a format that Business Intelligence tools could read? Pi has a robust API that could satisfy this, but this becomes an extra API integration to maintain. Also, we realised that many customers who had built data integrations to communicate with the API essentially built the same software independently of each other.
We decided to build a data integration that we could make available to all our customers and remove the need for everyone to maintain their own API integrations. This is what we now call Pi Sync.
Pi Sync is a fairly simple tool. The user inputs their SQL Database credentials into Pi and then sets up the source data they require to be pushed across there. Within moments, Pi Sync pushes data to the selected database and will do this automatically every time data has been made available.
Because our next step involved Google BigQuery we decided to use Google’s CloudSQL hosted in Europe to connect to Pi Sync.
We then had a database full of Pi data that was automatically being updated.
Data warehousing
The next challenge was integrating data from different marketing locations. For a small amount of data, this could simply be achieved by plugging a Business Intelligence platform straight into the SQL database and using the inbuilt features to combine the data sets. However, this does hit scalability and performance issues quite quickly for large datasets. This is why we opted to go with Google BigQuery to satisfy the scalability issues in a cost-effective manner. As BigQuery puts it ‘Focus on the analytics, not your infrastructure’.
Because we didn’t have a data warehouse our first step was centralising all the data into BigQuery. As Pi data was already in CloudSQL we only needed to create a connection within BigQuery. Next, to bring in other data sources, not surprisingly, BigQuery has default connections to Google Ads. We used Google Ads, but also, for a bit of fun, we brought in some of the public data sources BigQuery has available. Why not see if there’s a correlation between cloud-to-ground lightning strikes and search data? (There isn’t by the way).
Visualisation
With BigQuery being used as our selected data warehouse we had a lot of options for data visualisation, however, as we already have the licence and infrastructure set up we still opted with one of the more advanced business intelligence tools in Tableau Desktop and Server.
Tableau, by default, hooks straight into BigQuery and does have the ability to connect to the entire data source. Meaning at this point you can query directly from Tableau, dragging and dropping any combination of dimensions and metrics. We were highly impressed with BigQueries ability to handle complex SQL queries from Tableau, queries that would traditionally require hours for us to compute were taking seconds.
Depending on your use case, you’ll likely want to go back into BigQuery and optimise your queries in there to save money. Tableau also has some good best practices. https://www.tableau.com/learn/whitepapers/google-bigquery-tableau-best-practices
Pi Sync → BigQuery → Tableau
There you go, an automated marketing reporting stack. If you already have the business intelligence infrastructure, there is no reason why you can’t have the barebones setup within an afternoon. What can you do? Well anything, but here are some ideas.
- High-level reporting - Performance overview of all your marketing data. SEO, PPC, Display, Affiliates, Email, Social etc.
- Key opportunity area - Blend data sources to see efficiency opportunities.
- Sales Pipeline - Combine sales pipeline with content data to determine the true lifetime value of content.
Never miss a post
Join our mailing list and have our SEO news delivered straight to your inbox.
Never miss a post
Join our mailing list and have our SEO news delivered straight to your inbox.