I am extremely thrilled to announce that RGoogleAnalytics was released recently by CRAN. R is already a swiss army knife for data analysis largely due its 6000 libraries. What this means is that digital analysts can now fully use the analytical capabilities of R to fully explore their Google Analytics Data. In this post, we will go through the basics of RGoogleAnalytics. Let’s begin.
Fire up your favorite R IDE and install RGoogleAnalytics. Installation is pretty basic. In case, you are new to RGoogleAnalytics, refer this page to learn how to install it.
Since RGoogleAnalytics uses the Google Analytics Core Reporting API under the hood, every request to the API has to be authorized under the OAuth2.0 protocol. This requires an initial set up in terms of registering an app with the Google Analytics API so that you get a unique set of project credentials (Client ID and Client Secret). Here’s how to do this –
- Navigate to Google Developers Console.
- Create a new project and open credentials page.
- Create credentials by selecting OAuth client ID.
- In order to proceed further, you will be asked to Configure consent screen first.
- After the consent screen configuration, in next step select Application Type – Other and click Create.
- The above step will generate OAuth client as below.
- Once your Client ID and Client Secret are created, copy them to your R Script.
- Enable the Google Analytics API from API Manager
- Once the project is configured and the credentials set ready, we need to authenticate your Google Analytics Account with your app. This ensures that your app (R Script) can access your Google Analytics data/List your Google Analytics profiles and so on. Once authenticated you get a pair of tokens (Access Token and Refresh Token). An Access Token is appended with each API request so that Google’s servers know that the requests came from your app and they are authentic. Access Tokens expire after 60 minutes so they need to be regenerated using the Refresh Token. I will show you how to do that but prior to that, lets continue the data extraction flow.
require(RGoogleAnalytics) # Authorize the Google Analytics account # This need not be executed in every session once the token object is created # and saved client.id <- "xxxxxxxxxxxxxxxxxxxxxxxxx.apps.googleusercontent.com" client.secret <- "xxxxxxxxxxxxxxxd_TknUI" token <- Auth(client.id,client.secret) # Save the token object for future sessions save(token,file="./token_file")
The next step is to get the Profile ID/View ID of the Google Analytics profile for which the data extraction is to be carried out. It can be found within the Admin Panel of the Google Analytics UI. This profile ID maps to the table.id argument below.
The code below generates a query with the Standard Query Parameters – Start Date, End Date, Dimensions, Metrics etc. and hits the query to the Google Analytics API. The API response is converted in the form of a R DataFrame.
# Get the Sessions & Transactions for each Source/Medium sorted in # descending order by the Transactions query.list <- Init(start.date = "2014-08-01", end.date = "2014-09-01", dimensions = "ga:sourceMedium", metrics = "ga:sessions,ga:transactions", max.results = 10000, sort = "-ga:transactions", table.id = "ga:123456") # Create the Query Builder object so that the query parameters are validated ga.query <- QueryBuilder(query.list) # Extract the data and store it in a data-frame ga.data <- GetReportData(ga.query, token) # Sanity Check for column names dimnames(ga.data) # Check the size of the API Response dim(ga.data)
In future sessions, you need not generate the Access Token every time. Assumming that you have saved it to a file, it can be loaded via the following snippet –
load("./token_file") # Validate and refresh the token ValidateToken(token)
Here are a few practices that you might find useful –
- Before querying for a set of dimensions and metrics, you might want to check whether they are compatible. This can be done using the Dimensions and Metrics Explorer
- The Query Feed Explorer lets you try out different queries in the browser and you can then copy the query parameters to your R Script. It can be found here. I have found this to be a huge time-saver for debugging failed queries
- In case if the API returns an error, here’s a guide to understanding the cryptic error responses.
Did you find RGoogleAnalytics useful? Please leave your comments below. In case if you have a feature request or want to file a bug please use this link.
Editor’s Note: This blog has been updated on 03/01/2018 for increased accuracy, taking into consideration all the official tech updates.