Product revenue prediction with Google Prediction API

Google Prediction API
In this post, I am going to explain how can we build the model for transactional product revenue prediction with Google Prediction API as we already discussed the same stuff (Product revenue prediction with R)on R. With the help of Prediction API, we can build prediction model without any programming. Here we just have to focus on our dataset to make more accurate predictions. Google provides this service with Google Cloud Storage and Google Prediction API.

We can store our dataset at Google cloud storage and fire queries from Google prediction API for generating predictions on a set of your datasets which is easy. Google Prediction API is developed with a set of optimized Machine Learning Algorithms for model-based prediction. Here, the description is on the base of the Prediction API v1.5

How to train data with Google Prediction API:

First, we require a minimum of one number of projects at our Google API console, enabled Google Prediction API and Google Cloud Storage service with that project. Our datasets have to meet certain formats like

  • There must be no header for all column
  • The predicted variable must be in the first column
  • There must be no NA values in the dataset which confuse Google Prediction server

Then we can upload our dataset to Google Cloud Storage Engine by creating a bucket. After uploading the dataset to Google Cloud Storage Engine, we have to query from Google Prediction API by Google API explorer. There are numbers of methods for Prediction API service to deal with the data like

  1. prediction.trained models.insert – Can insert your dataset for training with this function.
  2. prediction.trainedmodels.get – Can get the training status of pre-inserted data model
  3. prediction.trainedmodels.analyze – Can provide the Analysis on the trained model
  4. prediction.trainedmodels.predict – Can make prediction

Here, we are making prediction for product revenue (yitemrevenue) on the base of the  xcartadd, xcartuniqadd, xcartaddtotalrs, xcartremove, xcardtremovetotal, xcardtremovetotalrs, xproductviews, xuniqprodview and xprodviewinrs. Therfore,  yitemrevenue is predicted variable and others are explanatory variables. As in previous blog, we have already developed a model in R for the same dataset. For doing the same with Google Prediction API, We can start the training of the dataset with prediction. trained models.insert with a model unique id, data storage location, and type of model as parameters.

[You can use this dataset]

Will give Responses like:

It means the request was accepted by the Google Prediction server and have started training. We can check the training status of the model with the help of prediction.trainedmodels.get

Here is response of Get function of Prediction API:

Where it describes

Toatal numbers of instance = 4061
Type of model = Regression
Mean squared Error(MSE) = = 1606123.17

Here, this model summary will not provide intercept and variable coefficient like R. To make prediction with above model we can query with the followed data ( description ) by Predict method (prediction.trainedmodels.predict)

Test 1:

  1. xcartadd = 0
  2. xcartuniqadd = 0
  3. xcartaddtotalrs = 0
  4. xcartremove = 0
  5. xcardtremovetotal = 0
  6. xcardtremovetotalrs = 0
  7. xproductviews = 47
  8. xuniqprodview = 38
  9. xprodviewinrs = 5828

Actual  (yitemrevenue) = 110.06

(yitemrevenue) = 155.15717487938028

Test 2:

  1. xcartadd = 0
  2. xcartuniqadd = 0
  3. xcartaddtotalrs = 0
  4. xcartremove = 0
  5. xcardtremovetotal = 0
  6. xcardtremovetotalrs = 0
  7. xproductviews = 484
  8. xuniqprodview = 392
  9. xprodviewinrs = 445026

Actual (yitemrevenue) = 803.81

(yitemrevenue) = 934

with R (after removing outliers + subsets of independent variable), we predicted 115.8346013 for test 1 and 955.153476 for test 2 on same dataset. Therefore, getting prediction with R and Google Prediction API are nearly same. When we are doing with prediction API we can improve prediction with improving dataset quality but another side in R, we can improve prediction accuracy by improving the dataset quality as well as the prediction model as we already discussed in Product revenue prediction with R – part 2.

Want us to help you implement or analyze the data for your visitors. Contact us

The following two tabs change content below.
Vignesh Prajapati

Vignesh Prajapati

Vignesh is Data Engineer at Tatvic. He loves to play with opensource playground to make predictive solution on Big data with R, Hadoop and Google Prediction API.Google Plus profile: Vignesh Prajapati
Vignesh Prajapati

Latest posts by Vignesh Prajapati (see all)

, ,
Previous Post
Predict Bounce Rate based on Page Load Time in Google Analytics
Next Post
Google Tag Manager – GTM Overview Example

Leave a Reply

Your email address will not be published.

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed