Author Archives: Kushan Shah


Kushan Shah

About Kushan Shah

Kushan is a Web Analyst at Tatvic. His interests lie in getting the maximum insights out of raw data using R and Python. Google+

Infographic: The Anatomy of an Ideal Product Recommendation

Does your product recommendation attract less clicks or traffic? Product recommendation is a great of way to increase the average order value of your ecommerce store.  According to Monetate, a product recommendation can increase your revenue by up to 300% & boost the average order value by 50%!

The following infographic is aimed at helping you increase the engagement on your product recommendation. You will learn 4 important elements of a product recommendation which are often overlooked & 6 best practices to follow to increase the clicks on your product recommendation.

Infographic-product-recommendations-13

Use this code to embed this Infographic in your Blog/Website:

Kushan Shah

Kushan Shah

Kushan is a Web Analyst at Tatvic. His interests lie in getting the maximum insights out of raw data using R and Python. Google+

More Posts - Website

Follow Me:
TwitterFacebook

How to save your marketing dollars on your Customer Loyalty Program?

I am very much intrigued by the Customer Loyalty programs I observe across eCommerce sites I encounter. Indeed they are very much common place and have become a tried and tested tool to drive loyalty. What makes a marketing campaign stand out is the definitely the ROI fetched on the campaign and this led me to think if we could apply some form of analysis to improve the returns. Before I pull the rabbit out of the hat, let me walk you through the approach I followed.

Cohort Analysis gives us great insight into consumer action. So I pulled up some Google Analytics eCommerce data and started experimenting with cohorts. I wanted to understand Repeat Purchase Behavior and hence focus on two cohorts in particular:

1. Customers who made only one purchase on the website

2. Customers who made multiple purchases on the website

Let us first examine the proportion of consumers belonging to both the above mentioned cohorts.

Proportion of Transactions for One Time vs Repeat Consumers

 

There is indeed a stark difference between the two cohorts. A huge majority (85%) of customers are one-time customers. They do not come back and purchase again. This is a challenge faced by many sites and is not limited to a singular instance. If we were to compare the revenue contribution of the two cohorts we see that while the one-time consumers contributed towards  60% of the total revenue, the other cohort contributed a ‘whopping’ 39.5%. I have a reason for using the word ‘whopping’ and I will state it now.

“15% of the consumers contributed towards 39.5% of the total site revenue.”

Sweet, let me tell that to my boss!

Contribution to Total Transactions and Revenue by Cohort

 

Yay! We have identified the problem that the consumers are not sticking. So lets target them with an aggressive retention campaign. Lets send out discount coupons to all the one-time consumers to encourage them for future purchases. But would it seem fair to target the entire one-time consumer base to drive sales? Some consumers would go for the 2nd purchase without the incentive. The point I’m trying to drive home is sending out discount coupons to all the one time consumers would also erode your profit margin to a large extent.

Moreover, assume that the discount coupon initiates a future purchase with a value of  $50 in 10% of the cohort. If a voucher is sent to a consumer who would have re-purchased anyway, that results in a loss of $5 to the store owner. Note that these figures are ball park estimates but I encourage you to think about the implications.

Marketers have a limited budget and need to prove the efficacy of retention campaigns by measuring results. How would you do this? Predictive Analytics can be of help here. Where cohort analysis is descriptive in nature, predictive analysis helps you use your data to build models to predict consumer behavior. Think of it as a machine that converts data into actionable insights that give you better ROI.

Specifically in our case, predictive analytics can provide answer to this important question:

“Which customers should I target for discount coupons in order to maximize revenue?”

Does this question seem interesting? View a recorded version of our recent webinar where we show you how to build a Predictive Model for Discount Targeting using R

Kushan Shah

Kushan Shah

Kushan is a Web Analyst at Tatvic. His interests lie in getting the maximum insights out of raw data using R and Python. Google+

More Posts - Website

Follow Me:
TwitterFacebook

An Introduction to Collaborative Filtering

A typical consumer today uses multiple devices to surf the web and interact in many ways with your eCommerce business. For most stores, maximizing conversion and increasing order size in this environment is not only an enormous challenge, but also an incredible opportunity.

eCommerce stores also have a variety of marketing channels be it e-mail marketing, Social or Mobile Apps which help in increasing consumer interaction. A reasonably large ecommerce store has a large number of products to be displayed to a large number of customers across a variety of channels (or devices). Consumers experience information overload and seek help in selecting from an overwhelming array of products while merchandisers lost their relationships with consumers and seek to re-build and deepen those relationships by better helping consumers find products of interest. Hence, one of the questions that needs to be addressed is: Which product should be displayed so that the customer is most likely to buy it?

More recently, ecommerce stores have started adopting a wide range of mass customization techniques for customizing the consumer experience. The consumer experience includes the physical products, which can be customized in function or in appearance, and the presentation of those products, which can be customized automatically or with help from the consumer. Sites invest in learning about their customers, use recommender systems to operationalize that learning, and present custom interfaces that match consumer needs. Consumers repay these sites by returning to the ones that best match their needs.

Collaborative filtering is a way of making automatic predictions (filtering) about the interests of a user by collecting preferences from many other users (collaborating). The principle is like this: if several members of my community owned and liked the latest Apple gadget, then it is highly likely that I will too. Here is an example of how this technique is effectively tapped by Amazon:

Association rules have been used for many years in merchandising, both to analyze patterns of preference across products, and to recommend products to consumers based on other products they have selected. An association rule expresses the relationship that one product is often purchased along with other products. By contrast, collaborative filtering techniques based on user-similarity are more effective and personalized in a domain where consumer choices are very dynamic, such as online retail.

To sum up, by implementing collaborative filtering you can:

  • Convert browsers to buyers
  • Increase cross sell via either Product bundling
  • Build loyalty and increase customer lifetime value

Want to learn more about how you can leverage collaborative filtering for your business using R? Join us in our next webinar. More details here

Kushan Shah

Kushan Shah

Kushan is a Web Analyst at Tatvic. His interests lie in getting the maximum insights out of raw data using R and Python. Google+

More Posts - Website

Follow Me:
TwitterFacebook

Google+ Hangout: The Future of Working with Data with Michael Koploy and Thomas Davenport

Hi folks,

Some of the guys from our team have watched the replay of this Google+ Hangout and we have found it quite interesting. Assuming that the majority of you guys that come here to check our blog posts out are also part of the web analytics world, we thought we should share this with you here.

We really hope you enjoy this cross-post from the Plotting Success blog from Software Advice !

Author and professor Thomas Davenport’s new book, Keeping Up with the Quants, serves as a “quantitative literacy” guide for managers as they wade through the world of data today and tomorrow. Keeping Up with the Quants, co-authored by Davenport and Jinho Kim, covers the basics of quantitative analytics, the essential habits of effective analysts and insight on how business users and top-ranking quants can best collaborate.

Michael Koploy, Managing Editor at business intelligence resource website Software Advice, recently hosted a Google+ Hangout with Davenport to discuss key points from the book and Davenport’s thoughts on the future of business analytics.

Among other topics, Davenport and Koploy cover:

  • The importance of balancing creativity with a regular, thorough analytical process
  • Why great companies hire great analysts–and why that isn’t likely to change anytime soon
  • Why visualization tools are effective at analyzing “Big Data”
  • How “Ph.Ds with personality” drive analytical innovation in business
  • Why everyone should dabble in coding

Check out a full recording of the hangout below:

For a full analysis and takeaways from the hangout, check out Koploy’s post on the Plotting Success blog: Hangout with Thomas Davenport: The Future of Working with Data. Be sure to check out Keeping Up with the Quants: Your Guide to Understanding Analytics, on sale now. And connect with Koploy on Twitter (@PlottingSuccess) and Google+.

Kushan Shah

Kushan Shah

Kushan is a Web Analyst at Tatvic. His interests lie in getting the maximum insights out of raw data using R and Python. Google+

More Posts - Website

Follow Me:
TwitterFacebook

Highlights of the Amazon Web Services Summit, Mumbai 2013

As a data analytics firm, we understand the value of the cloud for data storage and processing. When we talk about cloud computing, we are very passionate about Amazon Web Services since it is a very rich platform that enables us to build our applications for various use-cases. We were curious to understand AWS and gain an insight into best practices and this led us to the annual AWS Summit at Mumbai. The AWS Summit is a one day cloud community event organized in 12 major cities around the world. The focus of the current event was on cost effectiveness, high availability, big data and security.

The keynote was delivered by Dr. Werner Vogels, CTO of Amazon.com where he talked about how AWS helps developers focus on the application rather than the infrastructure.

A range of breakout sessions were attended by team Tatvic.

In the Startup and Developer Track, one of the speakers talked about the four stages of a startup development lifecycle namely: Idea, MVP (Minimum Viable Product), Scale and Profitability. These ideas were inspired by Steve Blank’s book: The Four Steps to Epiphany. He also mentioned how a startup can effectively use AWS as they pass through each of these stages.

One of the other sessions that we found very useful was the Big Data Analytics session. Abhishek Sinha, Head of Big Data and Compute, spoke about how running Analytics on the cloud could be extremely effective since it was Elastic and scalable. Data objects could either reside in S3 or if in a database layer like DynamoDB. The data processing can be carried out on an EC2 instance or on a distributed Hadoop cluster with the use of Elastic Map Reduce. Finally, the processed results can be pushed back to S3, DynamoDB or Redshift. Whats more to this is that this entire process of Data collection, Computation and Storage can be automated with the help of Data Pipeline. Interesting !

The keynote and breakout talks left us with an even better sense of what AWS is about, how it has grown, what kinds of customers are using it, and what it can do.

Update – The videos from the AWS Summits held around the world can be found in this Youtube playlist

Kushan Shah

Kushan Shah

Kushan is a Web Analyst at Tatvic. His interests lie in getting the maximum insights out of raw data using R and Python. Google+

More Posts - Website

Follow Me:
TwitterFacebook

Web Analytics Visualization through ggplot2

During our last webinar, we covered some of the basic ideas behind ggplot2, the R Visualization package by Dr. Hadley Wickham. In this blog post I will walk through the example that I covered during the webinar.

In order to carry out the examples yourself, you may download the sample datasets from this link

Creating visualizations is an iterative process. You start with a data set, generate some quick graphs that best depict the insights and keep on adding components/data to the graph to finally produce a viz that you can show in your reports. The idea behind ggplot2 is to make this process simpler and more effective at the same time.

Diving into our example, we want to explore how Transactions for a hypothetical ecommerce store have fared for a particular calendar year. The function which we used for plotting is ggplot() and we need to mention the data frame as one of the arguments. Now for a bit of ggplot2 terminology which I am quoting verbatim from the documentation.

  • aes stands for aesthetics and this controls how variables are mapped to the axis. In our example we map month to the x-axis and transactions to the y-axis.
  • geoms, short for geometric objects, describe the type of plot you will produce. In our case, we are plotting the data as a bar graph
  • stat stands for statistics which help us transform the data prior to plotting. In our case, its an identity transform so the data remains unchanged.

Here’s the R code :

require(ggplot2)
# Load the dataframe
mydata <- read.csv("./datasets/dataset1.csv")
head(mydata)
# Append a new column that maps month numbers to month names mydata$monthf <- factor(mydata$month,levels=as.character(1:12), labels=c("Jan","Feb","Mar","Apr","May","Jun", "Jul","Aug","Sep","Oct","Nov","Dec"), ordered=TRUE)
# Plot Transactions vs Month ggplot(mydata,aes(monthf,transactions)) + geom_bar(stat="identity")
# Which month shows the highest transactions ?

You may now be on your way to follow the rest of the code and keep on improving our first visualization using the rest of the R code :

# Load data frame that includes Medium as a dimension
mydata_1 <- read.csv("./data/dataset2.csv")
mydata_1$monthf <- factor(mydata_1$month,levels=as.character(1:12),
labels=c("Jan","Feb","Mar","Apr","May","Jun",
"Jul","Aug","Sep","Oct","Nov","Dec"),
ordered=TRUE)
# Facet the Transactions by medium ggplot(mydata_1,aes(monthf,transactions)) + geom_bar(stat="identity") + facet_wrap(~medium)
# What is the problem with this plot ?
# Exclude the mediums having zero transactions fresh_data <- subset(mydata_1,medium %in% c("cpc","organic","referral","(none)"))
# Re-plot ggplot(fresh_data,aes(monthf,transactions)) + geom_bar(stat="identity") + facet_wrap(~medium)
# Stack the plots vertically for easier comparison ggplot(fresh_data,aes(monthf,transactions)) + geom_bar(stat="identity") + facet_wrap(~medium,ncol=1) # Which medium performed best w.r.t transactions ?
# Load the data frame including an additional dimension Visitor Type mydata_2 <- read.csv("./data/dataset3.csv")
mydata_2$monthf <- factor(mydata_2$month,levels=as.character(1:12), labels=c("Jan","Feb","Mar","Apr","May","Jun", "Jul","Aug","Sep","Oct","Nov","Dec"), ordered=TRUE)
# Map a color to Visitor Type Variable ggplot(mydata_2,aes(monthf,transactions,fill=visitorType)) + geom_bar(stat="identity") + facet_wrap(~medium,ncol=1)
# Stack the bar graphs side by side for easier comparison ggplot(mydata_2,aes(monthf,transactions,fill=visitorType)) + geom_bar(stat="identity",position="dodge") + facet_wrap(~medium,ncol=1)
# Strip the grey background and add a plot title ggplot(mydata_2,aes(monthf,transactions,fill=visitorType)) + geom_bar(stat="identity",position="dodge") + facet_wrap(~medium,ncol=1) + theme_bw() + ggtitle("MoM transactions split by Visitor Type")

If you followed the code correctly, you might end up with something like this:

There is a lot that can be still improved with the viz but let us stop here and quickly sum up what we just learnt. We understood the basic idea behind ggplot2, gained some knowledge about its terminology and saw how we could generate interesting visualizations in a matter of minutes. Of course, there is a fair bit of programming overhead involved but once you get the hang of ggplot2, it is time well spent in learning to code. If you’re in for something advanced you may want to have a look at our other blog posts on ggplot2 here

Kushan Shah

Kushan Shah

Kushan is a Web Analyst at Tatvic. His interests lie in getting the maximum insights out of raw data using R and Python. Google+

More Posts - Website

Follow Me:
TwitterFacebook

Installing the RGoogleAnalytics package

In this blog post, I would walk you through the steps from downloading to installing the RGoogleAnalytics package on your machine.

The RGoogleAnalytics package currently resides at https://code.google.com/p/r-google-analytics/ and this page lists the latest developments around the package. The zip and tarball archives for the package can be obtained from the Downloads Section.

Once you download the archive, fire up the RStudio interface and click on the Install Packages button in the Packages tab and select the Package Archive File option as shown in this screenshot.

 


Browse to the location containing the downloaded package archive and install it. Your RStudio console should then display the following messages :

As an aside, RGoogleAnalytics requires a couple of addtional packages for its working. These are : rjson, RCurl and bitops. These can be downloaded by clicking on Install Packages again, selecting Repository(CRAN, CRANExtra) and typing the package name.

Do let us know if this worked out for you. As a next step, you might be interested in learning how to extract Google Analytics data extraction in R. Here’s the link : http://www.tatvic.com/blog/ga-data-extraction-in-r/
Update : You might get a warning saying that the package is not available for R 3.0.0 or R 3.0.1. This is warning message specific to RStudio. While installing this package from RStudio, it will call getDependencies() to check its dependencies and also identify that whether the original  package exists on CRAN and throws the given warning when it doesn’t. As RGoogleAnalytics package is not on CRAN, hence the warning. and can be safely ignored. I hope this should be fixed up when RGoogleAnalytics is ported on CRAN. Doing the installation from default R console shall not fire this warning message.

Would you like to understand the value of predictive analysis when applied on web analytics data to help improve your understanding relationship between different variables? We think you may like to watch our Webinar – How to perform predictive analysis on your web analytics tool data. Watch the Replay now!

Kushan Shah

Kushan Shah

Kushan is a Web Analyst at Tatvic. His interests lie in getting the maximum insights out of raw data using R and Python. Google+

More Posts - Website

Follow Me:
TwitterFacebook

Resources for getting started with R

As we believe you may know, we are having a webinar tomorrow (June 19th, 2013) on Predictive Analytics. During this webinar, you are going to be introduced to R, learn how to build a predictive model and also how to carry insightful analysis through visualization.

As learning a new language can be a really difficult and painful process, we thought that it would be a valuable idea to share useful links for R resources with you. If you can spare some time to read some of these links, we believe that this first briefing will enable you to come with a better background to our webinar.

So, what do you say? Are you in for a reading and for reducing your learning curve?

Downloads

R : http://www.r-project.org/
Choose your nearest download location and click on the appropriate link
RStudio : http://www.rstudio.com/

Packages

RGoogleAnalytics :https://code.google.com/p/r-google-analytics/
Guide to getting started with RGoogleAnalytics :http://bit.ly/11kUgzI
Guide to getting started with ggplot2 :http://www.cookbook-r.com/Graphs/
Ggplot2 chart chooser :http://www.yaksis.com/posts/r-chart-chooser.html
Finding additional R packages for your domain : http://cran.r-project.org/web/views/
Additional Ideas for Predictive modelling :http://bit.ly/13XyCCK

Courses on R

Codeschool : http://tryr.codeschool.com/
2 minute short videos on R: http://www.twotorials.com/

Community

A Prezi tour of the R ecosystem :http://prezi.com/s1qrgfm9ko4i/the-r-ecosystem/
R news and tutorials from prominent R blogs : http://www.r-bloggers.com/
A search engine for R: http://www.rseek.org/

If you come across more resources, please ensure that you drop a comment below.

Would you like to understand the value of predictive analysis when applied on web analytics data to help improve your understanding relationship between different variables? We think you may like to watch our Webinar – How to perform predictive analysis on your web analytics tool data. Watch the Replay now!

Kushan Shah

Kushan Shah

Kushan is a Web Analyst at Tatvic. His interests lie in getting the maximum insights out of raw data using R and Python. Google+

More Posts - Website

Follow Me:
TwitterFacebook

Understanding the value of Predictive Analytics on Web Data

In this blogpost, I will be talking briefly about Predictive Analytics and why it holds value from a web analytics perspective. Broadly speaking, Predictive Analytics is a set of methodologies that assist us in anticipating customer behavior. The customer behavior of interest could be anything ranging from spend, buying habits, page views, response to a certain trigger or something else. From a business perspective, there could be a variety of reasons why you would opt for Predictive Analytics strategies. Some of them are:

  • Traditional Web Analytics tools generate tons of clickstream data. Predictive Analytics helps filter out the noise and go beyond aggregate level metrics.
  • It helps you understand the complex patterns between metrics and these patterns now form the basis of your decision making process.
  • It helps you allocate investments wisely since your decisions are now based on your data rather than gut.

I should point out here that Predictive Analytics does not mean that you need to predict customer behavior with a very high probability and be very accurate in your findings. Let me illustrate this with an example: Every time a new customer lands on your website, you know that he has a 50% probability of converting. Now, if you build a predictive model which indicates that he has 52% chance of converting, it holds great value for your business since it suggests that your customer is more likely to convert. You can now segment all your customers who have a higher than 50% chance of converting and channel your marketing efforts towards these customers.

Now, in order to perform Predictive Analytics, you will require the following :

  1. Clear Objective: The business problem that you want to model
  2. Data: Having the right data is absolutely imperative. If you have a user centric business model, where you can get rich data regarding your customers behavior, that’s a big plus.
  3. Methodology: Once you have the data and a clear objective, you can start thinking about the statistical method you will use to build the prediction model
  4. Tool: There are a variety of predictive analytics tools available. Selecting the right tool for your business depends on your in house analytics talent pool and allocated budget.

If you are interested in knowing more about deploying Predictive Analytics techniques at your organization, join us in our webinar where we will show you how to leverage Predictive Analytics on your clickstream data using the R language. R is the lingua franca for data analysis. Savvy Web companies, like Facebook, have successfully used R in predictive analytics to answer questions like “Which data points predict whether a user will stay? And if they stay, which data points predict how active they’ll be after three months?” We will also be covering data visualization, since the use of good visualization leads to better understanding of the nuances between your variables.

See you at the webinar !

PS: You might want to warm up and read some additional posts on Predictive Analytics. Find them here.

Would you like to understand the value of predictive analysis when applied on web analytics data to help improve your understanding relationship between different variables? We think you may like to watch our Webinar – How to perform predictive analysis on your web analytics tool data. Watch the Replay now!

Kushan Shah

Kushan Shah

Kushan is a Web Analyst at Tatvic. His interests lie in getting the maximum insights out of raw data using R and Python. Google+

More Posts - Website

Follow Me:
TwitterFacebook

Visualizing your websites’ ecommerce performance with R

In this blogpost, I want to dive deeper into the explanation of the relationship between Frequency and Recency of Visits with the Conversion Rate and Average Order Value. I have used the RGA package for data extraction and Dr. Hadley Wickham’s ggplot2 package to achieve the visualizations.

Here’s the data aggregation script :

#transactions dataframe contains the input data extracted via RgoogleAnalytics

head(transactions, n = 3)

#  visitsToTransaction daysToTransaction transactions transactionRevenue
#1                   1                 0         1639           11505429
#2                  10                 1            1               3700
#3                  10                10            1               6050

transactions$visitsToTransaction=(3*i-2)&transactions$visitsToTransaction=(3*j-2)&transactions$daysToTransaction

We now convert our data into a visualization using the ggplot2 package. Here’s the command:

require(ggplot2)
aov

 

Average Order Value

Let us make some quick inferences:

  • When the consumers visit 1-3 times across a period of 4-6 days they tend to buy the most expensive products
  • When they visit the site 7-9 times across a very small period of 1-3 days, these might be the consumers who visit repeatedly to keep a tab on offers and prices of a product of their choice they too have a higher Average Order value
  • Spontaneous buying decisions are made when the Visits and Days to Transaction are in the 1-3 categories. As expected, the average order value for these transactions is on the lower end.

Let us do a similar exercise for the Conversion Rate. We extract data corresponding to the dimensions: Visit Count, Days since Last Visit and Metrics: Visits. Repeat the same steps as before. We already had the Transactions binned and categorized earlier. We now divide the total number of Transactions across each category to the Total Visits in order to get the Conversion Rate and plot it.

Conversion Rate

Both the plots stacked up together help us understand the relationship between the Average Order Value and the Conversion Rate which in this case seems to be an inverse relationship i.e. AOV tends to be higher when Conversion Rate is the lowest. Now, this correlation may not imply an underlying causation therefore we need to drill down further to verify our hypothesis.

Consumers do tend to visit the website multiple times before making a purchase. Some might buy right away, but most of them will research a bit and come back. With this realization, we could focus on giving them more information to help with their research and getting them to convert at their own pace. On the other hand, if the time period is short (for e.g. 1-3 Visits in 1-3 Days) and the purchase is more spontaneous we have some room for improvement here (Average Order Value: 6497, Conversion Rate 1.37 %). We could play with pricing strategy and thereby to increase the Average Order Value or provide a referral discount and get more consumers to convert. Of course, this has to be done keeping the site’s business objective in mind.

Would you like to understand the value of predictive analysis when applied on web analytics data to help improve your understanding relationship between different variables? We think you may like to watch our Webinar – How to perform predictive analysis on your web analytics tool data. Watch the Replay now!

Kushan Shah

Kushan Shah

Kushan is a Web Analyst at Tatvic. His interests lie in getting the maximum insights out of raw data using R and Python. Google+

More Posts - Website

Follow Me:
TwitterFacebook