“Other” Problem in Content Report
Google Analytics in a single day can process upto 50,000 unique URLs and this can create problem in relatively large accounts where daily unique URL exceeds this limit because each URL after first 50,000 gets grouped into “Other” which is shown in content reports. This problem becomes severe in sites where dynamic URL query parameters are not excluded and same URL is shown as unique, multiple times with different URL query parameters. Hence the limit of 50k URLs is reached very quickly for such websites.
Working with data on daily basis to help clients unlock the hidden business potential using data insights is our core activity and while working with one of largest e-commerce company this peculiar problem hit us very hard because all important tracking elements were being captured in query parameters in content reports of Google Analytics and due to multiple unique URLs the threshold of 50k always reached within few hours due to heavy traffic resulting in bulging of “Other” in content reports (Refer content report below). We realized that it was difficult to perform analysis on “Other” segment URL and we were losing important data for our analysis purpose.
However problem did not end there and due to content contamination in URLs resulting in “Other” content problem, site search data also got affected and many internal site search terms were also grouped under Other Category as shown in diagram below.
Solution of (Other) problem in Content Report
We thought of analyzing time pattern of this problem to know details of hours during which URL limit exceeds 50,000 since Google Analytics collects website data round the clock but processes it intermittently in chunks and typically it happens every 3 or 4 hours but sometimes also take upto 24 hours.
We created advance segment taking hour dimension and started from 00 hour, going upward adding incremental values (i.e. 01, 02, 03 etc) and checking in content reports to see after how many hours “other” problem arises i.e. unique URLs exceeds 50k threshold.
We ended up adding hour number 23 and we realized that Other don’t did not show up in content reports but voila the URL exceed limit is removed and now content reports shows more than 49999 rows of data as shown in report below.
If you put row number 100,000 in “Go to” box located at bottom of the page you can even see the GA shows rows starting from 100,000 numbers.
Also applying this segment solves problem of site search terms and now instead to 41k terms, search terms reports shows 49k terms. Hence search terms which were grouped under “Other” category are now available for better data analysis as referred in diagram below.
You can also take advantage of this nifty trick, unlock the Other problem of content and internal site search reports. Create advance segment as shown in diagram below to get this quick around of Content report problem.
The caveat of this trick is it does not work against the sampling so if longer date range is applied to data sets sampling would be applied.
All data geeks who face this Other Content issue and need to pull more data for better data analysis for your clients’ web properties then you can create this hourly advance segment and pull larger datasets via Tatvic Google Analytics excel plugin. If you haven’t used Tatvic tool before then you can register and start using it for your benefit.
All from my side. Please share your comments, inputs feedback.