Categories
SEO and Growth Hacking

Search term correlations for business intelligence

In the previous post I’ve shown the way to apply website traffic data (time series) in order to find any correlation with organic search queries from the Google database in time span. Here I want to show two more features of Google Trends (former Correlate): (1) finding search terms that have a pattern of activity […]

In the previous post I’ve shown the way to apply website traffic data (time series) in order to find any correlation with organic search queries from the Google database in time span. Here I want to show two more features of Google Trends (former Correlate): (1) finding search terms that have a pattern of activity over time similar to the custom query and (2) finding query terms whose popularity over time matches any shape you draw. Those features provide some insights into search traffic optimization and might be a support tool for Google Webmaster Tools.

Search Term Correlation Analysis

Whenever we build a business blog there is a need for a conspicuous semantic core, that would make the weblog stick out of the heap of web content. The most fitting tool both for measuring website semantic core and for seeing the number of search queries is Google Webmaster Tools. However, the change in time of a particular organic search term might closely relate with some factors that we do not want to miss. These correlations might not be immediately evident, but they might influence bloggers or the site seo to do further searches and gain more detailed analysis. That’s where Google Trends might play its part.

So, first we open Google Trends and in the top box enter a query with which we want to find the correlations over time and click “Search correlations“.

There are basic search queries starting from 2005 in Google datasets. That time span from 2005 might be good enough for exploring semantics. The results given are the top queries which have the highest Pearson Correlation Coefficient, plus the time graph of the frequency for the given and correlated words.

Let’s look at my simple results with Correlate Lab: The query data mining has surfaced the following highly-correlated (R2> 0.97) queries (top 10 of them):

Coefficient Query
0.9791 distributed
0.976 c++
0.976 java code
0.9747 c code
0.9742 c++ code
0.9728 algorithms
0.9725 modulation
0.9725 e-commerce
0.9715 implementation
0.9711 statistical

That gives me some more insight concerning semantic advancement. The terms c++, modulation, e-commerce, statistical (already in semantic core), algorithms have directed my steps toward further study in Google Webmaster Tools. I now am given some further guidance regarding the raising of the website in search results thru these terms if properly applied.

Another result I found useful is the statistical curve for a query scraping (blue line):

The graph shows the seasonal low points in people’s interest in those search terms and therefore in related web pages’ visitors activities. The Christmas season has a major influence. Also the interest for this term has risen in the years 2011 and 2012. It is interesting that I have not found this tendency with the same term in Google Trends.

Most of the terms  for which I searched correlations gave non-related results, mostly weird. This tool, I think, is one for elementary data mining, applicable for fast and non-professional searching.

Draw a curve – get statistics

Another possibility for the correlation engine is to find query terms whose popularity over time matches the shape you draw. Just go to Search by Drawing to try it. The main idea is to make the search engine define the db stored queries following a custom curve. This is a less practical tool in my opinion. It can seldom be applied in business intelligence and particularly not in web traffic analysis.

Conclusion

Computing correlations with Google search queries which are stored over the years is a interesting means for researchers to study online organic behavior. I made some attempt to apply it to web traffic analysis, including both time series (previous post) and the semantic core correlation outlook.

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

This site uses Akismet to reduce spam. Learn how your comment data is processed.