As we have touched on some basics on Clusters in Data Mining, we want to consider the computation techniques applied for clusters. Those techniques stand in line with the data mining for web traffic analysis.
Month: February 2013
How to scrape CSV data files
This short post in to guide you in how to scrape CSV data files.
Clustering in Data Mining
Clustering is a data mining process where data are viewed as points in a multidimensional space. Points that are “close” in this space are assigned to the same cluster.
In Business Intelligence (and in data mining in general) a regular need is to be able to find the items that frequently go together in a consumer basket.
Employee monitoring software has become commonplace. Many apps take monitor screenshots, capture keystrokes and mouse movements, monitor active applications and visited sites and, in extreme cases, can even take pictures using webcam. It seems to be fair to track what your employees do when they are being paid for their time. After all, if they exchange their time for money, it seems fair for the employer to know what they are paying for. So, why does it still feel morally inappropriate in some cases? The question is far from being just theoretical. If a wrong decision is made, a company may suffer from lawsuits, experience a backlash and overall productivity drop (opposite from what was intended) from their employees or suffer damage to the company’s image. Let’s review in more detail what employee monitoring practices can be considered valid and what should be avoided.
In the previous post I’ve shown the way to apply website traffic data (time series) in order to find any correlation with organic search queries from the Google database in time span. Here I want to show two more features of Google Trends (former Correlate): (1) finding search terms that have a pattern of activity over time similar to the custom query and (2) finding query terms whose popularity over time matches any shape you draw. Those features provide some insights into search traffic optimization and might be a support tool for Google Webmaster Tools.
Both my partner and I were asking: what factors influence website traffic? How does one find any correlations in business intelligence related to organic searches? This post was born out of my attempt to join together both traffic data from the business blog (data source being Google Analytics) and real organic queries done in Google, in order to get some insight into which items my traffic correlates with, in specific how these items ( i.e. those which people are searching for) might have influenced website traffic.
This table is what I’ve scraped in 1 min from pinterest.com to access all the categories and the links to corresponding Pinterest pages.