Categories
Data Mining

Implementing frequent itemsets algorithm thru MapReduce

The problem of finding frequent itemsets in data analysis is described in this post, and here i state the practical steps for finding the frequent itemsets thru MapReduce.

Categories
Data Mining

Data Mining: The AdWords Problem Review

This post is a continuation of the previous post on Advertising on the Web and Data mining. Here we conclude by reviewing some basic algorithms for placing ads on the web.

Categories
Data Mining

Advertising on the Web and Data mining

The challenge of effective web advertisement primarily involves placing relevant ads on user requested web pages. Those ads must be relevant to a page receiver, that is relevant to the page context and/or directly to the user. What algorithms are being used for this? What trends are there now in business intelligence and data mining for digital advertisement solutions?

Categories
Uncategorized

Clustering in a Parallel Environment and MapReduce

As we have touched on some basics on Clusters in Data Mining, we want to consider the computation techniques applied for clusters. Those techniques stand in line with the data mining for web traffic analysis.

Categories
Development

How to scrape CSV data files

This short post in to guide you in how to scrape CSV data files.

Categories
Data Mining

Clustering in Data Mining

Clustering is a data mining process where data are viewed as points in a multidimensional space. Points that are “close” in this space are assigned to the same cluster.

Categories
Data Mining

Frequent Itemset Challenge in Data Mining

In Business Intelligence (and in data mining in general) a regular need is to be able to find the items that frequently go together in a consumer basket.

Categories
Miscellaneous

Ethical issues of using employee monitoring software

Employee monitoring software has become commonplace. Many apps take monitor screenshots, capture keystrokes and mouse movements, monitor active applications and visited sites and, in extreme cases, can even take pictures using webcam. It seems to be fair to track what your employees do when they are being paid for their time. After all, if they exchange their time for money, it seems fair for the employer to know what they are paying for. So, why does it still feel morally inappropriate in some cases? The question is far from being just theoretical. If a wrong decision is made, a company may suffer from lawsuits, experience a backlash and overall productivity drop (opposite from what was intended) from their employees or suffer damage to the company’s image. Let’s review in more detail what employee monitoring practices can be considered valid and what should be avoided.

Categories
SEO and Growth Hacking

Search term correlations for business intelligence

In the previous post I’ve shown the way to apply website traffic data (time series) in order to find any correlation with organic search queries from the Google database in time span. Here I want to show two more features of Google Trends (former Correlate): (1) finding search terms that have a pattern of activity over time similar to the custom query and (2) finding query terms whose popularity over time matches any shape you draw. Those features provide some insights into search traffic optimization and might be a support tool for Google Webmaster Tools.

Categories
SEO and Growth Hacking

Google Correlate for web traffic analysis

Both my partner and I were asking: what factors influence website traffic? How does one find any correlations in business intelligence related to organic searches? This post was born out of my attempt to join together both traffic data from the business blog (data source being Google Analytics) and real organic queries done in Google, in order to get some insight into which items my traffic correlates with, in specific how these items ( i.e. those which people are searching for) might have influenced website traffic.