Tag: data mining

Data Science Development

Linear models, Sklearn.linear_model, Classification

Post author By admin
Post date February 17, 2021
No Comments on Linear models, Sklearn.linear_model, Classification

In this post we’ll show how to build classification linear models using the sklearn.linear.model module.

The code as an IPython notebook

sklearn.linear_model_part1 Download

Tags classification, data mining, Linear Regression, Python

Data Science

Adding regularization into Linear Regression model

Post author By admin
Post date February 17, 2021
No Comments on Adding regularization into Linear Regression model

The Regularization is applying a penalty to increasing the magnitude of parameter values in order to reduce overfitting. When you train a model such as a logistic regression model, you are choosing parameters that give you the best fit to the data. This means minimizing the error between what the model predicts for your dependent variable given your data compared to what your dependent variable actually is.

See the practical example how to deal with overfitting by the regularization.

Tags data mining, Linear Regression

Data Science

Cross-validation strategies and their application

Post author By admin
Post date February 16, 2021
No Comments on Cross-validation strategies and their application

In the post we’ll get to know the Cross-validation strategies as from the Sklearn module. We’ll show the methods of how to perform k-fold cross-validation. All the iPython notebook code is correct for Python 3.6.

The iPython notebook code

sklearn.cross_validation-ENG Download

Tags data mining, Linear Regression, Python

Data Science Development

Work with inbuilt datasets of Sklearn and Seaborn libraries

Post author By admin
Post date February 12, 2021
No Comments on Work with inbuilt datasets of Sklearn and Seaborn libraries

In the post we will show how to generate model data and load standard datasets using the sklearn datasets module. We use sklearn.datasets in the Python 3.

The code of an iPython notebook

sklearn-datasets-ENG Download

Tags data mining, Linear Regression, Python

Data Science

Linear regression and Stochastic Gradient Descent

Post author By admin
Post date February 8, 2021
No Comments on Linear regression and Stochastic Gradient Descent

In this post we’ll show how to make a linear regression model for a data set and perform a stochastic gradient descent in order to optimize the model parameters. As in a previous post we’ll calculate MSE (Mean squared error) and minimize it.

Tags data mining, Linear Regression, Python, statistics

Data Science

Linear Regression application for data analysis and scientific computing

Post author By admin
Post date January 27, 2021
No Comments on Linear Regression application for data analysis and scientific computing

In this post we’ll share with you the vivid yet simple application of the Linear regression methods. We’ll be using the example of predicting a person’s height based on their weight. There you’ll see what kind of math is behind this. We will also introduce you to the basic Python libraries needed to work in the Data Analysis.

The iPython notebook code

Linreg_height_weight-ENG Download

Tags data mining, Linear Regression, machine learning, Python

Data Science

Classification vs Clustering in Machine Learning

Post author By admin
Post date January 20, 2021
No Comments on Classification vs Clustering in Machine Learning

In the post we share some basics of classification and clustering in Machine learning. We also review some of the cluster analysis methods and algorithms.

Tags clustering, data mining, machine learning

Data Science

Weibull distribution & sample averages approximation using Python and scipy

Post author By admin
Post date January 12, 2021
No Comments on Weibull distribution & sample averages approximation using Python and scipy

In this post we share how to plot distribution histogram for the Weibull ditribution and the distribution of sample averages as approximated by the Normal (Gaussian) distribution. We’ll show how the approximation accuracy changes with samples volume increase.

One may get the full .ipynb file here.

Tags data mining, Python

Development

Invalid data, what it is?

Post author By admin
Post date January 8, 2021
No Comments on Invalid data, what it is?

Often we see “invalid data”, “clean data”, “normalize data”. What does it mean as to practical data extraction and how does one deal with that? One shot is better than 1000 words though:

Tags data mining

Data Science

Simple text analysis with Python

Post author By admin
Post date January 1, 2021
No Comments on Simple text analysis with Python

Finding the most similar sentence(s) to a given sentence in a text in less than 40 lines of code 🙂

Tags data mining, machine learning, Python