{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Bike Sharing Demand" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The *Bike sharing* task requires using historical data on weather conditions and bicycle rental to predict the number of occupied bicycles (rentals) for a certain hour of a certain day.\n", "\n", "In the original problem statement, there are 11 features available. The feature set contains both real, categorical, and binary data. For the demonstration, a training sample bike_sharing_demand.csv is used from the original data." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import warnings\n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Libraries import" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "from sklearn import model_selection, linear_model, metrics, pipeline, preprocessing\n", "\n", "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Populating the interactive namespace from numpy and matplotlib\n" ] } ], "source": [ "%pylab inline" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Data load" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "raw_data = pd.read_csv('bike_sharing_demand.csv', header = 0, sep = ',')" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | datetime | \n", "season | \n", "holiday | \n", "workingday | \n", "weather | \n", "temp | \n", "atemp | \n", "humidity | \n", "windspeed | \n", "casual | \n", "registered | \n", "count | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "2011-01-01 00:00:00 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "9.84 | \n", "14.395 | \n", "81 | \n", "0.0 | \n", "3 | \n", "13 | \n", "16 | \n", "
1 | \n", "2011-01-01 01:00:00 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "9.02 | \n", "13.635 | \n", "80 | \n", "0.0 | \n", "8 | \n", "32 | \n", "40 | \n", "
2 | \n", "2011-01-01 02:00:00 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "9.02 | \n", "13.635 | \n", "80 | \n", "0.0 | \n", "5 | \n", "27 | \n", "32 | \n", "
3 | \n", "2011-01-01 03:00:00 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "9.84 | \n", "14.395 | \n", "75 | \n", "0.0 | \n", "3 | \n", "10 | \n", "13 | \n", "
4 | \n", "2011-01-01 04:00:00 | \n", "1 | \n", "0 | \n", "0 | \n", "1 | \n", "9.84 | \n", "14.395 | \n", "75 | \n", "0.0 | \n", "0 | \n", "1 | \n", "1 | \n", "