Simple linear regression is a statistical method that enables users to summarise and study the relationships between two continuous (quantitative) variables. Linear regression is a linear model wherein a model that assumes a linear relationship between the input variables (x) and the single output variable (y). Here, y can be calculated from a linear combination of the input variables (x). When there is a single input variable (x), the method is called a simple linear regression. When there are multiple input variables, the procedure is referred to as multiple linear regression.

Application: Salary forecasting, Real estate predictions etc.

y = mx + b → Linear Equation

The motive of the linear regression algorithm is to find the best values for m and b. Before moving on to the algorithm, let’s have a look at two important concepts you must know to better understand linear regression.

Cost Function: The cost function helps us to figure out the best possible values for m and b, which would provide the best fit line for the data points. Since we want the best values for m and b, we convert this search problem into a minimization problem where we would like to minimize the error between the predicted value and the actual value.

Math

Given our simple linear equation: y=mx+b

we can calculate MSE as:

Gradient Descent

To minimize MSE we use Gradient Descent to calculate the gradient of our cost function. [TODO: Slightly longer explanation].

Math

There are two parameters (coefficients) in our cost function we can control: weight m and bias b. Since we need to consider the impact each one has on the final prediction, we use partial derivatives. To find the partial derivatives, we use the Chain rule. We need the chain rule because (y – (mx+b))^2 is really 2 nested functions: the inner function y – (mx + b) and the outer function: x^2

Returning to our cost function:

We can calculate the gradient of this cost function as:

Python Implementation

# Simple Linear Regression# Importing the librariesimport numpy as npimport matplotlib.pyplot as pltimport pandas as pd# Importing the datasetdataset = pd.read_csv('Salary_Data.csv')X = dataset.iloc[:, :-1].valuesy = dataset.iloc[:, 1].values# Splitting the dataset into the Training set and Test setfrom sklearn.cross_validation import train_test_splitX_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 1/3, random_state = 0)# Fitting Simple Linear Regression to the Training setfrom sklearn.linear_model import LinearRegressionregressor = LinearRegression()regressor.fit(X_train, y_train)# Predicting the Test set resultsy_pred = regressor.predict(X_test)# Visualising the Training set resultsplt.scatter(X_train, y_train, color = 'red')plt.plot(X_train, regressor.predict(X_train), color = 'blue')plt.title('Salary vs Experience (Training set)')plt.xlabel('Years of Experience')plt.ylabel('Salary')plt.show()# Visualising the Test set resultsplt.scatter(X_test, y_test, color = 'red')plt.plot(X_train, regressor.predict(X_train), color = 'blue')plt.title('Salary vs Experience (Test set)')plt.xlabel('Years of Experience')plt.ylabel('Salary')plt.show()

Linear RegressionSimple linear regression is a statistical method that enables users to summarise and study the relationships between two continuous (quantitative) variables. Linear regression is a linear model wherein a model that assumes a linear relationship between the input variables (x) and the single output variable (y). Here, y can be calculated from a linear combination of the input variables (x). When there is a single input variable (x), the method is called a simple linear regression. When there are multiple input variables, the procedure is referred to as multiple linear regression.

Application: Salary forecasting, Real estate predictions etc.y = mx + b → Linear Equation

The motive of the linear regression algorithm is to find the best values for m and b. Before moving on to the algorithm, let’s have a look at two important concepts you must know to better understand linear regression.

Cost Function:The cost function helps us to figure out the best possible values for m and b, which would provide the best fit line for the data points. Since we want the best values for m and b, we convert this search problem into a minimization problem where we would like to minimize the error between the predicted value and the actual value.MathGiven our simple linear equation: y=mx+b

we can calculate MSE as:

Gradient DescentTo minimize MSE we use Gradient Descent to calculate the gradient of our cost function. [TODO: Slightly longer explanation].

MathThere are two parameters (coefficients) in our cost function we can control: weight m and bias b. Since we need to consider the impact each one has on the final prediction, we use partial derivatives. To find the partial derivatives, we use the Chain rule. We need the chain rule because (y – (mx+b))^2 is really 2 nested functions: the inner function y – (mx + b) and the outer function: x^2

Returning to our cost function:

We can calculate the gradient of this cost function as:

Python Implementation