Linear regression is a Supervised Machine Learning algorithm. It is a statistical model used to predict the linear relationship between two or more variables. Mostly we treat datasets with quantitative values as regression models. Linear regression models use a straight line, logistic and nonlinear regression models use a curved line. There are two types of Linear Regression- Simple Linear Regression and Multiple Linear Regression.
Independent variables do not change based on the effects of the other variables. They are also known as explanatory variables or Predictors.
The value of the dependent variable changes when there is a change in the independent variable. It is also known as an outcome or response variable.
Example of response and predictors:
In a Data set containing the quality and price of a product, price is a dependent variable or outcome which changes according to the quality(predictor) of the product.
The common equation of the straight line is used as a regression equation.
y = a + b x
y – dependent variable
x – independent variable
b – the slope of the line
a- coefficient of regression
Residual Error and Mean Square Error:
The residual error is the distance between the data point to the fitted regression line. Root Mean Square Error (RMSE) is the standard deviation of the residuals. The lesser the Mean square error, the better the regression model.
The main idea of the linear regression algorithm is to obtain a line that best fits the data. Line of best fit refers to a line through a plot of data points that best expresses the relationship between those points. For the best fit line, the total prediction error (all data points) is as small as possible.
SIMPLE LINEAR REGRESSION :
The simple Linear regression model establishes a relationship between one dependent variable and one independent variable.
Example of Simple Linear Regression- Finding salary (dependent variable) of an Employee by the years of experience (independent variable).
MULTIPLE LINEAR REGRESSION :
Multiple linear regression is to find a relationship between several dependent and independent variables. It can determine the relative effect of one or more predictors.
Example of Multiple Linear Regression- Finding the salary of an employee by multiple variables like the years of experience, age, gender, etc..,
Advantages of Linear Regression:
The main advantage of Linear regression models is linearity. It makes the estimation procedure simple and the linear equations are easy to understand.