A Basic Overview of Linear Regression

Andromeda AI
May 11, 2021
3 min read

Linear Regression is a machine learning algorithm that is famously used for simple problems that be solved with a linear graph. To understand this idea, we have to explore the definition of linear regression in basic math terms. Linear in math is known as a line, which is a representation of something completely straight.

The formula for a line is often referred to as y = mx + b, where (x,y) are the changing coordinates, m is the slope of the line (how much does y coordinate increase when moving x by +1), and b the y-intercept (0,b).

This concept of lines is brought into linear regression when we try and train our model. Keep in mind this machine learning algorithm takes both input and output, and we have to create a model such that given any input it can produce a valid output. For the sake of understanding and comprehension, we will pretend x is house size, and y is the price of the house. So if using linear regression we would try to create a model which can take the house size and find out the price of the house.

How linear regression works is that it tries to find a linear relationship between the input and output or x and y and uses a conventional y = mx + b line to do so. We have already given it the values of x (house size) and y (price), so the model has to train and learn the best possible values for m and b. To understand how a machine does this, you can think about what you would do if you wanted to learn the linear relationship between the house size and price in a mathematical model. You would start by choosing random values and assigning them to m and b. You would then plot the I line on the graph and put dots at the coordinates referring to (input, output). If the coordinates fall mostly on the line that means that your m and b values are pretty good and you can experiment by slightly changing these values to see how that affects the similarity between the coordinates and the line. After repeating this process numerous times, you will reach better and better lines or values for m and b. This process, in math, is called the line of best fit and through a series of calculations, you can find the best line of fit for a certain set of data points. Once your model has been trained, you can give it an x or input value, and using the line we created, it can produce a y or output value.

Keep in mind this described model is for unsupervised learning but doesn't change much when doing binary classification problems. In fact, the only change in how the model is trained is instead of looking for a line that fits the coordinates best, you search for a line that separates the data points best. What this means is that once that line is drawn, you classify one side as 0 or false, and the other as 1 or true. In a binary classification problem, the output of each data point will be 0 or 1 (false or true).

Keep checking the website to see updates on the linear regression topic!!

A Basic Overview of Linear Regression

Recent Posts

Commenti

Subscribe Form