# Why is a best fit line used to find slope instead of using the data points?

Because the data points are generally **not** all in line
with each other. If you connect the dots,

from one data point to the next and then to the next, you usually get a zig-zag line of many

segments, where the slopes of the segments are all different and cover a wide range. It would

be impossible to decide what the "real" slope of the data is. The "best fit" line is a line that finds

the pattern buried in the zig-zag data, giving each data point its best share of determining the best

single equation to represent the whole batch of points. That's why it's called "best".

### How do you find the equation for the line of best fit?

There are many ways, but probably you aren't in a statistics class, but in an algebra class. Step 1 plot all the data points on a coordinate plane graph (x-y graph) Step 2 estimate a line 'close' to points. Step 3 use 2 points ON THE LINE (these do not need to be data points) Step 4 find slope of line using points from step 3 Step 5 use point-slope formula to write the equation.

### When you compare two variables how does that turn into slope?

Suppose you have a sample of n points for two variables: (x1, y1), (x2, y2), ... (xn, yn). Without going into various statistical considerations (which are nonetheless important) you can estimate the slope of the 'best' line that can be used to estimate the values of y from the values of x using for formula given for beta-hat in the wikipedia article for simple linear regression.

