Why is a best fit line used to find slope instead of using the data points?

Because the data points are generally not all in line with each other. If you connect the dots,

from one data point to the next and then to the next, you usually get a zig-zag line of many

segments, where the slopes of the segments are all different and cover a wide range. It would

be impossible to decide what the "real" slope of the data is. The "best fit" line is a line that finds

the pattern buried in the zig-zag data, giving each data point its best share of determining the best

single equation to represent the whole batch of points. That's why it's called "best".