In an investigation of two variables, where one variable is dependent on the other, a regression line is a ‘best fit’ line through a series of points on a graph, showing the form of the relationship between two sets of data. This line can be drawn by eye, but individuals' perceptions of the location of the best-fit line can vary widely, so the best-fit line is often calculated using the least squares method. The aim of this method is to ensure that the sum of the squares of the deviations of all the points from the line is at a minimum.
By convention, y is the dependent variable; that is, it is the variable whose values are being predicted from the independent variable x. The appropriate regression line involves the regression of y on x. This is particularly important, because a regression of x on y would give a different line. The regression line is described by:
yc = a + bx
where y
c is the computed value of the independent variable,
a is the
y intercept (the value of y
c when x = 0), calculated from
a = ‾y - bx
, and equation
b is the slope of the regression line, calculated from

where
n is the number of paired variables.
The regression line may be used to calculate the values of one data set, given the values for the other.