Linear regression is a way to determine how close two number series of data: x (independent) and y (potentially dependent), fit a linear function of the form: y = a*x + b.This is the first of a series of planned posts that will cover how to set up linear regression a variety of different languages. This week, we will discuss the easiest method of performing Linear regression analysis; and that is with Excel 2010.First, we will need to enable the Analysis ToolPak for Excel:1) Click File and select options2) Select Add-ins on the left hand menu3) Select Excel Add-ins in the drop down list named Manage at the bottom of the pop up4) Click the ‘Go’ button5) Tick the checkbox for Analysis ToolPak if it is empty6) Click the ‘Ok’ buttonOnce the Add-in is installed, create a table of data similar to the following:
x
y
7
216
12
302
14
431
6
151
22
677
38
958
14
431
9
227
10
308
11
277
To perform the Regression analysis, select a cell and then click the ‘Data’ ribbon tab. Then double-click the Data Analysis section of the ribbon. This will present a popup similar to the following:
Select the Regression option and click the ‘Ok’ button. This will prompt another popup to enter the cell ranges for the data which will be analyzed:
For the Input Y Range, select all cells in the Y column; including the header. Do the same for the Input X Range using the X column. Be sure to select the Labels checkbox and then click the ‘Ok’ button. Doing so will create the regression analysis within your spreadsheet:
Next week, we will talk a bit about what all of this means, but for right now, we will focus on the R Square value, and the Coefficients of the rows labeled ‘Intercept’ and ‘x’. The best way to demonstrate the significance of these values is to present them in a scatter chart with the linear trend line being rendered. This can be done in the following steps:1) Click an empty cell2) Select the ‘Insert’ ribbon tab3) Select the Scatter Plot and choose ‘Scatter with only Markers’4) Right Click in the center of the Chart and choose ‘Select Data’5) Highlight your data table including the column header lines6) Click the ‘Ok’ buttonYour chart should be displayed similar to the following:
The following steps can be used to display the linear trend line in the chart:1) Left Click any of the plot markers2) Right Click and select ‘Add Trendline’3) Select Linear in the Trendline Options pop up4) Enable the checkbox for ‘Display Equation on chart’5) Enable the checkbox for ‘Display R-Squared value on chart’6) Click the close button
And there you have it; Linear Regression done simply in Excel 2010. You should notice that the formula given uses the values obtained in the Summary Output of the Data Analysis step. Next week, we will discuss how to perform Regression analysis programmatically in C#.