Not a big fan of Microsoft Excel. Oh sure, it does the job, and it does it quite well (usually). I really just hate starting up that program. Let me say that historically, I think Excel has had a HUGE impact on spreadsheets (even though it wasn't the first). But in this day, there are some useable alternatives to Excel. Open Office is quite fine (but again, you have to start up that program). About Open Office, I should say that if you haven't used this, it is a great alternative to Excel. It essentially is Excel (except that it is free). In some cases, it does things they way I expect more than my current version of Excel. It does them the way Excel used to do it.
Ok, enough of that. I want to talk about google docs. I like google docs because it is in a webpage. However, it does lack one thing that both Open Office and Excel have - the 'trendline'. Trendline is a dumb word for linear regression fit. What is a linear regression fit? Basically you have a bunch of data points. What linear equation would fit this data the best? This is linear regression.
Let me do two things. First, show you how to do linear regression in google docs (trendline). Second, I will show you how to do it the long way in google docs. It seems I did this before, but I just can't find it. Oh well.
I need to start with some data. How about I use my data from the price of a piece of LEGO post? Here is a plot of price vs. the number of pieces in some LEGO sets:
But what if I want a function that fits this data such that:
Where m is the slope and b is the intercept. In google docs, this is pretty easy to find using the SLOPE and INTERCEPT function. For this case, I can just select a cell to enter:
The SLOPE function does a linear regression on the data you put in and returns the slope. Simple. Also, there is the INTERCEPT function:
There you go. Now, what if you want to include a plot of that function in your graph? There maybe is a better way, but I just generated another set of data from the best fit function. Make a new column of data (I called it fit) and use the FORECAST function. Here is an example of that:
So, column A is the price data. Column B is number of pieces data. I just created some more 'price' values and fed them into the FORECAST function. I made these "forecast" data points in a different column so that they would show up as a different color. Here is the finished graph.
Not perfect, but it works. And here is the google doc sheet in case you want the data or you want to look at it or something.
One more thing. I lied. I said I was going to show you how linear regression works. This will be for a future post.
UPDATE: As requested, here is a link to the google doc with the Lego data. Sorry I forgot to add that in earlier.