Recently, I was contacted about a quick calculation I did back in July of 2007. I simply calculated the correlation between gas prices and hybrid car sales up to that point. Not surprisingly, the correlation was very high (0.855) and very significant.
The student who contacted me used the data I had collected, then updated the data to show that the relationship was still very strong.
For those of you who are already bored with the statistics, skip down to the conclusions.
A More Complicated Model
I also wanted to add some other things to the mix. So, I decided to run some more complicated regression models using hybrid car sales as the outcome. I tried modeling gas price, month, time, and the number of models available. (Time is just a counter for each month starting in January of 2004, when I started tracking hybrid car sales). I quickly came to the conclusion that hybrid car sales were related to all of these things, but unfortunately, they were also highly correlated to each other.
Gas prices were highly correlated with time (R = 0.55, p-value < 0.001) and number of models sold (R = 0.43, p-value = 0.0004). And the number of hybrid car models sold is highly correlated with time (R = 0.96, p-value < 0.0001).
So I decided that I could (and should) drop something and quickly tossed out the number of models sold. As long as I have the other variables, number of models would just muddy the waters (confound).
I ran a regression model using dummy variables for the months, time, and gas prices. I got a good fit (R-squared - 0.72, adjusted R-squared of 0.65). But the residual plots looked off. The residual plot showed why the correlation had dropped from when I ran this back in July, 2007, but it didn't explain why.
It almost looked like there was a curve to the residual plot, so I tried modeling with different time variables, but that didn't seem to fit the bill. I tried a few other interactions, but nothing seemed to make the problem go away.
I was almost resigned to just leaving it at that, but then I decided to try one more thing. I decided to model the recession, seeing as how that seems to have had a big affect on car sales. So I created a dummy variable indicating a break point in May of 2008. It seemed to do a good job, so I threw it into the whole model and voila, it worked pretty well (See the table of Parameter Estimates and the second figure).
Parameter Estimates | |||||
---|---|---|---|---|---|
Variable | DF | Parameter Estimate | Standard Error | t Value | Pr > |t| |
Intercept | 1 | -4469.28975 | 2416.66010 | -1.85 | 0.0703 |
Gas_Price | 1 | 56.88336 | 10.86820 | 5.23 | <.0001 |
Recession | 1 | -13179 | 1663.59432 | -7.92 | <.0001 |
time | 1 | 398.61143 | 43.60902 | 9.14 | <.0001 |
January | 1 | -3876.86879 | 2181.03947 | -1.78 | 0.0816 |
February | 1 | -3206.59171 | 2179.80633 | -1.47 | 0.1475 |
March | 1 | 2635.57235 | 2187.10717 | 1.21 | 0.2339 |
April | 1 | 1694.27245 | 2209.67133 | 0.77 | 0.4468 |
May | 1 | 4994.91892 | 2281.48836 | 2.19 | 0.0333 |
June | 1 | 476.40455 | 2430.98600 | 0.20 | 0.8454 |
July | 1 | -16.45925 | 2425.43771 | -0.01 | 0.9946 |
August | 1 | -160.91587 | 2391.17318 | -0.07 | 0.9466 |
September | 1 | -3606.45695 | 2378.10359 | -1.52 | 0.1357 |
October | 1 | -2834.32326 | 2316.73403 | -1.22 | 0.2269 |
November | 1 | -1697.19686 | 2279.80620 | -0.74 | 0.4601 |
The overall model fits pretty well (R-squared = 0.87, adjusted R-squared = 0.84).
Conclusions
If you don't mind, I'm going to take some liberties with my conclusions. No good statistician would accept these conclusions (and they shouldn't!), but I'm going to write it this way to spark some thought.
If we take the model on it's own, it seems the recession is responsible for sales of hybrid cars dropping over 13,000 units. Also, every time the gas price goes up a penny, 57 more hybrids are sold that month. And, for every month that passes, 400 more hybrid cars are sold than in the previous month (allowing for seasonal differences). The seasonal differences aren't significant. This is probably due to gas prices being affected by the seasons, so adding the months doesn't really add much to the ability of the model to explain hybrid car sales. But you can see that March through May are better for sales, while the winter months can be tough.
It's a little surprising to me how correlated gas prices are to hybrid car sales. Yes, I expect there to be a relationship between them, but not as strong a linear one as we see here. I would have expected other factors to muddy the relationship (lack of inventory early on, number of models being sold, month of sale, federal tax credits, etc...), but it seems that gas prices trump those things. But the recession has had a huge affect, trumping even gas prices.
Where Do We Go From Here
That leads us to some interesting questions. Will we see a bounce back when the recession ends? If gas prices dropped down too much, would car companies turn away from hybrid technology once again (look at the history of hybrid and electric cars, turn of the century, to understand what I'm saying). Should the government mandate higher gas prices in order to force consumers to consider hybrids on a larger scale?
Plug-in hybrids and electric cars are probably going to suffer even worse at the hands of gas prices than hybrids do now. Given their larger price-tags, most consumers are likely to shy away unless there's some other factor pushing them towards on. It's hard to know for sure, but these are some of the questions we need to consider going forward.
By the way, I ran the models using SAS v9.1.3.
No comments:
Post a Comment
Comments posted on Hybrid Car Review will be moderated. Please avoid dropping links just for the sake of links. The comment will be deleted shortly after. Keep comments on topic and non-abusive. Thanks!