1.  Submit a draft of you literature review.  You must include at least 5 sources.  Be sure to provide sufficient background about you topic so that someone with no experience will be able to read your paper without consulting additional sources.  You should also provide information about how your topic fits into the current literature and how other have assisted in the specification of you model.  You can assume the reader of your review understands econometrics.

2.  Provide a printout of you first regression.

3.  Using the consumption function data.  Determine the marginal effects of income and wealth on consumption.  Are there any problems with the regression?  If you believe there may be a problem, test your hypothesis. 

4.  Building a "Demand-Side" Model for Pork

This is an attempt to bridge the gap between textbook and the classroom and computer. It is a complete exercise that helps you make independent specification decisions and attempts to simulate some of the procedures that you will encounter in your termpaper. Like in your termpaper, I strongly encourage you to:
  1. Look over a portion of the reading list before deciding on your specification.
  2. Try to estimate as few regression runs as possible. There's nothing wrong with looking at only one regression result. 
The dependent variable for the interactive example is the quantity of pork consumed (pounds per person per year) in the United States. The data are quarterly for 1975 through 1984, so there are 40 observations. For each quarter, the dependent variable, CONPK, measures the annual per capita consumption (CON) of pork (PK).

You should have discussed demand-side equations for consumption goods in Intermediate Microeconomics, so the underlying theory will be fairly familiar. In most demand equations we include at least the price of the product, in this case, PRIPK, the price (PRI) of pork (PK). In addition,I have added some measure of consumer buying power, and in this example we will use YDUSP, which is disposable income (YD) in the United States (US) per capita (P). In addition, the price of a substitute, PRIBF, the price of beef (BF) is also available as an explanatory variable.

You will choose whether the functional form of the demand/income relationship should be linear (using YDUSP) or semi-log (using the log of that variable, LYDUSP). In addition, you will have to figure out whether to adjust the intercept of the quarterly model for seasonal variation with the inclusion of quarterly seasonal dummies (D1, D2, and D3). You also will have to decide on the extent to which simultaneity can be dealt with by including a production variable, PROPK, the production (PRO) of pork, in the equation. Finally, serial correlation or heteroskedasticity may be a problem.

As even a little reading will show you, attention paid in the literature to the demand for pork is impressive. One question is whether the typical consumer decides to buy a product based on real prices and incomes or on nominal prices and incomes. The more sophisticated the consumer behavior is assumed to be, the more logical it is to use real prices and incomes since the dependent variable in this example (pounds of pork) is in real terms. The variables in this exercise are nominal; however, they could be converted to real prices and incomes by dividing through by the consumer price index (CPI) or GNP deflator of the quarter in question and multiplying by 100:

Real Xi = Nominal Xi(100/CPIi)

The variables available for your model are:

CONPKt Per capita pounds of pork consumed in the U.S. in quarter t
PRIPKt The price of a pound of pork (in dollars per 100 pounds) in quarter t
PRIBFt The price of a pound of beef (in dollars per 100 pounds) in quarter t
YDUSPt Per capita disposable income in the U.S. in quarter t (current dollars)
LYDUSPt The log of per capita disposable income
PROPKt Pounds of pork produced (in billions) in the U.S. in quarter t
D1t Dummy equal to 1 in the first quarter of the year and 0 otherwise
D2t Dummy equal to 1 in the second quarter of the year and 0 otherwise
D3t Dummy equal to 1 in the third quarter of the year and 0 otherwise

This data set was obtained from Prof. William G. Tomek of Cornell University.

Now:

  1. Hypothesize the coefficients of the expected signs for all these variables in an equation for the consumption of pork. Consider each variable carefully; what is the economic content of each hypothesis?

  2. Choose the best combination of explanatory variables for this model. Assume that every model has at least the price of pork variable and one of the two income variables. Do not take the attitude that you will see what a particular specification looks like before making up your mind. Taking such an attitude leads to possible bias and ruins the applicability of the t- and F-tests. In addition, this example has been set up in the hope that you will be capable of getting the "perfect" equation after looking at only one regression estimate. Take it as a challenge; try to make your first equation your final equation.

Your equations should include CONPK as the dependent variable and PRIPK as one of the explanatory variables. Because I have questions for each different specification, you should include either YDUSP or LYDUSP.

 

 If you chose YDUSP as one of your variables, go to the first set of links, but if you instead chose LYDUSP, jump to the second set of links below.

  1. Find below the combination of explanatory variables (from PROPK, PRIBF, and the seasonal dummies) that you wish to include in your regression and then go to the indicated estimated regression equation:
    none of them, go to question set 1.
    the seasonal dummies only, go to question set 2
    PROPK only, go to question set 3
    PRIBF only, go to question set 5
    the seasonal dummies and PROPK, go to question set 4
    the seasonal dummies and PRIBF, go to question set 6
    PROPK and PRIBF, go to question set 7
    all three, go to question set 8

  2. Find below the combination of explanatory variables (from PROPK, PRIBF, and the seasonal dummies) that you wish to include in your regression and then go to the indicated estimated regression equation:
    none of them, go to question set 9
    PROPK only, go to question set 11
    the seasonal dummies only, go to question set 10
    PRIBF only, go to question set 13
    the seasonal dummies and PROPK, go to question set 12
    the seasonal dummies and PRIBF, go to question set 14
    PROPK and PRIBF, go to question set 15
    all three, go to question set 16

Next Section

Multicollinearity, Heteroskedasticity and Serial Correlation

It does not appear that multicollinearity is a problem.  If you believe that it is, regress, the independent variable on each oth to check for problems.  If it is a problem, drop variable a back-track to the previous section.

_______________________________________________________________---

Heteroskedasticity. This dependent variable is already per capita, is time series, and does not change substantially during the sample period. As a result, the possibility of pure heteroskedasticity is so low that most econometricians would not even bother testing for it. (If a Park test were to be run, however, a logical proportionality factor Z might be per capita disposable income.)

Run a park test where the proportionality factor is the per capita disposable income.  Print out the regression an report your results.

Most econometrics programs have a built-in test for heterskedasticity.  Run a White test, Breush-Pagen, Cook-Weisberg test, or the built in test in your econometrics program.  Print out the test an report the results.

Today's suggestion is to always correct for heteroskedasticity even if it is not a problem.  If we had cross-sectional data, I would suggest dividing all of the independent and the dependent variable by the proportionality fact if the Park Test was significant.  If the Park Test was not significant or you are using time-series data as in this assignment, I would suggest using the econometrics program's built-in correction.  This may be listed as robust errors or White errors.

Run a regression with robust errors.  Print out the regression. 

____________________________________________________________________________________

Serial Correlation. Serial correlation is quite another matter, however, since the data are time series.

Run the regression (with the robust errors) and determine the Durbin-Watson statistic.  Interpret the statistic.
Most econometrics programs make you define the time variable and whether is is annual, quarterly, or another.

 

In addition, it seems possible that short-run fads in the consumption of pork, or alternatively, supply shocks, might cause swings in consumption from year to year that would not be completely captured by the price variables or by coefficients estimated over the entire sample.

In particular, if your final equation was either regression based on question set 6 or run based on question set 14, then there is a possibility of serial correlation. Their Durbin-Watson d's of 1.09 and 1.19 respectively are right at the edge of the critical dL for positive serial correlation for 40 observations and 6 explanatory variables.

Run either regression run 6 or run 14 (Go back to the first section to determine the variables) and correct for serial or auto-correlation.  Most programs have a build in correction program.  Two popular generalized least squares methods are the Cochrane-Orcutt method and the Prais-Winsten method.  Test the Durbin-Watson statistic on the transformed regressions. 

 

 

 

As can be seen by comparing the GLS result with the OLS result, the correction should have decrease the significance of the price and income explanatory variables as would have been expected. The increase in significance of the seasonal dummies with GLS estimation raises the possibility that the seasonal pattern is not as simple as that implied by the three intercept dummies.

Note that most of the lowest Durbin-Watson statistics came with the seasonal dummy group included, supporting the hypothesis that this set of dummies did not properly capture the actual seasonality of the demand for pork; the introduction of the seasonal dummies might have deleted some but not all of the seasonal variation, leaving a serially correlated pattern in the residuals. While the equation was probably better off with the seasonal dummies than without, a better knowledge of the meat industry before estimation might have allowed a more sophisticated seasonal pattern to be chosen. Even using only intercept dummies, for example, a better overall fit might have been obtained by using only one seasonal dummy (D4, equal to one in the fourth quarter and zero otherwise). To make such a switch (or to drop one of the seasonal dummies while keeping the others) on the sole basis of the attached estimations would be a mistake, however, because the hypothesis would be tested on the same data set from which it was developed.

Finally, note that some of the Durbin-Watson statistics were greater than two when PROPK was included in the regression. This result almost surely occurred because the resulting equation included both demand-side and supply-side variables, and the residuals were no longer the residuals of just the demand-side equation. This result is one of many reasons that most econometricians go to great lengths to avoid including such a production variable in a demand-side equation. In a sense, production acts like a dominant variable; its relationship to the dependent variable is strong but is definitional with little economic content.