Chow test is an F-ratio test and it is for testing structural change in regression analysis for large samples. It is used mostly in time-series models. Here we are going to show an example using hsb2.sas7bdat.
Our data set hsb2 consists of high school student scores on various tests and their demographical information. Let’s say our model is a regression model of writing scores on math and reading scores. Furthermore we want to test if the regression model will be different for male and female students. In other words, we want to test if the same regression coefficients apply to both male students and female students in the data set or there are two subsets with different intercepts and slopes. We will use Chow test for this purpose.
Since Chow test is mostly used in time series, SAS has included it with proc autoreg. The way to specify the two subsets is to specify the breakpoint in terms of the position of the observations. In this example, we use proc freq to identify the position for the breakpoint and we then have to sort the data accordingly.
proc freq data = hsb2; tables female; run;The FREQ ProcedureCumulative Cumulative FEMALE Frequency Percent Frequency Percent ----------------------------------------------------------- 0 91 45.50 91 45.50 1 109 54.50 200 100.00proc sort data = hsb2; by female; run; proc autoreg data = hsb2; model write = math read /chow = 91; run;Dependent Variable WRITEOrdinary Least Squares EstimatesSSE 9938.81034 DFE 197 MSE 50.45081 Root MSE 7.10287 SBC 1364.64741 AIC 1354.75246 Regress R-Square 0.4441 Total R-Square 0.4441 Durbin-Watson 1.6662Structural Change TestBreak Test Point Num DF Den DF F Value Pr > FChow 91 3 194 11.84 <.0001Standard Approx Variable DF Estimate Error t Value Pr > |t|Intercept 1 15.5339 3.0180 5.15 <.0001 MATH 1 0.4005 0.0717 5.58 <.0001 READ 1 0.3094 0.0655 4.72 <.0001
The middle section of the output above gives the Chow Test, and the rest is just the regression model for the entire sample including both male and female students. The Chow test indicates that there is a structural difference for male and female students. Now let’s run the regression models separately.
proc reg data = hsb2; by female; model write = math read ; run; quit;FEMALE=0Parameter Standard Variable DF Estimate Error t Value Pr > |t|Intercept 1 7.33165 4.60342 1.59 0.1148 MATH 1 0.39321 0.10066 3.91 0.0002 READ 1 0.41592 0.09259 4.49 <.0001FEMALE=1Parameter EstimatesParameter Standard Variable DF Estimate Error t Value Pr > |t|Intercept 1 21.07310 3.37071 6.25 <.0001 MATH 1 0.41966 0.08719 4.81 <.0001 READ 1 0.23061 0.07933 2.91 0.0044