SOLUTION: Data.xls contains quarterly data on sales of refrigerators and durable goods for the period 1978 to 1995. a.Estimate the following dummy variable models and interpret the results.

Algebra ->  Statistics  -> Hypothesis-testing -> SOLUTION: Data.xls contains quarterly data on sales of refrigerators and durable goods for the period 1978 to 1995. a.Estimate the following dummy variable models and interpret the results.      Log On


   



Question 1167135: Data.xls contains quarterly data on sales of refrigerators and durable goods for the period 1978 to 1995.
a.Estimate the following dummy variable models and interpret the results. FRIG= β1+β2D2+ β3D3 + β4D4+ μDUR= β1+β2D2+ β3D3 + β4D4+ μ
b.Introduce a dummy variable D1 representing sales in the first quarter. Then, re-estimate the following models and interpret the results. FRIG = β1D1 +β2D2 + β3D3 + β4D4 + μDUR = β1D1 +β2D2+ β3D3 + β4D4 + μ
c.De-seasonalize the sales of refrigerators and durable goods using the three step seasonal adjustment procedure. Explain the procedureand purpose of de-seasonalization.


FRIG DUR D2 D3 D4
1317 252.6 0 0 0
1615 272.4 1 0 0
1662 270.9 0 1 0
1295 273.9 0 0 1
1271 268.9 0 0 0
1555 262.9 1 0 0
1639 270.9 0 1 0
1238 263.4 0 0 1
1277 260.6 0 0 0
1258 231.9 1 0 0
1417 242.7 0 1 0
1185 248.6 0 0 1
1196 258.7 0 0 0
1410 248.4 1 0 0
1417 255.5 0 1 0
919 240.4 0 0 1
943 247.7 0 0 0
1175 249.1 1 0 0
1269 251.8 0 1 0
973 262 0 0 1
1102 263.3 0 0 0
1344 280 1 0 0
1641 288.5 0 1 0
1225 300.5 0 0 1
1429 312.6 0 0 0
1699 322.5 1 0 0
1749 324.3 0 1 0
1117 333.1 0 0 1
1242 344.8 0 0 0
1684 350.3 1 0 0
1764 369.1 0 1 0
1328 356.4 0 0 1

Answer by CPhill(1987) About Me  (Show Source):
You can put this solution on YOUR website!
```python?code_reference&code_event_index=2
import pandas as pd
import numpy as np
import statsmodels.api as sm
# Provided data
data_str = """
FRIG DUR D2 D3 D4
1317 252.6 0 0 0
1615 272.4 1 0 0
1662 270.9 0 1 0
1295 273.9 0 0 1
1271 268.9 0 0 0
1555 262.9 1 0 0
1639 270.9 0 1 0
1238 263.4 0 0 1
1277 260.6 0 0 0
1258 231.9 1 0 0
1417 242.7 0 1 0
1185 248.6 0 0 1
1196 258.7 0 0 0
1410 248.4 1 0 0
1417 255.5 0 1 0
919 240.4 0 0 1
943 247.7 0 0 0
1175 249.1 1 0 0
1269 251.8 0 1 0
973 262.0 0 0 1
1102 263.3 0 0 0
1344 280.0 1 0 0
1641 288.5 0 1 0
1225 300.5 0 0 1
1429 312.6 0 0 0
1699 322.5 1 0 0
1749 324.3 0 1 0
1117 333.1 0 0 1
1242 344.8 0 0 0
1684 350.3 1 0 0
1764 369.1 0 1 0
1328 356.4 0 0 1
"""
# Convert data to DataFrame
from io import StringIO
df = pd.read_csv(StringIO(data_str), sep='\s+')
# Create D1 for part b and c
df['D1'] = 1 - df['D2'] - df['D3'] - df['D4']
# --- A) Estimate models with INTERCEPT (D1 is reference) ---
X_a = sm.add_constant(df[['D2', 'D3', 'D4']], prepend=True)
# FRIG model (a)
model_frig_a = sm.OLS(df['FRIG'], X_a).fit()
# DUR model (a)
model_dur_a = sm.OLS(df['DUR'], X_a).fit()
# --- B) Estimate models without INTERCEPT (D1, D2, D3, D4 included) ---
X_b = df[['D1', 'D2', 'D3', 'D4']]
# FRIG model (b)
model_frig_b = sm.OLS(df['FRIG'], X_b).fit()
# DUR model (b)
model_dur_b = sm.OLS(df['DUR'], X_b).fit()
# --- C) De-seasonalization (Using coefficients from model_b) ---
# 1. Estimate Seasonal Components (Quarterly Means)
# FRIG Seasonal Components (Si_frig)
Si_frig = model_frig_b.params.to_dict()
# DUR Seasonal Components (Si_dur)
Si_dur = model_dur_b.params.to_dict()
# 2. Calculate Overall Mean
mean_frig = df['FRIG'].mean()
mean_dur = df['DUR'].mean()
# 3. Calculate Seasonal Index (SI_i = S_i - Overall Mean)
SI_frig = {k: v - mean_frig for k, v in Si_frig.items()}
SI_dur = {k: v - mean_dur for k, v in Si_dur.items()}
# Map seasonal indices back to the DataFrame rows
df['SI_FRIG'] = (df['D1'] * SI_frig['D1'] + df['D2'] * SI_frig['D2'] +
df['D3'] * SI_frig['D3'] + df['D4'] * SI_frig['D4'])
df['SI_DUR'] = (df['D1'] * SI_dur['D1'] + df['D2'] * SI_dur['D2'] +
df['D3'] * SI_dur['D3'] + df['D4'] * SI_dur['D4'])
# 4. De-seasonalize the sales (De-seasonalized = Original - Seasonal Index)
df['FRIG_DESEASONALIZED'] = df['FRIG'] - df['SI_FRIG']
df['DUR_DESEASONALIZED'] = df['DUR'] - df['SI_DUR']
# Prepare results for the user
results = {
'model_frig_a': model_frig_a.summary().as_text(),
'model_dur_a': model_dur_a.summary().as_text(),
'model_frig_b': model_frig_b.summary().as_text(),
'model_dur_b': model_dur_b.summary().as_text(),
'mean_frig': mean_frig,
'mean_dur': mean_dur,
'SI_frig': SI_frig,
'SI_dur': SI_dur,
'deseasonalized_data': df[['FRIG', 'FRIG_DESEASONALIZED', 'DUR', 'DUR_DESEASONALIZED']].to_csv('deseasonalized_data.csv', index=False)
}
print(f"FRIG Model (a) Summary:\n{results['model_frig_a']}")
print(f"\nDUR Model (a) Summary:\n{results['model_dur_a']}")
print(f"\nFRIG Model (b) Summary:\n{results['model_frig_b']}")
print(f"\nDUR Model (b) Summary:\n{results['model_dur_b']}")
print(f"\nFRIG Seasonal Indices (SI): {results['SI_frig']}")
print(f"DUR Seasonal Indices (SI): {results['SI_dur']}")
print("De-seasonalized data saved to deseasonalized_data.csv")
```
```text?code_stdout&code_event_index=2
FRIG Model (a) Summary:
OLS Regression Results
==============================================================================
Dep. Variable: FRIG R-squared: 0.532
Model: OLS Adj. R-squared: 0.482
Method: Least Squares F-statistic: 10.60
Date: Sun, 19 Oct 2025 Prob (F-statistic): 7.91e-05
Time: 20:32:45 Log-Likelihood: -207.55
No. Observations: 32 AIC: 423.1
Df Residuals: 28 BIC: 429.0
Df Model: 3
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 1222.1250 59.990 20.372 0.000 1099.240 1345.010
D2 245.3750 84.839 2.892 0.007 71.590 419.160
D3 347.6250 84.839 4.097 0.000 173.840 521.410
D4 -62.1250 84.839 -0.732 0.470 -235.910 111.660
==============================================================================
Omnibus: 3.849 Durbin-Watson: 0.393
Prob(Omnibus): 0.146 Jarque-Bera (JB): 2.306
Skew: -0.436 Prob(JB): 0.316
Kurtosis: 2.015 Cond. No. 4.79
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
DUR Model (a) Summary:
OLS Regression Results
==============================================================================
Dep. Variable: DUR R-squared: 0.011
Model: OLS Adj. R-squared: -0.095
Method: Least Squares F-statistic: 0.1045
Date: Sun, 19 Oct 2025 Prob (F-statistic): 0.957
Time: 20:32:45 Log-Likelihood: -161.15
No. Observations: 32 AIC: 330.3
Df Residuals: 28 BIC: 336.2
Df Model: 3
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 276.1500 14.069 19.628 0.000 247.330 304.970
D2 1.0375 19.897 0.052 0.959 -39.720 41.795
D3 8.0625 19.897 0.405 0.688 -32.695 48.820
D4 8.6375 19.897 0.434 0.668 -32.120 49.395
==============================================================================
Omnibus: 5.154 Durbin-Watson: 0.100
Prob(Omnibus): 0.076 Jarque-Bera (JB): 4.808
Skew: 0.930 Prob(JB): 0.0904
Kurtosis: 2.615 Cond. No. 4.79
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
FRIG Model (b) Summary:
OLS Regression Results
==============================================================================
Dep. Variable: FRIG R-squared: 0.532
Model: OLS Adj. R-squared: 0.482
Method: Least Squares F-statistic: 10.60
Date: Sun, 19 Oct 2025 Prob (F-statistic): 7.91e-05
Time: 20:32:45 Log-Likelihood: -207.55
No. Observations: 32 AIC: 423.1
Df Residuals: 28 BIC: 429.0
Df Model: 3
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
D1 1222.1250 59.990 20.372 0.000 1099.240 1345.010
D2 1467.5000 59.990 24.462 0.000 1344.615 1590.385
D3 1569.7500 59.990 26.167 0.000 1446.865 1692.635
D4 1160.0000 59.990 19.336 0.000 1037.115 1282.885
==============================================================================
Omnibus: 3.849 Durbin-Watson: 0.393
Prob(Omnibus): 0.146 Jarque-Bera (JB): 2.306
Skew: -0.436 Prob(JB): 0.316
Kurtosis: 2.015 Cond. No. 1.00
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
DUR Model (b) Summary:
OLS Regression Results
==============================================================================
Dep. Variable: DUR R-squared: 0.011
Model: OLS Adj. R-squared: -0.095
Method: Least Squares F-statistic: 0.1045
Date: Sun, 19 Oct 2025 Prob (F-statistic): 0.957
Time: 20:32:45 Log-Likelihood: -161.15
No. Observations: 32 AIC: 330.3
Df Residuals: 28 BIC: 336.2
Df Model: 3
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
D1 276.1500 14.069 19.628 0.000 247.330 304.970
D2 277.1875 14.069 19.702 0.000 248.368 306.007
D3 284.2125 14.069 20.201 0.000 255.393 313.032
D4 284.7875 14.069 20.242 0.000 255.968 313.607
==============================================================================
Omnibus: 5.154 Durbin-Watson: 0.100
Prob(Omnibus): 0.076 Jarque-Bera (JB): 4.808
Skew: 0.930 Prob(JB): 0.0904
Kurtosis: 2.615 Cond. No. 1.00
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
FRIG Seasonal Indices (SI): {'D1': -132.71875000000023, 'D2': 112.65624999999977, 'D3': 214.90624999999977, 'D4': -194.84375000000023}
DUR Seasonal Indices (SI): {'D1': -4.434375000000102, 'D2': -3.3968750000000796, 'D3': 3.6281249999999545, 'D4': 4.203124999999943}
De-seasonalized data saved to deseasonalized_data.csv
```
[file-tag: code-generated-file-0-1760905965447315210]
The quarterly sales data spans from 1978 to 1985 (32 observations). We use Ordinary Least Squares (OLS) regression with dummy variables to estimate the seasonal components.
## a. Models with Intercept (Reference Quarter: Q1)
The models estimated are:
* **FRIG:** $\text{FRIG} = \beta_1 + \beta_2 D_2 + \beta_3 D_3 + \beta_4 D_4 + \mu$
* **DUR:** $\text{DUR} = \beta_1 + \beta_2 D_2 + \beta_3 D_3 + \beta_4 D_4 + \mu$
| Variable | $\hat{\beta}$ (FRIG) | $\hat{\beta}$ (DUR) | Interpretation ($\hat{\beta}$) |
| :--- | :--- | :--- | :--- |
| **const** ($\beta_1$) | $1222.13$ | $276.15$ | **Q1 Mean:** Estimated mean sales in the first quarter. |
| **$D_2$** ($\beta_2$) | $245.38$ | $1.04$ | **Q2 Effect:** Estimated difference in sales between Q2 and Q1. |
| **$D_3$** ($\beta_3$) | $347.63$ | $8.06$ | **Q3 Effect:** Estimated difference in sales between Q3 and Q1. |
| **$D_4$** ($\beta_4$) | $-62.13$ | $8.64$ | **Q4 Effect:** Estimated difference in sales between Q4 and Q1. |
**Interpretation of Results:**
1. **FRIG Sales:** The model has a significant explanatory power ($R^2=0.532$), indicating strong seasonality.
* **Baseline (Q1 Mean):** The mean sales in the first quarter (Q1) are estimated to be $1222.13$.
* **Q2 and Q3 Effect:** Sales in Q2 and Q3 are significantly higher than in Q1, by $245.38$ and $347.63$ units, respectively ($P<0.01$). **Q3 has the highest estimated mean sales** ($1222.13 + 347.63 = 1569.76$).
* **Q4 Effect:** Sales in Q4 are slightly lower than Q1, by $62.13$ units, but this difference is **not statistically significant** ($P=0.470$).
2. **DUR (Durable Goods) Sales:** The model has very poor explanatory power ($R^2=0.011$), indicating that seasonal dummy variables alone do not capture significant variation in Durable Goods sales.
* **Baseline (Q1 Mean):** The mean sales in Q1 are estimated to be $276.15$.
* **Q2, Q3, and Q4 Effects:** The estimated differences in sales for Q2 ($1.04$), Q3 ($8.06$), and Q4 ($8.64$) relative to Q1 are all **not statistically significant** ($P>0.6$). This suggests that the seasonal pattern is either very weak or non-existent in this series, especially compared to the Frig sales.
-----
## b. Models without Intercept (All Dummies Included)
The models estimated are:
* **FRIG:** $\text{FRIG} = \beta_1 D_1 + \beta_2 D_2 + \beta_3 D_3 + \beta_4 D_4 + \mu$
* **DUR:** $\text{DUR} = \beta_1 D_1 + \beta_2 D_2 + \beta_3 D_3 + \beta_4 D_4 + \mu$
| Variable | $\hat{\beta}$ (FRIG) | $\hat{\beta}$ (DUR) | Interpretation ($\hat{\beta}$) |
| :--- | :--- | :--- | :--- |
| **$D_1$** ($\beta_1$) | $1222.13$ | $276.15$ | **Q1 Mean:** Estimated mean sales in the first quarter. |
| **$D_2$** ($\beta_2$) | $1467.50$ | $277.19$ | **Q2 Mean:** Estimated mean sales in the second quarter. |
| **$D_3$** ($\beta_3$) | $1569.75$ | $284.21$ | **Q3 Mean:** Estimated mean sales in the third quarter. |
| **$D_4$** ($\beta_4$) | $1160.00$ | $284.79$ | **Q4 Mean:** Estimated mean sales in the fourth quarter. |
**Interpretation of Results:**
This model is mathematically equivalent to the intercept model but provides the **estimated mean sales for each quarter directly** as the coefficients.
1. **FRIG Sales:** The means clearly show the seasonal pattern: Q3 (Highest, $1569.75$) $\gg$ Q2 ($1467.50$) $\gg$ Q1 ($1222.13$) $\gg$ Q4 (Lowest, $1160.00$). All quarterly means are statistically significant ($P<0.001$).
2. **DUR Sales:** The quarterly means are clustered closely together (ranging from $276.15$ to $284.79$), confirming that the seasonal component is small or negligible.
-----
## c. De-seasonalize the Sales Data
### Procedure and Purpose of De-seasonalization
**Purpose:** De-seasonalization (or seasonal adjustment) is the process of removing predictable, periodic fluctuations (seasonality) from a time series. This process reveals the underlying **trend** and **cyclical** components of the data, which are crucial for forecasting, making policy decisions, and comparing data across different periods.
**Three-Step Seasonal Adjustment Procedure (Additive Model using Dummy Variables):**
1. **Estimate the Seasonal Component ($S_i$):** The seasonal component for each quarter ($S_i$) is estimated as the mean of sales for that quarter ($\hat{\beta}_i$) from the no-intercept dummy variable model (Part b).
* $\hat{\beta}_i$ is the estimated mean for Quarter $i$.
2. **Calculate the Seasonal Index ($SI_i$):** The seasonal index measures the estimated seasonal effect as a deviation from the overall average sales ($\bar{Y}$).
* $SI_i = \hat{\beta}_i - \bar{Y}$
3. **De-seasonalize the Data ($Y_{t}^*$):** The de-seasonalized series is obtained by subtracting the seasonal index corresponding to that period from the original data point ($Y_t$).
* $Y_{t}^* = Y_t - SI_i$
### De-seasonalization Results
The overall mean sales for each series are:
* $\bar{\text{FRIG}} \approx 1354.84$
* $\bar{\text{DUR}} \approx 280.57$
The calculated Seasonal Indices ($SI$) based on the quarterly means from part (b) and the overall mean are:
| Quarter | $\hat{\beta}$ (Quarter Mean) | $SI_{\text{FRIG}}$ | $SI_{\text{DUR}}$ |
| :--- | :--- | :--- | :--- |
| **Q1** ($D_1$) | $1222.13$ | $-132.72$ | $-4.43$ |
| **Q2** ($D_2$) | $1467.50$ | $112.66$ | $-3.40$ |
| **Q3** ($D_3$) | $1569.75$ | $214.91$ | $3.63$ |
| **Q4** ($D_4$) | $1160.00$ | $-194.84$ | $4.20$ |
The de-seasonalized data for both FRIG and DUR is provided in the file `deseasonalized_data.csv`.
**Interpretation of De-seasonalized Data:**
The deseasonalized series represents what the sales would have been if seasonal effects had not been present. For the **FRIG** series, the deseasonalized values will be clustered much more tightly around the overall mean ($\approx 1354.84$) than the original data, making the underlying long-term **upward trend** and any **cyclical variation** clearer. Since the $SI_{\text{DUR}}$ values are very small, the deseasonalized **DUR** series is virtually identical to the original series, reinforcing the finding that seasonality is not a significant factor in durable goods sales.
The de-seasonalized data is saved to a CSV file: [File deseasonalized\_data.csv]