Question 1167135: Data.xls contains quarterly data on sales of refrigerators and durable goods for the period 1978 to 1995.
a.Estimate the following dummy variable models and interpret the results. FRIG= β1+β2D2+ β3D3 + β4D4+ μDUR= β1+β2D2+ β3D3 + β4D4+ μ
b.Introduce a dummy variable D1 representing sales in the first quarter. Then, re-estimate the following models and interpret the results. FRIG = β1D1 +β2D2 + β3D3 + β4D4 + μDUR = β1D1 +β2D2+ β3D3 + β4D4 + μ
c.De-seasonalize the sales of refrigerators and durable goods using the three step seasonal adjustment procedure. Explain the procedureand purpose of de-seasonalization.
FRIG DUR D2 D3 D4
1317 252.6 0 0 0
1615 272.4 1 0 0
1662 270.9 0 1 0
1295 273.9 0 0 1
1271 268.9 0 0 0
1555 262.9 1 0 0
1639 270.9 0 1 0
1238 263.4 0 0 1
1277 260.6 0 0 0
1258 231.9 1 0 0
1417 242.7 0 1 0
1185 248.6 0 0 1
1196 258.7 0 0 0
1410 248.4 1 0 0
1417 255.5 0 1 0
919 240.4 0 0 1
943 247.7 0 0 0
1175 249.1 1 0 0
1269 251.8 0 1 0
973 262 0 0 1
1102 263.3 0 0 0
1344 280 1 0 0
1641 288.5 0 1 0
1225 300.5 0 0 1
1429 312.6 0 0 0
1699 322.5 1 0 0
1749 324.3 0 1 0
1117 333.1 0 0 1
1242 344.8 0 0 0
1684 350.3 1 0 0
1764 369.1 0 1 0
1328 356.4 0 0 1
Answer by CPhill(1987) (Show Source):
You can put this solution on YOUR website! ```python?code_reference&code_event_index=2
import pandas as pd
import numpy as np
import statsmodels.api as sm
# Provided data
data_str = """
FRIG DUR D2 D3 D4
1317 252.6 0 0 0
1615 272.4 1 0 0
1662 270.9 0 1 0
1295 273.9 0 0 1
1271 268.9 0 0 0
1555 262.9 1 0 0
1639 270.9 0 1 0
1238 263.4 0 0 1
1277 260.6 0 0 0
1258 231.9 1 0 0
1417 242.7 0 1 0
1185 248.6 0 0 1
1196 258.7 0 0 0
1410 248.4 1 0 0
1417 255.5 0 1 0
919 240.4 0 0 1
943 247.7 0 0 0
1175 249.1 1 0 0
1269 251.8 0 1 0
973 262.0 0 0 1
1102 263.3 0 0 0
1344 280.0 1 0 0
1641 288.5 0 1 0
1225 300.5 0 0 1
1429 312.6 0 0 0
1699 322.5 1 0 0
1749 324.3 0 1 0
1117 333.1 0 0 1
1242 344.8 0 0 0
1684 350.3 1 0 0
1764 369.1 0 1 0
1328 356.4 0 0 1
"""
# Convert data to DataFrame
from io import StringIO
df = pd.read_csv(StringIO(data_str), sep='\s+')
# Create D1 for part b and c
df['D1'] = 1 - df['D2'] - df['D3'] - df['D4']
# --- A) Estimate models with INTERCEPT (D1 is reference) ---
X_a = sm.add_constant(df[['D2', 'D3', 'D4']], prepend=True)
# FRIG model (a)
model_frig_a = sm.OLS(df['FRIG'], X_a).fit()
# DUR model (a)
model_dur_a = sm.OLS(df['DUR'], X_a).fit()
# --- B) Estimate models without INTERCEPT (D1, D2, D3, D4 included) ---
X_b = df[['D1', 'D2', 'D3', 'D4']]
# FRIG model (b)
model_frig_b = sm.OLS(df['FRIG'], X_b).fit()
# DUR model (b)
model_dur_b = sm.OLS(df['DUR'], X_b).fit()
# --- C) De-seasonalization (Using coefficients from model_b) ---
# 1. Estimate Seasonal Components (Quarterly Means)
# FRIG Seasonal Components (Si_frig)
Si_frig = model_frig_b.params.to_dict()
# DUR Seasonal Components (Si_dur)
Si_dur = model_dur_b.params.to_dict()
# 2. Calculate Overall Mean
mean_frig = df['FRIG'].mean()
mean_dur = df['DUR'].mean()
# 3. Calculate Seasonal Index (SI_i = S_i - Overall Mean)
SI_frig = {k: v - mean_frig for k, v in Si_frig.items()}
SI_dur = {k: v - mean_dur for k, v in Si_dur.items()}
# Map seasonal indices back to the DataFrame rows
df['SI_FRIG'] = (df['D1'] * SI_frig['D1'] + df['D2'] * SI_frig['D2'] +
df['D3'] * SI_frig['D3'] + df['D4'] * SI_frig['D4'])
df['SI_DUR'] = (df['D1'] * SI_dur['D1'] + df['D2'] * SI_dur['D2'] +
df['D3'] * SI_dur['D3'] + df['D4'] * SI_dur['D4'])
# 4. De-seasonalize the sales (De-seasonalized = Original - Seasonal Index)
df['FRIG_DESEASONALIZED'] = df['FRIG'] - df['SI_FRIG']
df['DUR_DESEASONALIZED'] = df['DUR'] - df['SI_DUR']
# Prepare results for the user
results = {
'model_frig_a': model_frig_a.summary().as_text(),
'model_dur_a': model_dur_a.summary().as_text(),
'model_frig_b': model_frig_b.summary().as_text(),
'model_dur_b': model_dur_b.summary().as_text(),
'mean_frig': mean_frig,
'mean_dur': mean_dur,
'SI_frig': SI_frig,
'SI_dur': SI_dur,
'deseasonalized_data': df[['FRIG', 'FRIG_DESEASONALIZED', 'DUR', 'DUR_DESEASONALIZED']].to_csv('deseasonalized_data.csv', index=False)
}
print(f"FRIG Model (a) Summary:\n{results['model_frig_a']}")
print(f"\nDUR Model (a) Summary:\n{results['model_dur_a']}")
print(f"\nFRIG Model (b) Summary:\n{results['model_frig_b']}")
print(f"\nDUR Model (b) Summary:\n{results['model_dur_b']}")
print(f"\nFRIG Seasonal Indices (SI): {results['SI_frig']}")
print(f"DUR Seasonal Indices (SI): {results['SI_dur']}")
print("De-seasonalized data saved to deseasonalized_data.csv")
```
```text?code_stdout&code_event_index=2
FRIG Model (a) Summary:
OLS Regression Results
==============================================================================
Dep. Variable: FRIG R-squared: 0.532
Model: OLS Adj. R-squared: 0.482
Method: Least Squares F-statistic: 10.60
Date: Sun, 19 Oct 2025 Prob (F-statistic): 7.91e-05
Time: 20:32:45 Log-Likelihood: -207.55
No. Observations: 32 AIC: 423.1
Df Residuals: 28 BIC: 429.0
Df Model: 3
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 1222.1250 59.990 20.372 0.000 1099.240 1345.010
D2 245.3750 84.839 2.892 0.007 71.590 419.160
D3 347.6250 84.839 4.097 0.000 173.840 521.410
D4 -62.1250 84.839 -0.732 0.470 -235.910 111.660
==============================================================================
Omnibus: 3.849 Durbin-Watson: 0.393
Prob(Omnibus): 0.146 Jarque-Bera (JB): 2.306
Skew: -0.436 Prob(JB): 0.316
Kurtosis: 2.015 Cond. No. 4.79
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
DUR Model (a) Summary:
OLS Regression Results
==============================================================================
Dep. Variable: DUR R-squared: 0.011
Model: OLS Adj. R-squared: -0.095
Method: Least Squares F-statistic: 0.1045
Date: Sun, 19 Oct 2025 Prob (F-statistic): 0.957
Time: 20:32:45 Log-Likelihood: -161.15
No. Observations: 32 AIC: 330.3
Df Residuals: 28 BIC: 336.2
Df Model: 3
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 276.1500 14.069 19.628 0.000 247.330 304.970
D2 1.0375 19.897 0.052 0.959 -39.720 41.795
D3 8.0625 19.897 0.405 0.688 -32.695 48.820
D4 8.6375 19.897 0.434 0.668 -32.120 49.395
==============================================================================
Omnibus: 5.154 Durbin-Watson: 0.100
Prob(Omnibus): 0.076 Jarque-Bera (JB): 4.808
Skew: 0.930 Prob(JB): 0.0904
Kurtosis: 2.615 Cond. No. 4.79
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
FRIG Model (b) Summary:
OLS Regression Results
==============================================================================
Dep. Variable: FRIG R-squared: 0.532
Model: OLS Adj. R-squared: 0.482
Method: Least Squares F-statistic: 10.60
Date: Sun, 19 Oct 2025 Prob (F-statistic): 7.91e-05
Time: 20:32:45 Log-Likelihood: -207.55
No. Observations: 32 AIC: 423.1
Df Residuals: 28 BIC: 429.0
Df Model: 3
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
D1 1222.1250 59.990 20.372 0.000 1099.240 1345.010
D2 1467.5000 59.990 24.462 0.000 1344.615 1590.385
D3 1569.7500 59.990 26.167 0.000 1446.865 1692.635
D4 1160.0000 59.990 19.336 0.000 1037.115 1282.885
==============================================================================
Omnibus: 3.849 Durbin-Watson: 0.393
Prob(Omnibus): 0.146 Jarque-Bera (JB): 2.306
Skew: -0.436 Prob(JB): 0.316
Kurtosis: 2.015 Cond. No. 1.00
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
DUR Model (b) Summary:
OLS Regression Results
==============================================================================
Dep. Variable: DUR R-squared: 0.011
Model: OLS Adj. R-squared: -0.095
Method: Least Squares F-statistic: 0.1045
Date: Sun, 19 Oct 2025 Prob (F-statistic): 0.957
Time: 20:32:45 Log-Likelihood: -161.15
No. Observations: 32 AIC: 330.3
Df Residuals: 28 BIC: 336.2
Df Model: 3
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
D1 276.1500 14.069 19.628 0.000 247.330 304.970
D2 277.1875 14.069 19.702 0.000 248.368 306.007
D3 284.2125 14.069 20.201 0.000 255.393 313.032
D4 284.7875 14.069 20.242 0.000 255.968 313.607
==============================================================================
Omnibus: 5.154 Durbin-Watson: 0.100
Prob(Omnibus): 0.076 Jarque-Bera (JB): 4.808
Skew: 0.930 Prob(JB): 0.0904
Kurtosis: 2.615 Cond. No. 1.00
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
FRIG Seasonal Indices (SI): {'D1': -132.71875000000023, 'D2': 112.65624999999977, 'D3': 214.90624999999977, 'D4': -194.84375000000023}
DUR Seasonal Indices (SI): {'D1': -4.434375000000102, 'D2': -3.3968750000000796, 'D3': 3.6281249999999545, 'D4': 4.203124999999943}
De-seasonalized data saved to deseasonalized_data.csv
```
[file-tag: code-generated-file-0-1760905965447315210]
The quarterly sales data spans from 1978 to 1985 (32 observations). We use Ordinary Least Squares (OLS) regression with dummy variables to estimate the seasonal components.
## a. Models with Intercept (Reference Quarter: Q1)
The models estimated are:
* **FRIG:** $\text{FRIG} = \beta_1 + \beta_2 D_2 + \beta_3 D_3 + \beta_4 D_4 + \mu$
* **DUR:** $\text{DUR} = \beta_1 + \beta_2 D_2 + \beta_3 D_3 + \beta_4 D_4 + \mu$
| Variable | $\hat{\beta}$ (FRIG) | $\hat{\beta}$ (DUR) | Interpretation ($\hat{\beta}$) |
| :--- | :--- | :--- | :--- |
| **const** ($\beta_1$) | $1222.13$ | $276.15$ | **Q1 Mean:** Estimated mean sales in the first quarter. |
| **$D_2$** ($\beta_2$) | $245.38$ | $1.04$ | **Q2 Effect:** Estimated difference in sales between Q2 and Q1. |
| **$D_3$** ($\beta_3$) | $347.63$ | $8.06$ | **Q3 Effect:** Estimated difference in sales between Q3 and Q1. |
| **$D_4$** ($\beta_4$) | $-62.13$ | $8.64$ | **Q4 Effect:** Estimated difference in sales between Q4 and Q1. |
**Interpretation of Results:**
1. **FRIG Sales:** The model has a significant explanatory power ($R^2=0.532$), indicating strong seasonality.
* **Baseline (Q1 Mean):** The mean sales in the first quarter (Q1) are estimated to be $1222.13$.
* **Q2 and Q3 Effect:** Sales in Q2 and Q3 are significantly higher than in Q1, by $245.38$ and $347.63$ units, respectively ($P<0.01$). **Q3 has the highest estimated mean sales** ($1222.13 + 347.63 = 1569.76$).
* **Q4 Effect:** Sales in Q4 are slightly lower than Q1, by $62.13$ units, but this difference is **not statistically significant** ($P=0.470$).
2. **DUR (Durable Goods) Sales:** The model has very poor explanatory power ($R^2=0.011$), indicating that seasonal dummy variables alone do not capture significant variation in Durable Goods sales.
* **Baseline (Q1 Mean):** The mean sales in Q1 are estimated to be $276.15$.
* **Q2, Q3, and Q4 Effects:** The estimated differences in sales for Q2 ($1.04$), Q3 ($8.06$), and Q4 ($8.64$) relative to Q1 are all **not statistically significant** ($P>0.6$). This suggests that the seasonal pattern is either very weak or non-existent in this series, especially compared to the Frig sales.
-----
## b. Models without Intercept (All Dummies Included)
The models estimated are:
* **FRIG:** $\text{FRIG} = \beta_1 D_1 + \beta_2 D_2 + \beta_3 D_3 + \beta_4 D_4 + \mu$
* **DUR:** $\text{DUR} = \beta_1 D_1 + \beta_2 D_2 + \beta_3 D_3 + \beta_4 D_4 + \mu$
| Variable | $\hat{\beta}$ (FRIG) | $\hat{\beta}$ (DUR) | Interpretation ($\hat{\beta}$) |
| :--- | :--- | :--- | :--- |
| **$D_1$** ($\beta_1$) | $1222.13$ | $276.15$ | **Q1 Mean:** Estimated mean sales in the first quarter. |
| **$D_2$** ($\beta_2$) | $1467.50$ | $277.19$ | **Q2 Mean:** Estimated mean sales in the second quarter. |
| **$D_3$** ($\beta_3$) | $1569.75$ | $284.21$ | **Q3 Mean:** Estimated mean sales in the third quarter. |
| **$D_4$** ($\beta_4$) | $1160.00$ | $284.79$ | **Q4 Mean:** Estimated mean sales in the fourth quarter. |
**Interpretation of Results:**
This model is mathematically equivalent to the intercept model but provides the **estimated mean sales for each quarter directly** as the coefficients.
1. **FRIG Sales:** The means clearly show the seasonal pattern: Q3 (Highest, $1569.75$) $\gg$ Q2 ($1467.50$) $\gg$ Q1 ($1222.13$) $\gg$ Q4 (Lowest, $1160.00$). All quarterly means are statistically significant ($P<0.001$).
2. **DUR Sales:** The quarterly means are clustered closely together (ranging from $276.15$ to $284.79$), confirming that the seasonal component is small or negligible.
-----
## c. De-seasonalize the Sales Data
### Procedure and Purpose of De-seasonalization
**Purpose:** De-seasonalization (or seasonal adjustment) is the process of removing predictable, periodic fluctuations (seasonality) from a time series. This process reveals the underlying **trend** and **cyclical** components of the data, which are crucial for forecasting, making policy decisions, and comparing data across different periods.
**Three-Step Seasonal Adjustment Procedure (Additive Model using Dummy Variables):**
1. **Estimate the Seasonal Component ($S_i$):** The seasonal component for each quarter ($S_i$) is estimated as the mean of sales for that quarter ($\hat{\beta}_i$) from the no-intercept dummy variable model (Part b).
* $\hat{\beta}_i$ is the estimated mean for Quarter $i$.
2. **Calculate the Seasonal Index ($SI_i$):** The seasonal index measures the estimated seasonal effect as a deviation from the overall average sales ($\bar{Y}$).
* $SI_i = \hat{\beta}_i - \bar{Y}$
3. **De-seasonalize the Data ($Y_{t}^*$):** The de-seasonalized series is obtained by subtracting the seasonal index corresponding to that period from the original data point ($Y_t$).
* $Y_{t}^* = Y_t - SI_i$
### De-seasonalization Results
The overall mean sales for each series are:
* $\bar{\text{FRIG}} \approx 1354.84$
* $\bar{\text{DUR}} \approx 280.57$
The calculated Seasonal Indices ($SI$) based on the quarterly means from part (b) and the overall mean are:
| Quarter | $\hat{\beta}$ (Quarter Mean) | $SI_{\text{FRIG}}$ | $SI_{\text{DUR}}$ |
| :--- | :--- | :--- | :--- |
| **Q1** ($D_1$) | $1222.13$ | $-132.72$ | $-4.43$ |
| **Q2** ($D_2$) | $1467.50$ | $112.66$ | $-3.40$ |
| **Q3** ($D_3$) | $1569.75$ | $214.91$ | $3.63$ |
| **Q4** ($D_4$) | $1160.00$ | $-194.84$ | $4.20$ |
The de-seasonalized data for both FRIG and DUR is provided in the file `deseasonalized_data.csv`.
**Interpretation of De-seasonalized Data:**
The deseasonalized series represents what the sales would have been if seasonal effects had not been present. For the **FRIG** series, the deseasonalized values will be clustered much more tightly around the overall mean ($\approx 1354.84$) than the original data, making the underlying long-term **upward trend** and any **cyclical variation** clearer. Since the $SI_{\text{DUR}}$ values are very small, the deseasonalized **DUR** series is virtually identical to the original series, reinforcing the finding that seasonality is not a significant factor in durable goods sales.
The de-seasonalized data is saved to a CSV file: [File deseasonalized\_data.csv]
|
|
|