Introduction:
In this blog post, we will explore a dataset of student credit card charges over a two-year period. Our goal is to understand the patterns and trends in the data and create a predictive model using exponential smoothing. We will use R to perform the analysis and discuss the results.
Data:
The dataset consists of monthly credit card charges for a student from January 2012 to December 2013. Here’s a glimpse of the data:
Month,2012,2013 Jan,31.9,39.4 Feb,27.0,36.2 March,31.3,40.5 Apr,31.0,44.6 May,39.4,46.8 Jun,40.7,44.7 Jul,42.3,52.2 Aug,49.5,54.0 Sep,45.0,48.8 Oct,50.0,55.8 Nov,50.9,58.7 Dec,58.5,63.4
Time Series Plot:
First, let’s create a time series plot to visualize the data:
charges <- c(31.9, 27.0, 31.3, 31.0, 39.4, 40.7, 42.3, 49.5, 45.0, 50.0, 50.9, 58.5,
39.4, 36.2, 40.5, 44.6, 46.8, 44.7, 52.2, 54.0, 48.8, 55.8, 58.7, 63.4)
chargesSeries <- ts(charges, start=c(2012,1), frequency=12)
plot.ts(chargesSeries, main="Student Credit Card Charges", xlab="Year", ylab="Charges ($)")
The plot reveals an increasing trend in credit card charges over the two years. Additionally, there appears to be seasonality, with charges consistently peaking around December each year, likely due to holiday shopping.
Exponential Smoothing Model:
Next, we’ll fit an exponential smoothing model to the data using the HoltWinters function:
chargesSeriesForecasts <- HoltWinters(chargesSeries)
chargesSeriesForecasts
Output:
Holt-Winters exponential smoothing with trend and additive seasonal component.
Smoothing parameters:
alpha: 0.412921
beta : 0
gamma: 1
Coefficients:
[,1]
a 30.129963
b 0.239933
s1 0.408154
s2 -5.178694
s3 -1.588864
s4 -1.908911
s5 5.251975
s6 7.110975
s7 9.350922
s8 16.801922
s9 12.301922
s10 15.301922
s11 16.201922
s12 23.801922
The model captures a linear trend (b=0.24) and additive seasonality. The alpha parameter of 0.41 indicates the level is updated based on recent observations, while the gamma of 1 means the seasonal component is heavily influenced by the most recent seasonal period.
Model Diagnostics:
To assess the appropriateness of the exponential smoothing model, we’ll analyze the residuals:
plot.ts(chargesSeriesForecasts$residuals)
acf(chargesSeriesForecasts$residuals, lag.max=20)
Box.test(chargesSeriesForecasts$residuals, lag=20, type="Ljung-Box")
hist(chargesSeriesForecasts$residuals)
The residuals appear randomly distributed around zero, and the ACF plot shows no significant autocorrelations, confirmed by the Ljung-Box test (p-value > 0.05):
Box-Ljung test
data: chargesSeriesForecasts$residuals
X-squared = 11.43, df = 20, p-value = 0.9341
The histogram suggests the residuals are approximately normally distributed. These diagnostics indicate the exponential smoothing model is a reasonable fit for the data.
Forecasting:
Given the satisfactory model diagnostics, we can confidently make forecasts:
library(forecast)
chargesSeriesForecasts2 <- forecast.HoltWinters(chargesSeriesForecasts, h=12)
chargesSeriesForecasts2
Output:
Point Forecast Lo 80 Hi 80 Lo 95 Hi 95
Jan 2014 64.33915 59.041895 69.63640 56.245174 72.43312
Feb 2014 59.16055 53.569618 64.75147 50.629790 67.69130
Mar 2014 63.57031 57.713222 69.42740 54.649648 72.49097
Apr 2014 63.26016 57.154877 69.36544 53.981338 72.53898
May 2014 71.50113 65.162075 77.84019 61.889333 81.11293
Jun 2014 73.36013 66.800851 79.91941 63.438163 83.28210
Jul 2014 75.60008 68.830362 82.36981 65.385425 85.81474
Aug 2014 83.05108 76.079180 90.02298 72.558823 93.54334
Sep 2014 78.55108 71.384120 85.71804 67.794289 89.30787
Oct 2014 81.55108 74.193851 88.90831 70.540518 92.56165
Nov 2014 82.45108 74.908369 89.99379 71.196577 93.70559
Dec 2014 90.05108 82.327685 97.77448 78.561540 101.54063
The forecasts show the increasing trend persisting, with seasonal peaks in December each year. The prediction intervals widen over time, reflecting increased uncertainty further into the future.
Conclusion:
In this analysis, we used time series techniques to model and forecast student credit card charges. The exponential smoothing model captured the increasing trend and seasonality in the data, providing a good fit as evidenced by the model diagnostics. The forecasts suggest that credit card charges will continue to rise, with seasonal spikes around December.
However, it’s important to note that the model should be regularly updated with new data to adapt to any changes in the underlying patterns. Additionally, external factors not captured in the historical data, such as changes in economic conditions or personal circumstances, could impact future credit card usage.
Understanding these patterns can help students better manage their finances and make informed decisions about credit card usage. For example, setting aside money throughout the year for holiday expenses could help avoid excessive charges in December.
I hope this analysis has provided valuable insights into student credit card usage patterns. Feel free to reach out with any questions or suggestions for further analysis!
Leave a comment