Analyzing Student Credit Card Charges with Time Series Analysis

Introduction:

In this blog post, we will explore a dataset of student credit card charges over a two-year period. Our goal is to understand the patterns and trends in the data and create a predictive model using exponential smoothing. We will use R to perform the analysis and discuss the results.

Data:

The dataset consists of monthly credit card charges for a student from January 2012 to December 2013. Here’s a glimpse of the data:

Month,2012,2013 Jan,31.9,39.4 Feb,27.0,36.2 March,31.3,40.5 Apr,31.0,44.6 May,39.4,46.8 Jun,40.7,44.7 Jul,42.3,52.2 Aug,49.5,54.0 Sep,45.0,48.8 Oct,50.0,55.8 Nov,50.9,58.7 Dec,58.5,63.4

Time Series Plot:

First, let’s create a time series plot to visualize the data:

charges <- c(31.9, 27.0, 31.3, 31.0, 39.4, 40.7, 42.3, 49.5, 45.0, 50.0, 50.9, 58.5, 
             39.4, 36.2, 40.5, 44.6, 46.8, 44.7, 52.2, 54.0, 48.8, 55.8, 58.7, 63.4)
chargesSeries <- ts(charges, start=c(2012,1), frequency=12)
plot.ts(chargesSeries, main="Student Credit Card Charges", xlab="Year", ylab="Charges ($)")

The plot reveals an increasing trend in credit card charges over the two years. Additionally, there appears to be seasonality, with charges consistently peaking around December each year, likely due to holiday shopping.

Exponential Smoothing Model:

Next, we’ll fit an exponential smoothing model to the data using the HoltWinters function:

chargesSeriesForecasts <- HoltWinters(chargesSeries)
chargesSeriesForecasts

Output:

Holt-Winters exponential smoothing with trend and additive seasonal component.

Smoothing parameters:
 alpha: 0.412921 
 beta : 0 
 gamma: 1 

Coefficients:
         [,1]
a   30.129963
b    0.239933
s1   0.408154
s2  -5.178694
s3  -1.588864
s4  -1.908911
s5   5.251975
s6   7.110975
s7   9.350922
s8  16.801922
s9  12.301922
s10 15.301922
s11 16.201922
s12 23.801922

The model captures a linear trend (b=0.24) and additive seasonality. The alpha parameter of 0.41 indicates the level is updated based on recent observations, while the gamma of 1 means the seasonal component is heavily influenced by the most recent seasonal period.

Model Diagnostics:

To assess the appropriateness of the exponential smoothing model, we’ll analyze the residuals:

plot.ts(chargesSeriesForecasts$residuals)
acf(chargesSeriesForecasts$residuals, lag.max=20)
Box.test(chargesSeriesForecasts$residuals, lag=20, type="Ljung-Box")
hist(chargesSeriesForecasts$residuals)

The residuals appear randomly distributed around zero, and the ACF plot shows no significant autocorrelations, confirmed by the Ljung-Box test (p-value > 0.05):

Box-Ljung test

data:  chargesSeriesForecasts$residuals
X-squared = 11.43, df = 20, p-value = 0.9341

The histogram suggests the residuals are approximately normally distributed. These diagnostics indicate the exponential smoothing model is a reasonable fit for the data.

Forecasting:

Given the satisfactory model diagnostics, we can confidently make forecasts:

library(forecast)
chargesSeriesForecasts2 <- forecast.HoltWinters(chargesSeriesForecasts, h=12)
chargesSeriesForecasts2

Output:

         Point Forecast      Lo 80     Hi 80      Lo 95     Hi 95
Jan 2014       64.33915  59.041895  69.63640  56.245174  72.43312
Feb 2014       59.16055  53.569618  64.75147  50.629790  67.69130
Mar 2014       63.57031  57.713222  69.42740  54.649648  72.49097
Apr 2014       63.26016  57.154877  69.36544  53.981338  72.53898
May 2014       71.50113  65.162075  77.84019  61.889333  81.11293
Jun 2014       73.36013  66.800851  79.91941  63.438163  83.28210
Jul 2014       75.60008  68.830362  82.36981  65.385425  85.81474
Aug 2014       83.05108  76.079180  90.02298  72.558823  93.54334
Sep 2014       78.55108  71.384120  85.71804  67.794289  89.30787
Oct 2014       81.55108  74.193851  88.90831  70.540518  92.56165
Nov 2014       82.45108  74.908369  89.99379  71.196577  93.70559
Dec 2014       90.05108  82.327685  97.77448  78.561540 101.54063

The forecasts show the increasing trend persisting, with seasonal peaks in December each year. The prediction intervals widen over time, reflecting increased uncertainty further into the future.

Conclusion:

In this analysis, we used time series techniques to model and forecast student credit card charges. The exponential smoothing model captured the increasing trend and seasonality in the data, providing a good fit as evidenced by the model diagnostics. The forecasts suggest that credit card charges will continue to rise, with seasonal spikes around December.

However, it’s important to note that the model should be regularly updated with new data to adapt to any changes in the underlying patterns. Additionally, external factors not captured in the historical data, such as changes in economic conditions or personal circumstances, could impact future credit card usage.

Understanding these patterns can help students better manage their finances and make informed decisions about credit card usage. For example, setting aside money throughout the year for holiday expenses could help avoid excessive charges in December.

I hope this analysis has provided valuable insights into student credit card usage patterns. Feel free to reach out with any questions or suggestions for further analysis!


Comments

Leave a comment