A time series is a collection of observations on at least one variable ordered along single dimension, time. A time series data demonstrates properties such as large data size, abundant attributes and continuity. Time series data is particularly useful in an analysis of a trend and forecasting in macroeconomics. In the field of finance time series data assists in forecasting volatility and an average price.
In general, time series data examples include the broad macroeconomic aggregates such as price levels, money supply, exchange rates, gross domestic product, investments, relative income levels and productivity indicators. Stock prices and cryptocurrency prices are also instances of time series data. For instance, you can find Bitcoin historical price 1-minute data from 2012 to 2021 on Kaggle.
Time Series Econometrics Analysis
The emphasis in econometrics of time series is on studying the dependence among observations at multiple points in time. What distinguishes time series econometrics analysis from general econometrics analysis is precisely the temporal order of the observations. In addition to the contemporaneous relationships among variables, relationships between their current and past values are critical.
Consider Purchasing affiliate must-have book on Time Series by R. Tsay “Analysis of Financial Time Series”.
The key feature of time series data is time dependence. The data points tend to exhibit strong relationship to their recent histories, lags. This feature creates a critical problem in utilizing time series data in a standard econometric model. Researchers need to take additional steps in specifying econometric models for time series data before using them in standard econometric methods.
Time Series Forecasting
Furthermore, an appropriate forecasting model can be constructed to capture the dynamics of the underlying time series, which can provide a basis for investors’ decision-making. For example, an accurate forecasting of the stock index allows investors to grasp the overall trend of the market to effectively capture trading opportunity and make reasonable asset allocations.
The classical forecasting models of time series are auto-regressive model (AR), moving-average model (MA), auto-regressive moving average model (ARMA) and auto-regressive integrated moving average model (ARIMA). In turn, the ARIMA model has become one of the more widely used methods in the study of forecasting models for time series.
Stock index time series in Python
To illustrate basic operations with time series data in python we use 6 years of 5 minute NIFTY 50 Indian stock index data set from Kaggle. The data spans from 9 January 2015 to 25 March 2021.
from datetime import datetime import matplotlib.pyplot as plt import numpy as np import pandas as pd
We load the data set using pandas library:
For our analysis we need date and close columns:
df = pd.read_csv("/kaggle/input/6-year-nifty50-historical-data-of-5-30-min-candle/5min_N50_10yr.csv", usecols = ['date','close']) print(df.head(5))
Meanwhile we process the date column to a datetime data type and set it as an index:
def parser(s): return datetime.strptime(s, '%Y-%m-%d %H:%M:%S') df.date = df.date.apply(lambda x: parser(x[:19])) df = df.set_index('date') print(df.head(5))
Now we can plot the data set and conduct a time series analysis using partial auto-correlation and auto-correlation plots that we will cover in details in the future:
df.plot(figsize=(20,12)) plt.savefig('plot.png') # to save the plot plt.show()
For a daily price analysis we save only the close price for the day:
data = df[(df.index.minute==25)&(df.index.hour==15)] data.head(2)
For plotting auto-correlations we use statsmodels library:
from statsmodels.graphics.tsaplots import plot_pacf, plot_acf
plot_acf(data['close']) plt.savefig('acf.png') plt.show()
plot_pacf(data['close']) plt.savefig('pacf.png') plt.show()
Clearly there is evidence for time-dependence and serial correlation. Hence, we should convert the prices to a log-price for symmetry around zero and difference the data for stationarity.
data['log_return'] = data.close.apply(np.log1p).diff()
plot_acf(data['log_return'].dropna()) plt.savefig('dacf.png') plt.show()
plot_pacf(data['log_return'].dropna()) plt.savefig('dacf.png') plt.show()
The plot of log-returns exhibits a constant mean, variance and auto-correlation function.
The kaggle notebook of the post.