Financial Econometrics and Empirical Market Microstructure
Time Series Data Mining vs. Risk Management
The major tasks considered by the time series data mining community (Ratanama- hatana et al. 2010) are as follows:
- Indexing (Query by Content): Given a query time series Q, and some similar - ity/dissimilarity measure D(Q;C), find the most similar time series in database DB.
- Clustering: Find natural groupings of the time series in database DB under some similarity/dissimilarity measure D(Q;C).
- Classification: Given an unlabeled time series Q, assign it to one of two or more predefined classes.
- Prediction (Forecasting): Given a time series Q containing n data points, predict the value at time n + 1.
These tasks can be used to solve problems in risk management (see Table 1).
Let us discuss some tools for time series data mining.
Dynamic time-warping (DTW) (Keogh and Ratanamahatana 2005) is an algorithm for measuring similarity between two sequences that may vary in time or speed. For instance, similarities in walking patterns would be detected, even if in one video the person was walking slowly and if in another he or she were walking more quickly, or even if there were accelerations and decelerations during the course of one observation.
Mining time series data task |
Risk management tool |
Indexing |
Benchmarking |
Clustering |
Risk analysis |
Classification |
Risk classification |
Prediction |
Scenario generation |
Summarization |
Risk map |
Anomaly detection |
Hidden risk identification |
Segmentation |
Risk mapping and aggregation into portfolio |
Given two time sequences C(m) and Q(n)m, it fills an m by n matrix representing the distances of best possible partial path using a recursive formula:
D (i, j) = d (i, j) C min fD (i. j - 1),D(i -1,j),D(i - 1 ,j - 1)},
1 < i < n, 1 < j < m (2)
Where D(I, j) represents the distance between Qi and Cj. D(1,1) is initialized to d(1,1). The alignment that results in the minimum distance between the two sequences has value D(m, n).