Book Image

Time Series Indexing

By : Mihalis Tsoukalos
Book Image

Time Series Indexing

By: Mihalis Tsoukalos

Overview of this book

Time series are everywhere, ranging from financial data and system metrics to weather stations and medical records. Being able to access, search, and compare time series data quickly is essential, and this comprehensive guide enables you to do just that by helping you explore SAX representation and the most effective time series index, iSAX. The book begins by teaching you about the implementation of SAX representation in Python as well as the iSAX index, along with the required theory sourced from academic research papers. The chapters are filled with figures and plots to help you follow the presented topics and understand key concepts easily. But what makes this book really great is that it contains the right amount of knowledge about time series indexing using the right amount of theory and practice so that you can work with time series and develop time series indexes successfully. Additionally, the presented code can be easily ported to any other modern programming language, such as Swift, Java, C, C++, Ruby, Kotlin, Go, Rust, and JavaScript. By the end of this book, you'll have learned how to harness the power of iSAX and SAX representation to efficiently index and analyze time series data and will be equipped to develop your own time series indexes and effectively work with time series data.
Table of Contents (11 chapters)

How the sliding window size affects the iSAX construction speed

In this section, we are going to continue working with the accessSplit.py utility we developed in the previous chapter to find out whether the sliding window size affects the construction speed of an iSAX index, provided that the remaining iSAX parameters stay the same.

Put simply, we will use different methods to find out more about the quality of iSAX indexes and whether the sliding window size affects the construction speed. We are going to perform our experiments using the following sliding window sizes: 16, 256, 1024, 4096, and 16384. We are going to experiment using the 500k.gz time series from Chapter 4, 8 segments, a maximum cardinality value of 32, and a threshold value of 500.

For the window size of 16, the results are the following:

$ ./accessSplit.py -s 8 -c 32 -t 500 -w 16 500k.gz
Max Cardinality: 32 Segments: 8 Sliding Window: 16 Threshold: 500 Default Promotion: False
Number of splits: 1376
Number...