Conference article

Visual-Interactive Preprocessing of Time Series Data

Jürgen Bernard
Fraunhofer IGD Darmstadt, Germany

Tobias Ruppert
Fraunhofer IGD Darmstadt, Germany

Oliver Goroll
Technische Universitität Darmstadt, Germany

Thorsten May
Fraunhofer IGD Darmstadt, Germany

Jörn Kohlhammer
Fraunhofer IGD Darmstadt, Germany

Download article

Published in: Proceedings of SIGRAD 2012; Interactive Visual Analysis of Data; November 29-30; 2012; Växjö; Sweden

Linköping Electronic Conference Proceedings 81:6, p. 39-48

Show more +

Published: 2012-11-20

ISBN: 978-91-7519-723-4

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

Time series data is an important data type in many different application scenarios. Consequently; there are a great variety of approaches for analyzing time series data.Within these approaches different strategies for cleaning; segmenting; representing; normalizing; comparing; and aggregating time series data can be found. When combining these operations; the time series analysis preprocessing workflow has many degrees of freedom. To define an appropriate preprocessing pipeline; the knowledge of experts coming from the application domain has to be included into the design process. Unfortunately; these experts often cannot estimate the effects of the chosen preprocessing algorithms and their parameterizations on the time series. We introduce a system for the visual-interactive exploitation of the preprocessing parameter space. In contrast to ‘black box’-driven approaches designed by computer scientists based on the requirements of domain experts; our system allows these experts to visual-interactively compose time series preprocessing pipelines by themselves. Visual support is provided to choose the right order and parameterization of the preprocessing steps. We demonstrate the usability of our approach with a case study from the digital library domain; in which time-oriented scientific research data has to be preprocessed to realize a visual search and analysis application.

Keywords

I.3.3 [Computer Graphics]: Picture/Image Generation—Line and curve generation

References

[AMST11] AIGNER W.; MIKSCH S.; SCHUMANN H.; TOMINSKI C.: Visualization of Time-Oriented Data. Springer; London; UK; 2011. 41; 42

[BBF11] BERNARD J.; BRASE J.; FELLNER D.; KOEPLER O.; KOHLHAMMER J.; RUPPERT T.; SCHRECK T.; SENS I.: A visual digital library approach for time-oriented scientific primary data. Springer International Journal of Digital Libraries; ECDL 2010 Special Issue (2011). 45

[BK12] BERNARD J.; KÖNIG-LANGLO G. SIEGER R.: Timeoriented earth observation measurements from the baseline surface radiation network (bsrn) in the years 1992 to 2012 ; reference list of 6813 datasets. doi:10.1594/pangaea.787726; 2012. 45

[BvLBS11] BREMM S.; VON LANDESBERGER T.; BERNARD J.; SCHRECK T.: Assisted descriptor selection based on visual comparative data analysis. Comput. Graph. Forum 30; 3 (2011); 891– 900. 41; 42

[CBK09] CHANDOLA V.; BANERJEE A.; KUMAR V.: Anomaly detection: A survey. ACM Comput. Surv. 41; 3 (July 2009). 41

[DTS08] DING H.; TRAJCEVSKI G.; SCHEUERMANN P.; WANG X.; KEOGH E.: Querying and mining of time series data: experimental comparison of representations and distance measures. Proc. VLDB Endow. 1; 2 (Aug. 2008); 1542–1552. 41

[Fu11] FU T.-C.: A review on time series data mining. Engineering Appl. of Artificial Intelligence 24; 1 (2011); 164–181. 41

[GAIM00] GAVRILOV M.; ANGUELOV D.; INDYK P.; MOTWANI R.: Mining the stock market: Which measure is best. In In proceedings of the 6 th ACM Int’l Conference on Knowledge Discovery and Data Mining (2000); pp. 487–496. 41

[HDKS05] HAO M. C.; DAYAL U.; KEIM D. A.; SCHRECK T.: Importance-driven visualization layouts for large time series data. In INFOVIS (2005); IEEE Computer Society; p. 27. 42

[IMI10] INGRAM S.; MUNZNER T.; IRVINE V.; TORY M.; BERGNER S.; MÖLLER T.: Dimstiller: Workflows for dimensional analysis and reduction. In Proceedings of the 5th IEEE Conference on Visual Analytics in Science and Technology (VAST) (Florida; USA; 2010); IEEE Computer Society. 42

[KCH03] KIM W.; CHOI B.-J.; HONG E.-K.; KIM S.-K.; LEE D.: A taxonomy of dirty data. Data Min. Knowl. Discov. 7; 1 (Jan. 2003); 81–99. 41

[KCHP01] KEOGH E.; CHU S.; HART D.; PAZZANI M.: An online algorithm for segmenting time series. In In ICDM (2001); pp. 289–296. 41; 42

[KCPM00] KEOGH E.; CHAKRABARTI K.; PAZZANI M.; MEHROTRA S.: Dimensionality reduction for fast similarity search in large time series databases. Journal of Knowledge and Information Systems 3 (2000); 263–286. 40

[KHP11] KANDEL S.; HEER J.; PLAISANT C.; KENNEDY J.; VAN HAM F.; RICHE N. H.; WEAVER C.; LEE B.; BRODBECK D.; BUONO P.: Research directions in data wrangling: visuatizations and transformations for usable and credible data. Information Visualization 10; 4 (Oct. 2011); 271–288. 39; 41

[KK03] KEOGH E.; KASETTY S.: On the need for time series data mining benchmarks: A survey and empirical demonstration. Data Min. Knowl. Discov. 7; 4 (Oct. 2003); 349–371. 41; 42

[KLR04] KEOGH E.; LONARDI S.; RATANAMAHATANA C. A.: Towards parameter-free data mining. In Proceedings of the ACM SIGKDD int. conf. on Knowledge discovery and data mining (New York; NY; USA; 2004); KDD ’04; ACM; pp. 206–215. 41

[LKLC03] LIN J.; KEOGH E.; LONARDI S.; CHIU B.: A symbolic representation of time series; with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery (New York; NY; USA; 2003); DMKD ’03; ACM; pp. 2–11. 41

[ODF98] OHMURA A.; DUTTON E. G.; FORGAN B.; FRÖHLICH C.; GILGEN H.; HEGNER H.; HEIMO A.; KÖNIGLANGLO G.; MCARTHUR B.; MÜLLER G.; PHILIPONA R.; PINKER R.; WHITLOCK C. H.; DEHNE K.; WILD M.: Baseline surface radiation network (BSRN/WCRP): New precision radiometry for climate research. Bull. Amer. Met. Soc. 79 (1998); 2115–2136. 45

[PBCR11] PRETORIUS A. J.; BRAY M.-A.; CARPENTER A. E.; RUDDLE R. A.: Visualization of parameter space for image analysis. IEEE Transactions on Visualization and Computer Graphics 17; 12 (Dec. 2011); 2402–2411. 41

[SBVLK09] SCHRECK T.; BERNARD J.; VON LANDESBERGER T.; KOHLHAMMER J.: Visual cluster analysis of trajectory data with interactive kohonen maps. Information Visualization 8; 1 (Jan. 2009); 14–29. 42

[SS04] SEO J.; SHNEIDERMAN B.: A rank-by-feature framework for unsupervised multidimensional data exploration using low dimensional projections. In in Proceedings of IEEE Symposium on Information Visualization (2004); pp. 65–72. 41

[SSW12] SCHRECK T.; SHARALIEVA L.; WANNER F.; BERNARD J.; RUPPERT T.; VON LANDESBERGER T.; BUSTOS B.: Visual Exploration of Local Interest Points in Sets of Time Series. In Proc. IEEE Symp. on Visual Analytics Science and Technology (Poster Paper; accepted for publication) (2012). 42

[WL05] WARREN LIAO T.: Clustering of time series data-a survey. Pattern Recogn. 38; 11 (Nov. 2005); 1857–1874. 41

[ZCPB11] ZHAO J.; CHEVALIER F.; PIETRIGA E.; BALAKRISHNAN R.: Exploratory analysis of time-series with chronolenses. IEEE Transactions on Visualization and Computer Graphics 17; 12 (2011); 2422–2431. 42

[ZJGK10] ZIEGLER H.; JENNY M.; GRUSE T.; KEIM D. A.: Visual market sector analysis for financial time series data. In IEEE VAST (2010); IEEE; pp. 83–90. 47

Citations in Crossref