﻿ 基于多模型的不等长序列数据关联算法<sup>*</sup>
 文章快速检索 高级检索

1. 海军航空工程学院 电子信息工程系, 烟台 264001;
2. 91934 部队, 义乌 322000

Data association algorithm for unequal length sequence based on multiple model
SUN Guidong1, GUAN Xin1, YI Xiao1, ZHAO Jun2
1. Department of Electronics and Information Engineering, Naval Aeronautical and Astronautical University, Yantai 264001, China;
2. 91934 Army, Yiwu 322000, China
Received: 2016-08-10; Accepted: 2016-10-28; Published online: 2016-11-17 17:12
Foundation item: National Natural Science Foundation of China (61032001); Program for New Century Excellent Talents in University (NCET-11-0872)
Corresponding author. GUAN Xin, E-mail:gxtongwin@163.com
Abstract: When dealing with data association for unequal length sequence, single model cannot balance computational accuracy, complexity and disturbance rejection. So a data association algorithm for unequal length sequence based on multiple model (MM) was proposed. The two unequal length sequence similarity measurement model based on sliding window and dynamic time warping (DTW) were selected as the input model of MM, which uses the rate of change between time and similarity of two model as the index to realize the transformation of the two models. It combines both advantages of two models and gets the models' application condition.The unequal length sequence similarity is output after MM as the index to judge the association of the sequence data. Simulation results show the effectiveness of the proposed algorithm for unequal length sequence and analyze the effect of sequence length and fluctuant rate on association result.
Key words: data association     unequal length     sequence similarity     multiple model (MM)     rate of change between time and similarity

1 基本概念 1.1 不等长序列的表示

 (1)

m条序列数据组成的不等长序列数组表示为

 (2)
1.2 序列相似度

 (3)

 (4)

 (5)

 (6)

 (7)
1.3 突变率

 图 1 突变点示意图 Fig. 1 Schematic diagram of fluctuant point

 (8)

2 不等长序列相似度度量模型 2.1 滑动窗口的不等长序列

 (9)

 (10)

 (11)

 (12)
2.2 DTW的不等长突变序列

 (13)

 (14)

 (15)

 (16)

 (17)

 (18)

 (19)

 (20)

3 模型的转换及关联判定 3.1 模型的转换

 (21)

 图 2 模型转换流程图 Fig. 2 Model transformation flowchart

go model 1

output sequence length

else

go model 2

End

 (22)

u(12)=1

u(21)=1

End

3.2 基于相似度度量指标的关联判定

 (23)

 (24)

4 仿真分析 4.1 仿真环境

 (25)

4.2 仿真实验

4.2.1 模型时间复杂度对比

 图 3 2种模型计算时间对比 Fig. 3 Comparison of calculation time between two models

4.2.2 模型相似度计算对比

 图 4 模型相似度随序列长度变化 Fig. 4 Variation of model similarity with sequence length

 图 5 模型相似度随突变率变化 Fig. 5 Variation of model similarity with fluctuant rate

4.2.3 MM算法模型转换判断

 图 6 时似变化比随序列长度的变化 Fig. 6 Variation of rate of change between time and similarity with sequence length

 图 7 不同突变点条件下的模型转换临界点变换 Fig. 7 Change of model transformation critical point under different fluctuant points

 序列 相似度 Q1 Q2 Q3 Q4 S1 0.212 5 0.696 6 0.105 6 0.010 2 S2 0.126 4 0.075 9 0.730 6 0.078 7

5 结论

1) 相对于基于滑动窗口的不等长序列度量模型，基于DTW的不等长序列度量模型对复杂度计算更为敏感。

2) 相对于基于DTW的不等长序列度量模型，基于滑动窗口的不等长序列度量模型对突变率的变化更为敏感。

3) 在序列长度较短阶段MM中基于DTW的不等长序列度量模型起主导作用，在序列长度较长阶段MM中基于滑动窗口的不等长序列度量模型起主导作用，为不等长序列数据的度量和关联提供了一种可行的方法。

 [1] AGRAWAL R, FALOUTSOS C, SWAMI A.Efficient similarity search in sequence databases[C]//Proceedings of 4th International Conference on Foundations of Data Organization and Algorithms.Berlin:Springer, 1993:69-84. [2] RAFIEI D, MENDELZON A O. Querying time series data based on similarity[J]. IEEE Transactions on Knowledge and Data Engineering, 2000, 12 (5): 675–693. DOI:10.1109/69.877502 [3] WANG C Z, WANG X Y.Multilevel filtering for high dimensional nearest neighbor search[C]//Proceedings of ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.New York:ACM Press, 2000:37-43. [4] KORN F, JAGADISH H V, FALOUTSOS C.Efficiently supporting ad hoc queries in large datasets of time sequences[C]//Proceedings of ACM SIGMOD International Conference on Management of Data.New York:ACM Press, 1997:289-300. [5] HUHTALLA Y, KRKKINEN J, TOIVENEN H.Mining for similarities in aligned time series using wavelets[C]//Proceedings of Data Mining and Knowledge Discovery:Theory, Tools, and Technology.Orlando:[s.n.], 1999:150-160. [6] 张海勤, 蔡庆生. 基于小波变换的时间序列相似模式匹配[J]. 计算机学报, 2003, 26 (3): 373–377. ZHANG H Q, CAI Q S. Time series similarity querying based on wavelets[J]. Computer Journal, 2003, 26 (3): 373–377. (in Chinese) [7] KEOGH E.Data mining and machine learning in time series database[C]//Proceedings of the 5th Industrial Conference on Data Mining (ICDM).Leipzig:[s.n.], 2005. [8] KEOGH E, CHAKRABARTI K, PAZZANI M, et al. Dimensionality reduction for fast similarity search in large time series databases[J]. Journal of Knowledge and Information Systems, 2001, 3 (3): 263–286. DOI:10.1007/PL00011669 [9] SANG W K, SANGH Y P, WEALEY W C.An Index-based approach for similarity search supporting time warping in large sequence databases[C]//Proceedings 17th International Conference on Data Engineering.Washington, D.C.:IEEE Computer Society, 2001:607-614. [10] THANNWIN R, BILSON C, KEOGH E.Searching and mining trillions of time series subsequences under dynamic time warping[C]//Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM Press, 2012:262-270. [11] KEOGH E.Exact indexing of dynamic time warping[C]//Proceedings of the 28th International Conference on Very Large Databases.San Francisco:Morgan Kaufmann, 2002:406-417. [12] RATHT M, MANMATHA R.Lower bounding of dynamic time warping distances for multivariate time series:Technical ReportMM-40[R].Amherst:Center for Intelligent Information Retrieval Technical Report, University of Massachusetts, 2003. [13] KEOGT E, PAZZANI M.Derivative dynamic time warping[C]//Proceedings of the 1st SIAM International Conference on Data Mining.Chicago:SIAM Press, 2001:209-211. [14] KEOGH E, CHAKRABARTI K, PAZZANI M. Locally adaptive dimensionality reduction for indexing large time series databases[J]. ACM Transactions on Database Systems, 2002, 27 (2): 188–228. DOI:10.1145/568518.568520 [15] BERNDT D, CLIFFORD J.Using dynamic time warping to find patterns in time series[C]//AIAA 94 Workshop on Knowledge Discovery in Databases.Reston:AIAA, 1994:359-370. [16] BAR-SHALOM Y, FORTMANN T E. Tracking and data association[M]. San Diego: Academic Press, 1988: 125-127.

文章信息

SUN Guidong, GUAN Xin, YI Xiao, ZHAO Jun

Data association algorithm for unequal length sequence based on multiple model

Journal of Beijing University of Aeronautics and Astronsutics, 2017, 43(8): 1640-1646
http://dx.doi.org/10.13700/j.bh.1001-5965.2016.0658