P6-09: FlexDTW: Dynamic Time Warping With Flexible Boundary Conditions

Irmak Bukey (Pomona College), Jason Zhang (University of Michigan), Timothy Tsai (Harvey Mudd College)*

Subjects (starting with primary): MIR tasks -> alignment, synchronization, and score following ; MIR fundamentals and methodology -> music signal processing

Presented In Person: 4-minute short-format presentation

Abstract:

Alignment algorithms like DTW and subsequence DTW assume specific boundary conditions on where an alignment path can begin and end in the cost matrix. In practice, the boundary conditions may not be known a priori or may not satisfy such strict assumptions. This paper introduces an alignment algorithm called FlexDTW that is designed to handle a wide range of boundary conditions. FlexDTW allows alignment paths to start anywhere on the bottom or left edge of the cost matrix (adjacent to the origin) and to end anywhere on the top or right edge. In order to properly compare paths of very different lengths, we use a goodness measure that normalizes the cumulative path cost by the path length. The key insight of FlexDTW is that the Manhattan length of a path can be computed by simply knowing the starting point of the path, which can be computed recursively during dynamic programming. We artificially generate a suite of 16 benchmarks based on the Chopin Mazurka dataset in order to characterize audio alignment performance under a variety of boundary conditions. We show that FlexDTW has consistently strong performance that is comparable or better than commonly used alignment algorithms, and it is the only system with strong performance in some boundary conditions.

If the video does not load properly please use the direct link to video