I was pondering the usefulness of noise removal based on several copies of a recording, possibly from different physical copies. A Google search reveals that such a technique is already patented:
The present invention provides a method for reducing noise using a plurality of recording copies. The present invention produces a master file with lower noise than the available recording copies, and avoids the problems of losing musical content caused by prior art pop and click removers. The system comprises a recording playback unit, a computer system with a sound input capability, and a high capacity storage system such as a CD recorder. In operation, a plurality of recording copies of a single recording are played on the playback unit. These recordings are digitized by the computer and a separate recording file is formed for each copy of the recording. The recording files are then synchronized. The samples from each of the recording files are then averaged to reduce the noise components. A variety of threshold comparison techniques can be employed to eliminate samples and/or recording files that are outside of a computed range for that sample.
After doing a little more research I found the MATCH plugin for Sonic Visualiser, which sort of does what I had in mind, but I haven’t figured out how to process the aligned audio to get two aligned files:
This method is described in a 2005 paper by Simon Dixon:
Dynamic time warping ﬁnds the optimal alignment of two time series, but it is not suitable for on-line applications because it requires complete knowledge of both series before the alignment can be computed. Further, the quadratic time and space requirements are limiting factors even for off-line systems. We present a novel on-line time warping algorithm which has linear time and space costs, and performs incremental alignment of two series as one is received in real time. This algorithm is applied to the alignment of audio signals in order to follow musical performances of arbitrary length. Each frame of audio is represented by a positive spectral difference vector, emphasising note onsets. The system was tested on various test sets, including recordings of 22 pianists playing music by Chopin, where the average alignment error was 59ms (median 20ms). We demonstrate one application of the system: the analysis and visualisation of musical expression in real time.
I also found another paper by Pablo Sprechmann, Alex Bronstein, Jean-Michel Morel and Guillermo Sapiro who seem to do exactly what I had in mind. This is fresh stuff, and I found no reference to any available software implementation:
A method for removing impulse noise from audio signals by fusing multiple copies of the same recording is introduced in this paper. The proposed algorithm exploits the fact that while in general multiple copies of a given recording are available, all sharing the same master, most degradations in audio signals are record-dependent. Our method ﬁrst seeks for the optimal non-rigid alignment of the signals that is robust to the presence of sparse outliers with arbitrary magnitude. Unlike previous approaches, we simultaneously ﬁnd the optimal alignment of the signals and impulsive degradation. This is obtained via continuous dynamic time warping computed solving an Eikonal equation. We propose to use our approach in the derivative domain, reconstructing the signal by solving an inverse problem that resembles the Poisson image editing technique. The proposed framework is here illustrated and tested in the restoration of old gramophone recordings showing promising results; however, it can be used in other applications where different copies of the signal of interest are available and the degradations are copy-dependent.
If you happen to be in Vancouver in the end of May, you can attend their presentation at ICASSP 2013.
I’m curious as to how much difference this would make in practice compared with traditional statistical approaches. Theoretically, it would be very interesting for getting the most out of tango recordings where the masters are long gone and would reduce less man hours for manual restoration. However, it would require extra effort as each track would have to be digitized from multiple physical copies, not to mention that you would need access to more than one physical copy of each recording. Something to think of for a project like TangoVia, that multiple copies of the same material might actually be useful.