The speech watermarking technique has attracted attention as a solution to achieve these countermeasures against speech tampering and spoofing [3,4]. This technique aims to protect digital speech content by embedding an inaudible security code into a speech signal and by detecting the embedded security code from the watermarked speech signal. In general, speech watermarking methods must meet four requirements to provide a useful and reliable form of audio/speech watermarking [2,5]: (1) inaudibility (inaudible to humans with no sound distortion caused by the embedded data), (2) robustness (not affected when subjected to techniques such as data compression and malicious attacks), (3) blind-detectability (high possibility of detecting the embedded data without using the original or reference signal), and (4) confidentiality (secure and undetectable concealment of embedded data).
Other state-of-the-art audio watermarking methods include singular value decomposition (SVD) with dither modulation quantization, which is a type of quantization index modulation (QIM) [8,9], and various phase modulation techniques [5,10]. Although they have strong points in terms of one or two requirements (e.g., inaudibility and robustness or blind-detectability), they cannot satisfy all four requirements simultaneously due to fragility against speech codecs and sensitivity to frame desynchronization attacks. This suggests that a speech watermarking method using typical audio watermarking techniques must be reconsidered to satisfy the robustness against speech codecs and to ensure blind detection with frame synchronization. 2b1af7f3a8