Register or Login To Download This Patent As A PDF
United States Patent Application 
20170345441

Kind Code

A1

Belhomme; Arthur
; et al.

November 30, 2017

Method And Device For Estimating A Dereverberated Signal
Abstract
A method for estimating an instantaneous phase of dereverberated acoustic
signal, the method comprising the following steps: measurement of an
acoustic signal reverberated by propagation in a medium, estimation of at
least a one shortterm Fourier transform of the reverberated acoustic
signal with at least one a window function, calculation of at least one
an instantaneous frequency of dereverberated signal from said shortterm
Fourier transform and from an influencing factor of the medium, said
influencing factor being a function of a reverberation time of said
medium, determination of at least one an instantaneous phase of
dereverberated signal by integrating the instantaneous frequency of
dereverberated signal over time.
Inventors: 
Belhomme; Arthur; (Paris, FR)
; Badeau; Roland; (Paris, FR)
; Grenier; Yves; (Magny Les Hameaux, FR)
; Humbert; Eric; (Boulogne Billancourt, FR)

Applicant:  Name  City  State  Country  Type  INVOXIA  Issy Les Moullneaux   FR 
 
Assignee: 
INVOXIA
Issy Les Moulineaux
FR

Family ID:

1000002669761

Appl. No.:

15/604997

Filed:

May 25, 2017 
Current U.S. Class: 
1/1 
Current CPC Class: 
G10L 21/0264 20130101; G10L 2021/02082 20130101; H04R 3/04 20130101; G10L 21/0232 20130101 
International Class: 
G10L 21/0264 20130101 G10L021/0264; G10L 21/0232 20130101 G10L021/0232; H04R 3/04 20060101 H04R003/04 
Foreign Application Data
Date  Code  Application Number 
May 25, 2016  FR  16 54713 
Feb 9, 2017  FR  17 51073 
Claims
1. A method for estimating an instantaneous phase of dereverberated
acoustic signal, the method comprising the following steps: (a)
measurement of an acoustic signal reverberated by propagation in a
medium, (b) estimation of at least one shortterm Fourier transform of
the reverberated acoustic signal with at least one window function, (c)
calculation of at least one instantaneous frequency of dereverberated
signal from said shortterm Fourier transform and from an influencing
factor of the medium, said influencing factor being a function of a
reverberation time of said medium, (d) determination of at least one
instantaneous phase of dereverberated signal by integrating the
instantaneous frequency of dereverberated signal over time.
2. The method according to claim 1, wherein, for calculating at least one
instantaneous frequency of dereverberated signal from said shortterm
Fourier transform: for each frequency band k among a plurality of N
frequency bands, a smoothed instantaneous frequency of the reverberated
signal in said frequency band k and a rate of change over time of said
smoothed instantaneous frequency of the reverberated signal are
estimated, an instantaneous frequency of dereverberated signal in said
frequency band k is calculated from said smoothed instantaneous frequency
of the reverberated acoustic signal, the rate of change over time of said
smoothed instantaneous frequency of the reverberated signal, and the
influencing factor of the medium, and wherein an instantaneous phase of
dereverberated signal is determined in said frequency band k by
integrating the instantaneous frequency of dereverberated signal in
frequency band k over time.
3. The method according to claim 2, wherein the influencing factor of the
medium is given by: R ( t ) = 1 2 .delta. + min (
t , T h ) 1  e 2 .delta. min ( t , T h )
##EQU00022## where .delta. and T_h are respectively a damping
factor and a duration of an exponential decay [(p(t)=e)]
(.delta.t)1_([0,T_h]) of the impulse response of the medium, and wherein
the damping factor .delta. is calculated from a reverberation time
measured in the medium, in particular an RT_60 reverberation time, for
example such that .delta.=3.log(10)/RT_60.
4. The method according to claim 2, wherein, for estimating a smoothed
instantaneous frequency of the reverberated signal for each frequency
band k among the plurality of N frequency bands, a reassigned vocoder
algorithm is applied.
5. The method according to any one of claims 2, wherein, for calculating
said at least one instantaneous frequency of dereverberated signal, a
correction factor is determined by multiplying the rate of change over
time of the smoothed instantaneous frequency of the reverberated signal
by the influencing factor of the medium, in particular wherein said
correction factor is added to said smoothed instantaneous frequency of
the reverberated acoustic signal.
6. The method according to claim 1, wherein, for calculating at least one
instantaneous frequency of dereverberated signal from said shortterm
Fourier transform: a plurality of quadratic terms of said at least one
shortterm Fourier transform is calculated for each frequency band k
among a plurality of N frequency bands and for each time period m among a
plurality of time periods, and for each frequency band k and each moment
of time m, an instantaneous frequency of the dereverberated signal and a
rate of change over time of said instantaneous frequency of the
dereverberated signal are determined, by calculating a first derivative
and a second derivative of a dual parameter solution of a linear system
whose coefficients are based on said plurality of quadratic terms and the
influencing factor of the medium, said instantaneous frequency of the
dereverberated signal being an imaginary part of the first derivative of
the dual parameter and said rate of change over time being an imaginary
part of the second derivative of the dual parameter, in particular a
matrix constructed from said plurality of quadratic terms and from the
influencing factor of the medium is inverted in order to solve said
linear system.
7. The method according to claim 6, wherein at least five shortterm
Fourier transforms of the reverberated acoustic signal are respectively
estimated with a first window function, a second window function which is
a first derivative of the first window function, a third window function
which is a second derivative of the first window function, a fourth
window function which is a product of the first window function and a
function linearly increasing over time, and a fifth window function which
is a first derivative of the fourth window function, and wherein said
plurality of quadratic terms are calculated from said at least five
shortterm Fourier transforms.
8. The method according to either of claims 6, wherein for each frequency
band k and each moment of time m, an instantaneous amplitude of the
dereverberated signal is determined from said plurality of quadratic
terms, as are first and second derivatives of the dual parameter for each
frequency band k and each moment of time m.
9. The method according to any one of claims 6, wherein, for determining
at least one instantaneous phase of dereverberated signal for a frequency
band k, a preceding frequency, band k' is determined so as to minimize a
difference between the central frequencies of the window functions
g.sup.i.sup.(t) and an estimated frequency in frequency band k, and an
instantaneous frequency of dereverberated signal and a rate of change of
said instantaneous frequency of dereverberated signal are integrated for
said preceding frequency band k'.
10. A device for estimating an instantaneous phase of dereverberated
acoustic signal, comprising: measurement means for capturing at least one
acoustic signal reverberated by propagation in a medium, means for
estimating at least one shortterm Fourier transform of the reverberated
acoustic signal with at least one window function, means for calculating
at least one instantaneous frequency of dereverberated signal from said
shortterm Fourier transform and from an influencing factor of the
medium, said influencing factor being a function of a reverberation time
of said medium, means for determining at least one instantaneous phase of
dereverberated signal by integrating the instantaneous frequency of
dereverberated signal over time.
Description
FIELD OF THE INVENTION
[0001] The present invention relates to methods and devices for estimating
a dereverberated signal.
BACKGROUND OF THE INVENTION
[0002] When an original acoustic signal is emitted in a reverberant medium
then picked up by a microphone, the microphone picks up a reverberated
signal that is dependent on the reverberant medium.
[0003] the following, the term "anechoic acoustic signal" is understood to
mean the original acoustic signal that is not reverberated by a medium.
An anechoic acoustic signal can sometimes be directly recorded by a
microphone, for example when the original acoustic signal is emitted in
an anechoic chamber.
[0004] However, under common recording conditions, a microphone records a
reverberated acoustic signal which is a signal consisting of the original
acoustic signal received directly, but also reflections of the original
acoustic signal on the reverberant elements of the medium, for example
the walls of a room.
[0005] Strong acoustic reverberation of the medium can be particularly
bothersome since it degrades the quality of the recorded sound and
reduces speech intelligibility and speech recognition by machines.
[0006] To solve this problem, methods and devices are known for
reconstructing the amplitude of a dereverberated signal from an acoustic
signal reverberated by a medium.
[0007] In the present application, "dereverberated signal" means an
estimate of the original acoustic signal, or anechoic signal, obtained by
analog or digital processing of a reverberated acoustic signal recorded
by a microphone.
[0008] By way of example, patent US201603667 describes a dereverberation
method which reconstructs a dereverberated signal from an acoustic signal
reverberated by a medium, by calculating the amplitude of the
dereverberated signal in several frequency bands.
[0009] There is a need to further improve the performance of such methods
by more accurately estimating the characteristics of the dereverherated
signal from a reverberated acoustic signal recorded by a microphone.
[0010] Another method is described in the paper "Restoration of
instantaneous amplitude and phase of speech signal in noisy reverberant
environments" by Yang Liu et al., published in the reports of the 23rd
European Signal Processing Conference. This paper describes a supervised
method for teaching a Kalman filter to reconstruct the phase and
amplitude of a dereverherated signal using a training database consisting
of a pair of reverberant and anechoic signals. Such a database, however,
is complicated to collect and the results obtained are highly dependent
on the quality of the training database and on the fit between the types
of reverberations present in the signals of the training database and the
reverberations appearing in the actual applications. In addition, the
Kalman filter dereverberation method described in that document only
allows for linear amplitude and phase modulations, meaning those in which
the temporal derivatives of the amplitude and of the phase,
dereverberated, are constant over time.
[0011] The present invention improves this situation.
OBJECTS AND SUMMARY OF THE INVENTION
[0012] To this end, a first object of the invention is a method for
estimating an instantaneous phase of dereverberated acoustic signal. The
method comprises the following steps:
[0013] (a) measurement of an acoustic signal reverberated by propagation
in a medium,
[0014] (b) estimation of at least one shortterm Fourier transform of the
reverberated acoustic signal with at least one window function,
[0015] (c) calculation of at least one instantaneous frequency of
dereverberated signal from said shortterm Fourier transform and from an
influencing factor of the medium, said influencing factor being a
function of a reverberation time of said medium, and
[0016] (d) determination of at least one instantaneous phase of
dereverberated signal by integrating the instantaneous frequency of
dereverberated signal over time.
[0017] In preferred embodiments of the invention, one or more of the
following arrangements may possibly be used:
[0018] For calculating at least one instantaneous frequency of
dereverberated signal from said shortterm Fourier transform:
[0019] for each frequency band k among a plurality of N frequency bands, a
smoothed instantaneous frequency of the reverberated signal in said
frequency band k and a rate of change over time of said smoothed
instantaneous frequency of the reverberated signal are estimated,
[0020] an instantaneous frequency of dereverberated signal in said
frequency band k is calculated from said smoothed instantaneous frequency
of the reverberated acoustic signal, the rate of change over time of said
smoothed instantaneous frequency of the reverberated signal, and the
influencing factor of the medium,
[0021] and an instantaneous phase of dereverberated signal is determined
in said frequency band k by integrating the instantaneous frequency of
dereverberated signal in frequency band k over time;
[0022] The influencing factor of the medium is given by:
R ( t ) = 1 2 .delta. + min ( t , T h ) 1
 e 2 .delta. min ( t , T h ) ##EQU00001##
where .delta. and T.sub.h are respectively a damping factor and a
duration of an exponential decay p(t)=e.sup..delta.t1.sub.[0,T.sub.h] of
the impulse response of the medium, and the damping factor .delta. is
calculated from a reverberation time measured in the medium, in
particular an RT.sub.60 reverberation time, for example such that
.delta.=3.log(10)/RT.sub.60;
[0023] For estimating a smoothed instantaneous frequency of the
reverberated signal for each frequency band k among the plurality of N
frequency bands, a reassigned vocoder algorithm is applied;
[0024] For calculating said at least one instantaneous frequency of
dereverberated signal, a correction factor is determined by multiplying
the rate of change over time of the smoothed instantaneous frequency of
the reverberated signal by the influencing factor of the medium,
[0025] in particular said correction factor is added to said smoothed
instantaneous frequency of the reverberated acoustic signal;
[0026] For calculating at least one instantaneous frequency of
dereverberated signal from said shortterm Fourier transform:
[0027] a plurality of quadratic terms of said at least one shortterm
Fourier transform is calculated for each frequency band k among a
plurality of N frequency bands and for each time period m among a
plurality of time periods, and
[0028] for each frequency band k and each moment of time m, an
instantaneous frequency of the dereverberated signal and a rate of change
over time of said instantaneous frequency of the dereverberated signal
are determined, by calculating a first derivative and a second derivative
of a dual parameter solution of a linear system whose coefficients are
based on said plurality of quadratic terms and the influencing factor of
the medium, said instantaneous frequency of the dereverberated signal
being an imaginary part of the first derivative of the dual parameter and
said rate of change over time being an imaginary part of the second
derivative of the dual parameter,
[0029] in particular a matrix constructed from said plurality of quadratic
terms and from the influencing factor of the medium is inverted in order
to solve said linear system;
[0030] At least five shortterm Fourier transforms of the reverberated
acoustic signal are respectively estimated with a first window function,
a second window function which is a first derivative of the first window
function, a third window function which is a second derivative of the
first window function, a fourth window function which is a product of the
first window function and a function linearly increasing over time, and a
fifth window function which is a first derivative of the fourth window
function,
[0031] and said plurality of quadratic terms are calculated from said at
least five shortterm Fourier transforms;
[0032] For each frequency band k and each moment of time m, an
instantaneous amplitude of the dereverberated signal is determined from
said plurality of quadratic terms, as are first and second derivatives of
the dual parameter for each frequency band k and each moment of time m;
[0033] For determining at least one instantaneous phase of dereverberated
signal for a frequency band k, a preceding frequency band k' is
determined so as to minimize a difference between the central frequencies
f.sub.i of the window functions g.sub.i(t) and an estimated frequency in
frequency band k, and an instantaneous frequency of dereverberated signal
and a rate of change of said instantaneous frequency of dereverberated
signal are integrated for said preceding frequency band k'.
[0034] The invention also relates to a device for estimating an
instantaneous phase of dereverberated acoustic signal, comprising:
[0035] measurement means for capturing at least one acoustic signal
reverberated by propagation in a medium,
[0036] means for estimating at least one shortterm Fourier transform of
the reverberated acoustic signal with at least one window function,
[0037] means for calculating at least one instantaneous frequency of
dereverberated signal from said shortterm Fourier transforms and from an
influencing factor of the medium, said influencing factor being a
function of a reverberation time of said medium,
[0038] means for determining at least one instantaneous phase of
dereverberated signal by integrating the instantaneous frequency of
dereverberated signal over time.
BRIEF DESCRIPTION OF THE DRAWINGS
[0039] Other features and advantages of the invention will become apparent
from the following description of one of its embodiments, given by way of
nonlimiting example, with reference to the accompanying drawings.
[0040] In the drawings:
[0041] FIG. 1 is a schematic view illustrating the reverberation of sound
in a room when a subject is speaking such that his speech is picked up by
a device according to an embodiment of the invention,
[0042] FIG. 2 is a schematic diagram of the device of FIG. 1, and
[0043] FIG. 3 is a flowchart of a method for reconstructing a
dereverberated signal according to an embodiment of the invention, in
particular making use of a method for estimating an instantaneous phase
of dereverberated signal according to one embodiment of the invention.
DETAILED DESCRIPTION
[0044] in the various figures, the same references designate identical or
similar elements.
[0045] The aim of the invention is to estimate an instantaneous phase of
dereverberated acoustic signal from a measurement of an acoustic signal
reverberated by propagation in a medium 7, for example a room of a
building as shown schematically in FIG. 1.
[0046] The invention thus makes it possible to process the acoustic
signals picked up by an electronic device 1 which has a microphone 2. The
electronic device 1 may for example be a telephone in the example shown,
or a computer or some other device.
[0047] When a sound is emitted in the medium 7, for example by person this
sound propagates to the microphone 2 along various paths 1, ether
directly or after reflection on one or more walls 5, 6 of the medium 7.
[0048] As shown in FIG. 2, the electronic device 1 may comprise for
example a central processing unit 8 such as a processor or other,
connected to the microphone 2 and to various other elements, including
for example a speaker 9, a keyboard 10, and a screen 11. The central
processing unit 8 can communicate with an external network 12, for
example a telephone network.
[0049] The invention enables the electronic device 1 to estimate an
instantaneous phase of dereverberated acoustic signal.
[0050] In a first application which is of primary interest, the
instantaneous phase of dereverberated signal can be used to reconstruct a
dereverberated signal from a reverberated acoustic signal.
[0051] For this purpose, an acoustic signal that is reverberated by
propagation in the medium first measured.
[0052] Then, a dereverberated signal amplitude spectrum is determined for
a plurality of N frequency bands, from the reverberated acoustic signal.
[0053] Numerous methods for determining a dereverberated signal amplitude
spectrum from a reverberated acoustic signal are known from the prior
art.
[0054] These methods consist, for example, of estimating a reverberation
spectrum from the reverberated acoustic signal and then subtracting said
reverberation spectrum from the reverberated acoustic signal.
[0055] Methods are therefore known for determining a dereverberated signal
amplitude spectrum using:
[0056] longterm prediction as described in the paper "Suppression of late
reverberation effect on speech signal using longterm multiplestep
linear prediction" by K. Kinoshita, M. Deicroix, T. Nakatani, and M.
Miyoshi, published in. IEEE Transactions on Audio, Speech, and Language
Processing, vol. 17, no. 4, p. 534545, May 2009,
[0057] stochastic modeling of the impulse response of the medium as
described in "A new method based on spectral subtraction for speech
dereverberation" by K. Lebart and J. M. Boucher, published in ACUSTICA,
vol. 87, no. 3, pp. 359366, 2001, or
[0058] deep neural networks as described in "Speech dereverberation for
enhancement and recognition using dynamic features constrained deep
neural networks and feature adaptation" by X. Xiao, S. Zhao, D. H. Ha
Nguyen, X. Zhong, D. L. Jones, E. S. Chang, and H. Li, published in
EURASIP Journal on Advances in Signal Processing, vol. 2016, no. 1, p.
118, 2016.
[0059] In these prior art methods, a dereverberated signal is then
reconstructed from the obtained dereverberated signal amplitude spectrum
and the phase of the reverberated signal.
[0060] There is, however, a need to further improve the quality and
intelligibility of the dereverberated signal obtained by this method.
[0061] For this purpose, according to the invention, an instantaneous
phase of dereverberated signal for each frequency band k among the
plurality of N frequency bands is determined from the reverberated
acoustic signal by means of a method as described hereinafter.
[0062] Then, a dereverberated signal is reconstructed from the
dereverberated signal amplitude spectrum and from the estimated phase
using the method according to the invention.
[0063] In this manner, a reconstructed dereverberated signal that is
clearly of higher quality is obtained.
[0064] The instantaneous phase of dereverberated signal determined by the
method according to the invention can also have uses other than
reconstruction of the dereverberated signal, and can be used for example
to improve the quality and precision of a sound source location algorithm
as known in the literature.
[0065] It is known that the reverberant medium can be modeled by a
stochastic model by defining an impulse response h(t) of the form:
h(t)=b(t)p(t) (1)
where b(t).about. (0,.sigma..sup.2) is white noise with a centered
Gaussian distribution of variance .sigma..sup.2, and
p(t)=e.sup..delta.t1.sub.[0,TDi h] is an exponential decay of the
impulse response of the medium where .delta. and T.sub.h are respectively
a damping factor and a duration of the impulse response of the medium.
[0066] Such a stochastic model is described, for example, thesis of J. D.
Polack, "Transmission of sound energy in concert halls", which was
supported by the Universite du Maine in 1988.
[0067] The damping factor .delta. and the duration of the impulse response
T.sub.h can be determined from a reverberation time measured in the
medium.
[0068] A commonly used reverberation time is the 60 dB reverberation time,
denoted RT.sub.60. The 60 dB reverberation time is the time required for
the energy decay curve (EDC) to decrease by 60 dB.
[0069] For example, the 60 dB reverberation time can be defined by the
inverse integration method of Manfred R. Schroeder (New Method of
Measuring Reverberation Time, The Journal of the Acoustical Society of
America, 37(3): 409, 1965) by the energy decay curve
EDC(n)=.SIGMA..sup.N.sup.h.sub.k=nh(k).sup.2 where h is the impulse
response of a medium of length N.sub.h and n is a time index, for example
a number of samples obtained by sampling at constant time intervals, n
being between 1 and N.sub.h. RT.sub.60 is then the time at time index n
required for EDC(n) to decrease by 60 dB.
[0070] Typical values of the RT.sub.60 reverberation time are, for
example, values between 0.4 s and 2 s.
[0071] Although the RT.sub.60 reverberation time is most commonly used, it
is also possible to use another reverberation time characteristic of the
medium 7.
[0072] it is then possible to calculate the damping factor of the medium
.delta. from the RT.sub.60 reverberation time by the formula
.delta.=3.log(10)/RT.sub.60.
[0073] The duration of the impulse response T.sub.h can also be defined
from the reverberation time, for example as Th=.alpha..RT.sub.60 where
.alpha. can be greater than 1, for example equal to 1.3.
[0074] However, the damping factor of the medium .delta. and the duration
of the impulse response T.sub.h can also be calculated by other methods
known from the prior art.
[0075] From the statistical model given by equation (1), the reverberated
acoustic signal can be linked to the anechoic acoustic signal by the
convolution equation:
y(t)=(h*s)(t) (2)
[0076] where y(t) is the reverberated acoustic signal and s(t) is the
anechoic acoustic signal.
[0077] The instantaneous phase of the reverberated signal can also be
expressed as a function of the Hilbert transform of the reverberated
signal, as:
.PHI. rev ( t ) = arctan ( y ^ ( t ) y ( t )
) ( 3 ) ##EQU00002##
[0078] where .phi..sub.rev(t) is the instantaneous phase of the
reverberated signal and y(t) is the Hilbert transform of the reverberated
signal.
[0079] It is also possible to link the instantaneous frequency of the
reverberated signal to the instantaneous phase of the reverberated signal
by the expression:
f rev ( t ) = 1 2 .pi. d .PHI. rev
( t ) dt ( t ) ( 4 ) ##EQU00003##
[0080] In a first embodiment of the invention, one can first estimate the
rate of change oven time of the smoothed instantaneous frequency of the
reverberated signal. One can then determine the instantaneous frequency
of the anechoic signal as a function of the expected value of the
instantaneous frequency of the reverberated signal based on equations (1)
to (4), as:
f ( t ) = E [ f rev ( t ) ] + f . ( 1 2
.delta. + min ( t , T h ) 1  e 2 .delta.
min ( t , T h ) ) ( 5 ) ##EQU00004##
[0081] where f(t) is the instantaneous frequency of the anechoic signal
estimated at time t, E[f.sub.rev(t)] is the expected value of the
instantaneous frequency of the reverberated signal at time t, and {dot
over (f)} is the rate of change over time of the instantaneous frequency
of the reverberated signal.
[0082] The expected value of the instantaneous frequency of the
reverberated signal at time t cannot be measured but can be approximated
by temporal smoothing of the instantaneous frequency of the measured
reverberated signal.
[0083] It is thus possible to estimate an instantaneous frequency of a
dereverberated signal as a function of an instantaneous frequency of the
reverberated signal based on equations (1) to (5), as:
f ~ ( t ) = f rev ( t ) _ + f . ( 1 2
.delta. + min ( t , T h ) 1  e 2 .delta.
min ( t , T h ) ) ( 6 ) ##EQU00005##
[0084] where {tilde over (f)}(t) is the instantaneous frequency of the
estimated dereverberated signal at time t, f.sub.rev(t) is a smoothed
instantaneous frequency of the reverberated signal at time t now the SIFT
is smoothed directly, and {dot over (f)} is the rate of change over time
of the smoothed instantaneous frequency of the reverberated signal.
Equation (6) makes it possible to estimate an instantaneous frequency of
the dereverberated signal as a function of the smoothed instantaneous
frequency of the reverberated signal, the rate of change over time of the
instantaneous frequency, and an influencing factor of the medium R is
given by
R ( t ) = 1 2 .delta. + min ( t , T h )
1  e 2 .delta. min ( t , T h ) ( 7 )
##EQU00006##
[0085] We can thus rewrite equation (6) as:
{tilde over (F)}(t)=f.sub.rev(t)+{dot over (f)}R(t) (8)
[0086] An instantaneous phase of the dereverberated signal {tilde over
(.phi.)}(t) can subsequently be determined by temporal integration, as:
{tilde over (.phi.)}(t)=2.pi..intg..sup.t.sub.0{tilde over
(f)}(.tau.)d.tau.+{tilde over (.phi.)}(0) (9)
where {tilde over (.phi.)}(0) Is an original phase of the dereverberated
signal.
[0087] The frequency and phase of the dereverberated signal which are
estimated by means of equations (6) to (9) are therefore estimates of the
frequency and phase of the original acoustic signal or anechoic signal.
[0088] The tests carried out by the inventors indicate that these
estimates are particularly good because they lead to a dereverberated
signal of a quality clearly superior to the prior art.
[0089] Such a method can be further improved by directly determining both
the instantaneous frequency of the dereverberated signal and the rate of
change of the instantaneous frequency of the dereverberated signal.
[0090] This makes it possible to estimate more precisely both the phase
and amplitude of the dereverberated signal.
[0091] For this purpose, several discrete shortterm. Fourier transforms
of the reverberated signal y(t) are calculated for several associated
window functions.
[0092] More precisely, a first window function. g.sub.k(t) is defined for
each frequency band k among a plurality of N frequency bands, k .dielect
cons. [0,N1], and for any time t, t .dielect cons.. The window function
g.sub.k(t) is a complex response function of an analog bandpass filter
centered on a frequency f.sub.k. Then a second, third, fourth, and fifth
window function are further defined from the first window function as
follows:
[0093] The second window function .sub.k(t) is a first derivative of the
first window function,
[0094] The third window function {umlaut over (g)}.sub.k(t) is a first
derivative of the first window function,
[0095] The fourth window function g'.sub.k(t)=t.g.sub.k(t) is a product of
the first window function and the time function, and
[0096] The fifth window function '.sub.k(t) is a first derivative of the
fourth window function.
[0097] Five shortterm Fourier transforms of the reverberated acoustic
signal are respectively calculated for each of said five window
functions:
Y.sub.g[m,k]=(g.sub.k*y)(t.sub.m) (10)
Y.sub. [m,k]=( .sub.k*y)(t.sub.m) (1 1)
Y.sub.{umlaut over (g)}[m,k]=(g.sub.k*y)(t.sub.m) (12)
Y.sub.g'[m,k]=(g'.sub.k*y)(t.sub.m) (13)
Y.sub. '[m,k]=( '.sub.l*y)(t.sub.m) (14)
for each frequency band k among the plurality of frequency bands and each
time period m (equivalently t.sub.m) among a plurality of time periods,
where
t m = m R f s ##EQU00007##
and R is a sampling factor or number of samples per time period and
f.sub.s is a sampling frequency.
[0098] From the form of the impulse response given in (1) and the relation
between the reverberated acoustic signal and the anechoic acoustic signal
given by equation (2), we can deduce relations between the quadratic
terms of the discrete shortterm Fourier transforms of the anechoic
acoustic signal and the reverberated acoustic signal, as:
S g 2 = 1 .sigma. 2 E [ 2 .delta. Y g
2 + 2 ( Y g * Y g . ) ] ##EQU00008##
S g * S g . = 1 .sigma. 2 E [ 2 .delta.
Y g * Y g . + Y g * Y g + Y g . 2 ]
##EQU00008.2## S g * S g ' = 1 .sigma. 2 E [ 2
.delta. Y g * Y g ' + Y g . * Y g ' +
Y g * Y g . ' ] ##EQU00008.3## S g ' 2 = 1
.sigma. 2 E [ 2 .delta. Y g ' 2 + (
Y g ' * Y g . ' ) ] ##EQU00008.4## S g ' * S
g . = 1 .sigma. 2 E [ 2 .delta. Y g ' *
Y g . + Y g . ' * Y g . + Y g ' * Y g
] ##EQU00008.5##
where each term is defined for each frequency band k among the plurality
of frequency bands and each time period m among a plurality of time
periods, but where the dependencies in k and m have been hidden to
simplify the notation (for example S.sub.g.sup.2 in the above equation
is actually S.sub.g[m,k].sup.2).
[0099] Here, too, the expected value of the terms can be approximated by
temporal smoothing and we can obtain the estimates:
= 1 .sigma. 2 ( 2 .delta. Y g 2 _ + 2
( Y g * Y g . _ ) ) ( 15 ) = 1
.sigma. 2 ( 2 .delta. Y g * Y g . _ +
Y g * Y g _ + Y g . 2 _ ) ( 16 ) (
17 ) ( 18 ) ( 19 ) ##EQU00009##
[0100] Here, too, we can define an influencing factor of the medium R
given by
R = 1 2 .delta. ##EQU00010##
[0101] From these quadratic terms and by performing a secondorder Taylor
expansion of the anechoic signal s(t), we can then establish a linear
system verified by the first and second derivatives of a dual parameter
(t)=(t)+i.representing the dereverberated signal in exponential notation:
s(t)=.SIGMA..sub.k(t)=exp((t))=exp((t). exp(i(t))
[0102] where (t)=((t)) and (t)=((t)) We then have:
A ^ m , k [ .theta. . ^ m , k .theta. ^
m , k ] = b ^ m , k ( 20 ) where A ^
m , k = w m , k [ ] ( 21 )
and b ^ m , k = w m , k [ ] (
22 ) ##EQU00011##
where S.sub.m[m',k']=(t.sub.m't.sub.m)S.sub.g[m',k']S.sub.g'[m',k'],
the terms w.sub.m,k[m',k'] are spatiotemporal masks indicating whether a
sinusoid q dominant at time period m and in frequency band k is also
dominant at time period m' and in frequency band k', and where the sums
are defined on the dependencies of the quadratic terms and
spatiotemporal masks as a function of the time periods m' and frequency
bands k' of the quadratic terms and spatiotemporal masks (here again the
dependencies in m' and k' have been hidden to simplify the notation).
[0103] It is then possible to determine the first derivative of the dual
parameter {dot over ({circumflex over (.theta.)})}.sub.m,k and the second
derivative of the dual parameter {dot over ({circumflex over
(.theta.)})}.sub.m,k by inverting matrix A to obtain.
[ .theta. . ^ m , k .theta. ^ m , k ] =
A ^ m , k  1 b ^ m , k ( 23 ) ##EQU00012##
[0104] it is also possible to deduce, from a secondorder Taylor expansion
of the anechoic signal (t), an estimate of the instantaneous amplitude of
the dereverberated acoustic signal {circumflex over
(.alpha.)}.sub.m,k=exp((t)), as:
m , k = w m , k w m , k ( 24 )
##EQU00013##
where the term G.sub.m,k[m',k'] is determined from the first derivative
of the dual parameter {dot over ({circumflex over (.theta.)})}.sub.m,k
and from the second derivative of the dual parameter {dot over
({circumflex over (.theta.)})}.sub.m,k, as:
G m , k [ m ' , k ' ] = exp ( .theta. . m , k
( t m '  t m ) + 1 / 2 .theta. m , k (
t m '  t m ) 2 ) n g k ' [ n ]
.times. exp (  n / f s ( .theta. . m , k +
.theta. m , k ( t m '  t m  n / 2 f s ) )
) ##EQU00014##
[0105] A method for estimating an instantaneous phase of a dereverberated
acoustic signal according to the invention thus comprises the following
steps:
[0106] (a) a measurement step, during which the reverberated acoustic
signal measured by propagation in a medium,
[0107] (b) an estimation step, during which at least one smoothed
shortterm Fourier transform of the reverberated acoustic signal is
estimated with at least one window function,
[0108] (c) a calculation step, during which at least one instantaneous
frequency of dereverberated signal is calculated from said smoothed
shorttime Fourier transform and from an influencing factor of the
medium, said influencing factor being a function of a reverberation time
of said medium,
[0109] (d) a determination step, during which at least one instantaneous
phase of dereverberated signal is determined integrating the
instantaneous frequency of the dereverberated signal over time.
[0110] (a) Measurement Step:
[0111] During this step, the microphone 2 picks an acoustic signal
reverberated by propagation in the medium 7, for example when the person
3 is talking. This signal is sampled and stored in the processor 8 or in
auxiliary memory (not shown).
[0112] As indicated above, the captured signal y(t) a convolution of the
emitted anechoic signal s(t) (speech) with the impulse response h(t) of
the medium between the person speaking 3 and the microphone 2.
[0113] (b) Estimation Step:
[0114] During this step, at least one shortterm Fourier transform of the
reverberated acoustic signal Is estimated with at least one window
function.
[0115] In particular, at least one discrete local Fourier transform of the
reverberated acoustic signal is calculated using window functions w(n)
where n is between 0 and N1.
[0116] Such a discrete local Fourier transform of the reverberated
acoustic signal can be implemented with window functions w(n) of size N
and time frames separated by jumps of R signal samples.
[0117] The reverberated acoustic signal being sampled with frequency
f.sub.s, for example 16 kHz, we thus obtain N discrete frequencies
f k = k f s N , k .dielect cons. [ 0 , N  1 ]
##EQU00015##
and N.sub.f time frames. N is equal for example to 256, 512, or 1024. R
is equal for example to half or a fourth of N.
[0118] In the second embodiment of the invention, at least five shortterm
Fourier transforms of the reverberated acoustic signal can be estimated,
for example as given by equations (10) to (14) above with respectively a
first, second, third, fourth, and fifth window function g.sub.k(t),
.sub.k(t), {umlaut over (g)}.sub.k(t), g'.sub.k(t) and '.sub.k(t) as
defined above.
[0119] (c) Calculation Step:
[0120] Next a calculation step can be implemented during which at least
one instantaneous frequency of dereverberated signal is calculated from
said shortterm Fourier transforms: and from an influencing factor of the
medium, said influencing factor being a function of a reverberation time
of said medium.
[0121] Estimation of the instantaneous frequency or frequencies of the
reverberated signal may typically be done on a number N.sub.f of frames,
for example one hundred frames, corresponding to at least a few seconds
of signal depending on the analysis parameters selected. The frames may
have an individual duration of 10 to 100 ms, in particular about 32 ms.
The frames may overlap each other, for example with an overlap of about
50% between successive frames.
[0122] In the first embodiment of the invention described above in
equations (5) to (9), one can first determine a smoothed instantaneous
frequency of the reverberated signal and a rate of change over time of
said smoothed instantaneous frequency of the reverberated signal, from
the shortterm Fourier transform of the reverberated acoustic signal
estimated in step (b).
[0123] To do so, one may begin by determining the smoothed instantaneous
frequency of the reverberated signal by first measuring the instantaneous
frequency of the reverberated signal and then smoothing said
instantaneous frequency, for example by temporal smoothing using a
SavitzkyGolay filter.
[0124] The instantaneous frequency of the reverberated signal can be
determined in general by a Fourier transform of the signal.
[0125] In a variant embodiment, for each frequency band k among a
plurality of N frequency bands, an instantaneous frequency of the
reverberated signal in said frequency band k can be estimated as well as
a rate of change over time of said instantaneous frequency of the
reverberated signal.
[0126] For this purpose, it is possible for example to apply a reassigned
vocoder algorithm using a discrete local Fourier transform of the
reverberated acoustic signal (or shortterm Fourier transform) or vice
versa.
[0127] Such a reassigned vocoder algorithm is described for example in the
paper "Estimation of frequency for AM/FM models using the phase vocoder
framework" by M. Betser, P. Collen, G. Richard, and B. David, published
in IEEE Transactions On Signal Processing, vol. 56, no. 2, p. 505517,
February 2008.
[0128] Once the instantaneous frequencies of the reverberated signal are
estimated, they can then be smoothed by a temporal smoothing algorithm as
indicated above in order to obtain the smoothed instantaneous frequencies
of the reverberated signal.
[0129] In this step, the above equation (8) {tilde over
(f)}(t)=f.sub.rev(t)+{dot over (f)}R(t) is calculated in order to
estimate an instantaneous frequency of the dereverberated signal.
[0130] In the variant embodiment in which a smoothed instantaneous
frequency of the reverberated signal is estimated for each frequency band
k among a plurality of N frequency bands, it is then possible to
calculate more precisely an instantaneous frequency of dereverberated
signal {tilde over (F)}(m,k) in each frequency band k and for each time
frame m.
[0131] More precisely, the instantaneous frequency of dereverberated
signal {tilde over (F)}(m,k) is calculated from the smoothed
instantaneous frequency of the reverberated acoustic signal of said
frequency band k, the rate of change over time of said smoothed
instantaneous frequency of the reverberated signal, and the influencing
factor of the medium R(t).
[0132] This calculation also uses equation (8) which is applied
independently to each frequency band k, in other words replacing {tilde
over (f)}(t)) with {tilde over (F)}(k).
[0133] To estimate the instantaneous frequency of the dereverberated
signal f (t) or P(.,,k), a correction factor {tilde over (f)}R(t) is
first determined by multiplying the rate of change over time {dot over
(f)} of the smoothed instantaneous frequency of the reverberated signal
by the influencing factor of the medium R(t)=1/(2.delta.)+min(t,
T.sub.h)/(1exp(2.delta.min(t, T.sub.h)).
[0134] Then, the correction factor {dot over (f)}R(t) is added to the
smoothed instantaneous frequency of the reverberated acoustic signal
according to equation (8).
[0135] In the second embodiment of the invention, which is the subject of
equations (10) to (24) above, it is possible to directly determine both
the instantaneous frequency of the dereverberated signal and the rate of
change of the instantaneous frequency of the dereverberated signal.
[0136] To do this, we seek to solve the system given by equation (20), in
particular by inverting matrix .sub.m,k as indicated in equation (23).
[0137] Having estimated the five shortterm Fourier transformations of
equations (10) to (14) Y.sub.g, Y.sub. , Y.sub.{umlaut over (g)}, Y.sub.
, and Y.sub.g'we can begin by temporally smoothing said Fourier
transforms by any temporal smoothing algorithm, in particular the filters
detailed above.
[0138] Then, the plurality of quadratic terms of equations (15) to (19)
are calculated: , , , , and according to the influencing factor of the
medium R=1/2.delta. and terms Y.sub.g, Y.sub. , Y.sub.{umlaut over (g)},
Y.sub. , and Y.sub.g, of the shortterm Fourier transforms for each
frequency band k and each time period m among a plurality of time
periods.
[0139] From these quadratic terms, it is then possible to construct matrix
A.sub.m,k given in equation (21), as well as vector {circumflex over
(b)}.sub.m,k of equation (22).
[0140] Finally, it is possible to determine, for each frequency band k and
each moment of time m, an instantaneous frequency of dereverberated
acoustic signal (t)=({dot over ({circumflex over (.theta.)})}.sub.m,k)
and a rate of change of said instantaneous frequency of dereverberated
acoustic signal {umlaut over ({circumflex over (.phi.)})}(t)=({umlaut
over ({circumflex over (.theta.)})}.sub.m,k), by solving the linear
system of equation (20).
[0141] For this, one can invert matrix A.sub.m,k as indicated in equation
(23).
[0142] Furthermore, it is possible to determine, from the first derivative
of the dual parameter {dot over ({circumflex over (.theta.)})}.sub.m,k
and from the second derivative of the dual parameter {dot over
({circumflex over (.theta.)})}.sub.m,k, an instantaneous amplitude of the
dereverberated signal for each frequency band k and each moment of time
m.
[0143] For this purpose, the equation (24) detailed above is applied.
[0144] In the two embodiments described, the influencing factor of the
medium R can be previously determined in a preliminary calibration step.
[0145] During this preliminary calibration step, a reference acoustic
signal is measured that is reverberated by propagation in the medium, and
the influencing factor of the medium is determined from said reference
acoustic signal.
[0146] For this purpose it is possible, for example, to determine a
reverberation time of said medium by methods otherwise known, for example
the RT.sub.60 reverberation time as described above, and to deduce
therefrom the damping' factor .delta. and the duration of the impulse
response T.sub.h.
[0147] The reference acoustic signal may be an acoustic signal
reverberated by the medium from an original signal known to the device.
[0148] However, determination of the influencing factor of the medium may
also be carried out "blind", meaning from a reverberated signal recorded
following an arbitrary original signal.
[0149] Advantageously, it is possible to use a plurality of reference
acoustic signals which correspond to a respective plurality of different
cases (different people speaking, different positions, different media
7). The number of reference acoustic signals may be several hundred, or
even several thousand.
[0150] In one particular embodiment of the invention, the reference
acoustic signal may consist of the reverberated acoustic signal used by
the method according to the invention, so that determination of the
influencing factor of the medium is then carried out directly during
implementation of the method for estimating the instantaneous phase and
without requiring a preliminary calibration step.
[0151] The determination of the influencing factor of the medium may also
be carried out in a repetitive manner, so that the device 1 adapts for
example to changing the person speaking 3, to movements of the person
speaking 3, to movements of the device 1 or of other objects in the
environment 7.
[0152] (d) Determination Step:
[0153] During this last step, the instantaneous phase of the
dereverberated signal {tilde over (.phi.)}(t) is determined by temporal
integration of the dereverberated instantaneous frequency as indicated in
equation (9).
[0154] This temporal integration may be performed using an original phase
of the dereverberated signal {tilde over (.theta.)}(0).
[0155] In most cases, the dereverberated signal can be assumed to have a
phase equal to the phase of the original reverberated signal, so that,
for example we have {tilde over (.phi.)}(0)=.phi..sub.rev(0). This
applies in particular to the case where the recorded signal is preceded
by silence, so that the reverberation is initially zero.
[0156] Alternatively, here again an instantaneous phase of dereverberated
signal {tilde over (.theta.)}(m,k) can be determined in each frequency
band k among the plurality of N frequency bands and for each time frame
m, by integrating the instantaneous frequency of dereverberated signal of
said frequency band k over time, in other words by summing it over the
time frames m.
[0157] When, in order to estimate a smoothed instantaneous frequency of
the reverberated signal for each frequency band k among the plurality of
N frequency bands, a discrete local Fourier transform of the reverberated
acoustic signal is calculated using window functions w(n) with n between
0 and N1, it is necessary to take into account said window functions
w(n) for the calculation of the instantaneous phase of the anechoic
signal .phi.(t).
[0158] We thus have:
.PHI. ( m , k ) = .PHI. ( mR f s ) + arg ( r
( k , f ( mR f s ) ) ) ##EQU00016##
where
.PHI. ( mR f s ) ##EQU00017##
is the Hilbert phase as defined by equation (3) for the time frame of
index m, .PHI.(m,k) is the phase of the anechoic signal, and .GAMMA.(k,f)
is a correction factor linked to the window functions w(n) which can for
example be written:
.GAMMA. ( k , f ) = n = 0 N  1 w ( n )
exp ( i [ 2 .pi. ( f  f k ) n f s +
.pi. f . ( n f s ) 2 ] ) ##EQU00018##
[0159] The temporal integration of the instantaneous frequencies
determined for the dereverberated signal can then be written as a sum
over the time frames:
.PHI. ~ ( m , k ) = .PHI. ~ ( m  1 , k ) + 2
.pi. F ~ ( m , k ) R f s + arg ( r ( k ,
f ~ ( mR f s ) ) .GAMMA. * ( k , f ~ ( (
m  1 ) R f s ) ) ) ##EQU00019##
[0160] where {tilde over (F)}(m,k) is the instantaneous frequency of
dereverberated signal for frequency band k and for time frame m and
.GAMMA.* denotes the conjugate complex of the correction factor .GAMMA.
linked to the window functions w(n).
[0161] In a manner analogous to the above case in which a single smoothed
instantaneous frequency is determined, it is possible for example to
initialize {tilde over (.PHI.)}(0,k) for each frequency band k with the
value .PHI..sub.rev(0,k) in other words to consider zero reverberation
initially.
[0162] In the second embodiment of the invention, the terms of the
shortterm Fourier transform of the dereverberated signal which can be
inverted to reconstruct a dereverberated signal are similarly estimated.
[0163] In this latter embodiment, it is advantageously possible to carry
out a sequence for integrating the phase in the following manner. Since
the instantaneous frequency varies over time, it may be advantageous to
sweep the frequency bands to identify the best preceding frequency band
k' for integration between time t.sub.m1 and time t.sub.m. For this
purpose, for each given frequency band k, it is possible to determine a
preceding frequency band k' that allows minimizing a difference between
the central frequencies f.sub.i of the window functions g.sub.i(t) and an
estimated frequency in frequency band k, for example as
k ' = argmin i .dielect cons. [ 0 , N  1 ] 1
2 .pi. ( .PHI. . ^ m , k  .PHI. ^ m , k
R f s )  f i ##EQU00020##
[0164] The phase can then be integrated between time m1 (in an equivalent
manner t.sub.m1) and time m (in an equivalent manner t.sub.m) from the
instantaneous frequency of dereverberated acoustic signal (t) and from
the rate of change of said instantaneous frequency of dereverberated
acoustic signal (t) as follows:
.PHI. ^ m , k = .PHI. ^ m  1 , k ' + .PHI. . ^
m  1 , k ' R f s + 1 2 .PHI. ^ m  1 , k '
( R f s ) 2 ##EQU00021##
[0165] Tests show that use of the phase and/or estimated amplitude of the
dereverberated signal in algorithms for reverberated signal
reconstruction and source location, instead of the conventional use of
the phase of the reverberated signal, significantly improves the quality
and intelligibility of the dereverberated signal, and provides better
sound source location.
[0166] For example, tests have shown a 10 dB increase in the
signaltoreverberation ratio (SRR) and a 5 dB decrease in the cepstral
distance (CD), which respectively correspond to a significant gain in
dereverberation and a significant reduction in distortion.
* * * * *