Specialist Task Force 575:
Methods for Objective assessment of Listening Effort based on subjective test data bases
Who we are:
What we do
The following work has been performed :
- the creation, at acoustical interfaces in the presence of background noise, a set of:
- high quality reference speech samples in three languages -German, Chinese and English- from ETSI TS 103 281. The three languages are effectively used, but not for all the test conditions. A combination of languages and conditions have been agreed to limit the number of subjective tests. The background noise simulation in these experiments is compliant with ETSI TS 103 224.
- test conditions which impair the reference items
- high quality recordings of the impaired reference items (test sequences) and
- to conduct a statistically significant number of auditory tests (subjective tests) for the training of an objective model (this task is not included in the STF work plan).
Some tasks have to be performed until the end of the STF
- to perform the validation of the objective model. To do this the team will use the test materials, speech sequences and results of the subjective test (VALIDATION part). Note that the model will be trained (task non included in the STF scope) with the test materials – speech sequences and results of the subjective test (TRAINING part),
- to produce the final text for the annex to TS 103 558 describing the whole work done by the STF.
Creation of a set of test sequences at the acoustic interface
The objective of the first task was to create recordings of speech samples with various speakers in the presence of different types of acoustical background noise.
It was agreed to select three languages: German, English and Chinese and the speech materials are based on contents available in ETSI TS 103 281. The types of background noise cover a wide variety of different scenarios and recordings are conducted with terminals providing super-wideband mode. These recordings are provided in the suitable quality for the use in auditory tests. The recording procedures comply with the corresponding clauses of TS 103 558. The background noise simulation in these experiments has to be compliant with ETSI TS 103 224.
The recordings use a HATS (Head and Torso Simulator) which replaces the human talker or the human listener. HATS is defined in ITU-T Recommendation P.58.
Application 1: ANC headset
The ANC headset is a headset providing active noise cancellation
Figure 1 illustrates the three use case scenarios, which were used for the generation of acoustic HATS-based recordings with mounted ANC headset device:
- The first scenario (left of Figure 1) records speech played back via downlink of the headset, while noise is played back via the noise field generation.
- The second scenario (mid of Figure 1) utilizes a second HATS, simulating as a second talker at left ear with distance 50 cm.
- The third scenario (right of Figure 1) utilizes an external loudspeaker at 50 cm above the listening artificial head (45° azimuth and elevation). This scenario simulates e.g., a public address or announcement system and is denoted as ExtLs in the following.
Five devices under test (Two “in-ear” and three “over-ear” ANC headsets.) were evaluated for the acoustic recordings of the databases.
A scenario without headset is also included in the test plan.
The background noises samples used for the subjective tests are “Crossroadnoise”, “Inside_Bus”, “Railway Platform” and “Inside Aircraft”.
These environmental noises are representative of real usage of the ANC headsets.
Application 2: In-car communication (ICC)
This application simulates the communication between two persons inside a car.
The background noise corresponds to the stationary car and to different running speeds ( 50, 100 and 120 km/h).
Different implementations of noise reduction systems are tested.
Application 3: Mobile Devices (e.g. smartphones or mobile phones)
This application simulates the situation of a user listening a distant speech. The listener is replaced by a HATS, and the ambient noise is produced by a set of calibrated loudspeakers.
The positioning and mounting instructions comply with the ones described in ETSI TS 103 737 and 103 739 for Handset terminals and TS 103 738 and 103 740 for hands-free terminals. The figure below illustrates the two operational modes for a mobile device. The mobile device used in this experiment is a board implementing all the functions of a mobile phone.
The transmitted speech is provided in two band-width modes: “narrow-band” (typically from 300Hz to 3,4 kHz) and “Super-wideband” (typically 150 Hz to 15 kHz)
The background noises have been selected as “Roadnoise”, “cafeteria”, “Pubnoise”. Some recordings have also been done without background noise (“silence” mode). The volume control for the terminal may be “nominal” or “maximum”.
Designing, conducting and analysing subjective tests at the acoustic interface
The aim of this Task was to conduct subjective tests for the relevant conditions as defined above. The results of the subjective tests are intended to be used
-to train the objective model (the “training” task not included in the workplan of the STF) and
-to validate the objective model.
The subjects have to assess to criteria:
The listening Quality (LQ), and
The listening Effort (LE)
The following picture shows one subject participating to the subjective test.
- One set of subjective tests is used to train the objective model in order to achieve a maximum correlation between the subjective and predicted scores.
- The second set of subjective tests is used for a validation of the model.
For more details, see our Terms of Reference
Why we do it
The rolling standardization plan focuses on the need to improve speech intelligibility, for normal as well as hearing-impaired people. To evaluate improvements in speech intelligibility, the long-term plan is to develop an even more generic objective model, taking both groups of listeners into account. This is a long-term objective, as many parameters have to be taken into account. So, we initially will investigate on two parameters: the impact of background noise and listening effort (LE), which is linked to intelligibility.
The full plan for such a project should be:
a) Design experiments to measure LE, using methods that include all relevant aspects of accessibility for persons without hearing impairments.
b) Collect a corpus of results over a wide range of conditions relevant to contemporary telecommunications systems, with initial focus on persons without hearing impairment
c) Development of an objective model for the prediction of LE for persons without hearing impairment.
d) Explore relations between listening quality, listening effort and intelligibility for users with normal hearing using the new LE model. This step may require additional listening tests.
e) Collect a corpus of results over the same conditions as noted in b), but for persons with hearing impairment. Due to the complexity of hearing impairments, this may require extensive work,
f) Use the LE predictive model developed in step c) with the additional data collected in step e) to extend the LE predictive model for use with persons with hearing impairment.
The background noise to be taken into account depends on the interface considered. The present STF focuses on the acoustical interface, as described in steps a) and b). An additional work (outside the STF workplan) will consider the electrical interface.
The development of the LE predictive model (step c) will be developed (and trained) outside the STF workplan. However, the STF is intended to use a set of subjective test results to validate the model.
How we do it
The STF has produced:
- the recorded test sequences (transmitted speech sequences mixed with background noise),
- the test plans for subjective tests,
- the results of the subjective tests (two sets of results). The first set will be provided to the laboratory in charge of the model training. A second set will be used for validation of the model.
The final deliverable will provide all the results obtained by the STF.
The main deliverable will be a new annex to TS 103 558 (RTS/STQ-new WI as approved by STQ#63) for the acoustical interface. TS 103 558 is intended to develop methods for the objective prediction of listening effort at the near-end side.
Progress reports will be produced according to the milestones defined in the time plan and submitted to STQ meetings for discussion and approval.
• Design of Test Plans for subjective test
• Test sequences for subjective testing
• Progress report to be approved at TC STQ#62
• Early Draft available
||• Conducting and analysing subjective tests completed
• Progress Report to be approved at TC STQ#63
• Stable draft available
||Model for the validation available
||• Consolidation of tests results and validation of the model
• Final Report and final draft to be approved at TC STQ#64
||Deliverable published, STF closed
If additional information is needed, please mail to firstname.lastname@example.org
STF 575 reports and has reported on the progress at all the TC STQ Meetings :
STQ#61 Nuremberg(DE) 8-12 July 2019 - Presentation of the work plan
STQ#62 Boston (US) 23-27 September 2019 - Progress Report Milestone A
STQ#63 Bern (CH) 17-21 February 2020 - Progress Report Milestone B
STQ#64 Zagreb (CR) 29 June-3July 2020 - Final Report