Logo des Hörzentrums
Menu

Hearing Aid Research Data Set for Acoustic Environment Recognition (HEAR-DS)

The HEAR-DS provides binaural audio material recorded in acoustic environments, which are typical for hearing aid users.Its goal is to support researchers to train and test algorithms in environments relevant for hearing aids, e.g. deep neural networks.
 
During revision process of the paper "Hearing Aid Research Data Set for Acoustic Environment Recognition" (Andreas Hüwel, Dr. Kamil Adiloğlu and Dr. Jörg-Hendrik Bach) for ICASSP2020 (unpublished) we provide exemplatory data for the reviewer to give a general idea about the content of the data set. After publication the whole data set and its correlated code will be freely accessible.
 

Acoustic Environments Overview

Cocktail party  
Interfering speakers  
In traffic Speech in traffic
In vehicle Speech in vehicle
Music  Speech in music
Quiet indoors Speech in quiet indoors
Reverberant environment Speech in reverberant environment
Wind turbulence Speech in wind turbulence

Example of Speech in Background SNR Variations

Acoustic Environment          
Speech in vehicle SNR -10 SNR -5 SNR 0 SNR 5 SNR 10

As described in the paper, some audio material was used from 3rd party, and thus cannot be provided here. But all the needed data is accessible online. With our scripts (made public here as soon as the paper is published) everyone can re-generate the whole data set by themselves.

Audio for interfering speakers comes from CHiME5 and the material for speech mixing for the speech in background environments comes from CHiME2. For CHiME2 (2013) and CHiME5 (2018), please contact the organizators to get access to the data sets. Audio for music comes from GTZan.

Data and Format

An acoustic environment holds audio from different recording situations. Each recording situation has a unique id (rec_id) containing one or more recording sessions. From the raw audio of each recording session we manually cut suitable audio pieces (the cuts) to fill its recording situation with audio material, each cut has a local unique cut_id. To generate the actual data set to train machine learning systems, we performed a further processing step, which produces for every acoustical environments all 10s audio samples, as further described in sub section Audio Samples.

HEAR-DS Raw Audio Cuts

For each recording situation one folder holds all cut wav files.

Folder structure:

├── CocktailParty
│   ├── rec_001_HDH_1
│   ├── rec_002_HDH_2_bistro
│   ├── rec_003_ParishHall_1
│   └── rec_004_UniCafete_1
├── InTraffic
│   ├── rec_id_551_rush_hour_outskirts_mainroad_1
│   ├── rec_id_552_busstop_city_mainroad_2
│   ├── rec_id_553_city_secondaryroad_3
│   └── rec_id_554_rush_hour_city_mainhub_1
├── InVehicle
│   ├── rec_id_501_berlingo_II_diesel_1
│   ├── rec_id_502_skoda_fabia_ottoengine_1
│   └── rec_id_503_vw_t5_diesel_caravelle_1
├── QuietIndoors
│   ├── rec_id_401_quiet_rural_home_1
│   ├── rec_id_402_quiet_smalltown_home_1
│   └── rec_id_403_quiet_city_home_1
├── ReverberantEnvironment
│   ├── rec_005_Oldenburg_Church_1
│   ├── rec_006_Rheine_Railstation_Hall_1
│   └── rec_007_Staircase_1
└── WindTurbulence
     ├── rec_id_202_wind_suburban_garden_02
     ├── rec_id_203_wind_before_rural_house_01
     └── rec_id_204_marie_curie_parkplace
 
Due to the manual process of audiuo cutting, the length of cuts vary. The naming scheme is:
 
rec_id_<REC_ID>_cut_<CUT_ID>_<DESCRIPTION>_<TRACKNAME>_<EXPORTFORMAT>.wav
 

With <REC_ID> being a 3 digit number and <CUT_ID> a 2 digit number. The <DESCRIPTION> could e.g. be "startengine_driveoff" for InVehicle or "bell" in ReverberantEnvironment. <TRACKNAME> stands for one of the used hearing aid microphones [Mic_BTE_L_front, Mic_BTE_L_rear, Mic_BTE_R_front, Mic_BTE_R_rear, Mic_ITC_L, Mic_ITC_R]. <EXPORTFORMAT> is the name of the used audio-exporter, currently "raw_48kHz32bit".

HEAR-DS Audio Samples

In this processing step the raw audio cuts were further sliced into 10s snippets. This 10s snippets are either used directly as background sample or are further mixed with random speech, at multiple (-10, -5, 0, 5, 10) SNRs, to create audio samples for the speech in background environments. The binaural speech source material comes from five different directions, which we randomly choose from, the start and end-time of this source speech, and the start time of the background snippet are also randomized. This 10s samples finally form the HEAR-DS audio-material for training of machine learning systems, e.g. as input for the feature-extraction step of deep neural networks.

Audio Sample Snippet File Format

The naming scheme for snippets is:

<ENV_ID>_<REC_ID>_<CUT_ID>_<SNIP_ID>_<TRACKNAME>_<SAMPLERATE>.wav

  • <ENV_ID>: 2 digit id of acoustical environment, where each speech in background environment has its own id, separated from the pure background environment.
  • <REC_ID>: 3 digit id of record situation.
  • <CUT_ID>: 2 digit id of cut of the record situation (unique for all sessions of that situation)
  • <SNIP_ID>: 3 digit id of the snippet of this cut.
  • <TRACKNAME>: as described above.
  • <SAMPLERATE>: in [48kHz, 16kHz]

For e.g. ReverberantEnvironment, recording situation "Oldenburg Church", first cut, first snippet the 16kHz Version the snippet filename is 06_005_00_000_BTE_L_front_16kHz.wav


Folder Structure of HEAR-DS samples

In the speech folders the snippet files over the different SNR subfolders share same filenames. But the content of those audio snippets is different: For each SNR subfolder a different (random) speech piece was mixed at that SNR with that (same) background audio snippet.

├── CocktailParty
│   └── Background
├── InterferingSpeakers
│   └── Background
├── InTraffic
│   ├── Background
│   └── Speech
│          ├── 0
│          ├── 10
│          ├── -10
│          ├── 5
│          └── -5
├── InVehicle
│   ├── Background
│   └── Speech
│          ├── 0
│          ├── 10
│          ├── -10
│          ├── 5
│          └── -5
├── Music
│   ├── Background
│   └── Speech
│          ├── 0
│          ├── 10
│          ├── -10
│          ├── 5
│          └── -5
├── QuietIndoors
│   ├── Background
│   └── Speech
│          ├── 0
│          ├── 10
│          ├── -10
│          ├── 5
│          └── -5
├── ReverberantEnvironment
│   ├── Background
│   └── Speech
│          ├── 0
│          ├── 10
│          ├── -10
│          ├── 5
│          └── -5
└── WindTurbulence
     ├── Background
     └── Speech
            ├── 0
            ├── 10
            ├── -10
            ├── 5
            └── -5

Acknowledgements

This work was supported by the German Ministry of Education and Science (BMBF), FZK 02K16C202 AUDIO-PSS.

The authors would like to thank Marei Typlt and the partners in the AUDIO-PSS project for support in designing the acoustic environments and Audifon GmbH for providing the hearing aid dummies.

 

Contact for HEAR-DS Telefon: 0441 2172-218 Fax: 0441 2172-250 E-Mail: Publikationen: Publikationen ansehen Public-Key: Public-Key

-----BEGIN PGP PUBLIC KEY BLOCK----- Version: GnuPG v1.4.1 (GNU/Linux) mQINBFnKMioBEAClnjvNb9muAfdsr5UGvTiQ+CA8IsmOMnGfReLidSrHfA4hMkaq b+bD4e692sgDRvdZIxTCccp9IqUG6S+pGZwDJ+5lPBCw5d0NbTxxz+o082HYU92E g3x5Am9Bisw08oEpXqOX2RkPpS16k9YafEEpcqOHAwr5QVfpENinKTbJ0jm/y7ow tbPUBviN0L227jaOCp4n5nrtO9It/MAjJ8uaaZn4+nxLdZIwfQQzZK6T3kIQoPA7 9cN+yYSSstE/J05OwxizUTNVSw3JOtSTPGAY693afedo0Psl4nK8H2psUTKNCElY qDxFsa+vA6J8pHF1raD4DynUzF5RIVEUOByJt3vIt+kej03a25JQ6D9nDeNxjayz Yo9E33w9Dp58CyJwgFpB5PG/QVwP4ELAZ/x0GWHYBsXzyiBmuruS1Sgw362OyvOG 91Oip/ovHYlAITu64uCIMkxpO/4D44Wa2F56Foxso+6nFHcyS4zTLOUU+UVmpbp4 VH7A+tbrzTdgAnXUn9DjzYAu8jEoYA1P9fLX2luNp7qla2N8cEWEfFdChAYunqdb PlbpqyQYwzuCGRAWGTijlV1kJ4S4x8YxBdHaw+IIVkikvelMUT2W3BYddccabMop HH93EYcX8ApLrgh4hDgo6g5qvpT2Bh6koPczWsM8E2bnUjIQWmwTUPwQuQARAQAB tCVBbmRyZWFzIEh1ZXdlbCA8QS5IdWV3ZWxAaG9lcnRlY2guZGU+iQJUBBMBCAA+ FiEEwXKKYG+UANi0mfIcusdk5VS58oUFAlnKMioCGyMFCQlmAYAFCwkIBwIGFQgJ CgsCBBYCAwECHgECF4AACgkQusdk5VS58oUMgBAAjlfX+rMUH9BISsuLNQ3XQ/6K YQx8Dt2RwO4s1CeT4get/jn3rf6xj/aSr0nHVLQ/COFU3lfbYszpLnIcxw4wpYfx 5zMXsO9AJk1kQ8AgkQ2M2JuENTilke3Py8/+IR4+z81PEB0/zThT8lLL6olFkRYc wTiZpFWGVklp9KlGkRfypCJpiXqy4H+uwSKJn3Z/LQ6sogrDBwjCKOysEpHfxrfs cu5qddHeAl9slZ1R7bRM1q4S2IXJnOkASGuUB5iLVa9+05XSSxlNpawaVkjdp77F paqIPrJd5i/ZzcZxdb8iDLoNtbiae20MKEOw0bOSHZW66lz7wZtqGtkF8ptdGt2l 1z/Isy3riIRhX9HhQ1aiBGafDBwZ/zjB7wdwDSsOktiB0iW36a3gQmB7Hajikay0 c0BJFUUvzXp0VIodW0ARtxX5EuoaA6j8HIfaa2o1fWt4LKvLvcRbHRniLqqFNx8j ZR/wNYFm8cYfuCJx7o7Q2nq8PwNz3rJFaUiZ9wlL25zZIxnlHVipT4Z/H29xSddT +GCkUojbmvCxdnKz/ZCAfAlHeOJ9tHRu8MQsPdBQRBJbxRrPg+1n6IPUpIuj0Eit 3AVuau/wPPDHQF6EzEr/sYIUhuTllTHOt18ex+rKnjaeQY/zKShiaK/X1aTLPCxU m78FRC3gSeZ4sHAx1RS5Ag0EWcoyKgEQAMbqbL/rUBR8/u+HmgSemvsJ8pqBUoCA XvDiCrHAWGvX0LEqP7h4BTypRcDiieW1zpEB0R9gum+aGJBXL1ZDgGPMpe/VdAEy Vy9Kp8vm6+au71Zv84OJd+Wl/ljedmoNeIFLV0do2fak4FbX6nn31ESuQG1xPeXg F7aiRoO5vakd+cr8XcNp7HQP5dSyLB73zkJwbJ+zg5PjTwxxsV3Xl/yGqXzyKPjl 7JdKzlF1WKvdFQgDGl06qQX5t2AABq6fuKfvmKa3ovJIxC2qzDZCwQEAVz5GABMb iexEvue4Rt3zqqH9vbQymiFruF0TSFPJ9H2LGLZt1xMebE/RmLJtjZXWzdumsuuY YmXOZWWjRzbW93g+5f7ZRKW2nR4iwXo2sTt2jXEGMl9aQxbGEsIreziVngGTdFEo DzHjAh8ZMzthq4n9opLyCQqOfm1QeCNgt8gpNfP4w5SFjgDIUrZj3s2DAct1iAxH 5aCyoFOy2t1z17zrKWIBK0yO5zTJV3aV6+yR4SkJb0NJwdmRdz3DceyhRTlUp/YZ bGOcF0KnGbu1zey1v1LHqiZo/yn/TPCpK2k+TDiNW16ymtSivE5fWbDrY8M+A8Q3 9SFbIyMCYg6F84MXeWLaQadf3HEVoYvK3lUx4KxGeMew2eZxgRsd2RAldVL1DPCC y8ikVqUHZ2sRABEBAAGJAjwEGAEIACYWIQTBcopgb5QA2LSZ8hy6x2TlVLnyhQUC WcoyKgIbDAUJCWYBgAAKCRC6x2TlVLnyhYw5EACJCB1DuxRCK0Y3dXclcHO0f7Z5 kkQWmi9nH2M1+Pn5N67NojE9NcMik/pWKnUi6NP00nA8S1sYKHh+tFcbCVmAouHV geWKRoy525OVlnzq4sZkPDd3fKOYz+Q7/SNBSlwiVMEZV+sCcnxFp9QfBbxcANBm wjeu10Y6/h3921ziMF4yhy0cEU0wSMYbx0F3bH4TbYtSFcA6HIHWApg4s5ObUOL7 6LS85oGe9cFH4lOVMiuXliuYFyUzu0y3J8Hm8Dg1X0WSnCcdlsH0BZwgyP/dem4b izSJ2bJXXYx8GrUQlqdtwpnolozWroHySjXd1Js4KnCcmr3vq+fkrHsNhbOlT0nd LTsVP9AIjjZdwpta16gZ0We4zd1vfqJa+aGF4K46n5lRzFMFWMIyeDB28pl3wtrl 0h3A9G0cGKVYTkTDrp3Wrrm4dd8rttKyJghOhRG+SdwbgY0vGWt3N33DuhNaqL/Y W6ZxTFO00x4ZdFfyJ+4IIobiE3Q/t3spV8uq7j6WXFCG01fkvn/PDjnlTcDY07xI 2AoUzP0BbiStNj4JwUrQHeiIT8Ag7QNU7O99Qj9KAhb3A7gmp8705QQ2TsZxu8Jn xWI/4El3OnGC5Fzb28zTMPqKW09CmpDZRlr5P3Q/ImnQVyLxs23GU9Xec5BhRVJn CIansLe3B5Y9PDwrTg== =WFGZ -----END PGP PUBLIC KEY BLOCK-----