(1)   资源名称(中、英文)

电话语音识别语料库

Telephone Speech Corpus for Recognition

2)资源简述(中、英文)

我们在国家863课题的支持下研究、建立电话语音识别用的语音库。我们的电话语音识别库采用8K采样率录制,其中包括具有不同口音、年龄和文化层次的664个录音人;录音数据采自不同信道,其中长途电话数据来自38个不同的省市及地区。所设计的录音文本考虑到电话语音应用中最为常见的情况,并覆盖汉语音节及其音联关系,以保证用此数据库训练出来的声学模型具有良好的效果。

In order to facilitate the research of the telephone speech recognition , a speech database project for collecting the telephone speech of different people of different ages and accents with different culture backgrounds  across different areas of China has been has been completed under the support of the national 863 high-tech project. The database includes many of telephone speech of 664 people from 38 different cities and regions of China and is recorded in -law format through different kind of trunk communication channels. The common requirements of various applications have been considered in the design of telephone speech database. The database covers all the Chinese syllables pronunciations and all possible phonemes combinations that reflecting the linguistic feature of Chinese. Since that the database meet the requirements of many speech model training and recognition algorithms, it becomes a must in the research of telephone speech recognition.

 

3)单位名称(中、英文)

 

清华大学  电子工程系  网络与人机语音通信研究所

Network and Human-Machine Speech Communication Research Institute

Department of Engineering

4)开发时间

20034 – 20044

5)规模

录音人数:640人,共计:37小时  数据量:1.04G8K 率)

 

电话语音库各项指标情况请见下表:

项目内容

具体指标情况

 

录音人数

664

 

发音人年龄分布

1468

 

口音的分布

发音人选自28个省市、自治区

(除西藏、台湾地区)

 

数据采集方式

1.采集本地(北京)电话数据

2.增加采集长途电话数据:

上海、厦门、西安、南宁、桂林、南昌、广州、江门、石家庄、沈阳、福州、日照等地,共38个地点。

详见:附录1

 

 

 

录音语料内容

孤立的10个数字、5个数字串

10个最常用的单词

从实用出发,采用读货币名称方式:4个十进制数字串

6个时间(3个日期、3个常用时间)

每人录28个句子,句子的设计,覆盖了全部音节及各音节间的联接关系,并考虑到音节的平衡问题