The resources to be formally submitted to CLDC should be organized according to the following directory framework:
The top directory should contain:
l
Document directory
|
Resource directory
l
Other related directory
The details are as follow:
1.Document directory
The document directory should at least contain 3 files, respectively, basic resource information, labeling specification, technical document and user’s manual.
(1)The basic resource information should include:
:
1)
Resource’s name
2)
Publishing date (including version number)
3)
Author(main persons involved in designing and constructing the data. Including name and email)*
4)
Corporation(or individual)
5)
Type(text, speech or video))
6)
Source(s):(broadcast news, newspaper, telephone or spontaneous speech),The year and the process in which the data was collected*.
7)
Language(s)(Mandarin, Cantonese, English and Chinese and English)
8)
Supporting projecte.g. 863,973 and natural science fund),Please supply the project number, and give a brief introduction about the research aim and the relationship between the resource and the project.
863、973)
9)
Application(s)(e.g. multi-languages information indexing,automatic abstract ,machine translation, speech analysis, speech recognition, speaker verification, speech synthesis and oral dialog system)
10)
Description of resource’s content
11)
Description of resource’s attributes
Description
1.for text: file format, encoding method, volume of available data (e.g. the number of chapter, sentence, word, syllable, phoneme and so on ), size in (K, M or G)bytes
2. for speech (and video): file format, channel number, sampling rate, sampling format, time length, memory space(MB)
§ Detailed description of the directory structure
(2)Labeling specification should include::
1)
labeling method
2)
labeling format
3)
labeling examples
(3)Technical documents should include::
1) designing background, principle and algorithm
2)
factors considered
3)
related technical principle and application method
2.Resource directory - (must in accord with the directory structure introduced in the basic resource information)
3.Other related directory - involved tools, documents, and the special character sets ,etc.