Author(s):
|
Asunción Moreno, David Conejero, Gonzalo Bustamante
|
Institute:
|
Universidad Politécnica de Cataluña
|
Address:
|
Jordi Girona 1-3, Edificio D5, 08034 Barcelona, Spain
|
email:
|
asuncion@gps.tsc.upc.edu
|
Date:
|
December, 23rd 2006
|
Version:
|
V1.4
|
CONTENTS
1. Introduction 5
1.1 Speech file formats 6
1.2 Directory structure 6
1.3 File nomenclature 7
1.4 Label files 10
2. Database design and collection 13
2.1 Recording platform 13
2.2 Speaker recruitment 13
2.3 Design of prompting and prompt-sheet 14
3. Database contents definition 14
3.1 Application words 14
3.1.1 Common application words 00-81 14
3.1.2 Language-dependent application words P1-2 21
3.2 Voice activation keywords A1-2 21
3.3 Isolated digits 21
3.3.1 Single digits I1-4 21
3.3.2 Digit string B1 21
3.4 Connected digits 22
3.4.1 Sheet number C1 22
3.4.2 Telephone number C2, C5-C7 22
3.4.3 Credit card number C3 22
3.4.4 PIN code C4 24
3.5 Dates D1-3 25
3.5.1 Spontaneous date 25
3.5.2 Prompted date 25
3.5.3 Relative and general date expression 26
3.6 Embedded application word phrases E1-2 26
3.7 Spelled names/words L1-7 26
3.7.1 Spontaneous name 27
3.7.2 Prompted name linked to city 27
3.7.3 Real names/words 28
3.7.4 Artificial name 28
3.8 Money amount M1 28
3.9 Natural number N1 28
3.10 Directory assistance names O1-7 28
3.10.1 Spontaneous forename 28
3.10.2 Spontaneous city name 28
3.10.3 City name (set of 150) 28
3.10.4 Company/agency name/street name (set of 150) 29
3.10.5 Forename & surname (set of 150) 31
3.11 Phonetically rich sentences S1-9 32
3.12 Times T1-2 32
3.12.1 Spontaneous time 32
3.12.2 Read time phrase 33
3.13 Phonetically rich words W1-4 34
3.14 Spontaneous sentences Z0-9 35
3.15 Any other additional material 36
3.16 Links to other databases 36
4. Transcription 36
5. The lexicon 38
6. Speaker demographic information 40
6.1 Accent/Regions 40
6.2 Speaker characteristics 41
7. Recording conditions 42
8. Deviations from SpeechDat Car specifications 42
9. Sample Prompt sheets 43
9.1 Sample instruction sheets and prompt sheet 43
10. Bibliography 45
1.Introduction
The Catalan Database for In-Car Applications was recorded within the scope of the ”Generació de recursos lingüístics per les technologies de la parla” project which was sponsored by the Catalan and Spanish Governments.
Collection was performed at the Department of Signal Theory and Communications of the Universitat Politècnica de Catalunya (UPC) (Spain) and annotation was performed at Verbio Technologies. The owner of the database is the Catalan Government.
This database comprises in-car recordings from 300 speakers recorded in 600 different sessions. The database follows the SpeechDat Car specifications (corpus content, speakers, transcription, lexicon, formats) and the Speecon specifications for the recording platform (speech signal formats and doc files). The database is distributed in 12 ISO 9660 DVD volumes and one CD ROM. The CD is used for text files and documentation, DVDs content recordings in the car. The content of each volume is described below. Tables show the disk identification name, the first and last codes of the sessions included in each disk and the effective number of sessions.
Disk
|
DISK_ID
|
From
|
To
|
Ses
|
Contents
|
CD01
|
VEHIC2CAD00
|
|
|
|
Text and
documentation
|
DVD00
|
VEHIC2CA000
|
BLOCK00/SES0000
|
BLOCK00/SES0049
|
50
|
Signals
|
DVD01
|
VEHIC2CA001
|
BLOCK00/SES0050
|
BLOCK00/SES0099
|
50
|
Signals
|
DVD02
|
VEHIC2CA002
|
BLOCK01/SES0100
|
BLOCK01/SES0149
|
50
|
Signals
|
DVD03
|
VEHIC2CA003
|
BLOCK01/SES0150
|
BLOCK01/SES0199
|
50
|
Signals
|
DVD04
|
VEHIC2CA004
|
BLOCK02/SES0200
|
BLOCK02/SES0249
|
50
|
Signals
|
DVD05
|
VEHIC2CA005
|
BLOCK02/SES0250
|
BLOCK02/SES0299
|
50
|
Signals
|
DVD06
|
VEHIC2CA006
|
BLOCK03/SES0300
|
BLOCK03/SES0349
|
50
|
Signals
|
DVD07
|
VEHIC2CA007
|
BLOCK03/SES0350
|
BLOCK03/SES0399
|
50
|
Signals
|
DVD08
|
VEHIC2CA008
|
BLOCK04/SES0400
|
BLOCK04/SES0449
|
50
|
Signals
|
DVD09
|
VEHIC2CA009
|
BLOCK04/SES0450
|
BLOCK04/SES0499
|
50
|
Signals
|
DVD10
|
VEHIC2CA010
|
BLOCK05/SES0500
|
BLOCK05/SES0549
|
50
|
Signals
|
DVD11
|
VEHIC2CA011
|
BLOCK05/SES0550
|
BLOCK05/SES0599
|
50
|
Signals
|
The list of the distribution disks and directories are contained in the README.TXT file. Further details regarding the database contents, files and directories are provided in the documentation files in the DOC directory and the files in the TABLE and INDEX directories.
File types are identified with the following extensions:
*.DOC - Microsoft Word V6.0 document
*.LST - DOS text index file with ISO Latin 1 symbols
*.TBL - DOS text file with ISO Latin 1 symbols
*.SES - DOS text file
*.TXT - DOS text file
*.CAC - SAM label file, text file with ISO Latin 1 symbols for car recordings
*.CA1 - Speech signal channel 1
*.CA2 - Speech signal channel 2
*.CA3 - Speech signal channel 3
*.CA4 - Speech signal channel 4
*.PS - Postcript file
Each CD-ROM has the following directory structure:
\:
COPYRIGH.TXT - copyright notice
DISK.ID - UNIX volume ID file
README.TXT - readme file
VEHIC2CA\ - data directory
VEHIC2CA\DOC:
DESIGN.DOC - Catalan database documentation file
SUMMAR0.TXT - database contents summary file
SAMPALEX.PS - SAMPA table
ISO88591.PS - ISO 8859_1 table
VEHIC2CA\INDEX:
CONTENT0.LST - file/utterance/speaker index table
VEHIC2CA\TABLE:
LEXICON.TBL - full lexicon table
REC_COND.TBL - Recording condition table
SESSION.TBL - session table
SPEAKER.TBL - speaker table
VEHIC2CA\: - contains the data block directories
BLOCK00\ - sessions are grouped in blocks
BLOCK01\
...
VEHIC2CA\BLOCK00:
SES0002\ - session directories for each session
...
VEHIC2CA\BLOCK00\SES0002:
V2000206.CAC - SAM label file for car recordings
V2000206.CA0 - speech signal file in car
V2000206.CA1 - speech signal file in car
V2000206.CA2 - speech signal file in car
V2000206.CA3 - speech signal file in car
|