Página principal

2. Database design and collection Database contents definition Transcription


Descargar 0.95 Mb.
Página1/13
Fecha de conversión24.09.2016
Tamaño0.95 Mb.
  1   2   3   4   5   6   7   8   9   ...   13



1. Introduction

2. Database design and collection

3. Database contents definition

4. Transcription

5. The lexicon

6. Speaker demographic information

7. Recording conditions

8. Deviations from SpeechDat Car specifications

9. Sample Prompt sheets

10. Bibliography



Author(s):

Asunción Moreno, David Conejero, Gonzalo Bustamante

Institute:

Universidad Politécnica de Cataluña

Address:

Jordi Girona 1-3, Edificio D5, 08034 Barcelona, Spain

email:

asuncion@gps.tsc.upc.edu

Date:

December, 23rd 2006

Version:

V1.4


CONTENTS


1. Introduction 5

1.1 Speech file formats 6

1.2 Directory structure 6

1.3 File nomenclature 7

1.4 Label files 10

2. Database design and collection 13

2.1 Recording platform 13

2.2 Speaker recruitment 13

2.3 Design of prompting and prompt-sheet 14



3. Database contents definition 14

3.1 Application words 14



3.1.1 Common application words 00-81 14

3.1.2 Language-dependent application words P1-2 21

3.2 Voice activation keywords A1-2 21

3.3 Isolated digits 21

3.3.1 Single digits I1-4 21

3.3.2 Digit string B1 21

3.4 Connected digits 22



3.4.1 Sheet number C1 22

3.4.2 Telephone number C2, C5-C7 22

3.4.3 Credit card number C3 22

3.4.4 PIN code C4 24

3.5 Dates D1-3 25



3.5.1 Spontaneous date 25

3.5.2 Prompted date 25

3.5.3 Relative and general date expression 26

3.6 Embedded application word phrases E1-2 26

3.7 Spelled names/words L1-7 26

3.7.1 Spontaneous name 27

3.7.2 Prompted name linked to city 27

3.7.3 Real names/words 28

3.7.4 Artificial name 28

3.8 Money amount M1 28

3.9 Natural number N1 28

3.10 Directory assistance names O1-7 28



3.10.1 Spontaneous forename 28

3.10.2 Spontaneous city name 28

3.10.3 City name (set of 150) 28

3.10.4 Company/agency name/street name (set of 150) 29

3.10.5 Forename & surname (set of 150) 31

3.11 Phonetically rich sentences S1-9 32

3.12 Times T1-2 32

3.12.1 Spontaneous time 32

3.12.2 Read time phrase 33

3.13 Phonetically rich words W1-4 34

3.14 Spontaneous sentences Z0-9 35

3.15 Any other additional material 36

3.16 Links to other databases 36

4. Transcription 36

5. The lexicon 38

6. Speaker demographic information 40

6.1 Accent/Regions 40

6.2 Speaker characteristics 41

7. Recording conditions 42

8. Deviations from SpeechDat Car specifications 42

9. Sample Prompt sheets 43

9.1 Sample instruction sheets and prompt sheet 43



10. Bibliography 45



1.Introduction

The Catalan Database for In-Car Applications was recorded within the scope of the ”Generació de recursos lingüístics per les technologies de la parla” project which was sponsored by the Catalan and Spanish Governments.


Collection was performed at the Department of Signal Theory and Communications of the Universitat Politècnica de Catalunya (UPC) (Spain) and annotation was performed at Verbio Technologies. The owner of the database is the Catalan Government.
This database comprises in-car recordings from 300 speakers recorded in 600 different sessions. The database follows the SpeechDat Car specifications (corpus content, speakers, transcription, lexicon, formats) and the Speecon specifications for the recording platform (speech signal formats and doc files). The database is distributed in 12 ISO 9660 DVD volumes and one CD ROM. The CD is used for text files and documentation, DVDs content recordings in the car. The content of each volume is described below. Tables show the disk identification name, the first and last codes of the sessions included in each disk and the effective number of sessions.



Disk

DISK_ID

From

To

Ses

Contents

CD01

VEHIC2CAD00










Text and

documentation



DVD00

VEHIC2CA000

BLOCK00/SES0000

BLOCK00/SES0049

50

Signals

DVD01

VEHIC2CA001

BLOCK00/SES0050

BLOCK00/SES0099

50

Signals

DVD02

VEHIC2CA002

BLOCK01/SES0100

BLOCK01/SES0149

50

Signals

DVD03

VEHIC2CA003

BLOCK01/SES0150

BLOCK01/SES0199

50

Signals

DVD04

VEHIC2CA004

BLOCK02/SES0200

BLOCK02/SES0249

50

Signals

DVD05

VEHIC2CA005

BLOCK02/SES0250

BLOCK02/SES0299

50

Signals

DVD06

VEHIC2CA006

BLOCK03/SES0300

BLOCK03/SES0349

50

Signals

DVD07

VEHIC2CA007

BLOCK03/SES0350

BLOCK03/SES0399

50

Signals

DVD08

VEHIC2CA008

BLOCK04/SES0400

BLOCK04/SES0449

50

Signals

DVD09

VEHIC2CA009

BLOCK04/SES0450

BLOCK04/SES0499

50

Signals

DVD10

VEHIC2CA010

BLOCK05/SES0500

BLOCK05/SES0549

50

Signals

DVD11

VEHIC2CA011

BLOCK05/SES0550

BLOCK05/SES0599

50

Signals

The list of the distribution disks and directories are contained in the README.TXT file. Further details regarding the database contents, files and directories are provided in the documentation files in the DOC directory and the files in the TABLE and INDEX directories.


File types are identified with the following extensions:

*.DOC - Microsoft Word V6.0 document

*.LST - DOS text index file with ISO Latin 1 symbols

*.TBL - DOS text file with ISO Latin 1 symbols

*.SES - DOS text file

*.TXT - DOS text file

*.CAC - SAM label file, text file with ISO Latin 1 symbols for car recordings

*.CA1 - Speech signal channel 1

*.CA2 - Speech signal channel 2

*.CA3 - Speech signal channel 3

*.CA4 - Speech signal channel 4

*.PS - Postcript file


Each CD-ROM has the following directory structure:
\:

COPYRIGH.TXT - copyright notice

DISK.ID - UNIX volume ID file

README.TXT - readme file

VEHIC2CA\ - data directory

VEHIC2CA\DOC:

DESIGN.DOC - Catalan database documentation file

SUMMAR0.TXT - database contents summary file

SAMPALEX.PS - SAMPA table

ISO88591.PS - ISO 8859_1 table

VEHIC2CA\INDEX:

CONTENT0.LST - file/utterance/speaker index table

VEHIC2CA\TABLE:

LEXICON.TBL - full lexicon table

REC_COND.TBL - Recording condition table

SESSION.TBL - session table

SPEAKER.TBL - speaker table

VEHIC2CA\: - contains the data block directories

BLOCK00\ - sessions are grouped in blocks

BLOCK01\


...

VEHIC2CA\BLOCK00:

SES0002\ - session directories for each session

...


VEHIC2CA\BLOCK00\SES0002:

V2000206.CAC - SAM label file for car recordings

V2000206.CA0 - speech signal file in car

V2000206.CA1 - speech signal file in car

V2000206.CA2 - speech signal file in car

V2000206.CA3 - speech signal file in car



  1   2   3   4   5   6   7   8   9   ...   13


La base de datos está protegida por derechos de autor ©espanito.com 2016
enviar mensaje