The C-ORAL-BRASIL corpus: methodological basis for the treatment of spontaneous speech

Maryualê M. Mittmann, Tommaso Raso, Heliana R. Mello

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

This paper highlights the primary methods employed in the C-ORAL-BRASIL compiling process, i.e, recording, transcribing and segmenting oral texts. The C-ORAL-BRASIL is a Brazilian Portuguese corpus of spontaneous speech, designed for the study of informational structure. It is representative of the diaphasic variation, seeking to cover as many different comunicative situations as possible. This paper presents and exemplifies the processes of transcription and segmentation of speech into prosodic units as employed in our on-going research. It concludes with illustrations of some questions that the corpus will enable us to answer.
Original languageEnglish
Title of host publicationSTIL 2009 - 2009 7th Brazilian Symposium in Information and Human Language Technology
PublisherIEEE Computer Society
Pages179-182
Number of pages4
ISBN (Print)9780769539454
DOIs
Publication statusPublished - 2009
Externally publishedYes
Event7th Brazilian Symposium in Information and Human Language Technology, STIL 2009 - Sao Carlos, Sao Paulo, Brazil
Duration: 8 Sept 200911 Sept 2009

Publication series

NameSTIL 2009 - 2009 7th Brazilian Symposium in Information and Human Language Technology

Conference

Conference7th Brazilian Symposium in Information and Human Language Technology, STIL 2009
Country/TerritoryBrazil
CitySao Carlos, Sao Paulo
Period8/09/0911/09/09

Fingerprint

Dive into the research topics of 'The C-ORAL-BRASIL corpus: methodological basis for the treatment of spontaneous speech'. Together they form a unique fingerprint.

Cite this