Collecting and Sharing corpora for language and speech disorders
DELAD/CLARIN workshop at the ICL conference Poznan
11 -12 September 2024
DELAD is an initiative that facilitates the sharing of corpora of speech of individuals with communication disorders (CSD) among researchers. We do this in a GDPR compliant way and at secure repositories in the CLARIN infrastructure. See our website.
DELAD regularly organises workshops around the themes:
- Guidelines for collecting and sharing CSD
- Ethics and legal aspects
- Levels of anonymisation
- Layered access of data
- Integration of CSD in the CLARIN infrastructure
- Formats
- Relevant metadata
For themes and reports of our previous workshops, visit our website https://delad.ruhosting.nl/wordpress/delad-workshops-2017-2020/
We are organising our next workshop in conjunction with the ICL Conference in Poznan in 2024: https://icl2024poznan.pl/. This will be a hybrid workshop held on 11th and 12th September 2024 as a lunch to lunch meeting.
We invite researchers working with CSD to present their work, and address their data sharing methods including any obstacles encountered.
Among others, the programme features presentations from DELAD representatives about sharing CSD via DELAD and some latest updates, including a new CLARIN Resource Family page for corpora with communication disorders (see https://www.clarin.eu/resource-families for other examples of resource families). Other topics are related to metadata deemed relevant for making such datasets findable and a panel discussion about the role that Large Language Models (such as ChatGPT) can play in our research.
The workshop is sponsored by CLARIN ERIC.
Provisional Program for 11 and 12 September 2024:
Date | Time (CEST) | Topic | Agenda |
---|---|---|---|
11 Sep | 14:30-14:45 | Welcome & Introduction | Introduction (Henk van den Heuvel) |
14:45-15:30 | Recent development at ACE / DELAD | A CLARIN Resource Family for Corpora of Communication Disorders & Questionnaire about data sharing (Henk van den Heuvel & Satu Saalasti) | |
COFFEE BREAK | |||
15:45-16:55 | Presentations by researchers about current status of their CDS & potential of ACE for CDS sharing | 20-minute presentations & 10-minute discussion: – “Challenges in data sharing from a clinical perspective: a use case of voice data from patients with COPD” (Loes van Bemmel) – “Using a portable system for multi-channel audio data acquisition and processing” (Anita Lorenc et al.) – “Corpus-based research into intra- and interpersonal language variation in people with aphasia” (Marina Ruiter et al.) | |
BREAK | |||
17:10-18:20 | Presentations by researchers about current status of their CDS & potential of ACE for CDS sharing | 20-minute presentations & 10-minute discussion: – “Dysarthric speech database in Dutch and English for personalized dysarthric speech recognition” (Zhengjun Yue et al.) – “The Icelandic Language Biobank: Data Collection through a Clinical Analysis Platform” (Iris Nowenstein et al.) “STAR – A Speech Therapy Animation and imaging Resource” (Eleanor Lawson et al.) | |
12 Sep | 10:00-10:15 | COFFEE BREAK at ICL | |
10:15-10:30 | Welcome | Welcome & Wrap-up of Day 1 | |
10:30-11:00 | Presentations by researchers about current status of their CDS & potential of ACE for CDS sharing | 20-minute presentation & 10-minute discussion: – “Sensitive Data in HPC – How secure can it be?” (Matthiesen) | |
11:00-12:00 | The impact of AI on research and treatment of language & speech impairments | 30-minute Introduction (Speaker) & 30-minute Panel discussion | |
12:00-12:15 | Conclusion of Workshop | Wrap-up | |
12:15- | LUNCH |
We expect onsite ICL conference participants and online attendants.
Participation in the DELAD lunch-to-lunch workshop only is free of charge for all registered participants. In order to participate in other events of the ICL, it is required to register to ICL and pay the respective fees. Read more about the ICL registration and fees: https://icl2024poznan.pl/?id=12