sharing corpora of disordered speech among researchers

Scroll to bottom

The DELAD initiative

About us


DELAD stands for Database Enterprise for Language And speech Disorders, and is also Swedish for SHARED. DELAD is an initiative  to share corpora of speech of individuals with communication disorders (CSD) among researchers. We do this in a GDPR compliant way and at secure repositories in the CLARIN infrastructure.

DELAD was initiated by Martin Ball from Bangor University. In October 2015 and June 2016 two workshops were held in which selected researchers and data curation specialists discussed relevant issues in setting up such an archive, how to link the initiative to existing resource infrastructures such as CLARIN, and how to find funding to implement the archive.

On April 29, 2022, Franciska de Jong, executive director of CLARIN ERIC, presented DELAD as one of three impactful CLARIN case studies at the
@ERIC_forum policy seminar on the socio-economic impact of ERICs, see htttps://

DELAD organises workshops addressing relevant themes such as:

  • Guidelines for collecting and sharing CSD
  • Ethics and legal aspects
  • Levels of anonymisation
  • Layered access of data
  • Integration of CSD in the CLARIN infrastructure
  • Formats
  • Relevant metadata

The DELAD community consist of researchers involved in collecting and analysing CSD, research data and infrastructure specialists, and legal experts. DELAD has chosen the CLARIN infrastructure as primary space for storing and sharing CSD. More specifically, DELAD has linked up with CLARIN’s Knowledge Centre for Atypical Communication Expertise (ACE) for making CSD available through The Language Archive (TLA) at the Max Planck Institute in Nijmegen (being a CLARIN Data Centre) and CMU’s Talkbank (Clinical Banks).

This website has a number of objectives:

  • to raise awareness for the DELAD enterprise
  • to bring interested researchers together to promote the initiative and to organise workshops
  • to make an inventory of relevant data
  • and make these findable through a dedicated webportal
  • to stimulate and facilitate the exchange of data


DELAD Workshops

Funded by CLARIN ERIC the DELAD community organised a workshop about these issues  in Cork, 15-17 Nov. 2017. A full report with video lectures can be found here..

DELAD Partners

DELAD partners inform each other about their activities in regular workshops and make their data accessible to others whenever possible.

The archive will consist of a digital archive of sound files and video files representing samples of disordered speech, in a variety of languages. These sound and video files will be accompanied by high-fidelity transcripts, and acoustic analysis files together with imaging files (such as ultrasound imaging), as appropriate. At all times, the importance of developing ethical guidelines and appropriate permissions in the collection of data will be stressed.

For researchers, the attraction of such data repositories is the ability to refine analysis methods and formulate and test hypotheses about disordered speech without having to ‘reinvent the wheel’ in terms of primary data collection. For educators and students, an archive of high quality speech data allows them to learn and practise analysis of disordered speech and the application of diagnostic tools on a variety of cases. The ultimate goal of this research is the improvement of evidence-based therapy for developmental and acquired speech disorders, as well as the improvement of research opportunities.