TEXTCROWD

SD ID: TEXTCROWD - Collaborative semantic enrichment of text-based datasets

ORGANISATION: PIN srl

CONTACT: Franco Niccolucci

Email: franco.niccolucci(at)gmail.com

 

OVERVIEW: The Social Sciences and Humanities research communities face a fragmented research landscape that can be supported by EOSC. The EOSC would help overcome such fragmentation, by building on structuring and integrating initiatives such as the CLARIN, DARIAH and E-RIHS ERICs, and Digital Humanities Organizations (e.g. their Association ADHO) to offer advanced text-based services addressing common research needs (see recent survey by PARTHENOS). One example is enabling the semantic enrichment of text sources through cooperative, supervised crowdsourcing, based on shared semantics, and then to make this work available to others via EOSC. This would benefit many scientists in the long-tail even if delivering such a service presents real challenges around interoperability and multilingualism.

CONTEXT: Cultural heritage and humanities datasets are largely based on texts:

  • Reports
  • Archaeology: excavations, surveys
  • Conservation: diagnosis, restoration – often mixed with numeric results
  • Grey literature
  • Literary/historical sources
  • Research articles
  • Monographs
     

Download TEXTCROWD by Kathrin Beck (MPCDF)