ProLingKNOWER
2rd International Workshop on PROfiling LINGuistic KNOWledgE gRaphs
co-located with LDK 2023
ABOUT
In the last decades, we have experienced a substantial increase of Knowledge Graphs (KGs) published on the Web. The focus of this workshop is to reveal novel approaches, methodologies and frameworks on profiling Linguistic Linked Data (LLD) (corpora, lexicons, ontologies, etc.) as well as to highlight tools and user interfaces that can effectively assist different use cases for profiling such data. In addition, the workshop seeks methodologies that help effective profiling in building real-world Linked Data applications leveraging linguistic data, as well as use cases that reveal success stories or aspects that have been neglected so far. The benefits of addressing Linguistic Linked Data profiling issues will not only help in understanding and exploring such data, but also provide the means to increase Linguistic Linked Data consumption, and to maintain track of the evolution of the relevant datasets.
Despite the high number of datasets published as LLD, their usage is still not exploited as they lack comprehensive metadata. Data consumers need to obtain information about datasets in a concise form to decide if they are useful for their use case or not. Data profiling techniques offer an efficient solution to this problem as they are used to generate a semantic profile that contains metadata and statistics that describe the content of the dataset. Semantic profiles are very important for different use cases, such as: (1) provision of a general overview of the data, (2) ontology / dataset integration, (3) identification of quality issues, (4) query optimization, (5) data visualization, (6) data analytics tasks, (7) schema discovery, and (8) entity summarization.
Besides academia, the workshop targets developers and other knowledge workers. We envision the workshop as a forum for researchers and practitioners to come together and discuss common challenges and identify synergies for joint initiatives. We welcome contributions describing technical approaches, as well as those related to real use cases in using semantic profiles.
To assure a high quality of the accepted papers, a peer review process is chosen for the workshop. Each submission will be reviewed by at least 2 members of the PC. Papers will be evaluated according to their significance, originality, technical content, style, clarity, and relevance to the workshop.
The proposed workshop seeks application-oriented papers, as well as more theoretical papers and position papers. The workshop proposes a multidisciplinary discussion on the following themes, with a focus on RDF data. Main topics but not limited to:
Linguistic dataset profiling tools and algorithms
Linguistic data summarisation
Ontology and data quality evaluation for linguistic data
Fusing and refining linguistics profiling results
Scalable approaches for linguistic profiles generation
SHACL shapes as means for profiling
Topic profiling for linguistic data
Semantic profiles representation of linguistic data
PROGRAM
09:00 - 09:10 Opening
09:10 - 10:00 Keynote by Marieke Van Erp on Contextual Profiling of Linguistic Datasets.
10:00 - 10:30 RDF Shapes ecosystem: tooling and uses: Daniel Fernandez
10:30 - 11:00 Profiling Linguistic Knowledge Graphs: Blerina Spahiu, Renzo Alva Principe and Andrea Maurino
11:00 - 11:30 Coffee Break
11:30 - 12:00 Pruning and re-ranking the frequent patterns in knowledge graph profiling using machine learning: Gollam Rabby, Farhana Keya, Vojtěch Svátek and Blerina Spahiu
12:00 - 13:00 Discussion & Closing
13:00 - 14:00 Lunch Break
We welcome the following types of contributions:
Short (up to 5 pages) and full (up to 10 pages) research papers
Industry and use case presentations (up to 5 pages)
Tool and system demonstrations should not exceed 4 pages
Position papers (up to 4 pages)
All submission lengths are given including references. Accepted submissions will be published in an open-access conference proceedings volume, free of charge for authors. The ACL templates should therefore be used for all conference submissions.
Papers have to be submitted through easychair: https://easychair.org/conferences/?conf=prolingknower2023
Each submission will be reviewed by at least 2 members of the PC. Papers will be evaluated according to their significance, originality, technical content, style, clarity and relevance to the workshop.
DATES
NEWS
Keynote speaker for the ProLingKNOWER 2023 is Marieke van Erp!
We're thrilled to announce that Marieke van Erp will be our keynote speaker for the ProLingKNOWER workshop! As an esteemed expert in her field, she'll bring invaluable insights into cutting-edge research and industry trends that regards language and semantic web technologies.
Title of Marieke's talk: Contextual Profiling of Linguistic Datasets
Improving the metadata of datasets has received more attention in recent years with initiatives such as Datasheets for Datasets and DCAT these initiatives mostly focus on the form and creation process of the data and to a certain extent the topics and themes. In this talk, I will make a case for contextual profiling of datasets, as the context in which a dataset was conceived and/or used can have far reaching implications for its interpretation. Through examples from the humanities domain, I will show how the meaning of terms is affected by situational factors and how we can describe such contexts to prevent misinterpretations when the dataset is used outside its original frame of reference.
Guest speaker for the ProLingKNOWER 2023 Daniel Fernandez!
We're thrilled to announce that Daniel Fernandez will be our guest speaker for the ProLingKNOWER workshop! He has a PhD in Computer Science and he is an Associate Professor at the University of Oviedo, Spain. He is specialized in RDF shapes and, specifically, automatic extraction of RDF shapes from knowledge graphs/natural language
Title of Daniel's talk: RDF Shapes ecosystem: tooling and uses
In this talk, we will describe the purpose and potential uses of RDF shapes (SHACL and ShEx). We will start by briefly introducing the concept of shape and discussing some differences between shapes and other technologies used to validate or describe RDF data. Then, we will make an overview of tools that allow users to perform usual tasks with RDF shapes, such as editing and validation. Finally, as hand-crafting shapes is costly, we will describe techniques and tools for automatically extracting or infering shapes from existing RDF content.
Acknowledgements for the abstract:
The research work presented in this talk was partially funded by the Spanish Ministry of Economy and Industry, project ID MCI-21-PID2020-117912RB-C21.
COMMITTEES
ORGANISING COMMITTEE
PROGRAM COMMITTEE (TBC):
Albin Ahmeti - Semantic Web Company, Austria
Alfonso Guarino - Università degli Studi di Foggia, Italy
Anisa Rula - University of Brescia, Italy
Andrea Maurino - Università degli Studi di Milano - Bicocca, Italy
Beyza Yaman - ADAPT Centre, Dublin City University, Ireland
Daniele Schicchi - Università di Palermo, Italy
Daniele Spaladori - Istituto di Sistemi e Tecnologie Industriali Intelligenti per il Manifatturiero Avanzato (STIIMA) – CNR, Italy
Dimitrios Skoutas - Information Management Systems Institute/Athena RC, Greece
Gabriella Casalino - University of Bari "A.Moro", Italy
Jakub Klímek - Charles University, Prague, Czech Republic
Jeremy Debattista - Top Quadrant, Malta
Jose Emilio Labra Gayo -University of Oviedo, Spain
Luigi Asprino - University of Bologna, Italy
Manuel Vimercati - Università degli Studi di Milano - Bicocca, Italy
Marco Cremaschi - Università degli Studi di Milano - Bicocca, Italy
Riccardo Albertoni - Institute for Applied Mathematics and Information Technologies, Consiglio Nazionale delle Ricerche, Genoa, Italy
Theodore Dalamagas - Information Management Systems Institute/Athena RC, Greece
Thierry Declerck – DFKI GmbH - Saarbrücken, Germany
Włodzimierz Lewoniewski - Poznań University of Economics and Business, Poland
HISTORY
The first edition of the ProLingKNOWER workshop was held on 23rd May 2022 in Jerusalem, Israel.