Suchergebnisse

Filtern nach

Letzte Suchanfragen

Ergebnisse für *

Zeige Ergebnisse 1 bis 2 von 2.

Relevanz

Titel

Typ

Autor

[Datum]

Mining Social Science Publications for Survey Variables

Autor*in: Zielinski, Andrea; Mutschke, Peter

Erschienen: 2018

Verlag: MISC

Volltext:	https://www.ssoar.info/ssoar/handle/document/57722 http://www.aclweb.org/anthology/W17-2907
Zitierfähiger Link:	http://nbn-resolving.org/urn:nbn:de:0168-ssoar-57722-7

Research in Social Science is usually based on survey data where individual research questions relate to observable concepts (variables). However, due to a lack of standards for data citations a reliable identification of the variables used is often difficult. In this paper, we present a work-in-progress study that seeks to provide a solution to the variable detection task based on supervised machine learning algorithms, using a linguistic analysis pipeline to extract a rich feature set, including terminological concepts and similarity metric scores. Further, we present preliminary results on a small dataset that has been specifically designed for this task, yielding modest improvements over the baseline.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt AVL
Sprache:	Unbestimmt
Medientyp:	Konferenzveröffentlichung
Format:	Online
Übergeordneter Titel:	Proceedings of the Second Workshop on NLP and Computational Social Science ; 47-52
DDC Klassifikation:	Literatur und Rhetorik (800); Publizistische Medien, Journalismus, Verlagswesen (070)
Schlagworte:	Literatur; Rhetorik; Literaturwissenschaft; Publizistische Medien; Journalismus,Verlagswesen; Literature; rhetoric and criticism; News media; journalism; publishing; OpenMinTed; Information Science; Science of Literature; Linguistics; Sprachwissenschaft; Linguistik; Informationswissenschaft; publication; technical literature; artificial intelligence; computational linguistics; survey; social science; concept; algorithm; periodical; construction of indicators; data capture; Datengewinnung; künstliche Intelligenz; Begriff; Algorithmus; Computerlinguistik; Befragung; Publikation; Sozialwissenschaft; Fachliteratur; Indikatorenbildung; Zeitschrift
Lizenz:	Creative Commons - Namensnennung, Nicht-kommerz., Weitergabe unter gleichen Bedingungen 4.0 ; Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 ; info:eu-repo/semantics/openAccess

Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications

Autor*in: Zielinski, Andrea; Mutschke, Peter

Erschienen: 2018

Verlag: DEU

Volltext:	https://www.ssoar.info/ssoar/handle/document/57723
Zitierfähiger Link:	http://nbn-resolving.org/urn:nbn:de:0168-ssoar-57723-2

In this paper, we describe our effort to create a new corpus for the evaluation of detecting and linking so-called survey variables in social science publications (e.g., "Do you believe in Heaven?"). The task is to recognize survey variable mentions in a given text, disambiguate them, and link them to the corresponding variable within a knowledge base. Since there are generally hundreds of candidates to link to and due to the wide variety of forms they can take, this is a challenging task within NLP. The contribution of our work is the first gold standard corpus for the variable detection and linking task. We describe the annotation guidelines and the annotation process. The produced corpus is multilingual - German and English - and includes manually curated word and phrase alignments. Moreover, it includes text samples that could not be assigned to any variables, denoted as negative examples. Based on the new dataset, we conduct an evaluation of several state-of-the-art text classification and textual similarity methods. The annotated corpus is made available along with an open-source baseline system for variable mention identification and linking.

Export in Literaturverwaltung

Quelle:	BASE Fachausschnitt AVL
Sprache:	Unbestimmt
Medientyp:	Konferenzveröffentlichung
Format:	Online
Übergeordneter Titel:	Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC) ; International Conference on Language Resources and Evaluation (LREC) ; 11
DDC Klassifikation:	Literatur und Rhetorik (800); Publizistische Medien, Journalismus, Verlagswesen (070)
Schlagworte:	Publizistische Medien; Journalismus,Verlagswesen; Literatur; Rhetorik; Literaturwissenschaft; News media; journalism; publishing; Literature; rhetoric and criticism; text mining; semantic textual similarity; paraphrase detection; linking; Informationswissenschaft; Sprachwissenschaft; Linguistik; Information Science; Science of Literature; Linguistics; Sozialwissenschaft; Publikation; Daten; Algorithmus; Computerlinguistik; social science; publication; data; algorithm; computational linguistics
Lizenz:	Creative Commons - Namensnennung, Nicht kommerz., Keine Bearbeitung 4.0 ; Creative Commons - Attribution-Noncommercial-No Derivative Works 4.0 ; info:eu-repo/semantics/openAccess

Filtern nach

Aktive Filter

Kategorien:

Quelle

Format

Beteiligt

Medientyp

Sprache

Jahr

Letzte Suchanfragen

Ergebnisse für *

Mining Social Science Publications for Survey Variables

Towards a Gold Standard Corpus for Variable Detection and Linking in Social Science Publications

Kontaktieren Sie uns!