Narrow Search
Last searches

Results for *

Displaying results 1 to 2 of 2.

  1. Humanities Data in R
    Exploring Networks, Geospatial Data, Images, and Text
    Published: 2024
    Publisher:  Springer International Publishing, Imprint: Springer, Cham

    Zusammenfassung: This book teaches readers to integrate data analysis techniques into humanities research practices using the R programming language. Methods for general-purpose visualization and analysis are introduced first, followed by... more

     

    Zusammenfassung: This book teaches readers to integrate data analysis techniques into humanities research practices using the R programming language. Methods for general-purpose visualization and analysis are introduced first, followed by domain-specific techniques for working with networks, text, geospatial data, temporal data, and images. The book is designed to be a bridge between quantitative and qualitative methods, individual and collaborative work, and the humanities and social sciences. The second edition of the text is a significant revision, with almost every aspect of the text rewritten in some way. The most notable difference is the incorporation of new R packages such as ggplot2 and dplyr that center broad data-science concepts. This 2nd edition of Humanities Data with R does not presuppose background programming experience. Early chapters take readers from R set-up to exploratory data analysis, with one chapter dedicated to each stage of the data-science pipeline (data collection, visualization, manipulation, and relational joins). Following this, text analysis, networks, temporal data, geospatial data, and image analysis each have a dedicated chapter. These are grounded in examples to move readers beyond the intimidation of adding new tools to their research. The final section of the book extends the core material with additional computer science techniques for processing large datasets. Everything is hands-on: image analysis is explained using digitized photographs from the 1930s, and networks are applied to page links on Wikipedia. After working through these examples with the provided data, code and book website, readers are prepared to apply new methods to their own work. The open source R programming language, with its myriad packages and popularity within the sciences and social sciences, is particularly well-suited to working with humanities data. R packages are also highlighted in an appendix. The methodology will have wide application in classrooms and self-study for the humanities, but also for use in linguistics, anthropology, and political science. Outside the classroom, this intersection of humanities and computing is particularly relevant for research and new modes of dissemination across archives, museums and libraries

     

    Export to reference management software   RIS file
      BibTeX file
    Source: Union catalogues
    Language: English
    Media type: Ebook
    Format: Online
    ISBN: 9783031625664
    Other identifier:
    Edition: 2nd ed. 2024
    Series: Quantitative Methods in the Humanities and Social Sciences
    Other subjects: (lcsh)Mathematical statistics--Data processing.; (lcsh)Digital humanities.; (lcsh)Sociology--Methodology.; (lcsh)Computational linguistics.; (lcsh)Anthropology.; Statistics and Computing.; Digital Humanities.; Sociological Methods.; Computational Linguistics.; Anthropology.
    Scope: Online-Ressource, XIV, 284 p. 80 illus., 50 illus. in color., online resource.
    Notes:

    - Part I Core -- Working with Data in R -- EDA I: Grammar of Graphics -- EDA II: Organizing Data -- EDA III: Restructuring Data -- Collecting Data -- Part II Data Types -- Textual Data -- Network Data -- Temporal Data -- Spatial Data -- Image Data -- Part III Additional Methods -- Programming in R -- Data Formats

  2. Linguistic Resources for Natural Language Processing
    On the Necessity of Using Linguistic Methods to Develop NLP Software
    Contributor: Silberztein, Max (Herausgeber)
    Published: 2024
    Publisher:  Springer Nature Switzerland, Imprint: Springer, Cham

    Zusammenfassung: Empirical — data-driven, neural network-based, probabilistic, and statistical — methods seem to be the modern trend. Recently, OpenAI’s ChatGPT, Google’s Bard and Microsoft’s Sydney chatbots have been garnering a lot of attention for... more

     

    Zusammenfassung: Empirical — data-driven, neural network-based, probabilistic, and statistical — methods seem to be the modern trend. Recently, OpenAI’s ChatGPT, Google’s Bard and Microsoft’s Sydney chatbots have been garnering a lot of attention for their detailed answers across many knowledge domains. In consequence, most AI researchers are no longer interested in trying to understand what common intelligence is or how intelligent agents construct scenarios to solve various problems. Instead, they now develop systems that extract solutions from massive databases used as cheat sheets. In the same manner, Natural Language Processing (NLP) software that uses training corpora associated with empirical methods are trendy, as most researchers in NLP today use large training corpora, always to the detriment of the development of formalized dictionaries and grammars. Not questioning the intrinsic value of many software applications based on empirical methods, this volume aims at rehabilitating the linguistic approach to NLP. In an introduction, the editor uncovers several limitations and flaws of using training corpora to develop NLP applications, even the simplest ones, such as automatic taggers. The first part of the volume is dedicated to showing how carefully handcrafted linguistic resources could be successfully used to enhance current NLP software applications. The second part presents two representative cases where data-driven approaches cannot be implemented simply because there is not enough data available for low-resource languages. The third part addresses the problem of how to treat multiword units in NLP software, which is arguably the weakest point of NLP applications today but has a simple and elegant linguistic solution. It is the editor's belief that readers interested in Natural Language Processing will appreciate the importance of this volume, both for its questioning of the training corpus-based approaches and for the intrinsic value of the linguistic formalization and the underlying methodology presented

     

    Export to reference management software   RIS file
      BibTeX file
    Source: Union catalogues
    Contributor: Silberztein, Max (Herausgeber)
    Language: English
    Media type: Ebook
    Format: Online
    ISBN: 9783031438110
    Other identifier:
    Edition: 1st ed. 2024
    Other subjects: (lcsh)Natural language processing (Computer science).; (lcsh)Computational linguistics.; (lcsh)Artificial intelligence.; (lcsh)Digital humanities.; Natural Language Processing (NLP).; Computational Linguistics.; Artificial Intelligence.; Digital Humanities.
    Scope: Online-Ressource, XXII, 217 p. 118 illus., 101 illus. in color., online resource.
    Notes:

    In honor of Peter -- Foreword. - Preface -- About this book. Part 1. Introduction -- 1. The Limitations of Corpus-based Methods in NLP -- Part 2 -- 2. Developing Linguistic-based NLP Software -- 3. Linguistic Resources for the Automatic Generation of Texts in Natural Language -- 4. Towards a More Efficient Arabic-French Translation -- 5. Linguistic Resources and Methods and Algorithms for Belarusian Natural Language Processing -- Part 3 -- Linguistic Resources for Low-resource Languages -- 6. A New Set of Linguistic Resources for Ukrainian -- 7. Formalization of the Quechua Morphology -- 8. The Challenging Task of Translating the Language of Tango -- 9. A Polylectal Linguistic Resource for Rromani -- Part 4. Processing Multiword Units: The Linguistic Approach -- 10. Using Linguistic Criteria to Define Multiword Units -- 11. A Linguistic Approach to English Phrasal Verbs -- 12. Analysis of Indonesian Multiword Expressions: Linguistic vs Data-driven Approach