Workshop Data Scopes for Developers - Transparent Data Research for the Humanities
Data Scopes is a concept for dealing with complex data in a humanities research context. With this concept we want to communicate to contribute to methodological reflection on the processes of preparing and analyzing humanities research data, and how they shape and contribute to interpretation.
In this workshop we focus on various activities in transforming humanities materials to make them fit for addressing research questions and how these require knowledge and interpretation of the data and how they affect further interpretation during analysis and synthesis.
The workshop is one and half days, consisting of three half day sessions. During the workshop participants work in small groups on assignments based on sample datasets and realistic research questions. The following topics will be discussed:
- Combining data from different sources: e.g. book reviews from different book review website, to study reception of literature.
- Normalizing variation: conflating variations of names and keywords for linking across records and datasets.
- Reducing complexity: classification and dealing with skewed distributions.
In the process of scoping data to fit a research question, we discern five distinct activities: modelling, selecting, normalizing, linking and classifying. The goal is to come to a good way of documenting the data transformation and research process in terms of these activities, in such a way that others understand what happened to the data.
Goals
The purpose is to make developers realize different aspects of using large-scale data in the practice of research. In the workshop we discuss the relation between research questions and the selection and transformation of data in order to answer these questions. While this may seem a research issue at first, it is typically shape in the cooperation between researchers, developers and data curators. While the researchers are primarily responsible for their data, they often have not enough insight in the chain of data transformations in which developers play a part. But are developers aware of the effects of their efforts on research data?
The goal of the workshop is to raise this awareness and explore the relation between the development process and research methodology.
Programme
Wednesday 5 September:
- 10:00-10:30 Workshop introduction
- 10:30-11:00 Data Scopes concept (slides)
- 11:00-13:00 Hands-on session 1: Creating Data Axes for Online Book Review Data
- 13:00-14:00 Lunch
- 14:00-17:00 Hands-on session 2: Combining Heterogeneous Review Datasets
Thursday 6 September:
- 10:00-12:00 Hands-onn session 3: Sharing Data Scopes
- 12:00-13:00 Methodology discussion and wrap up
Instructors
- Rik Hoekstra - KNAW Humanities Cluster - Research and Development
- Marijn Koolen - KNAW Humanities Cluster - Research and Development