Step 1: Creating a Data Scope
- Finish the mapping of the crawled dataset to the ODBR data model,
- Export the crawled dataset to CSV,
- Export the Open Refine steps to a JSON file,
- Create a file that documents the most important choices, decisions and interpretations.
Step 2: Share your Data Scope
Go to the SurfDrive folder and add the three documents (CSV, JSON and documentation) to the folder we made for your group.
Once all groups have done this, get the documents from the group next to you and try to merge their edited version of the crawled data to your version of ODBR dataset, by applying the same steps.
Look at their documentation and consider the following questions:
- What is most useful in the documentation?
- Is anything not useful or unclear in the documentation?
- Is anything missing in the documentation?
- Are there differences between what you documented for your data scope and what the other group did?
Back to the research question:
- What is the relation between the data scope and the research question? Was the research question clear?
- What did you focus on in doing the data processing? Is anything pushed to the background?
- How have interpretation, choices and decisions changed the data?
- Do these transformations create the impression that there’s more structure than there actually is/was? Is that a problem? Whose interpretations are these?
Methodology discussion
A general question:
- Who’s task is this?
Reflection on the process:
- Open Refine as interface for communicating
- Adding columns with source of information
- Creating a separate dataset of records that were removed
- Right level of communication between scholars and developers
- Spreadsheet interface gives quick visual feedback
- Spreadsheet operations are conceptually close to what some scholars do
- Spreadsheet operations are conceptually close to what programmers do
- Useful for which parts of research? E.g. exploration, analysis, presentation?