what are the principles of open data sharing Flashcards
what and where to share, keeping data anonymous, sharing practices
1
Q
what are the reasons to open data
A
- must be open if publicly funded
- facilitate data aggregation
- permit creative re-analysis
- assist with later scientific developments
- signals commitment to transparency
- public trust in scientists is heightened when data are openly available
2
Q
what are where should data be shared
A
- data underlying stats reported = personal, lab and project specific websites - data with personal identifiers - institutional repositories
- materials like behavioural tasks, test instruments, code books etc = open science repositories
- data repositories - supplemental materials attached to an article and stored on publishers website
3
Q
what are the FIR principles for data sharing
A
- findable
- accessible
- interoperable
- reusable
4
Q
what is databrary
A
- find clips
- share data
- archive data
- use data
5
Q
what is a DOI
A
- digital ID for intellectual property trading
- Digital data preservation (manage and retain data)
- digital curation (management and preservation of digital data over the long-term)
6
Q
what is GDPR
A
- European general data protection regulation
- safeguarding privacy of participants
- special category data
- personal data can be shared on the basis of public or legitimate interest on the basis of written consent from participant
7
Q
what is anonymised data
A
- it is not feasible to provide all raw data due to ethical, legal or other reasons - some data is likely to be better than none at all
8
Q
what are synthetic datasets
A
- mimic organic datasets by retaining statistical properties and relationships between variables
- no record in synthetic database represents a real individual
9
Q
what is participant consent
A
- level of data sharing (kinds of consent for certain information shared)
- confidential to research team but doesn’t promise to destroy data or keep it to the research team
10
Q
what is data sharing work flow
A
- curate data for sharing (sensible files/ labels)
- create metadata (README text file for data set)
- update/finalise data
-update metadata - submit data to repository
10
Q
what is README
A
- details of how study was run
- how analysis scripts are to be run
- list of folders & files
- study descriptions/link to pre-registration
- link to pre-print
11
Q
what is good about data sharing
A
- completeness (data and data descriptors supporting a studies findings)
- reusability - how readily available data can be accessed and understood by third parties
12
Q
what are the data sharing practices like in Psych
A
- 95% of papers didn’t contain identifiable open date
- data was held in image or pdf format
- comprised of tables but not raw
- data stored in journals or hyperlink without DOI
- often missing README, data descriptions/codebook
- exclusions of raw variables from which calculations were made
13
Q
what is the computational reproducibility in registered reports
A
- data from 118 from registered reports to independently computationally reproduce the main results
- 62 articles met inclusion criteria, 41 had data available and 37 had analysis scripts
- run 31 analyses
- only able to reproduce the main result for 21 articles
14
Q
what are the solutions
A
- develop standards
- educate about best practices that guarantee reproducibility
- overcome obstacles to sharing data
- fully describe the dataset
- to provide clear and practical open data guidelines
- ensure ling term accessible version of their data
- use third party repositories