what are the principles of open data sharing Flashcards

what and where to share, keeping data anonymous, sharing practices

You may prefer our related Brainscape-certified flashcards:
1
Q

what are the reasons to open data

A
  • must be open if publicly funded
  • facilitate data aggregation
  • permit creative re-analysis
  • assist with later scientific developments
  • signals commitment to transparency
  • public trust in scientists is heightened when data are openly available
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what are where should data be shared

A
  • data underlying stats reported = personal, lab and project specific websites - data with personal identifiers - institutional repositories
  • materials like behavioural tasks, test instruments, code books etc = open science repositories
  • data repositories - supplemental materials attached to an article and stored on publishers website
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

what are the FIR principles for data sharing

A
  • findable
  • accessible
  • interoperable
  • reusable
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is databrary

A
  • find clips
  • share data
  • archive data
  • use data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is a DOI

A
  • digital ID for intellectual property trading
  • Digital data preservation (manage and retain data)
  • digital curation (management and preservation of digital data over the long-term)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is GDPR

A
  • European general data protection regulation
  • safeguarding privacy of participants
  • special category data
  • personal data can be shared on the basis of public or legitimate interest on the basis of written consent from participant
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is anonymised data

A
  • it is not feasible to provide all raw data due to ethical, legal or other reasons - some data is likely to be better than none at all
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what are synthetic datasets

A
  • mimic organic datasets by retaining statistical properties and relationships between variables
  • no record in synthetic database represents a real individual
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what is participant consent

A
  • level of data sharing (kinds of consent for certain information shared)
  • confidential to research team but doesn’t promise to destroy data or keep it to the research team
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is data sharing work flow

A
  • curate data for sharing (sensible files/ labels)
  • create metadata (README text file for data set)
  • update/finalise data
    -update metadata
  • submit data to repository
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is README

A
  • details of how study was run
  • how analysis scripts are to be run
  • list of folders & files
  • study descriptions/link to pre-registration
  • link to pre-print
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is good about data sharing

A
  • completeness (data and data descriptors supporting a studies findings)
  • reusability - how readily available data can be accessed and understood by third parties
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what are the data sharing practices like in Psych

A
  • 95% of papers didn’t contain identifiable open date
  • data was held in image or pdf format
  • comprised of tables but not raw
  • data stored in journals or hyperlink without DOI
  • often missing README, data descriptions/codebook
  • exclusions of raw variables from which calculations were made
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is the computational reproducibility in registered reports

A
  • data from 118 from registered reports to independently computationally reproduce the main results
  • 62 articles met inclusion criteria, 41 had data available and 37 had analysis scripts
  • run 31 analyses
  • only able to reproduce the main result for 21 articles
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the solutions

A
  • develop standards
  • educate about best practices that guarantee reproducibility
  • overcome obstacles to sharing data
  • fully describe the dataset
  • to provide clear and practical open data guidelines
  • ensure ling term accessible version of their data
  • use third party repositories
How well did you know this?
1
Not at all
2
3
4
5
Perfectly