Sourcing Data Flashcards
Define ‘data sourcing’.
How one goes out to gather data.
Which two ways of data sourcing are there?
Active and Passive
What is active data sourcing?
These are ways of acquiring records (or data) actively through fieldwork, surveys, interviews etc.
What is passive data sourcing?
These are ways of acquiring records (or data) that are made available from some “official” (or “recognised”)
sources (e.g., NGOs, agencies, companies, government, educational institutions etc.)
What is Open Data?
Official/recognised sources can make their data freely available to everyone to use & republish, repurpose it as they wish without any patent, legal or copyright restrictions.
What type of data is Open Data?
Secondary Data
What type of data is active data sourcing?
Primary Data
Define primary data.
This type of data source refers to the first hand data gathered by the user, researcher or enumerator through fieldwork, interviews, questionnaire surveys etc.,
Define primary data.
First hand data gathered by the user or researcher through fieldwork, interviews, questionnaire surveys etc.,
Define secondary data.
Data that has already been collected through a primary source and made readily available for other researcher(s) to use for their study or investigation.
What is internal secondary data?
If you are working in collaborating with an organisation that has some relevant data of interest and the organisation has collected the data and can provide
access to such users working within it or collaborators.
Give an example of internal secondary data.
Data scientists working within the UK Metropolitan Police service seeking to quantify the burden of various crime outcomes e.g. burglary, sexual assault etc.
What is external secondary data?
- Open source websites (freely accessible)
- Paying for the data (requires a license)
- Online data service which is free but requires users to
register etc.
List the advantages of primary data sources.
- Data collected is always up-to date
- Relevant and specific to user’s research aims and
objectives - High-level, and greater of understanding about the
nature and content of the dataset - High-level of accuracy as along as you do whatever in your power to minimise all kinds of systematic errors in the data collection process
List the disadvantages of primary data sources.
- Depending on the study design – data collection is a very time consuming and expensive process
- If you are collecting personal and sensitive data, you must DEFINITELY apply for ethnical approval before going out to get your data
- You will have to clean, manage and maintain your own data
- Possibility to falsify his/her data since one has his/her autonomy over the data