All of it Flashcards
Data collection methods
Autonomous devices
Passive and active data collection
Manual data collection
Usage data
Size of a sound file
File size = sample rate × sample
resolution × length of sound
Or
File size = bit rate × length of sound
Advanced storage techniques
Redundant Array of Inexpensive
Disks (RAID)
Network Attached Storage (NAS)
High availability storage
Storage Area Networks usage (SAN)
Cloud storage
Hosted storage
Size of an image file
Size of an image = row * columns * bpp
Data
Data can be defined as a set of recorded facts, numbers or events that has no initial meaning or structure. The main purpose of data collection is to gather information in a measured and systematic manner to ensure accuracy and facilitate data analysis. Data only becomes valuable once this has happened as it gives context and meaning in relation to why it was gathered.
Methods to help store data
Virtualisation: Is the process of turning hardware into a software equivalent without sacrificing functionality.
Hosted instance: Instances are the virtual machines that run operating systems’ images such as Linux.
Hosted solution: When you rent a virtual server from a company that takes over the responsibility for maintaining and keeping your server running.
Clustering: A group of two or more computer systems that run in parallel together to achieve a goal.
Blockchain storage: A way of saving data in a decentralised network, which utilises the unused hard disk space of users across the world to store files.
Descriptive analytics
It can involve breaking down data and summarising its main features and characteristics. It presents what has happened in the past without exploring why or how.
Artificial Intelligence (AI)
Artificial intelligence (AI) is the simulation of human intelligence processes by machines.
Units of data
Unit Symbol Value
Byte B 8 bits
Kilobyte KB 1024
bytes
Megabyte MB 1024 KB
Gigabyte GB 1024 MB
Terabyte TB 1024 GB
Petabyte PB 1024 TB
Exabyte EB 1024 PB
Zettabyte ZB 1024 EB
Yottabyte YB 1024 ZB
General storage methods
Digitally sampled sound
Bitmapped graphics
Compressed audio
Compressed video
Cloud computing services
Data storage
Virtualised software
Remotely hosted applications
Data visualisation
Involves presenting the data visually or graphically to detect patterns, trends and correlations that are not usually apparent from raw data.
Management Information System
A management information system(MIS) is a collection of systems and procedures that gather data from multiple sources and compile them in a readable format.
Project Management Software
(PMS)
Project management software (PMS) is a software tool that helps organise, manage and track projects.
Data Warehouse
A data warehouse (DW or DWH) is a
system used for reporting and data
analysis.
Data mining
Data mining is considered an
interdisciplinary field that joins the
techniques of computer science and
statistics together.
Social and ethical implications of AI
Is it acceptable if AI becomes more knowledgeable than humans?
How many jobs will be lost to AI?
How much data does AI gather?
Does AI take away people’s privacy?
How can we safeguard AI from discrimination and bias?
Who is accountable if a wrong decision is made?
How do we know what information AI is generating?
How do we know the information generated by AI is accurate?
How do we know if AI has been manipulated?
Is AI gathering too much information?
Large data sets
Large data sets refer to data sets that
are too large or complex to be dealt
with by traditional data-processing
application software.
Neural network modelling
A neural network is a series of
algorithms that tries to recognise
underlaying relationships in a set of
data through a process that mimics the
way the human brain operates.
Natural Language Processing (NLP)
A subset of artificial intelligence is
known as natural language processing
(NLP). The aim of this subset is to
develop computer systems which can
understand text or voice data in the
same way as human beings.
Data Flow Diagrams (DFD)
Data flow diagrams (DFD) are used to
show the flow of data in a business
information system. Specific rules and
symbols must be used when creating
these diagrams.
Cyber security
How individuals and organisations reduce the risk of cyber-attacks, and how to prevent unauthorised access to the personal information we store on our devices and online.
Risks associated with online marketing
communications:
Spam and unwanted e-mail
Phishing and scam attempts
Privacy concerns
Ad fraud
Brand safety
Misinformation
The importance of large data sets to
the operation and competitiveness of
organisations
Health sector: Electronic health records (EHRs), patient
data and clinical trial data are used to improve patient care, support medical research and streamline operations.
Finance sector: Transaction data, credit history and
market data are used to make informed investment
decisions, identify fraud and improve risk management strategies.
Retail sector: Customer data, sales data and supply chain data are used to improve marketing and sales campaigns, optimise supply chain operations and provide personalised customer experiences.
MAC addresses and MAC address
spoofing
The Media Access Control (MAC) address is a unique
identifier assigned to a Network Interface Controller (NIC) for use as a network address in communications within a network segment. The use of unique MAC addresses can
create security risks:
MAC spoofing
Privacy concerns
Network security
Network performance
Cryptocurrencies and why they can sometimes be
associated with cyber security
Blockchain is a decentralised, digital ledger that records transactions across a network of computers. It uses cryptography to secure and validate transactions, ensuring that the ledger is tamper-proof. The most well-known application of blockchain technology is cryptocurrency. Blockchain technology is used in cybersecurity in the following ways:
Decentralised identity
management
Cyber threat intelligence sharing
Secure record keeping
Data privacy
Chain security
Cyber insurance
Security and integrity problems during online file
updates:
Unauthorised access
Incomplete updates
Man-in-the-middle attacks
Denial of service
Malicious software
Rollback attacks
To mitigate these risks, organisations should use secure methods for transmitting and verifying the integrity of update files.