ICS2 Flashcards

IS and Data Mgmt

1
Q

SOC 2

A

System and Organization Controls engagements that examination of service orgs system of internal controls as it relates to the AICPA’s five Trust Service Criteria.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Trust Service Criteria

A

Security, availability, processing integrity, confidentiality, and privacy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Network infrastructure

A

refers to hardware, software, layout, topology of network resources that enable connectivity and communciation between devices

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Modem

A

Connects network to internet service provider network through cable connection - receives analog signals and translates those into digital signals. Each modem has a public IP address.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Router

A

Manager network traffic by connecting devices to form a network.

Read source and destination fields in information packet headers to determine the best path for the packet to travel.

Act as a link between modem and switches or if no switches a user’s device.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Switches

A

similar to routers they can connect and divide devices w/in a network - turns one network jack into many so mulitple device can share one network connection.

does not assign IP.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Gateways

A

a computer/device acts as an intermediary between different networks.

Transforms data from one protocol into another so info can flow between networks.

Gateways interpret protocols and coverts the them into the right format to facilitate network movement, usually between company network and internet.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Protocol

A

Rule, or set of rules, that governs the way in which information is transmitted

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

TCP/IP

A

type of protocol used by internet- transmission control protocol/internet protocol.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Edge-enabled devices

A

allow computing, storage and networking functions closer to devices where data/system request originates - makes for faster response time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Servers

A

physical/virtual machines that coordinate computers, programs, and data that are part of a network.

Client/server model - client sends request to server and it provides a response to executes an action.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Firewall

A

Software or hardware that protect a person or network traffic by filtering it through security protocols w/ rules.

Designed to prevent unauthorized access and downloading of malicious programs or access restricted sights

can be set up to only allow trusted sources

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Circuit level gateway firewall

A

verifies source of packet and meets rules/policies set by security team

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

application level gateway

A

inspects packet itself - resource intensive and may slow performance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

network address translation firewall

A

assigns internal network address to specific, approve external sources so those sources are approved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

stateful multilayer inspection firewall

A

combines packet filtering and network address translation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

next-gen firewall

A

assigns different firewall rule to different applications as well as users.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Bus Topology

A
  • linear or tree form with each node connected to a single line or cable.

-Any node can send data at same time and cause interference so cables must be terminated at each end.

-Downside if central line is compromised- entire network offline

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

Mesh Topology

A
  • there are numerous connections between nodes, with all nodes begin connected in a full mesh and some in a partial mesh.
    -Common for wireless networks
  • Allows for high levels of traffic and promotes network stability if node is damaged.
    -Costly

DIAMOND shaped

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Ring Topology

A
  • nodes connected in circular path, data must go through all devices between source and destination.
  • Can be uni or multi directional

-Advantage is data transmission collision is minimized or eliminated, but can result in slow performance.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Star Topology

A
  • Data passes through central hub that acts as a switch or server then transmits to peripheral device that act as client
  • Can be mulitple hubs so if one fails on some nodes be affected

-Easier to ID damaged cables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Network Infrastructure Protocols

A

governs the way data is transmitted based on method used (cable, port).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Open System Interconnection model

A
  • Developed by ISO and explains how protocols work and how devices communicate w/ each other.

-Segregates network functions into 7 layers, each responsible for specific data exchange function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Open System Interconnection model layers

Encapsulation

Decapsulation

A

Data flows through each later through encapsulation which adds a header/footer to the data point received from the previous layer. Starts at application layer with a message down to the physical layer. There decapsulation begins moving up to application

  1. Application
  2. Presentation
  3. Session
  4. Transport
  5. Network
  6. Data Link
  7. Physical - actual network device use to transmit message
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
OSI Application Layer
serves as an interface between applications that a person uses and the network protocol to transmit a message. HTTP, FTP, SMTP, and EDI
26
OSI Presentation Layer
Transforms data from application layer into a format that other devices can interpret, such as videos, images, and webpages. Encryption also occurs. ASCII, JPG, MPEG.
27
OSI Session Layer
Allows sessions between communicating devices to be established and maintained, which allow devices to have dialog with each other. Remote Procedure Call, Structured Query Language (SQL), Network File System.
28
OSI Transport Layer
supports and controls the communication connections between devices -involves setting rules for how devices are referenced, amount of data transmitted, validating data integrity, and determining if data lost. TCP, UDP, SSL, and TLS.
29
OSI Network Layer
Adds routing and address headers/footers to data (source and destination IP) so messages each the corret device. Detects errors. IP, IPSec, NAT, and IGMP.
30
OSI Data Link Layer
data packets are formatted for transmission determined by hardware and network technology (ethernet). Adds Media Access Control addresses, which are device identifiers that act as source and destination reference numbers to rout message to right device. ISDN, PPTP, L2TP, and ARP.
31
OSI Physical Layer
converts the message sent from data link layer into bits so it can be transmitted to other physicals devices. Receives messages from other physical devices and coverts back from bits to a format that can be interpreted by data link layer. HSSI, SONET, V.35 X.21
32
Network Infastructure Architecture
Way in which an organization structures its network from a holisic design standpoint, considering geographical, physical and logical layouts and network protocols used.
33
Local Area Network
network access to limited geographic area - home/single office
34
Wide Area Netowrk
network access to larger geographic area - cities, regions - connect LANs together to provide broad coverage Example - internet
35
Software Defined Wide Area network
Monitors performance of WAN connections and manages traffic to optimize connectivity. Control and management are separated from the hardware and included in a software. (while a WAN its in the hardware)
36
Virtual Private network
virtual connection through secure channel or tunnel that provide remote and secure access to existing network. RDS (remote desk tops)
37
Firmware
software locally embedded in hardware instructs hardware how to operate. It is not updated frequently.
38
Internet of Things
Devices - that are an extension of mobile technology and typically require bluetooth or an internet connection to access a larger network (smart things)
39
Cloud computing advantages
costs related to maintenance/support, only pay for what is needed, gain efficiencies as data is all in one location, reduces likelihood of loss in an attach/disaster due to redundancies in cloud computing
40
Cloud Computing Model: Infrastructure as a Service
CSP provides an entire virtual data center of resources (servers, storage, hardware, networking) billed on per use basis. Company responsible for keeping environment up and running and virtually managing the performance of the physical infrastructure. CSP responsible for physical mgmt of infrastructure
41
Cloud Computing Model: Platform as a Service
CSP provide tools/solutions remotely that are used to fulfill a specific business purpose (online platform for sell merch). CSP responsible for keeping application uptime at acceptable level
42
Cloud Computing Model Software as a Service
CSP provides a business application/software used to perform specific functions or processes. Customers purchase through licensing. CSP offers application via the internet and is responsible for updates, security enhancements, etc.
43
Business Processes as a Service model
use SaaS to deliver specific business functions (payroll)
44
Cloud Computing Deployment Model: Public
cloud owned and managaged by provider
45
Cloud Computing Deployment Model: Private
Cloud created for a single org and managed by org or CSP. Can exist on/off premises
46
Cloud Computing Deployment Model: Hybrid
Two or more clouds, with at least 1 being private, that remain unique cloud entities but with technology in place that facilitates the portability of data and applications between each entity.
47
Cloud Computing Deployment Model: Community
Shared by multiple organizations to support common interest
48
COSO Enterprise Risk Management - Integrating with Strategy and Performance
categorizes methods for addressing an organizations risk into five components with 20 support princples.
49
COSO ERM - 5 components (list)
1.Governance and Culture 2. Strategy and Objective Setting 3. Performance 4. Review and revision 5. Information, Communication, Reporting
50
COSO ERM - Govnernance and Culture
Sets tone and reinforces importance of oversight. Target behaviors and values and understanding risk
51
COSO ERM - Strategy and Objective Setting
Strategic planning process - risk appetite should be aligned with strategy, business objectives put in place to achieve level of appetite through ID risk, assess risk, and responding to risk.
52
COSO ERM - Performance
Prioritize risks based on risk appetite so objectes are assesed, met, and reported to key stakholders
53
COSO EMR - Review and Revision
review performance over time and make revisions
54
COSO EMR - Information, Communication, and Reporting
continual process to support sharing internal/external info through organization
55
COSO ERM For Cloud Computing
- guidance when applying COSO framework to cloud computing. -Integrate governance of cloud computing into overall RM strategy. Ownership of risk still remain w/ org, proper governance may include - CC Steering Committee - Understanding CSP values and culture and how effects risk profile, how CSP risks can impact performance, responsibilties of CSP, and how CSP's IC address risk -continuously update and reassess ERM when changes in cloud needs of CSPs
56
Applying COSO TO Configure Cloud options
Apply 8 components to tailor to risk appetite. 1. Internal Environment - foundation for risk appetite to determine level of outsource 2. Objective Setting - understand how outsourcing help or hinder objectives 3. Event ID - how CSP could made event ID harder/easier 4. RA - risk of cloud strategy, impacts to risk profile, inherent/residual risks, likelihood of impact 5. Risk Response - determine if risk response will be to avoid, reduce, share, or accept 6. Control Activities - how IC are modified in cloud 7. Info/Communciation - how cloud may impact timeliness, availability and dissemination of info. 8. Monitoring - modify monitoring mechanisms to accommodate new complexities.
57
Cloud Risks
-Rate of competitor adoption -being in same risk ecosystem as CPS and other tenants -transparency -reliability/performance -lack of application portability (vendor lock in) - security/compliance -cyberattacks or data leakage -IT organization changes -CSP long term viablity
58
ERP
Enterprise Resource Planning systems are cross function systems that support different business functions and facilitate integration of information across departments. -centralized database and user interface -Modules function independently or as an integrated system that allows data to be shared
59
Accounting Information System
collects, records, and stores accounting information and using rules, reports on financial and nonfinancial info. Made up of 3 subsystems Transaction Process System Financial Reporting System Management Reporting System
60
AIS-Subsystem: Transation Processing System
converts economic events into financial transactions (JE) and distributes info to support daily operations. Covers - Sales cycle, conversion cycle, expenditure cycle.
61
AIS-Subsystem: Financial Reporting System
aggregates daily financial info from the TPS and other sources of infrequent events (mergers, lawsuits, disasters) to enable timely regulatory and financial reporting
62
AIS Subsystem: Management Reporting System
Provides internal financial information to solve business problems (budgeting, variance, cost/volume/profit analysis)
63
What are the 5 objectives of AIS - collectively of all subsystems
1. Record valid transactions 2. Properly classify transactions 3. Record at correct value 4. Record in correcct accounting period 5. Present in FS
64
What is the sequence of events in AIS
1. Transaction data entered into AIS by end user/customer 2. Source docs filed 3. Recorded in jounal 4. Posted to general/sub ledgers 5. Trail balance prepped 6. Adj. accurals, and corrections are entered. 7. Finacial reports are generated
65
Revenue and Cash Collections Cycle
- real time access to inventory subledger to check availablity - auto approval/denial credit - Records sales invoice and transmits inventory release and packing slip - Input shipping notices - triggers updating cust. Credit record, inventory, GL, and mgmt reports. - Cash receipts clerk to record remittance - closes sales invoice, posts GL, updates payment record, mgmt reports.
66
Purchsing and Disbursement Cycle
-Reads requested purchse to verify on approval list and shows approved vendors - Preps PO and delivers to vendor - Rec. departmetn inputs qty received - updates rec. report, reconciles qty again open PO, closes PO, updates Inv. SL and GL. - AP enters invoice - links inovice to PO and receiving report - create voucher - approves payments and sets payment date - prints/distributes signed checks to mailrool - recorded in check register file, closes invoice, updates GL, transaction report.
67
HR/Payroll Cyces
-AIS integrated with HRMS for real time EE data changes - EE enter time to prodcue time/attendance files - allocates labor costs to job costs, accumulates direct/indirect exp and end of work period on a batch basis, calcs payroll, updates EE records, produces payroll register and AP. - creast JE and updates GL.
68
Production Cycel
-Rec. work order for production run - put in as Work In Programm SL. Labor/materials added - documns sent to AIS to automatically update WIP - tracks costs for labor materials, OH - variances - Closes WIP account after fian ticket showing good in inventory - JE and GL updates
69
Fixed Asset Cycle
-Create record of asset SL - useful life, salvage, deprectioanl, lcoation - update GL, JE, depreciation schedule - Cals depreciation, AD, book value at end of period - JE and GL 0 Disposed recorded - cals gain/losses, JE, adj enteries to GL
70
Treasury Cycle
-integrated with other cycles - includes source doc (deposit slits, checkes, stocks, itnerst) to post JE - Bank recs - JE for change in cash, - reports
71
GL and Reporting Cycle
-Update GL (in other cycles) - Auto produces trail balance - posts adj entries - prodcued FS and report of variances - Closes temp accounts, carry forward to BS
72
Automation
designed to perform repetitive tasks. Must first examine process to describe each step take, exchange of info, governance of policies for each transaction, and knowledge needed to perform each task.
73
Risks to outsourcing
Quality - product defective or substandard Productivity Staff turnover - language skills security qualifications labor insecurity
74
Common offshore operations
IT provided by managed services provider Business processes Software development Knowledge processing (reading xrays)
75
Robotic Process Automation
Use of program to perform repetitive tasks that don’t need skilled human labor. Uses simple rule based processes. Web scraping tool.
76
lidar
light detection and ranging -type of robotic process automation - used in self driving vehicles
77
Natural Lanague Processing Software
NLP - technology used to encode, decode, interpret human languages to perform tasks and interact with humans, carry out commands on other devices. Ie virtual assistants
78
Neural Networks
Modeled after neurons that faciliate the fucntion of human or animal memories. Invovles an input layer, hidden layer, and output layer. Fuzzy logic
79
Process integrity
systems ability to initiate and complete transactions so they are valid, accurate, completed timely, and authorized to meet a company's objective.
80
AICPA definition of deficiences in desing of a control in SOC 2
necessary controls are missing, or not designed properly
81
how to ID if deficient in design exists
obtain understanding of mgmt RA process, evaluate the link between controls and trust service criteria and deteremine if controls are appropraite.
82
Trust Service Criteria: Security
ID transaction processing methods that compromise confidentially, privacy, availability, and can be circumvented to allow unauthorized access.
83
Trust Service Criteria: Availablity
Bottlenecks in flow of data and proceses that prevent data from being available
84
Trust Service Criteria: Processing Integrity
Processing methods and transactions that do not complete timely or at all, yeild faulty results, or do not meet objectives
85
Trust Service Criteria: Confidentiality
EE and procesess that handle transactions w/ confidential data to ID leaks, mishandling.
86
Trust Service Criteria: Privacy
Analyce methods use to collect, store, and dipose of data - data breacks/leaks
87
AICPA Description Criteria for a Description of a Servicec Orgs.
used to id defeciencies to compare w/ org. system design documentation. Two items recommeded for review are the principal service commitment and principal system requirements-
88
Test of controls
evidence of how controls were applied, consistency of the application, and personnel responsible for applying them. If deficencies are already ID- not reuqired to test
89
When is deficiency materail
if cannot obtain reasonable assurance that system requirements or service commitments are being met.
90
If fraud?
assess the risk that the systme description does not accuratly reflect the system design, and operating controls are not operating effectivly.
91
What should an auditro do if fraud?
consider noncompliance with regs, hold dicusssions, request they consult with legal counsel, consider implications to engagement, communicate w/ regulators, withdrawl.
92
COSO - Control Activity- principal 11
general controls over tech. in order to achieve organizational objectives. Company must understand dependecy between general controls over tech and use of tech, and establish controls over tech infrastrcuture, seurity mgmt, tech, auisition, maintenance.
93
COSO - Control Activity- principal 13
Info and communication - aquire, create, use quality infoto support IC: id company info needs, capture data, process data into useful info, maintain quality when processing
94
COSO - Control Activity- principal 14
effective communication of info is necessary to support IC.
95
blockchain
control system designed to govern the creation and distribution of bitcoin. - resistence to alteration, multiparty transaction validation, and decentralized nature
96
mining cytprocurrency
person/group performing cryptography - solivng complex math equations
97
Cryptography
solving of complex math euqatios where blocks of fixed number of transactions are confirmed at a time with reward of bitcoin and validation of a new block of transactions.
98
How can auditors use blockcahin
verify transaction ny vaidating crypogrpahy signatures, time stamps, and tracing wallet addressses.
99
Business resilency
continouse operation or quick return - ntegration of system availability controls ,disaster recovery plans, business continutey plan, and crisi mgmt plans into central set of procedures.
100
busiens continuity
continue to deliver; operations focus - key business processes and risks, acceptable downtime, implement mitgation and contingency plans to address risk and downtimes
101
soc 2 business coninuity
auditors verify testing of plan and plan is based on relevant scenarios, focused on things to impair, consider scenarios where key personnel are lacking, revised based on testing
102
business impact analysis
ids business units, departments, processes that are essential and impact if failure, how fast can return and resources requried
103
BIA Steps
1. BIA approach 2. ID critical resources 3. define diruption impacts 4. estiamte losses 5. establish recover priorities 6. create bia report 7. implemenet BIA recommendaiton
104
system availablity controls
ability to prevent disruptions - and procedures to recover (CM, DR, BCP all fall under this)
105
crisis mgmt
overall response - goals to lessen impact, protect people, reputation and return to normal. Policies should address: RA of potential crisises, implemeneation process, crisirs response center, R&R, communication lines, ee trainined
106
diaster recovery
strategic recoery - steps: 1. assess risk, 2. ID mission critical applications/data, 3. develop plan for handeling mission crticial, 4. responsiblites and 5. test
107
cold site
all connections but no equiptment 1-3 days to be operational
108
warm site
hardware installed but fall short of processing capabilities
109
hote site
equipped to take over- prewired, with hardware/office equiptment, backup copies
110
Recovery Point Objectivr
max for data lost, dollars lost, or inoperability as mesured by a metric
111
Incremental backup
only data changed since last backup - slowest
112
Differential
all changes made since last full backup
113
Infastrcuture Capacity and Mointoring
obseve monitoring tool showing capacity metrics being monitored
114
Change Mgmt steps
1. ID and define need for system change 2. desing high level plan with goals 3. mgmt approval 4. budget/timeline 5. Assign personnel 6. id and address potential risks during or post change 7. implementation road map 8. procure resources and train 9. test change 10.execute plan 11. review/monitor change and test to verify effective implemenation
115
Development Enviornment
application protoype - source code editing tool, automation tools w/ preconfigured code, debugging to fix errors.
116
Testing Env.
Test and debug codeto id errors - can be same as development
117
Staging Env
test in final phases in a production like env. Test functionality, compatabilty, security, and performance
118
Productiong Env.
deployed and made available
119
Disaster Recovery
so can be restored quickly, save critical data/systems, notify mgmt and recover
120
Selection and Acquisition Risks
lack of: Expertise, formal selction/acquisition process, software/hardware vulnerablity and incompatability
121
Mitigate Selection and Acquisition Risks
AICPA's trust criterion CC5.1 - select and develop controls to mitigate risk SOC 2 - performan annual RA to see if risk/controls are adequate, auditor can obtain this RA and select a sample of system change requests to reivew change mgmt process for appropriateness.
122
Integration risks
user resistance, lack of mgmt/stakeholder support, resource concerns, business disruption, lack of system integration
123
Outsource change management riskk
Lack of organizational knowledge, uncertainty of 3rd party knowledge/mgmt, lack of security
124
Emergency change policies
allow for expiditaed process that maintain audit-trailand controls - crisis of time sensative
125
annual risk assessment
evaluate: the econmic, regulatory, and physical environment -business env, industry, competition, consumer dynamics -effect of new lines of buiness , modfied lines, expanding through acquisition, downsizing -mgmt attidtuede toward IC - changes in tech env. - partnerships w/ vendors/businesses
126
baseline configuration
can be used to benchmark/compare current progress or performance of system - checklist can be used or baseline image that is a graphical depictionf a system, and metrics may include system uptime, resource utilzationand failover time
127
system component inventory
hardware, soware, peripherals and other IT assets- purpose of each, location, status, if functioning properly
128
Acceptance criteria
to help evaluate change control policies, measurable and specific, help enhance likelihood that changes are clear and concise, properly tested/implemented, documented, approved, evaluated reviewed and monitored.
129
Application logs
application dataor when error occurs
130
Change logs
change that were requested, approved, and implemented- allowe for system restoration
131
event logs
records various events that occur on a system - directory logs, DNS server logs, endpoint, security basic
132
firewall logs
all traffic that flows through firewall- pack info w/ ports used, IP addresses, protocol, action taken, time/date
133
network logs
aka perimeter logs - from devices that guard network
134
proxy logs
details on site visted, time, how long
135
Continuous Integration
merge changes in code in a central repository in which they automate building and testing code. Helps id bugs faster, enhance quality, shortens time needed to relese software updates.
136
Continuous Deployment
software is automatically created, tested, and then deployed to a production env.
137
Waterfall model
different teams of employees performing separate tasks in sequence, with each team beginning work from a pre-written agreement from preceding team and ending when they meet their bussiness requirement and then passing to next team.
138
challenges with waterfall model
lots of time, benefits of new system not realized until completed, no customer input and change hard to manage, some employees idle
139
Agile model
cross-functional teams, each dedicated to function/improvements drawn from a piorized list of customers remaining needs for the sytsem - work at the same time, shorter deadlines, allows changes of direction during accounting for stakeholder feedback
140
what is patch management
systemic process of ID specific vulernabilities or software bugs and addressing themwith patches, fixes, between releases.
141
patch mgmt process
evaluate new patches and devise plan to implement, use vulnerabilty tools to id need for patches, test patches in test env., approving and deploying patches in down time, verify that patches were deployed.
142
patc mgmt and SOC 2
maintain a documented patch mgmt process - inspected by auditors
143
direct conversion method
cease to use old system and start new immediately - risk for business impact
144
parallel converstion model
new implemented while old stil in use - lots of effort from personell
145
pilot conversion method
conversion on small scale w/ test env.while using old system- allows validation testing and adjustment
146
phase converstion method
aka gradual or modular- graudally adds volume- good for multi-site
147
hybrid converstion method
tailored mix
148
software testing process
1. testing plan w/ roles, responsibilites, and timeline 2. ID/prioritze key areas of software 3. what type of tests to run and objecgives of test 4. execute tests 5. log resultsand ID defects 6. report findingsand fix defects
149
integration testing
aka thread/string testing - cohesivness of all unit and plan for future system maintenace/updates to id future patch needs
150
data life cycle
sequential steps all data must go through 1. Definition, 2. Capture, 3. Preparation, 4. Synthasis, 5. Analtyics/Use, 6. Publication, 7. Archival, 8. Purge
151
purpose of Preparation step in Date life cycle
determine if data is complete, current, clean, encrypted, and user friendly
152
Steps to validate caputred data
1. compare # of records, 2. compare stats for numeric fields, 3. validate field formats 4. compare character limts
153
Steps to clean data
1. remove headings/subtotal, 2. clean leading zeros and nonprintable characters for consistnecy 3. formate negative numbers, 4. ID and correct inconsistencies (CA vs CAL) 5. address inconsistent data types
154
purpose of Synthasis step in Data life cycle
bridge between prep and use, calcuated fields that can be useful
155
Extract, Transform, and Load
data collection - extracted from org. source, transformed into useful info, and loaded in tool for use/analsysis.
156
Operational Data Store
repository of transactional data fom multiple sources - often an iterim between a data source and a data warehouse. Can be operational or system relate data. Data sets are smaller and frequently overwritten as transactions are modified, processed, and reported.
157
Data Warehouse
Very large data repositories - centralized and used for reporting and analysis. Pulls data from enterprise systems or ODS, combined into single repository used for reporting, to create data marts, or other.
158
Data Mart
like data warehouse by for specific purpose (i.e. marketing), subset of data warehouse - choose highly relevant data points from data warehoes
159
Data Lake
Similar to data warehouse but contains both structred and unstructured data, data in natural raw form. Not indexed or prepped
160
relational database
helps assure that data are complete, not redudent, and that business rules and IC are enforced; aids in communciation and integration across processes. Store data across a series of related tables, each table has columns (attributes) and rows (records), each column is unique and relevant.
161
Three types of columns in relational database
Primary keys Foreign keys Descriptive attributes
162
Tables
part of relationional database - at least two that are related - org structures that establish columns and rows to store specific type of data . Aka "entities". Each represents an "object" in a database
163
Attributes
Columns in tables , describe the characterstics or properties desired to be known about each table/entity ie. Last Name
164
Primary Key
attribute required in every table to help solidfy that each row is unique - rarely discriptive and usually numbers/letters (ord #, ids, etc.)
165
Composite Primary Key
when more than one attribute is combined to make a unique identifier
166
Foreign Key
attributes in one table that are primary keys in another (cust. Id). Linked to primary key in another is what makes a relational database.
167
normalization
techniue to reduce redudancies and eliminate undesireable characters
168
First Normal form
each cell/field must contain only one piece of info, each record must be uniquely identified (primary key)
169
Second Normal form
all non key attributes depend on the entire primary key
170
Third Normal Form
Each column describes only the primary key - making sure that non-key attributes are not depending on other non-key attributes
171
data models
high level design of data structures in an information system and are not restrited to relational databases
172
data models - conceptual
least complex - high level big pictur. Defines main entities and relationships of data. Overall structure and meaning of data. Useful for communication design to stakeholders
173
data models - logical
more detailed representation of data structures at the level of data itself. Define entities and relationships for the data and attributes of each entity. Useful for data oriented project (designing data warehouse). If more info was in the conceptual model then logical model will include primary / foreign keys and adjusts for 1NF and 2NF.
174
data models - physical
most detailed - how data will be stored, entites referred to as tables and attributes as columns - model complete enough that database can be built based on it, includes data dictionary and data type of each attribute, character limit, required fields, default values
175
database schema
implemenation of data model - set of instructions to tell the database engine how to organize data to be in compliance with data models. Defines structure of database - how stored/accessed
176
Fact tables
contains measures/metrics
177
Dimension tables
descriptive or contextual data for measurs such as dates, product names, customer names.
178
Star Schema
most common for dimensional modeling and simplest. Org. into a central fact table w/ assoc. demension tables surrounding it
179
Snowflake Schema
Dimension tables are broken down into multiple related tables, more complex that star schema as has more tables and foreign keys, but more fexilble as more detailed info stored about dimensions.
180
SQL Commands
language specific words (SELECT), case does not matter, however uppercase is common to differentiate them from database elements.
181
Database elements
references to table names, attribute names or criteria. Must be spelled the same as the table names or record names, case does not matter. Common to use proper case
182
SQL Clauses
phrases that begin w/ commands and include database elements
183
SELECT
attributes wish to view - order you put them in will be how they are viewed
184
FROM
which table contains attributes user is selecting
185
WHERE
like an excel filter filtering for text - needs to be surrounded in quotes
186
Aggregates
used to create grand total or subtotals - SUM, COUNT, AVG, MIN, MAX
187
GROUP BY
two modifications from the SELECT that returns grand total - allows for subtotals.
188
HAVING
equal, greater than, less than
189
How to relate two tables
Identify matching primary and foreign keys
190
INNER JOIN
order of the table does not matter for the clauses, however matching keys is indicated with ON clause. Retreives data from two tables where there is a match FROM table 1 INNTER JOIN table 2 ON table1.matching_key = table.2matching_key
191
LEFT JOIN
Provide when there is not match
192
Business Processing Modeling Notation
create flowcharts referred to as activity models, uses a common set of symbols/rules to look for efficiencies and effectiveness in regards to humans, but also what could be automated.
193
Swim Pools
Used to quickly showcase how many organizations are involved
194
Swim Lane
more granular than pools- indicate segregation of duties
195
Events
circles and indicate when a process kicks off/completes. Start - smooth circles End Events - bold circles Intermediate events - something changes in course (time delay/error) - two lined circles.
196
Tasks
actions in process - rectangle with rounded edges
197
Sequence Flows
smoothed lined arrows
198
Message flows
if more than one pool - communication between internal/exteral orgs - dashed arrows - between two pools
199
gateway
diamond shape - is a question (sufficient inventory)
200
data flow diagrams
used to describe the flow of data through a process
201
DFD - Process
any action that results in data changingand producting a new output - either rectangle or circle
202
DFD - Data flow
direction data flows through process, labeled to indicate data flowing- either a curved or straight arrow
203
DFD - Data store or warehouse
where data is stored for later use - open ended rectangle
204
DFD - External Entity or Terminator
that recieves the data at the end of a set process - square