BI Flash Cards - BI Flash Cards (2)

1
Q

SSDT

A

Sql Server Data Tools

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Can you build packages without connecting to an SQL instance?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

SSIS architecture

A

Five main components
SSIS Service (for legacy ssis packages)
SSIS Runtime engine and runtime executables
SSIS Catalog
SSIS Data Flow engin and the Data Flow Components
SSIS Clients

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What handles the operational aspects of SSIS

A

SSIS catalog and SSIS service (for legacy mode packages)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Do you need the service to run packages?

A

no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What happens if you stop the service?

A

any packages running will stop

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Package Catalog

A

Has new features/ like running packages with T-SQL

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What runs the packages?

A

SSIS Runtime engine and complementary programs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Containers

A

Add structure and scope to packages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Tasks

A

functionality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Event Handlers

A

Respond to raised events in your package

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Precedence Constraints

A

Provide ordinal relationship between various items in your package/ the order of execution (green arrow)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Package

A

Collection of tasks that execute in an orderly fashion

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Package is saved where?

A

SQL Server (MSDB) or package catalog database. It can also be saved as a .dtsx file (xml structured)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Data flow tab is enabled when?

A

When a data flow task is created

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Control Flow tab does..

A

handles the main workflow of the package

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Data flow tab does

A

handles data transformation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How many control flows does a package have

A

1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How many data flows?

A

many

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Excel Destination

A

outputs data from data flow to excel which must already exist

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

SQL Server Destination

A

Package/ must be executed at the destination

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Transformations run in-memory or on disk?

A

In memory. Incurring I/O on the page file on disk will slow things way down

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Aggregate transform

A

Similar to group by

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Fuzzy Grouping

A

Finding rows that are likely duplicates

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Fuzzy Lookup
Matches and standardizes data
26
Ole DB Command
Executes an OLE DB command for each row in the data flow/ can be used to run an update or delete statement inside the data flow
27
Sort
Sorts by a column and removes exact duplicates
28
Project
Container of one or more packages and other SSIS components
29
Project Deployment model
New model that enables many of the new SSIS 2012 features. Deploys the entire project of packages/ and packages can be configured by the DBA
30
Package Deployment model
Old model where packages are deployed one at a time
31
Solution
Container of one or more projects. Can house a C# project/ an SSRS project etc all together
32
Can a solution house more than one type of project?
Yes
33
Shared parameters and shared connections are only available in the
Project deployment model
34
.dtsx
SSIS Package
35
.ds
Shared Data Source file
36
.sln
solution file containing one or more projects
37
.dtproj
SSIS project file
38
.params
shared parameter file
39
.conmgr
shared connection manager
40
Miscellaneous folder
holds documentation/ etc. can be checked into version control software
41
How to convert back and forth between package and project deployment
Right click on the project and select "convert to (product or package)"
42
Two tabs in SSIS are..
Control flow and data flow
43
Data flow tab consists of what tasks?
sources/ transforms and destinations
44
Enter debug mode by doing what?
Execute the package by right clicking on it in the solution explorer and hit execute package. Opens progress tab
45
Progress tab
Where you go when you have a problem you need to debug
46
Server Explorer is accessed by
Tools - connect to database
47
Server explorer lets you
run queries/ create stored procedures/ redesign tables
48
Connection manager
connection that can be leveraged and consumed once or many times in a package
49
Can a connection manager connect to the internet? If so/ what ways?
Yes/ SMTP/ FTP/ and HTTP
50
Are connection managers externally available to the DBA
Yes
51
Externally available (connection manager) means
that the connection manager can be repointed
52
How to create a project connection manager
right click on connection managers in the solution explorer
53
convert a connection manager to a project cm
right click on it in the connection manager pane
54
configure tasks in control flow by
double clicking
55
Naming tasks
Example SCR - Encrypt file. Helps you self document your package and gives clarity to the logs
56
How is a package encrypted by default
With your windows user key
57
EncryptSensitiveWithUserKey
Encrypts credentials with your user key/ also Encryptallwithuserkey does the whole package
58
EncryptSensitiveWithPassword
Adds a password so that anyone who runs the package has to have the password
59
What helps with packages running in production that fail due to permissions
select encryptallwithpassword and pass the password in when scheduling or running
60
Arrow color meanings in control flow
Green - success red - failure blue completion
61
When upgrading SSIS
Connection strings are updated/ but if your package is dynamically changing the strings it won't upgrade that old packages aren't converted to the new deployment model
62
Arrow color meanings in data flow
red - bad data / green - good data
63
Predecessor tasks
any task with a precedence constraint that is connected to another task
64
How many evaluation operations are there in precedence constraints
4
65
What are the different evaluation options in a precedence constraint
Constraint, Expression, Expression and Constraint, Expression or constraint
66
ForceExecutionResult
A property in a task that overrides the execution result to a desired result
67
Operation Property in File System Task
Decides what the file system task will do
68
File System Task: IsSourcePathVariable
Is the source of the file system task a variable or a location
69
File System Task: OverwriteDestination caveats
If you are copying a directory and set this to yes, you may overwrite a needed file, set to no and if the directory exists then your package will fail
70
Will the rename file operation property move a file?
If the source is different from the destination
71
UseDirectoryIfExists
File system task option under create directory. Either uses the directory or fails
72
Script Task is broken down into
Script / General / Expressions node
73
Dts.Variables() method
used to change package level variables
74
Common Uses for Script tasks
Changing connection strings, whether a file exists, whether it is in use, if-then logic
75
TypeConversionMode (Execute SQL Task)
Allows for datatypes to be converted for SSIS tasks
76
Resultset property (exec sql task)
Can be XML, single row, none, which can be stored in an object variable
77
Exec SQL task takes what as the SQL Source Type
Direct input (type it in), File connection (command saved in a file), and variable
78
Parameter mapping in Sql Task starts at
zero
79
SQL Task - isquerystoredprocedure
Only available on the ADO and ADO.net option
80
BypassPrepare Query property
"compiles" the query so it doesn't need to be analyzed every time it is used, won't parse unless it is set to false
81
Execute Process Task
Allow for windows and console applications to be used from within the package. Encryption is an example
82
Expression Task
Manipulates variables in the control flow without having to use VB.NET, good for incrementing variables in a loop, like a set of files
83
Send Mail Task
Information about the package passed through system variables or user defined variables
84
Where to place the send mail task
At the end of the package, or in the event handler area
85
Send Mail OnPreExecute
Will send an email saying when the package starts
86
Send Mail OnPostExecute
Send mail indicating the start and end time of the package
87
OnError or OnWarning
Send Mail task will send mail on these conditions
88
Can the sendmail task send files
Yes
89
SendMail connects how?
Through SMTP connection manager
90
SendMail Credentials
Either sent through SSL or windows authentication
91
SendMail message sourcetype Direct Input
Message is typed in the send mail task
92
sendmail MessageSourceType File connection property setting
Message is saved in an external file
93
sendmail message sourcetype variable property
Message is stored in a package variable
94
FTP Task
Sends and receives files via FTP
95
FTP Port
Usually 21
96
FTP Chunk Size
The chunk of file (similar to packet) to be sent, helps avoid exceeding network limitations
97
FTP Stop on Failure
Do you want the transfer to continue if part of it fails?
98
FTP Operation options
Send, receive, create local directory, create remote directory, remove local/remote directory, delete local/remote files
99
Data Flow Task
transfers data from source to destination and can transform data as needed
100
A Dataflow task can write to multiple locations T/F
TRUE
101
OLE DB Provider Is being deprecated because of...
Cloud computing, but you should still use it over ODBC because of higher functionality
102
Most common source used
Ole db
103
OLE DB has 4 data access modes
Table or view, table or view indicatied in a variable, sql results, or sql statement initiated in a variable
104
Can SSIS pass a variable into an SQL query in the oledb source?
Yes, by using the ?
105
three nodes in an oledb source
connection manager, columns (only check the ones you want), Error output (redirects data, ignores, or fails the component)
106
Can you run a query against an excel source?
Yes, it treats excel like a database
107
Native Excel Driver is
32 bit
108
If you use the native excel driver
your packages will have to run in 32 bit mode, even on a 64 bit workstation
109
to change to 32 bit mode
right click on the project file and select properties, select the debugging tab change run64bitruntime to false
110
32 bit runtime on Excel Source only applies
in SSDT because it is a 32 bit application. On a server you can download the microsoft access database engine which includes 64 bit driver
111
How do you correct datatypes in a flat file source
in the advanced tab of the flat file connection manager
112
Destinations cannot be configured until
you drag the blue/red line to it
113
in the destination flow if you don't see the destination you want
simply uncheck "show only installed destinations"
114
Destination Flow - Table lock
prevents anyone from accessing the table targeted, speeds up the load
115
What does the sheet creation look like in an excel destination?
It looks like a create table command in SQL
116
Data conversion transform does what
changes datatypes
117
output alias in Data Conversion Transform
defaults to copy of...
118
length precision and scale
used in a data conversion transform
119
length
number of bytes to store the number, or the number of characters in a string
120
precision
total number of digits
121
scale
number of digits to the right of the decimal
122
Fast parse is available in which transforms?
Flat file source and data conversion transform.
123
Fast parse does what?
Disables verification of a particular column
124
When would you use fast parse?
Only if you are sure of the quality of the data
125
synchronous transform
all the rows don't need to loaded into memory before being manipulated. Output is synchronous with input
126
asynchronous transform
all the rows need to be loaded into memory before being manipulated, sort and aggregate are examples
127
The function of a derived column transform is?
To either replace or create a column in the data stream
128
Common uses for derived column transform
Auditing rows and editing incoming data using the SSIS expressions
129
Variables are preceded with what sign?
@
130
what function parses the middle of a string
substring
131
Number of keys text box in the aggregate transform
put the number of groups you estimate will be created, found in the advanced tab
132
Aggregate transform is what kind of transform?
Asynchronous and fully blocking
133
Sort Transform
sorts by multiple columns, May choose which columns will pass through, can remove rows with duplicate sort values
134
How do you tell the package that the data is presorted?
Right click the source, click advanced editor, go to the input and output properties and select oledb source output. In the properties pane, change issorted property to true, then you go to output columns and click on each sort column and put in the order
135
Lookup transform can join how many data sets?
two
136
no cache mode in lookup transform
High performance overhead on the system
137
Full cached mode in lookup transform
One table loaded entirely from memory
138
Is a lookup transform synchronous?
Yes
139
Partial cache
Middle ground between no cache and full cache in a lookup transform
140
Match output and no match output
Chooses which rows go to the next destination, the matching or the non matching
141
Cache Connection Manager and Transform
Allows you to persist lookup information in the cache (same or different package) so you don't have to reload the cache every time the lookup information is used
142
Where does the cache transform store data
in either the pagefile (virtual memory) or in the permanent file storage
143
Cache Connection Manager can...
reference the in-memory cache and read and write the cache to a disk based file
144
How does a CCM optimize itself?
by allowing you to decide which columns are indexed and which are referenced
145
how to use cached connection
open the cache transform and create a new cached connection manager, choose which columns are index columns and what position they are in the index order
146
Script task in a data flow can perform these functions
Transform, source, or destination
147
data flow script task use as a source
May need to add some business rules as it comes out of the source, sometimes necessary with COBOL files
148
data flow script task as a destination
use the script to write to non OLEDB destinations, like XML or Sharepoint
149
selecting readWrite for a variable does what
Locks during the duration of the execution
150
Input Columns tab of Data Flow Script Task caveat
make sure to only mark columns you need access to writing as such, as this slow things down
151
Can a script task in a data flow support multiple outputs?
Yes
152
debugging script task in a data flow
Doesn't allow for breakpoints, so you will need to use more arcane methods like msgboxes
153
Conditional Split uses what language?
SSIS expression language
154
What can you use as a temporary destination
a union all transform
155
when should you use .NET in a package?
Only when necessary, as the person who inherits the package might not know it
156
OLE DB Transform
executes an SQL statement for every row in the data stream, ie an update for every row!
157
fuzzy lookup
minimizes unmatched rows with fuzzy logic also indicates the quality of the match
158
when should you use a fuzzy lookup transform
only after you have used a regular lookup transform
159
fuzzy lookup requires at least one field to be
a string to be mapped to string in the reference table
160
fuzzy lookup passthrough
column is not analyzed but appears in the output stream
161
fuzzy grouping
eliminates possible duplicates
162
Is fuzzy lookup/grouping expensive?
Yes
163
similarity threshold
likeliness of match in percentage
164
what are the two outputs of a lookup
match output and no match output
165
How do you use an expression for the variable?
Press f4 to bring up properties after selecting the variable
166
How do you change the scope of a variable?
Move Variable icon in the variables dialog
167
Fuzzy grouping key in and key out does what?
Key in is the value in, and the key out is the best match of the possible matches
168
How do you use a conditional split to clean up fuzzy grouping?
Only save rows where the key in matches the key out
169
Connection managers function only as sources
false, they are used in both sources and destinations
170
How do you get a variable to evaluate as an expression?
select it and press f4, this brings up properties. Change evaluate as expression to true, then enter the expression in the expression property
171
Are variable names case sensitive?
Yes
172
parameter value in ssdt
design or default value, may be replaced prior to execution
173
paramter sensitive property
marked as sensitive, its value is encrypted
174
parameter required
you must provide the parameter prior to execution, rather than using the design parameter
175
parameter scope
package scope or project scope
176
To create a project scoped parameter
select project.params in the solution explorer
177
Uses for parameters
Can be used to provide encrypted information, like passwords. Also, it is much easier to set runtime values for parameters over variables
178
you can change any available property...
using an expression
179
When using expressions for properties remember to
convert to the appropriate datatype
180
Common use of expressions in a connection manager
is to import a collection of flat files using the same package (foreach loop container)
181
how do you use an expression for multiple files
change the connection string between iterations
182
To use a backslash in SSIS expressions
write a double backslash, even in quotes
183
Can you use expressions in tasks?
Yes
184
How do you use expressions in tasks?
select the task and hit f4, find the expression property, click the ellipsis
185
What should you do before you write an expression?
Determine the datatype of the property
186
Two ways to track changes to the source table
Keep a shadow copy of the table, or use change data capture
187
how to handle deletes in ETL?
either scan the table or keep a list of keys.
188
Establishing the start and end dates
Search the destination table for the max modified date, set the start date to now. Use andExecuteSQL task with resultset set to one row
189
Logical vs Physical Deletes
Logical uses a delete flag, physical is just a delete
190
Passing values from executeSQL task to data flow OLEDB source
uses parameters set by executesqltask
191
CDC Control Task
Responsible for INITIAL loading of data and maintaining incremental load status by keeping with the log serial number associated with changes, does error handling
192
CDC Source Task
Reads data that has been inserted, updated, or deleted within a range and provides those rows to downstream consumers
193
CDC Splitter
Accepts dataflow from the CDC source task and splits it into three data flows, insert, update, delete
194
How does CDC work?
By reading the log files
195
Uses for CDC
data marts and data warehouses, also replication and auditing
196
CDC uses how many packages?
two, one for initial, and one for incremental
197
Data Quality Services
Helps improve data in the organization
198
Master Package does what?
Runs child packages, in certain order
199
Master Package variables
are controlled globally and shared with the child packages
200
Master packages allow you to control parallelism t f
True
201
sequence containers
group tasks to disable a part of the package that isn't needed Narrow the scope of a variable to just the container collapsing and expanding the container to hide the tasks within managing the properties of multiple tasks in one step by setting the container using one method to make sure multiple tasks execute successfully creating a transaction across data related tasks, not the entire package event handlers on a single container
202
for loop container
loops through control flow tasks
203
Types of iterating containers
for and foreach
204
for each has what types of objects?
File, Item (set manually), ADO (list of tables or rows), SSIS variable, XML nodelist, SMO (SQL Management Objects)
205
Slowly changing dimension transform
Inserts and updates to dimension tables, only works with sql server
206
Bulk Insert Task
Loads directly into SQL Server without transformation
207
Execute Process Task
Opens application or batch file
208
Expression task
builds and evaluates expressions that set variables at runtime
209
Parameters can be scoped to
Package or project level
210
How do you create a package level parameter
parameter tab
211
How do you create a project level parameter?
Project.params under solution explorer
212
Data Quality Services Checks for
Completeness, Consistency, Conformity, Validity, Accuracy, Duplication
213
Data Quality Services keeps decisions in
DQKB
214
DQKB
Data Quality Knowledge Base
215
Data Quality Services has what components?
Data Quality Server and Client
216
=
Set a variable to a value
217
==
checks to see if there is equality between two values
218
Configuration Tables are used for
Storing values to be used for production deployment, for example connection strings, etc
219
Configuration tables are only available in the
package deployment model
220
Best practice for configuration manager for configuration table
It should have its own connection manager, even if it is in the source database
221
Many packages may use the same configuration table?
True
222
How does each package filter out its configurations?
With the filter name on the configuration table
223
How do you save configuration information in an SSIS package?
Configuration tables, or XML configuration files
224
advantages of using XML to store configuration info
compatible with source control systems, easier than adding rows to a database
225
using xml files for configuration requires which deployment model?
Package
226
Ways to troubleshoot
Logging, event handlers, data viewers, breakpoints
227
how to enable loggoing
Right click in the control flow and select logging
228
places logs can be stored
Sql Server, SQl Server Profiler, Windows Event Log, Text File, XML file
229
two common uses for an event handler
notification and logging
230
how to create custom logging
Use Execute SQL Tasks in the Event Handlers tab
231
When should data viewers be removed?
When sending a package to production
232
How do you create a data viewer
Double click on the red or blue line connecting two tasks, select the dataviewer node
233
Once in the dataviewer node of the data flow path editor, how do you proceed
enable the dataviewer with the checkbox, and then select the columns you want
234
What does the detach button do in the data viewer
It allows the package to continue running without triggering the dataviewer
235
How do breakpoints work
allow you to inspect the state of tasks, variables, and data at different points in the package
236
To use configuration tables, your package must
use the package deployment model
237
Configuration tables can be used for
any property for the package, connection, container, variable, or task
238
Project deployment model uses what for configuration information
parameters and environments
239
How to create configuration table for a package
Right click on the canvas in the control flow and choose package configurations