IICS CONCEPTS Flashcards

1
Q

What is the difference between Connected and Unconnected Lookup ?

What are some common uses for them ?

A

Connected Lookup– Is a part of the mapping data flow. With connected lookups you can have multiple return values. That is you can pass multiple values from the same row in the lookup table out of the lookup transformation.
Common uses for them are
* Finding a name based on a number
* Finding a value based on a range of dates
* Finding a value based on multiple conditions

Unconnected Lookup: Transformation that exists separate from the data flow in the mapping.
Common uses for them are:

  • Testing the results of a lookup in a expression
  • Filtering records based on the lookup results
  • Marking records for update based on the result of a lookup ( for example updating slowly changing dimensions table)
  • Calling the same lookup multiple times in one mapping.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a Normalizer ?

A

It is an active transformation that transforms one incoming row to multiple output rows

When the Normalizer transformation receives a row that contains multiple-occurring data, it returns a row for each instance of the multiple-occurring data.

For example, a relational source includes four fields with quarterly sales data. You can configure a Normalizer transformation to generate a separate output row for each quarter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the normalizer properties ?

Desribe them as well ?

A
  1. Normalized Fields tab: It define the multiple-occurring fields and specify additional fields that you want to use in the mapping.
  2. Field Mapping tab: It connect the incoming fields to the normalized fields.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is GeneratedColumnID (GCID) and GeneratedKey (GK) ?

A

GeneratedColumnID (GCID): Generates an incremental value starting with 1 for each occurrence of a multiple-occurring data within the row. Resets back to 1 for next row.

GeneratedKey (GK): Generate a key for each incoming row. The key starts with one and is incremented by one for each processed row. It is an optional field.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is SCD ?

A

In data management and data warehousing, a slowly changing dimension (SCD) is a dimension that consists of relatively static data that can change slowly but unexpectedly, rather than on a regular schedule. Some examples of specific slowly changing dimensions are entities in the form of names of geographic locations, customers or products.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the 6 types of SCD and how are they used ?

A

Type 0 – Fixed Dimension

-No changes allowed, dimension never changes

Type 1 – No History

-Update record directly, there is no record of historical values, only current state

Type 2 – Row Versioning

-Track changes as version records with current flag & active dates and other metadata

** Type 3 – Previous Value column**

-Track change to a specific attribute, add a column to show the previous value, which is updated as further changes occur

Type 4 – History Table

-Show current value in dimension table but track all changes in separate table

Type 6 – Hybrid SCD

-Utilise techniques from SCD Types 1, 2 and 3 to track change.

SCD 0,1,2 are widely used. There is no SCD 5

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

WHAT IS

WHAT IS SCD2 ?

What is a Dimension Table ?

A

SCD type 2 will store the entire history in the dimension table.
In SCD type 2 effective date, the dimension table will have Start_Date (Begin_Date) and End_Date as the fields.
If the End_Date is Null, then it indicates the current row.

SCD type 2 retains the full history of values.
When the value of a chosen attribute changes, the current record is closed.
A new record is created with the changed data values and this new record becomes the current record.

Dimension table having the detailed information about the product.

Dimension tables are typically small, ranging from a few to several thousand rows.
It provides the context /descriptive information for a fact table measurements and it’s structure is smaller than Fact table.
Surrogate Key is used to prevent the primary key (pk) violation (store historical data).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is a Fact table ?

A

Fact table having the aggregate information about the product.
Fact tables can grow very large, with millions or even billions of rows.

Fact table is a table that contains summarized numerical and historical data (facts) and a multipart index composed of foreign keys from the primary keys of related dimension tables.
It size is larger than Dimension table.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are the transformations that is active and which transformation is passive ?

A

Active transformations: Filter, Aggregator, joiner, rank, router and source qualifier

Passive transformations: Expression, Sorter, Sequence Generator, Lookup

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How can we make joiner as a passive transformation ?

A

By selecting the Full join

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

How can we make sorter transformation as active transformation ?

A

By using Distinct option

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How many methods we have to achieve SCD2 and which method is mostly used ?

A

There are 3 methods

Method 1 ——- FLAG
Method 2 ——– Version
Method 3———- Effective date

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the cloud services ?

Iaas?
Paas?
Saas?

A
  1. (infrastructure as a Service) third type of computing across the cloud, Iaas provides a virtualized platform.
  2. (Platform as a Service) Provides computers platforms for users of cloud computing
  3. (Software as a service) applications are hosted on cloud servers and users access them over the internet
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is data intergration used for in the cloud ? What are the different tasks that are used for batch jobs ? Define them as well

What are the rare tasks that get used as well ?

A

Developing Batch jobs
* Mapping task - This is similar to the PowerCenter mapping designer. This is used to define the data flow logic to process the data.
* Replication task - this can be used to replicate more than one object from a source to a target. You can either use full load, incremental load options for replicating the data.
* **Synchronization task **- This is used to synchronize the data between the source and target. It provides DML operations like insert, update, upsert and delete. Using this task, we can sync only one object data at a time.
* Taskflows - You can combine other tasks and run them together in a flow. You can run these tasks in sequence, parallel or based on a decision.
The rare tasks are

Powercenter task
Integration
Cleansing
Warehousing

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does the filter transformation do ? What happens when the filter condition is true or false ?

What exactly is a filter ? And how does it work best in mapping effiency

A
  • The Filter transformation filters data out of the data flow based on a specified filter condition.
  • To improve job performance, place the Filter transformation close to mapping sources to remove unnecessary data from the data flow.

Filter condition is an expression that returns TRUE or FALSE.
- When the filter condition returns TRUE for a row, the Filter transformation passes the row to the rest of the data flow.
- When the filter condition returns FALSE, the Filter transformation drops the row.

You can use all the operators in filter especially this one too != which

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an expression transformation ? What else do you use the transformation for ?

A

The Expression transformation calculates values within a single row. Use the Expression transformation to perform non-aggregate calculations.

You might use an Expression transformation to
- concatenate first and last names,
- change FIRST_NAME to upper case or lower case,
- change LAST_NAME to lower case or upper case,
- add number of months in date column etc.

you are manipulating rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is a sorter transformation ? Is it active or passive ?

Tell us more about sorter transformations like the specifics ?

A

Sorter transformation in IICS is used to sort the data in an ascending or descending order based on single or multiple keys.

This is the type an active transformation which sorts the data either in ascending order or descending order.

  • You can sort data that passes through a Lookup or an Aggregator transformation configured to use sorted incoming fields.
  • When you create a Sorter transformation, specify fields as sort conditions and configure each sort field to sort in ascending or descending order.
  • You can use a parameter for the sort condition and define the value of the parameter when you configure the mapping task.

It can also supress the duplicate records in the source

18
Q

What is a sorter transformation(Distinct) ?

A

Sorter Transformation is used to sort the source data in either Ascending or Descending order, similar to SQL command ORDER BY Statement.

**If you want to Integration Service to select unique values from a source, use the Select Distinct option. **

**You might use this feature to extract unique customer/employees IDs from a table listing total customer/employees. **

**Using Select Distinct filters out unnecessary data earlier in the data flow, which might improve performance. **

19
Q

What is a task ? What are the Data integration tasks ?

A

A task is a process that you configure to analyze, extract, transform, and load data. You can run individual tasks manually or set tasks to run on a schedule.

  1. Mapping: Use to process data based on the data flow logic defined in a mapping or Visio template.
  2. Synchronization : Use to load data and integrate applications, databases, and files. Includes add-on functionality such as mapplets.
  3. Replication: Use to replicate data from Salesforce or database sources to database or file targets. You might replicate data to archive the data, perform offline reporting, or consolidate and manage data.
  4. Mass Ingestion : Use to transfer a large number of files of any file type between on-premises and cloud repositories and to track and monitor file transfers.
  5. Masking : Use to replace source data in sensitive columns with realistic test data for non-production environments. Masking rules define the logic to replace the sensitive data. Assign masking rules to the columns you need to mask.
  6. Powercenter: Use to import a PowerCenter workflow and run it as a Data Integration PowerCenter task.
20
Q

What is a Mapping task? What are the three templates ?

A

Mapping task is use to process data based on the data flow logic defined in a mapping or Visio template.

When you create a mapping task, you select the mapping or Visio template for the task to use.

The mapping or Visio template must already exist before you can create a mapping task for it. Alternatively, you can create a mapping task by using a template.

  1. Integration
  2. Cleansing
  3. Data Warehousing
21
Q

Can we specify ranking on more than one port?

A

No you can’t.

22
Q

Advanced properties define how to Rank transformation processes _________.

A

DATA

23
Q

What is joiner transformation ?

Whats another aspect of joiner transformation ? Also the details of how

A

The Joiner transformation can join data from two related heterogeneous sources.

	NOTE:You can use the Joiner transformation to join two or more than two tables but in this journey you will learn how to join two tables with some conditions.

A join condition matches fields between the two sources. You can create

24
Q

What are the different Join types ?

A
  1. Normal Join:- Includes rows with matching join conditions. Discards rows that do not match the join conditions.
  2. Master Outer:- Includes all rows from the detail pipeline and the matching rows from the master pipeline. It discards the unmatched rows from the master pipeline.
  3. Detail Outer:- Includes all rows from the master pipeline and the matching rows from the detail pipeline. It discards the unmatched rows from the detail pipeline.
  4. Full Outer:- Includes rows with matching join conditions and all incoming data from the master pipeline and detail pipeline.
25
Q

What is advanced joiner transformations ?

A

In Advance Joiner transformation can join data from three or more than related heterogeneous sources.

NOTE: To join more than two sources in a mapping, you can use multiple Joiner transformations.
You can join the output from the Joiner transformation with another source pipeline.
You can add Joiner transformations to the mapping until you join all source pipelines.
A join condition matches fields between the two sources. You can create multiple join conditions.

26
Q

What is aggregator transformation ?

Give me an example of a aggregator transformation ?

A
  • Aggregator transformation is used to perform aggregate calculations, such as averages and sums, on groups of data.
  • When the mapping task performs aggregate calculations, the task stores data in groups in an aggregate cache.

when you group by the ITEM field for the following expression, the task

27
Q

How does aggregate work in mapping ?

A
  • When you configure an Aggregator transformation, create an aggregate field for the output of each calculation that you want to use in the data flow. You can use aggregate functions in aggregate fields. You can also use conditional clauses and nonaggregate functions.
  • Configure aggregate fields on the Aggregate tab of the Properties panel. When you configure an aggregate field, you define the field name, data type, precision and scale. You also define the calculations that you want to perform.
28
Q

What are conditional clauses in a Aggregate transformation ?

A

Use conditional clauses in the aggregate expression to reduce the number of rows used in the aggregation.
The conditional clause can be any clause that evaluates to TRUE or FALSE.
For example,
use the following expression to calculate the total commissions of employees who exceeded their quarterly quota:

SUM( COMMISSION, COMMISSION > QUOTA )

29
Q

What is the rank transformation ?

Give me an example ?

Show me how rank works in IICS ?

A

Rank transformation selects the top or bottom range of data.
Rank transformation use to return the largest or smallest numeric values in a group.
Rank transformation is an active transformation because it can change the number of rows that pass through it.

Rank transformation use to select the top 10 customers by region.

When you define the rank properties for a Rank transformation, you select the field to
1. rank by,
2. specify the rank order
3. specify number of rows to rank by.

30
Q

What is Router transformation ?

Why dont we just use multiple filter transformations ?

A

Router transformation is an active transformation that you can use to apply a condition to incoming data.

  • It is ideal for splitting upstream pipeline to multiple downstream pipelines.
  • It uses output group conditions to route a data row.

Note: If you need to test the same input data based on multiple conditions, use a Router transformation in a mapping instead of creating multiple Filter transformations to perform the same task.

It is more effective to use router transformation once rather than to us

31
Q

How does router transformation work ? What are the type of groups ?

A

We use groups in a Router transformation to filter the incoming data.

A Router transformation has the following types of groups:
1. Input: Data Integration copies properties from the input group fields to create the fields for each output group.

  1. Output: There are two types of output groups:
    I. User-defined groups
    II. Default group
32
Q

In a sense of true or false how does router transformation work ?

A

A group filter condition returns TRUE or FALSE for each row that passes through the transformation, depending on whether a row satisfies the specified condition. Zero (0) is the equivalent of FALSE, and any non-zero value is the equivalent of TRUE.

When the task runs, Data Integration handles the data in the following ways:
Passes the rows of data that evaluate to TRUE to each transformation or target that is associated with each user-defined group.

Passes the row to the default group if all of the conditions evaluate to FALSE.

33
Q

What is a sequence generator transformation ?

A

The Sequence Generator transformation is a passive and connected transformation that generates numeric values. Sequence Generator use to create unique primary key values, replace missing primary keys, or cycle through a sequential range of numbers.

34
Q

What is a source qualifier transformation ?

A

Source qualifier transformation is an active and connected transformation.

Source qualifer is used to represent the rows that the integrations service read. Whenever we add a relational source or a flat file to a mapping, a source qualifier transformation is required.

35
Q

What is Union Transformation ?

A

Union transformation is an active transformation that you use to merge data from multiple pipelines into a single pipeline.

For data integration patterns, it is common to combine two or more data sources into a single stream that includes the union of all rows. The data sources often do not have the same structure, so you cannot freely join the data streams.

36
Q

Tell me more about the input groups in Union Transformations ?

A

The input groups have the following characteristics:
1. The Union transformation initializes its output fields based on fields in the first source that you connect to an input group.

  1. Each input group can use a different field mapping mode or parameter.
  2. You can parameterize the field mappings or define the field mapping for each input group.
37
Q

What are the two types of flat files ?

A
  1. Delimited Flat files:
    Each column is separated by some specifically characters such as comma, tab, space, semicolon.
  2. Fixed width Flat files:
    A record of fixed length needs to be specified into multiple columns using break points.Imp: A delimiter flat file can also be known as variable length files.
38
Q

What is a Flat file ?

A

A flat file database is a type of database that stores data in a single table. This is unlike a relational database, which makes use of multiple tables and relations. Flat file databases are generally in plain-text form, where each line holds only one record. The fields in the record are separated using delimiters such as tabs and commas.

39
Q

What is a lookup transformation ? Give me an example ?

A

Lookup transformation in Informatica Cloud(IICS) is used to look up a database table or a flat-file based on a condition and retrieve value from the object used as a lookup. The Lookup transformation can be used as both connected and unconnected transformation.

For example, you can use a Lookup transformation to retrieve values from a database table for codes used in source data.

40
Q

What are the 5 types of lookup ?

A

There are Five types of Lookup.
1. Connected lookup: Connected transformation is connected to other transformations or directly to target table in the mapping.
2. Unconnected lookup: Unconnected transformation is not connected to other transformations in the mapping.
3. Uncached lookup: Uncached lookup is when the Integration Service does not cache the lookup source.
4. Static lookup: Static lookup cache does not change during the task.
5. Dynamic lookup: In dynamic lookup cache, task inserts or updates rows in the cache as it passes rows to the target.

41
Q

What is a unconnected lookup ? What are some common uses ?

A

An unconnected Lookup transformation exists separate from the data flow in the mapping.
You write an expression using the :LKP reference qualifier to call the lookup within another transformation.

Some common uses for unconnected lookups include:-

=> Testing the results of a lookup in an expression.
=> Filtering records based on the lookup results.
=> Marking records for update based on the result of a lookup (for example, updating slowly changing dimension tables).
=> Calling the same lookup multiple times in one mapping.

Unconnected is more of a specific value lookup from (expression) and ret