Thursday, October 7, 2010

INFORMATICA QUESTION AND ANSWERS PART4


118)What are two modes of data movement in Informatica Server?
The data movement mode depends on whether Informatica Server should process single byte or multi-byte character data. This mode selection can affect the enforcement of code page relationships and code page validation in the Informatica Client and Server.
 a) Unicode - IS allows 2 bytes for each character and uses additional byte for each nonascii character (such as Japanese characters)
b) ASCII - IS holds all data in a single byte.The IS data movement mode can be changed in the Informatica Server configuration parameters. This comes into effect once you restart the Informatica Server.
119)Identifying bottlenecks in various components of Informatica and resolving them.
The best way to find out bottlenecks is writing to flat file and see where the bottle neck is
120)What r the basic needs to join two sources in a source qualifier?
The both the table should have a common feild with same datatype.
Its not neccessary both should follow primary and foreign relationship. If any relation
ship exists that will help u in performance point of view.
121)Identifying bottlenecks in various components of Informatica and resolving them.
The best way to find out bottlenecks is writing to flat file and see where the bottle neck is
122)What is aggregate cache in aggregator transforamtion?
The aggregator stores data in the aggregate cache until it completes aggregate calculations.When you run a session that uses an aggregator transformation,the informatica server creates index and data caches in memory to process the transformation.If the informatica server requires more space,it stores overflow values in cache files.
123)Can u tell me how to go for SCD's and its types.Where do we use them mostly
The "Slowly Changing Dimension" problem is a common one particular to data warehousing. In a nutshell, this applies to cases where the attribute for a record varies over time. We give an example below: Christina is a customer with ABC Inc. She first lived in Chicago, Illinois. So, the original entry in the customer lookup table has the following record: Customer Key Name State 1001 Christina Illinois At a later date, she moved to Los Angeles, California on January, 2003. How should ABC Inc. now modify its customer table to reflect this change? This is the "Slowly Changing Dimension"
problem. There are in general three ways to solve this type of problem, and they are categorized as follows:
 In Type 1 Slowly Changing Dimension, the new information simply overwrites the original information. In other words, no history is kept. In our example, recall we originally have the following table: Customer Key Name State 1001 Christina IllinoisAfter Christina moved from Illinois to California, the new information replaces the new record, and we have the following table: Customer Key Name State 1001 Christina CaliforniaAdvantages: - This is the easiest way to handle the Slowly
Changing Dimension problem, since there is no need to keep track of the old information.
Disadvantages: - All history is lost. By applying this methodology, it is not possible to trace back in history. For example, in this case, the company would not be able to know that Christina lived in Illinois before. Usage: About 50% of the time. When to use Type 1: Type 1 slowly changing dimension should be used when it is not necessary for the data warehouse to keep track of historical changes. In Type 2 Slowly Changing Dimension, a new record is added to the table to represent the new information. Therefore, both the original and the new record will be present. The new record gets its own primary key. In our example, recall we originally have the following table: Customer Key Name State 1001 Christina IllinoisAfter Christina moved from Illinois to California, we add the new
information as a new row into the table: Customer Key Name State 1001 Christina Illinois 1005 Christina CaliforniaAdvantages: - This allows us to accurately keep all historical information. Disadvantages: - This will cause the size of the table to grow fast. In cases where the number of rows for the table is very high to start with, storage and performance can become a concern. - This necessarily complicates the ETL process. Usage: About 50% of the time. When to use Type 2: Type 2 slowly changing dimension should be used when it is necessary for the data warehouse to track historical changes. In Type 3 Slowly Changing Dimension, there will be two columns to indicate the particular attribute of interest, one indicating the original value, and one indicating the current value. There will also be a column that indicates when the current value becomes active. In our example, recall we originally have the following table: Customer Key Name State1001 Christina IllinoisTo accomodate Type 3 Slowly Changing Dimension, we will now have the following columns: • Customer Key • Name • Original State • Current State • Effective Date After Christina moved from Illinois to California, the original information gets updated, and we have the following table (assuming the effective date of change is January 15, 2003): Customer Key Name Original State Current State Effective Date 1001 Christina Illinois California 15-JAN-2003Advantages: - This does not increase
the size of the table, since new information is updated. - This allows us to keep some part of history. Disadvantages: - Type 3 will not be able to keep all history where an attribute is changed more than once. For example, if Christina later moves to Texas on December 15, 2003, the California information will be lost. Usage: Type 3 is rarely used in actual practice. When to use Type 3: Type III slowly changing dimension should only be used when it is necessary for the data warehouse to track historical changes, and when such changes will only occur for a finite number of time.
124)What are Target Types on the Server?
Target Types are File, Relational and ERP.
125)What are Target Options on the Servers?
Target Options for File Target type are FTP File, Loader and MQ. There are no target options for ERP target type. Target Options for Relational are Insert, Update (as Update), Update (as Insert), Update
(else Insert), Delete, and Truncate Table.
126)What is the difference between summary filter and detail filter]
Summary filter can be applieid on a group of rows that contain a common value.where as
detail filters can be applied on each and every rec of the data base.
127)Difference between summary filter and details filter?
Summary Filter --- we can apply records group by that contain common values.
Detail Filter --- we can apply to each and every record in a database.
128)What is the diff b/w STOP & ABORT in INFORMATICA sess level ?
Stop:We can Restart the session
Abort:WE cant restart the session.We should truncate all the pipeline after that start the session
129)What is the difference between stop and abort
stop: _______If the session u want to stop is a part of batch you must stop the batch,
if the batch is part of nested batch, Stop the outer most bacth\
Abort:----
You can issue the abort command , it is similar to stop command except it has 60 second
time out . If the server cannot finish processing and commiting data with in 60 sec
130)What is the status code?
Status code provides error handling for the informatica server during the session.The stored procedure issues a status code that notifies whether or not stored procedure completed sucessfully.This value can not seen by the user.It only used by the informatica server to determine whether to continue running the session or stop.
131)Difference between static cache and dynamic cache
Static cache:You can not insert or update the cache.
Dynamic cache
U can not insert or update the cache U can insert rows into the cache as u pass to the targetThe informatic server returns a value from the lookup table or cache when the condition is true.When the condition is not true, informatica server returns the default value for connected transformations and null for unconnected transformations. The informatic server inserts rows into cache when the condition is false.This indicates that the the row is not in the cache or target table. U can pass these rows to the target table
132)What is power center repository?
Standalone repository. A repository that functions individually, unrelated and unconnected to other repositories.
Global repository. (PowerCenter only.) The centralized repository in a domain, a group of connected repositories. Each domain can contain one global repository. The global repository can contain common objects to be shared throughout the domain through global shortcuts.
Local repository. (PowerCenter only.) A repository within a domain that is not the global repository. Each local repository in the domain can connect to the global repository and use objects in its shared folders.
133)What r the joiner caches?
Specifies the directory used to cache master records and the index to these records. By default, the cached files are created in a directory specified by the server variable $PMCacheDir. If you override the directory, make sure the directory exists and contains enough disk space for the cache files. The directory can be a mapped or mounted drive.
134)In the source, if we also have duplicate records and we have 2 targets, T1- for
unique values and T2- only for duplicate values. How do we pass the unique values
to T1 and duplicate values to T2 from the source to these 2 different targets in a
single mapping?
source--->sq--->exp-->sorter(with enable select distinct check box)--->t1
--->aggregator(with enabling group by and write count
function)--->t2
If u want only duplicates to t2 u can follow this sequence
--->agg(with enable group by write this code decode(count(col),1,1,0))---
>Filter(condition is 0)--->t2.
135)What r the diffrence between joiner transformation and source qualifier
transformation?
Source qualifier – Homogeneous source
Joiner – Heterogeneous source
136)While importing the relational source defintion from database, what are the meta
data of source you import?
Source name
Database location
Column names
Datatypes
Key constraints.
137)What r the unsupported repository objects for a mapplet?
Source definitions. Definitions of database objects (tables, views, synonyms) or files that provide source data.
Target definitions. Definitions of database objects or files that contain the target data.
Multi-dimensional metadata. Target definitions that are configured as cubes and
dimensions.
Mappings. A set of source and target definitions along with transformations containing
business logic that you build into the transformation. These are the instructions that the
Informatica Server uses to transform and move data.
Reusable transformations. Transformations that you can use in multiple mappings.
Mapplets. A set of transformations that you can use in multiple mappings.
Sessions and workflows. Sessions and workflows store information about how and when
the Informatica Server moves data. A workflow is a set of instructions that describes how
and when to run tasks related to extracting, transforming, and loading data. A session is a
type of task that you can put in a workflow. Each session corresponds to a single
mapping.
138)What r the types of metadata that stores in repository?
Source definitions. Definitions of database objects (tables, views, synonyms) or files that
provide source data.
Target definitions. Definitions of database objects or files that contain the target data.
Multi-dimensional metadata. Target definitions that are configured as cubes and
dimensions.
Mappings. A set of source and target definitions along with transformations containing
business logic that you build into the transformation. These are the instructions that the
Informatica Server uses to transform and move data.
Reusable transformations. Transformations that you can use in multiple mappings.
Mapplets. A set of transformations that you can use in multiple mappings.
Sessions and workflows. Sessions and workflows store information about how and when
the Informatica Server moves data. A workflow is a set of instructions that describes how
and when to run tasks related to extracting, transforming, and loading data. A session is a
type of task that you can put in a workflow. Each session corresponds to a singlemapping.
139)Suppose session is configured with commit interval of 10,000 rows and source has
50,000 rows. Explain the commit points for Source based commit and Target based
commit. Assume appropriate value wherever required.
Source based commit will commit the data into target based on commit interval.so,for every 10,000 rows it will commit into target.Target based commit will commit the data into target based on buffer size of the target.i.e., it commits the data into target when ever the buffer fills.Let us assume that the
buffer size is 6,000.So,for every 6,000 rows it commits the data.
140)What are the reusable transforamtions?
Reusable transformations can be used in multiple mappings.When you need to incorporate this transformation into maping,U add an instance of it to maping.Later if you change the definition of the transformation ,all instances of it inherit the changes.Since the instance of reusable transforamation is a pointer to that transforamtion,You can change the transforamation in the transformation developer,its instances automatically reflect these changes.This feature can save you great deal of work.
141)What are the types of maping in Getting Started Wizard?
Simple Pass through maping :
Loads a static fact or dimension table by inserting all rows. Use this mapping when youwant to drop all existing data from your table before loading new data.
Slowly Growing target :
Loads a slowly growing fact or dimension table by inserting new rows. Use this mapping
to load new data when existing data does not require updates.
142)What r the types of maping wizards that r to be provided in Informatica?
Simple Pass through
Slowly Growing Target
Slowly Changing the Dimension
Type1
Most recent values
Type2
Full History
Version
Flag
Date
Type3
Current and one previous
143)What are Dimensions and various types of Dimensions?
Set of level properties that describe a specific aspect of a business, used for analyzing the factual measures of one or more cubes, which use that dimension. Egs. Geography, time,customer and product.
144)What are the session parameters?
Session parameters are like maping parameters,represent values you might want to change between sessions such as database connections or source files. Server manager also allows you to create userdefined session parameters.Following are user defined session parameters:-
Database connections
Source file names: use this parameter when you want to change the name or location of session source file between session runs.
Target file name : Use this parameter when you want to change the name or location of session target file between session runs.
Reject file name : Use this parameter when you want to change the name or location of session reject files between session runs.
145)What is Session and Batches?
Session - A Session Is A set of instructions that tells the Informatica Server How And When To Move Data From Sources To Targets. After creating the session, we can use either the server manager or the command line program pmcmd to start or stop the session. Batches - It Provides A Way to Group Sessions For Either Serial Or Parallel Execution By The Informatica Server. There Are Two Types Of Batches :Sequential - Run Session One after the Other. Concurrent - Run Session At The Same Time.
146)If a session fails after loading of 10,000 records in to the target.How can u load the
records from 10001 th record when u run the session next time in informatica 6.1?
Running the session in recovery mode will work, but the target load type should be normal. If its bulk then recovery wont work as expected
147)Whats the diff between Informatica powercenter server, repositoryserver and
repository?
Repository is a database in which all informatica componets are stored in the form of tables. The reposiitory server controls the repository and maintains the data integrity and Consistency across the repository when multiple users use Informatica. Powercenter Server/Infa Server is responsible for execution of the components (sessions) stored in the repository.
148)How can you access the remote source into your session?
Relational source: To acess relational source which is situated in a remote place ,u need
to configure database connection to the datasource.
FileSource : To access the remote source file you must configure the FTP connection to
the host machine before you create the session.
Hetrogenous : When U’r maping contains more than one source type,the server manager
creates a hetrogenous session that displays source options for all types.
149)Difference between Rank and Dense Rank?
Rank:
2<--2nd position
2<--3rd position
4
5
Same Rank is assigned to same totals/numbers. Rank is followed by the Position. Golf
game ususally Ranks this way. This is usually a Gold Ranking.
Dense Rank:
1
2<--2nd position
2<--3rd position
3
4
Same ranks are assigned to same totals/numbers/names. the next rank follows the serial
number.
150)What is rank transformation?where can we use this transformation?
Rank transformation is used to find the status.ex if we have one sales table and in this if
we find more employees selling the same product and we are in need to find the first 5 0r
10 employee who is selling more products.we can go for rank transformation.
151)In update strategy target table or flat file which gives more performance ? why?
Pros: Loading, Sorting, Merging operations will be faster as there is no index concept and
Data will be in ASCII mode.
Cons: There is no concept of updating existing records in flat file.
As there is no indexes, while lookups speed will be lesser.
152) What is a command that used to run a batch?
pmcmd is used to start a batch.
153)What r the mapping paramaters and maping variables?
Please refer to the documentation for more understanding.
Mapping variables have two identities:
Start value and Current value
Start value = Current value ( when the session starts the execution of the underlying mapping)
Start value <> Current value ( while the session is in progress and the variable value changes in one ore more occasions)
Current value at the end of the session is nothing but the start value for the subsequent run of the same session.
154)How do we estimate the depth of the session scheduling queue? Where do we set the number of maximum concurrent sessions that Informatica can run at a given time?
u set the max no of concurrent sessions in the info server.by default its 10. u can set to any no.
155)Where should you place the flat file to import the flat file defintion to the designer?
Place it in local folder.
156)Why we use partitioning the session in informatica?
Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline. Informatica server can achieve high performance by partitioning the pipleline and performing the extract , transformation, and load for each partition in parallel.
157)Why we use partitioning the session in informatica?
Partitioning achieves the session performance by reducing the time period of reading the source and loading the data into target.
158)What is difference between partioning of relatonal target and partitioning of file
targets?
Partition's can be done on both relational and flat files.
Informatica supports following partitions
1.Database partitioning
2.RoundRobin
3.Pass-through
4.Hash-Key partitioning
5.Key Range partitioning
All these are applicable for relational targets.For flat file only database partitioning is not
applicable.
Informatica supports Nway partitioning.U can just specify the name of the target file and
create the partitions, rest will be taken care by informatica session.
159)What are partition points?
Partition points mark the thread boundaries in a source pipeline and divide the pipeline into stages.
160)What is parameter file?
Parameter file is to define the values for parameters and variables used in a session.A parameter
file is a file created by text editor such as word pad or notepad. You can define the following values in parameter file:-
Maping parameters
Maping variables
session parameters.
161)Differences between Normalizer and Normalizer transformation.
Normalizer: It is a transormation mainly using for cobol sources, it's change the rows into coloums and columns into rows
Normalization:To remove the retundancy and inconsitecy
162)Which transformation should we use to normalize the COBOL and relational
sources?
Normalizer Transformation.
When you drag the COBOL source in to the mapping Designer workspace,the normalizer
transformation automatically appears,creating input and output ports for every column in the source.
163)Which transformation should u need while using the cobol sources as source
defintions?
Normalizer transformaiton which is used to normalize the data.Since cobol sources r oftenly consists of Denormailzed data.
164)What is the difference between Narmal load and Bulk load?
Normal Load: Normal load will write information to the database log file so that if any recorvery is needed it is will be helpful. when the source file is a text file and loading data to a table,in such cases we should you normal load only, else the session will be failed. Bulk Mode: Bulk load will not write information to the database log file so that if any recorvery is needed we can't do any thing in such cases. compartivly Bulk load is pretty faster than normal load.
165)What are the join types in joiner transformation?
Normal (Default)
Master outer
Detail outer
Full outer.
166)What is the logic will you implement to laod the data in to one factv from 'n'
number of dimension tables.
Noramally evey one use
!)slowly changing diemnsions
2)slowly growing dimensions
167)After draging the ports of three sources(sql server,oracle,informix) to a single source qualifier, can u map these three ports directly to target?
NO.Unless and until u join those three ports in source qualifier u cannot map them directly.
168)Can U use the maping parameters or variables created in one maping into another
maping?
NO. You might want to use a workflow parameter/variable if you want it to be visible with other mappings/sessions


Print this post

No comments: