Module Objectives
After
completing this unit , you should be able to understand about:
–Query
transformer
–Various
options of Query transformer
–Joining and
aggregation through query transformer
–Built-in
functions used in Query transformer
–Case
Transformer
–Merge
Transformer
QUERY TRANSFORM
The Query
transform is by far the most commonly used transform. It retrieves a data set
that satisfies conditions that you specify
The Query
transform can perform the following operations:
To choose
(filter) the data to extract from sources
Join data from
multiple sources
Map columns from
input to output schemas
Perform
transformations and functions on the data
Perform data
nesting and un-nesting
Add new columns,
nested schemas, and function results to the output schema
Assign primary keys to output columns
• Query
transform can also be used for following.
–Performing
calculations
–Joining
sources
–Aggregation
Built-in Functions used in Query Transformer
FUNCTIONS
•Define built
in functions.
•Differentiate
between functions and transforms.
•List types of
operations available for functions
•Using
functions in expressions.
•Use date and
time functions and date_generation
transform to
build a dimension table.
•Use lookup
functions to look up status in a table.
•Understand
other functions.
BUILT IN FUNCTIONS
•Functions take
input values and produce a return value.
Functions also
operate on individual values passed to them
•Input values
can be parameters passed into data low, values
from column of
data or variables defined inside a script.
You can use
functions in expressions that include script and
Conditional
statements.
DIFFERENTIATING BETWEEN FUNCTIONS AND TRANSFORMS
•Functions
operate on single values, such as values in
specific
columns in data set.
•Transforms
operate on data set , creating, updating &
Deleting rows
of data.
TYPES OF OPERATIONS FOR FUNCTIONS
Operative Type Description
Aggregate Generates à a single value from a set of values. Aggregate functions
, such as max & min & count.
Iterative
Maintains state information from one invocation to another. The life of
iterative
function’s state information is the execution life of the query in which they are included. E.g. Lookup.
Stateless State
information is not maintained from one invocation to the next.
Stateless
functions such as to_char or month can be used anywhere expressions
are allowed.
OTHER TYPES OF FUNCTIONS
•Database &
application functions –
Functions
specific to you DBMS.
•Custom
functions-Functions you define yourself.
USING FUNCTIONS IN EXPRESSIONS
•Functions are
typically used to add:
–Columns based
on some other value (lookup function)
–Generated key
fields
•You can use
functions in:
–Transforms :
for e.g Query , Case and SQL transforms.
–Scripts: these
are single-use objects used to call functions and
assign values
to variables to work flow.
–Conditionals:
these are single-use objects used to implement
if/then/else
logic in work flow. Conditionals and their
components-if
expressions , then and else diagrams-are
included in the
scope of parent control flow’s variables and
parameters.
–Other custom
functions
CUSTOM FUNCTIONS
–Written by the
user in Data Integrator scripting language
–Reusable
objects
–Managed
through the function wizard
Consider these
guidelines when you create your own functions:
• Functions can
call other functions.
• Functions
cannot call themselves.
• Functions
cannot participate in a cycle of recursive calls. For example, function A
cannot call function B, which calls function A.
• Functions
return a value.
• Functions can
have parameters for input, output, or both. However, data
flows cannot
pass parameters of type output or input/output.
Before creating
a custom function, you must know the input, output, and
return values
and their data types. The return value is predefined to be return.
USING BUILT IN FUNCTIONS
• The built in
functions for date and time and built-in
date_generation
transform are useful when building a
time dimension
table.
• to_char
–To convert
date to string.
• to_date
–To convert a
string to a date.
• Month
–To determine
the month in which the given date fails.
• Quarter
–To determine
the quarter in which given date fails.
USE LOOKUP FUNCTIONS TO LOOK UP STATUS IN ATABLE
•A specialized
type of join, similar to an SQL outer join
–A SQL outer
join may return multiple matches for a single record in the
outer table.
–Lookup
functions always return exactly the same no. of records that are
in the source
(outer) table.
–Sophisticated
caching options
–A default
value when no match is found.
LOOKUP_EXT()
• While all
lookup functions return one row for each row in the source they
differ by how
they choose which of several matching rows to return:
–Lookup_ext()
•Allows
specification of an Order by column and Return policy(Min,Max) to
return the
record with highest/lowest value in a given field. For e.g. a
surrogate key.
•This function
also extends functionality by allowing you to :
–Return
multiple columns from a single lookup.
–Choose from
more operators to specify a lookup condition.
–Specify a
return policy for lookup.
–Call
lookup_ext, using Function Wizard, in the query output.
USE DATABASE TYPE FUNCTIONS TORETURN INFORMATION ON
DATA SOURCES
• db_type
–function
returns the database type of the data store configuration in
use at runtime.
This function is useful if your datastore has multiple
configurations.
Syntax: db_type(ds_name)
• db_version
–Function
returns the database version of datastore configuration in use at
runtime.
Syntax:
db_version(ds_name)
•
db_database_name
–Function
returns the database name of the datastore configuration in use
at runtime.
Syntax:db_database_name(ds_name)
DB FUNCTIONS
• db_owner :
–Function
returns the real owner name for the datastore configuration that is
in use at
runtime.This function is useful if your datastore has multiple
configurations
because with multiple configurations, you can use alias owner
instead of
database owner names.
Syntax :
db_owner(ds_name,alias_name)
• decode :
–Function to
return an expression based on the first condition in the specified
list of
conditions and expressions that evaluates to TRUE. It provides an
alternate way
to write nested if then else functions.
Syntax :
decode(condition_and expression_list,default expression)
Print this post
No comments:
Post a Comment