Module
Objectives
After
completing this unit , you should be able to understand about:
- –Projects
- –Jobs
- –Data Flows
- –Work flows
- –Scripts
- –Transforms
- –Data Stores
- –Importing from through Data Stores and File Format
DATA
SERVICES OBJECTS
•
Projects
•
Jobs
•
Data Flows
•
Work flows
•
Scripts
• Transforms
PROJECTS & JOBS
•
Projects
–A
projectis an object that allows you to group jobs. A project is the
highest level of organization offered by Data Integrator.
–Opening
a project makes one group of objects easily accessible in the user interface.
•
Jobs
–A
job is the only object you can execute. You can manually execute and test
jobs in development.
–In
production, you can schedule batch jobs.
–A
job is made up of steps you want executed
together.
DATA
FLOW
•Data
flows extract, transform, and load data.
•
Everything having to do with data, including reading sources, transforming
data, and loading targets, occurs inside a data flow.
• The lines connecting objects in a data
flow represent the flow of data through data transformation steps.
WORK
FLOW
•A work flow defines the
decision-making process for executing data flows.
•For
example, elements in a work flow can determine the path of execution based on a
value set by a previous job or can indicate an alternative path if something
goes wrong in the primary path.
TRANSFORMS
•
Data Integrator includes objects called transforms.
•
Transforms operate on data sets.
•
Transforms manipulate input sets and produce one or more output sets.
•
E.g.
–Query
–Case
–Date
Generator
–Etc.
SCRIPTS
•Scripts are
single-use objects used to call functions and assign values to variables in a
work flow.
•
For example, you can use the SQL function in a script to determine the most
recent update time for a table and then assign that value to a variable.
• You can then assign the variable to a
parameter that passes into a data flow and identifies the rows to extract from
a source.
OBJECT
USE
•
Single Use Objects
–Some
objects are defined only within the context of a single job or data flow, for
example scripts and specific transform definitions.
•
Re-usable Objects
–A reusable object has a single
definition; all calls to the object refer to that definition. If you change the
definition of the object in one place, you are changing the object in all other
places in which it appears.
RELATIONSHIP
BETWEEN THE OBJECTS
•
Jobs are composed of work flows and/or data flows
•
Work flow is an incorporation of several data flows
•
Data flow is the process by which source data is transformed into
target
data
•
Work flow:
• Data flow:
OBJECT
HIERARCHY
DEVELOPMENT
PROCESS
•
Design
•
Test
• Production
DESIGNER
INTERFACE
•
Key Areas of Designer windows
•
Toolbar
•
Local Object Library
•
Project Area
•
Tool palette
• Workspace
KEY AREAS OF
DESIGNER WINDOW
DEFINING
SOURCE AND TARGET METADATA
•Using
Data store
•
Importing Metadata
• Defining a File Format
DATASTORES
•
Connection or multiple connections to data sources
•
Import metadata from the data source
•
Metadata consists of:
–Table
name
–Column
names
–Column
data types
–Primary
key columns
–Table
attributes
–RDBMS
functions
–Application
specific data structure
•
Data Integrator uses datastores to read data from source table or load data to target table
TYPES
OF DATASTORES
•
Database Datastores:
–Provides
a simple way to import way to import metadata
•
Application Datastores:
–Easily
import metadata from most ERP systems
•
Adapter Datastores:
–Provides access to an application’s data
and metadata or just metadata
v Datastores
represent connection configurations between Data Services and data bases or
applications.
v These
configurations can be direct or through adapters.
v Datastore configurations allow Data Services to access metadata from a database or application and read from or write to that database or application while Data Services executes a job.
v Datastore configurations allow Data Services to access metadata from a database or application and read from or write to that database or application while Data Services executes a job.
Datastores:
vProvide a
logical channel (connection) to a database
vMust be
specified for each source and target database
v Are used to import
metadata for source and target databases into the repository.
vAre used by Data Integrator to read data
from source tables and load data to target tables
Different
Database Connection through DS
v The databases to which Data Integrator
datastores can connect:
•
OracleàMicrosoft
SQL Server
•
IBM DB2àSybase
ASE
•
InformixàODBC
•
Sybase IQ
v Metadata consists of:
•
Database tablesàPrimary
key columns
•
Table nameàTable
attributes
•
Column namesàColumn
data types
v RDBMS functions
Connection metadata is defined in the
object library as datastores (for tables) and file formats (for flat files).
While we are designating the datastores as sources or targets datastores only
function as connections. The actual source and target objects are defined when
we define data flows
To define a Datastore
- Explain Datastore
- Create a database datastore
- Change a datastore definition
Advanced
Importing
Properties
Import Metadata Through Datastage
With Data Services, we can import meta data for individual tables using a data store. Data Services stores meta data information for all imported objects in a data store. Importing meta data through a data base data store
To view imported objects in the object library
To view imported objects in the object library
If we have already imported an object such as a datastore, function, or table, we
can reimport it, which updates the object’s metadata from our database
(reimporting overwrites any changes we might have made to the object in Data
Services).
File Format Properties
- A file format is a set of properties describing the structure of a flat file. File formats describe the metadata structure.
- A file format defines a connection to a file. Therefore, we use a file format to connect Data Integrator to source or target data when the data is stored in a file rather than a database table.File format objects can describe files in:
- Delimited format —Characters such as commas or tabs separate each Field
- Fixed width format —The column width is specified by the user
- SAP ERP and R/3 format
To create a file format from an existing file format
We must enter a new name for the replicated file. Data Services does not allow to save the replicated file with the same name as the original (or any other existing File Format object).
Editing file formats
- We can modify existing file format templates to match changes in the format or structure of a file.
- We cannot change the name of a file format template. Any changes effect every source or target file that is based on this file format template.
File format features
HANDLING ERRORS IN FILE FORMAT
• When you enable error handling in File Format
Editor, Data Integrator:
–Checks for two types of Flat-file source error:
• Data-type conversion error
• Row-format error
–Stop processing the source file after reaching a
specified number of invalid rows
–Logs data-type conversion or row-format warning to
Print this post
No comments:
Post a Comment