Wednesday, January 5, 2011

FAQ's

If a DataStage job has aborted after say 1000 records, how to continue the load from 1000th record after fixing the error?
Assuming following jobs / table details :
Table : EMP_TAB
Job1: Hash file extraction from EMP_TAB ( with key details, e.g. month date if its a monthly load)
Job 2: Cleanse/Transform ( Apply all business logic)
Job 3: Load job with look up on hash file from Job1 (to EMP_TAB)
Decription:
The extract job should be executed first in the sequence. For all the jobs check the option "Do no checkpoint run" except the first hash creation job.
Now in case your source had 1000 rows, and the job failed on 200th row. On simpley restarting the sequence the hash file would have all the rows inserted
by earlier run, & the earlier jobs like job 2 would not run unecessarily. The load job would ensure that where the match is found , not action is performed.
***************************************************************
It is possible to restart the job after the aborted rows.
You have to select the "check point" property in the sequence.
There you have to give the aborted final record number.
This is only in datastage(PX)
****************************************************************
What is the use of parameters in datastage job?
The main purpose of job parameters in Datastage is to provide the values at job run time.Ex;File Path File name database server name userid password.

**************************************************
2nd Ans>>

there are two types of parameters we can create
1.Job level parameters. - Those parameters only use in that particular job.U can create in Desinger.
2.Project Level parameters. - Those parameters can use into multi jobs under that particular project.Using Admin we have to create the project level parameters.
Use - Dynamic value passing at runtime(while running of job) like file_name path etc.so runtime u can pass the file name or path or whatever u want.so ur job wil be dynamic take that parameters and do perform the work.
***************************************************************************************
Explain about Error Handling in Datastage and best practices.
Answers To handle the error handling you can use the reject link option available in the transformer stage.
In datastage sequence there is "Exception Handler" activity.When you are calling your jobs from a Datastage Sequence you should do following :
Step 1: Go to properties of master sequence and check the checkbox "Automatically handle activities that fail".Also check "Restart job from the failure point".
Step2: In your sequence use a exception handler activity.After exception handler activity you may inclue a email notification activity.Here if the job fails the handle will go to the exception handler activity and an email willl be sent notifying the user that a sequence has failed.
*********************************************************
Problems with Datastage - Date format & Null Handling of a Date field in a schema file
 .csv file as my source. reading the data using a sequential file stage.
Then I am validating the format of the data using a schema file.
The date format of the incoming file is '%dd/%mm/%yyyy'.So the date
column has a fixed length 10.
Now according to the provided incoming data set the date column should be nullable.

Read date field from sequential file as varchar datatype with nullable yes. Use modify stage or transformer to convert the datatype and also check the nullability of the incoming data and if null found assign some default value like '12-Dec-9999' and process.
At the time of loading convert the date '12-Dec-9999' to NULL again.

**********************************************************************

No comments:

Post a Comment