Shyam S. Reddy: September 2011

Tuesday, September 20, 2011

IBM - OPTIM - II

InfoSphere Data Architect: Simplifies data modeling and integration design, enabling architects to discover, model, relate, and standardize diverse and distributed data assets.
Optim Development Studio: Provides a complete development and testing environment for building database objects, queries, database logic, and Java™ pureQuery applications.
Optim pureQuery Runtime: Delivers a high-performance data access platform that provides the capability to visualize, optimize, and lock down SQL without modifying the application. Complements Optim Development Studio by enabling pureQuery support in production systems, including support for static execution of .NET and Java applications.
Optim Database Administrator: Manages complex structural changes while preserving data, privileges, dependent objects, and application bindings.
Optim High Performance Unload: Formerly known as DB2 High Performance Unload, provides a high-speed unload utility as an alternative to the export feature. Whether used alone or with Optim Database Administrator, this product significantly reduces the time required to migrate databases.
DB2 Merge Backup: Shortens recovery time on production servers, ensures full backups are available as needed, and enables organizations to reduce resources and outages for full backups.
DB2 Recovery Expert: Localizes recovery processes to impacted objects in order to reduce outages without having to resort to full database recovery.
Optim Performance Manager offerings: Provides a comprehensive and proactive performance-management solution to help you identify, diagnose, solve, and prevent database performance issues for applications that access DB2 databases on Linux®, UNIX®, or Windows®.
Optim Query Tuner offerings: Includes single-query tuning for both DB2® for z/OS® and DB2 for Linux, UNIX, and Windows. For DB2 for z/OS, you can extend the capability to entire SQL workloads using Optim Query Workload Tuner for DB2 for z/OS (formerly DB2 Optimization Expert for z/OS). These products provide a set of advisors and tools to help make it easier to tune SQL.

While there are more products that are part of the lifecycle management portfolio, including InfoSphere Optim Test Data Management solutions and InfoSphere Optim Data Growth solutions, this article focuses only on the listed products.

To learn more about IBM Integrated Data Management solutions by Optim, visit these links:
IBM documentation -- Tutorials for Optim Development
http://www.ibm.com/software/data/optim/
http://www.ibm.com/software/data/optim/development-studio/
http://www.ibm.com/developerworks/spaces/optim/
http://www.ibm.com/developerworks/data/products/devstudio/
http://www.ibm.com/developerworks/data/library/techarticle/dm-1105optimquerytuner2/index.html
http://publib.boulder.ibm.com/infocenter/idm/v2r2/index.jsp?topic=%2Fcom.ibm.datatools.dwb.tutorial.doc%2Ftopics%2Fdwb_introduction.html
http://www.ibm.com/developerworks/data/library/techarticle/dm-0909optimpackaging/index.html
http://www.ibm.com/developerworks/data/library/techarticle/dm-1107extendedinsight/index.html

Ref: www.ibm.com

Note: Some of the articles are grab from various Websites / Blogs.

IBM - OPTIM - I

IBM Optim Database Management Professional Edition

InfoSphere Optim is the family name of a portfolio of products that delivers on the IBM vision of managing data across your data’s lifecycle. This portfolio is designed to increase productivity for all roles throughout the data life cycle and to help reduce the complexity and cost of managing database applications. InfoSphere Optim offerings help you manage and optimize tasks in each phase of the application life cycle from a data management viewpoint. This article focuses on an overview of some of these offerings and how the products are packaged.

Table 1. Data lifecycle management capabilities by InfoSphere Optim solutions

Information governance core disciplines lifecycle management
Discover and define	Develop and test	Monitor and optimize	Consolidate and retire
InfoSphere Discovery IBM Business Glossary InfoSphere Data Architect	Optim Development Studio Optim Database Administrator Optim Test Data Manager and Data Privacy	Optim Performance Manager Optim Query Tuner Optim pureQuery Runtime DB2 Merge Backup DB2 Recovery Expert Optim Data Growth	Optim Application Consolidation solutions

The following list gives a high-level description of the products highlighted in Table 1.

Quicklinks for z/OS developers and database administrators

For many years, IBM has had a great set of products to manage DB2 for z/OS data and applications. Over time, more capabilities for z/OS will be delivered in these products, as well as in products in the Optim portfolio. Of the products described in this article, the following are currently enabled for z/OS databases:

Although not described in this article, the IBM Optim set of solutions also has great support for z/OS to manage data growth, to protect data privacy, and to streamline test data.

Ref: http://www.ibm.com/
Note: Some of the articles are grab from various Websites / Blogs.

Thursday, September 15, 2011

DS Related Useful Links - I

1. http://publib.boulder.ibm.com/infocenter/iisinfsv/v8r5/index.jsp?topic=/com.ibm.swg.im.iis.bpdir.usage.doc/topics/bpdir_con_gui_overview.html

2. For Orc-DWH :: http://oracle-datawarehousing.blogspot.com/2011/02/data-warehousing-objects.html

3. For Unix Related :: http://www.dartmouth.edu/~rc/classes/ksh/

4. For Datastage :: http://datastage-tutorials.blogspot.com/2009/09/1.html

5. For DSxchange Site :: http://www.dsxchange.com/portal.php

6. For WS_Quality Stage :: http://it.toolbox.com/wiki/index.php/WebSphere_QualityStage

7. http://www.dstagebox.com/cat/real-time/

8. http://www.duke-consulting.com/DwNav.htm

9. http://spulagam.blogspot.com

10. http://www.robvonk.com/datastage/standard-functions/

11. http://evergreendatastage.blogspot.com/2011/02/datastage-faqs.html

12. www.objectarena.com/Datastage/DataStage_OA.ppt

DS - Unix Related - II

ps -ef | grep

ps -ef | grep dsrpc

ps -ef | grep phantom

ps -ef | grep osh

ps -ef | grep dscs

ps -ef | grep dsapi

kill -9

Check the permissions
1.Log on to server as dsadm or root
2.cd $DSHOME/bin
3.ls -l | grep rws

The output should be (6) files owned by root with the setuid bit set -- especially the uv file.

Check to ensure that there was a clean restart of the DataStage Engine.
Here are the instructions for a clean restart of DataStage Engine:

1. Log on to server as dsadm or root
2. cd $DSHOME
3. source the environment:

. ./dsenv --> that’s dot space dot slash dsenv

4. Check for DataStage client connections
ps -ef | grep dsapi_slave
ps -ef | grep dscs

If there are any connections found, use kill -9 to remove them.

5. Check for jobs processes:
ps -ef | grep phantom

If there are any connections found, use kill -9 to remove them.

6. Check datastage shared memory segments:
ipcs - mop | grep ade

If there any shared memory segments returned, remove them, use command:
ipcrm -m ID , where ID is the number from the second column of ipcs -mop

7. Check if there is any port hung using:
netstat -a | grep dsrpc

If you get any results and the dsrpc is in WAIT then you need to wait until this it disappears (you can also reboot the machine).

8. At this point you can stop or start the DataStage Engine.
bin/uv -admin -stop
bin/uv -admin –start

DS - Unix Related - I

IN UNIX, the Datastage home directory location will always be specified in the “.dshome” file which will be present in the root directory. Before you can run your Datastage commands you will have to run the following commands

cd ‘cat /.dshome’

This will change the location to the home directory. By default this will be /opt/IBM/InformationServer/Server/DSEngine

. ./dsenv > /dev/null 2>&1

This will run the dsenv file which contains all the environment variables. Without doing this, your UNIX commands won’t run on the command prompt.

After you have done this then you can use any Datastage command for interacting with the server. The main command you can use is the ‘dsjob’ command which is not used only to run jobs but for a wide variety of reasons. Let’s look at the various ways you can use the dsjob command

To run a job

Using the dsjob command you can start, stop, reset or run the job in validation mode.

dsjob –run –mode VALIDATE project_name job_name

This command will actually run the job in validation mode. Similarly you can use RESET or RESTART instead of VALIDATE depending on what type of run you want. If you want a normal run then you will not need to specify the –mode keyword as shown below

dsjob –run project_name job_name | job_name.invocationid

Running with the invocationid would mean that the job would be run with that specific invocation id

Now if you have parameters to set or paratemeterset values to set then this can also be as set as shown below

dsjob –run –param variable_name=“VALUE” –param psParameterSet=“vsValueSet” project_name job_name

To stop a job

Stopping a job is fairly simple. You might not actually require it but still its worth to take a look. It acts the same way as you would stop a running job the Datastage director.

dsjob –stop project_name job_name|job_name.invocationid

To list projects, jobs, stages in jobs, links in jobs, parameters in jobs and invocations of jobs

dsjob can very easily give you all the above based on the different keywords. It will be useful for you if you want to get a report of what’s being used in what project and things like that

The various commands are shown below

‘dsjob –lprojects’ will give you a list of all the projects on the server

‘dsjob –ljobs project_name’ will give you a list of jobs in a particular project

‘dsjobs –lstages project_name job_name’ will give you a list of all the stages used in your job. Replacing –lstage with –links will give you a list of all the links in your job. Using –lparams will give you a list of all the parameters used in your job. Using –linvocations will give you a list of all the invocations of your multiple instance job.

To generate reports of a job

You can get the basic information of a job buy using the ‘jobinfo’ option as shown below

dsjob -jobinfo project_name job_name

Running this command will give you a short report of your job which includes The current status of the job, the name of any controlling job for the job, the date and time when the job started , the wave number of the last or current run (internal InfoSphere Datastage reference number) and the user status

You can get a more detailed report using the below command

dsjob -report project job_name BASIC|DETAIL|XML

BASIC means that your report will contain very basic information like start/end time of the job, time elapsed and the current status of the job. DETAIL as the name indicates will give you a very detailed report on the job down to the stages and link level. XML would give you an XML report which is also a detailed report in an XML format.

To access logs

You can use the below command to get the list of latest 5 fatal errors from the log of the job that was just run

dsjob -logsum –type FATAL –max 5 project_name job_name

You can get different types of information based on the keyword you specify for –type. Full list of allowable types are available in the help guide for reference.

There are a number of other options also available to get different log information. You can explore this in more detail in the developer guide. With the Datastage commands you can administer jobs, run jobs, maintain jobs, handle errors, prepare meaningful job logs and even prepare reports. The possibilities are endless. If you like to code then you won’t mind spending your time exploring the command line options available.

Source: http://www.dstagebox.com

http://www.spulagam.blogspot.com

Note: Some of the articles are grab from various websites/Blogs.

Monday, September 12, 2011

Subject: Read it carefully.... A Troubled User...

Dear Tech Support Team,

Last year I upgraded from Girlfriend 5.0 to Wife 1.0.

I soon noticed that the new program began unexpected child-processes that took up a lot of space and valuable resources. In addition, Wife 1.0 installed itself into all other programs and now monitors all other system activities.

Applications such as BachelorNights 10.3, Cricket 5.0 , BeerWithBuddies 7.5, and Outings 3.6 no longer runs, crashing the system whenever selected. I can't seem to keep Wife 1.0 in the background while attempting to run my favorite applications.

I'm thinking about going back to Girlfriend 5.0 , but the 'uninstall ' doesn't work on Wife 1.0.

Please help!
Thanks, "A Troubled User "

REPLY:
Dear Troubled User:
This is a very common problem that people complain about.
Many people upgrade from Girlfriend 5.0 to Wife 1.0, thinking that it is just a Utilities and Entertainment program.Wife 1.0 is an OPERATING SYSTEM and is designed by its Creator to run EVERYTHING !!!

It is also impossible to delete Wife 1.0 and to return to Girlfriend 5.0.
It is impossible to uninstall, or purge the program files from the system once installed.
You cannot go back to Girlfriend 5.0 because Wife 1.0 is designed not to allow this. (Look in your Wife 1.0 Manual under Warnings-Alimony- Child Support) ..

I recommend that you keep Wife1.0 and work on improving the environment.
I suggest installing the background application " Yes Dear" to alleviate software augmentation.

The best course of action is to enter the command C:\APOLOGIZE because ultimately you will have to give the APOLOGIZE command before the system will return to normal anyway.

Wife 1.0 is a great program, but it tends to be very high maintenance. Wife 1.0 comes with several support programs, such as Clean 2.5, Sweep 3.0, Cook 1.5 and DoLaundry 4.2. However, be very careful how you use these programs. Improper use will cause the system to launch the program NagNag 9.5 .

Once this happens, the only way to improve the performance of Wife 1.0 is to purchase additional software. I recommendSarees 2.1 and Jewellery 5.0

STATUTORY WARNING : DO NOT, under any circumstances, install SecretaryWithShortS kirt 3.3. This application is not supported by Wife 1.0 and will cause irreversible damage to the operating system.

Best of luck,Tech Support ....

P Please consider the environment before printing this page.

Friday, September 9, 2011

DS - DWH Query's ---> I

1. There is 10 years records and I want only Last months record from that then what to do?

Ans1: Select the SYSDATE and trunc for month and subrtract minus 1. And check for less than 2. Hope its a database which has data field in it.

Ans2: Using field date functions with IF, THEN, ELSE in transformer after sequential file (Or)
Using External filter Stage with grep command
(Or)
Using filter command in sequential file stage

****************************

2. We are getting it on Insert into table sh: dbx: not found Operator terminated abnormally: received signal SIGSEGV

Ans: SIGSEGV is the UNIX signal meaning “Segmentation Violation”, an attempt to access a memory address not owned by the process Run out of memory.

Lets Try The Below Solution’s may help out:

1) Change the jobs to use Join Stage or Merge Stage instead of lookup. Try to Handle the file Sizes

2) Some Time Job will run successfully but some time not. In this case handling ‘NULL’ or Data Type Conversions.

****************************

3. When running datastage job IN UNIX, getting the below Error, I am running the dsenv also before running the job.
$dsjob -run PROJECT_NAME DSJOJ_NAME
Error running job Status code = -99 DSJE_REPERROR ??

1Ans: Its an Repository error.
Check the project and Job name for any type.
$DSHOME/bin/dsjob -run -jobstatus PROJECT_NAME DSJOJ_NAME
Pass parameter is any mandatory

**********************************************

4. I have a requirement of getting random “n” records from a text file and that “n” would be present in another table. For example if my original input file is input.txt then the random records to be fetched would be present in number.txt.

1Ans: You could count your input file, divide the count by “n” to give you x, then take every “x” record by input rownum in a transform stage.

2Ans: Or assign a Random partition and choose only one partition till you reach ‘n’ records.

****************************************************

5Q: How to lock / unlock the jobs as Admin?

1Ans: Open DataStage Director
click on Job
click on CleanupResources
there you can see a window.
select “Show all”

You can see your user id and then logout from there

2Ans: If the CleanUp resource option is diabled, you should ask you adminstrator to enable the Admistration over the Director to get that done.
And the other option should be executed in “Command” option from the admisntrator.

************************************

6Q: How can we use shell script in the middle of job?? I can get record count from datastage job and then How can I assign parameter to seperate records.

To make my question more clear, If I have 3 records, then count=3. So, the file should create 3 seperate text files and each file contain only one record.

1Ans: First you would have calculated the total number of records. You create a records based on some or you key column with a comma seperated in a file.

get: key1,key2,key3

write it into a file and read it in Start loop activity or pass it as parameter to start loop activity.

Call the actual job, and pass these keys as parameter.

In transformer if the key is key1 by using job parmeters and pass that particular record to output.

Pass the parameter to filename. like File#parameter#.txt
So you will get each record in each file based on the keys.

Source: Deepak – ITToolbox.com

Tuesday, September 6, 2011

Index and Types of INDEX

INDEX AND TYPES OF INDEX

Indexes are optional structures associated with tables and clusters that allow SQL statements to execute more quickly against a table. Just as the index in this manual helps you locate information faster than if there were no index, an Oracle Database index provides a faster access path to table data. You can use indexes without rewriting any queries. Your results are the same, but you see them more quickly.

Oracle Database provides several indexing schemes that provide complementary performance functionality. These are:

B-tree indexes:

In computer science, a B-tree is a tree data structure that keeps data sorted and allows searches, sequential access, insertions, and deletions in logarithmic amortized time. The B-tree is a generalization of a binary search tree in that more than two paths diverge from a single node. Unlike self-balancing binary search trees, the B-tree is optimized for systems that read and write large blocks of data. It is commonly used in databases and file systems.

B-tree cluster indexes: The rows in data pages of clustered index are ordered on the value of the clustered index keys. The leaf nodes contains the data pages of the table, root and branch nodes contain actual index pages and index rows.

Each index row contains a key value and a pointer to branch level page or to data row in the leaf node of the index. New rows are inserted to fit the ordering sequence of index keys among existing rows. When you create a primary key in sql server, it will create clustered index by default. The structure of clustered index makes them faster than non clustered index.

Bitmap indexes: Bitmap indexes have traditionally been considered to work well for data such as gender, which has a small number of distinct values, for example male and female, but many occurrences of those values. This would happen if, for example, you had gender data for each resident in a city. Bitmap indexes have a significant space and performance advantage over other structures for such data. However, some researchers argue that Bitmap indexes are also useful for unique valued data which is not updated frequently. Bitmap indexes use bit arrays (commonly called bitmaps) and answer queries by performing bitwise logical operations on these bitmaps.

Bitmap indexes are also useful in the data warehousing applications for joining a large fact table to smaller dimension tables such as those arranged in a star schema. In other scenarios, a B-tree index would be more appropriate.

Hash cluster indexes: defined specifically for a hash cluster

Global and local indexes: relate to partitioned tables and indexes

Reverse key indexes: most useful for Oracle Real Application Clusters applications

Function-based indexes: contain the precomputed value of a function/expression

Domain indexes: specific to an application or cartridge.

Indexes are logically and physically independent of the data in the associated table. Being independent structures, they require storage space. You can create or drop an index without affecting the base tables, database applications, or other indexes. The database automatically maintains indexes when you insert, update, and delete rows of the associated table. If you drop an index, all applications continue to work. However, access to previously indexed data might be slower.

Shyam S. Reddy