Wednesday, February 23, 2011

DS - Putty

Putty File Transfer Commands in SSH Protocol

Using SSH (secure shell) protocol during file transfer can decrease the chances that your web server's security will be compromised. There are a variety of SSH clients available. Putty, an open source SSH client, offers a number of advantages. This article will explain how to use it.
One of the most important aspects of transferring files (uploading or downloading) to a web server is security. The basic method of transferring is FTP, which is not secure; it sends transmitted information in clear text form. This increases the risk of eavesdropping, and in the long run can increase the chances of having your web server access compromised.

However, the solution to this problem is to use SSH (another name: “secure shell”) protocol during file transfer. SSH effectively encrypts the communication channel between the client (your computer) and the SSH server. In this way, any hacker or eavesdropper has no means of identifying your passwords, login information, etc as your packets travel on the Internet. This drastically improves the security of your file transferring session.

Putty SSH client is an open source SSH client for Windows-based operating systems (Windows 7, XP, etc). Other aspects include how to establish SSH sessions and use it to browse web server files. Simple SSH commands will be discussed; there will also be a detailed discussion of transferring and moving files from the client computer to an SSH server.

how to execute command line statements in Putty, Let’s get started.

Enable SSH in your web hosting account
If you have a paid hosting account, find out from your web hosting support whether they have SSH included as one of their hosting features. If your web hosting package includes SSH, you need to enable it before you can use it to connect and transfer files. Consult your web hosting support for details.

If your web hosting package does not include SSH, you may try to switch to other hosting companies, but this is rare -- almost all Linux/Unix/Apache-based hosting includes SSH functionality in a paid hosting account. In my experience, depending on your web hosting company, adding SSH features or enabling them can cause your site to experience some down time, so you had better prepare for it. You can always consult your web hosting company for any additional SSH support.

To test if you have SSH enabled in the server:

Step 1: Go to Start -> All programs -> Accessories -> Command Prompt
Step 2: On the Windows DOS prompt, ping your website by executing your command (replace the domain with yours): ping php-developer.org
The command above will ping the website http://www.php-developer.org/ . Your objective is to learn the IP address of the website server, for example: 82.197.130.134
Step 3: Once you know your website server's IP address, execute the following command, and then press enter: ssh YOURFTPUSERNAME@YOURIPADDRESS
If the reply is something like this:
ssh: connect to host 82.197.130.134 port 22: connection refused
You have NOT yet enabled your SSH access. Replace the above examples with your own FTP username and website IP address. See sample screen shot below:
***********************

***********************
The PuTTY Event Log

If you choose "Event Log" from the system menu, a small window will pop up in which PuTTY logs significant events during the connection. Most of the events in the log will probably take place during session startup, but a few can occur at any point in the session, and one or two occur right at the end.

You can use the mouse to select one or more lines of the Event Log, and hit the Copy button to copy them to the clipboard. If you are reporting a bug, it's often useful to paste the contents of the Event Log into your bug report.

Starting new sessions
PuTTY's system menu provides some shortcut ways to start new sessions:
Selecting "New Session" will start a completely new instance of PuTTY, and bring up the configuration box as normal.
Selecting "Duplicate Session" will start a session with precisely the same options as your current one - connecting to the same host using the same protocol, with all the same terminal settings and everything.
The "Saved Sessions" submenu gives you quick access to any sets of stored session details you have previously saved. See section 4.1.2 for details of how to create saved sessions.

Changing your session settings
If you select "Change Settings" from the system menu, PuTTY will display a cut-down version of its initial configuration box. This allows you to adjust most properties of your current session. You can change the terminal size, the font, the actions of various keypresses, the colours, and so on.
Some of the options that are available in the main configuration box are not shown in the cut-down Change Settings box. These are usually options which don't make sense to change in the middle of a session (for example, you can't switch from SSH to Telnet in mid-session).

Copy All to ClipboardThis system menu option provides a convenient way to copy the whole contents of the terminal screen and scrollback to the clipboard in one go.

Clearing and resetting the terminalThe "Clear Scrollback" option on the system menu tells PuTTY to discard all the lines of text that have been kept after they scrolled off the top of the screen. This might be useful, for example, if you displayed sensitive information and wanted to make sure nobody could look over your shoulder and see it. (Note that this only prevents a casual user from using the scrollbar to view the information; the text is not guaranteed not to still be in PuTTY's memory.)
The "Reset Terminal" option causes a full reset of the terminal emulation. A VT-series terminal is a complex piece of software and can easily get into a state where all the text printed becomes unreadable. (This can happen, for example, if you accidentally output a binary file to your terminal.) If this happens, selecting Reset Terminal should sort it out.

************************
Putty only displays command prompt of a remote Linux computer in Windows.
The commands that you type in are simply Linux commands. They are not putty commands. Here are few examples:
ls - to list files in a directory:
Code:
ls
handbook-draft.pdf  iso
ls -lh
total 3.4M
-rw-r--r-- 1 pavlo pavlo 3.4M 2007-05-15 05:53 handbook-draft.pdf
drwxr-xr-x 2 pavlo pavlo 4.0K 2007-04-10 00:25 iso
ls iso/
rhel-5-client-x86_64-disc6.iso
cd - change directory (navigate to some directory):
Code:
cd iso
cp - copy a file:
Code:
cp ../handbook-draft.pdf .
mv - move a file (also used to rename files):
Code:
mv handbook-draft.pdf howto.pdf
rm - remove a file:
Code:
rm handbook-draft.pdf
mkdir - make directory:
Code:
mkdir new
pwd - show your current location:
Code:
pwd
/home/pavlo/example/iso
whoami - find out which user you are:
Code:
whoami
pavlo
date - display date and time
Code:
date
Tue May 15 06:07:45 UTC 2007
some commands are distribution specific, such as apt in Debian. Example:
Code:
apt-get install expect
which downloads and installs "expect" package.
man - most important of all commands (opens manual pages for other commands):
Code:
man expect
q - quit or exit an application (for example when done reading man page, type q to exit).
Here I found a list of some Linux commands: Linux bash commands - MAN Pages

DS - Autosys

Autosys – Through UNIX Command Prompt
The most commonly used commands are listed below. Autosys can be installed in Unix and in NT. When installed in NT, the commands listed below are typically available through the AutoConsole graphical user interface (GUI) as buttons or menu items.

Autosys Description
% wildcard for autorep commands

Status Commands  autorep –j pfiw9999% shows a brief job information for jobs starting with pfiw9999
autorep –j pfiw9999 -d show detailed status history for job pfiw9999
autorep –j pfiw% shows status of all jobs starting with pfiw

Job Scheduling and other Information 
autorep –j pfiw9999% -q show all job information for jobs starting with pfiw9999

Job Status for previous runs 
autorep –j pfiw9999% -r -1 show brief job status for jobs starting with pfiw9999 one run backwards in time

Job Dependency Information 
job_depends –j pfiw9999 -c show job dependencies (starting conditions) and successors (dependent jobs) for pfiw9999
job_depends show “scheduled” starting job times

Setting Status of Autosys Jobs 
To set a job to inactive sendevent –E CHANGE_STATUS –J <JOB_NAME> -i INACTIVE
To set a job to success sendevent –E CHANGE_STATUS –J <JOB_NAME> -s SUCCESS

Other Events 
To force start a job sendevent –E FORCE_STARTJOB –J <JOB_NAME>
To put a job on hold sendevent –E JOB_ON_HOLD –J <JOB_NAME>
To put a job on ice sendevent –E JOB_ON_ICE –J <JOB_NAME>
To take a job off hold sendevent –E JOB_OFF_HOLD –J <JOB_NAME>
To take a job off ice sendevent –E JOB_OFF_ICE –J <JOB_NAME>
To kill a job sendevent –E KILLJOB –J <JOB_NAME>
When you set a job to SUCCESS or ON_ICE, the job depending on it may run because it is still set to run.
If a job is ON_HOLD or ON_ICE, it will not run until it is force started or set to inactive for a normal run.
When you want to stop a job, use the KILL command. Once the job has been KILLED, it can be marked ON-HOLD, or INACTIVE, or whatever supported status you wish to assign to it.
**********************

Tuesday, February 8, 2011

DS - General

Which stages use HASH partitioning?
Want to increase 15% of the sal for the dept 10?
A:: try wid UPDATE employee SET salary = (salary * 15)/100 + sal wehre deptno = 10;
In Unix ::
1) How can u goto home dir., if ur in current file ::::: $cd !, $cd . cd.. cd...
2) Display Hidden Files :::::: $ls -al
3) How can u identify hidden files :::::: the file with rwx permissions
4) pids ::: $ps -ef | more
5) kill   ::: kill pid
6) chmod Filename r + x

SQL Basic QA's

## What is SQL?
A: SQL stands for ‘Structured Query Language’.

## What is SELECT statement?
A: The SELECT statement lets you select a set of values from a table in a database. The values selected from the database table would depend on the various conditions that are specified in the SQL query.

## How can you compare a part of the name rather than the entire name?
A: SELECT * FROM people WHERE empname LIKE ‘%ab%’
Would return a recordset with records consisting empname the sequence ‘ab’ in empname

## What is the INSERT statement?
 A: The INSERT statement lets you insert information into a database.

## How do you delete a record from a database?
 A: Use the DELETE statement to remove records or any particular column values from a database.

## How could I get distinct entries from a table?
 A: The SELECT statement in conjunction with DISTINCT lets you select a set of distinct values from a table in a database. The values selected from the database table would of course depend on the various conditions that are specified in the SQL query. Example SELECT DISTINCT empname FROM emptable

## How to get the results of a Query sorted in any order?
A: You can sort the results and return the sorted results to your program by using ORDER BY keyword thus saving you the pain of carrying out the sorting yourself. The ORDER BY keyword is used for sorting.

SELECT empname, age, city FROM emptable ORDER BY empname

## How can I find the total number of records in a table?
 A: You could use the COUNT keyword , example
SELECT COUNT(*) FROM emp WHERE age>40

## What is GROUP BY?
 A: The GROUP BY keywords have been added to SQL because aggregate functions (like SUM) return the aggregate of all column values every time they are called. Without the GROUP BY functionality, finding the sum for each individual group of column values was not possible.

## What is the difference among “dropping a table”, “truncating a table” and “deleting all records” from a table.
A: Dropping : (Table structure + Data are deleted), Invalidates the dependent objects ,Drops the indexes
Truncating: (Data alone deleted), Performs an automatic commit, Faster than delete

Delete : (Data alone deleted), Doesn’t perform automatic commit

## What are the Large object types suported by Oracle?
 A: Blob and Clob.

## Difference between a “where” clause and a “having” clause.
 A: Having clause is used only with group functions whereas Where is not used with.

## What’s the difference between a primary key and a unique key?
 A: Both primary key and unique enforce uniqueness of the column on which they are defined. But by default primary key creates a clustered index on the column, where are unique creates a nonclustered index by default. Another major difference is that, primary key doesn’t allow NULLs, but unique key allows one NULL only.

 ## What are cursors? Explain different types of cursors. What are the disadvantages of cursors? How can you avoid cursors?

A: Cursors allow row-by-row prcessing of the resultsets.
Types of cursors: Static, Dynamic, Forward-only, Keyset-driven. See books online for more information.

Disadvantages of cursors: Each time you fetch a row from the cursor, it results in a network roundtrip, where as a normal SELECT query makes only one rowundtrip, however large the resultset is. Cursors are also costly because they require more resources and temporary storage (results in more IO operations). Furthere, there are restrictions on the SELECT statements that can be used with some types of cursors.

Most of the times, set based operations can be used instead of cursors.


## What are triggers? How to invoke a trigger on demand?
A: Triggers are special kind of stored procedures that get executed automatically when an INSERT, UPDATE or DELETE operation takes place on a table.

Triggers can’t be invoked on demand. They get triggered only when an associated action (INSERT, UPDATE, DELETE) happens on the table on which they are defined.

Triggers are generally used to implement business rules, auditing. Triggers can also be used to extend the referential integrity checks, but wherever possible, use constraints for this purpose, instead of triggers, as constraints are much faster.

## What is a join and explain different types of joins.
A: Joins are used in queries to explain how different tables are related. Joins also let you select data from a table depending upon data from another table.

Types of joins: INNER JOINs, OUTER JOINs, CROSS JOINs. OUTER JOINs are further classified as LEFT OUTER JOINS, RIGHT OUTER JOINS and FULL OUTER JOINS.

## What is a self join?
A: Self join is just like any other join, except that two instances of the same table will be joined in the query.

Friday, February 4, 2011

DS - Deadlock daemon to clear unused locks

Q: How to use the deadlock daemon to clear unused locks in InfoSphere DataStage
Answer:Locks are used by DataStage to guarantee the integrity of objects while they are being used. Locks prevent multiple processes to update the same object at the same time (for example when 2 users try to edit the same job concurrently). Under normal circumstances DataStage cleans the locks when they are no longer necessary but there might be scenarios when locks are not removed properly (for example when a network connection is abruptly closed). When that happens DataStage might keep unneeded locks, which in turn might cause problems.

DataStage has a tool that helps you to automatically clear these locks and is called the deadlock daemon (dsdlockd). This service can be configured to run with certain frequency and removed the locks that are no longer needed.

The deadlock daemon checks all DataStage locks and makes sure that all of them are associated to existing processes running. If a lock exists but the process that was using the lock no longer exists then the lock is removed.

There are different ways of using the deadlock daemon:
1. If you want to start this daemon automatically when the DataStage Engine
starts then edit the file $DSHOME/dsdlockd.config. For security reasons, don't forget to make a copy of this file before you change it.

Open the file and set the parameter "start "to 1 and to define how often the deadlock daemon will check for unused locks use the parameter "timer". This is value is set in seconds.

For example:
start=1
timer=3600
will start the daemon with DataStage and then it will check for unused locks every hour.

2. If the deadlock daemon is not running and you want to manually start it then go to $DSHOME source the dsenv file and then run bin/dsdlockd -config The parameter -config forces the program to read the $DSHOME/dsdlockd.config and use the parameter timer to define how often to check for unused locks.
If the parameter -config is omitted it will check every 60 seconds.

To stop the daemon you can run
bin/dsdlockd -stop

3. If the deadlock daemon is not running and you want to run only once you can use the parameter -p, for example:
bin/dsdlockd -p

This way the daemon will only check for unused locks once and then will stop automatically.

Note: Do not modify the other values of the dslockd.config file unless suggested by IBM.

Thursday, February 3, 2011

DS - Useful Links - II

Here you can find some  useful Links


http://www.oracle.com/pls/db102/portal.portal_db?selected=6
http://www.ibm.com/developerworks/data/community/ 
http://www-01.ibm.com/support/docview.wss?uid=swg21413260
http://www.ibm.com/developerworks/data/library/techarticle/dm-0609xu/
http://www.channeldb2.com/profiles/blogs/datawarehouse-project-working
ORACLE : http://www.orafaq.com/wiki/SQL_FAQ#How_does_one_eliminate_duplicates_rows_from_a_table.3F
UNIX :  http://www.panix.com/~elflord/unix/grep.html

http://datastage.org/
http://datastage-tutorials.blogspot.com/2010/04/datastage-overview_24.html
http://datastagecareer.blogspot.com/
http://en.wordpress.com/tag/datastage/
http://www.dsxchange.com/viewtopic.php?p=234616&sid=91f929650228a308564d102d0ca4bedf
http://idpt0000.wordpress.com/2006/09/11/datastage-blog-and-wiki/
http://www.blogtoplist.com/rss/data-warehousing.html
http://datawarehousingsites.blogspot.com/
http://blogs.oracle.com/datawarehousing/
http://blogs.oracle.com/datawarehousing/
http://etl-tools.info/infosphere-datastage-ee/certification.htm
http://www.ibm.com/developerworks/data/library/techarticle/dm-0609xu/
http://www-148.ibm.com/tela/webmail/NlDynamicPage/8205/28501/cosmetic?web_page_id=508702&pre=0
http://www.inmoncif.com/home/
http://www.toadworld.com/BLOGS/tabid/67/Default.aspx
http://obiee101.blogspot.com/search?updated-min=2010-01-01T00%3A00%3A00%2B01%3A00&updated-max=2011-01-01T00%3A00%3A00%2B01%3A00&max-results=50
http://spulagam.blogspot.com/2009/12/logical-versus-physical-database.html
http://www.softpedia.com/get/Others/Home-Education/000-415-IBM-WebSphere-IIS-DataStage-Enterprise-Edition-Practice-Test-Questions.shtml
http://www.globalguideline.com/interview_questions/Questions.php?sc=Extract_transform_load_Oracle_Interview_Questions_Answers

http://etlguru.com/blog/category/etl-testing/


How to control Angryness :::::


http://www.ehow.com/how_2116060_overcome-angriness.html

http://www.soundfeelings.com/free/anger.htm

DS - Lookup Q

When performing a sparse or normal lookup in a parallel job how do you return multiple rows?  
For a sparse lookup all matching rows are returned by default. There is no way to change this and any rows matching the key values will be returned.

For a normal lookup you will need to specify this option:
Right click the lookup stage and click Properties then click the Constraints icon

Select the drop-down list called "return multiple rows from link"

This list will give you all reference links but only one can be used to return multiple rows. 

******************************
Datastage server7.5 installed on Unix machine. need to have a DSN to connect to the database which is on some other unix machine.
let me know it is enough if we create a DSN at the client machine or do we have to change the config file in UNIX?
In UNIX, three files need to be configured.
a) uvodbc.config
b) odbc.ini
c) dsenv
a)uvodbc.config : Here you need to specify the name of data spurce(DSN). Existing configurations in the file can be referred to create new DSN.
For example :
<DSN>
DBMSTYPE = ODBC

b)odbc.ini : Here you need to provide the configuration details of the DSN created in uvodbc.config. The name of data source should be same in uvodbc.config and odbc.ini here the details like like driver location, database name, ip, user id, etc are specified.
Refer the exsting configurations in this file to create new ones.

c)dsenv : It contains the Environment variables needed by the ODBC drivers to connect to the ODBC data source.You need to setup certain env variables to connect to the ODBC data source.
For example :
# Oracle 8i
    ORACLE_HOME=/space/oracle8i
    ORAHOME=/space/oracle8i
    LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$ORACLE_HOME/lib:$ORACLE_HOME/rdbms/lib;export LD_LIBRARY_PATH
    ORACLE_SID=WSMK5
    ORASID=WSMK5
    export ORACLE_HOME ORAHOME ORACLE_SID ORASID


write a querry to count the goods sold on hourly basis

Select Count(*) DATEPART ( hh sellgoodsdate) From goodssellGroup By DATEPART ( hh sellgoodsdate)

********************

How can I convert a varchar data type into a numeric data type then insert it into a varchar field in a datatabase.

For example, I need to clean up the leading zeros of this string "00000000000009700477" then insert it into a field with charvar data type.

Note that I cannot change the charvar data type to integer or numeric data type in the target table.
Example:
Source (flat file) leading zeros in numeric Data ---> Taget table: dont want leading zeros in varchar field.
00000000000009700477 want to see only 9700477
00000000000000000890 want to see only 890
Is there an interger convert function that can be used in transformer to accomplist this task?
---------
IF LEN(Trim(DSLink9.Field001[159,4],"0",&qu ot;L")) = 0 THEN 0 ELSE Trim(DSLink9.Field001[159,4],"0","L ")

Tuesday, February 1, 2011

Working on a job vs_new...and somehow my connectivity broke..
after that started new session and tried to access the job but getting error that "job is being accessed by other user". How can we resolve? 

So opened Director to cleanup resource...but there is the error as below-
ERROR: Cannot find any process numbers for stages in job vs_new

ANS : The deadlock daemon can be started to release locks and abandoned shared memory segments for processes that are terminated via the UNIX kill command or abnormally due to an unforeseen event.
To start the dslockd process edit $DSHOME/dsdlockd.config and set the start flag to 1.
The daemon will start automatically the next time the DataStage server is started
Enter the time interval, in seconds, that the daemon should wait before rechecking for deadlock conditions. The default is 900 seconds, but you may want to select a smaller interval like 300 seconds.
start=0
timer=900
res=0
log=
         OR   
    
Under the "Locks" you select the "Show All" option, you should still be able to see the job name.
You can either do the "Release All" or "Logout" the corresponding PID.

**********************

99% of the time you will not see those types of locks nor do you see them when you use LIST LOCKS.
The best way to do this and never have to worry about it is to set the deamon value as I described before so it will run and automatically take care of this as well as other important things like lost shared memory segments.

**********************
Data Extraction Rule : The data should be extracted from the previous quarter of the year for the input date passed.
For e.g
If the input date is 14-Apr-2010 the job should extract Q1 data since the input date falls in Q2.
If the input date is 03-Sep-2010 the job should extract Q2 data since the input date fall in Q3.

ANS : Suppose stage is
seqfile --> transformer --> dataset
in transformer stage variables write logic in if then else
if ( juliandayfromdate(datecolumn) < juliandayfromdate(04-01-2011)) then display or extract data columns of first quarter else
if ( juliandayfromdate(datecolumn) < juliandayfromdate(07-01-2011)) then display or extract data columns of second quarter else
if ( juliandayfromdate(datecolumn) < juliandayfromdate(10-01-2011)) display or extract data columns of third quarter
else display or extract data columns of fourth quarter
or simply compare dd mm yy separately in transformer by extracting them and get the data or columns u need.