Saturday 25 October 2014

XML file Read in DATASTAGE


1.     Introduction:

Data stage is and ETL (Extend, Transformer, Load) tool. It is a powerful data integration tool.Many of the Data Stage users knows how to load Sequential files, CSV files and tables using Data Stage. Currently In most of the organizations data is storing in the format of XML files. XML is used to create structure, store, and transport information.

2.     Purpose

The purpose of this document is to give simple steps to load a XML files using Data Stage. The document covers each step with the screenshots for better understanding.

3.     Introduction to XML file:

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. XML is used in many aspects of web development, often to simplify data storage and sharing. A markup construct that begins with < and ends with >. Tags come in the below flavors:
·         start-tags; for example: <section>
·         end-tags; for example: </section>










4.     Steps to Load XML files

4.1 Sample XML file-Creation:

·         Place the text in to the notepad
·         Save it as .XML format
·         Open the XML file with Internet explore. If it not showing any parser error then the file is correctly defined.
·         Ex: Employee Sample XML file. It gives the information about employees.

4.2 Data Stage Designer-Login:

·         Go to Start -> Programs -> IBM Information Server -> IBM WebSphere DataStage and QualityStage Designer
·         Or  else click the  short cut key on the Desktop
·         Enter the Domain ,Username ,Password ,Project details

4.3 Parallel Job –Creation:

·         Login in to DataStage Designer window
·         File -> New-> Parallel Job

4.4 XML File Metadata –Import

·         In the DataStage Designer window go to
·         Import -> Table Definition-> XML Table Definition
·         New window XML Meta data Importer will open.
·         In the XML Meta data go to
·         File -> open -> file
·         Give the XML file name in the file name tab.
·         The XML file Metadata will appears on the screen
·         Check all the Text boxes on the screen. (If any fields not required then don’t check the check box)
·         Check one of the key columns in the Table Definition.

·         Then to go File -> Save
·         Give the  DataStage table definition name, Folder Path and then enter Ok

·         File-> Exit (come out from the XML Meta Data Importer window)












4.5 XML Job Design

·         Drag the 2 Sequential file stages from the Pallet->File
·         The first file is to give the XML file name and second file is to load the XML data
·         Drag the XML Input stage from the Pallet -> Real Time
·         XML Input stage will Import the Metadata of XML file and process the transactions.
·         Then save the job in the desinger window go to File -> save as
·         Give the Item name and Folder path


·         Double click on the first Sequential File stage
·         Stage tab->Give the XML file name  in the File path
·         In the Output tab give  any value in column name  ,SQL type and Length


·         In the Format tab select the record type value as ‘implicit’ and Delimiter as ‘none’ and click ok
·         Next  open XML Input stage  ,in the stage tab and in General tab check the below values

·         Stage tab -> Transformation settings, check the below fields and load the XML parsing code for the file.
·         Click on the Load button and select the XML file then click OK
·         After loading the parsing code the screen should be


·         In the Input tab and in XML source tab select the below options.

·         In the Output tab and in transformation settings select the below options
·         In Output tab and Columns load the Metadata of XML file

·         After all steps done click ok in XML Input stage





·         Open target Sequential file stage , give the target file name  then click ok
·         Finally compile the job ,then run the job.
·         After completion of successful run the data will load in the target table.
·         Target file output.



2 comments:

  1. Hi.Information is very helpfull but process figure is not visible.

    ReplyDelete
  2. Screenshots not visible,please make it visible

    ReplyDelete