Logo

Navigation
  • Home
  • Services
    • ERP Solutions
    • Implementation Solutions
    • Support and Maintenance Solutions
    • Custom Solutions
    • Upgrade Solutions
    • Training and Mentoring
    • Web Solutions
    • Production Support
    • Architecture Designing
    • Independent Validation and Testing Services
    • Infrastructure Management
  • Expertise
    • Microsoft Development Expertise
    • Mobile Development
    • SQL Server Database and BI
    • SAP BI, SAP Hana, SAP BO
    • Oracle and BI
    • Oracle RAC
  • Technical Training
    • Learn Data Management
      • Business Intelligence
      • Data Mining
      • Data Modeling
      • Data Warehousing
      • Disaster Recovery
    • Learn Concepts
      • Application Development
      • Client Server
      • Cloud Computing Tutorials
      • Cluster Computing
      • CRM Tutorial
      • EDI Tutorials
      • ERP Tutorials
      • NLP
      • OOPS
      • Concepts
      • SOA Tutorial
      • Supply Chain
      • Technology Trends
      • UML
      • Virtualization
      • Web 2.0
    • Learn Java
      • JavaScript Tutorial
      • JSP Tutorials
      • J2EE
    • Learn Microsoft
      • MSAS
      • ASP.NET
      • ASP.NET 2.0
      • C Sharp
      • MS Project Training
      • Silverlight
      • SQL Server 2005
      • VB.NET 2005
    • Learn Networking
      • Networking
      • Wireless
    • Learn Oracle
      • Oracle 10g
      • PL/SQL
      • Oracle 11g Tutorials
      • Oracle 9i
      • Oracle Apps
    • Learn Programming
      • Ajax Tutorial
      • C Language
      • C++ Tutorials
      • CSS Tutorial
      • CSS3 Tutorial
      • JavaScript Tutorial
      • jQuery Tutorial
      • MainFrame
      • PHP Tutorial
      • VBScript Tutorial
      • XML Tutorial
    • Learn Software Testing
      • Software Testing Types
      • SQA
      • Testing
  • Career Training
    • Career Improvement
      • Career Articles
      • Certification Articles
      • Conflict Management
      • Core Skills
      • Decision Making
      • Entrepreneurship
      • Goal Setting
      • Life Skills
      • Performance Development
      • Personal Excellence
      • Personality Development
      • Problem Solving
      • Relationship Management
      • Self Confidence
      • Self Supervision
      • Social Networking
      • Strategic Planning
      • Time Management
    • Education Help
      • Career Tracks
      • Essay Writing
      • Internship Tips
      • Online Education
      • Scholarships
      • Student Loans
    • Managerial Skills
      • Business Communication
      • Business Networking
      • Facilitator Skills
      • Managing Change
      • Marketing Management
      • Meeting Management
      • Process Management
      • Project Management
      • Project Management Life Cycle
      • Project Management Process
      • Project Risk Management
      • Relationship Management
      • Task Management
      • Team Building
      • Virtual Team Management
    • Essential Life Skills
      • Anger Management
      • Anxiety Management
      • Attitude Development
      • Coaching and Mentoring
      • Emotional Intelligence
      • Stress Management
      • Positive Thinking
    • Communication Skills
      • Conversation Skills
      • Cross Culture Competence
      • English Vocabulary
      • Listening Skills
      • Public Speaking Skills
      • Questioning Skills
    • Soft Skills
      • Assertive Skills
      • Influence Skills
      • Leadership Skills
      • Memory Skills
      • People Skills
      • Presentation Skills
    • Finding a Job
      • Etiquette Tips
      • Group Discussions
      • HR Interviews
      • Interview Notes
      • Job Search Tips
      • Resume Tips
      • Sample Resumes
 

XML Processing

By Exforsys | on July 16, 2007 |
XML Tutorial

XML Processing

XML documents process is explained by a huge set of specifications and the list of these specifications is growing endlessly. A lot of applications depend on these specifications to work with XML or extensive markup language. These specifications will have all the requirements listed for XML processing model and even the XML language specifications. These specifications are more at the conceptual level and contain descriptions about the language based interactions.


The XML documents are treated as a set of information modules and the specifications contains processes which construct new sets of information modules, inspect the information sets, modify them or extract information from the per existing information sets. The processing model has to be described in terms of the info set and the applications which have been working with the solid object models cannot be considered as the info set. The applications use DOM object models or the SAXX event stream or other representations of the info sets.

Requirements of the XML processing model

The language should be able to address the concerns related to interoperability.  The language itself should be easily operated and should be simple for the XML processing model.  The language should be able to specify the input and output and all the required paramet6ers of the document.  The language should define mandatory processing options for input and also error reporting options in the XML processing model. This has to be done for the sake of interoperability. The language should be capable of specifying the documents and the set of and components separately.  The language itself should be easy for implementation but it should be also be sophisticated for performing operations that can be optimized.

The XML processing model should be extensible so that the applications have the ability to define new functions and design them min the pipeline.  The model should have a plan for error handling and fallback scenarios.  The XML processing model should be able to select different components depending on the run time and should also allow processing which is conditional to take place.  The information exchange between the components should take place in a standardized way.  The language should be able to use the XML tools for manipulating the data and so the data should be essentially in XML.

Processing XML with Java

XML document is a tree of objects and there are standard API’s which are used to represent them using the World Wide Web’s data object model specifications. It is represented as a series of events in the SAX. The standard API for the Java XML parsers is called the JAXP and the JAXP 1.1 is expanded to include an API for the engines in XSLT also. This phenomenon is called TRAX which is a standard or Transformation of API for XMLAPI is very powerful if you understand its usage and the top level interfaces of the TRAX.

Uses of TRaX

The XML transformation is included in the TRAX API and the original work of the JAXP is extended to bring in a vendor and a standard Java API for identifying and carry out the XML transformations. TraX plays a more important role in this environment that just being an API engine and its main usage is for being a general-purpose interface for transformation of XML documents. TRaX is not in competition with the data object model or the java data object model or even the SAX, it is just an API which is used to represent the XML transformation methods and bridge these various methods. It includes SAX events and templates from XSLT. TRax also relies upon SAX2 and the Data object model or the DOM and their parsers to a great extent. TRaX basically provides the same level of functionalities like the XSLT engines but the parsers can be changed by changing their properties. In certain codes for a successive transformation the XSLT code has to be reprocessed. A common scenario is that the same set of transformations is used to apply to different sources repeatedly but possible in different series of threads. A better way to approach this whole thing would be to process the style sheet transformation only once and keep this as a copy by saving it for the other repetitive transformation cycles. This way a lot of time can be saved and the process need not be repeated over and over again. By using the TraX interface and its templates this can be done.

When the transformation is taking place with the help of the transformer the actual instance for the template would be the real run time processing that takes place during the transformation and the instructions that go into it. If you would like to increase output and performance levels then these templates instances can be saved and used and also these templates are thread safe. The very fact that a XSLT style sheet contains a huge collection of templates of one or more elements leads to interfaces which end up with plural names. Each style sheet transformation is defined by a template element within the same style sheet and therefore it chooses the simplest name available for the template for representing the collection of templates

XML Processing in Python

SAX or the Simple API for XML and DOM the data object model are two popular and basic ways which create an environment to work with XML. SAX method carries out its functions by reading the XML in divisions, some at a time and whenever it finds an element it calls for it. This is somewhat similar to the HTTP which works in a similar fashion by calling out elements as and when it finds it in the document. The Data Object Model reads the entire document first and then it creates references through out the document using the Python classes and links all these references it has been collecting into a tree shaped structure. But the draw back is if the XML document is huge it is going to end up spending a lot of time scanning the entire document a creating references and also it is going to take a lot of memory space to store that tree shaped structure which it will create at the end of it all. Python has its own standard modules for parsing the XML document.

Parsing XML using DOM level 2

The data object model basically represents the entire data in an XML document in a tree shaped structure like format. This tree shaped structure format can be easily manipulated by Java because as it is DOM has it that it is very simple for other programs to use as an advantage. You can use this advantage to modify data and even extract data when needed fro this tree shaped structure. But what Dom basically does is it parses the whole document and not some parts of it like the SAX. So if you have no need for the entire document then parsing the whole document will be a waste of time and a wasted effort and a waste of memory space for you. When you have large XML documents and have to parse only a small portion of it then it makes sense to use the SAX. While parsing the XML data using DOM there are two major tasks to be fulfilled, one is converting the XML data into DOM data and the other is looking at the data that would be useful for you. XML processing with Java takes place when a parser is specified and if a parser is not specified then the Apache Xerces parser is used.

Parsing in SAX

SAX parsing also includes two major tasks while parsing just like the DOM. One is to create a content handler and the other is to invoke the process and direct it to the content handler. However some instructions have to follow while parsing like telling the system about which parser to use. You have to create an instance for the parser and also then create a content handler which will respond to the parser. The start of document and the end of document should be declared along with start element and end element. The Characters and the white spaces which can be ignored should be clear. Finally the content handler has to be designated to invoke the parser. If the last step is not done then the entire processing function of the parser in the SAX will not happen.


The start element is something which is found in the start tag of the document. In case you forget to mention the element in the tag then the start element will not be present and there for the document itself will not be identified. In case there are errors while parsing this is the first place to check for errors.  The end element is typically found in the end tag of the document and it takes values by subtracting two from the indentation and then presents a message. A character is something which is used to print the first word of the tag body and it does’ not change the indentation.

« « NLP Neurological Levels
NLP E-Prime Technique » »

Author Description

Avatar

Editorial Team at Exforsys is a team of IT Consulting and Training team led by Chandra Vennapoosa.

Free Training

RSSSubscribe 394 Followers
  • Popular
  • Recent
  • XML – Document Type Definitions (DTD)

    June 14, 2006 - 0 Comment
  • XML Security

    July 21, 2007 - 0 Comment
  • XML – Elements in Document Type Definitions (DTD)

    June 14, 2006 - 0 Comment
  • XML SQL Server

    July 23, 2007 - 0 Comment
  • Working with XML in Flash

    August 30, 2007 - 0 Comment
  • XML and Service Oriented Architecture

    July 25, 2007 - 0 Comment
  • Working with XML in Python

    September 2, 2007 - 0 Comment
  • XML Spy

    August 24, 2007 - 0 Comment
  • Working with XML in Perl

    September 6, 2007 - 0 Comment
  • Using XML with Microsoft Excel

    August 24, 2007 - 0 Comment
  • Working with XML in C

    September 7, 2007 - 0 Comment
  • Working with XML in Perl

    September 6, 2007 - 0 Comment
  • Working with XML in Python

    September 2, 2007 - 0 Comment
  • Working with XML in Flash

    August 30, 2007 - 0 Comment
  • Working with XML in Oracle

    August 30, 2007 - 0 Comment
  • Working with XML in Visual Basic

    August 28, 2007 - 0 Comment
  • Using XML with Microsoft Excel

    August 24, 2007 - 0 Comment
  • XML Spy

    August 24, 2007 - 0 Comment
  • XML and Service Oriented Architecture

    July 25, 2007 - 0 Comment
  • XML SQL Server

    July 23, 2007 - 0 Comment

Exforsys e-Newsletter

ebook
 

Related Articles

  • Working with XML in C
  • Working with XML in Perl
  • Working with XML in Python
  • Working with XML in Flash
  • Working with XML in Oracle

Latest Articles

  • Project Management Techniques
  • Product Development Best Practices
  • Importance of Quality Data Management
  • How to Maximize Quality Assurance
  • Utilizing Effective Quality Assurance Strategies
  • Sitemap
  • Privacy Policy
  • DMCA
  • Trademark Information
  • Contact Us
© 2023. All Rights Reserved.IT Training and Consulting
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish.AcceptReject Read More
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT