Logo

Navigation
  • Home
  • Services
    • ERP Solutions
    • Implementation Solutions
    • Support and Maintenance Solutions
    • Custom Solutions
    • Upgrade Solutions
    • Training and Mentoring
    • Web Solutions
    • Production Support
    • Architecture Designing
    • Independent Validation and Testing Services
    • Infrastructure Management
  • Expertise
    • Microsoft Development Expertise
    • Mobile Development
    • SQL Server Database and BI
    • SAP BI, SAP Hana, SAP BO
    • Oracle and BI
    • Oracle RAC
  • Technical Training
    • Learn Data Management
      • Business Intelligence
      • Data Mining
      • Data Modeling
      • Data Warehousing
      • Disaster Recovery
    • Learn Concepts
      • Application Development
      • Client Server
      • Cloud Computing Tutorials
      • Cluster Computing
      • CRM Tutorial
      • EDI Tutorials
      • ERP Tutorials
      • NLP
      • OOPS
      • Concepts
      • SOA Tutorial
      • Supply Chain
      • Technology Trends
      • UML
      • Virtualization
      • Web 2.0
    • Learn Java
      • JavaScript Tutorial
      • JSP Tutorials
      • J2EE
    • Learn Microsoft
      • MSAS
      • ASP.NET
      • ASP.NET 2.0
      • C Sharp
      • MS Project Training
      • Silverlight
      • SQL Server 2005
      • VB.NET 2005
    • Learn Networking
      • Networking
      • Wireless
    • Learn Oracle
      • Oracle 10g
      • PL/SQL
      • Oracle 11g Tutorials
      • Oracle 9i
      • Oracle Apps
    • Learn Programming
      • Ajax Tutorial
      • C Language
      • C++ Tutorials
      • CSS Tutorial
      • CSS3 Tutorial
      • JavaScript Tutorial
      • jQuery Tutorial
      • MainFrame
      • PHP Tutorial
      • VBScript Tutorial
      • XML Tutorial
    • Learn Software Testing
      • Software Testing Types
      • SQA
      • Testing
  • Career Training
    • Career Improvement
      • Career Articles
      • Certification Articles
      • Conflict Management
      • Core Skills
      • Decision Making
      • Entrepreneurship
      • Goal Setting
      • Life Skills
      • Performance Development
      • Personal Excellence
      • Personality Development
      • Problem Solving
      • Relationship Management
      • Self Confidence
      • Self Supervision
      • Social Networking
      • Strategic Planning
      • Time Management
    • Education Help
      • Career Tracks
      • Essay Writing
      • Internship Tips
      • Online Education
      • Scholarships
      • Student Loans
    • Managerial Skills
      • Business Communication
      • Business Networking
      • Facilitator Skills
      • Managing Change
      • Marketing Management
      • Meeting Management
      • Process Management
      • Project Management
      • Project Management Life Cycle
      • Project Management Process
      • Project Risk Management
      • Relationship Management
      • Task Management
      • Team Building
      • Virtual Team Management
    • Essential Life Skills
      • Anger Management
      • Anxiety Management
      • Attitude Development
      • Coaching and Mentoring
      • Emotional Intelligence
      • Stress Management
      • Positive Thinking
    • Communication Skills
      • Conversation Skills
      • Cross Culture Competence
      • English Vocabulary
      • Listening Skills
      • Public Speaking Skills
      • Questioning Skills
    • Soft Skills
      • Assertive Skills
      • Influence Skills
      • Leadership Skills
      • Memory Skills
      • People Skills
      • Presentation Skills
    • Finding a Job
      • Etiquette Tips
      • Group Discussions
      • HR Interviews
      • Interview Notes
      • Job Search Tips
      • Resume Tips
      • Sample Resumes
 

XML Parsing

By Exforsys | on July 14, 2007 |
XML Tutorial

XML Parsing

XML documents can be parsed efficiently and more critically because XML is a widely accepted language. It is extremely crucial to programming for the web that XML data be parsed efficiently, especially in cases a where the applications that are required to handle huge volumes of data. When parsing is improper it can increase memory usage and time for processing which directly affects the scalability by decreasing it.

There are many XML parsers that are available. Choosing a right one for your situation might be challenging. There are three XML parsing techniques which are extremely popular and are used for Java and it also guides you to choose the correct make right choice of method based on the application and its requirements.

An Extensive Markup Language parser takes a serialized string which is raw as input and performs a series of operations with it. First and foremost the XML data is checked for syntax errors and how well it formed is, and it also makes sure that the start tags will have end tags that match and that there are no elements which are overlapping with each other. Many parsers implement first validate the Document Type Definition (DTD) or even the XML Schema sometimes to verify if the structure along with the content are correctly specified by you. In the end the output after parsing is provided access to the XML document’s content through the APIs programming modules.

The three XML parsing that are popularly used with techniques for Java is, Document Object Model (DOM), it is w3c provided mature standard, and Simple API for XML (SAX), it was one of the first to be widely adapted form of API for XML in Java and has become the standard, the third one is Streaming API for XML (StAX), which is a new model for parsing in XML but is very efficient and has a promising future. Each one of the mentioned techniques has their advantages and disadvantages.

Parsing with DOM

Data Object Model or the DOM technique that based on the tree structure parsing and it builds an entire parsing tree in the memory. It also lets the DOM have complete access to the entire XML document dynamically.

The data object model is a tree like structure. So the document is considered to be the root from which all the DOM trees take birth, and the root will have one child node at the least, and the root element, which usually catalogues elements keeps it in the sample code. Another node that is created is the Document Type, which is used for the Document Type Data declarations. The elements in the catalog usually have child nodes, and these Child nodes are used as elements.

The DOM program takes the XML filename, and then creates the DOM tree. It uses the function called getElementsByTagName() for finding all the Data Object Model element nodes that can be used as the title elements. After this it finally prints the information in the text that is associated with the title elements. It achieves this by inspecting the list of title elements and then it examines the first child separately. The first child element is usually located between the start and end tags of the element, and it also uses the function getFirstChild() method to achieve this.

The Data object model is a direct model and very straight forward in its functions. XML document can be accessed randomly at any time because the memory stores the entire tree. DOM APIs also modify the nodes like for example appending a child or restructuring and updating or removing or deleting a node. There is a lot of support for navigating the memory tree in the DOM; but simultaneously there are issues related to parsing that have to be considered. It is essential in this system that the entire document has to be parsed at one single shot and the same time, it cannot be parsed partially or in intervals. If the XML document is huge then building the entire tree in the memory will become an extensive and an expensive process. The Data object model tree can actually consume a lot of memory. Though the DOM is very interoperable and interoperability is the biggest positive point it can offer at the same time it is not very good with binding and this proves to be its draw back when it comes to object binding.

There are a lot of applications which are well suited for DOM parsing. If the application needs to have immediate access to the XML document randomly then in such cases the DOM parsing is appropriate. For example an Extensive Style Language processor always has the need to navigate through an entire file and this becomes a repeated process while it is processing templates. Dom is dynamic when it comes to updating or modifying data so this feature is extremely convenient for applications, like the XML editors, which need to frequently modify data.

Parsing with SAX

SAX processing model is entirely based on stream of events and is an event-driven model for the processing of XML documents. Though it is not a standard declared by the W3C, it is still a very famous form of API that many SAX parsers use in without offending compliance or crating issues related to compliance. Unlike the DOM where it builds an entire tree to represent the data, the SAX parser streams a series of events while it reads the document. These events are forwarded to event handlers, which also provide access to the data of the document. There are three basic types of event handlers the DTD Handler which is used for accessing the data of XML DTD’s. The error handlers which are used for creating a low-level access to the errors created while parsing. The last but not the least Content handler which is used for accessing the content in the document

The difference between the DOM and the SAX parser offers a great benefit in terms of performance. It provides a low-level access which is efficient at the same time to the XML documents contents. Whereas the SAX model while having the major advantage of consuming extremely low memory, mainly because the document in its entirety does not have the need to be loaded into the memory slot at one time, and this feature enables a SAX parser to be able to parse a document which is much larger than the system’s own memory component. In addition to this, you don’t have the need to create objects for each and every node, unlike the DOM environment. SAX "push" model finally can be used in a broad context, when it comes to multiple content handlers which can be registered and used to receive events in a parallel way, instead of receiving them one by one in a pipeline in a series.

One of the disadvantages of SAX can be that you will have to implement all the event handlers to handle each and every incoming event. The application code must be maintained in this state of events. The SAX parser is incapable of processing the events when it comes to the DOM’s element supports, and you also have to keep track of the parsers position in the document hierarchy. The application logic gets tougher as the document gets complicated and bigger. It may not be required that the entire document be loaded but a SAX parser still requires to parse the whole document, similar to the DOM.

One of the biggest problems the SAX is facing today is that it lacks a built-in document support for navigation like the one which is provided by XPath. Along with the existing problem the one-pass parsing syndrome also limits the random access support. These kinds of limitations also start affecting the namespaces. These shortcomings make SAX a not so good choice when it comes to manipulating and even modifying a XML document.

Applications that can read the documents content in one single pass can derive huge benefits from SAX parsing. Many Business to Business Portals and applications use XML so that the data can be encapsulated in a format in which it can be received and retrieved using a simple process. This is the only scenario where the SAX might win hands down compared to DOM, purely due to the efficiency of SAX which results in high output. The modern SAX 2.0 also has a built-in filtering mechanism which makes very easy for the documents output to be subset. SAX parsing is also considered very useful when it comes to validating DTDs and the XML schemas.

Parsing with STax

Stax is a brand new parsing technique which is very similar to SAX and also an improvisation to it. The STAX uses a model that is event-driven. The only difference between sax and STAAX here is that the sax uses a push model and the STAX uses a pull model for event processing. And also another notable feature is instead of using call back options the STAX parser returns events which are requested by the applications in use.

« « Supply Chain Management Degree Programs
.NET Client-Server Technology » »

Author Description

Avatar

Editorial Team at Exforsys is a team of IT Consulting and Training team led by Chandra Vennapoosa.

Free Training

RSSSubscribe 394 Followers
  • Popular
  • Recent
  • XML and Service Oriented Architecture

    July 25, 2007 - 0 Comment
  • Working with XML in Python

    September 2, 2007 - 0 Comment
  • XML Spy

    August 24, 2007 - 0 Comment
  • Working with XML in Perl

    September 6, 2007 - 0 Comment
  • Using XML with Microsoft Excel

    August 24, 2007 - 0 Comment
  • Working with XML in C

    September 7, 2007 - 0 Comment
  • Working with XML in Visual Basic

    August 28, 2007 - 0 Comment
  • XML Advantages

    July 5, 2007 - 0 Comment
  • Working with XML in Oracle

    August 30, 2007 - 0 Comment
  • XML Disadvantages

    July 8, 2007 - 0 Comment
  • Working with XML in C

    September 7, 2007 - 0 Comment
  • Working with XML in Perl

    September 6, 2007 - 0 Comment
  • Working with XML in Python

    September 2, 2007 - 0 Comment
  • Working with XML in Flash

    August 30, 2007 - 0 Comment
  • Working with XML in Oracle

    August 30, 2007 - 0 Comment
  • Working with XML in Visual Basic

    August 28, 2007 - 0 Comment
  • Using XML with Microsoft Excel

    August 24, 2007 - 0 Comment
  • XML Spy

    August 24, 2007 - 0 Comment
  • XML and Service Oriented Architecture

    July 25, 2007 - 0 Comment
  • XML SQL Server

    July 23, 2007 - 0 Comment

Exforsys e-Newsletter

ebook
 

Related Articles

  • Working with XML in C
  • Working with XML in Perl
  • Working with XML in Python
  • Working with XML in Flash
  • Working with XML in Oracle

Latest Articles

  • Project Management Techniques
  • Product Development Best Practices
  • Importance of Quality Data Management
  • How to Maximize Quality Assurance
  • Utilizing Effective Quality Assurance Strategies
  • Sitemap
  • Privacy Policy
  • DMCA
  • Trademark Information
  • Contact Us
© 2023. All Rights Reserved.IT Training and Consulting
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish.AcceptReject Read More
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT