|
|
Title |
Using the XPath notation to locate data in an XML document.
|
Summary |
XPath is like the fully qualified path to a file in a directory except it it points to a piece of data in a XML document. It is very useful for extracting data with predefined descriptors in the document. |
Contributor |
John McTainsh
|
Published |
8-Jun-2001 |
Last updated |
8-Jun-2001 |
|
|
Description.
As XML has become more popular, the need to extract information in a simple way
has arisen (possibly in parallel). One simple way to extract data from an XML
document is using the XPath notation. With XPath the exact element is referenced
like you would a file in a sub directory. Consider the following XML.
<People>
<SillyPeople>
<MyPerson Name="Mr Wal Mart">This is very rich man. </MyPerson>
<MyPerson Name="Tod">No so rich</MyPerson>
</SillyPeople>
<BrilliantPeople>
<MyPerson Name="John McTainsh">This is a very cool person.</MyPerson>
</BrilliantPeople>
</People>
In the above XML Tod's data is represented with the XPath of //People/SillyPeople/MyPerson[@Name='Tod'] .
It looks a bit cryptic to start with but with time it will grow on you. The code
presented here shows how to extract data using this method and points out a few
traps for young players.
Where to find out about XPath.
XPath stands for XML Path Language. It is defined by W3C in http://www.w3.org/TR/xpath.
What we need to get started.
Microsoft has a COM object called the Microsoft MSXML Parser tool to create and
parse XML. To use this we need to add the following code to stdafx.h. It is very
important to note we are using MSXML Parser 3 and the MSXML2 namespace.
#import "msxml3.dll"
using namespace MSXML2;
MSXML Parse 3.0 can be downloaded from Microsoft for free here.
Because we are working with COM will also need to call CoInitialize(NULL);
and CoUninitialize(); in our code and start-up and shutdown
respectively.
Extracting the data.
The following code segment extracts the data at //People/SillyPeople/MyPerson[@Name='Tod']
with these steps;
- Creates an XML Document 2 smart pointer.
- Loads it with an XML document. Note: Async is false.
- Set the Language to
XPath .
- Request a list of nodes that match the search XPath search criteria. This
may be more than on item.
- Iterate through the items displaying the data.
try
{
// Create the XML Document
IXMLDOMDocument2Ptr pXMLDoc(__uuidof(MSXML2::DOMDocument));
// Load the XML from a file or a string
pXMLDoc->put_async(VARIANT_FALSE);
ASSERT( pXMLDoc->loadXML(
_T( "<People>"
" <SillyPeople>"
" <MyPerson Name=\"Mr Wal Mart\">Rich man.</MyPerson>"
" <MyPerson Name=\"Tod\">Is a clown.</MyPerson>"
" </SillyPeople>"
" <BrilliantPeople>"
" <MyPerson Name=\"John McTainsh\">Is cool.</MyPerson> "
" </BrilliantPeople>"
"</People>") ) );
// Very important to set the language
pXMLDoc->setProperty( _T("SelectionLanguage"), _T("XPath") );
// Get the list of items we are looking.
//bstr_t bsLookFor( _T("//People/SillyPeople/MyPerson[@Name='Tod']") );
bstr_t bsLookFor( _T("//SillyPeople/MyPerson") );
IXMLDOMNodeListPtr pNodeList = pXMLDoc->documentElement->selectNodes( bsLookFor );
int nList = pNodeList->length;
TRACE( _T("Looking for = %s\n"), (LPTSTR)bsLookFor );
TRACE( _T("Found %d item(s)\n"), nList );
// Iterate through each item found
for( int n = 0; n < nList; n++ )
{
IXMLDOMNodePtr pNode = pNodeList->item[n];
bstr_t bsNodeText = pNode->text;
TRACE( _T("Text = %s\n"), (LPTSTR)bsNodeText );
}
}
catch(_com_error &e)
{
// Display any com error in a MessageBox.
bstr_t bstrSource(e.Source());
bstr_t bstrDescription(e.Description());
CString sErr, sOutMessage;
sErr.Format( _T("Code = 0x%08lx\n"), e.Error());
sOutMessage += sErr;
sErr.Format( _T("Code meaning = %s\n"), e.ErrorMessage());
sOutMessage += sErr;
sErr.Format( _T("IErrorInfo.Source = %s\n"), (LPTSTR)bstrSource );
sOutMessage += sErr;
sErr.Format( _T("IErrorInfo.Description = %s"), (LPTSTR)bstrDescription );
sOutMessage += sErr;
AfxMessageBox( sOutMessage );
}
|