Excel Function: FILTERXML
Purpose
The FILTERXML function extracts data from an XML document. The data to be extracted are specified by the xpath argument.
Invocations of the FILTERXML function take the form:
FILTERXML(xml,xpath)
Limitations
- FILTERXML does not work on either Apple Macs, or Excel for the Web.
- FILTERXML supports XPath 1.0 expressions only.
Arguments
Argument | Argument type | Description |
---|
xml | Mandatory | An XML document from which data will be extracted. |
xpath | Mandatory | An XPath expression used to locate the data to be extracted. |
Related Functions
Function | Example | Purpose |
---|
WEBSERVICE | FILTERXML(WEBSERVICE(url),xpath) | Call a web service whose endpoint is at some URL, then extract data from the response. |
Error Conditions
The FILTERXML function will respond with an error value if any of the following conditions are met.
Error value | Condition type | Condition |
---|
#VALUE! | Bad argument | The document supplied to the xml argument was not valid XML. |
#VALUE! | Bad argument | The document supplied to the xml argument contained a namespace with an invalid prefix. |
#VALUE! | Bad argument | No matches to the expression supplied to the xpath argument were found. |
#VALUE! | Bad argument | An element was found and an attempt was made to extract its value, but it had none. |
Notes
- If FILTERXML extracts two or more values then they will be spilled into the cells below the one containing the formula.
- The 2nd and 3rd values spilled into cells D3 & D4 respectively.
Examples: Traversing & Searching XML
- The “<xml>” symbol represents a cell containing the contents of the XML column.
- Examples 3, 6, 7, 9 & 11 extract more than one value. Each value is on a different row in the ‘result‘ column.
- Examples 1 – 5 demonstrate how to extract data from XML by declaring an absolute path to the elements containing them.
- Examples 6 – 10 demonstrate how to extract data from XML by searching for the elements containing them.
- Example 11 shows how to extract the values of element’s attributes.
Example | Description | XML | Formula | Result |
---|
1 | Extract the root element’s value. | <a>a-node-value <b>b-node-value</b> </a> | =FILTERXML(<xml>,"/a") | a-node-value |
2 | Extract a child element’s value.- “/a/b” isolates <b> elements which are children of <a> elements.
- One <b> element matches the search criteria.
- Its value is returned.
| <a>a-node-value <b>b-node-value</b> </a> | =FILTERXML(<xml>,"/a/b") | b-node-value |
3 | Extract the values from 2 child elements.- “/a/b” isolates <b> elements which are children of <a> elements.
- Two <b> elements match the search criteria.
- Their values are returned.
| <a>a-node-value <b id="1">b1-node-value</b> <b id="2">b2-node-value</b> </a> | =FILTERXML(<xml>,"/a/b") | b1-node-value b2-node-value |
4 | Extract the values of elements based on an attribute’s value.- “/a/b[@id=’2′]” isolates <b> elements
- which are children of <a> elements.
- whose id attribute has a value of “2”.
- One <b> element matches the search criteria.
- Its value is returned.
| <a>a-node-value <b id="1">b1-node-value</b> <b id="2">b2-node-value</b> </a> | =FILTERXML(<xml>,"/a/b[@id='2']") | b2-node-value |
5 | Select an element and extract its parent’s value.- “/a/b/..”
- isolates <b> elements which are children of <a> elements.
- gets the values of those element’s parents.
- One <a> element matches the search criteria.
- Its value is returned.
| <a>a-node-value <b id="1">b1-node-value</b> <b id="2">b2-node-value</b> </a> | =FILTERXML(<xml>,"/a/b/..") | a-node-value |
6 | Select elements and extract their parent’s values.- “//b/..”
- isolates all <b> elements in the XML.
- gets the values of those element’s parents.
- Two elements match the search criteria.
- Their values are returned.
| <a>a-node-value <b id="1">b1-node-value</b> <b id="2">b2-node-value</b> <c>c-node-value <b id="3">b3-node-value</b> <b id="4">b4-node-value</b> </c> </a> | =FILTERXML(<xml>,"//b/..") | a-node-value c-node-value |
7 | Extract the values from ‘b’ elements.- “//b” isolates all <b> elements in the XML.
- Three elements match the search criteria.
- Their values are returned.
| <a>a-node-value <b id="1">b1-node-value</b> <b id="2">b2-node-value</b> <c> <b id="3">b3-node-value</b> </c> </a> | =FILTERXML(<xml>,"//b") | b1-node-value b2-node-value b3-node-value |
8 | Filter the values extracted in example 7 based on attribute values.- “//b[@id=’3′]”
- isolates all <b> elements in the XML.
- retains only those whose id attribute has a value of “3”.
- One element matches the search criteria.
- Its value is returned.
| <a>a-node-value <b id="1">b1-node-value</b> <b id="2">b2-node-value</b> <c> <b id="3">b3-node-value</b> </c> </a> | =FILTERXML(<xml>,"//b[@id='3']") | b3-node-value
|
9 | Extract all values from ‘b’ elements, where their id ≥ 2.- “//b[@id>=2]”
- isolates all <b> elements in the XML.
- retains only those whose id attribute has a value ≥ “2”.
- Two elements match the search criteria.
- Their values are returned.
| <a>a-node-value <b id="1">b1-node-value</b> <b id="2">b2-node-value</b> <c> <b id="3">b3-node-value</b> </c> </a> | =FILTERXML(<xml>,"//b[@id>=2]") | b2-node-value b3-node-value |
10 | Extract values from ‘b’ elements that are children of ‘c’ elements.- “//c/child::b”
- isolates all elements that are children of <c> elements.
- retains only <b> elements.
- One element matches the search criteria.
- Its value is returned.
| <a>a-node-value <b id="1">b1-node-value</b> <b id="2">b2-node-value</b> <c> <b id="3">b3-node-value</b> </c> </a> | =FILTERXML(<xml>,"//c/child::b") | b3-node-value |
11 | Extract values of attributes of type ‘id’.- “//@id” isolates the values of all “id” attributes.
- Three elements contain attributes which match the search criteria.
- Their values are returned.
| <a>a-node-value <b id="1">b1-node-value</b> <b id="2">b2-node-value</b> <c> <b id="3">b3-node-value</b> </c> </a> | =FILTERXML(<xml>,"//@id") | 1 2 3 |