This will be a quick tutorial that will show you how to use PHP’s DOMDocument to parse your XML so you do not have to use XML parser. In this tutorial you’ll see how to loop through your XML file and how to extract some specific data. For this, we will use XML file that is available on w3schools.com.
Load DocumentTop
First thing we have to do is to make an instance of DOMDocument class.
$dom = new DOMDocument();
Now that we have instance, we can load document. To do that we’ll use load method. As an argument we pass path to our XML file.
$dom->load('http://www.w3schools.com/XML/simple.xml');
Now our document is loaded and we can do with it what we want.
Loop Through XMLTop
Now we will loop through our XML file. Let’s say we want to print data from food elements. To do that we have to select them first. We’ll do that by using getElementsByTagName method and pass name of our element (food).
$food = $dom->getElementsByTagName('food');
If you do var_dump on $food variable you’ll see that you get instance of DOMNodeList class. It has method item and variable $length so you can loop through them or use foreach to do the job. When you loop you’ll get each element that you got from your query as DOMElement and then you can do new queries or modify that element. We’ll just loop.
<ul>
< ?php
//Loop through each item
foreach ($food as $elem) {
?>
<li>
<table>
<tbody>
<tr>
<td><b>Name:</b></td>
<td>< ?php echo $elem->getElementsByTagName('name')
->item(0)
->nodeValue; ?></td>
</tr>
<tr>
<td><b>Description:</b></td>
<td>< ?php echo $elem->getElementsByTagName('description')
->item(0)
->nodeValue; ?></td>
</tr>
<tr>
<td><b>Price:</b></td>
<td>< ?php echo $elem->getElementsByTagName('price')
->item(0)
->nodeValue; ?></td>
</tr>
<tr>
<td><b>Calories:</b></td>
<td>< ?php echo $elem->getElementsByTagName('calories')
->item(0)
->nodeValue; ?></td>
</tr>
</tbody>
</table>
</li>
< ?php
}
?>
</ul>
As you can see, we do new queries on our element retrieved from loop. We want element with tag names name, description, price and calories. When we get our element we want first of them that is in list of elements retrieved from query and it’s value. This is how we loop through our XML. Result should be something like this.
| Name: | Belgian Waffles |
| Description: | two of our famous Belgian Waffles with plenty of real maple syrup |
| Price: | $5.95 |
| Calories: | 650 |
| Name: | Strawberry Belgian Waffles |
| Description: | light Belgian waffles covered with strawberries and whipped cream |
| Price: | $7.95 |
| Calories: | 900 |
| Name: | Berry-Berry Belgian Waffles |
| Description: | light Belgian waffles covered with an assortment of fresh berries and whipped cream |
| Price: | $8.95 |
| Calories: | 900 |
| Name: | French Toast |
| Description: | thick slices made from our homemade sourdough bread |
| Price: | $4.50 |
| Calories: | 600 |
| Name: | Homestyle Breakfast |
| Description: | two eggs, bacon or sausage, toast, and our ever-popular hash browns |
| Price: | $6.95 |
| Calories: | 950 |
Retrieve Specific ElementTop
Let’s say we want to get value of name element in third food element and print it out. We will use something like this.
$third = $dom->getElementsByTagName('food')
->item(2);
echo sprintf(
'Name of third element is: <b>%s</b>',
$third->getElementsByTagName('name')
->item(0)
->nodeValue
);
You should get result like this.
Name of third element is: Berry-Berry Belgian Waffles
ConclusionTop
DOM classes in PHP are very powerful and I like to use them for parsing XML much more then XML parser because they are build in object-oriented way and can be very easy extended. Thank you for reading.
Ti trebas funkcije parse_str i parse_url.
Rjesio sam ipak
Trebalo je samo dodati $xml->PRICES->PRICE
neznam zasto nije odmah radilo a probo sam.
Kad smo vec kod toga znas li mozda kako na PRODUCT_CARD linku odsjeci gornji blok (ASBIS logo + linkovi)
Taj link u PRODUCT_CARD tagu je ustvari opis proizvoda koji bi trebao ubaciti u webshop ali nezelim linkove veleprodaje…
moze li DOM iscupati samo dio te kartice u neku varijablu koju bi onda stavljo u bazu umjesto linka koji mi nude u xml fajlu.
Thanx
Bok Marijan
Hvala na tutorijalu
Imam jedan sitan problem please help
Pokusavam parsirat xml fajl http://c-bit.hr/1/ASBIS/PriceAvail6.xml
na sljedeci nacin :
$url = “http://c-bit.hr/1/ASBIS/PriceAvail6.xml”;
$xml = simplexml_load_file($url);
// loop begins
foreach($xml->PRICE as $PRICE)
{blok naredbi}
I ova petlja normalno radi ali samo ako iz xml fajla prethodno editorom odrezem CONTENT tag.
tj ostane u njemu samo “PRICES” i “PRICE”
Nekontam zasto nemogu izvuci varijablu $PRICE ako je fajl orginalan kao na linku
Hvala unaprijed
Awesome tutorial. Thanks! How would one go about parsing many pages though (for example a paginated URL to pull page 1, page 2, etc. until it returns an empty dataset)?
Very helpful, Marijan. You should follow-up this tutorial with others. This one covers the “Read” aspect of XML as a “database.” The other aspects are: Creating nodes, Updating (changing a node’s value), and Deleting a node. Some useful variations would be deleting all nodes with a given value other than the first one with that value. I am sure you can think of others as well.
Hi,
Just a quick note to say your tutorial helped me. Thanks for posting.
Jas
working good., thankx for posting
Thank you
How do you mean “Delete HTML page”? You mean to delete elements from HTML?