LINQ to XML and reading large XML files

LINQ to XML makes it relatively easy to read and query XML files. For example consider the following XML file:

Suppose you would like to find all records with groupid > 2. You could be tempted to issue the following query:

There’s a flaw in this method. XElement.Load method will load the whole XML file in memory and if this file is quite big, not only the query might take long time to execute but it might fail running out of memory. If we had some really large XML files we need to buffer through it instead of reading the whole contents into memory. XmlReader is a nice alternative to allowing us to have only the current record into memory which could hugely improve performance. We start by defining a User class which will be used to represent a single record:

Next we extend the XmlReader class with the User method:

And finally we can execute the query:

Conclusion: the second approach runs faster and uses much less memory than the first. The difference is noticeable on large XML files. So if you have to deal with large XML files be cautious when using LINQ to XML.

Leave a comment

Your email address will not be published. Required fields are marked *