This Linux tool is my new best friend. We get thousands of XML files from our clients for loading user, class, and enrollment information. Some of these clients customize our software or write their own software for generating the XML.
This means we frequently get oddities in the files which cause problems. Thankfully I am not the person who has to verify these files are good. I just get to answer the questions that person has about why a particular file failed to load.
The CE/Vista import process will stop if its validator finds invalid XML. Unfortunately, the error “An exception occurred while obtaining error messages. Â See webct.log” doesn’t sound like invalid XML.
Usage is pretty simple:
xmllint –valid /path/to/file.xml | head
- If the file is valid, then the whole file is in the output.
- If there are warnings, then they precede the whole file.
- If there are errors, then only the errors are displayed.
I use head here because our files can be up to 15MB, so this prevents the whole file from going on the screen for the first two situations.
I discovered this in researching how to handle the first situation below. It came up again today. So this has been useful to catch errors in the client supplied files where the file failed to load.
1: parser error : XML declaration allowed only at the start of the document
 <?xml version=”1.0″ encoding=”UTF-8″?>162: parser error : EntityRef: expecting ‘;’
<long>College of Engineering &; CIS</long>
(Bolded the errors.) The number before the colon is the line number. The carat it uses to indicate where on the line an error occurred isn’t accurate, so I ignore it.
My hope is to get this integrated into our processes to validate these files before they are loaded and save ourselves headaches the next morning.
Leave a Reply