So, I have some code that queries a data source, and that data source sends me back an XML message. I have to parse the XML message so I can store information from it into a relational database. So, let’s say my XML response looks like this:
<xml> <response> <results=2> <result> <fname>Brian</fname> <lname>Jones</lname> <gender>M</gender> <office_phone_ext>777</office_phone_ext> <mobile_phone>201-555-1212</mobile_phone> </result> <result> <fname>Molly</fname> <lname>Jones</lname> <home_phone>201-555-1234</home_phone> </result> </results> </xml>
So, as you can see, the attributes for each result returned for a query can differ, and if a result doesn’t have a value for some attribute, the corresponding xml element isn’t included at all for that result. If it were just 2 or 3 attributes, I could easily enough get around it by doing something like this:
def __init__(self, xmlresult): self.xmlresult = xmlresult if self.xmlresult.xpath('fname') is not None: self.fname = self.xmlresult.xpath('fname') if self.xmlresult.xpath('lname') is not None: self.lname = self.xmlresult.xpath('lname')
Like I said, if it were just a few things I needed to check for, I’d do it this way and be done with it. It’s not just a few though — it’s like 50 attributes. Now what?
I decided lxml.objectify would be a great way to go. It would allow me to access these things as object attributes, which should mean I can do something like this:
self.fname = getattr(self.xmlresult, 'fname', None) self.lname = getattr(self.xmlresult, 'lname', None) ...
So, you *can* do this, technically speaking. Trouble is, you’re asking for an attribute of an ObjectifiedElement object, and when you do that, it returns an object that is not a native Python datatype, which I did not realize when I first started using lxml.objectify. So, in the above, ‘self.fname’ will not be a Python string — it’ll be an lxml.objectify.StringElement object. Of course, my database driver, my ‘join()’ operations, and everything else in my code that relies on native Python datatypes is now broken.
What I actually need to do is get the ‘.pyval’ attribute of self.xmlresult.fname, if that attribute exists at all. So, something that does what I mean, which is “self.fname = getattr(self.xmlresult, ‘fname.pyval’, None). And, of course, doing ‘getattr(self.xmlresult, ‘fname’, None).pyval’ doesn’t work because None has no attribute ‘pyval’. I’ve tried a couple of other hacks too, but I’ve learned enough Python to know that if it feels like a hack, there’s probably a better way. But I can’t find that better way. Ideas?