XML , XML () . , XML (wellformedness) . , , . , , HTML web- HTML . ( , HTML . , HTML , .)
( ), XML . , . , , XML HTTP (, Atom). XML, 1997, , Atom .
, XML , . , lxml .
XML .
<?xml version='1.0' encoding='utf-8'?><feed xmlns='http://www.w3.org/2005/Atom' xml:lang='en'> <title>dive into …</title>...</feed>, … XML ( HTML). , lxml hellip.
>>> import lxml.etree>>> tree = lxml.etree.parse('examples/feed-broken.xml')Traceback (most recent call last): File "<stdin>", line 1, in <module> File "lxml.etree.pyx", line 2693, in lxml.etree.parse (src/lxml/lxml.etree.c:52591) File "parser.pxi", line 1478, in lxml.etree._parseDocument (src/lxml/lxml.etree.c:75665) File "parser.pxi", line 1507, in lxml.etree._parseDocumentFromURL (src/lxml/lxml.etree.c:75993) File "parser.pxi", line 1407, in lxml.etree._parseDocFromFile (src/lxml/lxml.etree.c:75002) File "parser.pxi", line 965, in lxml.etree._BaseParser._parseDocFromFile (src/lxml/lxml.etree.c:72023) File "parser.pxi", line 539, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/lxml.etree.c:67830) File "parser.pxi", line 625, in lxml.etree._handleParseResult (src/lxml/lxml.etree.c:68877) File "parser.pxi", line 565, in lxml.etree._raiseParseError (src/lxml/lxml.etree.c:68125)lxml.etree.XMLSyntaxError: Entity 'hellip' not defined, line 3, column 28XML , XML.
|
|
① lxml.etree.XMLParser. , recover. True lxml .
② XML parser parse(). lxml ….
③ . ( recover.)
④ …, . title 'dive into '.
⑤ : … , lxml .
, XML . … HTML . ? . ? , XML . ( XML ) . , .