> For this task, the suggestion of "use a parser" is indeed sound advice. Perhap...

inopinatus · on May 9, 2021

> "use a parser" amounts to "write a entire parser yourself, then use it"

"Use a parser" is a common answer, besides being the accepted one, and with good reason: it'll work. The world is not short of HTML parsers (although, who knows, perhaps PHP may have been short of very good parsers back in 2009). Whether they use regular expressions for tokenizing is an internal detail.

Serializing XML from the resulting memory structure, DOM or otherwise, closes the loop, and this remains a conventional and commonplace means to normalize some incoming HTML-like mush into something that can be spliced/interpolated into XML and a strict receiver will probably accept it.

> Pot, kettle.

Oh look, personal abuse! Good-day to you, too.