This reduces chances of errors where consumers of strings mutate inputs
and reduces memory usage of the app.
Test suite passes now, but there may be some stuff left, so we will run
a few sites on a branch prior to merging
* FIX: Don't diplay character reference in HTML diffs
Before this change, HTML escaping was done before splitting text into
tokens, so token splitter saw literals like "'", and split them as
it was normal text into parts into ["&", "#", "39", ";"]. This caused
diff to display character references, as those tokens used separate
HTML tags to display their insertion/deletion status.
* Avoid making one element arrays while generating diffs