After fiddling around with zero-allocation solutions, I concluded that
all non-allocating approaches are too annoying to work with in realistic
code. Using closures leads to yak-shaving with lifetimes; and because
Iron needs to take ownership of the response body we often end up
cloning the input data anyway.
Removing this constraint has let me simplify the entire system, removing
a net 300 lines from the library. The `html!` macro no longer takes a
writer, and instead returns a `PreEscaped<String>`. This means that the
result of an `html!` can be spliced directly into another `html!`,
removing the need for the `impl Template` rigmarole.
To rub it in, benchmarks show the new code is in fact *faster* than it
was before. How lovely.