[CONF] Apache Camel: TidyMarkup (page edited)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[CONF] Apache Camel: TidyMarkup (page edited)

Dhiraj Bokde (Confluence)

TidyMarkup has been edited by Claus Ibsen (Jan 20, 2009).

(View changes)

Content:

TidyMarkup

TidyMarkup is a Data Format that uses the TagSoup to tidy up HTML. It can be used to parse ugly HTML and return it as pretty wellformed HTML.

Camel eats our own dog food soap

We had some issues in our pdf Manual where we had some strange symbols. So Jonathan used this data format to tidy up the wiki html pages that are used as base for rendering the pdf manuals. And then the mysterious symbols vanished.

TidyMarkup only supports the unmarshal operation as we really don't want to turn well formed HTML into ugly HTML

Example

An example where the consumer provides some HTML

from("file://site/inbox").unmarshal().tidyMarkup().to("file://site/blogs");

Requirements

TidyMarkup is provided in the camel-tagsoup.jar so if you are using maven you can just depend on this artifactId camel-tagsoup.