Parsing XML Performance

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Parsing XML Performance

BobbySixKiller
Hi,

I have a simple question about parsing xml and performance. Actually my route is like this :

from("endpointIn")
.convertBodyTo(String.class)//
.unmarshal().jaxb("com.groupemb.entite.search.compario.in")//
.process(...)

where com.groupemb.entite.search.compario.in refers to a jaxb.index file which refers to a java class named ResponseCompario.java

It works fine but i was wondering, is this route more performant ?:
from("endpointIn")
.convertBodyTo(ResponseCompario.class)//
.process(...)

NB: The xml files are pretty big.

Best regards,
Reply | Threaded
Open this post in threaded view
|

Re: Parsing XML Performance

hekonsek
Hi,

> It works fine but i was wondering, is this route more performant ?:
> NB: The xml files are pretty big.

If XML (de)serialization is an issue consider using JiBX data format [1].

from("direct:start").unmarshal().jibx(MyMappedBean.class).to(...);

JiBX is a speed demon comparing to JAXB.

Best regards.

[1] http://camel.apache.org/jibx.html

--
Henryk Konsek
http://henryk-konsek.blogspot.com
Reply | Threaded
Open this post in threaded view
|

Re: Parsing XML Performance

Christian Mueller
Administrator
In reply to this post by BobbySixKiller
Why you do not try it out? ;-)

Best,
Christian

On Wed, Feb 20, 2013 at 2:52 PM, BobbySixKiller <[hidden email]> wrote:

> Hi,
>
> I have a simple question about parsing xml and performance. Actually my
> route is like this :
>
> from("endpointIn")
> .convertBodyTo(String.class)//
> .unmarshal().jaxb("com.groupemb.entite.search.compario.in")//
> .process(...)
>
> where com.groupemb.entite.search.compario.in refers to a jaxb.index file
> which refers to a java class named ResponseCompario.java
>
> It works fine but i was wondering, is this route more performant ?:
> from("endpointIn")
> .convertBodyTo(ResponseCompario.class)//
> .process(...)
>
> NB: The xml files are pretty big.
>
> Best regards,
>
>
>
> --
> View this message in context:
> http://camel.465427.n5.nabble.com/Parsing-XML-Performance-tp5727867.html
> Sent from the Camel - Users mailing list archive at Nabble.com.
>



--
Reply | Threaded
Open this post in threaded view
|

Re: Parsing XML Performance

Łukasz Dywicki
In reply to this post by BobbySixKiller
Avoid usage of DOM or big object trees. With all XML-Java binding marshallers it's gonna be more or less same. If you want do this fast, try to limit number of conversions. The best you can do is to use stream from beginning. Consider splitting file as described on Claus blog [1]. Also you may use partial unmarshall with JAXB (smaller documents), description is in JAXB dataformat docs [2].

[1] http://www.davsclaus.com/2011/11/splitting-big-xml-files-with-apache.html
[2] http://camel.apache.org/jaxb#JAXB-Partialmarshalling%2Funmarshalling

Best regards,
Łukasz Dywicki
--
[hidden email]
Twitter: ldywicki
Blog: http://dywicki.pl
Code-House - http://code-house.org

Wiadomość napisana przez BobbySixKiller <[hidden email]> w dniu 20 lut 2013, o godz. 14:52:

> Hi,
>
> I have a simple question about parsing xml and performance. Actually my
> route is like this :
>
> from("endpointIn")
> .convertBodyTo(String.class)//
> .unmarshal().jaxb("com.groupemb.entite.search.compario.in")//
> .process(...)
>
> where com.groupemb.entite.search.compario.in refers to a jaxb.index file
> which refers to a java class named ResponseCompario.java
>
> It works fine but i was wondering, is this route more performant ?:
> from("endpointIn")
> .convertBodyTo(ResponseCompario.class)//
> .process(...)
>
> NB: The xml files are pretty big.
>
> Best regards,
>
>
>
> --
> View this message in context: http://camel.465427.n5.nabble.com/Parsing-XML-Performance-tp5727867.html
> Sent from the Camel - Users mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Parsing XML Performance

hekonsek
> Avoid usage of DOM or big object trees.

Good point. I assumed that the entire XML needs to be parsed (this is
often the case). But yeah, definitely favor streaming over
deserializing entire XML.

--
Henryk Konsek
http://henryk-konsek.blogspot.com
Reply | Threaded
Open this post in threaded view
|

Re: Parsing XML Performance

Raul Kripalani
In reply to this post by Łukasz Dywicki
Aalto XML [1] is a non-blocking async StAX XML parser which many claim is
ultra-fast and/or the fastest around. I haven't taken it for spin yet, but
it's worth a try.

[1] http://wiki.fasterxml.com/AaltoHome

*Raúl Kripalani*
Apache Camel Committer
Enterprise Architect, Program Manager, Open Source Integration specialist
http://about.me/raulkripalani | http://www.linkedin.com/in/raulkripalani
http://blog.raulkr.net | twitter: @raulvk <http://twitter.com/raulvk>

On Thu, Feb 21, 2013 at 10:36 AM, Łukasz Dywicki <[hidden email]>wrote:

> Avoid usage of DOM or big object trees. With all XML-Java binding
> marshallers it's gonna be more or less same. If you want do this fast, try
> to limit number of conversions. The best you can do is to use stream from
> beginning. Consider splitting file as described on Claus blog [1]. Also you
> may use partial unmarshall with JAXB (smaller documents), description is in
> JAXB dataformat docs [2].
>
> [1]
> http://www.davsclaus.com/2011/11/splitting-big-xml-files-with-apache.html
> [2] http://camel.apache.org/jaxb#JAXB-Partialmarshalling%2Funmarshalling
>
> Best regards,
> Łukasz Dywicki
> --
> [hidden email]
> Twitter: ldywicki
> Blog: http://dywicki.pl
> Code-House - http://code-house.org
>
> Wiadomość napisana przez BobbySixKiller <[hidden email]> w dniu 20
> lut 2013, o godz. 14:52:
>
> > Hi,
> >
> > I have a simple question about parsing xml and performance. Actually my
> > route is like this :
> >
> > from("endpointIn")
> > .convertBodyTo(String.class)//
> > .unmarshal().jaxb("com.groupemb.entite.search.compario.in")//
> > .process(...)
> >
> > where com.groupemb.entite.search.compario.in refers to a jaxb.index file
> > which refers to a java class named ResponseCompario.java
> >
> > It works fine but i was wondering, is this route more performant ?:
> > from("endpointIn")
> > .convertBodyTo(ResponseCompario.class)//
> > .process(...)
> >
> > NB: The xml files are pretty big.
> >
> > Best regards,
> >
> >
> >
> > --
> > View this message in context:
> http://camel.465427.n5.nabble.com/Parsing-XML-Performance-tp5727867.html
> > Sent from the Camel - Users mailing list archive at Nabble.com.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Parsing XML Performance

hekonsek
> I haven't taken it for spin yet, but
> it's worth a try.

Also James added VTD XML component to Camel Extra, if extra-fast XPath
query is fair enough for you.

[1] http://camel.apache.org/vtd-xml

--
Henryk Konsek
http://henryk-konsek.blogspot.com