High Volume ETL using camel

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

High Volume ETL using camel

Hello Geeks

We are a huge retail organization with over 1500 stores as well as online shopping websites. Volumes of messages at the middleware run probably over a billion daily. Currently we have a commercial product by another vendor taking up the responsibility.

We plan to move out systems soon into complete open source stack. That will include products like Camel, Fuse , Jenkins, Gerrit etc.

We have 2 scenarios where we are trying to fit in Camel

1) For real time traffic. That will be the typical busy Fuse ESB layer which takes care of messaging / transformation / reliable delivery etc between enterprise applications. We are clear on that and going ahead with Camel and OSGi containers.

2) Scheduled Batch/ETL jobs. Now these will be really heavy jobs and may contain payloads running to few GB's in size. We want to fit in Camel for these and alternative in case we hit a road block will be Talend ETL.

My queries are below

1) Is camel capable of  doing heavy batch ETL's reliably? If so what would be the programmer's best practices & strategy for dealing heavy csv files. I wish not to see OOM issues.

2) What would be the best suitable container for running batch ETL jobs? Should I use Servicemix / Jboss Fuse or consider other Java based containers?

Any insights from you all would help evaluate on options.

Reji Mathews Sr. Engineer - Middleware Integrations / SOA ( Open Source - Apache Camel & Jboss Fuse ESB | Mule ESB ) LinkedIn - http://in.linkedin.com/pub/reji-mathews/31/9a2/40a Twitter - reji_mathews