I have read some posts about the readLock policy for the file component
but I still have some doubts which hopefully can be answered here.
My route is like this:
I understand this to read every 10 minutes the files in a directory and
pick up 10000 files, if there is that many files present. Originally I
had no readLock policy and it worked well. But I have another process
which is non camel which is writing the files into the from directory,
sometimes the files are quite big like 1.2GB and the process which
writes them can be curl or wget meaning they are written in chunks and
depending on network speed etc.
I observed that with no readLock, that sometimes the files were
processed before being completely downloaded if the poll happened at
this time. I added the readLock=changed and this solved the problem but
as documented the performance suffers terribly and the behaviour is not
consistent. For example sometimes the polling only picks up one file
every ten minutes when several are present or it stops picking up files
altogether for several hours. I don't understand this. I found this
thread but can't read it without an account:
I thought perhaps changing to a different locking policy could help like
rename but I think ultimately I've concluded to try and remove the
readLock altogether. If I did this and had my other process write to a
different location and then move the files using the linux command mv
when they are complete. Would this work, I mean does mv write the files
partially or could it suffer from the same problem if the poll happened
at the same time mv was running?
Or is there some other solution which could help. I know I could write
my own readLock strategy but I don't really consider that to be an
option as yet.