Quantcast

[CONF] Apache Camel: FTP2 (page edited)

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[CONF] Apache Camel: FTP2 (page edited)

Dhiraj Bokde (Confluence)

FTP2 has been edited by Claus Ibsen (Feb 13, 2009).

Change summary:

CAMEL-1295

(View changes)

Content:

FTP/SFTP Component

This component provides access to remote file systems over the FTP and SFTP protocols.

URI format

ftp://[username@]hostname[:port]/filename[?options]
sftp://[username@]hostname[:port]/filename[?options]

Where filename represents the underlying file name or directory. Can contain nested folders.
The username is currently only possible to provide in the hostname parameter.

If no username is provided then anonymous login is attempted using no password.
If no port number is provided. Camel will provide default values according to the protocol. (ftp = 21, sftp = 22)

Examples

ftp://someone@.../public/upload/images/holiday2008?password=secret&binary=true
ftp://someoneelse@...:12049/reports/2008/budget.txt?password=secret&binary=false&directory=false
ftp://publicftpserver.com/download

More information

This component is an extension of the File2 component. So there are more samples and details on the File2 component page.

URI Options

The options below are exclusive for the FTP2 component.

Name Default Value Description
password null specifies the password to use to login to the remote file system
binary false specifies the file transfer mode BINARY or ASCII. Default is ASCII.
localWorkDirectory null When consuming a local work directory can be used to store the remote file content directly in local files, to avoid loading the content into memory. This benefits if you consume very big remote file and thus can preserve memory usage. See below for more details.
passiveMode false FTP only: Set whether to use passive mode connections. Default is active.
ftpClientConfig null FTP only: Reference to a bean in the registry as a org.apache.commons.net.ftp.FTPClientConfig class. Use this option if you need to configure the client according to the FTP Server date format, locale, timezone, platform etc. See the javadoc FTPClientConfig for more documentation.
knownHostsFile null SFTP only: Sets the known_hosts file so that the SFTP endpoint can do host key verification.
privateKeyFile null SFTP only: Set the private key file to that the SFTP endpoint can do private key verification.
privateKeyFilePassphrase null SFTP only: Set the private key file passphrase to that the SFTP endpoint can do private key verification.

More URI options

See File2 as all the options there also applies for this component.

limitations

The option readLock can be used to force Camel not to consume files that is currently in the progress of being written. However this option is default turned off, as it requires that the user has write access. There are other solutions to avoid consuming files that are currently being written over FTP, for instance you can write the a temporary destination and move the file after it has been written.

The ftp producer does not support appending, to existing files. Any existing files on the remote server will be deleted before the file is written.

Message Headers

The following message headers can be used to affect the behavior of the component

Header Description
CamelFileName Specifies the output file name (relative to the endpoint directory) to be used for the output message when sending to the endpoint. If this is not present and no expression either then a generated message Id is used as filename instead.
CamelFileNameProduced The actual absolute filepath (path + name) for the output file that was written. This header is set by Camel and its purpose is providing end-users the name of the file that was written.
CamelFileBatchTotal Current index out of total number of files being consumed in this batch.
CamelFileBatchIndex Total number of files being consumed in this batch.
CamelFileHost The remote hostname.
CamelFileLocalWorkPath Path to the local work file, if local work directory is used.

Using Local Work Directory

Camel supports consuming from remote FTP servers and downloading the files directly into a local work directory. This avoids reading the entire remote file content into memory as its streamed directly into the local file using FileOutputStream.

Camel will store the into a local file with the same name as the remote file, thought with .progress as extension while the file is being downloaded. Afterwards the file is renamed to remove the .inprogress prefix. And finally when the Exchange is complete the local file is deleted.

So if you want to download files from a remote FTP server and store it as files then you need to route to a file endpoint such as:

from("ftp://[hidden email]?password=secret&localWorkDirectory=/tmp").to("file://inbox");

The route above is very efficient as it avoids reading the entire file content into memory. It will download the remote file directly to a local file stream. The java.io.File handle is then used as Exchange body. The file producer leverages this fact and can work directly on the java.io.File handle to store the file content in the inbox folder using java.nio.channels.FileChannel's.

Samples

In the sample below we setup Camel to download all the reports from the FTP server once every hour (60 min) as BINARY content and store it as files on the local file system.

protected RouteBuilder createRouteBuilder() throws Exception {
    return new RouteBuilder() {
        public void configure() throws Exception {
            // we use a delay of 60 minutes (eg. once pr. hour we poll the FTP server
            long delay = 60 * 60 * 1000L;

            // from the given FTP server we poll (= download) all the files
            // from the public/reports folder as BINARY types and store this as files
            // in a local directory. Camel will use the filenames from the FTPServer

            // notice that the FTPConsumer properties must be prefixed with "consumer." in the URL
            // the delay parameter is from the FileConsumer component so we should use consumer.delay as
            // the URI parameter name. The FTP Component is an extension of the File Component.
            from("ftp://scott@localhost/public/reports?password=tiger&binary=true&consumer.delay=" + delay).
                    to("file://target/test-reports");
        }
    };
}

And the route using Spring DSL:

<route>
     <from uri="ftp://scott@localhost/public/reports?password=tiger&amp;binary=true&amp;delay=60000"/>
     <to uri="file://target/test-reports"/>
  </route>

Consuming a remote FTP server triggered by a route

The FTP consumer is build as a scheduled consumer to be used in the from route. However if you want to start consuming from a FTP server triggered within a route it's a bit cumbersome to do this in Camel 1.x (we plan to improve this in Camel 2.x). However it's possible as this code below demonstrates.

In the sample we have a Seda queue where a message arrives that holds a message containing a filename to poll from a remote FTP server. So we setup a basic FTP url as:

// we use directory=false to indicate we only want to consume a single file
// we use delay=5000 to use 5 sec delay between pools to avoid polling a second time before we stop the consumer
// this is because we only want to run a single poll and get the file
// file=getme/ is the path to the folder where the file is
private String getFtpUrl() {
    return "ftp://admin@localhost:" + getPort() + "?password=admin&binary=false&directory=false&consumer.delay=5000&file=getme/";
}

And then we have the route where we use Processor within the route so we can use Java code. In this Java code we create the ftp consumer that downloads the file we want. And after the download we can get the content of the file and put it in the original exchange that continues being routed. As this is based on an unit test it routes to a Mock endpoint.

from("seda:start").process(new Processor() {
    public void process(final Exchange exchange) throws Exception {
        // get the filename from our custome header we want to get from a remote server
        String filename = exchange.getIn().getHeader("myfile", String.class);

        // construct the total url for the ftp consumer
        String url = getFtpUrl() + filename;

        // create a ftp endpoint
        Endpoint ftp = context.getEndpoint(url);

        // create a polling consumer where we can poll the myfile from the ftp server
        PollingConsumer consumer = ftp.createPollingConsumer();

        // must start the consumer before we can receive
        consumer.start();

        // poll the file from the ftp server
        Exchange result = consumer.receive();

        // the result is the response from the FTP consumer (the downloaded file)
        // replace the outher exchange with the content from the downloaded file
        exchange.getIn().setBody(result.getIn().getBody());

        // stop the consumer
        consumer.stop();
    }
}).to("mock:result");

Filter using org.apache.camel.component.file.GenericFileFilter

Camel supports pluggable filtering strategies. This strategy it to use the build in org.apache.camel.component.file.GenericFileFilter in Java. You can then configure the endpoint with such a filter to skip certain filters before being processed.

In the sample we have build our own filter that only accepts files starting with report in the filename.

public class MyFileFilter implements GenericFileFilter {

    public boolean accept(GenericFile file) {
        // we only want report files 
        return file.getFileName().startsWith("report");
    }
}

And then we can configure our route using the filter attribute to reference our filter (using # notation) that we have defines in the spring XML file:

<!-- define our sorter as a plain spring bean -->
   <bean id="myFilter" class="com.mycompany.MyFileFilter"/>

  <route>
    <from uri="ftp://[hidden email]?password=secret&amp;filter=#myFilter"/>
    <to uri="bean:processInbox"/>
  </route>

Filtering using ANT path matcher

The ANT path matcher is a filter that is shipped out-of-the-box in the camel-spring jar. So you need to depend on camel-spring if you are using Maven.
The reasons is that we leverage Spring's AntPathMatcher to do the actual matching.

The file paths is matched with the following rules:

  • ? matches one character
  • * matches zero or more characters
  • ** matches zero or more directories in a path

The sample below demonstrates how to use it:

<camelContext xmlns="http://camel.apache.org/schema/spring">
    <template id="camelTemplate"/>

    <!-- use myFilter as filter to allow setting ANT paths for which files to scan for -->
    <endpoint id="myFTPEndpoint" uri="ftp://admin@localhost:20123/antpath?password=admin&amp;recursive=true&amp;delay=10000&amp;initialDelay=2000&amp;filter=#myAntFilter"/>

    <route>
        <from ref="myFTPEndpoint"/>
        <to uri="mock:result"/>
    </route>
</camelContext>

<!-- we use the AntPathMatcherRemoteFileFilter to use ant paths for includes and exlucde -->
<bean id="myAntFilter" class="org.apache.camel.component.file.AntPathMatcherGenericFileFilter">
    <!-- include and file in the subfolder that has day in the name -->
    <property name="includes" value="**/subfolder/**/*day*"/>
    <!-- exclude all files with bad in name or .xml files. Use comma to seperate multiple excludes -->
    <property name="excludes" value="**/*bad*,**/*.xml"/>
</bean>

Debug logging

This component has log level TRACE that can be helpful if you have problems.

See Also

Loading...