JavaTechnology

How to Download a File from URL in Java

This article covers a different ways to Read and Download a File from an URL in Java and storing it on disk, which includes plain Java IO, NIO, HttpClient, and Apache Commons Library.

Overview

There are a number of ways, we can download a file from a URL on internet. This article will help you understand them with the help of examples.

We will begin by using BufferedInputStream and Files.copy() methods in Plain Java. Next we will see how to achieve the same using Java NIO package. Also, we will see how to use HttpClient, which provides a Non-Blocking way of downloading a file. Finally, we will use third party library of Apache Commons IO to download a file.

Using Plain Java IO

First, we will see an example of using Java IO to download a file. The Java IO provides APIs to read bytes from InputStream and writing them to a File on disk. While, Java NET package provides APIs to interact with a resource residing over internet with the help of URL.

In order to use Java IO and Java NET we need to use java.io.* and java.net.* packages into our class.

Using BufferedInputStream

Next is a simple example of using Java IO and Java NET to read a file from URL. Here, we are using BufferedInputStream to download a file.

URL url = new URL("https://www.google.com/");
try (
        InputStream inputStream = url.openStream(); 
        BufferedInputStream bufferedInputStream = new BufferedInputStream(inputStream); 
        FileOutputStream fileOutputStream = new FileOutputStream(outputPath);
    ) {
        byte[] bucket = new byte[2048];
        int numBytesRead;

        while ((numBytesRead = bufferedInputStream.read(bucket, 0, bucket.length)) != -1) {
        fileOutputStream.write(bucket, 0, numBytesRead);
  }
}

At first, we created an URL instance by specifying URL of the file or resource we want to download. Then, we opened an InputStream from the file using openStream method. Next, in order to be able to download large files we wrapped the input stream into a BufferedInputStream. Also, we created a FileOutputStream by providing a path on the disk where we want the file to be saved.

Next, we use a bucket of byte[] to read 2048 bytes from the input stream and writing onto the output stream iteratively. This example, demonstrates how we can use our own buffer (for example 2048 bytes) so that downloading large files should not consume huge memory on our system.

Note: While dealing with Java File IO, we must close all the open streams and readers. To do that, we have used try-with-resources block for respective streams instantiation.

Using Files.copy()

While writing the previous example, we had to take care of a lot of logic. Thankfully, Java Files class provides the copy method which handles these logic internally.

Next is an example of using Files.copy() to download file from URL.

URL url = new URL("https://www.google.com");
try(InputStream inputStream = url.openStream()){
    Files.copy(inputStream, Paths.get(outputPath));    
}

Using Java NIO

The Java NIO package offers a faster way of data transfer, which does not buffer data in memory. Hence, we can easily work with large files. In order to use Java NIO channels, we need to create two channels. One channel will connect to the source and other to the target. Once the channels are set, we can transfer data between them.

Next is an example of using NIO Channels to read a file on internet.

URL url = new URL("https://www.google.com");
try (
        ReadableByteChannel inputChannel = Channels.newChannel(url.openStream());

        FileOutputStream fileOutputStream = new FileOutputStream(outputPath);
        FileChannel outputChannel = fileOutputStream.getChannel();
) {
    outputChannel.transferFrom(inputChannel, 0, Long.MAX_VALUE);
}

Using Java HttpClient

We can also use HttpClient provided by java NET package. Next, is an example of using HttpClient to download a file and save it on the disk.

HttpClient httpClient = HttpClient.newBuilder().build();

HttpRequest httpRequest = HttpRequest
        .newBuilder()
        .uri(new URI("https://www.google.com"))
        .GET()
        .build();

HttpResponse<InputStream> response = httpClient
        .send(httpRequest, responseInfo ->
                HttpResponse.BodySubscribers.ofInputStream());

Files.copy(response.body(), Paths.get(outputPath));

First, we simply create an instance of HttpClient using its builder. Next, we create HttpRequest by providing the URI, and HTTP GET method type. Then we invoke the request by attaching a BodyHandler, which returns a BodySubscriber of InputStream type. Finally, we use the input stream from the HttpResponse and use File#copy() method to write it to a Path on disk.

Using Java HttpClient Asynchronously

This section explains how to asynchronously download a file from URL and save it to the disk. To do that, we can use sendAsync method of HttpClient, which will return a Future instance.

When we execute an asynchronous method, the program execution will not wait for the method to finish. Instead it will progress further doing other stuff. We can check on the future instance to see if the execution is finished and the response is ready.

Next block of code demonstrates using HttpClient that downloads a file asynchronously and save onto the disk.

HttpRequest httpRequest = HttpRequest
        .newBuilder()
        .uri(new URI("https://www.google.com"))
        .GET()
        .build();

Future<InputStream> futureInputStream =
        httpClient
                .sendAsync(httpRequest, HttpResponse.BodyHandlers.ofInputStream())
                .thenApply(HttpResponse::body);

InputStream inputStream = futureInputStream.get();
Files.copy(inputStream, Path.of(outputPath));

As it is shown in the example, we are sending an async request, which returns a Future of InputStream. the get method on the Future will be blocked until the input stream is ready. Finally, we use Files#copy method to write the file to disk.

Using Apache Commons IO

The Apache Commons IO library provides a number of useful abstractions for general purpose File IO. In order to read a file from URL and to save it to disk, we can use copyURLToFile method provided by FileUtils class. Here is an example of using Apache Commons IO to read a file from URL and save it.

URL url = new URL("https://www.google.com");
FileUtils.copyURLToFile(url, new File(outputPath));

This looks a lot simpler and short. The copyURLToFile method internally uses IOUtils.copy method (as explained in Using Apache Commons IO to copy InputStream to OutputStream). Thus, we do not need to manually read buffers from input stream and write on output stream.

Alternatively, we can use another flavour of this method which allows to set connection timeout, and read timeout values.

public static void copyURLToFile(
            URL source, 
            File destination, 
            int connectionTimeout, 
            int readTimeout) throws IOException {

The snippet shows signature of the method that we can use along with specific timeout values.

Summary

In this article we understood How to Download a File from URL and store it on the disk. We have covered a different ways of doing this, which includes using Plain Java IO and Java NET combination, using Java NIO package, using Http Client both synchronously and asynchronously, and finally using Apache Commons IO. For more on Java, please visit Java Tutorials.