Parse Large Json Files in Java using Gson Streaming

Guide on How to Parse Large Json files into Java Objects. Use Gson Streaming to Read very large Json file and covert it into objects.

Overview

Gson is a very popular API for parsing JSON strings into Objects. The parse method, provided by Gson is good for reading the entire JSON string and parse it into Java Objects in one go. To do that the JSON string is first loaded into memory and then converted into an object. However, when we are dealing with a very large JSON file, it can lead to OutOfMemoryError. To avoid this, we can use Gson Streaming technique to parse a large file in chunks.

In this tutorial we use Gson Streaming and parse a 400 MB JSON file efficiently into Java Objects without loading it completely in memory. We will also monitor the amount of memory consumed by the program. But, before we do that, let’s begin with a quick setup.

Write Large JSON File

First, we will create a 400MB Json file using a Java program. Next is an example of a sample Person record in the form of JSON.

{ "name":"John", "age":31, "city":"New York" }
Code language: JSON / JSON with Comments (json)

We will create a JSON array of 10 million person records and store it into a Json file.

FileOutputStream fos = new FileOutputStream(target); OutputStreamWriter ow = new OutputStreamWriter(fos); ow.write("["); for (int i = 0; i < 10000000; i++) { if (i != 0) { ow.write(","); } ow.write(person); } ow.write("]"); ow.flush();
Code language: Java (java)

We are using a OutputStreamWriter to write each person record to a file. Don’t forget to close all the open streams and readers.

Read Large JSON File by Streaming

Now that our input JSON file is ready, we will stream it and convert each record into Java objects using the memory efficient way of Gson Streaming.

In order to Stream a Json file Gson provides JsonReader class. Using this class, we can navigate through JSON objects and nested objects and iteratively convert them into java Objects.

Next is the example, where we Read the large JSON file, iterate through its contents, and parse them into objects.

private void readLargeJson(String path) throws IOException { try ( InputStream inputStream = Files.newInputStream(Path.of(path)); JsonReader reader = new JsonReader(new InputStreamReader(inputStream)); ) { reader.beginArray(); while (reader.hasNext()) { Person person = new Gson().fromJson(reader, Person.class); //System.out.println(Person); } reader.endArray(); } }
Code language: Java (java)

First, we create an InputStream on the file and use it to create an InputStreamReader. Next, we instantiate JsonReader wrapper and use it to parse the JSON file.

As we are dealing with Json array of person objects, we use beginArray() method to stream through the array elements. Then we iterate through all the elements of the Json array and covert each of them into a new Person object. Finally, we close the array. Note that we are using try-with-resources block to automatically close the streams and readers.

Similarly, if you are dealing with a large Json object you can use beginObject() method to access the nested objects.

Testing

Now, we will use the 400MB Json file that we created to stream it. In order to monitor the memory consumption, we will run the readLargeJson() method into a separate thread. While the main thread prints amount of available memory in MB on the console on fixed interval.

Source File Size 400 Memory used: 9 Memory used: 139 Memory used: 112 Memory used: 122 Memory used: 96 Memory used: 121 Memory used: 150 Memory used: 82 total time 35023
Code language: Bash (bash)

The output clearly indicates that our memory consumption was optimal and we did not read entire JSON file into memory.

Summary

In this quick tutorial we learned How to Parse a Very Large JSON file in a memory efficient way to avoid OutOfMemoryError. We used GSON API to Stream a 400 MB Json file and iteratively converted into Java Objects.

To use Jackson API to parse Json objects, please visit Read JSON Strings into Java Objects with Jackson API.