Parse Large Json Files in Java using Gson Streaming

Guide on How to Parse Large Json files into Java Objects. Use Gson Streaming to Read a very large JSON file and convert it into objects.

Overview

Gson is a very popular API for parsing JSON strings into Objects. The parse method provided by Gson is suitable for reading the entire JSON string and parsing it into Java Objects in one go. The JSON string is first loaded into memory and converted into an object. Thus, large JSON objects can lead to OutOfMemoryError. We can use the Gson Streaming technique to parse a large file in chunks to avoid that.

This tutorial uses Gson Streaming and efficiently parses a 400 MB JSON file into Java Objects without loading it entirely in memory. We will also monitor the amount of memory consumed by the program. But, before we do that, let’s begin with a quick setup.

Write Large JSON File

First, we will create a 400MB JSON file using a Java program. Next is an example of a sample Person record in the form of JSON.

{ "name":"John", "age":31, "city":"New York" }
Code language: JSON / JSON with Comments (json)

We will create a JSON array of 10 million person records and store it in a JSON file.

FileOutputStream fos = new FileOutputStream(target); OutputStreamWriter ow = new OutputStreamWriter(fos); ow.write("["); for (int i = 0; i < 10000000; i++) { if (i != 0) { ow.write(","); } ow.write(person); } ow.write("]"); ow.flush();
Code language: Java (java)

We use an OutputStreamWriter to write each person’s record to a file. Don’t forget to close all the open streams and readers.

Read Large JSON File by Streaming

Now that our input JSON file is ready, we will stream it and convert each record into Java objects using the memory-efficient way of Gson Streaming.

To Stream a JSON file, Gson provides JsonReader class. Using this class, we can navigate through JSON objects and nested objects and iteratively convert them into java Objects.

Next is the example where we Read the large JSON file, iterate through its contents, and parse them into objects.

private void readLargeJson(String path) throws IOException { try ( InputStream inputStream = Files.newInputStream(Path.of(path)); JsonReader reader = new JsonReader(new InputStreamReader(inputStream)); ) { reader.beginArray(); while (reader.hasNext()) { Person person = new Gson().fromJson(reader, Person.class); //System.out.println(Person); } reader.endArray(); } }
Code language: Java (java)

First, we create an InputStream on the file and use it to create an InputStreamReader. Next, we instantiate the JsonReader wrapper and use it to parse the JSON file.

As we are dealing with a JSON array of person objects, we use the beginArray() method to stream through the array elements. Then we iterate through all the elements of the JSON array and covert each of them into a new Person object. Finally, we close the array. We are using the try-with-resources block to close the streams and readers automatically.

Similarly, if you are dealing with a large JSON object, you can use beginObject() method to access the nested objects.

Testing

Now, we will use the 400MB JSON file that we created to stream it. We will run the readLargeJson() method into a separate thread to monitor the memory consumption. While the main thread prints amount of available memory in MB on the console at a fixed interval.

Source File Size 400 Memory used: 9 Memory used: 139 Memory used: 112 Memory used: 122 Memory used: 96 Memory used: 121 Memory used: 150 Memory used: 82 total time 35023
Code language: Bash (bash)

The output indicates that our memory consumption was optimal, and we did not read the entire JSON file into memory.

Summary

In this quick tutorial, we learned How to Parse a Very Large JSON file in a memory-efficient way to avoid OutOfMemoryError. We used GSON API to Stream a 400 MB JSON file and iteratively converted it into Java Objects.

To use Jackson API to parse Json objects, please visit Read JSON Strings into Java Objects with Jackson API.