Fast, auto-generated streaming JSON parsing for Android

Instagram Engineering
Instagram Engineering
3 min readJan 24, 2015

--

The Instagram engineering team is constantly looking at how we can improve the speed, reliability and overall performance of the app. One of the things we wanted to improve is how quickly people using the Android app can view “News” in their Instagram feed — the place you go to see when people tag you, like or comment on one of your posts, or even to see what the people you follow are up to.

We started with Android after facing some special challenges on the platform converting data between formats. Large amounts of unstructured data needs to be parsed into meaningful constructs in an efficient manner. So, earlier this summer we developed a tool to automate this process, ig-json-parser, and today we’re happy to announce that we’remaking it available to the open source community.

But first, a little background on why we’re doing this…

JSON is a prevalent data format for internet applications, and as such, serialization and deserialization to native objects on each platform is important. It is vital for these transformations to be correct, and desirable for them to be fast and efficient. On mobile platforms such as Android, there are additional challenges such as dalvik-specific issues and extreme memory pressure. Jackson ObjectMapper provides a mechanism which achieves most of these objectives, but comes up short in a few areas. First, ObjectMapper incurs a significant one-time penalty the first time it encounters an object it has not previously serialized/deserialized. This penalty is especially onerous when it affects the startup time of a mobile application. Second, ObjectMapper performs a lot of memory allocations, which can stress the garbage collector. Finally, it retains a large memory footprint even after the operations are complete.

Jackson stream parsing offers an alternative that solves all of the issues that ObjectMapper presents. Unfortunately, it introduces a fatal flaw of its own: stream parsing requires a lot of repetitive handwritten code, which is burdensome to write and prone to mistakes. Because of these issues, we have traditionally used them in limited areas where ObjectMapper can significantly degrade performance.

The ideal solution is to automatically generate the stream parsing code, which is what ig-json-parser does. It is a compile-time processor (known as a JSR-269 annotation processor) that takes model classes annotated with a few bits of data, and generates a stream parser. The mechanism is highly flexible, allowing users to plug in their own code snippets to customize the generated code.

The performance characteristics of the autogenerated parser are excellent. In the tables below, the test data is a list of Instagram stories.

On cold start, the benchmark yields:

On a subsequent iterations of parsing the data, the gap is significantly shorter:

At the end of the day, delivering the best possible experience for people is what’s most important to us and the ig-json-parser has helped us do just that. We’re excited about the opportunities the tool presents and looking forward to seeing how the community will use it within their own organizations.

--

--