In last few months, we has been developing and iPhone / iPad apps that pull JSON data from an web API frequently. We used Ruby on Rails to do the first version for both website and API service. After trying the first version, customers complained about loading speed. They said that data loading on the mobile app was SLOW. There are two reasons that make Rails based API calls slow:
- Rails is heavy and blocking (after an API call we do some logging and system shouldn't have wait for logging to return data to clients)
- Need to join four SQL tables to query data, combine data then convert to JSON
Let optimize them!
Solutions for (1) are:
Non blocking was chosen because it's fast, it handles more requests and help to avoid managing additional job queues.
For (2), de-normalize and pre-calculate JSON is the first step. But there is a table that cannot be de-normalized using SQL. It's listings table that state which item will be showed on which month and in which category. Let say item I will be showed in category C in month M1 and M2 then two rows (I, C, M1) and (I, C, M2) must exist in listings table. After de-normalization and JSON pre-calculation, only it need to join only two tables to get items give a category and a month then combine pre-calculated item JSON and return it to clients.
Second step for (2) is choosing database that is faster and more suitable to de-normalize and query data. Key-value stores are super fast but only support query data by key. Distributed / Map-Reduce (Riak, HBase ..) is overkill. In this case, MongoDB seem to be a perfect fit. MongoDB is very fast (memory mapping), flexible data structure and rich & fast queries (indexing). In MongoDB, listings can be stored in item document it self as an array of [category, month] pairs then can be indexed for fast querying.
Third step for (2) is minimizing data size for each item. MongoDB try to keep indexes and recently used data in memory so smaller data size mean less memory needed. Smaller data size also mean faster data transfer between database and app instances. To minimize data size I do following tricks:
After lot of micro-optimization, item data size is reduced to one third (33%).
The final result is amazing. On the server side, API queries are 20 times faster and often take less than 10ms. Everybody is happy :)