Schema Design Tactics : Grouping Small Entities for Greater Cache Efficiency


This is a somewhat standard pattern that would apply to most databases, including MongoDB, but is worth mentioning I think. Imagine you have a large number of documents/records that are small — perhaps 100 bytes in size for example, but regardless, smaller than the caching page size used by the database. The problem is, if that document is hot, its entire page will be held in the page cache, and that could be wasteful as other objects in the page are not frequently used.

As an example consider the diagram above. Each black rectangle represents a page in the page cache. This might be 4KB for example. Suppose each object/record is 512 bytes in size. So eight fit per page.

