Efficient Techniques for Fuzzy and Partial matching in mongoDB

This blogpost describes a number of techniques, in MongoDB, for efficiently finding documents that have a number of similar attributes to a supplied query whilst not being an exact match. This concept of "Fuzzy" searching allows users to avoid the risks of failing to find important information due to slight differences in how was entered.

Where users are able to enter data items manually, rather than choosing from a list of options, there exists a reasonable probability that multiple users will enter data in different forms. For names this may be misspelling or typographic errors, telephone numbers may be entered with or without spaces or international prefixes. When typing addresses people may enter sub-items in incorrect form boxes.

In an application aware of these issues then the data entry interface can be made somewhat smarter; telephone numbers can have an enforced format and addresses can be verified against a geo databases. Unfortunately other data elements such as names have no such safeguards and are therefore frequently entered in multiple formats.

