Search is a must-have for any website or application. A simple search widget can allow users to comb through your entire blog. Or allow customers to browse your inventory. Building a custom photo gallery? Add a search box. Website search functionality is available from a variety of third-party vendors. Or you can take the DIY approach and build the entire backend to answer search API calls.

Lunr.js works on the client-side via JavaScript. Instead of sending calls to a backend, Lunr looks up search terms in an index built on the client-side itself. This avoids expensive back-and-forth network calls between the browser and your server. There are plenty of tutorials online to showcase Lunr’s website search functionality. But you can actually use Lunr.js to search any array of JavaScript objects.

In this how-to, I’ll build a search index for the top 100 books of all time. After that, I’ll show you how to pre-build the index for faster indexing. I’ll also show you how to make the most of Lunr’s search options. And finally, I’ll show off findmymastodon.com - a real-world implementation of Lunr.

Getting started with Lunr.js

Create a new HTML page called lunr.html. We’ll use this file throughout this guide. At the top of lunr.html, call the main Lunr JS library.

<script src="https://unpkg.com/lunr/lunr.js"></script>


Note: You can find the complete code here

Loading the dataset

Next, create a variable called mybigjson. This variable will contain the JSON-ified string of our main dataset. Define the variable in lunr.html within <script> tags.

var my_big_json = '[{"author":"Chinua Achebe","country":"Nigeria","imageLink":"images/things-fall-apart.jpg","language":"English","link":"en.wikipedia.org/wiki/Things_Fall_Apart","pages":209,"title":"Things Fall Apart","year":1958}]';

(Note: JSON truncated. Please see source for full JSON.)

We need to parse this data as JSON so JavaScript can process it.

my_big_json = JSON.parse(my_big_json);

for (var i = 0; i < my_big_json.length; i++) {

    console.log(i + 1, " -> ", my_big_json[i]);
}


To ensure that the loading process was successful, I’m iterating over the JSON dataset and printing each entry. You should see the following in your console log.

Loading Lunr search dataset into JSON in JavaScript.

Building the search index

Now we will build the search index in a variable called idx using the lunr() function. This step requires 3 things:

  1. Lunr returns a document reference for every document that matches a search query. And we need to tell Lunr which field in our dataset should be the reference. Usually this reference is a numeric ID unique to every document. Since our dataset does not contain such a field I’ll use the link field as the reference field.

  2. Lunr also requires the list of fields which should be part of the search index. For this example, I want to search on author, title, and country.

  3. Lastly, Lunr requires our dataset… the mybigjson variable.

We can now call the lunr() function to build our search index idx.

var idx = lunr(function () {
 
    this.ref('link')
    this.field('author')
    this.field('title')
    this.field('country')

    my_big_json.forEach(function (doc) {
        this.add(doc)
    }, this)
});


If there were no errors in your browser’s console log then our index built was successful. Let’s take it for a test drive.

Looking for books in our dataset with Lunr

Let’s first see if there are any documents containing the word adventures:

results = idx.search("adventures");
console.log('Results: ', results.length);

We should see the following in our console log:

Results:  1


Alright. So there is 1 document that matches the search query adventures. But how do we display it?

As I mentioned before, Lunr returns the reference of the matching document. But not the document itself. Meaning if we print results we’ll see the following.

Searching for the best books of all time using Loading Lunr search.

Here results contains an array of objects with a single element. And that element’s ref field contains a value. This is because we used the link field as the reference. To show the full document, we’ll need to work a bit harder.

results = idx.search("adventures");
console.log('Results: ', results.length);

var results_full = results.map(function (item) {
    return my_big_json.filter(function (value, index, arr) {
        return value.link == item.ref;
    })[0];
});
console.log(results_full);


In the above code, we’re iterating through the results using the map() function. We then find documents in mybigjson for all the references in the result set. So results_full contains the complete search results.

0: Object { author: "Mark Twain", country: "United States", imageLink: "images/the-adventures-of-huckleberry-finn.jpg", … }


Let’s repeat the search for the keyword india. The code is as follows.

results = idx.search("india");
console.log('Results: ', results.length);

var results_full = results.map(function (item) {
    return my_big_json.filter(function (value, index, arr) {
        return value.link == item.ref;
    })[0];
});
console.log(results_full);


And the results are:

Results:  4
0: Object { author: "Kālidāsa", country: "India", imageLink: "images/the-recognition-of-shakuntala.jpg", … }
1: Object { author: "Valmiki", country: "India", imageLink: "images/ramayana.jpg", … }
2: Object { author: "Vyasa", country: "India", imageLink: "images/the-mahab-harata.jpg", … }
3: Object { author: "Salman Rushdie", country: "United Kingdom, India", imageLink: "images/midnights-children.jpg", … }


And it’s that simple! Adding search for any array of JSON objects only requires 5 easy steps:

  1. Call Lunr.js.

  2. Identify the reference fields and the search fields.

  3. Build the search index by iterating over the dataset.

  4. Call the search() method to search the index and return the matching references.

  5. And finally, retrieve the documents for the matching references.

Prebuilding Lunr.js’s search index

You may have noticed that the search index takes some time to build on every page refresh. The time may be imperceptible right now. But it won’t be when lunr.html is on a remote server. Lunr allows pre-building the search index to make the search more responsive.

There are 2 ways of pre-building the index. The first method is to serialise the index after building it. Since we’ve already created the index in our tutorial we’ll use this method.

var serializedIdx = JSON.stringify(idx);
console.log(serializedIdx);


We would then replace mybigjson with the serialised string. Our index load command would also change.

var my_big_json = lunr.Index.load(JSON.parse(serializedIdx));


The second method involves calling the above commands in CLI. This method is great for running in a CI/CD pipeline. I’ll talk about this method more later on.

Lunr.js search tips and tricks

Web search engines like DuckDuckGo provide a way to prioritise search terms. For example, a + sign will include a search term and - will exclude it.

Lunr provides similar features to prioritise search terms or search fields. + will include a search term and - will exclude it. The ~ symbol allows fuzzy matching. The ^ symbol allows weighting search terms. And the : symbol allows search for specific fields.

For example, to find the best books published in England but not by Shakespeare, our Lunr query will be:

results = idx.search("country:England -author:Shakespeare");


This gives me 2 results:

0: {author: 'Geoffrey Chaucer', country: 'England', imageLink: 'images/the-canterbury-tales.jpg', language: 'English', link: 'https://en.wikipedia.org/wiki/The_Canterbury_Tales', …}
1: {author: 'Laurence Sterne', country: 'England', imageLink: 'images/the-life-and-opinions-of-tristram-shandy.jpg', language: 'English', link: 'https://en.wikipedia.org/wiki/The_Life_and_Opinions_of_Tristram_Shandy,_Gentleman', …}

Lunr in action: findmymastodon.com!

findmymastodon.com helps users find, well, Mastodon instances. These instances often cater to specific languages, themes, and interests. And so the website needed a search engine.

I started by first building the dataset using Python. The dataset contains metadata for thousands of Mastodon instances currently live. I then used Node to create a serialised Lunr index from the original JSON. This Lunr index is then loaded as a static asset. Search executes against this index and the Mastodon instances, if found, display on the search page. You can browse the JavaScript source code here.

Conclusion

Lunr can prove to be a useful ally in webdev. It’s especially great for static websites which don’t rely on an active backend. Support for pre-built indices using CLI is a huge performance bonus. This allows using CI/CD to fetch data from a remote backend and create the index. Meaning the backend no longer needs to remain active to serve search requests.

Client-side search is a new thing for me. I’ve been a DevOps/cloud guy for a long time and I’m eager to find out how this can improve security and optimise costs. Web browsers are becoming more sophisticated every day. Letting them handle the heavy-lifting for search might grant performance improvements, improved user experiences, and lower cloud costs.

Thanks for reading this how-to, and I hope it sparked some creative ideas in your head. Happy coding :)