I recently tackled ways to make a search experience more responsive. Autocomplete suggestions are a small but very impactful area to optimize. Upon digging into it, I found a rabbit hole of subtle optimizations to really make the search experience first-class.
The setup
Let's start with a dead-simple implementation.
<input type="search" placeholder="Type here" />
<pre><!-- we'll put autocomplete suggestions here --></pre>
I'm setting up a basic event listener on the input field which calls a function each time the input changes. This function is where we'll do the majority of our work.
const input = document.querySelector('input');
const results = document.querySelector('pre');
async function handleNewInput(query) {
// TODO: add our logic here
results.innerText = 'set suggestions here';
}
// listen for typing
input.addEventListener('input', e => handleNewInput(e.target.value));
Then we'll simulate an AJAX request which retrieves the suggestions for a given query.
// simulate an AJAX call to fetch results
function getSuggestions(query) {
return new Promise(resolve => {
// simulate a small network delay (in milliseconds)
const delay = 200;
window.setTimeout(() => {
const suggestions = [
`${query} one`,
`${query} two`,
`${query} three`,
].join("\n");
resolve(suggestions);
}, delay);
});
}
In reality, this would probably return a decoded JSON object, have more details, etc., but it's sufficient for this demo.
A naive implementation
Let's try the most obvious approach:
async function handleNewInput(query) {
results.innerText = await getSuggestions(query);
}
Seems to work pretty well!
But wait--our getSuggestions()
function doesn't simulate network conditions very well. It returns a response in exactly 200ms. In reality, requests will have varying response times, and may even arrive out of order. So let's update our getSuggestions()
simulation with something more realistic:
function getSuggestions(query) {
return new Promise(resolve => {
const delay = 50 + (Math.random() * 350); // 50ms to 400ms
window.setTimeout(() => {
// ...
}, delay);
})
}
A response delay time between 50ms and 400ms would simulate some more turbulent network conditions. But our autocomplete needs to handle conditions like that, so let's use this for testing.
Here's how that works once we randomize the delay time. Try typing a few search terms in that field (type rapidly).
This implementation doesn't handle out-of-order responses very well. If I quickly type a search term like "beach ball", I might end up with suggestions for "beach bal", or whichever HTTP request finished last. Not great.
Enforcing strict ordering
We can solve this problem giving each request sequential order numbers, then remembering the last request number we've received a response from. Any responses that we receive which came before our last received request number are silently dropped.
So say we issue requests #1, #2, and #3.
- We start a
lastReqSeen
variable initialized to0
. - Then we see a response from #2 come back first. We see that
2
is greater than ourlastReqSeen
of0
, so we setlastReqSeen
to2
and display the results. - Then we get a response from #1. Since
1
is less than ourlastReqSeen
of2
, we silently drop it and do nothing with the results. - Then we get #3's response back. We see that it's greater than our
lastReqSeen
of2
, so we updatelastReqSeen
to3
and display the results. - And so on and so forth.
Here's how we might implement that. We'll need to modify getSuggestions()
so that it takes a request number argument along with the query, and then we'll need to have two persistent variables which track 1) the last request number that's completed, and 2) the order number that we want to assign to the next request we fire.
let lastReqSeen = -1; // remember the last request that we've seen
let nextReqNumber = 0; // the order number that we'll give the next request
async function handleNewInput(query) {
// send off the request and wait for a response
const {suggestions, reqNum} = await getSuggestions(query, nextReqNumber++);
// if we saw another later request already, drop this result
if (reqNum <= lastReqSeen) {
return;
}
// otherwise, use it
results.innerText = suggestions;
lastReqSeen = reqNum; // and remember this request as the last one seen
}
function getSuggestions(query, reqNum) { // capture the request number
return new Promise(resolve => {
// ...
window.setTimeout(() => {
// ...
resolve({ suggestions, reqNum }); // include the request number
}, delay);
})
}
That gives us this:
That's much better! Results no longer appear out of order, and assuming that all of the autocomplete HTTP requests return successfully, I always see suggestions for my full search term eventually (no "beach bal" suggestions when I type "beach ball").
But there's still some problems here. The suggestions are changing too fast to read them because they're changing with every single character. And further, this generates an HTTP request for each and every character typed, which is unnecessary additional load on our search backend.
Slow things down
So let's limit the responses we get from the server. One common and easy way to do this is debouncing. We'll use Lodash's _.debounce()
function for this. We'll pick a noticeable but short delay, like 300ms. Let's wrap the input
event handler with _.debounce()
:
input.addEventListener(
'input',
_.debounce(e => handleNewInput(e.target.value), 300)
);
Try typing something quickly and then pausing for a moment. You'll see results show up shortly after you pause.
Our HTTP requests are now issued much less frequently, which minimizes server load. But it's not great; users who type fast will end up typing their whole query before they realize that we have improvement suggestions for them.
Keep the flow going
A better solution to this would be throttling. Instead of waiting for a pause in the input, we can continually send autocomplete queries at a regular pace while the user is typing, while still limiting the overall rate of requests. This minimizes backend load and keeps the suggestions from changing too quickly.
Notice that this is how Google's auto-suggest works (as of right now); if you type fast, you only get updated suggestions every so often.
Throttling is is similar to debouncing, except:
- It sends the first request immediately when the first trigger is received
- New triggers don't reset the timer; they just cancel the function call from right before themselves, and replace it with their handler.
Lodash has a convenient _.throttle()
function that we can use. Let's replace _.debounce()
with _.throttle()
and see how that works:
input.addEventListener(
'input',
_.throttle(e => handleNewInput(e.target.value), 300)
);
Now requests are sent every 300ms to the server, even if the user is typing quickly.
Beautiful! We regularly get updated results, those results stay around long enough to read at a glance, and we're minimizing our server load.
Is there a better way?
But wait. Our code here is messy and a little hard to follow. We've got global variables, we're relying on a specific function instance to stick around (the result of _.throttle()
), and if we needed to add any other adjustments in the future, we'd have to increasingly junk things up.
There are definitely ways to manage this by encapsulating our vars, etc. But we can simplify a lot of this manual work using an event-based library like RxJS.
Enter Observables
An observable is effectively an object which emits values synchronously or asynchronously. It's a similar abstraction to Promises, which let us reason more simply about one-time asynchronous operations. But it generalizes that concept and extends it to cover multiple separate emissions.
Observables are nothing without operators. Operators allow us to manipulate, transform, combine, and do all manner of operations on the emissions of observables.
Let's look at how we would implement the same search behavior using RxJS principles. We can drop Lodash, and just use RxJS:
const { fromEvent, from, asyncScheduler } = rxjs;
const { map, throttleTime, switchAll } = rxjs.operators;
const input = document.querySelector('input');
const results = document.querySelector('pre');
fromEvent(input, 'input') // #1
.pipe( // #2
throttleTime(300, asyncScheduler, { leading: true, trailing: true }), // #3
map(e => e.target.value), // #4
map(query => from(getSuggestions(query))), // #5
switchAll() // #6
).subscribe(suggestions => results.innerText = suggestions); // #7
Let's step through this:
- Create an Observable instance which emits events for each
input
event. - Pass the emitted events through a chain of operations. This will return a new observable at the end.
- As the first operation, throttle events to every 300ms, and make sure to include both leading and trailing events. Ignore
asyncScheduler
--it's out of scope for this article. - Pull out just the search query from the
input
event. - Fetch the suggestions from the server, and use
from()
to transform the request promise to an Observable. This is a little hard to follow; at this point, our pipeline is an observable, emitting observables, each of which emits 1 suggestion result. RxJS calls this a "higher-order observable". - As the final operator, coalesce the higher-order observable back into a simple observable comprised of suggestion results. The magic here is that
switchAll()
will automatically enforce that later emitted events have precedence over earlier ones. If the second request finishes before the first, the first request will be ignored when it completes. - Use the output of the final observable to set the contents of the suggestions element.
The downside here is that you need to understand the paradigms that RxJS uses. But once you've got that, the resulting code is far easier to reason about. We don't have any variables tracking intermediate state, and the code is simple enough that we don't even need to decompose it into smaller functions.
The result is functionally identical to our homegrown version:
Taking it further
There are several things that can be done to further optimize the user experience and performance of suggestions in the search bar.
- Specially handle empty string queries by immediately returning a "no suggestions" response. This prevents useless HTTP requests from being sent to the backend.
- Once you fetch suggestions for a query, cache those results client-side. If the user backspaces their entire query, you can display cached suggestions instead of re-sending those HTTP requests.
- Slap a cache layer or CDN in front of the autocomplete results endpoint if the results aren't personalized to the user.