AI SEARCH PILOT

The smartest move was sounding less smart

ROLE

Content Design + Copy System

YEAR

2023

FORM

Short-form product copy

ORG

Traveloka

TEAM

Designers, PMs, Backend Engineers

STATUS

Shipped

NUMBERS

Directional: longer, more specific queries

AI SEARCH PILOT

The smartest move was sounding less smart

ROLE

Content Design + Copy System

YEAR

2023

FORM

Short-form product copy

ORG

Traveloka

TEAM

Designers, PMs, Backend Engineers

STATUS

Shipped

NUMBERS

Directional: longer, more specific queries

Everyone treated this project as an AI problem, which is how you end up planning to run an LLM on every search a user makes. That gets expensive fast, and it gives the model too much credit. Most of what people type is a small set of predictable words. You can answer predictable words with copy rules, and save the model for the times someone actually says something hard.

The situation

Universal Search is the one search bar that has to handle everything Traveloka sells, hotels and flights and trains and the rest. The pilot wanted it to take real preferences, the specific things people care about when they travel.

Nobody was typing real preferences, though. The usual search was a product and a place, "hotel bali," done. People hadn't learned to ask for more because the old search had never given them a reason to. (Research did find one opportunity: users who'd written off Universal Search would try it again if it looked new. So novelty was at least a foot in the door.)

This was late 2023, peak everyone-needs-an-AI-story. We scoped a pilot instead of a full build because the compute was real money and nobody was sure the demand was there. Design, product, and engineering ran it together. I handled the UI copy and the copy-based rules underneath the search suggestions and query understanding.

The approach

An AI search bar is only as useful as what people are willing to type into it, and ours had two gaps underneath. Users barely typed ("hotel bali" and nothing else), and the system couldn't afford to run the model on every search just to find out what they meant. While it sounds like a prompt problem and a cost problem… both, it turned out, just needed copy solutions.

01
Sounding less smart on purpose
Calling it "smart search" or "AI-powered" was right there, and I didn't reach for it. Promising intelligence either scares off people who don't want a chatty search bar (based on our user testing), or writes a check the MVP can't cash.
So the feature just showed up as something fresher rather than something cleverer, riding the design system refresh we had going at the time. No mention of a model, just a search that suddenly looked worth poking at, tuned over a lot of copy tests until people actually poked.
02
Showing people what a good query looks like
People won't type what they don't know they're allowed to type, so the suggestions had to teach by example. I wrote the logic behind them: each suggestion either zoomed in on what you'd started typing or zoomed back out, the two directions people already move in when they search without thinking about it.
The wording mapped to whatever criteria fit, a facility, a price range, a neighborhood, so a suggestion read like a sentence someone might say instead of a filter someone might tick.
Rough first pass, but it held up enough to become the thing later iterations built on.
03
Rules where rules were enough
Engineering had figured the model would parse every query, because what else would. But queries repeat. People ask for the same handful of things in roughly the same handful of ways, and that's writable down.
Working with one of the backend engineers, I built a map from likely phrasings to the parameters sitting behind them, so the ordinary stuff resolved on rules and the model got reserved for the genuinely weird queries.
One catch: there was no query history to learn from yet, since the whole habit we wanted didn't exist. We borrowed the vocabulary from the one place people had already described these places in their own words, which is the product reviews.

01
Sounding less smart on purpose
Calling it "smart search" or "AI-powered" was right there, and I didn't reach for it. Promising intelligence either scares off people who don't want a chatty search bar (based on our user testing), or writes a check the MVP can't cash.
So the feature just showed up as something fresher rather than something cleverer, riding the design system refresh we had going at the time. No mention of a model, just a search that suddenly looked worth poking at, tuned over a lot of copy tests until people actually poked.
02
Showing people what a good query looks like
People won't type what they don't know they're allowed to type, so the suggestions had to teach by example. I wrote the logic behind them: each suggestion either zoomed in on what you'd started typing or zoomed back out, the two directions people already move in when they search without thinking about it.
The wording mapped to whatever criteria fit, a facility, a price range, a neighborhood, so a suggestion read like a sentence someone might say instead of a filter someone might tick.
Rough first pass, but it held up enough to become the thing later iterations built on.
03
Rules where rules were enough
Engineering had figured the model would parse every query, because what else would. But queries repeat. People ask for the same handful of things in roughly the same handful of ways, and that's writable down.
Working with one of the backend engineers, I built a map from likely phrasings to the parameters sitting behind them, so the ordinary stuff resolved on rules and the model got reserved for the genuinely weird queries.
One catch: there was no query history to learn from yet, since the whole habit we wanted didn't exist. We borrowed the vocabulary from the one place people had already described these places in their own words, which is the product reviews.

None of this made the model smarter. It made the model optional, which on an AI project is a stranger thing to argue for, and the better one.

Artifacts

FIG. 01

The query suggestion logic I built, first iteration. The whole logic was zoom in or zoom out, dressed in words a person would actually use.

FIG. 01

The query suggestion logic I built, first iteration. The whole logic was zoom in or zoom out, dressed in words a person would actually use.

FIG. 02

The ordinary queries weren't routed to the LLM. They resolved here, against the mapping I built with the backend engineers based on the language found in product reviews, where users had already said what they meant.

FIG. 02

FIG. 03

The case for sounding simple. Our copy test participants mostly responded to the first copy iteration, with intelligent-sounding section titles, with something in the lines of “too long; didn’t read”.

FIG. 03

The impact

The MVP went out ahead of its own capability, which is a normal pilot problem, but the queries that came back were the whole point. People started typing like they were talking: "villa puncak from 675k to 1.5mio," "30th flight ticket from balikpapan to makassar." The search couldn't fully answer those at the time. What mattered was that people were suddenly willing to ask for exactly the thing, on the bet that the bar would get it. Give them a reason to trust it and they get specific.

WHAT DIDN'T WORK

The first copy iterations were the miss. We tried labeling results "100% match" or partial, with an emoji, on the theory that showing our precision would feel like the kind of straight-talk a smart system does. Copy testing was blunt about it: the numbers just made people stop and do math (80% of what, exactly), so we cut them and let the product cards show which criteria a result actually hit. Transparency was the right instinct, but percentages were the wrong unit.

WHAT THIS UNLOCKED

First, the workflow. Copy and logic turned out to go hand in hand in this kind of project, so the doc the backend engineer and I shared stopped being a handoff and became a place we both kept working, version after version. Iterative by design instead of by accident, which is the kind of thing you only get to keep once you've built it once.

And the messy queries we started to get paid us back twice. Every "closest guesthouse from makassar port near losari beach" the system couldn't fully answer was the feature pointing at where to go next: which phrasings to teach the rules, where the model's budget was actually worth spending, which capability the next iteration owed people. We'd shipped a thing that generated its own roadmap.

Somewhere in there I found where my work actually lives. Not at the tail end tidying strings, but up in the logic, where a call about words is also a call about cost and about what gets built. Most search projects before this, I'd braced for the build phase to break on me. This was the first one I got to help steer.

More case studies →

AI SEARCH PILOT

ROLE

YEAR

FORM

ORG

TEAM

STATUS

NUMBERS

AI SEARCH PILOT

ROLE

YEAR

FORM

ORG

TEAM

STATUS

NUMBERS

The situation

The approach

01

Sounding less smart on purpose

02

Showing people what a good query looks like

03

Rules where rules were enough

01

Sounding less smart on purpose

02

Showing people what a good query looks like

03

Rules where rules were enough

Artifacts

FIG. 01

FIG. 01

FIG. 02

FIG. 02

FIG. 03

FIG. 03

The impact

WHAT DIDN'T WORK

WHAT THIS UNLOCKED