Everyone treated this project as an AI problem, which is how you end up planning to run an LLM on every search a user makes. That gets expensive fast, and it gives the model too much credit. Most of what people type is a small set of predictable words. You can answer predictable words with copy rules, and save the model for the times someone actually says something hard.
The situation
Universal Search is the one search bar that has to handle everything Traveloka sells, hotels and flights and trains and the rest. The pilot wanted it to take real preferences, the specific things people care about when they travel.
Nobody was typing real preferences, though. The usual search was a product and a place, "hotel bali," done. People hadn't learned to ask for more because the old search had never given them a reason to. (Research did find one opportunity: users who'd written off Universal Search would try it again if it looked new. So novelty was at least a foot in the door.)
This was late 2023, peak everyone-needs-an-AI-story. We scoped a pilot instead of a full build because the compute was real money and nobody was sure the demand was there. Design, product, and engineering ran it together. I handled the UI copy and the copy-based rules underneath the search suggestions and query understanding.
The approach
An AI search bar is only as useful as what people are willing to type into it, and ours had two gaps underneath. Users barely typed ("hotel bali" and nothing else), and the system couldn't afford to run the model on every search just to find out what they meant. While it sounds like a prompt problem and a cost problem… both, it turned out, just needed copy solutions.
None of this made the model smarter. It made the model optional, which on an AI project is a stranger thing to argue for, and the better one.
Artifacts
The impact
The MVP went out ahead of its own capability, which is a normal pilot problem, but the queries that came back were the whole point. People started typing like they were talking: "villa puncak from 675k to 1.5mio," "30th flight ticket from balikpapan to makassar." The search couldn't fully answer those at the time. What mattered was that people were suddenly willing to ask for exactly the thing, on the bet that the bar would get it. Give them a reason to trust it and they get specific.
WHAT DIDN'T WORK
The first copy iterations were the miss. We tried labeling results "100% match" or partial, with an emoji, on the theory that showing our precision would feel like the kind of straight-talk a smart system does. Copy testing was blunt about it: the numbers just made people stop and do math (80% of what, exactly), so we cut them and let the product cards show which criteria a result actually hit. Transparency was the right instinct, but percentages were the wrong unit.
WHAT THIS UNLOCKED
First, the workflow. Copy and logic turned out to go hand in hand in this kind of project, so the doc the backend engineer and I shared stopped being a handoff and became a place we both kept working, version after version. Iterative by design instead of by accident, which is the kind of thing you only get to keep once you've built it once.
And the messy queries we started to get paid us back twice. Every "closest guesthouse from makassar port near losari beach" the system couldn't fully answer was the feature pointing at where to go next: which phrasings to teach the rules, where the model's budget was actually worth spending, which capability the next iteration owed people. We'd shipped a thing that generated its own roadmap.
Somewhere in there I found where my work actually lives. Not at the tail end tidying strings, but up in the logic, where a call about words is also a call about cost and about what gets built. Most search projects before this, I'd braced for the build phase to break on me. This was the first one I got to help steer.


