When we tell prospective clients that we’ve built AI agents handling 80% of inbound lead qualification with no human in the loop for most conversations, the reactions split cleanly: half are excited, half are skeptical.
Both reactions are reasonable. AI agents in production are still rare enough that the field-level experience varies wildly. Some are excellent. Most are bad. This post is about how to build one that lands in the first camp — with concrete details from a system we built for a 200-agent real estate brokerage in Dubai.
The problem we were solving
Falcon Estates was getting 600-900 inbound inquiries per month from property portal listings, paid ads, and organic search. The vast majority were poorly qualified: wrong budget, wrong area, wrong timeframe. Agents were spending the majority of their time triaging instead of selling.
The goal: filter out the bottom 70% before a human got involved, while making the top 30% feel like they got world-class service from minute one.
Why WhatsApp, not chat on the website
WhatsApp is how Dubai property buyers actually communicate. Building a slick on-site chat widget would have been faster but the engagement rates would have been a fraction of WhatsApp. We went where the customers were.
This also gave us a major advantage: WhatsApp message templates require pre-approval, which forces a level of discipline that web chat doesn’t. Every template the AI uses has been reviewed.
The architecture
- Front door: WhatsApp Business API via Meta’s Cloud API.
- Orchestration: n8n workflow that receives the webhook, calls the LLM, and routes responses.
- LLM: Claude Sonnet (Anthropic) for the qualification conversation. We tested GPT-4o and Gemini 1.5 Pro — Claude held the conversation tone better and was significantly less prone to confidently inventing property details that didn’t exist.
- Memory: Each conversation has a session stored in a Postgres database, so the agent doesn’t ask the same question twice.
- CRM integration: Every answer is pushed to HubSpot in real time. Hot leads create a deal and ping a Slack channel for the human team.
- Fallback: If the AI is more than 70% uncertain (it returns a confidence score), or if the user explicitly asks for a human, it hands off cleanly with full context.
The prompt
The full system prompt is too long to include here — it’s about 1,800 words. But the key principles:
- Define the role narrowly. The agent is a “property concierge for Falcon Estates” — not a general-purpose chatbot. This eliminates 90% of off-topic drift.
- List explicit qualification questions, in order. Budget, area, property type, timeframe, financing, contact details. The agent gathers these conversationally, but it has a clear checklist.
- Forbid invented inventory. The agent can only mention properties from a list provided in the prompt. “I don’t know” is an acceptable answer.
- Hand off triggers, explicitly. If the user mentions cash, urgency, or specific high-value units, transfer to a human within the same message.
The results
Four months after launch:
- 340% growth in sales-qualified leads
- Average response time: 4 minutes (was 11 hours)
- 80% of conversations fully handled by the AI without human intervention
- 47% increase in deals closed per agent per quarter
What we’d do differently
If we were starting this build today, we’d consider:
- Voice instead of WhatsApp text for the highest-value conversations. WhatsApp now supports voice notes via the API. Voice would lift response feel quality further.
- Tool-using agents instead of pure conversation. The new generation of frameworks (Anthropic’s tool use, OpenAI’s function calling) let the agent directly query the property database, book viewings, etc. Worth the additional complexity for some scenarios.
- A more rigorous hallucination test suite. We have one. It could be 3x larger.
Should you build one?
Good candidates: businesses with high inbound volume where qualification is the bottleneck. Real estate, education, healthcare, legal, automotive, B2B services.
Bad candidates: businesses where the entire sales process is consultative and depends on rapport (e.g. high-end wealth management, complex enterprise software). AI can support these — but “80% of inbound on autopilot” isn’t the right model.
If you’re in the first camp, we can help.