Chat with your website data using Jina Embeddings v5 and Elastic

Agent Builder is available now GA. Get started with an Elastic Cloud Trial, and check out the documentation for Agent Builder here.

Build a “chat with your website” experience in under an hour using Elasticsearch Serverless, Jina Embeddings v5, Elastic Open Web Crawler, and Elastic Agent Builder.

By the end, you’ll have a working agent that can search your crawled pages, cite relevant passages, and answer questions grounded in your content, no custom chunking or embedding pipeline required.

In this guide, you’ll:

Start an Elasticsearch Serverless project.
Create an index using the new semantic_text field powered by Jina Embeddings v5.
Crawl any website using Elastic Crawler Control (a.k.a. Crawly) (an open source UI + API wrapper around the Elastic Open Web Crawler).
Chat with that data using the Elastic Agent Builder in Kibana.

What you’ll walk away with:

A repeatable pattern you can point at any website/docs source.
Chat that stays grounded in your content.

Prerequisites

An Elasticsearch Serverless (Search) project + an API key with write permissions.
Docker + Docker Compose (to run the crawler UI).
git (to clone the repo).

1. Start an Elasticsearch Serverless project

First, we need a serverless project to host our data.

1. Log in to your Elastic Cloud Console.

2. Click Create project.

3. Select Search as the project type. (This type is optimized for vector search and retrieval.)

4. Give it a name (for example, es-labs-jina-guide), and click Create.

give a project name to chat with your website

5. Important: Save the Elasticsearch endpoint and API Key provided when the project is created. You’ll need these for the crawler.

2. Create the index

Elasticsearch Serverless supports semantic_text, which handles chunking and embedding generation automatically. We’ll use the .jina-embeddings-v5-text-small model that’s hosted on GPUs on Elastic Inference Service.

Create the index with the semantic_text field. This tells Elastic to automatically vectorize content put into the field property using the inference endpoint we just created.

In Kibana Dev tools run:

3. Run the Elastic Open Crawler

Crawly is one example of how an application can be constructed around the functionalities that the Open Web Crawler provides.

The application wraps the Elastic Open Crawler in a FastAPI service that manages crawler processes and persists execution data. A React front end provides the interface for configuring and monitoring crawls.

What happens under the hood is that the crawler service (check crawler.py) spawns JRuby processes via subprocess.Popen, allowing multiple concurrent crawls. Each execution's configuration, status, and logs are persisted to disk (for now).

Clone the repository:

Create an env.local file with your Elasticsearch credentials:

Start the services:

Access the UI at http://localhost:16700

You don’t necessarily need seed_urls unless you want to be specific, so your config can be as simple as below:

From there, you can start a crawl on any website and check its progress:

Once it's finished, we’re ready to query the content in Elasticsearch directly or use the pages you just crawled for chatting with the website on Agent Builder.

4. Chat with data in Kibana

Now that the data is indexed and vectorized, we can start chatting with the data using the Elastic Agent Builder.

Open Kibana, and navigate to Agents (under the "Search" section).
Test the agent:
- In the chat window, ask a question, like,"What is the difference between sparse and dense vectors?"

The agent will search your Jina-embedded data, retrieve the relevant snippets from the Search Labs blog posts, and generate an answer.

You can also chat with the data directly via Kibana API:

Use conversation_id to resume an existing conversation with an agent in Elastic Agent Builder. If you don’t provide it on the initial request, the API starts a new conversation and returns a newly generated ID in the streaming response.

Summary

You now have a working “chat with your website” stack: Your site gets crawled, indexed, auto-embedded with semantic_text + Jina v5, and surfaced through an agent in Kibana that answers questions grounded in your pages.

From here, you can point the same setup at docs, support content, or internal wikis and iterate on relevance in minutes.

How helpful was this content?

Not helpful

Somewhat helpful

Very helpful

Report an issue

Related Content

Your Elastic agent, Google's ADK, and zero custom APIs: building “Lucky Planet” over A2A

Agentic AI Python+1

June 5, 2026

Your Elastic agent, Google's ADK, and zero custom APIs: building “Lucky Planet” over A2A

Elastic Agent Builder's native A2A endpoint lets Google's ADK orchestrate a remote agent, with no custom REST API. Watch it work in 'Lucky Planet,' a random-exoplanet game built end-to-end.

By: Jonathan Simon

137,000 people, zero human decisions: agentic disaster response with Elasticsearch

Agentic AI Kibana

June 4, 2026

137,000 people, zero human decisions: agentic disaster response with Elasticsearch

Find out how a Kibana detection rule, a workflow and an AI agent automatically relocated 137,000 military personnel across seven installations when a hurricane hit, no dispatcher required.

By: Alec Carpenter

Build a RAG agent with Elasticsearch and GitHub Copilot SDK

Agentic AI .NET+1

June 1, 2026

Build a RAG agent with Elasticsearch and GitHub Copilot SDK

Wire Elasticsearch into the GitHub Copilot SDK as a RAG tool in five lines of C#, grounding your agent in your own logs, docs and data instead of model training data.

By: Greg Crist

Small model, big benchmarks: how Jina-VLM beat the competition at 2.4B and what ICLR told us is coming next

Jina AI

May 27, 2026

Small model, big benchmarks: how Jina-VLM beat the competition at 2.4B and what ICLR told us is coming next

Jina-VLM is a 2.4B open multilingual VLM leading VQA benchmarks across 29 languages. Plus: five days of ICLR 2026 takeaways on RLVR, sparse embeddings and retrieval.

AK GM SM

By: Andreas Koukounas, Georgios Mastrapas and Scott Martens

Cutting agent costs with pre-computed context

Agentic AI Hybrid Search

May 26, 2026

Cutting agent costs with pre-computed context

Pre-computing context as Knowledge Indicators reduces LLM agent token costs by up to 75% and improves answer accuracy from 60% to 92%. This post covers the extraction, retrieval and feedback loop that make it work, tested against the BrowseComp-Plus benchmark.

By: Joe McElroy

Build a "chat with your website data" agent with Jina Embeddings v5 and Elasticsearch