Elasticsearch hybrid search in LangChain4j

Elasticsearch is packed with new features to help you build the best search solutions for your use case. Learn how to put them into action in our hands-on webinar on building a modern Search AI experience. You can also start a free cloud trial or try Elastic on your local machine now.

In our previous article on hybrid search with Elasticsearch in LangChain, we explained why hybrid search can help retrieve better results than simple vector search, along with how it works. We recommend reading that article first.

In addition to Python and JavaScript, the LangChain ecosystem also has a community-driven Java project called LangChain4j, which will be the focus of this article, showing how powerful hybrid search can be by writing a complete application using LangChain4j, Elasticsearch, and Ollama.

Setting up the environment

Running a local Elasticsearch instance

Before running the examples, you'll need Elasticsearch running locally. The easiest way is using the start-local script:

After starting, you'll have:

Elasticsearch at http://localhost:9200.
Kibana at http://localhost:5601.

Your API key is stored in the .env file (under the elastic-start-local folder) as ES_LOCAL_API_KEY.

> Note: This script is for local testing only. Do not use it in production. For production installations, refer to the official documentation for Elasticsearch.

Running a local Ollama instance

You’ll also need to connect your application to an embedding model. Although you can choose between any provider supported by LangChain4j (check the complete list), for this example we’ll be using Ollama, which can be easily set up locally following the quickstart.

Let’s start coding

The idea for the application is simple: Given a dataset of movies (taken from an IMDb dataset on Kaggle), we want to be able to find movies whose descriptions are relevant to our queries. This demo uses a subset of the data, which has been cleaned. You can download the dataset used for this article from our GitHub repo, along with the full code for this demo.

Step 1: Dependencies and environment

Open your favorite integrated development environment (IDE), create a new blank project, preferably with a modern Java version (we’re using Java24) and a gradle/maven version to match (in our case, Gradle 9.0).

We only need three dependencies:

The first one is needed to ingest the data that we’ll embed and query; the other two are the necessary LangChain4j dependencies to connect and manage our Elasticsearch vector store and Ollama embedding model.

The best way to connect to the external services is to set up environment variables and set them at the start of our main function:

Step 2: Ingesting the dataset

Since the dataset is a CSV, we’ll be using Jackson dataformat’s jackson-dataformat-csv to easily read the data and map it to a Java class, defined as:

Now we can create an instance of CsvSchema mapping the CSV structure and read the file into an iterator:

Each row needs to be embedded first, and then both the embedded content and the text representation will be ingested by Elasticsearch.

Let’s start by creating an instance of the Ollama embedding model class:

And then the Elasticsearch vector store, which needs an instance of the Elasticsearch Java RestClient:

For the ingestion loop, the LangChain4j library requires the data to be split in two lists for ingestion, one for the vector representation and one for the original text, so we’ll set up two lists which will be filled by the loop:

Where Embedding and TextSegment are both library specific classes.

We’ll iterate on the movie dataset iterator, use the embedding model to retrieve the vector representation for each movie information (a text representation of all the fields merged), and add the name separately as metadata so that the result will be easier to read.

Finally, the vector list and text list are passed to the vector store method addAll(), which will handle asynchronously sending the data to the vector store:

Step 3: Querying

Our goal is to find movies with time loops in the plot, so our prompt will be:

Let’s try a simple vector search first, by creating a content retriever with a k-nearest neighbor (kNN) query default configuration and then running the query and printing the results:

This outputs:

Now let’s see how hybrid search performs:

Why these results?

This query (“time loop / reliving the same day”) is a great case where hybrid search tends to shine because the dataset contains literal phrases that BM25 can match and vectors can still capture meaning.

Vector-only (kNN) embeds the query and tries to find semantically similar plots. Using a broad sci‑fi dataset, this can drift into “trapped / altered reality / memory loss / high-stakes sci‑fi” even when there’s no time-loop concept. That’s why results like “The Witch: Part 1 – The Subversion” (amnesia) and “The Maze Runner” (trapped / escape) can appear.
Hybrid (BM25 + kNN + reciprocal rank fusion [RRF]) rewards documents that match keywords and meaning. Movies whose descriptions explicitly mention “time loop” or “relive the same day” get a strong lexical boost, so titles like “Edge of Tomorrow” (relive the same day over and over again…) and “Boss Level” (trapped in a time loop that constantly repeats the day…) rise to the top.

Hybrid search doesn’t guarantee that every result is perfect; it balances lexical and semantic signals, so you may still see some non-time-loop sci‑fi in the tail of the top‑k.

The main takeaway is that hybrid search helps anchor semantic retrieval with exact textual evidence when the dataset contains those keywords. Check the previous article for more information on how hybrid search works.

Full code example

You can find the full demo code on GitHub.

Conclusion

In this article, we demonstrated how to use hybrid search in LangChain4j through its Elasticsearch integrations, with a complete Java example. This article is an extension of a previous article, which presents the LangChain integrations for Python and JavaScript and introduces and explains hybrid search. We’re planning to continue our collaboration with LangChain4j in the future by contributing to the embedding models with our Elasticsearch Inference API.

How helpful was this content?

Not helpful

Somewhat helpful

Very helpful

Report an issue

Related Content

Building a multilingual voice agent with Elastic Agent Builder & Sarvam AI

Agentic AI Integrations+1

June 30, 2026

Building a multilingual voice agent with Elastic Agent Builder & Sarvam AI

A working demo combining Sarvam AI speech with Elastic Agent Builder: identity verification, per-customer ES|QL queries, and mid-call language switching across 22 Indian languages without multilingual indices.

By: Ashish Tiwari

jina-clip-v2 brings text-to-image search across 89 languages to Elasticsearch, no GPU needed

Jina AI Hybrid Search+1

June 23, 2026

jina-clip-v2 brings text-to-image search across 89 languages to Elasticsearch, no GPU needed

Run multimodal search across 89 languages inside Elasticsearch with jina-clip-v2: one embedding space for text and images, with no separate model infrastructure to manage.

KJ RD BJ

By: Kapil Jadhav, Ranjana Devaji and Brendan Jugan

Talk to your Elasticsearch data: building a real-time voice agent with Google ADK and MCP in 3 components

Agentic AI Integrations

June 26, 2026

Talk to your Elasticsearch data: building a real-time voice agent with Google ADK and MCP in 3 components

Wire Google ADK's real-time voice streaming to your Elasticsearch data via Agent Builder's built-in MCP server; no custom integration code required.

By: Jeffrey Rengifo

Your data analyst doesn't need SQL: wiring Elastic Agent Builder to AWS AgentCore for natural-language Elasticsearch queries

Agentic AI Python+1

June 22, 2026

Your data analyst doesn't need SQL: wiring Elastic Agent Builder to AWS AgentCore for natural-language Elasticsearch queries

Wire plain-English questions to your Elasticsearch data using Elastic Agent Builder MCP, AWS Bedrock AgentCore and the Strands SDK. Python code included.

By: Someshwaran Mohankumar

Jingra: A Reproducible Framework for Vector Search Benchmarking

Vector Database Java

June 18, 2026

Jingra: A Reproducible Framework for Vector Search Benchmarking

Jingra is an open source benchmarking framework that runs the same vector search workload across Elasticsearch, OpenSearch and Qdrant so you can compare engines under identical, reproducible conditions.

By: Sachin Frayne

Hybrid search with Java: LangChain4j Elasticsearch integration