How to Build a Knowledge Graph Chatbot with Neo4j and Chainlit
Ship a Python knowledge graph chatbot using Neo4j, Chainlit, and GPT-4o: Auto-generate Cypher, visualize results, and answer complex data questions accurately.
In an earlier post, we went through how to build a knowledge graph using Neo4j and add semantic search capabilities to really boost a retrieval-augmented generation (RAG) pipeline. We looked at extracting structured entities and relationships from messy unstructured content, getting everything into a graph database, and then doing hybrid retrieval with both vector embeddings and graph queries. If you want to dig deeper into making retrieval work better, check out our guide on retrieval tricks to boost answer accuracy in RAG pipelines.
Here's what I've discovered about knowledge graphs and large language models like GPT-4o. They actually complement each other really well. I was honestly surprised at how good these models are at turning natural language into Cypher queries. And when you give them a clear picture of your graph schema, with all the nodes and relationships laid out, they can reason through it pretty effectively. What this means for you is that you can literally ask questions about your data in plain English, and the model just... gets it. It translates your question into these powerful graph queries almost without thinking about it. If you're thinking about expanding to handle multiple documents and more advanced retrieval, our guide on building multi-document agents for advanced retrieval and summarization has everything you need to know.

So let's take this to the next level. I'm going to show you how to build an interactive chatbot interface using Chainlit and GPT-4o that lets you have actual conversations with your graph database. We'll pass in conversation history, format the knowledge graph context so the LLM understands it properly, pull out the Cypher from the response, run the query, and then return results as both natural language answers and visual charts. By the time we're done, you'll have a fully working conversational interface for your Neo4j knowledge graph.
Setup
First, let's get the required packages installed. Just run this command:
%pip install neomodel chainlit plotlyStep 1: Configure Neomodel and Test Your Neo4j Integration
%%writefile setup.py
import os
from neomodel import db, config
#load your API key safely:
from dotenv import load_dotenv, find_dotenv
_ = load_dotenv(find_dotenv())
# Extract components
uri = os.getenv("NEO4J_URI")
user = os.getenv("NEO4J_USERNAME")
password = os.getenv("NEO4J_PASSWORD")
if not all([uri, user, password]):
raise ValueError("Missing one or more of: NEO4J_URI, NEO4J_USERNAME, NEO4J_PASSWORD")
# Convert neo4j+s:// to bolt+s:// for Neomodel
host = uri.replace("neo4j+s://", "")
bolt_url = f"bolt+s://{user}:{password}@{host}"
config.DATABASE_URL = bolt_url
# Test the connection
try:
results, _ = db.cypher_query("RETURN 'Connection successful' AS message")
print(results[0][0])
except Exception as e:
print(f"Connection failed: {e}")Step 2: Create the Tool
Now we need to build the tool that lets us query our knowledge graph. It's going to take natural language input, figure out the right Cypher query, run it against the Neo4j database, and give us back the results. Actually, if you're interested in learning more about designing tools for agent frameworks, you should definitely look at our guide on how to build flexible tools for CrewAI agents.
%%writefile tools/query_knowledge_graph.py
from setup import *
from crewai.tools import tool
@tool
def query_knowledge_graph(query: str):
"Query the Neo4j Knowledge Graph"
results, meta = db.cypher_query(query)
return resultsTest the Tool
Alright, with the tool ready to go, let's test it out by running a sample query against our knowledge graph.
from tools.query_knowledge_graph import query_knowledge_graph
# Test it
query = """
MATCH (n)
WITH labels(n) AS lbls
UNWIND lbls AS label
RETURN label, count(*) AS count
ORDER BY count DESC;
"""
print(query_knowledge_graph.run(query)) Using Tool: query_knowledge_graph
[['Researcher', 157], ['Project', 46], ['ResearchArea', 20], ['Institution', 20]]Step 3: Create Your Agent
Time to set up the agent and its tasks using the standard CrewAI format. This agent is going to be responsible for understanding what users are asking and generating the right Cypher queries to pull data from the knowledge graph. If you want to see how this fits into bigger multi-agent systems and more complex workflows, take a look at our practical lessons from building multi-agent systems with CrewAI and LangChain.
Create the agents.yaml file
When I was designing the agent prompt, I had a few specific strategies in mind to make sure the LLM would work effectively. Let me walk you through what actually worked:
Expertise framing: We tell the agent it's an expert analyst who knows Neo4j inside and out and can explain results clearly.
Resilience: The agent doesn't give up. If a query doesn't work, it tries different approaches until it finds the data.
Polite tone: Every response needs to be courteous, professional, and actually helpful to the user.
Cypher guidance: This one's important. We specifically tell it to use WITH instead of SQL-style GROUP BY to avoid common mistakes.
Schema awareness: We give it the complete graph structure, all the nodes, properties, relationships, everything. That way it can generate accurate Cypher queries.
%%writefile config/agents.yaml
graph_analyst:
role: >
Knowledge Graph Analyst
goal: >
Answer user questions with precise, grounded insights by querying structured data from a knowledge graph.
backstory: >
<expertise>
You're an expert analyst trained in interpreting user questions, retrieving information from a Neo4j
knowledge graph, and delivering clear, well-structured answers. You bridge human language with
graph-based reasoning to produce reliable insights grounded in real data.
</expertise>
<tenacity>
You are tenacious when retrieving information from a Neo4j. If the query yields no results, you DO NOT give up and say that nothing was found.
Instead, you must rethink the question, try alternative keywords, adjust filters, explore related entity types, or reframe the query entirely.
Retry as many times as necessary until you find a meaningful result.
</tenacity>
<tone>
You always respond in a polite, helpful, and courteous manner. Your tone is respectful, professional, and friendly, ensuring the user feels supported and understood.
</tone>
<cypher_guidance>
Do not use SQL syntax like GROUP BY in Cypher. Instead, use the WITH clause to group data before returning it.
Example:
Use:
WITH YEAR(p.start_date) AS year, COUNT(*) AS count
RETURN year, count
Not:
RETURN YEAR(p.start_date), COUNT(*) GROUP BY YEAR(p.start_date)
</cypher_guidance>
<schema>
<node name="Institution">
Represents universities or organizations. Properties include:
- name (string, required, unique): Name of the institution
- type (string): Type of institution
- address (string): Address of the institution
- context (string): Additional contextual info
</node>
<node name="Researcher">
Represents individuals working on projects. Properties include:
- name (string, required): Name of the researcher
- role (string): Their role in the project
- specialization (string): Their expertise
- commitment_percent (int): Level of involvement
- honorarium_amount (float): Payment amount
- honorarium_currency (string): Currency type
- honorarium_frequency (string): Payment frequency
- Relationships:
- AFFILIATED_WITH → Institution
</node>
<node name="ResearchArea">
Represents a field or topic of research. Properties:
- name (string, required, unique)
</node>
<node name="Project">
Represents research projects. Properties include:
- file_name (string, required, unique): Unique identifier for the project
- type (string): Type of project
- summary_description (string): Project summary
- start_date, end_date (string): Project duration
- Relationships:
- HAS_PARTICIPANT → Institution
- HAS_RESEARCHER → Researcher
- COVERS_TOPIC → ResearchArea
</node>
</schema>Create the tasks.yaml file
To make sure the LLM gives accurate, respectful answers with helpful visualizations, I built the task prompt with these strategies:
Contextual grounding: We include the entire conversation history plus the latest user input. This lets the agent maintain continuity and reuse previous context when it makes sense.
Smart query planning: The LLM uses history or metadata whenever possible. It only queries Neo4j when it actually needs to.
Cypher correctness: We give it specific guidance to avoid syntax errors, especially around WITH and RETURN ordering. Trust me, this saves a lot of headaches.
Persistence: The agent can't just return empty results. If a query fails, it has to try again with a different approach.
Polished communication: Responses need to be clear, concise, and always courteous.
Visual output: When it makes sense, the agent creates charts, bar graphs, line graphs, pie charts, or markdown tables to make the data easier to understand.
%%writefile config/tasks.yaml
answer_user_question:
description: |
<CONVERSATION_HISTORY>
{history}
</CONVERSATION_HISTORY>
<NEW_USER_QUESTION>
{input}
</NEW_USER_QUESTION>
<INSTRUCTIONS>
<analysis>
Carefully analyze the user's question to understand what information is being requested.
Identify the relevant entities, relationships, and any applicable filters.
Use the metadata and conversation history if it contains sufficient context to answer the question directly.
Querying the Neo4j knowledge graph is optional and should only be performed if the answer cannot be fully determined from prior messages.
</analysis>
<querying>
If a query is needed, formulate and execute the appropriate Cypher query against the Neo4j graph to retrieve the necessary data.
<cypher_guidance>
In Cypher, the `RETURN` clause must always come at the end of the query.
You cannot place `RETURN` before a `WITH` clause in the same logical flow.
Invalid:
MATCH (p:Project)
RETURN SUBSTRING(p.start_date, 0, 4) AS year, COUNT(*) AS project_count
WITH year, project_count
ORDER BY year
Correct:
MATCH (p:Project)
WITH SUBSTRING(p.start_date, 0, 4) AS year, COUNT(*) AS project_count
ORDER BY year
RETURN year, project_count
Always use `WITH` to perform intermediate aggregation or transformation, and only use `RETURN` at the final step of the query.
</cypher_guidance>
</querying>
<requirements>
<IMPORTANT>
- A Cypher query against the Neo4j graph should ALWAYS retrieve data.
- DO NOT accept or return an empty result.
- If the query yields no results, you DO NOT give up and respond that nothing was found.
- Instead, you must rethink the question, try alternative keywords, adjust filters, explore related entity types, or reframe the query entirely.
- Retry as many times as necessary until you find a meaningful result.
</IMPORTANT>
</requirements>
<response_style>
Once data is retrieved, synthesize a concise, well-structured answer in natural language, backed by facts from the graph.
Always present your answer in a courteous, respectful, and helpful tone.
Even when correcting a misunderstanding or clarifying limitations, remain friendly and supportive.
</response_style>
<visualization>
If the answer involves numerical comparisons, trends over time, or proportional breakdowns, include a chart.
<chart_types>
- "bar": for categorical comparisons
- "line": for trends over time
- "pie": for showing proportions
</chart_types>
<chart_example type="bar">
```json
{
"type": "bar",
"labels": ["AI", "Cybersecurity", "Data Science"],
"values": [12, 8, 15],
"title": "Projects by Research Area"
}
```
</chart_example>
<chart_example type="line">
```json
{
"type": "line",
"labels": ["2019", "2020", "2021", "2022"],
"values": [5, 12, 18, 25],
"title": "Number of Projects Over Time"
}
```
</chart_example>
<chart_example type="pie">
```json
{
"type": "pie",
"labels": ["Government", "Academic", "Private"],
"values": [30, 45, 25],
"title": "Funding Sources by Sector"
}
```
</chart_example>
<markdown_table_guidance>
If the result is best represented in tabular form, use a **Markdown table** instead of a chart.
Example:
| Name | Institution | Specialization |
|-----------------|---------------------|-------------------------|
| Dr. Rachel Liu | GreenTech Institute | AI for Urban Planning |
| Dr. Nina Feld | GreenTech Institute | Multilingual NLP |
| Dr. M. Rinaldi | GreenTech Institute | AI for Education |
Always include clear headers and ensure that the data is aligned and readable.
</markdown_table_guidance>
</visualization>
</INSTRUCTIONS>
expected_output: >
A concise, well-structured answer in natural language, backed by data retrieved from the knowledge graph.
Include any key entities, metrics, or facts that support the answer.
If appropriate, include JSON-encoded chart data inside a ```json code block for visualization.
agent: graph_analystCreate the Crew
%%writefile crew.py
from crewai import Agent, Crew, Task, Process
from crewai.project import CrewBase, agent, task, crew, before_kickoff, after_kickoff
from crewai.agents.agent_builder.base_agent import BaseAgent
from typing import List
from tools.query_knowledge_graph import query_knowledge_graph
@CrewBase
class KnowledgeGraphAnsweringCrew:
"""Crew that understands user questions, queries a knowledge graph, and composes grounded answers."""
agents: List[BaseAgent]
tasks: List[Task]
# Paths to YAML configuration files
agents_config = 'config/agents.yaml'
tasks_config = 'config/tasks.yaml'
@agent
def graph_analyst(self) -> Agent:
return Agent(
config=self.agents_config['graph_analyst'],
verbose=True
)
@task
def answer_user_question(self) -> Task:
return Task(
config=self.tasks_config['answer_user_question'],
tools=[query_knowledge_graph]
)
@crew
def crew(self) -> Crew:
return Crew(
agents=self.agents,
tasks=self.tasks,
process=Process.sequential,
verbose=False,
)Step 4: Create Our Chatbot
Now we're ready to bring everything together. In this step, we'll build the actual chatbot interface that connects our agent, the task, and the knowledge graph. This is what lets users ask questions in natural language and get back rich, structured responses that actually make sense.
%%writefile chat.py
import plotly.graph_objects as go
import chainlit as cl
import json
import re
from setup import * # Custom setup (e.g., environment config, API keys)
from crew import KnowledgeGraphAnsweringCrew # Your CrewAI implementation
# Format conversation history as Markdown for context injection
def format_history_as_markdown(history):
md = ""
for msg in history:
author = msg["author"].capitalize()
content = msg["content"].strip()
md += f"**{author}:** {content}\n\n"
return md
# Extract and parse a JSON code block from the reply (if present)
def extract_json_and_clean_reply(reply: str):
pattern = r"```json(.*?)```"
match = re.search(pattern, reply, re.DOTALL)
json_data = None
if match:
json_block = match.group(1).strip()
try:
json_data = json.loads(json_block)
except json.JSONDecodeError:
json_data = None
# Remove the JSON code block from the reply text
reply = reply.replace(match.group(0), "").strip()
return reply, json_data
# Generate a Plotly chart from JSON data (bar, line, pie supported)
def generate_plot_from_json(data: dict) -> go.Figure:
chart_type = data.get("type", "")
title = data.get("title") or f"{chart_type.capitalize()} Chart"
if chart_type == "bar":
fig = go.Figure(data=[
go.Bar(x=data.get("labels", []), y=data.get("values", []))
])
elif chart_type == "line":
fig = go.Figure(data=[
go.Scatter(x=data.get("labels", []), y=data.get("values", []), mode="lines")
])
elif chart_type == "pie":
fig = go.Figure(data=[
go.Pie(labels=data.get("labels", []), values=data.get("values", []))
])
else:
print(f"[WARN] Unsupported chart type: {chart_type}")
fig = go.Figure()
return fig
# Chainlit message handler for incoming user messages
@cl.on_message
async def on_message(message: cl.Message):
# Retrieve previous conversation history from session
history = cl.user_session.get("history") or []
# Instantiate and prepare the Crew
crew_base = KnowledgeGraphAnsweringCrew()
crew = crew_base.crew()
# Package user input and formatted history for the agent
user_input = {
"input": message.content,
"history": format_history_as_markdown(history)
}
# Run the Crew to process the message
result = await cl.make_async(crew.kickoff)(inputs=user_input)
reply = result.raw_output if hasattr(result, "raw_output") else str(result)
# Extract JSON data (if any) and remove it from the visible reply
reply_no_code, chart_data = extract_json_and_clean_reply(reply)
# Update conversation history
history.append({"author": "user", "content": message.content})
history.append({"author": "assistant", "content": reply})
cl.user_session.set("history", history)
# If a valid chart type is found, display it inline with the message
if chart_data and chart_data.get("type") in {"bar", "line", "pie"}:
fig = generate_plot_from_json(chart_data)
await cl.Message(
content=reply_no_code,
elements=[cl.Plotly(name="chart", figure=fig, display="inline")]
).send()
else:
# If no chart is detected, just send the full reply as-is
await cl.Message(content=reply).send()Run the App
To get the chatbot running, just use this command. It starts up the Chainlit server and makes your chatbot available at http://localhost:8000.
!chainlit run chat.py --host 0.0.0.0 --port 8000
Conclusion
Working with relational data using LLMs can be really challenging. You've got multiple tables, complex relationships everywhere. And while large language models can theoretically generate SQL, in practice? They often struggle to consistently produce queries that actually work. I've run into this problem more times than I can count in various projects.
Neo4j and knowledge graphs offer a really powerful alternative. Even when you have a complex schema with tons of different node types and relationships, LLMs like GPT-4o are surprisingly good at generating correct Cypher queries. And they do it consistently. Actually, the more I think about it, combining this with semantic search gives you the best of both worlds. You get all the flexibility of a vector database plus the structured querying power of a relational database. And it's all wrapped up in a conversational interface that anyone can use.
If you want to make your chatbot even more effective, consider using in-context learning techniques to improve LLM responses.