Sorry for posting less and less lately, but as you probably know, AI is changing insanely fast. Every time I start writing a new article about some cool thing, it can become outdated in no time.
So now, I’ve decided to focus on writing about use cases I’m actually building for my customers stuff they love and that really works in real life.
And this one? It’s pretty damn awesome.
How I Built a RAG Chatbot Over Confluence with Chainlit, Postgres pgvector, and Python
In our company, most of the documentation lives in Confluence. Like many teams, we wondered how to tap into this wealth of internal knowledge with LLMs. So I built a Retrieval-Augmented Generation (RAG) chatbot that connects directly to our Confluence spaces and pages.
Here’s a straightforward walkthrough of how I did it using Postgres with the pgvector extension, OpenAI embeddings, and Chainlit for the chatbot interface.
Stack used
- Python
- PostgreSQL with
pgvectorextension - OpenAI embeddings (model
text-embedding-3-small) - Chainlit for a minimal frontend chatbot
- Atlassian Confluence API to pull pages
Step 1 – Pull Confluence Content
To get started, I fetch pages from a specific Confluence space using their REST API. Here’s a simple Python snippet that pulls pages with their HTML content:
import requests
from requests.auth import HTTPBasicAuth
import os
CONFLUENCE_URL = "https://your-domain.atlassian.net/wiki"
SPACE_KEY = "ENG"
auth = HTTPBasicAuth("your_email", os.getenv("ATLASSIAN_API_TOKEN"))
def fetch_confluence_pages():
results = []
start = 0
while True:
url = f"{CONFLUENCE_URL}/rest/api/content?spaceKey={SPACE_KEY}&expand=body.storage&start={start}&limit=25"
resp = requests.get(url, auth=auth)
data = resp.json()
if "results" not in data:
break
for page in data["results"]:
title = page["title"]
content = page["body"]["storage"]["value"]
results.append((title, content))
if len(data["results"]) < 25:
break
start += 25
return results
Below is a screenshot of a Confluence page I indexed and want to query with the chatbot:

Step 2 – Store Content in Postgres with Embeddings
I clean the HTML content to plain text, generate embeddings with OpenAI, then store it in Postgres using the pgvector extension:
import openai
import psycopg2
from bs4 import BeautifulSoup
from dotenv import load_dotenv
import os
load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
conn = psycopg2.connect(os.getenv("POSTGRES_URL"))
cur = conn.cursor()
cur.execute("""
CREATE TABLE IF NOT EXISTS confluence_pages (
id SERIAL PRIMARY KEY,
title TEXT,
content TEXT,
embedding vector(1536)
);
""")
for title, html in fetch_confluence_pages():
plain_text = BeautifulSoup(html, "html.parser").get_text()
if len(plain_text) > 20:
embedding = openai.embeddings.create(
model="text-embedding-3-small",
input=plain_text
)["data"][0]["embedding"]
cur.execute("""
INSERT INTO confluence_pages (title, content, embedding)
VALUES (%s, %s, %s)
""", (title, plain_text, embedding))
conn.commit()
cur.close()
conn.close()
Step 3 – Search Relevant Pages by Semantic Similarity
Once the data is stored, I can query it semantically. Here’s how to get the closest pages for any question:
def search_confluence_docs(question):
conn = psycopg2.connect(os.getenv("POSTGRES_URL"))
cur = conn.cursor()
embedding = openai.embeddings.create(
model="text-embedding-3-small",
input=question
)["data"][0]["embedding"]
cur.execute("""
SELECT title, content
FROM confluence_pages
ORDER BY embedding <-> %s
LIMIT 3;
""", (embedding,))
results = cur.fetchall()
cur.close()
conn.close()
return results
Step 4 – Serve a Chatbot with Chainlit
I chose Chainlit because it’s simple to set up and interactive. First, install it:
pip install chainlit
Here’s the chatbot code that queries Postgres and prompts GPT-4:
import chainlit as cl
import openai
from dotenv import load_dotenv
import os
from search import search_confluence_docs
load_dotenv()
openai.api_key = os.getenv("OPENAI_API_KEY")
@cl.on_message
def main(message: cl.Message):
question = message.content
matches = search_confluence_docs(question)
context = "\n---\n".join([f"{title}:\n{content}" for title, content in matches])
prompt = f"""You are a helpful assistant. Use the context below to answer the question.
Context:
{context}
Question:
{question}
"""
response = openai.ChatCompletion.create(
model="gpt-4",
messages=[{"role": "user", "content": prompt}]
)
cl.Message(content=response.choices[0].message["content"]).send()
Step 5 – Run and Test
Run your chatbot locally with:
chainlit run main.py
Then open http://localhost:8000 in your browser.
Here’s a screenshot of me asking a question in Chainlit and the bot replying with answers citing the indexed pages:

Summary Architecture
[Confluence] → [HTML → Plain Text] → [OpenAI Embeddings] → [pgvector in Postgres]
↓
[User question]
↓
[Vector similarity search]
↓
[Inject context → GPT-4 → Chainlit UI]
What’s next?
- Set up a nightly sync job to keep Confluence data fresh
- Implement user access control for sensitive content
- Extend to other data sources like Jira, GitLab, or internal docs
This system makes it easy for the team to get context-aware answers from our Confluence knowledge base. If you’re working with Confluence and want to make your docs more accessible, give this approach a try.
