Every time it almost worked, something broke. But I learned a lot trying to get a free, fast, intelligent chatbot running on my Gatsby portfolio.
First came the realization that building an AI chatbot will not be a weekend project. I thought it would, but two weeks later, I was knee-deep in vector mismatches, missing models, and context recipe testing.
What followed was the hard truth that even AI has a hard time helping you build other AI. It can only take you so far until things work. Sort of.
Let’s just say there’s nothing like watching your custom chatbot spew completely made-up facts about your career 😒
It soon became apparent this wasn’t going to be plug-and-play.
So why do it? Why spend time even integrating a custom AI when there are so many premade options?
I’ll tell you why: learning experience. There’s no better way to take an in-depth look at handling AI models from behind the scenes than when building one. It’s basic facts 💁♀️
Without further ado, let’s dive in.
The Initial Objective
I wanted a site-wide AI chatbot that could:
- Answer questions about my work and experience
- Understand and use the content I’d already written in JSON files
- Run fast and feel responsive
- Not cost a dime (a free solution for the limited utility it would provide)
The MVP stack I settled on consisted of:
- GatsbyJS site (my portfolio is already built with Gatsby)
- Supabase for vector search and serverless functions
- Hugging Face (via Xenova) for local embeddings
- Client-side and server-side caching
- Lightweight model for response generation
I wrote a step-by-step guide on how I got the core chatbot running. This post is everything that happened next—from surprises and mistakes to lessons and breakthroughs.
The Setup: Theory vs Reality
On paper, the workflow looked clean:
- Recursively parse all JSON files in the
data/directory - Break content into chunks for embedding
- Generate embeddings with
Xenova/all-MiniLM-L6-v2 - Store vectors in Supabase with
pgvector - At query time, find relevant chunks and generate a reply using a local model
It worked seemingly okay until I hit number five. That’s when things broke down, and here’s a breakdown of the what and why (pun intended).
Real-World Challenges and Solutions
Parsing Nested Files
Going into this, I didn’t understand context much, so I failed to consider nested paths in my file discovery script.
Solution: Recursively walk the data/ directory. Strip paths and extract the base filename (e.g. projects) for contentType fields. This allowed the system to treat all projects data the same, regardless of depth.
// Process all JSON files in data directory
async function processPortfolioContent() {
const dataDir = path.join(process.cwd(), 'src/data');
// const generateEmbedding = await initEmbeddingModel();
// Recursively find all JSON files
function findJsonFiles(dir) {
const files = fs.readdirSync(dir);
return files.flatMap(file => {
const fullPath = path.join(dir, file);
return fs.statSync(fullPath).isDirectory() ? findJsonFiles(fullPath) : fullPath.endsWith('.json') ? [fullPath] : [];
});
}
// Get all JSON files in the data directory
const files = findJsonFiles(dataDir);
// const files = fs.readdirSync(dataDir).filter(file => file.endsWith('.json'));
for (const file of files) {
const contentType = path.basename(file, '.json');
// const filePath = path.join(dataDir, file);
// Read and parse JSON file
const rawData = fs.readFileSync(file, 'utf8');
const data = JSON.parse(rawData);
console.log(`Processing ${contentType} data...`);
// Process each section of the content
await processDataRecursively(data, contentType);
}
console.log('Content processing complete!');
}The parentKey Format
I built a key format like projects:[0].description to track the source of each content chunk. It mimicked JavaScript object paths and was great for debugging.
The good:
- Clear traceability back to the original content
- Fine-grained control over chunk-level retrieval
The bad:
- Too granular. It split up semantically connected information
- Retrievals surfaced orphaned chunks, lacking the necessary context
Tip ⚠️
The lesson from this was that a little context fragmentation helps with retrieval, but too much breaks logical associations.
Keeping Content Fresh
Once I updated a JSON file, I needed the chatbot to reflect the new info. But the system didn’t auto-sync changes.
Solution: Rerun the processPortfolioContent() script manually to:
- Regenerate embeddings
- Push new or updated vectors to Supabase
To avoid duplicates or mismatches, I added checks in the Supabase client to determine whether to insert, update, or delete vectors and a flag to decide which action to run.
async function upsertPortfolioContent({
content,
contentType,
contextKey,
embedding,
action = 'create', // or 'update' or 'delete'
}) {
const content_type = `${contentType}:${contextKey}`;
if (action === 'delete') {
const { error } = await supabase
.from('portfolio_content')
.delete()
.eq('content_type', content_type);
if (error) console.error('Delete failed:', error);
else console.log(`Deleted ${content_type} content`);
return;
}
// Check if entry already exists
const { data: existing, error: fetchError } = await supabase
.from('portfolio_content')
.select('id')
.eq('content_type', content_type)
.maybeSingle();
if (fetchError) {
console.error('Check failed:', fetchError);
return;
}
if (existing && action === 'update') {
// update existing
const { error } = await supabase
.from('portfolio_content')
.update({ content, embedding })
.eq('content_type', content_type);
if (error) { console.error('Update failed:', error) } else {
console.log(`Updated ${content_type} content`);
}
} else if (!existing && ['create', 'update'].includes(action)) {
// insert new
const { error } = await supabase
.from('portfolio_content')
.insert({ content, content_type, embedding });
if (error) { console.error('Insert failed:', error) } else {
console.log(`Stored ${content_type} content`);
}
}
}Supabase Edge Function Constraints
Supabase Edge Functions run in a Deno environment, and that brought some unexpected limitations.
NPM Import Issues
Attempting to use Xenova’s models via npm:@xenova/transformers@2 failed.
Error: Cannot find module 'npm:@xenova/transformers@2'
Deno doesn’t support Node-style NPM imports unless bundled. Supabase Edge runs unbundled ESM by default, so the fix was to use a CDN-hosted ESM version, like: import * as transformers from '<https://esm.sh/@xenova/transformers>'
Not a shocking insight, but if you haven’t worked in a Deno environment before (🙋♀️), it can certainly add unexpected perplexity.
Local File Access
Xenova’s transformer models attempt to read model files from disk, but Supabase Edge Functions don’t allow file access. This results in an error like: The URL must be of scheme file.
A couple of solution options are to:
- Run Xenova locally for dev and testing
- Switch to Hugging Face’s hosted inference API for deployment
I opted for the second option. While it introduced rate limits, it was simple and stable. However, it wasn’t the best choice, as you’ll see later on.
Making the Chatbot Feel Instant
I wanted the bot to stream answers the way ChatGPT does—not just dump the entire response after several seconds.
To achieve this effect, I opted for fake streaming with Server-Sent Events (SSE). This consisted of:
- Generating a full response
- Splitting into tokens or words
- Sending incrementally using
setTimeout()
The result was better UX, no additional token cost, and, most importantly, a feeling that it’s conversational.
The Disappearing Model
I originally used mistralai/Mistral-7B-Instruct-v0.2 and it worked extremely well—until one day it didn’t.
The model disappeared from Hugging Face with no warning. I tested alternatives but hit a wall:
- Some weren’t instruction-tuned
- Others ignored the prompt entirely
- Many generated hallucinations or filler content
Examples included invented job roles, fictional side projects, and contact info that didn’t exist in my data.
Eventually, I switched to Gemini models, which solved problems like:
- Long context windows (up to 2 million tokens)
- System instructions to steer output
- Context caching to reduce token usage in repeated queries
Gemini’s ability to maintain long context threads and stick to the data significantly improved answer quality.
Embedding Incompatibilities
Now, for some more learning pains. I ran into vector dimension errors when trying to switch from Xenova embeddings to Gemini’s embedding model.
Error: Error: different vector dimensions 384 and 768
The reason for this is that Supabase pgvector expects all embeddings in a table to have the same dimensionality.
- Xenova/all-MiniLM-L6-v2 = 384D
- Gemini embedding = 768D
You can’t mix and match. Re-embed everything with the new model if switching.
Tip: Consistency in embeddings is crucial for valid similarity search!
UI State Bugs
Two separate state issues created unwanted side effects in the chat UI.
1. Replaying Old Messages on Reopen
If I closed the chatbot and reopened it, the conversation replayed line by line. The issue stemmed from restoring the state multiple times.
Fix: Only load history once, on mount. If a session exists, skip the welcome message.
2. State Not Shared Across Components
This one was a first for me. I used useChatbot() in both <Chatbot /> and <ChatbotLoader />, creating isolated instances.
Fix: Centralize state with a shared provider. Ensure all chatbot-related components pull from the same store.
Ironically, I did not proceed with creating a custom provider but decided to stick to the solo custom hook. Because of this, I needed to manage the active state of the chatbot independently.
export default function ChatbotLoader() {
// const { isOpen, toggleChat } = useChatbot();
const [isActive, setIsActive] = useState(false);
// If chatbot is not open, show just the fab button
if (!isActive) {
return (
<Tooltip
title="Chat with EVE"
placement="top-end"
componentsProps={{
tooltip: {
sx: {
bgcolor: '#c71585',
color: 'white',
boxShadow: '0 4px 20px rgba(199, 21, 133, 0.25)',
},
},
}}>
<Box
sx={{
position: 'fixed',
bottom: 35,
'@media(max-width: 600px)': {
bottom: 25,
},
right: '6%',
zIndex: 1400,
}}>
<Zoom in={true}>
<Fab
onClick={() => setIsActive(!isActive)}
aria-label="open chat"
data-gtm-track="chat-button"
sx={{
bgcolor: '#c71585',
color: 'white',
'&:hover': {
bgcolor: '#a01269',
},
boxShadow: '0 4px 20px rgba(199, 21, 133, 0.25)',
}}>
<ChatIcon />
</Fab>
</Zoom>
</Box>
</Tooltip>
);
}
return (
<>
<Suspense fallback={<CircularProgress />}>
{isActive && (
<LazyChatbot
isOpen={isActive}
onOpen={setIsActive}
/>
)}
</Suspense>
</>
);
}Why, might you ask? Well, each instance of the hook will have an independent state, which makes it challenging to keep track of changes and manage state updates across both components.
Note 😓
This translates to synchronization problems where changes in one component’s state aren’t reflected in others.
And, trust me when I tell you, it took a small brain whirl to pinpoint what I was seeing because, of course, I’d forget the fact I skipped creating a provider.
Learn from my mistakes, people.
Why I Stuck with JavaScript
I had the option to switch to Python, but kept the entire stack in JavaScript. Here’s why:
Benefits:
- One language across the frontend and backend
- Native compatibility with Supabase Edge Functions (Deno = TypeScript/JS)
- Faster cold starts than Python
- No need for a separate CI/CD setup
- Browser-compatible ML libraries like
transformers.js
When Python Might Be Better:
- Advanced ML tasks or model training
- Need for Python-native libraries (spaCy, NLTK, etc.)
- Existing Python infrastructure
In my case, staying in JS kept the build process simpler and more consistent.
Rate Limits and Performance
Language choice didn’t impact rate limits. What mattered was:
- Volume of vector DB operations
- Model generation time
- Cache hit/miss ratio
I optimized performance with aggressive caching:
- In-memory client-side cache for recent queries
- Supabase cache layer for duplicate prompts
- Selective reprocessing of content only when it changes
Final Thoughts
This project taught me a lot more than how to build a chatbot. It forced me to understand the real-world limitations of serverless AI, vector search, model behavior, and frontend state management.
I learned how to debug Deno constraints, deal with disappearing models, fake a stream, and make chatbots behave like actual humans.
Now my chatbot:
- Understands my portfolio
- Feels responsive
- Costs nothing to run
- Can answer questions about my projects with accuracy
It wasn’t easy. But it’s totally doable. If you’re trying to build your own AI chatbot, hopefully this post saves you from some of the potholes I hit. And if you hit something new, let me know—I’m still learning too.
Want to try building one yourself? Start by checking out the project repo on GitHub.
Have questions or comments? Start a conversation in the comments else I’ll see you on the next one 🙂