I Built A Free AI Chatbot For The First Time - Here's Everything That Went Wrong (And Right)

Every time it almost worked, something broke. But I learned a lot trying to get a free, fast, intelligent chatbot running on my Gatsby portfolio.

First came the realization that building an AI chatbot will not be a weekend project. I thought it would, but two weeks later, I was knee-deep in vector mismatches, missing models, and context recipe testing.

What followed was the hard truth that even AI has a hard time helping you build other AI. It can only take you so far until things work. Sort of.

Let’s just say there’s nothing like watching your custom chatbot spew completely made-up facts about your career 😒

It soon became apparent this wasn’t going to be plug-and-play.

So why do it? Why spend time even integrating a custom AI when there are so many premade options?

I’ll tell you why: learning experience. There’s no better way to take an in-depth look at handling AI models from behind the scenes than when building one. It’s basic facts 💁‍♀️

Without further ado, let’s dive in.

The Initial Objective

I wanted a site-wide AI chatbot that could:

Answer questions about my work and experience
Understand and use the content I’d already written in JSON files
Run fast and feel responsive
Not cost a dime (a free solution for the limited utility it would provide)

The MVP stack I settled on consisted of:

GatsbyJS site (my portfolio is already built with Gatsby)
Supabase for vector search and serverless functions
Hugging Face (via Xenova) for local embeddings
Client-side and server-side caching
Lightweight model for response generation

I wrote a step-by-step guide on how I got the core chatbot running. This post is everything that happened next—from surprises and mistakes to lessons and breakthroughs.

The Setup: Theory vs Reality

On paper, the workflow looked clean:

Recursively parse all JSON files in the data/ directory
Break content into chunks for embedding
Generate embeddings with Xenova/all-MiniLM-L6-v2
Store vectors in Supabase with pgvector
At query time, find relevant chunks and generate a reply using a local model

It worked seemingly okay until I hit number five. That’s when things broke down, and here’s a breakdown of the what and why (pun intended).

Real-World Challenges and Solutions

Parsing Nested Files

Going into this, I didn’t understand context much, so I failed to consider nested paths in my file discovery script.

Solution: Recursively walk the data/ directory. Strip paths and extract the base filename (e.g. projects) for contentType fields. This allowed the system to treat all projects data the same, regardless of depth.

// Process all JSON files in data directory
async function processPortfolioContent() {
    const dataDir = path.join(process.cwd(), 'src/data');
    // const generateEmbedding = await initEmbeddingModel();

    // Recursively find all JSON files
    function findJsonFiles(dir) {
        const files = fs.readdirSync(dir);
        return files.flatMap(file => {
            const fullPath = path.join(dir, file);
            return fs.statSync(fullPath).isDirectory() ? findJsonFiles(fullPath) : fullPath.endsWith('.json') ? [fullPath] : [];
        });
    }

    // Get all JSON files in the data directory
    const files = findJsonFiles(dataDir);

    // const files = fs.readdirSync(dataDir).filter(file => file.endsWith('.json'));

    for (const file of files) {
        const contentType = path.basename(file, '.json');
        // const filePath = path.join(dataDir, file);

        // Read and parse JSON file
        const rawData = fs.readFileSync(file, 'utf8');
        const data = JSON.parse(rawData);

        console.log(`Processing ${contentType} data...`);

        // Process each section of the content
        await processDataRecursively(data, contentType);
    }

    console.log('Content processing complete!');
}

The `parentKey` Format

I built a key format like projects:[0].description to track the source of each content chunk. It mimicked JavaScript object paths and was great for debugging.

The good:

Clear traceability back to the original content
Fine-grained control over chunk-level retrieval

The bad:

Too granular. It split up semantically connected information
Retrievals surfaced orphaned chunks, lacking the necessary context

Tip ⚠️
The lesson from this was that a little context fragmentation helps with retrieval, but too much breaks logical associations.

Keeping Content Fresh

Once I updated a JSON file, I needed the chatbot to reflect the new info. But the system didn’t auto-sync changes.

Solution: Rerun the processPortfolioContent() script manually to:

Regenerate embeddings
Push new or updated vectors to Supabase

To avoid duplicates or mismatches, I added checks in the Supabase client to determine whether to insert, update, or delete vectors and a flag to decide which action to run.

async function upsertPortfolioContent({
    content,
    contentType,
    contextKey,
    embedding,
    action = 'create', // or 'update' or 'delete'
}) {
    const content_type = `${contentType}:${contextKey}`;

    if (action === 'delete') {
        const { error } = await supabase
            .from('portfolio_content')
            .delete()
            .eq('content_type', content_type);

        if (error) console.error('Delete failed:', error);
        else console.log(`Deleted ${content_type} content`);
        return;
    }

    // Check if entry already exists
    const { data: existing, error: fetchError } = await supabase
        .from('portfolio_content')
        .select('id')
        .eq('content_type', content_type)
        .maybeSingle();

    if (fetchError) {
        console.error('Check failed:', fetchError);
        return;
    }

    if (existing && action === 'update') {
        // update existing
        const { error } = await supabase
            .from('portfolio_content')
            .update({ content, embedding })
            .eq('content_type', content_type);

        if (error) { console.error('Update failed:', error) } else {
            console.log(`Updated ${content_type} content`);
        }
    } else if (!existing && ['create', 'update'].includes(action)) {
        // insert new
        const { error } = await supabase
            .from('portfolio_content')
            .insert({ content, content_type, embedding });

        if (error) { console.error('Insert failed:', error) } else {
            console.log(`Stored ${content_type} content`);
        }
    }
}

Supabase Edge Function Constraints

Supabase Edge Functions run in a Deno environment, and that brought some unexpected limitations.

NPM Import Issues

Attempting to use Xenova’s models via npm:@xenova/transformers@2 failed.

Error: Cannot find module 'npm:@xenova/transformers@2'

Deno doesn’t support Node-style NPM imports unless bundled. Supabase Edge runs unbundled ESM by default, so the fix was to use a CDN-hosted ESM version, like: import * as transformers from '<https://esm.sh/@xenova/transformers>'

Not a shocking insight, but if you haven’t worked in a Deno environment before (🙋‍♀️), it can certainly add unexpected perplexity.

Local File Access

Xenova’s transformer models attempt to read model files from disk, but Supabase Edge Functions don’t allow file access. This results in an error like: The URL must be of scheme file.

A couple of solution options are to:

Run Xenova locally for dev and testing
Switch to Hugging Face’s hosted inference API for deployment

I opted for the second option. While it introduced rate limits, it was simple and stable. However, it wasn’t the best choice, as you’ll see later on.

Making the Chatbot Feel Instant

I wanted the bot to stream answers the way ChatGPT does—not just dump the entire response after several seconds.

To achieve this effect, I opted for fake streaming with Server-Sent Events (SSE). This consisted of:

Generating a full response
Splitting into tokens or words
Sending incrementally using setTimeout()

The result was better UX, no additional token cost, and, most importantly, a feeling that it’s conversational.

The Disappearing Model

I originally used mistralai/Mistral-7B-Instruct-v0.2 and it worked extremely well—until one day it didn’t.

The model disappeared from Hugging Face with no warning. I tested alternatives but hit a wall:

Some weren’t instruction-tuned
Others ignored the prompt entirely
Many generated hallucinations or filler content

Examples included invented job roles, fictional side projects, and contact info that didn’t exist in my data.

Eventually, I switched to Gemini models, which solved problems like:

Long context windows (up to 2 million tokens)
System instructions to steer output
Context caching to reduce token usage in repeated queries

Gemini’s ability to maintain long context threads and stick to the data significantly improved answer quality.

Embedding Incompatibilities

Now, for some more learning pains. I ran into vector dimension errors when trying to switch from Xenova embeddings to Gemini’s embedding model.

Error: Error: different vector dimensions 384 and 768

The reason for this is that Supabase pgvector expects all embeddings in a table to have the same dimensionality.

Xenova/all-MiniLM-L6-v2 = 384D
Gemini embedding = 768D

You can’t mix and match. Re-embed everything with the new model if switching.

Tip: Consistency in embeddings is crucial for valid similarity search!

UI State Bugs

Two separate state issues created unwanted side effects in the chat UI.

1. Replaying Old Messages on Reopen

If I closed the chatbot and reopened it, the conversation replayed line by line. The issue stemmed from restoring the state multiple times.

Fix: Only load history once, on mount. If a session exists, skip the welcome message.

2. State Not Shared Across Components

This one was a first for me. I used useChatbot() in both <Chatbot /> and <ChatbotLoader />, creating isolated instances.

Fix: Centralize state with a shared provider. Ensure all chatbot-related components pull from the same store.

Ironically, I did not proceed with creating a custom provider but decided to stick to the solo custom hook. Because of this, I needed to manage the active state of the chatbot independently.

export default function ChatbotLoader() {
  // const { isOpen, toggleChat } = useChatbot();
  const [isActive, setIsActive] = useState(false);

  // If chatbot is not open, show just the fab button
  if (!isActive) {
    return (
      <Tooltip
        title="Chat with EVE"
        placement="top-end"
        componentsProps={{
          tooltip: {
            sx: {
              bgcolor: '#c71585',
              color: 'white',
              boxShadow: '0 4px 20px rgba(199, 21, 133, 0.25)',
            },
          },
        }}>
        <Box
          sx={{
            position: 'fixed',
            bottom: 35,
            '@media(max-width: 600px)': {
              bottom: 25,
            },
            right: '6%',
            zIndex: 1400,
          }}>
          <Zoom in={true}>
            <Fab
              onClick={() => setIsActive(!isActive)}
              aria-label="open chat"
              data-gtm-track="chat-button"
              sx={{
                bgcolor: '#c71585',
                color: 'white',
                '&:hover': {
                  bgcolor: '#a01269',
                },
                boxShadow: '0 4px 20px rgba(199, 21, 133, 0.25)',
              }}>
              <ChatIcon />
            </Fab>
          </Zoom>
        </Box>
      </Tooltip>
    );
  }
  return (
    <>
      <Suspense fallback={<CircularProgress />}>
        {isActive && (
          <LazyChatbot
            isOpen={isActive}
            onOpen={setIsActive}
          />
        )}
      </Suspense>
    </>
  );
}

Why, might you ask? Well, each instance of the hook will have an independent state, which makes it challenging to keep track of changes and manage state updates across both components.

Note 😓
This translates to synchronization problems where changes in one component’s state aren’t reflected in others.

And, trust me when I tell you, it took a small brain whirl to pinpoint what I was seeing because, of course, I’d forget the fact I skipped creating a provider.

Learn from my mistakes, people.

Why I Stuck with JavaScript

I had the option to switch to Python, but kept the entire stack in JavaScript. Here’s why:

Benefits:

One language across the frontend and backend
Native compatibility with Supabase Edge Functions (Deno = TypeScript/JS)
Faster cold starts than Python
No need for a separate CI/CD setup
Browser-compatible ML libraries like transformers.js

When Python Might Be Better:

Advanced ML tasks or model training
Need for Python-native libraries (spaCy, NLTK, etc.)
Existing Python infrastructure

In my case, staying in JS kept the build process simpler and more consistent.

Rate Limits and Performance

Language choice didn’t impact rate limits. What mattered was:

Volume of vector DB operations
Model generation time
Cache hit/miss ratio

I optimized performance with aggressive caching:

In-memory client-side cache for recent queries
Supabase cache layer for duplicate prompts
Selective reprocessing of content only when it changes

Final Thoughts

This project taught me a lot more than how to build a chatbot. It forced me to understand the real-world limitations of serverless AI, vector search, model behavior, and frontend state management.

I learned how to debug Deno constraints, deal with disappearing models, fake a stream, and make chatbots behave like actual humans.

Now my chatbot:

Understands my portfolio
Feels responsive
Costs nothing to run
Can answer questions about my projects with accuracy

It wasn’t easy. But it’s totally doable. If you’re trying to build your own AI chatbot, hopefully this post saves you from some of the potholes I hit. And if you hit something new, let me know—I’m still learning too.

Want to try building one yourself? Start by checking out the project repo on GitHub.

Have questions or comments? Start a conversation in the comments else I’ll see you on the next one 🙂

AI AI Chatbot Projects Web Development

I Built A Free AI Chatbot For The First Time – Here’s Everything That Went Wrong (And Right)

The Initial Objective

The Setup: Theory vs Reality

Real-World Challenges and Solutions

Parsing Nested Files

The `parentKey` Format

Keeping Content Fresh

Supabase Edge Function Constraints

NPM Import Issues

Local File Access

Making the Chatbot Feel Instant

The Disappearing Model

Embedding Incompatibilities

UI State Bugs

1. Replaying Old Messages on Reopen

2. State Not Shared Across Components

Why I Stuck with JavaScript

Benefits:

When Python Might Be Better:

Rate Limits and Performance

Final Thoughts

Related Posts

What Is MCP, And Why Can’t You Just Use an API?

You’re Not Just Writing Code, You’re Architecting an Experience

The Great AI FOMO: Why Keeping Up Feels Like a Second Job

Leave a Comment Cancel reply

The Initial Objective

The Setup: Theory vs Reality

Real-World Challenges and Solutions

Parsing Nested Files

The parentKey Format

Keeping Content Fresh

Supabase Edge Function Constraints

NPM Import Issues

Local File Access

Making the Chatbot Feel Instant

The Disappearing Model

Embedding Incompatibilities

UI State Bugs

1. Replaying Old Messages on Reopen

2. State Not Shared Across Components

Why I Stuck with JavaScript

Benefits:

When Python Might Be Better:

Rate Limits and Performance

Final Thoughts

Related Posts

What Is MCP, And Why Can’t You Just Use an API?

You’re Not Just Writing Code, You’re Architecting an Experience

The Great AI FOMO: Why Keeping Up Feels Like a Second Job

Leave a Comment Cancel reply

The `parentKey` Format