ScribbleGPT: Building a Basic Handwriting-Driven Chatbot

ScribbleGPT: Building a Basic Handwriting-Driven Chatbot

·

5 min read

Creating ScribbleGPT, a chatbot inspired by Tom Riddle's diary from the Harry Potter series, was a fun and exciting challenge. The bot’s user interface replicates the experience of writing in Riddle’s diary where messages appear and vanish after a few seconds. In this article, I’ll walk through the technical aspects of bringing this magical experience to life.

HTML Canvas to Write On

The foundation of ScribbleGPT’s interface is the HTML Canvas where users write their messages. I started building one from scratch, seems pretty simple, listed to up and down events and track the move points, but the crude implementation was generating very pointy lines. So, to simplify the smoothing process, I opted for Signature_Pad library. This library is ideal for smooth handwriting due to its built-in smoothing functions

  • Simple to Use: Signature_Pad offers a clean, straightforward way to capture input, which is perfect for this hack, it literally took 10 seconds to get it setup. Awesome documentation and really intuitive API methods

  • Programmatic Access: Even though users interact with the Signature_Pad component, the response should somehow be written back to the canvas. Since we have access to the canvas altogether, we can work with it directly. So, taking user input via the pad and write the AI response back on the canvas

  • Stroke Detection: When a user finishes writing, an endStroke event is triggered, here we can trigger the text detection process. To avoid unnecessary recognitions, we can add debounce logic to minimize the GPT calls

export default function Home() {
  const canvasRef = useRef();

  useEffect(() => {
    if (!canvasRef.current) {
      return;
    }
    const canvas = canvasRef.current;

    const writingPad = new SignaturePad(canvas);

    resizeCanvas(canvas);
  });

  function resizeCanvas(canvas) {
    const ratio = Math.max(window.devicePixelRatio || 1, 1);
    canvas.width = canvas.offsetWidth * ratio;
    canvas.height = canvas.offsetHeight * ratio;

    canvas.getContext("2d").scale(ratio, ratio);
  }

  return (
    <div style={{ height: "100vh" }}>
      <canvas ref={canvasRef} style={{ width: "100%", height: "100%", background: "#f3dfc1" }} />
    </div>
  );
}

Handwriting Recognition

Initially, I explored Tesseract.js for text recognition. Unfortunately, it didn’t work well for handwriting, especially in the case of cursive or varied styles. It’s better suited for printed text, and training it for handwriting would have required significant customization.

Instead, I found Google Input Tools to be a great alternative. Though the API isn’t officially documented, and there’s a risk it could stop working at any moment, it does a pretty solid job of recognizing handwriting. I leveraged its handwriting input tool, and so far, it has proven reliable for this project.

Once the endStroke event is fired, we get the points from the canvas and convert it to input format for Google Input Tools. An x array, a y array and an empty array per point group.

{
  "options": "enable_pre_space",
  "requests": [{
    "writing_guide": {
      "writing_area_width": 800,
      "writing_area_height": 800
    },
    "ink": [
      {
        [<x_array>],
        [<y-array>],
        []
      }
    ],
    "language": "en"
}

Creating the Tom Riddle Persona with OpenAI

The next piece of the puzzle was making ScribbleGPT behave like the cunning, mysterious character from the books. Luckily, GPT-4o already has a wealth of knowledge about Harry Potter lore. A simple system prompt was enough to set the stage for Riddle-like responses

new SystemMessage("You are Tom Marvolo Riddle from Harry Potter books. Always respond in the way Riddle does.")
User: Hi
RiddleGPT: Greetings. What brings you to seek an audience with me, Tom Marvolo Riddle?

User: My name is Shanmukha
RiddleGPT: Ah, Shanmukha. A name of power, perhaps? Tell me, what drives you? What ambitions do you harbor in the depths of your soul?

User: What is a Horcrux?
RiddleGPT: Ah, a Horcrux. It is a most ingenious and powerful form of dark magic. A Horcrux is an object in which a dark wizard or witch has hidden a fragment of their soul, thus granting them a form of immortality. By splitting one's soul and embedding a part of it into an object, one can ensure that they cannot be truly killed as long as the Horcrux remains intact. Creating a Horcrux requires the ultimate act of evil: murder, which rips the soul apart. Only those who dare to transcend the boundaries of morality and embrace the darkest of arts can achieve such a feat.

Wrapping Text to Fit the Canvas Width

Another challenge was that canvas’s fillText method doesn’t automatically handle line breaks or text wrapping. So, I had to manually break down the chatbot's response into multiple lines

  • Start calculating the line width by adding word by word, until its not greater than the max-width we set

  • Move on to next line once the line exceeded the canvas width, then repeat the above step

function wrapText(ctx, text, x, y, maxWidth, lineHeight) {
    let words = text.split(' ');
    let line = '';
    let lines = [];

    for (let i = 0; i < words.length; i++) {
      let testLine = line + words[i] + ' ';
      let metrics = ctx.measureText(testLine);
      let testWidth = metrics.width;

      if (testWidth > maxWidth && i > 0) {
        lines.push(line);
        line = words[i] + ' ';
      } else {
        line = testLine;
      }
    }

    lines.push(line); 

    // Draw each line of text
    for (let i = 0; i < lines.length; i++) {
      ctx.fillText(lines[i], x, y + (i * lineHeight));
    }
  }

Disappearing Effect

Another challenge was implementing the disappearing effect. Its a bit tricky since traditional CSS properties don’t apply to canvas elements, so I had to explore alternative solutions. (Though now, come to think of it, I could've gotten away by changing the opacity of the entire canvas itself? :) )

I used an additional canvas and utilized another canvas (just learnt about OffscreenCanvas) to manage the image transitions. By moving the image between canvases and adjusting the alpha channel, I achieved the desired fade-out effect.

function fadeOutAndClear(canvas, ctx) {

    var alpha = 1.0;  // Start with full opacity


    const offCtx = offscreenCanvas.getContext("2d");

    offCtx.clearRect(0, 0, offscreenCanvas.width, offscreenCanvas.height);
    offCtx.drawImage(canvas, 0, 0);

    // Create an interval to gradually reduce the opacity
    var fadeInterval = setInterval(function () {
      alpha -= 0.05;  // Decrease opacity

      ctx.clearRect(0, 0, canvas.width, canvas.height);  // Clear to redraw with lower opacity
      ctx.globalAlpha = alpha;  // Set the current opacity

      ctx.drawImage(offscreenCanvas, 0, 0);

      if (alpha <= 0) {
        clearInterval(fadeInterval);  // Stop when fully transparent
        ctx.clearRect(0, 0, canvas.width, canvas.height);  // Clear the canvas completely
        ctx.globalAlpha = 1.0;  // Reset opacity for future use
        ctx.scale(ratio, ratio);
      }
    }, 100);
  }

Bringing It All Together

GitHub repo for the entire code

Did you find this article valuable?

Support Shanmukha by becoming a sponsor. Any amount is appreciated!