Welcome to my blog. Sharing ideas on math, programming, and AI systems.

Learning Addition with GPT

Inspired by Andrej Karpathy’s nanoGPT and his excellent YouTube series1, I decided to train my own transformer model on a simple dataset. Additionally, I aimed to calculate precisely how well the model performs. This is not trivial when working with text, as evaluations often rely on a so-called vibe check, which is inherently subjective. A natural choice for objective evaluation is to train the model to generate text representing equations in the form $x + y = z$....

January 11, 2025 · 7 min · 1348 words · v4nn4

Transformers Dashboard

Since the publication of the now famous 2017 paper Attention is All You Need1, many large language models based on the transformer architecture have emerged. Fortunately, some studies 2 3 have compiled extensive data on many published models, including the dimensions of their transformers. Much like my experience learning about CNNs and their increasing complexity, I wanted to analyze LLM transformers. Which models are the largest? What is the optimal size for the feed-forward layer?...

May 4, 2024 · 4 min · 681 words · v4nn4

Glyph generation

Imagine a simple square grid with nine dots like the one above. What if you were challenged to draw as many unique shapes or “glyphs” as possible by connecting these dots? At first glance, it seems straightforward, but let’s dive deeper into the complexity. Each of the nine dots can be connected in pairs, forming lines that are the strokes of our glyphs. Calculating all possible pairings, we find there are $(9\times8)/2=36$ unique strokes....

April 26, 2024 · 5 min · 874 words · v4nn4

Some thoughts on training LeNet

Since my last blog post Training LeNet on Armenian script, I have made some significant improvement to the training process. Model simplification The model takes as input a mean and standard deviation for normalizing pixel intensities. These values are calibrated on the training set before initiating the gradient descent loop for adjusting the weights and biases. To make things simpler, I hardcoded those parameters. This way the dependency between model and training set only happens in the gradient descent loop....

December 30, 2023 · 4 min · 694 words · v4nn4

Diffusion models and time reversal

I recently spent some time reading about the algorithms behind Stable Diffusion and similar image generation models. They have been linked with an interesting 40-years-old result on diffusion processes1. In short, this result states that there exists an explicit path from an initial probability distribution $p_0$ to a random noise (a normal distribution), and that this path can be reversed. One application of this concept is sampling : we can draw a sample from a random noise and use the backward diffusion to obtain a sample from $p_0$....

October 24, 2023 · 9 min · 1876 words · v4nn4

Training LeNet-5 on Armenian script

Following Tinkering with Tesseract, I wanted to gain a better understanding of how OCR systems work. So, I decided to start with building my own character recognition engine using PyTorch. The code is available at v4nn4/hynet. Generating a dataset First, we visualize the alphabet in our target font, Mk_Parz_U-Italic : from PIL import Image, ImageDraw, ImageFont import matplotlib.pyplot as plt caps = range(0x531, 0x557) smalls = range(0x561, 0x588) letters = [f"{chr(a)}{chr(b)}" for (a, b) in zip(caps, smalls)] letters = [" "....

October 16, 2023 · 8 min · 1666 words · v4nn4

Tinkering with Tesseract

I have recently been experimenting with Tesseract, an Optical Character Recognition (OCR) engine developed by Google. My primary objective was to extract text from scans of a 1920s Armenian newspaper and execute search queries on it. Terms like պատերազմ (war) or Ֆրանսիա (France) for instance are likely to be discovered within the document. Some initial observations on the document : Image segmentation : there are a lot of different text blocks in the raw document, and distinguishing between them might be challenging....

October 13, 2023 · 4 min · 782 words · v4nn4