Vector Databases for Absolute Beginners — Part 1

6 min readDec 8, 2023

I’ve been struggling for the past few weeks to understand what vector databases are, and why is it a popular buzzword in AI.

Every article, video, or course I’ve found on vector databases has left me confused after the first few sentences. There’s a lot of jargon like “vectors”, “embeddings”, “RAG”, “semantic search”, etc., and it’s hard to understand vector databases without first knowing what these are.

Let’s start at the beginning.

Typical (Non-Vector) Databases

A typical database, is either an SQL database or a NoSQL database. SQL databases store tabular data, which looks similar to an Excel Spreadsheet. It has rows and columns with data inside, like this list of contacts:

| Name | Email |Age |
| Janac | | 30 |
| John | | 20 |
| Juan | | 41 |

NoSQL databases store data in a format called JSON. Here’s the equivalent of the above table in JSON:

name: "janac",
email: "",
age: 30
name: "john",
email: "",
age: 20
name: "juan",
email: "",
age: 41

There are dozens of other complex types of databases that are optimized for a specific purpose, like time series, graph-db, kdb, GIS, etc. but we won’t be covering those in this article.

What are vectors?

A vector is a list of numbers. For example: 1,2,3,4,5 can be a vector. In programming languages, we call a list of numbers an array. Here’s what an array looks like:


We can assign a vector to a variable:

const vector = [1,2,3,4,5];

Okay, but what do we do with this list of numbers?

We can assign each number some sort of meaning, for example, we can create a vector representing an color. A simple way to represent colors is to use a Red Green Blue (RGB) code.

The RGB code for hot pink is a list of three numbers: 255, 51, 255, representing “how much” red, green and blue is in this color, respectively.




Most of my writing is about software. I enjoy summarizing and analyzing books and self-help videos. I am senior software consultant at