Vector Databases for Absolute Beginners — Part 1

janac
6 min readDec 8, 2023

I’ve been struggling for the past few weeks to understand what vector databases are, and why is it a popular buzzword in AI.

Every article, video, or course I’ve found on vector databases has left me confused after the first few sentences. There’s a lot of jargon like “vectors”, “embeddings”, “RAG”, “semantic search”, etc., and it’s hard to understand vector databases without first knowing what these are.

Let’s start at the beginning.

Typical (Non-Vector) Databases

A typical database, is either an SQL database or a NoSQL database. SQL databases store tabular data, which looks similar to an Excel Spreadsheet. It has rows and columns with data inside, like this list of contacts:


| Name | Email |Age |
|--------|------------------|----|
| Janac | janac@gmail.com | 30 |
| John | john@gmail.com | 20 |
| Juan | juan@gmail.com | 41 |

NoSQL databases store data in a format called JSON. Here’s the equivalent of the above table in JSON:

[
{
name: "janac",
email: "janac@gmail.com",
age: 30
},
{
name: "john",
email: "john@gmail.com",
age: 20
},
{
name: "juan",
email: "juan@gmail.com",
age: 41
}
]

There are dozens of other complex types of databases that are optimized for a specific purpose, like time series, graph-db, kdb, GIS, etc. but we won’t be covering those in this…

--

--

janac

Most of my writing is about software. I enjoy summarizing and analyzing books and self-help videos. I am senior software consultant at LazerTechnologies.com.