An example of use for the vector data type

This page has 0 threads | Add post

Blogs

Many people around the world enjoy and benefit from our blogs. If you're one of them - and can afford to - please make a small donation to help keep this and future blogs, exercises, skills assessment tests, tips, tutorials and videos freely available to everyone!

Some other pages relevant to these blogs include:

You can also book hourly online consultancy for your time zone with one of our 7 expert trainers!

Using the new vector data type in SQL Server 2025 Part two of a two-part series of blogs
SQL Server 2025 introduces a completely new data type - the vector - which is aimed squarely at AI developers. Introducing the new vector data type An example of use for the vector data type - storing embeddings (this blog)

Using the new vector data type in SQL Server 2025

Part two of a two-part series of blogs

SQL Server 2025 introduces a completely new data type - the vector - which is aimed squarely at AI developers.

Introducing the new vector data type
An example of use for the vector data type - storing embeddings (this blog)

Posted by Andy Brown on 27 November 2025

In this blog

Creating our table
The Python to add in the embeddings
Determining the distance between the phrases using VECTOR_DISTANCE

This example shows how you can use the vector data type to store embeddings for AI model prompts in SQL Server, and how you can then use the vector_distance function to determine how similar different text phrases are.

You'll need a good knowlege of Python and an OpenAI API account to get this to work. For the moment I'm just trying to answer the question "Why on earth would anyone need to use the new vector field type?".

Creating our table

Begin by creating a table to hold our embeddings:

-- get rid of any old version of the table

-- we are about to create

DROP TABLE IF EXISTS tblEmbedding

-- create a table to hold some embeddings

CREATE TABLE tblEmbedding(

EmbeddingId int primary key identity,

EmbeddingText varchar(50) NULL,

Tokens vector(1536, float32)

)

Note that you don't need to specify the float32 data type when creating your vector, as this is the default. As to why we've created a vector containing 1536 numbers - watch and wait!

The Python to add in the embeddings

Here's some code to add 3 phrases into our table:

from openai import OpenAI

import pyodbc

import json

# API key for OpenAI not shown here, for obvious reasons!

client = OpenAI()

def add_embedding(

text_phrase: str

) -> None:

""""

A function to add the embeddings for a given

text phrase into a SQL Server table

""""

# create the embedding for a text query

response = client.embeddings.create(

input=text_phrase,

model="text-embedding-3-small"

)

# get the embedding for this (the 1536 length array of parameters

# to feed into a large language model)

embed_results = response.data[0].embedding

# connect to the databsae

conn = pyodbc.connect(

"DRIVER={ODBC Driver 17 for SQL Server};"

"SERVER=.\\sql2025;"

"DATABASE=VectorPlayground;"

"Trusted_Connection=yes;"

)

cursor = conn.cursor()

# convert embedding to JSON (the SQL Server vector type

# accepts JSON arrays as input)

tokens_json = json.dumps(embed_results)

# insert everything into the SQL Server table (I couldn't

# get the parameter to accept JSON though, so have converted

# the JSON into text and from there to a vector data type)

sql = """"

INSERT INTO tblEmbedding (EmbeddingText, Tokens)

VALUES (

CAST(CAST(? AS NVARCHAR(MAX)) AS VECTOR(1536, float32))

)

""""

cursor.execute(sql, (text_phrase, tokens_json))

conn.commit()

# slightly inefficiently, close SQL Server connection

# each time round

cursor.close()

conn.close()

# try out adding some embeddings

add_embedding("Wise Owl Training")

add_embedding("The curfew tolls the knell of parting day")

add_embedding("This too shall pass")

print("Done!")

The text-embedding-3-small generates vectors containing 1536 numbers. Here's what the table contains after running this code:

The 3 phrases as text and numbers.

Determining the distance between the phrases using VECTOR_DISTANCE

You can use the VECTOR_DISTANCE function to see how close each phrase is to each other one:

SELECT

e1.EmbeddingText AS 'Source text',

e2.EmbeddingText AS 'Target text',

-- calculate distances

VECTOR_DISTANCE('cosine',e1.Tokens, e2.Tokens) as 'Cosine',

VECTOR_DISTANCE('euclidean',e1.Tokens, e2.Tokens) as 'Euclidean',

VECTOR_DISTANCE('dot',e1.Tokens, e2.Tokens) as 'Dot'

FROM

tblEmbedding as e1

CROSS JOIN tblEmbedding as e2

WHERE

-- ensure only get one version

e1.EmbeddingId > e2.EmbeddingId

ORDER BY

Cosine

As this shows, there are 3 ways to see how close two vectors are: using a Cosine, Euclidean or Dot algorithm (as described in more detail here). Here's what this query would show for our example:

The two phrases which are "most similar" are shown first.

I hope that's given an idea of the sort of problem that the vector data type has been introduced to solve!

Parts of this blog
Introducing the new vector data type An example of use for the vector data type - storing embeddings (this blog)

Some other pages relevant to these blogs include:

You can also book hourly online consultancy for your time zone with one of our 7 expert trainers!

This blog has 0 threads

Add a new post

ALL BLOGS

SQL BLOGS

SQL BLOGS

Creating our table

The Python to add in the embeddings

Determining the distance between the phrases using VECTOR_DISTANCE

Head office

London

Manchester