Read our blogs, tips and tutorials
Try our exercises or test your skills
Watch our tutorial videos or shorts
Take a self-paced course
Read our recent newsletters
License our courseware
Book expert consultancy
Buy our publications
Get help in using our site
428 attributed reviews in the last 3 years
Refreshingly small course sizes
Outstandingly good courseware
Whizzy online classrooms
Wise Owl trainers only (no freelancers)
Almost no cancellations
We have genuine integrity
We invoice after training
Review 30+ years of Wise Owl
View our top 100 clients
Search our website
We also send out useful tips in a monthly email newsletter ...
Some other pages relevant to these blogs include:
You can also book hourly online consultancy for your time zone with one of our 7 expert trainers!
|
Using the new vector data type in SQL Server 2025 Part two of a two-part series of blogs |
|---|
|
SQL Server 2025 introduces a completely new data type - the vector - which is aimed squarely at AI developers.
|
In this blog
This example shows how you can use the vector data type to store embeddings for AI model prompts in SQL Server, and how you can then use the vector_distance function to determine how similar different text phrases are.
You'll need a good knowlege of Python and an OpenAI API account to get this to work. For the moment I'm just trying to answer the question "Why on earth would anyone need to use the new vector field type?".
Begin by creating a table to hold our embeddings:
-- get rid of any old version of the table
-- we are about to create
DROP TABLE IF EXISTS tblEmbedding
-- create a table to hold some embeddings
CREATE TABLE tblEmbedding(
EmbeddingId int primary key identity,
EmbeddingText varchar(50) NULL,
Tokens vector(1536, float32)
)
Note that you don't need to specify the float32 data type when creating your vector, as this is the default. As to why we've created a vector containing 1536 numbers - watch and wait!
Here's some code to add 3 phrases into our table:
from openai import OpenAI
import pyodbc
import json
# API key for OpenAI not shown here, for obvious reasons!
client = OpenAI()
def add_embedding(
text_phrase: str
) -> None:
""""
A function to add the embeddings for a given
text phrase into a SQL Server table
""""
# create the embedding for a text query
response = client.embeddings.create(
input=text_phrase,
model="text-embedding-3-small"
)
# get the embedding for this (the 1536 length array of parameters
# to feed into a large language model)
embed_results = response.data[0].embedding
# connect to the databsae
conn = pyodbc.connect(
"DRIVER={ODBC Driver 17 for SQL Server};"
"SERVER=.\\sql2025;"
"DATABASE=VectorPlayground;"
"Trusted_Connection=yes;"
)
cursor = conn.cursor()
# convert embedding to JSON (the SQL Server vector type
# accepts JSON arrays as input)
tokens_json = json.dumps(embed_results)
# insert everything into the SQL Server table (I couldn't
# get the parameter to accept JSON though, so have converted
# the JSON into text and from there to a vector data type)
sql = """"
INSERT INTO tblEmbedding (EmbeddingText, Tokens)
VALUES (
?,
CAST(CAST(? AS NVARCHAR(MAX)) AS VECTOR(1536, float32))
)
""""
cursor.execute(sql, (text_phrase, tokens_json))
conn.commit()
# slightly inefficiently, close SQL Server connection
# each time round
cursor.close()
conn.close()
# try out adding some embeddings
add_embedding("Wise Owl Training")
add_embedding("The curfew tolls the knell of parting day")
add_embedding("This too shall pass")
print("Done!")
The text-embedding-3-small generates vectors containing 1536 numbers. Here's what the table contains after running this code:

The 3 phrases as text and numbers.
You can use the VECTOR_DISTANCE function to see how close each phrase is to each other one:
SELECT
e1.EmbeddingText AS 'Source text',
e2.EmbeddingText AS 'Target text',
-- calculate distances
VECTOR_DISTANCE('cosine',e1.Tokens, e2.Tokens) as 'Cosine',
VECTOR_DISTANCE('euclidean',e1.Tokens, e2.Tokens) as 'Euclidean',
VECTOR_DISTANCE('dot',e1.Tokens, e2.Tokens) as 'Dot'
FROM
tblEmbedding as e1
CROSS JOIN tblEmbedding as e2
WHERE
-- ensure only get one version
e1.EmbeddingId > e2.EmbeddingId
ORDER BY
Cosine
As this shows, there are 3 ways to see how close two vectors are: using a Cosine, Euclidean or Dot algorithm (as described in more detail here). Here's what this query would show for our example:

The two phrases which are "most similar" are shown first.
I hope that's given an idea of the sort of problem that the vector data type has been introduced to solve!
| Parts of this blog |
|---|
|
Some other pages relevant to these blogs include:
You can also book hourly online consultancy for your time zone with one of our 7 expert trainers!
Kingsmoor House
Railway Street
GLOSSOP
SK13 2AA
Landmark Offices
99 Bishopsgate
LONDON
EC2M 3XD
Holiday Inn
25 Aytoun Street
MANCHESTER
M1 3AE
© Wise Owl Business Solutions Ltd 2025. All Rights Reserved.