MLWhiz | AI Unwrapped

MLWhiz | AI Unwrapped

Share this post

MLWhiz | AI Unwrapped
MLWhiz | AI Unwrapped
Today I Learned This Part I: What are word2vec Embeddings?
Copy link
Facebook
Email
Notes
More

Today I Learned This Part I: What are word2vec Embeddings?

Rahul Agarwal's avatar
Rahul Agarwal
Apr 09, 2017
∙ Paid

Share this post

MLWhiz | AI Unwrapped
MLWhiz | AI Unwrapped
Today I Learned This Part I: What are word2vec Embeddings?
Copy link
Facebook
Email
Notes
More
Share

Recently Quora put out a Question similarity competition on Kaggle. This is the first time I was attempting an NLP problem so a lot to learn. The one thing that blew my mind away was the word2vec embeddings.

Till now whenever I heard the term word2vec I visualized it as a way to create a bag of words vector for a sentence.

For those who don’t know bag of words: If we have a series of sentences(documents)

  1. This is good - [1,1,1,0,0]

  2. This is bad - [1,1,0,1,0]

  3. This is awesome - [1,1,0,0,1]

Bag of words would encode it using 0:This 1:is 2:good 3:bad 4:awesome

But it is much more powerful than that.

What word2vec does is that it creates vectors for words. What I mean by that is that we have a 300 dimensional vector for every word(common bigrams too) in a dictionary.

How does that help?

We can use this for multiple scenarios but the most common are:

A. Using word2vec embeddings we can find out similarity between words. Assume you have to answer if these two statements signify the same thing:

  1. President g…

Keep reading with a 7-day free trial

Subscribe to MLWhiz | AI Unwrapped to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Rahul Agarwal
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More