# Positional embeddings in transformers EXPLAINED | Demystifying positional encodings.

AI Coffee Break with Letitia2021-07-12

positional embeddings#explained#positional encodings#transformer#attention is all you need#sequence#order#text#NLP#AI#artificial intelligence#learning#machine learning#visualized#visualizations#deep learning#easy#beginner#simply#basics#comprehensible#research#computer science#women in ai#womeninai#algorithm#short#example#machine learning research#aicoffeebean#position representation#animated#animation#letitia parcalabescu#aicoffeebreak#coffeebreak#coffeebean#aicoffee#aibean

55K views|3 years ago

ðŸ’« Short Summary

"Positional embeddings are added to transformer neural networks to encode the order of input sequences, as transformers process input in parallel and do not inherently model sequence order. Positional embeddings help the transformer know the position of each element within the sequence, and they are typically implemented using sine and cosine functions with increasing frequencies to differentiate positions. These embeddings are important for various types of data, including text, graphs, and images."

âœ¨ Highlights

ðŸ“Š Transcript

âœ¦

Positional embeddings are essential for transformers to understand the order of the input sequence.

00:00Transformers process input in parallel and do not inherently understand the order of the sequence.

Positional embeddings are added to the initial vector representation of the input to indicate the order or position of each element within the sequence.

These embeddings help the transformer know that one piece comes after the other and in a specific order.

âœ¦

Positional embeddings must have the same identifier for every position, and their values should not push the vectors into very distinct subspaces.

03:29The solution to this problem, as presented in the 'Attention is All You Need' paper, involves using periodic functions like sine and cosine to model the positional embeddings.

Sine and cosine functions with increasing frequency are used to differentiate the positional values for tokens in the sequence.

This approach provides enough information to ensure that the transformer can accurately understand the order of the sequence.

âœ¦

Sine and cosine functions with increasing frequency are used to encode the position of tokens in the sequence, providing enough information for the transformer to understand the order.

07:50The last dimension broadly indicates where the token is in the sequence, but with less specific resolution.

The second to last dimension has higher frequency and resolution to pin down the exact position in the sequence.

Sine and cosine embeddings work well for text data, but there are other types of data, like graphs or images, that may require different positional embeddings.

ðŸ’« FAQs about This YouTube Video

### 1. What are positional embeddings in transformers?

Positional embeddings are added to the initial vector representation of the input in a transformer neural network to encode the order of the input sequence, allowing the transformer to understand the positions of elements in the sequence.

### 2. Why do we need positional embeddings in transformers?

Positional embeddings are needed in transformers because the transformer architecture does not inherently model the order of the input sequence. By adding positional embeddings, the transformer can understand the positions of elements in the sequence.

### 3. How do positional embeddings work in transformers?

Positional embeddings work by adding information about the position of elements in the input sequence to the initial vector representation in a transformer neural network. This allows the transformer to differentiate the positions of elements and understand the order of the sequence.

### 4. What is the purpose of positional embeddings in transformers?

The purpose of positional embeddings in transformers is to enable the model to encode the order of the input sequence, as the transformer architecture lacks inherent understanding of input sequence order. By incorporating positional embeddings, the transformer can effectively process and understand the positions of elements in the sequence.

### 5. How are positional embeddings added to the initial vector representation in transformers?

Positional embeddings are simply added to the initial vector representation of the input in a transformer neural network. This addition allows the transformer to incorporate information about the positions of elements in the input sequence, enabling it to understand the order of the sequence.

ðŸŽ¥ Related Videos

ðŸ”¥ Recently Summarized Examples