Can BERT do the next-word-predict task?Word labeling with TensorflowWhat is the difference between word-based...

Can I write a book of my D&D game?

Why do stocks necessarily drop during a recession?

How to count the characters of jar files by wc

How to say "Brexit" in Latin?

Normalization for two bulk RNA-Seq samples to enable reliable fold-change estimation between genes

How to solve a large system of linear algebra?

A starship is travelling at 0.9c and collides with a small rock. Will it leave a clean hole through, or will more happen?

Why zero tolerance on nudity in space?

Pronunciation of umlaut vowels in the history of German

Am I a Rude Number?

What is the wife of a henpecked husband called?

Dilemma of explaining to interviewer that he is the reason for declining second interview

Difference between `vector<int> v;` and `vector<int> v = vector<int>();`

Strange Sign on Lab Door

Finding a mistake using Mayer-Vietoris

Why would space fleets be aligned?

Why did other German political parties disband so fast when Hitler was appointed chancellor?

Why is working on the same position for more than 15 years not a red flag?

What is the lore based reason that the Spectator has the Create Food and Water trait, instead of simply not requiring food and water?

How much mayhem could I cause as a sentient fish?

How to prevent cleaner from hanging my lock screen in Ubuntu 16.04

How do Chazal know that the descendants of a Mamzer may never marry into the general populace?

what does しにみえてる mean?

What's a good word to describe a public place that looks like it wouldn't be rough?

Can BERT do the next-word-predict task?

Word labeling with TensorflowWhat is the difference between word-based and char-based text generation RNNs?What's the proper Word2vec model to get pre-trained word embedding for a classification task?Is it possible to have variable window size for Continuous Bag of Words method of training word embeddings?Can LSTM have a confidence score for each word predicted?Words as a input to networkis Glove better for word similarity Skip-gram/CBOW?Genarate one hour time interval array using pandas in python (import from csv) to predict next valueWhy ELMo's word embedding can represent the word better than glove?What is the reason for the speedup of transformer-xl?

As it is bidirectional.

How can we edit BERT to do the next-word-predict task?

asked 20 hours ago

jet

1706

$begingroup$
Have you seen the original publication? It seems to be addressing prediction at the sentence level, as explained in its section 3.3.2.
$endgroup$
– mapto
20 hours ago

add a comment |

As it is bidirectional.

How can we edit BERT to do the next-word-predict task?

asked 20 hours ago

jet

1706

$begingroup$
Have you seen the original publication? It seems to be addressing prediction at the sentence level, as explained in its section 3.3.2.
$endgroup$
– mapto
20 hours ago

add a comment |

As it is bidirectional.

How can we edit BERT to do the next-word-predict task?

asked 20 hours ago

jet

1706

As it is bidirectional.

How can we edit BERT to do the next-word-predict task?

neural-network deep-learning attention-mechanism transformer bert

asked 20 hours ago

jet

1706

asked 20 hours ago

jet

1706

asked 20 hours ago

jet

1706

asked 20 hours ago

jet

1706

asked 20 hours ago

jet

1706

$begingroup$
Have you seen the original publication? It seems to be addressing prediction at the sentence level, as explained in its section 3.3.2.
$endgroup$
– mapto
20 hours ago

add a comment |

$begingroup$
Have you seen the original publication? It seems to be addressing prediction at the sentence level, as explained in its section 3.3.2.
$endgroup$
– mapto
20 hours ago

Have you seen the original publication? It seems to be addressing prediction at the sentence level, as explained in its section 3.3.2.

– mapto
20 hours ago

add a comment |

1 Answer
1

active

oldest

votes

BERT can't be used for next word prediction, at least not with the current state of the research on masked language modeling.

BERT is trained on a masked language modeling task and therefore you cannot "predict the next word". You can only mask a word and ask BERT to predict it given the rest of the sentence (both to the left and to the right of the masked word).

This way, with BERT you can't sample text like if it were a normal autoregressive language model. However, BERT can be seen as a Markov Random Field Language Model and be used for text generation as such. See article BERT has a Mouth, and It Must Speak: BERT as a Markov Random Field Language Model for details. The authors released source code and a Google Colab notebook.

edited 8 hours ago

answered 19 hours ago

ncasas

3,6231130

$begingroup$
Thank you very much.
$endgroup$
– jet
3 hours ago

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "557"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46377%2fcan-bert-do-the-next-word-predict-task%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

BERT can't be used for next word prediction, at least not with the current state of the research on masked language modeling.

edited 8 hours ago

answered 19 hours ago

ncasas

3,6231130

$begingroup$
Thank you very much.
$endgroup$
– jet
3 hours ago

add a comment |

BERT can't be used for next word prediction, at least not with the current state of the research on masked language modeling.

edited 8 hours ago

answered 19 hours ago

ncasas

3,6231130

$begingroup$
Thank you very much.
$endgroup$
– jet
3 hours ago

add a comment |

BERT can't be used for next word prediction, at least not with the current state of the research on masked language modeling.

edited 8 hours ago

answered 19 hours ago

ncasas

3,6231130

BERT can't be used for next word prediction, at least not with the current state of the research on masked language modeling.

edited 8 hours ago

answered 19 hours ago

ncasas

3,6231130

edited 8 hours ago

answered 19 hours ago

ncasas

3,6231130

answered 19 hours ago

ncasas

3,6231130

answered 19 hours ago

ncasas

3,6231130

$begingroup$
Thank you very much.
$endgroup$
– jet
3 hours ago

add a comment |

$begingroup$
Thank you very much.
$endgroup$
– jet
3 hours ago

Thank you very much.

– jet
3 hours ago

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Ggthjy