Why is the model outputting UNK tokens? Shouldn't it be able to point to unkown words from the input?

From: https://github.com/abisee/pointer-generator/blob/master/beam_search.py#L111

In decoding words, we change the token id to the unkown id if t<vocab.size(). So if the decoder is pointing to that particular token it produces [UNK] in output. Is it correct? Following the paper it seems that the decoder should be able to point to that token and copy it, instead of copying the unknown token. I think it's the whole purpose of the pointer-generator model to handle oovs. But from some experiments in decoding I see that the models often outputs some unknown tokens.
I tried replacing the 50k vocabulary to the full vocabolary but I get cuda device asserted errors.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why is the model outputting UNK tokens? Shouldn't it be able to point to unkown words from the input? #32

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Why is the model outputting UNK tokens? Shouldn't it be able to point to unkown words from the input? #32

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions