Skip to content

Why is the model outputting UNK tokens? Shouldn't it be able to point to unkown words from the input? #32

@Rhuax

Description

@Rhuax

From: https://github.com/abisee/pointer-generator/blob/master/beam_search.py#L111

In decoding words, we change the token id to the unkown id if t<vocab.size(). So if the decoder is pointing to that particular token it produces [UNK] in output. Is it correct? Following the paper it seems that the decoder should be able to point to that token and copy it, instead of copying the unknown token. I think it's the whole purpose of the pointer-generator model to handle oovs. But from some experiments in decoding I see that the models often outputs some unknown tokens.
I tried replacing the 50k vocabulary to the full vocabolary but I get cuda device asserted errors.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions