Rouge scores mismatch

The work you have put in is quite appealing.

We have used the model provided [here](https://drive.google.com/open?id=1QqSaxcJGllVPSFea2c2iCV5_dtjJijVe) under the section "Train with pointer generation + coverage loss enabled " to decode. 
The ROUGE scores we obtained slightly vary from that posted here. 

Our ROUGE scores
ROUGE-1:
rouge_1_f_score: 0.3680 with confidence interval (0.3658, 0.3701)
rouge_1_recall: 0.4234 with confidence interval (0.4208, 0.4261)
rouge_1_precision: 0.3471 with confidence interval (0.3446, 0.3496)

ROUGE-2:
rouge_2_f_score: 0.1485 with confidence interval (0.1464, 0.1507)
rouge_2_recall: 0.1706 with confidence interval (0.1682, 0.1731)
rouge_2_precision: 0.1407 with confidence interval (0.1385, 0.1429)

ROUGE-l:
rouge_l_f_score: 0.3327 with confidence interval (0.3306, 0.3349)
rouge_l_recall: 0.3827 with confidence interval (0.3802, 0.3853)
rouge_l_precision: 0.3139 with confidence interval (0.3116, 0.3164)

To get the expected scores in the README what could be the config parameters? 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rouge scores mismatch #5

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Rouge scores mismatch #5

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions