- Training models
- Georgian Wikipedia ka
- Karakalpak Wikipedia kaa
- Kabyle Wikipedia kab
- Kabardian Wikipedia kbd
- Kabiyè Wikipedia kbp
-
Kongo Wikipedia kg - Kikuyu Wikipedia ki
- Kazakh Wikipedia kk
-
Kalaallisut Wikipedia kl - Khmer Wikipedia km
- Kannada Wikipedia kn
-
Komi-Permyak Wikipedia koisee T308135#8632750 -
Karachay-Balkar Wikipedia krc - Kashmiri Wikipedia ks
- Colognian Wikipedia ksh
- Kurdish Wikipedia ku
- Komi Wikipedia kv
- Cornish Wikipedia kw
-
Kyrgyz Wikipedia kysee T308135#8629471
- Models verification
- Publish Datasets
- Populate the excluded section titles
- Deploy back-end
- Check how the model works on the wikis, see T308135#8629471
- In Search, use hasrecommendation:link to find articles
- Test them on https://api.wikimedia.org/service/linkrecommendation/apidocs/#/default/get_v1_linkrecommendations__project___domain___page_title_
- Inform communities
- Deploy front-end, see 940347
Description
Details
- Due Date
- Aug 1 2023, 10:00 AM
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Open | • lbowmaker | T307881 Scaling of link suggestions service | |||
Open | Trizek-WMF | T304110 [EPIC] Deploy "add a link" to all Wikipedias | |||
Resolved | Urbanecm_WMF | T308135 Deploy "add a link" to 10th round of wikis | |||
Resolved | kevinbazira | T329817 Kyrgyz Wikipedia model training pipeline failed |
Event Timeline
18/19 models were trained successfully in the 10th round of wikis.
The Kyrgyz Wikipedia (kywiki) pipeline did not complete successfully and is being investigated in T329817.
Model evaluation has been completed and below are the backtesting results:
[email protected] | [email protected] | |
kawiki | 0.82 | 0.34 |
kaawiki | 0.77 | 0.34 |
kabwiki | 0.82 | 0.65 |
kbdwiki | 0.80 | 0.41 |
kbpwiki | 0.88 | 0.60 |
kgwiki | 0.96 | 0.81 |
kiwiki | 0.96 | 0.89 |
kkwiki | 0.85 | 0.41 |
klwiki | 0.74 | 0.35 |
kmwiki | 0.70 | 0.21 |
knwiki | 0.79 | 0.22 |
koiwiki | 0.94 | 0.13 |
krcwiki | 0.65 | 0.20 |
kswiki | 0.98 | 0.83 |
kshwiki | 0.81 | 0.52 |
kuwiki | 0.88 | 0.40 |
kvwiki | 0.82 | 0.38 |
kwwiki | 0.83 | 0.53 |
CCing @MGerlach, in case he would like to add comments on the backtesting evaluation.
The conclusion on the backtesting results is that most of the languages look fine besides:
- klwiki (0.74), and kmwiki (0.70) have a precision that is slightly lower than the recommended one (0.75).
- krcwiki has a low precision (0.65).
- koiwiki has a low recall (0.13).
Talked to @MGerlach about these results and he said:
@kostajh, we published datasets for all 17/19 models that passed the evaluation in this round.
Thanks! cc @Sgs, in case you want to incorporate this into the deployment work you're doing.
From comments in T308135#8632750 I understand kowiki is not problematic to deploy and its dataset can be used. That gives me a count of 18/19, since kywiki has been moved to T308136. Is that correct @kevinbazira?
I ran this script for adding the link-recommendation task type and populating the excluded sections entries:
for WIKI in kawiki kaawiki kabwiki kbdwiki kbpwiki kgwiki kiwiki kkwiki klwiki kmwiki knwiki kswiki kshwiki kuwiki kvwiki kwwiki; do ORIGIN=`mwscript getConfiguration.php $WIKI --settings 'wgCanonicalServer' --format json | jq --raw-output '.wgCanonicalServer'` mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \ --page MediaWiki:NewcomerTasks.json \ --create-only \ --json \ --summary "Growth features configuration boilerplate ([[phab:$PHAB]])" \ link-recommendation \ '{ "type": "link-recommendation", "group": "easy" }' jq "select(.wiki==\"$WIKI\" and .probability > 0.25) | .section" wiki_sections.jsonl \ | jq --slurp --compact-output "unique" \ | mwscript extensions/GrowthExperiments/maintenance/changeWikiConfig.php $WIKI \ --page MediaWiki:NewcomerTasks.json \ --json \ --summary "machine-generated configuration for excluding sections from link recommendations ([[phab:$PHAB]]), feel free to improve" \ link-recommendation.excludedSections \ "`cat`" echo "$ORIGIN/wiki/MediaWiki:NewcomerTasks.json" echo "$ORIGIN/w/index.php?title=MediaWiki:NewcomerTasks.json&diff=next" echo "Press <Enter> to continue" read # give time for manual verification done
Waiting for confirmation on koiwiki. Some observations after running the script:
- kabwiki has only one excluded section in English
- kbpwiki has no excluded sections and no other newcomer tasks enabled
- kgwiki has only excluded sections in French
- kiwiki has only excluded sections in English
- klwiki has only excluded sections in English
@Sgs, yes koiwiki's dataset can be used. Regarding kywiki, 17/19 models were published in this round because kywiki's training pipeline did not complete successfully in the 10th round, the bug that caused this issue was fixed in T329817#8635930 then kywiki was added to the 11th round where: its training pipeline run successfully; passed the backtesting evaluation; and got published here:
https://analytics.wikimedia.org/published/datasets/one-off/research-mwaddlink/kywiki/
Change 935723 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):
[operations/mediawiki-config@master] GrowthExperiments: Enable backend of link recommendation 10th round wikis
Change 935723 merged by jenkins-bot:
[operations/mediawiki-config@master] GrowthExperiments: Enable backend of link recommendation 10, 11, 12th round wikis
Mentioned in SAL (#wikimedia-operations) [2023-07-11T13:03:28Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:935723|GrowthExperiments: Enable backend of link recommendation 10, 11, 12th round wikis (T308135 T308136 T308137)]]
Mentioned in SAL (#wikimedia-operations) [2023-07-11T13:04:58Z] <urbanecm@deploy1002> sgimeno and urbanecm: Backport for [[gerrit:935723|GrowthExperiments: Enable backend of link recommendation 10, 11, 12th round wikis (T308135 T308136 T308137)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
Mentioned in SAL (#wikimedia-operations) [2023-07-11T13:13:13Z] <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:935723|GrowthExperiments: Enable backend of link recommendation 10, 11, 12th round wikis (T308135 T308136 T308137)]] (duration: 09m 45s)
Change 940347 had a related patch set uploaded (by Sergio Gimeno; author: Sergio Gimeno):
[operations/mediawiki-config@master] GrowthExperiments: enable AddLink task frontend in 10th round of wikis
@Trizek-WMF I can confirm all the wikis from this round have produced abundant results now. Including koiwiki and kywiki which were prior discarded from the round because of model pipeline issues. However there are two wikis which have generated a very low number of results:
I think this round is ready for announcement and frontend enabling (aside from kgwiki and klwiki, which are maybe ok as well). I'm OoO next two weeks, I've left the configuration change (940347) ready to backport so another engineer can take on, in case you want to progress the task before I'm back.
Per a Slack discussion, the release date was changed to August 01 (tech news change, newsletter change).
Change 940347 merged by jenkins-bot:
[operations/mediawiki-config@master] GrowthExperiments: enable AddLink task frontend in 10th round of wikis
Mentioned in SAL (#wikimedia-operations) [2023-08-01T08:22:38Z] <urbanecm@deploy1002> Started scap: Backport for [[gerrit:940347|GrowthExperiments: enable AddLink task frontend in 10th round of wikis (T308135)]]
Mentioned in SAL (#wikimedia-operations) [2023-08-01T08:24:20Z] <urbanecm@deploy1002> sgimeno and urbanecm: Backport for [[gerrit:940347|GrowthExperiments: enable AddLink task frontend in 10th round of wikis (T308135)]] synced to the testservers mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, and mw-debug kubernetes deployment (accessible via k8s-experimental XWD option)
Mentioned in SAL (#wikimedia-operations) [2023-08-01T08:33:30Z] <urbanecm@deploy1002> Finished scap: Backport for [[gerrit:940347|GrowthExperiments: enable AddLink task frontend in 10th round of wikis (T308135)]] (duration: 10m 52s)
Re-checked:
kgwiki - 7 results | Special:NewcomerTasksInfo lists 7 tasks for link-recommendation | Special:Homepage doesn't display any task types available |
klwiki - 2 results | Special:NewcomerTasksInfo expand 66 and link-recommendation2 | Special:Homepage displays only expand task type |
krcwiki - no results matching the query | Special:NewcomerTasksInfo No data is available. | |