Nemo_bis (Nemo)
User

Projects (82)
View All

Calendar

User Details

User Since: Oct 10 2014, 2:32 PM (545 w, 12 h)
Availability: Available
LDAP User: Unknown
MediaWiki User: Nemo bis [ Global Accounts ]

Wikimedia cross-wiki coordination and L10n/i18n. Mainly active on Wikiquote, Wiktionary, Wikisource, Commons, Wikidata, Wikibooks. And of course Meta-Wiki, translatewiki.net.

Contact me by MediaWiki.org email or user talk.

Recent Activity
View All

Sun, Mar 16

Nemo_bis added a project to T349671: Cannot upload on Commons or even here: SRE-swift-storage.

Sun, Mar 16, 9:50 AM · SRE-swift-storage, Traffic, SRE

Nemo_bis added a parent task for T271530: Better protect against data loss and corruption during file uploads: T289996: Media storage metadata inconsistent with Swift or corrupted in general.

Sun, Mar 16, 9:49 AM · Platform Team Workboards (Clinic Duty Team), MediaWiki-Uploading

Nemo_bis added a subtask for T289996: Media storage metadata inconsistent with Swift or corrupted in general: T271530: Better protect against data loss and corruption during file uploads.

Sun, Mar 16, 9:49 AM · Commons, MediaWiki-File-management, media-backups, SRE-swift-storage

Nemo_bis added a comment to T42304: Reupload/overwriting of old version of a file fails, multiple files are uploaded under same title, old revisions are lost.

The investigation at T263301, T271530 gives some idea of what might have happened here.

Sun, Mar 16, 9:47 AM · Multimedia, UploadWizard

Sat, Mar 15

Nemo_bis awarded T383243: Zuul/Jenkins: Investigate caching of build results for MediaWiki testsuite jobs a Love token.

Sat, Mar 15, 12:32 PM · Continuous-Integration-Infrastructure, Release-Engineering-Team (Doing 😎)

Feb 8 2025

Nemo_bis triaged T193728: Address concerns about perceived legal uncertainty of Wikidata as Low priority.

Feb 8 2025, 9:12 AM · WMF-Legal, Wikidata

Feb 6 2025

Nemo_bis awarded T376297: Block traffic to RESTBase /page/related endpoint and sunset it a Love token.

Feb 6 2025, 12:29 PM · Content-Transform-Team (Work In Progress), serviceops-radar, Traffic, RESTBase, RESTBase Sunsetting

Jan 28 2025

Nemo_bis added a comment to T341665: Update the share links to deprioritise twitter and Facebook.

Also, I've tried the link from a recent post and it doesn't even work: it produces an empty post after one and two redirects. It seems nobody is using those links, as nobody noticed.

Jan 28 2025, 6:18 AM · wikimediafoundation.org

Nemo_bis updated subscribers of T341665: Update the share links to deprioritise twitter and Facebook.

Jan 28 2025, 6:06 AM · wikimediafoundation.org

Nemo_bis updated subscribers of T341665: Update the share links to deprioritise twitter and Facebook.

Another reason to do this is that Facebook doesn't even allow sharing links to some Wikimedia projects.

Jan 28 2025, 6:05 AM · wikimediafoundation.org

Jan 22 2025

Nemo_bis added a comment to T368098: Dumps generation cause disruption to the production environment.

Thanks for the update on XML data dumps list. I see there's progress on the other side: https://phabricator.wikimedia.org/T382947#10476420 . Hopefully this will allow to re-enable the dumps soon.

Jan 22 2025, 7:09 AM · DPE-Mediawiki-Content, Epic, Data-Engineering, MW-1.43-notes (1.43.0-wmf.11; 2024-06-25), Dumps-Generation, SRE

Jan 3 2025

Nemo_bis added a comment to T382069: Undeploy and archive ActiveAbstract.

IIRC these (and the OAI feeds) were added back in the day when the WMF got some corporate contribution to provide specialised data feeds. I imagine any contractual obligations have long expired (if they even existed), but I don't know who could verify that.

Jan 3 2025, 8:02 PM · translatewiki.net, Wikimedia-GitHub, Diffusion-Repository-Administrators, Projects-Cleanup, Data-Engineering, Dumps-Generation, Data-Platform, ActiveAbstract

Oct 20 2024

Pppery awarded T106007: Add link to report form in "Fatal exception of type X" errorbox a Like token.

Oct 20 2024, 9:57 PM · WMF-General-or-Unknown

Sep 9 2024

Nemo_bis added a comment to T360041: Set query result retention time.

The query itself will remain, so getting fresh results should be nothing more than a submit query away.

Sep 9 2024, 3:11 PM · Quarry

Sep 2 2024

Nemo_bis added a comment to Blog Post: Moving performance testing tools out of AWS.

By running more tests and using Mann Whitney we know if a performance regression is of statistical significance. That way we can make sure that we only alert on real regressions. That decreases the number of false alerts and time spent investigating regressions.

Sep 2 2024, 6:19 AM · Synthetic-Performance-Testing

Jul 22 2024

Nemo_bis added a comment to T367528: Cloud VPS "dumps" project Buster deprecation.

We certainly don't want to be in the way. Feel free to delete the VMs. I was hoping to double check there's nothing to salvage in the local mounts but usually there shouldn't be anyway.

Jul 22 2024, 7:54 PM · Cloud-VPS (Debian Buster Deprecation)

Jul 16 2024

Nemo_bis added a comment to T368729: CLDR data for relative timestamps missing for Karakalpak.

As an update, I created the account and luckily we were still in time for this round of submissions (CLDR 46). It's always a good time to ask a CLDR account from me! Six months tend to fly by.

Jul 16 2024, 7:19 PM · MediaWiki-extensions-CLDR

Jul 12 2024

Nemo_bis awarded T343020: Converting MediaWiki Metrics to StatsLib a Love token.

Jul 12 2024, 6:56 AM · MW-1.44-notes (1.44.0-wmf.21; 2025-03-18), SRE Observability (FY2024/2025-Q3), Patch-For-Review, Observability-Metrics

Jul 10 2024

Nemo_bis added a comment to T369675: Empty legacy upload/deletion/block logs, initial creations of Main Page on some wikis, and some unrelated old revisions on simplewiki, have blank timestamps, which render as the current time.

Maybe it could be retrieved from a very early dump or some other means

Jul 10 2024, 6:33 PM · Wikimedia-database-issue (Bad data)

Jul 1 2024

Nemo_bis claimed T367528: Cloud VPS "dumps" project Buster deprecation.

Jul 1 2024, 4:53 PM · Cloud-VPS (Debian Buster Deprecation)

Nemo_bis added a comment to T367528: Cloud VPS "dumps" project Buster deprecation.

@Hydriz Can I upgrade the VMs to Debian 11 one of these weekends? The only reason not to that I can think of is some scripts may require Python2, but that's still available in Debian 11.

Jul 1 2024, 4:53 PM · Cloud-VPS (Debian Buster Deprecation)

Jun 5 2024

Nemo_bis added a comment to T365690: Make it possible to access the Realtime API and On-demand API without authentication .

@HShaikh Please don't propagate myths. https://aeon.co/essays/the-tragedy-of-the-commons-is-a-false-and-dangerous-myth

Jun 5 2024, 8:16 PM · Wikimedia Enterprise

May 4 2024

Nemo_bis closed T309328: IP range-blocks should not block trusted logged-in users (autopatrolled, bot, bureaucrat, checkuser, interface-admin, steward) as Invalid.

I'm closing this task as unclear and not pertaining to MediaWiki core, mostly because it mixes different user groups and permissions some of which are Wikimedia-specific.

May 4 2024, 10:15 PM · Trust and Safety Product Team, User-Frostly, Community-consensus-needed, MediaWiki-Blocks

Nemo_bis claimed T309328: IP range-blocks should not block trusted logged-in users (autopatrolled, bot, bureaucrat, checkuser, interface-admin, steward).

May 4 2024, 10:14 PM · Trust and Safety Product Team, User-Frostly, Community-consensus-needed, MediaWiki-Blocks

Nemo_bis updated the task description for T204949: Audit local interface message overrides at translatewiki.net.

May 4 2024, 8:38 PM · User-Nikerabbit, translatewiki.net

Nemo_bis triaged T204949: Audit local interface message overrides at translatewiki.net as Low priority.

May 4 2024, 8:10 PM · User-Nikerabbit, translatewiki.net

Nemo_bis updated the task description for T204949: Audit local interface message overrides at translatewiki.net.

May 4 2024, 7:19 PM · User-Nikerabbit, translatewiki.net

Nemo_bis updated the task description for T204949: Audit local interface message overrides at translatewiki.net.

May 4 2024, 6:41 PM · User-Nikerabbit, translatewiki.net

Nemo_bis updated the task description for T204949: Audit local interface message overrides at translatewiki.net.

May 4 2024, 6:35 PM · User-Nikerabbit, translatewiki.net

Nemo_bis awarded T348388: SUL3: Use a dedicated domain for login and account creation a Mountain of Wealth token.

May 4 2024, 4:40 PM · Goal, OKR-Work, MediaWiki-Platform-Team (Roadmap), SUL3, Stewards-and-global-tools, MediaWiki-Core-AuthManager, MediaWiki-extensions-CentralAuth

May 3 2024

Nemo_bis added a comment to T363078: WQT: Automated Wikidata Entity Quality Checker with Language Models.

This reminds me a bit of the https://www.wikidata.org/wiki/Wikidata:Primary_sources_tool , which I believe focused on identifying easy concepts like numbers. I've not used it in years.

May 3 2024, 5:46 PM · Wikimedia-Hackathon-2024

Nemo_bis updated the task description for T363078: WQT: Automated Wikidata Entity Quality Checker with Language Models.

May 3 2024, 5:03 PM · Wikimedia-Hackathon-2024

Nemo_bis added a comment to F49979116: Screenshot_20240503_163330.png.

https://www.mediawiki.org/wiki/Special:RecentChanges?useskin=vector&uselang=ksh after disabling JavaScript recentchanges:

Screenshot_20240503_163542.png (419×774 px, 87 KB)

May 3 2024, 1:38 PM

Nemo_bis updated subscribers of T160604: Frequency of message key renames.

@Mazevedo Here's an example old ticket which may or may not be relevant any more. :)

May 3 2024, 11:27 AM · I18n, Wikipedia-iOS-App-Backlog

Nemo_bis renamed T231755: Local language name should be translatable in translatewiki.net from Local language name should be translatable in translatewiki to Local language name should be translatable in translatewiki.net.

May 3 2024, 8:05 AM · MW-1.44-notes (1.44.0-wmf.14; 2025-01-28), Patch-For-Review, MW-1.43-notes (1.43.0-wmf.4; 2024-05-07), Wikimedia-Hackathon-2024, translatewiki.net, MediaWiki-extensions-CLDR

Nemo_bis added a comment to T231755: Local language name should be translatable in translatewiki.net.

Do you want to focus on the exonyms in languages which are supported by MediaWiki core (or at least translatewiki.net) but not in CLDR?

May 3 2024, 8:00 AM · MW-1.44-notes (1.44.0-wmf.14; 2025-01-28), Patch-For-Review, MW-1.43-notes (1.43.0-wmf.4; 2024-05-07), Wikimedia-Hackathon-2024, translatewiki.net, MediaWiki-extensions-CLDR

Apr 25 2024

Nemo_bis awarded T299694: Adding sicilian language (scn) a Love token.

Apr 25 2024, 8:58 PM · Phabricator (2024-04-23), translatewiki.net, I18n

Apr 20 2024

Nemo_bis added a comment to P43437 enwiki-20230120-pmc-redundanturl.txt.

That was with all namespaces.

Apr 20 2024, 8:15 AM · OABot

Apr 19 2024

Nemo_bis added a comment to P43437 enwiki-20230120-pmc-redundanturl.txt.

Current status

Apr 19 2024, 5:48 PM · OABot

Apr 17 2024

Nemo_bis placed T32442: Give the user a chance to specify the gender when creating the account up for grabs.

Apr 17 2024, 7:57 AM · Growth-Team, Gender-Support, I18n, MediaWiki-User-login-and-signup

Jan 16 2024

Nemo_bis added a comment to T344114: Remove doi-access=free when Unpaywall no longer confirms it.

After the latest run

Jan 16 2024, 8:03 AM · OABot

Jan 14 2024

Nemo_bis added a comment to T344114: Remove doi-access=free when Unpaywall no longer confirms it.

Mostly fixed upstream.

Jan 14 2024, 1:29 PM · OABot

Jan 13 2024

Nemo_bis added a comment to T283717: Add PMC ID even if doi-access=free.

Not clear to me why this doi:10.1038/s41586-023-06291-2 got an arxiv but not pmc ID https://en.wikipedia.org/w/index.php?title=PubMed&diff=prev&oldid=1195324840

Jan 13 2024, 11:22 AM · OABot

Nemo_bis added a comment to T354472: Work around incorrect matches for PMC IDs of AMS/PNAS papers.

The new round seems to go fine so far https://en.wikipedia.org/w/index.php?title=Special:Contributions/OAbot&target=OAbot&dir=prev&offset=20240107000000&limit=50

Jan 13 2024, 11:17 AM · OABot

Jan 7 2024

Nemo_bis triaged T354472: Work around incorrect matches for PMC IDs of AMS/PNAS papers as Medium priority.

Jan 7 2024, 9:26 PM · OABot

Nemo_bis added a comment to T354472: Work around incorrect matches for PMC IDs of AMS/PNAS papers.

For the non-Unpaywall side, continues at T228702

Jan 7 2024, 9:19 PM · OABot

Nemo_bis added a comment to T228702: Relax author and year match?.

We're still discarding excess merges from Dissemin similar to the 2019 logic https://github.com/dissemin/oabot/commit/e3c74bff735c1ef16ee333dde2ac4bdd20949635 . We're not currently using the Dissemin title matches but if we did it would not be enough to check for title, author, year match: https://en.wikipedia.org/w/index.php?title=User_talk%3AOAbot&diff=1194216712&oldid=1193993325 .

Jan 7 2024, 9:18 PM · OABot

Nemo_bis added a comment to T354472: Work around incorrect matches for PMC IDs of AMS/PNAS papers.

Should be fixed with https://github.com/dissemin/oabot/pull/91/commits/1b49d999504b868c7d5eb4d4512300db1f55e871

Jan 7 2024, 9:07 PM · OABot

Nemo_bis updated the task description for T354472: Work around incorrect matches for PMC IDs of AMS/PNAS papers.

Jan 7 2024, 8:54 PM · OABot

Nemo_bis updated the task description for T354472: Work around incorrect matches for PMC IDs of AMS/PNAS papers.

Jan 7 2024, 8:52 PM · OABot

Nemo_bis added a comment to T354472: Work around incorrect matches for PMC IDs of AMS/PNAS papers.

There are over 6500k PMC matches and only 650k matches by title and author, of which some 60k appear without a PMCID match, so perhaps we can just ignore those europepmc matches:

$ lbzip2 -dc unpaywall_snapshot_2022-03-09_sorted.jsonl.bz2 | grep '"is_oa": true' | grep pmc | grep -c "oa repository (via pmcid lookup)"
6499014
$ lbzip2 -dc unpaywall_snapshot_2022-03-09_sorted.jsonl.bz2 | grep '"is_oa": true' | grep pmc | grep -c "oa repository (via OAI-PMH title and first author match)"
637491
$ lbzip2 -dc unpaywall_snapshot_2022-03-09_sorted.jsonl.bz2 | grep '"is_oa": true' | grep pmc | grep "oa repository (via OAI-PMH title and first author match)" | grep -vc "oa repository (via pmcid lookup)"
62310

Jan 7 2024, 7:47 PM · OABot

Nemo_bis added a comment to T354472: Work around incorrect matches for PMC IDs of AMS/PNAS papers.

Both papers on Unpaywall have evidence "oa repository (via OAI-PMH title and first author match)" although the PMC side exposes a link to the correct DOI. The CrossRef API has the page range like "113-128", "283-288", so it may be possible to check for the number of pages.

Jan 7 2024, 6:36 PM · OABot

Nemo_bis added a comment to T354471: Re-assess repository links Unpaywall found on CiteSeerX.

So we won't suggest edits like this either https://en.wikipedia.org/w/index.php?title=Saccharomyceta&curid=68064105&diff=1194087545&oldid=1182890284 as we don't get non-repository URLs from other sources.

Jan 7 2024, 9:37 AM · OABot

Jan 6 2024

Nemo_bis added a comment to T354471: Re-assess repository links Unpaywall found on CiteSeerX.

A sample of what kind of URLs we're talking about

Jan 6 2024, 5:34 PM · OABot

Nemo_bis raised the priority of T354471: Re-assess repository links Unpaywall found on CiteSeerX from Low to Medium.

Jan 6 2024, 2:48 PM · OABot

Nemo_bis added a comment to T354471: Re-assess repository links Unpaywall found on CiteSeerX.

Should be fixed by https://github.com/nemobis/oabot/commit/8895319d9fd65808b8a1cb41dd0ef29ed2987c43

Jan 6 2024, 2:47 PM · OABot

Nemo_bis added a comment to T354471: Re-assess repository links Unpaywall found on CiteSeerX.

Only 35k or so of these are in the best_oa_location (sometimes even when a separate match for arxiv exists, like doi:10.1002/rsa.20071 / oai:CiteSeerX.psu:10.1.1.237.8456 / oai:arXiv.org:math/0209357 ).

Jan 6 2024, 2:26 PM · OABot

Nemo_bis added a comment to T354471: Re-assess repository links Unpaywall found on CiteSeerX.

Not sure how to narrow this down, we're talking about some 500k matches from CiteSeerX (out of 900k):

$ lbzip2 -dc unpaywall_snapshot_2022-03-09_sorted.jsonl.bz2 | grep citeseerx | grep "oa repository (via OAI-PMH doi match)" | jq -r 'select(.oa_locations | .[] | .endpoint_id == "CiteSeerX.psu" and .evidence == "oa repository (via OAI-PMH doi match)" )|.doi' | wc -l
505747
$ lbzip2 -dc unpaywall_snapshot_2022-03-09_sorted.jsonl.bz2 | grep -c citeseerx
887759

Jan 6 2024, 1:59 PM · OABot

Nemo_bis added a parent task for T354472: Work around incorrect matches for PMC IDs of AMS/PNAS papers: T283717: Add PMC ID even if doi-access=free.

Jan 6 2024, 10:23 AM · OABot

Nemo_bis added a subtask for T283717: Add PMC ID even if doi-access=free: T354472: Work around incorrect matches for PMC IDs of AMS/PNAS papers.

Jan 6 2024, 10:23 AM · OABot

Nemo_bis created T354472: Work around incorrect matches for PMC IDs of AMS/PNAS papers.

Jan 6 2024, 10:08 AM · OABot

Nemo_bis added a parent task for T354471: Re-assess repository links Unpaywall found on CiteSeerX: T283717: Add PMC ID even if doi-access=free.

Jan 6 2024, 10:01 AM · OABot

Nemo_bis added a subtask for T283717: Add PMC ID even if doi-access=free: T354471: Re-assess repository links Unpaywall found on CiteSeerX.

Jan 6 2024, 10:01 AM · OABot

Nemo_bis updated the task description for T354471: Re-assess repository links Unpaywall found on CiteSeerX.

Jan 6 2024, 10:01 AM · OABot

Nemo_bis claimed T354471: Re-assess repository links Unpaywall found on CiteSeerX.

Jan 6 2024, 10:00 AM · OABot

Nemo_bis created T354471: Re-assess repository links Unpaywall found on CiteSeerX.

Jan 6 2024, 9:59 AM · OABot

Jan 5 2024

Nemo_bis added a comment to T283717: Add PMC ID even if doi-access=free.

Another example where URL priorities changed: https://en.wikipedia.org/w/index.php?title=Balbinot_1&diff=prev&oldid=1193722831 (but there was no doi-access=free).

Jan 5 2024, 8:24 AM · OABot

Nemo_bis added a comment to T283717: Add PMC ID even if doi-access=free.

The recent change to sort all URLs https://github.com/dissemin/oabot/commit/ddab25a5ee71e2f23fe4b8dfb5a28c8da333a922 allowed the bot to perform https://en.wikipedia.org/w/index.php?title=Serafim_Kalliadasis&diff=prev&oldid=1193717235 , while previously it would probably only have suggested the first URL https://eprints.qut.edu.au/134215/1/134215p.pdf . http://hdl.handle.net/10044/1/55290 is the 3rd suggestion from Unpaywall and https://arxiv.org/abs/1609.05938 is the 8th.

Jan 5 2024, 7:39 AM · OABot

Nemo_bis claimed T283717: Add PMC ID even if doi-access=free.

Jan 5 2024, 7:36 AM · OABot

Jan 1 2024

Nemo_bis added a comment to T196255: Do not take existing URL or identifier for granted.

That's fixed in https://github.com/dissemin/oabot/commit/1cd61525a8cc5d8378e60f63555cf291e1bb4660 hopefully

Jan 1 2024, 5:58 PM · OABot

Nemo_bis closed T354144: OAbot leaderboard not being updated with new users as Resolved.

Jan 1 2024, 2:59 PM · OABot

Nemo_bis added a comment to T354144: OAbot leaderboard not being updated with new users.

I've manually updated the leaderboard with https://github.com/nemobis/oabot/commit/4917289ac7b49ca5176129d9f19ae5355ac84b72

Jan 1 2024, 2:58 PM · OABot

Nemo_bis added a comment to T354144: OAbot leaderboard not being updated with new users.

The last row created was

Jan 1 2024, 1:35 PM · OABot

Nemo_bis created T354144: OAbot leaderboard not being updated with new users.

Jan 1 2024, 1:15 PM · OABot

Dec 25 2023

Nemo_bis added a comment to T196255: Do not take existing URL or identifier for granted.

https://en.wikipedia.org/w/index.php?title=Lyman_E._Johnson&diff=prev&oldid=1191724248 was not supposed to happen as the existing URL returns a PDF.

Dec 25 2023, 10:37 AM · OABot

Dec 23 2023

Nemo_bis awarded T190129: Consolidate language metadata into a 'language-data' library and use in MediaWiki a Love token.

Dec 23 2023, 1:50 PM · Librarization, Language-Team (Language-2022-January-March), Language codes, TechCom-RFC, Epic, I18n

Dec 19 2023

Nemo_bis added a comment to T196255: Do not take existing URL or identifier for granted.

Latest run

Dec 19 2023, 9:14 PM · OABot

Nemo_bis added a comment to T344114: Remove doi-access=free when Unpaywall no longer confirms it.

Still room for improvement

Dec 19 2023, 9:13 PM · OABot

Dec 10 2023

Nemo_bis added a comment to T344114: Remove doi-access=free when Unpaywall no longer confirms it.

Some doi-access=free being re-added now:

$ find -maxdepth 1 -type f -print0 | xargs -0 -P16 -n1 jq '.proposed_edits|.[]| select(.proposed_change|contains("doi-access=free")) | .orig_string' | grep doi | grep -Eo 'doi *= *[^"|]+' | grep -Eo '10\.[0-9]+/[a-z]+(\.([a-z]{,8}|[0-9-]{9})\b)?' | sort | uniq -c | sort -nr | head -n 40                                                                                                                                
    546 10.1146/annurev
    409 10.1007/s
    186 10.4202/app.
    178 10.1016/j.
    176 10.1016/j.cub
    156 10.1126/science.
    124 10.1038/s
     96 10.1016/j.cretres
     84 10.1111/pala.
     78 10.1017/jpa.
     72 10.1074/jbc.
     66 10.1002/ar.
     61 10.5252/geodiversitas
     56 10.11646/zootaxa.
     52 10.5852/ejt.
     52 10.5852/cr
     52 10.1016/j.palaeo
     52 10.1002/spp
     48 10.1016/j.jhevol
     46 10.1093/zoolinnean
     44 10.5962/bhl.part
     44 10.1111/j.
     42 10.1016/s
     41 10.3140/bull.geosci
     39 10.1016/j.cell
     39 10.1002/ajb
     38 10.4049/jimmunol.
     38 10.1017/pab.
     33 10.1038/nature
     32 10.1111/j.1475-4983
     31 10.37828/em.
     31 10.1093/mnras
     28 10.1111/j.1096-3642
     27 10.5962/p.
     27 10.2476/asjaa.
     25 10.7203/sjp.
     25 10.1016/j.revpalbo
     23 10.1002/ajpa.
     21 10.24425/agp.
     21 10.1093/bioinformatics

Dec 10 2023, 6:56 PM · OABot

Nemo_bis added a comment to T196255: Do not take existing URL or identifier for granted.

Currently with some 160k pages found:

$ find -maxdepth 1 -type f -print0 | xargs -0 -P8 -n1 jq '.proposed_edits|.[]| select(.proposed_change|contains("subscription")) | .orig_string' | grep -Eo '\| *url *= *http[^|}]+' | cut -d/ -f3 | sort | uniq -c | sort -nr | head -n 30
  15725 www.jstor.org
  14451 dx.doi.org
  12927 doi.org
   9520 www.sciencedirect.com
   6442 www.researchgate.net
   5630 www.tandfonline.com
   5491 onlinelibrary.wiley.com
   4498 www.cambridge.org
   3824 pubmed.ncbi.nlm.nih.gov
   3477 link.springer.com
   3182 muse.jhu.edu
   3024 linkinghub.elsevier.com
   2928 www.nature.com
   2770 journals.sagepub.com
   2065 www.academia.edu
   1934 pubs.acs.org
   1896 academic.oup.com
   1736 www.persee.fr
   1520 www.science.org
   1473 semanticscholar.org
   1247 www.journals.uchicago.edu
   1210 archive.org
   1128 books.google.com
    956 ieeexplore.ieee.org
    854 www.oxforddnb.com
    789 brill.com
    707 doi.wiley.com
    646 www.semanticscholar.org
    620 zenodo.org
    571 www.degruyter.com

Dec 10 2023, 6:50 PM · OABot

Nemo_bis added a comment to T196255: Do not take existing URL or identifier for granted.

After a broader run

$ find -maxdepth 1 -type f -print0 | xargs -0 -P8 -n1 jq '.proposed_edits|.[]| select(.proposed_change|contains("subscription")) | .orig_string' | grep -Eo '\| *url *= *http[^|}]+' |
 cut -d/ -f3 | sort | uniq -c | sort -nr | head -n 30
   3020 dx.doi.org
   2666 www.jstor.org
   2569 doi.org
   2116 www.sciencedirect.com
   1217 www.researchgate.net
   1105 onlinelibrary.wiley.com
   1011 www.tandfonline.com
    822 www.cambridge.org
    789 pubmed.ncbi.nlm.nih.gov
    748 linkinghub.elsevier.com
    685 link.springer.com
    630 www.nature.com
    522 journals.sagepub.com
    453 muse.jhu.edu
    435 pubs.acs.org
    361 www.academia.edu
    351 semanticscholar.org
    341 academic.oup.com
    338 www.science.org
    301 archive.org
    244 www.persee.fr
    210 www.journals.uchicago.edu
    187 books.google.com                                                                                                                                                                                                                   
    180 ieeexplore.ieee.org
    157 pubs.geoscienceworld.org
    150 doi.wiley.com
    149 www.semanticscholar.org
    120 pubs.rsc.org
    119 brill.com
    108 link.aps.org

Dec 10 2023, 11:02 AM · OABot

Dec 8 2023

Nemo_bis added a comment to T344114: Remove doi-access=free when Unpaywall no longer confirms it.

Some examples
https://doi.org/10.2307/1378152
https://doi.org/10.2307/3632910
https://doi.org/10.2307/3496680
https://doi.org/10.2307/2324301
https://doi.org/10.2307/2371798

Dec 8 2023, 2:28 PM · OABot

Nemo_bis added a comment to F41575560: Screenshot_20231208_134236.png.

Screenshot_20231208_140022.png (957×1 px, 207 KB)

Screenshot_20231208_140052.png (996×1 px, 128 KB)

Dec 8 2023, 2:27 PM

Nemo_bis added a comment to T344114: Remove doi-access=free when Unpaywall no longer confirms it.

How to sample JSTOR DOIs which look closed:

$ find -maxdepth 1 -type f -print0 | xargs -0 -P8 -n1 jq '.proposed_edits|.[]| select(.proposed_change|contains("doi-access=|")) | .orig_string' | grep 2307 | grep -Eo "10.2307/[0-9]+" | sort | shuf -n 40

Dec 8 2023, 11:55 AM · OABot

Nemo_bis set the alternate text for F41575560: Screenshot_20231208_134236.png to doi:10.2307/2987492.

Dec 8 2023, 11:44 AM

Nemo_bis added a comment to T196255: Do not take existing URL or identifier for granted.

Currently the most represented domains would be:

$ find -maxdepth 1 -type f -mtime -1 -print0 | xargs -0 -n1 jq '.proposed_edits|.[]| select(.proposed_change|contains("subscription")) | .orig_string' | grep -Eo '\| *url *= *http[^|}]+' | cut -d/ -f3 | sort | uniq -c | sort -nr | head -n 30
    916 dx.doi.org
    723 www.sciencedirect.com
    658 doi.org
    519 www.jstor.org
    312 onlinelibrary.wiley.com
    292 linkinghub.elsevier.com
    267 www.researchgate.net
    221 www.tandfonline.com
    218 www.cambridge.org
    204 link.springer.com
    182 pubmed.ncbi.nlm.nih.gov
    179 www.nature.com
    152 journals.sagepub.com
    131 pubs.acs.org
    102 www.science.org
     94 academic.oup.com
     93 semanticscholar.org
     87 archive.org
     79 www.academia.edu
     74 pubs.geoscienceworld.org
     55 doi.wiley.com
     54 www.journals.uchicago.edu
     52 pubs.rsc.org
     50 muse.jhu.edu
     49 www.semanticscholar.org
     47 ieeexplore.ieee.org
     43 iopscience.iop.org
     42 link.aps.org
     37 xlink.rsc.org
     35 aip.scitation.org

Dec 8 2023, 7:01 AM · OABot

Dec 7 2023

Nemo_bis added a comment to T196255: Do not take existing URL or identifier for granted.

Now looking better https://en.wikipedia.org/w/index.php?title=Thin-film_solar_cell&diff=prev&oldid=1188752862

Dec 7 2023, 1:31 PM · OABot

Nemo_bis added a comment to T196255: Do not take existing URL or identifier for granted.

Need to check how many url-access=limited we'd add to non-DOI citations like AdsAbs https://en.wikipedia.org/w/index.php?title=T_Scorpii&diff=prev&oldid=1188735108

Dec 7 2023, 10:10 AM · OABot

Nemo_bis added a comment to T196255: Do not take existing URL or identifier for granted.

We should not replace an existing url-access with another for the same URL as happened https://en.wikipedia.org/w/index.php?title=Soft_skills&diff=prev&oldid=1188731807 (even though I'd argue the archive.org inlibrary items are more "limited" than "registration").

Dec 7 2023, 9:28 AM · OABot

Dec 3 2023

Nemo_bis added a comment to T344114: Remove doi-access=free when Unpaywall no longer confirms it.

I've manually deleted the older suggestions so now the numbers will be lower.

find ~/www/python/src/bot_cache -mtime +3 -delete

Dec 3 2023, 11:27 PM · OABot

Nemo_bis added a comment to T344114: Remove doi-access=free when Unpaywall no longer confirms it.

Some ISSNs

$ find ~/www/python/src/bot_cache -type f -exec jq '.proposed_edits | .[] | .orig_string' {} \; | grep issn | grep -Eo 'issn *= *[0-9-]{8,9}' | grep -Eo '[0-9-]{8,9}' | sort | uniq -c | sort -nr | head -n 40
     87 0036-8075
     46 0004-637
     45 1476-4687
     45 0004-6256
     39 0191-2917
     39 0098-7484
     33 0028-0836
     28 1044-0305
     25 0067-0049
     24 0080-4606
     24 0021-8693
     19 2156-2202
     19 1396-0466
     18 1538-4365
     17 0148-0227
     17 0031-4005
     17 0022-0949
     16 0950-9232
     16 0304-3975
     16 0278-2715
     16 0140-6736
     16 0035-8711
     16 0028-646
     16 0002-7294
     15 1944-8007
     15 1538-4357
     15 0301-4223
     15 0031-949
     15 0006-3568
     15 0003-9926
     14 2330-4804
     14 1475-4983
     14 0271-5333
     13 0272-4634
     13 0097-3165
     13 0080-4630
     12 2515-5172
     12 1631-0683
     12 1364-5021
     12 0094-8276

Dec 3 2023, 11:19 PM · OABot

Nemo_bis added a comment to T344114: Remove doi-access=free when Unpaywall no longer confirms it.

Or to catch some more ISSN:

$ find ~/www/python/src/bot_cache -type f -exec jq '.proposed_edits | .[] | .orig_string' {} \; | grep doi= | grep -Eo 'doi *=[^"|]+' | grep -Eo '10\.[0-9]+/[a-z]+\b(\.?([a-z]{,8}|[0-9-]{8,9})\b)?' | sort | uniq -c | sort -nr | head -n 4
    390 10.1126/science.
    260 10.1001/jama.
    244 10.1074/jbc.
    235 10.1038/sj.onc
    155 10.1098/rsbm.
    116 10.1098/rstb.
    111 10.1525/aa.
    110 10.1098/rspa.
    104 10.1242/jeb.
    104 10.1111/j.
    100 10.5210/fm.
    100 10.1377/hlthaff.
     99 10.1016/j.
     91 10.1098/rstl.
     86 10.1093/mnras
     74 10.1242/jcs.
     68 10.1167/iovs.
     68 10.1001/archinte.
     62 10.1542/peds.
     61 10.1111/j.1469-8137
     60 10.1098/rsta.
     57 10.1111/j.1558-5646
     55 10.1001/archneur.
     53 10.1111/j.1096-3642
     52 10.1001/archpsyc.
     48 10.3732/ajb.
     46 10.1002/art.
     43 10.1038/sj.mp
     43 10.1016/j.febslet
     42 10.1093/hmg
     41 10.1111/j.1432-1033
     41 10.1016/j.jacc
     40 10.1093/acrefore
     40 10.1001/archopht.
     39 10.1098/rspb.
     39 10.1093/molbev
     38 10.1001/archpedi.
     37 10.1242/dev.
     37 10.1111/j.1475-4983
     36 10.1016/j.jasms

Dec 3 2023, 10:46 PM · OABot

Nemo_bis added a comment to T344114: Remove doi-access=free when Unpaywall no longer confirms it.

Some of the most common DOI segments slated for doi-access=free removal in today's run:

$ find ~/www/python/src/bot_cache -type f -exec jq '.proposed_edits | .[] | .orig_string' {} \; | grep doi= | grep -Eo 'doi *=[^"|]+' | grep -Eo '10\.[0-9]+/[a-z]+(\.([a-z]{,8}|[0-9-]{9})\b)?' | sort | uniq -c | sort -nr | head -n 40
    392 10.1126/science.
    351 10.1074/jbc.
    260 10.1001/jama.
    236 10.1038/sj.onc
    209 10.1007/s
    176 10.1016/s
    173 10.1038/s
    155 10.1098/rsbm.
    147 10.1146/knowable
    139 10.1038/d
    116 10.1098/rstb.
    111 10.1525/aa.
    110 10.1098/rspa.
    104 10.1242/jeb.
    104 10.1111/j.
    100 10.5210/fm.
    100 10.1377/hlthaff.
     99 10.1016/j.
     91 10.1098/rstl.
     86 10.1093/mnras
     76 10.1242/jcs.
     75 10.1167/iovs.
     68 10.1001/archinte.
     62 10.1542/peds.
     61 10.1111/j.1469-8137
     60 10.1098/rsta.
     57 10.1111/j.1558-5646
     55 10.1001/archneur.
     53 10.1111/j.1096-3642
     52 10.1001/archpsyc.
     48 10.3732/ajb.
     46 10.1038/nature
     46 10.1002/art.
     45 10.1038/sj.mp
     43 10.1016/j.febslet
     42 10.1111/j.1432-1033
     42 10.1093/hmg
     41 10.1016/j.jacc
     41 10.1007/bf
     40 10.1093/acrefore

Dec 3 2023, 10:27 PM · OABot

Nemo_bis added a comment to T141490: Deploy improved FancyCaptcha.

You can look at effect of captcha on known-human users (e.g. IPs from some insitutional range)

Dec 3 2023, 9:04 PM · User-notice-archive, MW-1.42-notes (1.42.0-wmf.14; 2024-01-16), ConfirmEdit (CAPTCHA extension), Security, Wikimedia-Site-requests

Nemo_bis added a comment to T196255: Do not take existing URL or identifier for granted.

And currently

$ find ~/www/python/src/cache -type f -exec jq '.proposed_edits | .[] | .orig_string' {} \; | grep url= | grep -Eo 'url=[^"|]+' | cut -d/ -f3 | sort | uniq -c | sort -nr | head -n 40     
   1427 doi.org
   1229 dx.doi.org
   1180 www.sciencedirect.com
    940 www.jstor.org
    875 web.archive.org
    736 onlinelibrary.wiley.com
    606 www.researchgate.net
    591 www.nature.com
    586 www.tandfonline.com
    408 www.cambridge.org
    376 archive.org
    337 link.springer.com
    328 linkinghub.elsevier.com
    310 www.escholarship.org
    302 journals.sagepub.com
    283 www.academia.edu
    265 academic.oup.com
    261 pubmed.ncbi.nlm.nih.gov
    259 www.biodiversitylibrary.org
    244 books.google.com
    238 www.science.org
    224 babel.hathitrust.org
    220 zenodo.org
    212 nrs.harvard.edu
    184 ieeexplore.ieee.org
    177 digitalcommons.law.yale.edu
    176 www.journals.uchicago.edu
    166 urn.kb.se
    164 pubs.acs.org
    123 www.bioone.org
    118 nbn-resolving.de
    117 philarchive.org
    110 muse.jhu.edu
    110 link.aps.org
    105 www.research.manchester.ac.uk
    100 bioone.org
     87 www.aeaweb.org
     86 www.osti.gov
     79 pubs.rsc.org
     77 dspace.lboro.ac.uk

Dec 3 2023, 7:19 PM · OABot

Nemo_bis added a comment to T344114: Remove doi-access=free when Unpaywall no longer confirms it.

I made reports upstream for Journal of Biological Chemistry (already fixed), Journal of Asian Studies/Duke University Press, Annual Review of Public Health, AAS journals, AME journals. I manually removed their doi-access=free removals in the queue (they were around 10 % of the total, I think, including all 10.1146/annurev DOIs some of which are not open yet).

Dec 3 2023, 4:52 PM · OABot

Nemo_bis created P54059 DOI prefix 10.4103.

Dec 3 2023, 4:33 PM · OABot

Nov 30 2023

Nemo_bis updated the task description for T196255: Do not take existing URL or identifier for granted.

Nov 30 2023, 11:35 AM · OABot

Nemo_bis updated the task description for T352405: Set url-access field for citations with automatically generated URLs from CrossRef DOIs.

Nov 30 2023, 11:34 AM · Citoid

Nemo_bis created T352405: Set url-access field for citations with automatically generated URLs from CrossRef DOIs.

Nov 30 2023, 11:32 AM · Citoid

Nemo_bis (Nemo)User

Projects (82)View All

Calendar

Today

Tomorrow

Monday

User Details

Recent ActivityView All

Sun, Mar 16

Sat, Mar 15

Feb 8 2025

Feb 6 2025

Jan 28 2025

Jan 22 2025

Jan 3 2025

Oct 20 2024

Sep 9 2024

Sep 2 2024

Jul 22 2024

Jul 16 2024

Jul 12 2024

Jul 10 2024

Jul 1 2024

Jun 5 2024

May 4 2024

May 3 2024

Apr 25 2024

Apr 20 2024

Apr 19 2024

Apr 17 2024

Jan 16 2024

Jan 14 2024

Jan 13 2024

Jan 7 2024

Jan 6 2024

Jan 5 2024

Jan 1 2024

Dec 25 2023

Dec 23 2023

Dec 19 2023

Dec 10 2023

Dec 8 2023

Dec 7 2023

Dec 3 2023

Nov 30 2023

Nemo_bis (Nemo)
User

Projects (82)
View All

Recent Activity
View All