User-to-user communication and collaboration - this is unmaintained software: https://www.mediawiki.org/wiki/Structured_Discussions/Deprecation
(Homepage)
See also: Growth-Team
User-to-user communication and collaboration - this is unmaintained software: https://www.mediawiki.org/wiki/Structured_Discussions/Deprecation
(Homepage)
See also: Growth-Team
So, thinking about next steps (to address on Monday):
Change #1157155 had a related patch set uploaded (by Pppery; author: Pppery):
[mediawiki/extensions/Flow@master] Ignore revisions by Flow talk page manager when importing LQT
Change #1156959 had a related patch set uploaded (by Pppery; author: Pppery):
[mediawiki/extensions/Flow@master] Do not try to import pages that already redirect to Flow
Change #1156959 had a related patch set uploaded (by Pppery; author: Pppery):
[mediawiki/extensions/Flow@master] Do not try to import pages that already redirect to Flow
Change #1156952 had a related patch set uploaded (by Pppery; author: Pppery):
[mediawiki/extensions/Flow@master] Fix misc stuff in LQT import code
Change #1156945 had a related patch set uploaded (by Pppery; author: Pppery):
[mediawiki/extensions/LiquidThreads@master] Misc fixes for LQT
Good catch – I'd initially eaten the whole anchor and then thought better of it.
In any case: the output list of topics in both the extracted and full-history versions of the Flow database dump were identical
Mostly unrelated to the above but I found that working with ElementTree was a lot more pleasant when I could have a look at the object in Jupyter: https://phabricator.wikimedia.org/P77955
Right then, here we go!
Hah, don't apologise – a lot of why I'm posting in so much detail to this thread is that it's important to get this right and it's reassuring that you're here to point me in the right direction. I'll run the other dump through this script tomorrow, with modifications as needed, and hopefully by the time you're online for the day I'll have a relatively convincing list of threads and the topics they should point at.
Not all Flow topics necessarily appear in the "flow" dump. Topics that were "hidden" are only found in the FlowHistory dump, and topics that were outright deleted may not necessarily be in any of the dumps (I'm not sure how that is handled). This isn't likely to affect that many topics, but take note.
For now I'm just grabbing the list of redirects, valid or otherwise - I did indeed realise as soon as I'd posted that I should have phrased that better. Processing every revision of everything in a namespace was the thing that really wanted behind me, but now I've got a json file of Flow topics that should load nice and quickly. It's end of day for me now, but hopefully by the end of tomorrow I'll have a list mapping from LQT pages to the most recent valid Flow topic, if it exists.
The LQT->LQT redirects in that list are fine; they are what normally happens when a thread is moved in LQT. The one LQT->Flow list was a case I hadn't forseen where double redirect bots turned the previous case into a redirect to an invalid Flow topic. Since there was only one of them I undid the bot edit and returned it to the previous state; the topic in question is on a page that remains LQT today.
I didn't want to hit up the API 12,627 times for revision history, as the rate limit would mean it would take about 2.5 hours to execute. I've therefore downloaded the complete revision history, which is about 305MB (and close to 5GB uncompressed). I then created the script below to extract the details of redirects.
All looks right to me. The correct thing to pass for throot is the LQT page name, not the ID.
So, as I understand it, I want to query everything in the Thread namespace and work out the status of each of these
Change #1154016 merged by jenkins-bot:
[mediawiki/extensions/Flow@master] Use middleware instead of AbortEmailNotification hook
Change #1154016 had a related patch set uploaded (by Pmiazga; author: Pmiazga):
[mediawiki/extensions/Flow@master] Use middleware instead of AbortEmailNotification hook
I don't know how you're so quick at finding these! I appreciate the example.
Oh, and, in case it's useful, here's another example of everything put together:
Don't worry, you've not led me too far astray. I spotted the Topico: namespace and worked out I could use it to look at individual threads, but I didn't mentally connect it to the URL you mentioned and I didn't realise the significance of it being a namespace in terms of what I could do with the API. I had imagined LQTs as being somehow "on" the root page, but really it's more like the root page renders all of the threads that are connected to it if it's got that special parser instruction.
Sorry, I forgot Phabricator ate characters at the end of links. The URL I meant to post was https://pt.wikibooks.org/wiki/Especial:Todas_as_p%C3%A1ginas/Thread%3A (a list of all pages in the LQT thread namespace). And it seems that mistake has led you pretty far astray, sorry
Once again thank you for taking the time to discuss this complicated situation. I think I'm starting to understand things.
Ah, I see. I think I was confused because this particular page doesn't have any threads, so the second example is helpful. This is clearly going to take some care to get right. I think I'll try to identify which pages fall into which categories next, both to solidify and confirm my understanding of the situation and also so that we have a good idea of when to script and when to manually amend.
First off, any LQT page that isn't any flavor of archive can probably safely be converted to Flow and archived using the convertLqtPageOnLocalWiki.php script. This means it somehow escaped all of the mess without ever being touched.
Here's an example of what I was trying to refer to:
I'm starting to get my head around this and it's even thornier than I thought.
Ah – thank you for the clarification. It sounds like it wouldn't really aid us much as a dry run, then.
If by frozen you mean https://github.com/wikimedia/operations-mediawiki-config/blob/bb48e35598c79ef5dc0524973c9d3ac35336f604/wmf-config/liquidthreads.php#L6 then the script would do nothing because the way freezing is implemented is by forcibly making all LQT pages not actually LQT (wgLiquidThreadsAllowUserControl = false disables the useliquidthreads parser function).