User talk:Connel MacKenzie/archive-2006-05
- If you are here at the top of the page, you are lost. Go here instead.
Normalizing janoun calls
From Template talk:janoun:
- [...] python replace.py -xml -regex "{{janoun\\|(\\[\\[)?(.*?)(]])?\\|(\\[\\[)?(.*?)(]])?}}" "{{
subst:janoun|\\2|\\5}}"
- [...] do you still want it "subst:"ed? --Connel MacKenzie T C 23:06, 1 May 2006 (UTC)
- Yes, please run the above script to replace calls like "
{{janoun|いわ|iwa}}
" on 岩 with calls like "{{janoun|いわ|iwa}}
". Let me know when the script completes, because I then want to change the template to have it wikilink the two parameters and to include a defaulting category index parameter as you mention above. Rodasmith 23:18, 1 May 2006 (UTC)
- Yes, please run the above script to replace calls like "
- Now I'm horribly confused. You want me to "subst:" all existing entries, so that you can change the template only for future entries? Or do you want only a subset "subst:"ed? --Connel MacKenzie T C 01:21, 2 May 2006 (UTC)
- My apologies. I accidentally included the "subst:" portion from the script on Template talk:janoun. I did not intend to subst anything at all. Instead, what I meant was to have the script strip wikilink markup from the two parameters wherever the template is called. Doing so would let me move forward with two changes: (1) add wikilink markup to the template around its first and second parameters; and (2) add an indexing parameter that defaults to the first given parameter. I would then remove the "subst" recommendation from the template's documentation. Updated request follows:
python replace.py -xml -regex "{{janoun\\|(\\[\\[)?(.*?)(]])?\\|(\\[\\[)?(.*?)(]])?}}" "{{janoun|\\2|\\5}}"
- Please forgive the confusion my copy-paste caused. Does the above script look right to you for stripping wikilink markup? Rodasmith 01:52, 2 May 2006 (UTC)
- My apologies. I accidentally included the "subst:" portion from the script on Template talk:janoun. I did not intend to subst anything at all. Instead, what I meant was to have the script strip wikilink markup from the two parameters wherever the template is called. Doing so would let me move forward with two changes: (1) add wikilink markup to the template around its first and second parameters; and (2) add an indexing parameter that defaults to the first given parameter. I would then remove the "subst" recommendation from the template's documentation. Updated request follows:
- Thank you for clearing up my confusion. Either something is wrong with the standard replace.py 'bot (unlikely) or I'm still passing it the wrong parameters, as it isn't changing any entries. By changing "-xml" to "-ref:Template:janoun" it uses only the relevant entries. Here is the command I used...do you see any typos?
$ python replace.py -ref:Template:janoun "{{janoun\\|(\\[\\[)?(.*?)(]])?\\|(\\[\\[)?(.*?)(]])?}}" "{{janoun|\\2|\\5}}"
--Connel MacKenzie T C 03:44, 2 May 2006 (UTC)
- OK, I'm a friggin idiot. I forgot the "-regex" switch. --Connel MacKenzie T C 05:23, 2 May 2006 (UTC)
- Final command was this:
$ python replace.py -ref:Template:janoun -regex "{{janoun\|(\[\[)?(.*?)(]])?\|(\[\[)?(.*?)(]])?}}" "{{janoun|\2|\5}}"
...in progress...
Looks great! (E.g.: [1]) I jumped the gun a bit and already did the template parameter wikilinks etc. Thanks for your help! Rodasmith 06:09, 2 May 2006 (UTC)
- Thank you. It was a fun (albeit sometimes frustrating) exercise, and I know this much more about "regex"s now. The process is still looping, pulling 20 items at a time, then sleeping for a minute, but I think it is stuck reloading the same lists. (The terminal doesn't display the Japanese characters, so all I see is "No changes were necessary in ?" or "No changes were necessary in ???" over and over.) I'll just hit ^C in a little while. --Connel MacKenzie T C 06:20, 2 May 2006 (UTC)
(from Template talk:janoun—Rodasmith) I'm not sure what the category.py bot would do. Would it only strip the redundant "Category:Japanese language" from entries with both "Category:Japanese language" and "Category:Japanese noun"? If so, that would be helpful, but it would help even further if that behavior applies to entries that are in "Category:Japanese nouns" due to their inclusion of {{janoun}}.
If instead category.py would move all "Category:Japanese language" entries to "Category:Japanese noun", that would not be so good, because many entries in "Category:Japanese language" are obviously not nouns. Rodasmith 17:09, 2 May 2006 (UTC)
Hi again, Connel. Articles in the misguided category Category:Furigana need moved to Category:Hiragana. Would the following command do that?
$python category.py move "Furigana" "Hiragana"
Thanks in advance. Rodasmith 21:12, 4 May 2006 (UTC)
- Yup. That is it. Reading the talk page of that category, and About Japanese language page here, I'll start this up in a few moments. PLEASE put new requests at the bottom of the page. Until I get around to archiving it again, I'm losing conversations herein. --Connel MacKenzie T C 00:50, 5 May 2006 (UTC)
- Ugh. Well, I stopped the bot, as it isn't doing anything. Twice a week, the categories are rebuilt - the change that the 'bot made to Template:furigana will be visible once the rebuild is run (Saturday, I think.)
- In the meantime, do you want 'bot replacement of template:furigana with template:hiragana? --Connel MacKenzie T C 00:59, 5 May 2006 (UTC)
- Um, well, a move did the trick, alright. My offer for 'bot correcting the individual entries remains, if needed. OTOH, that might be a good cleanup list for someone: Special:Whatlinkshere/Template:furigana. --Connel MacKenzie T C 01:08, 5 May 2006 (UTC)
- I followed Special:Whatlinkshere/Template:furigana and manually changed the few entries that used the old {{temp|furigana2}} because they were broken (showing "REDIRECT" within the page). The others can be manually reviewed in time. Thanks for your help! Rodasmith 05:34, 5 May 2006 (UTC)
Trolls
Oops, I was just reading through TR...didn't research the user enough. - TheDaveRoss 15:05, 2 May 2006 (UTC)
Word Characteristics
Allow me to start with a bit of an apology and at the same clarify my position on the issues we have so far discussed. (I have begun to developed my user page in hopes doing so will encourage communication should future issues arise.)
I have no contention with you except in regard to one matter and that matter is the need to verify a claim. I am speaking about the en.wikpedia.org site which is easy enough for any user to eliminate as a problem whether it is malicious or not by simply doing slow pings (one per minute) to the site their IP address is blocked by the site. Since I have done this I can not now determine from my IP whether this site is malicious as you claim. The only evidence of malice I have is from two popup ads that when linking to the site. Surely this is not sufficient grounds for labeling the site malicious. Therefore it does appear that the only basis for your contention is that the site bares an almost identical name. Consequently Wiktionary users have been deprived of any truthful information about this site. If the Wiktionary site is worthy of respect then the provision of such information must be allowed. Otherwise you relegate the characterization of the Wiktionary site to that of a selfish and spoiled child. You can not protect the Wiktionary by covering up such information but rather subject the Wiktionary site to ridicule and disdain for doing such a thing. You need to describe the malicious nature of this site and tell why you have problems with the en.wikpedia.org site because it is only one spelling mistake away from accidental user viewing and you duty here is to protect them. That said I will go on now to something else and to the reason for this visit.
I am writing as the result of accidentally discovering your table of Gutenberg words with each word classified according to parts of speech. I am working on a similar word classification project intended to optimize the order in which the characteristics of all words are listed. If I was only using the parts of speech to classify words then the values of each particular part of speech such as it being open or closed would be used to optimize their order so as to minimize the number of queries necessary to identify any particular word according to the parts of speech.
In the classification project I am working on the parts of speech are not the characteristics but rather a single characteristic in which the actual parts of speech such as nouns and words become the values. Using a multiple state scheme means that each part of speech can have multiple values designated by a prefix or suffix added to the part of speech.
I am excited about this project because it has the potential of providing not only dynamic classification of words based upon every possible criteria but because the order of that criteria will be optimized.
I sincerely hope we can now put any differences aside and contribute to this project together. If not then you may still share the benefits of the project without any obligation of contribution and I will proceed as best I can.
Have a great day.
-- [email protected] 15:52, 2 May 2006 (UTC)
Cheers
Dear Connel, I sent email your way the other day; drop me a line if you have received it. Regards, +sj + 20:05, 3 May 2006 (UTC)
Pedians' user pages protected?
Good idea, but as you protected User:Musical Linguist, they can't edit their own user page? Or is it as with the custom monobook pages, that the user can always edit their own pages? —Vildricianus 16:31, 4 May 2006 (UTC)
- That one was a rather special case, that I discussed privately with that user at length. --Connel MacKenzie T C 16:34, 4 May 2006 (UTC)
FmtTransBot?
- FmtTransBot: 'bot add the
{{top}}
/{{mid}}
/{{bottom}}
to translation sections, and balance columns of entries that do (only change if unbalanced by three or more?)
Really? I thought we were supposed to put A-I in the first column and J-Z in the second. —Vildricianus 17:31, 4 May 2006 (UTC)
- I've been removing those comments semi-automatically for quite some time now. Initially, when the translation table format was put forward as a formatting option, the guideline of A-I, J-Z was reasonable and naturally balanced most tables. After a short while, people realized that that was being misinterpreted as a rigid guideline (instead of a neat way to end up with balanced columns most of the time.) I strive for minimal vertical-pixel use when I balance a tables. I still do not have a satisfactory (to me, at least) way of automating this yet. --Connel MacKenzie T C 17:50, 4 May 2006 (UTC)
?
who are you? and why did u contact me
cool thanks you i feel hella special now, what exactly did u think was a good contribution? and thanks for being a sexy bioch, hmm i wonder if thats in there
Templates
I just created template:en-plural to make it faster to fill out redlinks for plurals (such as are created by template:en-noun-reg). I got the following message from SemperBlotto.
- Hi there. It is my understanding that we don't allow templates to have ==language== or ===anything=== within them. The problem comes when people accidentally edit the template when meaning to edit part of the word. Have a talk with Connel and see if that is still the case. Cheers SemperBlotto 16:42, 5 May 2006 (UTC)
No way to force people to use subst: so do we just want to abandon it (delete) or edit it to remove headings and just use it, if at all, for the body? JillianE 16:52, 5 May 2006 (UTC)
- I'll take a closer look tonight, but SB is correct; we do not include headings in templates (as that triggers special logic in the MediaWiki software for the section-edit button.) Also, there is
{{p}}
which you may find interesting. I'll talk more later, when I return. --Connel MacKenzie T C 17:02, 5 May 2006 (UTC)
- One other thing, before I log off; see the template
{{new_en_plural}}
and try it out by entering a plural form of a word in the search box and pressing [Go]. Note that "preload" templates in this form are always transcluded, rather than just referenced. Try a few, and you'll see. --Connel MacKenzie T C 17:06, 5 May 2006 (UTC)
RE: Quick reminder
We don't add categories to entries that will end up being huge, such as "English language" or "English nouns"...we are gradually eliminating them, instead. --Connel MacKenzie T C 01:55, 6 May 2006 (UTC)
- Wow. Thanks so much for telling me. I was going through a lot of entries tonight for anagrams and I was thinking to myself as I went "Why do all these people add language entries without adding the category. I guess I'll do it for them." Thanks for catching my mistake quickly and correcting me. --Think Fast 02:03, 6 May 2006 (UTC)
Hi Connel. You asked me some fairly specific points so I will try to explain them. English is a word which can be used in a few different ways - to indicate the lanague as a linguistic enitity (1.1), one's ability to use this language (1.2 and 1.3), and also an English translation of something (1.4). Sure, they are all closely-related; that is why I decided to put them as subsenses rather than ‘full’ senses, if you like. Personally I think I agree with you that 1.2 and 1.3 are identical, but since a previous editor had put them in separately, I decided to leave them that way.
The subsense thing, which clearly annoys you, is not something I am going to fight you or anyone else over - if you want to get rid of them in this or any other entry then be my guest, I am not in the habit of objecting if people change my edits, far from it. I think they make a kind of intuitive sense though – those first few defs are all obviously related in a way the others aren't.
As for the translations, I supposed I vaguely hoped that all the #1 defs would have the same translation, but as with any page you are obviously welcome to expand that section if it is necessary. On the billiards sense, I agree that it would be better placed on the relevant lower-case page. Again, this was something entered by a previous editor which I didn't disturb. Widsith 18:05, 6 May 2006 (UTC)
- Hmmm. OK. Is this difference output showing something for the billiards sense incorrectly? --Connel MacKenzie T C 18:13, 6 May 2006 (UTC)
Huh, funny...can't understand that at all. Like I say, I agree with you it should be at english. Widsith 18:47, 6 May 2006 (UTC)
Japanese kanji normalization bot requests
Many Japanese kanji entries (e.g. 反) were created with a language header of "== [[Japanese]] [[Kanji]] ==
". I'd like to replace that header with the more normalized "==Japanese==
" [newline] "===Kanji===
". I believe the following would command do that:
python replace.py -xml -regex "== \[*Japanese]* \[*Kanji]* ==" "==Japanese==\n===Kanji==="
Could you perhaps run that command?
After that, I'd also like to normalize kanji entries by having them use {{kanji}}. Assuming that replace.py
spans line breaks, I believe the following would do that:
python replace.py -xml -regex "===\[*Kanji]*===\n+\* '''Readings'''\n\*\* '''\[\[On]]''': (.*)\n\*\* '''\[\[Kun]]''': (.*)" "===Kanji===\n{{kanji|\1|\2}}"
There's certainly no rush to execute the above, so just let me know when you get around to it. Rodasmith 20:39, 6 May 2006 (UTC)
- Um, OK. Some questions and comments first:
- I figured out that it is one \ not two, on this last pass.
- Yes, it handles newlines exactly as you expect.
- Is ===Kanji=== really a valid heading? If it is, I do not understand why.
- Once the first pass is run, I'll have to wait another month to get an updated XML dump. I don't think that is what you want. I think I should first do a run of this:
python replace.py -xml -regex "== \[*Japanese]* \[*Kanji]* ==\n+\* '''Readings'''\n\*\* '''\[\[On]]''': (.*)\n\*\* '''\[\[Kun]]''': (.*)" "==Japanese==\n===Kanji===\n{{kanji|\1|\2}}"
right? --Connel MacKenzie T C 01:28, 7 May 2006 (UTC)
- Ahh. I see. I was under the impression from the earlier discussions that your shell treated backslash as an escape character, requiring us to double each regex backslash, but I see now that is not the case. It's also good to know about the delayed post-command XML update constraint. So, taking advantage of this opportunity to improve the clarity of the results, I propose the following, building to your merge above:
python replace.py -xml -regex "== *\[*Japanese]*[ =\n]+\[*Kanji]*[ =\n]+\* '''Readings'''\n\*\* '''\[\[On]]''': (.*)\n\*\* '''\[\[Kun]]''': (.*)" "==Japanese==\n===Kanji===\n{{kanji|on readings=\1|kun readings=\2}}"
- How does that look to you? BTW, "
===Kanji===
" is a valid header, per WT:AJ. It is analogous to "===Symbol===
" in the entry "C". Rodasmith
- Drat. I ran this last night (as you have at the end here, and it seems to have done nothing. I think introducing some [] groups and some () groups confused the python-regex-parser too much.
- <random thought>Let me backpedal for a moment. The problem may have been " -xml " more than anything else. Rather than traverse the whole XML dump, why don't I just generate a wikified list, for the input list, so it doesn't have to ever do the several-hours long dump traversal again.</random thought>
- Another thought, or two: your e-mail address link (e-mail this user) does not work. Please contact me via e-mail, (or irc "Connel" in #Wiktionary, or skype, etc.) so I can set you up to do this yourself. You probably have a lot more insight on debugging the regex statements than I do. --Connel MacKenzie T C 17:35, 7 May 2006 (UTC)
- Re: ===Kanji=== - ahhhh. That does make perfect sense. I don't know why I didn't recognize that earlier. --Connel MacKenzie T C 17:35, 7 May 2006 (UTC)
- I guess I forgot to enable e-mail on the preferences page. That's fixed now. Feel free to use the "e-mail this user" link on my user page. I didn't verify the regex above, so I'm curious about the specific problem, but also would be happy to help generate the wikified list. Would that be a string or page of the following form?
*[[PAGE_NAME_0]]
*[[PAGE_NAME_1]]
*[[PAGE_NAME_2]]
[...]
*[[PAGE_NAME_N]]
- If so, what method have we of generating that list? Is there some sort of Wiktionary-wide
grep
? Rodasmith 18:46, 7 May 2006 (UTC)
- I guess I forgot to enable e-mail on the preferences page. That's fixed now. Feel free to use the "e-mail this user" link on my user page. I didn't verify the regex above, so I'm curious about the specific problem, but also would be happy to help generate the wikified list. Would that be a string or page of the following form?
- E-mail sent. --Connel MacKenzie T C 19:06, 7 May 2006 (UTC)
- No. My programming language of choice is MUMPS. I import the XML dump monthly, and manipulate very quickly and easily within the MUMPS context. That makes detecting and generating the various lists that I do go very fast - 30 seconds is the typical time it takes to traverse my import of pages-latest-full-history.xml, for example. (I'm pretty sure it would be half that if I had fast disks/cpu.)
- I choose to keep the MUMPS code simple though, so I use either cut-n-paste or 'curl' or pywikipediabot.py to update Wiktionary, rather than doing it directly. The means that it is very difficult right now to share my laptop's db publicly.
- The "list" format is one line per entry, starting with "" ending with "\n" and nothing else. That is what pywikipediabot.py wants, so that is what it gets. --Connel MacKenzie T C 19:18, 7 May 2006 (UTC)
- Thanks. Yes, I received your e-mail. I have created and uploaded my public key to User:Rodasmith/Public key. Rodasmith 20:44, 7 May 2006 (UTC)
- Well, it appears that my ssh client here (away from my home PC) lacks whatever authentication method your host provides, becaues I get authentication failures trying to connect from this machine. After a few failed attempts, your machine seems to have attempted to have me authenticate with a password, but that obviously couldn't fly. I also tried briefly from my phone's terminal services client, but that was really just a pipe dream (maybe I'll find a decent ssh client for it later). Anyway, it will probably be easier for me to address when I'm working from home, though, so I'll resume this effort tomorrow (Monday) or Tuesday. When I go that route, I'd be happy to reduce the encryption key size if that would be better for your host. In the meantime, thanks for your help! Rodasmith 22:30, 7 May 2006 (UTC)
- Thanks much for the access.
I feel like such a n00b for asking, but where isNever mind. I foundreplace.py
?/home/python/pywikipedia
Rodasmith 01:39, 8 May 2006 (UTC) - Would you prefer I make my own copy of the modules at /home/python/pywikipedia or should I use the ones there? If I use the ones there, I may need write access to throttle.log. Rodasmith 02:05, 8 May 2006 (UTC)
- Sorry, that must've been as frustrating for you, as it was for me. I got called away to other things several times there. Protections should all be open there now, and user-config.txt is set to your username (and account ownership.) The text files ja*.txt are what you are probably looking for, as input for the "-file" switch. --Connel MacKenzie T C 02:21, 8 May 2006 (UTC)
Now *that* was funny...
No problem! Actually, it was a rather lucky coincidence, considering that I've been away for a few days because my computer went boom and I've just gotten back on now. :) --Rory096 04:07, 7 May 2006 (UTC)
Themes
Replying to your question at User talk:Hippietrail: this is the fix Meta has for its grey background (monobook.css).
#content { background: #f4f4f4; } .ns-0 * #content { background: #f4f4f4; } #p-cactions li { background: #f4f4f4; } .ns-0 * #p-cactions li { background: #f4f4f4; } #p-cactions li a { background-color: #f4f4f4; } .ns-0 * #p-cactions li a { background-color: #f4f4f4; }
Cheers. —Vildricianus | t | 18:41, 7 May 2006 (UTC)
- [2]. I can probably find out more css to get the entire screen customized; this was just pinched in seconds' time. Later, though. —Vildricianus | t | 20:55, 7 May 2006 (UTC)
- Wow. Thank you! --Connel MacKenzie T C 21:14, 7 May 2006 (UTC)
Here's a basic grey skin for you. Customize as you please. I've also added some of the better tricks of my css to it, notably the compact sidebar tweak. It's tested it in Netscape, but perhaps some things may not display properly, so please tell me which parts are bad. Cheers. —Vildricianus | t | 22:16, 8 May 2006 (UTC)
- I now also have a black skin, but it isn't as good as the grey one because of the many light tables (top/mid/bot, Wikipedia, Ncik's inflection boxes etc) in the content area. —Vildricianus | t | 20:18, 11 May 2006 (UTC)
Where did I get the material from? Benítez 04:54, 8 May 2006 (UTC)
- You were kind enough to provide a "bibliography" (in Wikipedia style, by the way, which is not correct here.) --Connel MacKenzie T C 05:01, 8 May 2006 (UTC)
- A bibliography is a recommended list for further reading. Even if they were from those books (which they are NOT) how can a single-word definition be copyrighted? There's no other way to define an abbreviation. If you don't like the bibliography though, feel free to remove it. Benítez 05:06, 8 May 2006 (UTC)
- Hey, OK. You bet. --Connel MacKenzie T C 05:08, 8 May 2006 (UTC)
pce3
I initiially blocked him for 1 week, and have not done anything about it since. If you had blocked him between the last thing I saw him post and when I did I might have screwed it up, my bad. - TheDaveRoss 16:06, 8 May 2006 (UTC)
- Well, I think we can leave it alone for now, but in the future, open a tab to check the block log before applying the block, please. --Connel MacKenzie T C
- The Block log tells me it was you, Connel, who blocked after TheDaveRoss had done so. —Vildricianus | t | 18:08, 8 May 2006 (UTC)
- Wow, I'm quite a idiot. But, IIRC, he was blocked for much shorter durations first? I'm not seeing evidence of those blocks now - perhaps they were by numeric IP address? --Connel MacKenzie T C 18:47, 8 May 2006 (UTC)
- My first block was his IP indeed. Afterwards, it were only Wutykaze's 2 hour block and the 1 week block by the two of you, which will expire 20:26, 9 May. —Vildricianus | t | 19:12, 8 May 2006 (UTC)
Language (resp)
okay! Thanks a lot!--Ricardo 19:17, 8 May 2006 (UTC)
Template-driven plural vote
Hi Connel. FYI, I removed the trailing period from the vote option for the template-driven solution, since the period is already in {{plural of}}
. I'm sure it doesn't change your vote, but I wanted to notify you since you already voted. Rodasmith 05:48, 10 May 2006 (UTC)
- Erm, thank you. I guess I should have notified you and Davilla properly as well. --Connel MacKenzie T C 05:57, 10 May 2006 (UTC)
ParserFunctions do work here
Made #ifeq work in Noarticletext. Please adapt further. —Vildricianus | t | 13:53, 10 May 2006 (UTC)
- Very figgin' strange. Was I doing something wrong, or was the server just giving the wrong response during heavy load time? --Connel MacKenzie T C 02:05, 11 May 2006 (UTC)
WikiSaurus - compromise proposed (/more)===
copied from [WT:BP]===WikiSaurus - compromise proposed (/more)=== A possible compromise between the "tough criteria for WikiSaurus", and the "Don't lose even the least valuable "synonyms". Introduce, in WikiSaurus, a xxx/more subpage for the problem pages. Cull the trash from the main page (by whatever criteria), but don't just delete it, put it in the /more page. In the main page indicate that new entries not meeting the tough criteria have to be put in the /more page, and there can be researched for verifiability, and perhaps later promoted to the main page. With this I would then suggest we might even protect the main WikiSaurus page. Admin's would then be responsible for checking the /more pages every so often to see if there are any terms that could be promoted to the main page, as they meet the criteria. Thus we would meet two purposes. The main WikiSaurus page would be kept up to our "standard" (which I have to point is very subjectively applied), whilst the /more page would capture every possible synonym, and would in effect be a specific protologism page.--Richardb 23:26, 10 May 2006 (UTC)
- Richard, yes, I liked that idea, and I proposed it last month. When it got no response whatsoever, I may have turned up the heat some. I hope you are not offended, and are able to show the community two or three examples (preferably the most vulgar ones) for your experiment. I wish you luck with it; you can expect my full support on this experiment. The only superior idea I've seen so far, is using categories, but that had problems of its own. Feel free to quote me on that, if you need to remind me. --Connel MacKenzie T C 02:04, 11 May 2006 (UTC)
Noarticletext magic
Hey man. I've sort of seen what you've been up to. Nice.
I noticed it always shows one self-link though which is ugly. I'm trying to find a way in CSS to detect ones like that and set display to none. We need selector which matches a class of .did-you-mean
which contains an element strong
. But I don't know how to do that. I tried > and + but neither worked.
Another thing is that I think the link which emulates the Go button should be as prominent as the "did you mean" links, perhaps worded in a way that says it will go to an article with the same spelling but different capitalization, if such exists.
Of course if there's a way to tell if we're coming from an outside link we can make some javascript which emulates the Go button... but would this mean an infinite loop? — Hippietrail 01:08, 11 May 2006 (UTC)
- Aha that seems to be impossible, but I found another way. I also implemented my own suggestion. See what you think. — Hippietrail 01:36, 11 May 2006 (UTC)
Sorry, but I've been away for a long time (what, 10 hours?) Please be a little more specific, as you lost me with your last comment.
- WOW! Yes, exactly that. w00t! --Connel MacKenzie T C 06:37, 11 May 2006 (UTC)
- As long as only "Noarticletext" has the Javascript "[Go]" button auto-clicked, I think we are OK. [Go] will get a person to either Nogomatch or Newarticletext, thereby preventing a looping-redirect situation. --Connel MacKenzie T C 02:00, 11 May 2006 (UTC)
- I started working on this but I'm postponing it for now:
- We can't just put a #REDIRECT on the page because it doesn't work with the magic required to create the destination of the redirect - for me anyway.
- If we can put a (script) tag right into the special page it would be the quickest but I know we can't use it on normal pages - maybe we can on special pages though?
- What's the best way to get the spelling of the nonexistant article in javascript? my cite tab code needs a few lines to get it but maybe there's a quicker way?
- Wehn you have the spelling you can put it in the search for but I don't yet know how to use javascript to hit one of several form submit buttons.
- Anywhere that's what I know if you want to play with it. — Hippietrail 20:42, 11 May 2006 (UTC)
- I started working on this but I'm postponing it for now:
document.searchBody.searchInput.value = pagename; document.searchBody.searchGoButton.press();
--Connel MacKenzie T C 21:15, 11 May 2006 (UTC)
- Thanks but that didn't work. I also tried .searchform. between searchBody and searchInput which failed too. This is what I get from trying to learn JavaScript and CSS without a reference manual. Lots of basic stuff I don't know. Also "pagename" didn't work. Where is that supposed to be getting set anyway? — Hippietrail 21:46, 11 May 2006 (UTC)
- OK, I'll check into Vild's monobook.js the next time I have a free minute. 'pagename' was supposed to be whatever your function returned as the pagename variable (I thought you said you already had that...ooops.)
- Ah I thought you were telling me "no need to use all those lines of code, just use 'pagename'"! — Hippietrail 22:46, 11 May 2006 (UTC)
- I'll to try running IE later. The last time I tried it, I don't think it could even render Main Page without errors, given the amount of NS custom stuff in my monobook. --Connel MacKenzie T C 22:42, 11 May 2006 (UTC)
- Oh well let me know. Thanks again. — Hippietrail 22:46, 11 May 2006 (UTC)
Throwing uc: into the mix as well as ucfirst: and lcfirst: is a fine thing, but the extra formatting looks pretty ugly to me, and right now it's distractingly different in the ucfirst and lcfirst cases.
Also we're going to have to pay attention to how it behaves in other namespaces. I just noticed that if you try going to a nonexistent user's talk page, you get a bizarre result... —Scs 06:32, 11 May 2006 (UTC)
- OK, I'll look into that. Keep in mind I may have changed it while you were looking. :-) --Connel MacKenzie T C 06:36, 11 May 2006 (UTC)
- I'm not getting bizarre results. I am only (correctly) getting nothing. Are you seeing something I'm not? --Connel MacKenzie T C 06:40, 11 May 2006 (UTC)
- Try http://en.wiktionary.org/wiki/User_talk:Connel. Oddly the behavior for http://en.wiktionary.org/wiki/User:Connel is different. —Scs 13:11, 11 May 2006 (UTC)
- I see now. I must've missed the work "talk" yesterday.
- Perhaps we could run both halves of the equation through "localurle" before/during the camparison, to convert " " to "_" for the second half? That of course, makes the syntax longer and uglier. --Connel MacKenzie T C 14:44, 11 May 2006 (UTC)
- Actually, we might not have to -- I've just discovered the existence of variants like {{PAGENAMEE}} and {{{FULLPAGENAMEE}} (note the double 'E', for "encoded"). See http://meta.wikimedia.org/wiki/Help:Magic_words. (I haven't tried playing with them yet.) —Scs 02:54, 12 May 2006 (UTC)
- Hmm. Well, we can't use {{{FULLPAGENAMEE}}, because we need to apply ucfirst: to just the article name. But I really would have thought {{{NAMESPACEE}} would have helped -- but I tried it, and it didn't. —Scs 03:43, 12 May 2006 (UTC)
- Fixed it a different way. See Wiktionary_talk:Project-Noarticletext#namespaces. —Scs 04:01, 12 May 2006 (UTC)
- OK, I got the Javascript redirects working properly...for more of the cases than the wikisyntax did too. There remain a handful of problems (categories, category talk, image talk, user talk, and WikiSaurus pages) but since they are all outside of NS:0 I'm not too worried. --Connel MacKenzie T C 07:49, 13 May 2006 (UTC)
optional diacriticals
Hi Connel. I didn't want to get into it too much on the vote page, but I had to point out that piped diacritical marks are not contrary to Wiktionary practice and are in fact very common – with certain languages. Old English is one – the language is not written with macrons, but they are printed in dictionaries and study books to help learners. The same goes for stress accents in Russian, and vowel markings in Arabic and Hebrew. For some discussions of this, see User_talk:Stephen_G._Brown#GSub, User_talk:Stephen_G._Brown#Arabic vowels, and on OE in particular Wiktionary:Beer_parlour_archive/October-December_05#Diacriticals_in_Old_English. It also applies to Latin. The point in all these cases is that the diacritical marks are not usually considered part of the language itself, only a tool for those learning or studying it. An accent in OE or Latin is not like an accent in, say, French or Spanish. There is a question over whether accented forms should have redirects but that is something for later; the main thing is to make sure unaccented forms have entries, because that is how these languages are used. Widsith 06:46, 11 May 2006 (UTC)
- Ah. OK then. Sorry for adding to the confusion. --Connel MacKenzie T C 06:59, 11 May 2006 (UTC)
Mobile
Hi. Writing from my mobile, so forgive brevity. The script is running bg. I'll monitor as possible. Rod (A. Smith) 02:44, 13 May 2006 (UTC)
FYI, I'm also watching minor edits ("hide self"). (Typing on this MDA is tolerable.) Rod (A. Smith) 03:29, 13 May 2006 (UTC)
Redirects from cap to lc
Hi,
I'm a little curious about your recent redirects. Have you been keeping up with the conversation on WT:BP regaring upper/lowercase redirects? As soon as I find my Javascript handbook, I'll figure out how to make the first choice do an auto-redirect. So soon, we won't have to worry so much about uppercase-to lowercase redirects. One from lower-case to the proper case will suffice.
As it gets better testing, Hippietrail, Scs, Vilricianus or I will announce it more properly on WT:BP. But I figured I'd give you a heads-up, as you are entering lots of redirects these days, perhaps unnecessarily.
--Connel MacKenzie T C 03:54, 13 May 2006 (UTC)
Oh dear... actually, before I started on this course, I had this conversation with SemperBlotto:
- Tell me if I understand this correctly - if a word has no sense that differ between uppercase and lowercase (e.g. ostentatious and Ostentatious), then the latter should redirect to the former; but if there are senses unique to the uppercase (e.g. creek and Creek), then there should be two separate articles with only the senses unique to the uppercase in the uppercase article and a "see also" pointing to the lowercase. Si? bd2412 T 15:18, 28 April 2006 (UTC)
- Yes. But 1) there is no great priority in generating things like Ostentatious - they will only be red-linked if capitalized as the first word of a sentence. 2) creek and Creek BOTH have to have a disambiguation "see also" at the top to point to the other (and so on if multiple forms e.g. if there was also a CREEK) SemperBlotto 15:24, 28 April 2006 (UTC)
- Thanks. Done. bd2412 T 15:25, 28 April 2006 (UTC)
- But to further complicate things - the name of a small river would be Something Creek ! SemperBlotto 15:34, 28 April 2006 (UTC)
...so I've been following from that and making redirects from the caps of all entries I had started or done some work on before. Although a lot of my redirects today were from different usages (e.g. poly-theistic to polytheistic; PanDeism to pandeism). bd2412 T 04:02, 13 May 2006 (UTC)
- You misunderstand my concern I think. If you want to enter the redirects, knock yourself out. But if an entirely lowercase entry exists, then no matter what upper-lower-case conbination they type, they will get that entry (as soon as I get this friggin Javascript working, that is.) Even now, you should see a link to that page, in those cases. --Connel MacKenzie T C 04:09, 13 May 2006 (UTC)
- Query - under the automated system of which you speak, if I redirect pan-theism to pantheism, does this automatically cause Pan-theism and Pan-Theism to lead there as well? bd2412 T 04:36, 13 May 2006 (UTC)
- Query 2: is this automation effective only with respect to the search window, or also with respect to links typed, e.g., in an article? bd2412 T 04:41, 13 May 2006 (UTC)
- The method is similar for internal and external links. You should now start seeing internal links telling you there are alternate entries. For external links, they'll be auto-redirecting, I think. We're all still in a quazi-experimental mode with this right now. --Connel MacKenzie T C 04:47, 13 May 2006 (UTC)
- Very well, then - I shall direct my activities to creating new entries and cleaning up old ones, rather than making oodles of cap-->lc links. Cheers! bd2412 T 04:53, 13 May 2006 (UTC)
- By the way, the Go/Search thing has done the case correction thing for about eight months now (we've been decapitalized for nine or ten months?) The changes being made now are for direct external links and wikified links. --Connel MacKenzie T C 04:58, 13 May 2006 (UTC)
- How about foreign script capital and lowercase characters? bd2412 T 04:04, 14 May 2006 (UTC)
- I tested some Cyrillic a few days ago and it was working then but things have been tweaked a lot. Another case is a page with at least the first character not case-changeable, like say a kana letter or a period. — Hippietrail 04:10, 14 May 2006 (UTC)
- How about foreign script capital and lowercase characters? bd2412 T 04:04, 14 May 2006 (UTC)
- By the way, the Go/Search thing has done the case correction thing for about eight months now (we've been decapitalized for nine or ten months?) The changes being made now are for direct external links and wikified links. --Connel MacKenzie T C 04:58, 13 May 2006 (UTC)
- Just to let you know, the magic redirect code didn't work for me in IE but I only tried quick and have to log off now. Maybe take a look at my js page to see if I did something wrong. Good night. — Hippietrail 04:45, 14 May 2006 (UTC)
- This worked yesterday, but today something is affecting the "span" tag, lopping off the class name. Your Javascript seems fine, if I can get the class name back in the span tag. Unfortunately, I have no clue what changed. When I "View Source" of the html now, I see no class name, just the tag (which I must say, seems pretty useless.) --Connel MacKenzie T C 06:50, 14 May 2006 (UTC)
- Well, this is live now. Seems to be working. I exclude all talk namespace searches, to prevent the false positives when there is a space in the namespace name.
- If I get the inclination, I'll add a Javascript check for "Wikisaurus" and do a redirect to the search page.
- Also, if I get the inclination, I'll do a similar JavaScript trick to do "stemming" to find the root form of a term. (i.e. -s, -es, -ed, -er, -ing.) Perhaps even removing a final doubled letter.
- While I'm at it, I suppose I could add a SOUNDEX search off my box or toolserver.
sql OR xml grep request
Could I ask you a favour? My soft template code is working great for the italbrac case but I need to find some articles that use the different variations:
- (foo) - 3702
- (foo) - 731
- (foo): - 254
- (foo): - 115
- (foo): - 24
- (foo:) - 4
- (foo:) - 1
- (foo:) - 0
To see which ones really exist and which are more popular. I understand if it's too much work and you can't get around to it of course... — Hippietrail 03:48, 14 May 2006 (UTC)
- Lemme fix the IE JavaScript problem first...
- But, you don't want examples of the satanic # (foo) (bar) ? --Connel MacKenzie T C 05:13, 14 May 2006 (UTC) :-)
- Whoa. This will require me re-running the template-transcluder thing. I'm not certain where I left off with it...I know I had it working for all inflection templates, but I didn't test anything beyond that. Yeah, I'm certainly going to attack the JS problem in IE first. --Connel MacKenzie T C 05:16, 14 May 2006 (UTC)
- Do you have a way to do a simple grep without taking templates into account to give a rough estimate? I've got people telling me which is most common but I prefer a scientific answer. As for the satanic (foo) (bar) that can wait. First solve the little problem that leads to the big problem (-: — Hippietrail 23:19, 14 May 2006 (UTC)
- There ya go. Interpreting those numbers is more difficult, as the formats do go back and forth in a wheel-war manner. That is to say, I know I've converted more than one thousand from the evil 3702 to the closer-to-correct 254. Apparently, some others have been just as active POV-"correcting" my entries...apparently moreso.
- Please also review the first section of Template talk:cattag. --Connel MacKenzie T C 07:47, 16 May 2006 (UTC)
wUnit
Is there a wiki equivalent of jUnit/nUnit for testing wiki templates? If not, I'm going to start building one. Rod (A. Smith) 01:20, 15 May 2006 (UTC)
- Oooh, that would be fantastic. I loved jUnit...I miss it a lot. To my knowledge, nothing like it exists yet. --Connel MacKenzie T C 01:38, 15 May 2006 (UTC)
- Actually, User:Gangleri on yi.wikt: (and others?) has his own test-server somewhere. If you check bugzilla, you may find it. He had rudimentary tests laid out for most of his own bug reports there. You can often find him on IRC. --Connel MacKenzie T C 01:44, 15 May 2006 (UTC)
- Yes, he does have a user page here also (sorry, I forgot) with some basic tests linked. --Connel MacKenzie T C 01:46, 15 May 2006 (UTC)
Wiktionary
Referring to the message you left on my Wikipedia talk page: I don't really know what you're referring to. Please could you link me to the posting you saw that you are replying to? Timwi 09:08, 16 May 2006 (UTC)
- I replied (well, sort of). Timwi 16:29, 16 May 2006 (UTC)
Transwikis - lowercase?
Your recent transwikis seem to be getting lost, as they are not in the "Transwiki:" pseudo-namespace, but instead in "transwiki:". I don't think that is correct - is it? --Connel MacKenzie T C 17:03, 16 May 2006 (UTC)
I responded to your question about transwikification capitalization at my talk page. Thanks. TheProject 17:09, 16 May 2006 (UTC)
- I've been told to switch to the lowercase version -- either way, what shows up on the logs should be correct. Let me know if they aren't. TheProject 17:08, 16 May 2006 (UTC)
- Entry names are supposed to be lowercase, but the pseudo-namespace should not, IIRC. So yes, the first character after the colon should be lower case, but it still should be "Transwiki:" not "transwiki:". Thanks for keeping these moving. --Connel MacKenzie T C 17:12, 16 May 2006 (UTC)
- Just looking down the transwiki log and clicking some of the links, it seems that a lot of the Transwiki: articles redirect to their lowercase equivalent. I'm just wondering though -- does it actually matter? If I transwiki into lowercase and then link to lowercase in the log, it should still work. (Is stuff actually getting "lost"?) TheProject 17:20, 16 May 2006 (UTC)
- Above at my talk page again. Thanks. TheProject 17:25, 16 May 2006 (UTC)
- "Lost" was perhaps inaccurate. It is more likely that entries will get needed attention if they are closer to what they shouold be. But since the Transwiki log itself links to wherever it is they ended up, it seems like they are not entirely lost after all. --Connel MacKenzie T C 17:32, 16 May 2006 (UTC)
scholarly training of administrators
Can you tell me what kind of training you and the other "administrators" have (and what exactly is an adminstrator)? Are any of you trained in linguistics, for instance? And do you work together in any coordinated fashion, as a kind of editorial board, or are you merely veteran users working essentially independently and coordinating only here in these User pages? In other words, is there an editorial body here at Wiktionary, and if so, how does it work? I haven't been able to find answers to these questions on the site itself, though this kind of explanation is always included in printed dictionaries and is essential to how we evaluate their value. If the answers to all these questions are actually here somewhere, and I've just missed them, I'd be grateful if you could point me to them. With many thanks.
- 19:38, May 16, 2006 User:Noah
- Please sign your messages on talk pages. A Wiktionary sysop is someone trusted by the Wiktionary community. Linguistic credentials are not necessarily helpful for gaining people's confidence.
- Neologisms are put through our verification process. Blatant nonsense is either pushed through our deletion process, or "shot on sight." Subtle problems are discussed in the tea room. In every regard, this is just a regular Wiki.
Thanks much, but I'm a bit confused by what you mean. I get that this is a wiki, and that there is no editorial board, just trusted users. But when you say linguistic credentials are not helping for gaining people's confidence, do you mean that people using a dictionary don't care whether the people writing it know anything about language or linguistics? In what else do we place our confidence in the dictionaries we use? I don't intend any of this by way of a challenge, I'm just very curious about how this works. Yours, Noah
- You are welcome, much. Please sign talk page entries (and other discussion pages) with the wikimagic "~~~~" characters to sign entries. That puts the automatic timestamp and your usename in place of the four tildes.
- By "not helpful" I was making a generalization, based on what I've seen. Working with others nicely is far more important than credentials...at least for getting approval for "sysophood." Disclaimer: I am no poster child, in this regard. Last year, we had one or two dictionary authors contribute, but they were unable to keep their interest up. This year, we've had a host of vandals, copying entries from other copyright-protected sources. Often it is difficult to distinguish the latter from the former; the cleanup tasks have diverted many resources from more productive activities.
- Do people have confidence in Wiki technology? That question is hard to guess at. While I hope that they someday will be able to depend on Wiktionary (and sibling projects) I doubt that we've progress that far, just yet. --Connel MacKenzie T C 02:43, 17 May 2006 (UTC)
Auto-balancing translation tables
I created {{Translations}}
to help with balancing translation tables. I thought I'd run it past you before posting it for critique on BP. I hope it will replace the recommendations for {{top}}
, {{mid}}
, and {{bottom}}
. What think ye? Rod (A. Smith) 02:27, 17 May 2006 (UTC)
- Let me bve the first to say w00t! --Connel MacKenzie T C 02:47, 17 May 2006 (UTC)
- P.S. Test it on water, peace and I love you (or sandbox copies) to make sure it won't overload when there are 10,000 translations for a word. --Connel MacKenzie T C 02:47, 17 May 2006 (UTC)
- Hmm. I've run into the problem you pointed out recently regarding #ifeq and <span id="foo"/>. It occurs when using
{{f}}
et al., because those templates expand to spans. I'm looking for your post to try to find the resolution. Do you remember where your post was about that? Rod (A. Smith) 02:50, 17 May 2006 (UTC)
- Hmm. I've run into the problem you pointed out recently regarding #ifeq and <span id="foo"/>. It occurs when using
- I don't remember where the post was, but the skinny is that "=" is a very special character within templates. So I passed in {{...|span=<span id="foo" />}} instead. --Connel MacKenzie T C 02:52, 17 May 2006 (UTC)
- It looks rather server-abusive. Perhaps we should consider using it only within personal monobook contexts, subst:'ed. --Connel MacKenzie T C 02:51, 17 May 2006 (UTC)
- The current format avoids the left-right-left-right-... stuff due to many vocal complaints. One of these days, I'll revisit User:Connel MacKenzie/monobook.js and use what I did for the TTBC section, to balance these "properly." I think that will be a better short-term solution for us. Once it's written in JavaScript, it becomes much easier to rewrite in python (for a 'bot to do.) People complain about template over-use as it is; it might be better not to push it too hard. --Connel MacKenzie T C 03:23, 17 May 2006 (UTC)
- OK, I can't find a usable syntax to make the #if of
{{Translations}}
accept{{f}}
's spans anyway. Maybe the parser functions are just too quirky to rely on for performing much complex logic. If so, my development efforts would be better spent writing python-based cleanup scripts.
- OK, I can't find a usable syntax to make the #if of
- Ah. Yes, I had to pass the span= parameter down from
{{alternate pages}}
to{{didyoumean}}
(originally from Wiktionary:Project-Noarticletext.) --Connel MacKenzie T C 04:22, 17 May 2006 (UTC)
- Ah. Yes, I had to pass the span= parameter down from
- Aside, though, I am curious about server load issues. I certainly want to avoid creating processor-heavy solutions but I want to balance that caution with taking full advantage of the MediaWiki software's capabilities.
- From my understanding of content rendering engines (based mostly on my experience building a similar web content rendering system in 1999), the MediaWiki servers should only evaluate templates when they must render a "dirty" page that contains them, i.e. one missing a render cache entry because it or any of its included templates were modified after its last rendering. Typically, that means they evaluate only once per edit submission. Logic based on string equality (like #if/#ifeq) typicaly has a big O complexity less than that of parsing an incoming Xml-ish page (e.g. wiki markup), which also must occur once per edit submission.
- If the media wiki software is similar to other content rendering systems, I'd be surprised if templates like
{{Translations}}
are noticeably any more server intensive than a page without templates. I cannot find any relevant performance discussions of the parser functions, though, so I may just be unaware of some basic aspect of MediaWiki's content rendering system. What resource could I read to get a better understanding of server load for various media wiki functions? Rod (A. Smith) 03:55, 17 May 2006 (UTC)
- If the media wiki software is similar to other content rendering systems, I'd be surprised if templates like
- I honestly hadn't thought that far through it. I'll trust that your guess is correct, since everything is cached, every-which-way possible. That suggests that the only performance hit is on first-view, where the server has to go get a couple pages instead of just one. I admit, that seems very negligible.
- The l-r-l-r-l-r problem is still pretty significant though. OTOH, perhaps the people advocating it the strongest, aren't here anymore?
- If you feel like writing some Python for it, I'll offer these thoughts:
- You'll have to use primitive Kerning to determine approximately how many lines one line will consume. Estimating a display of 800x600 might be too low for the typical visitor, I'd guess.
- The goal for each section should be to use the minimum vertical pixels.
- Translation section's sub-section title-detection may get tricky: three formats I know of are ";foo", "bar" and "(fubar)".
- Robot edits should be skipped if only one line will move, per sub-section.
- Remember that many pages have multiple translation sections, sometimes at ===level three===, sometimes at ===level four====, sometimes at =====level five===== and sometimes at ======level six======.
- Remember not to include [=]=Translations to be checked=[=].
- Remember not to split Serbian/Croation nor zh (trad)/zh (simp) across columns.
- Skip translation sections marked with
{{rfc-trans}}
.
- If you feel like writing some Python for it, I'll offer these thoughts:
- Thanks for the tips. I'll take up the scripted translation table balancing project later, as I need to catch up on anon review et al. In preparation for that project, I suppose I should apply for the bot account you mentioned last week. Is that just done as an informal request on BP? Rod (A. Smith) 05:01, 17 May 2006 (UTC)
- BWAHAHAHAHAHAHAHAHA
- OK, seriously though. The English Wiktionary has some rather draconian rules regarding 'bots. It is not simple like Wikipedia, where you either have the flag, or not. Here, each 'bot task needs its own User account. Each task needs its own vote/approval on WT:BP, then approval on Meta:. Reusing a 'bot account for another task is taboo. So, the 12,000 edits you did under your account would be one 'bot account/'bot approval process. Any future 'bot tasks then need separate approval. ALTERNATELY, you can just run each 'bot task under your own account name. I've gone out of my way to conform to these ignorant/assinine 'bot requirements. I've been run through the wringer each time. The lack of understanding of what the 'bot flag is for, is astounding. All the 'bot flag does, is mask the flood of edits from Special:Recentchanges.
- OTOH, since what you propose would be an ongoing refinement 'bot, it would really need the 'bot flag. (That is, every single time a translation is added, a translation table can become "unbalanced." So it would have to be re-run after each XML dump, or perhaps as a RecentChanges-monitoring 'bot.)
- To start the process, you first run a test batch of 10 entries. Then, when kinks are worked out, you run a test batch of 100. Then you post a semi-formal explanation on WT:BP, with a typical vote layout (kindof like WT:A,) and wait a couple weeks. If approved, you then post another formal request on Meta:. After about a week, your 'bot account will have the 'bot flag.
- --Connel MacKenzie T C 06:36, 17 May 2006 (UTC) But hey, I'm not bitter.
Japanese input files
Hi Connel. Several kanji entries were missed during the bulk run of the python script. I'm not sure whether that's because of some problem with the script, an interruption, or an incomplete data file. Are you aware of how japanese_kanji.txt and ja_kanji_urlencode.txt were created? Rod (A. Smith) 03:25, 19 May 2006 (UTC)
- Hi. Yes, I converted your first regex to a MUMPS pattern match whe I went through my imported copy of the XML dump. Could you give one or two examples of what was missed? We still don't have a new XML dump just yet. Perhaps a Wiktionary search would now be a better way of generating a list of remaining entries. --Connel MacKenzie T C 14:09, 19 May 2006 (UTC)
- Some examples of skipped entries are 静, 梅, and 茸, although the last two have subsequently been fixed by Izumi5. I can't tell whether they're in japanese_kanji.txt, though, since my terminal emulator isn't keen on Unicode. Maybe I should run the replacement on everything that links to Kanji, since all of them so far seem to do so. Rod (A. Smith) 15:29, 19 May 2006 (UTC)
- It looks tome like I forgot to search for lower-case "kanji" when I made the list. The 505 remaining entries look at though they have all been "touched" singe the original upload.
- I think working of that list is a very good idea, but your original regex's won't suffice, of course.
I'll put that list in /home/rodasmith/japanese_kanji_2.txt.--Connel MacKenzie T C 02:49, 20 May 2006 (UTC)
- I seem to be having the same difficulties, with strange characters. Rather than trying to force it, why not replace "-file japanese_kanji.txt" with "-ref Kanji"? That still works, IIRC. --Connel MacKenzie T C 03:05, 20 May 2006 (UTC)
- Will do! Rod (A. Smith) 03:46, 20 May 2006 (UTC)
Connel,
any progress on this. In my book it's pretty important, but too complicated for more than one person to work on it. Are you getting anywhere with it ?--Richardb 10:04, 19 May 2006 (UTC)
- Richard, thanks for your interest in it. When I last edited color/clour, I had the kinks worked out, I thought. Vild was going to start attacking other entries using this method. I was aware only that it needed more disscussion, but haven't heard a peep until now. If there is a specific problem, remaining with the approach, could you spell out what it is for me? The only remaining limitation I was aware of was lack of community support/comprehension. --Connel MacKenzie T C 14:14, 19 May 2006 (UTC)
Re wiktionary linking
Regarding your question left in the sandbox, {{wikipediapar}}
is one of our cross-project linking templates, along with {{wikipedia}}
. Most templates here start with a lowercase character. --Connel MacKenzie T C 19:52, 16 May 2006 (UTC)
- Thanks Connel. That's exactly what I was looking for. Cheers. Donama 06:58, 20 May 2006 (UTC)
. . . is what I call my local hardware store! (You can get anything you want there) SemperBlotto 07:06, 20 May 2006 (UTC)
- Heh...I was so focused on finding literary referencess (any MY there are a lot) that I forgot that most important line. You think you can fix it up some? --Connel MacKenzie T C 14:12, 20 May 2006 (UTC)
- OK. The format may need a slight tweak. SemperBlotto 14:21, 20 May 2006 (UTC)
BP revamp
Wiktionary talk:Beer parlour#Another idea. It just keeps bugging me. Anyway, what do you think? —Vildricianus 19:21, 20 May 2006 (UTC)
- I wonder...where you you be doing the transclusion? The same page as has the inputbox? --Connel MacKenzie T C 20:16, 20 May 2006 (UTC)
- No idea. Wiktionary:Beer parlour (all)? —Vildricianus 20:25, 20 May 2006 (UTC)
Javascript topic number 45,352
I've got a whole bunch of new javascript ripped from Wikipedia, so if you have a minute, perhaps you can find useful stuff there. It's working way better than before now. Cheers. —Vildricianus 17:55, 21 May 2006 (UTC)
- Dude, you are too quick for me. I'm preoccupied with four other major things right now however, so I'll give you these as requests:
- 45,353: I haven't refactored my Monobook yet, to enable all the lovely buttons you have over the edit box.
- 45,354: I need a widget that parses the text of a page into ==Objects== with ===Child objects=== under them and ====grandchild objects====, etc. I meant to get to this, after doing the TTBC stuff, but other things cropped up, and I'd guess that someone has already written such a thing.
- 45,355: I more handlers for the Image: namespace (commons' links, commons' RFD links, etc.)
- 45,356: I never got around to the "change all wikilinks on page to: 1) edit, 2) history, 3) protect, 4) delete, 5) move, 6) watch" thing. Is that there?
- 45,357: I'd like something that adds interwiki links to all known wikimedia projects (off by default, turned on by a click) from the current page. So, 100+ wikitionary links, 200+ wikipedia links, 50+ Wikisource links, etc.
- 45,358: I'd like to see a central area where *everyone* can load the Wiktionary version of pop-ups, so that all regular contributors have the Javascript "rollback" available.
- Back to parsing the XML dump for now... --Connel MacKenzie T C 18:16, 21 May 2006 (UTC)
Porguguese
Your joke went over my head. Were you suggesting I'm not a native speaker of English? Rod (A. Smith) 23:22, 21 May 2006 (UTC)
- Sorry about the joke...I thought for some reason it was a Drago-ism. After saving, I looked at the history, and realized it was you, therefore a simple typo (considering it was correct two out of three times.) --Connel MacKenzie T C 23:24, 21 May 2006 (UTC)
- Ah, I see. Drago sure is prolific here. Does he know he's a 'bot? ;-) Rod (A. Smith) 03:42, 22 May 2006 (UTC)
Using javascript
Having come across cirrostratus, I'd like to remind to have a thorough look at pages you edit using your javascript before saving. Ncik 21:09, 22 May 2006 (UTC)
- Good catch there Ncik. It's been thousands of edits since I missed one - glad to know you're still watching my edits closely...keeps me honest. :-) --Connel MacKenzie T C 23:13, 22 May 2006 (UTC)
Primetime? I don't see him in the undelete history. —Vildricianus 23:04, 22 May 2006 (UTC)
- Sockpuppet User:John Hill. Or did I goof? Perhaps another Checkuser meta: request is in order. BTW, there is talk on Wikisource about trying to get Checkuser - and failing. Meta: has an odd request that there be at least 25 "support" votes in order to honor a Checkuser flag. I don't think Wiktionary will therefore get any checkusers, this year. --Connel MacKenzie T C 23:11, 22 May 2006 (UTC)
- I'll reply with CheckUser results by e-mail. --Connel MacKenzie T C 18:49, 23 May 2006 (UTC)
- I heard about the Wikisource CheckUser troubles. Seems indeed likely to strike us, too - unless we can urge all of the active users to show some interest, for once.
- BTW, I don't think John Hill = Primetime. I could be wrong, but see User:John Hill and contributions. —Vildricianus 19:00, 23 May 2006 (UTC)
Random English
I'm now part of the replacing team: User:Vildricianus/replace.js. But that's not what I wanted to say; rather, you're using the XML dumps to make rnd-en-wikt work? Looking for the ==English== header? Any chance to make it main-namespace-only? There's an error ratio of ~0.5% due to ==English== headers appearing on user talk pages. Apart from that minor fact, it's flawless, if moderately slow. —Vildricianus 20:25, 24 May 2006 (UTC)
- I have a work-around to wait 1 second and retry when the Caché license limit error occurs. I will tweak it for NS:0, but I won't re-run it until the next XML dump (as I don't want to inadvertently reset the "Cleanup Random" thing until then...and I generate both lists at the same time.)
- Take a look at the last link on my toolserver page..."Some numbers..." (just below the cleanup link.) Not all of those languages are linked on my toolserver page yet. (Yes, sometimes, I'm just lazy, that way.) --Connel MacKenzie T C 23:56, 24 May 2006 (UTC)
rfc on "へ/compare に"
Hi Connel. Your "?" comment on your removal of "noinclude" from "へ/compare に" suggests you're wondering either why I excluded {{rfc}}
from the including pages or why I used an included page at all.
If you're wondering the former, I thought the RFC really just applies to the shared content instead of the including entries, but feel free to correct me if I misinterpreted the scope of that RFC. If you're wondering about the latter, it's a pretty poor attempt to show my suggestion for how to share content between related entries, similar to my oft-noted "theatre"/"theater" example. Whatever your question, I agree with your rfc, as the content is questionable for inclusion in individual dictionary entries. Rod (A. Smith) 23:35, 25 May 2006 (UTC)
- Actually, the answer to the mystery is much simpler: I messed up. I didn't realize (until discussing on the RFC page) that it was being used as if it were a template. Then the phone rang...and I forgot to go back and clean it up. Please move it to the template namespace. --Connel MacKenzie T C 23:40, 25 May 2006 (UTC)
- I don't want to be contrary, but that seems like a strange request. Is there some reason not to include main namespace pages? Rod (A. Smith) 23:56, 25 May 2006 (UTC)
- Umm, the same reasons color/colour use
{{color-colour (noun)}}
. The main reasons are 1) template namespace is less confusing, when being used as a template, 2) cleanup activities focus on the main namespace (NS:0) which is why I keep running across へ/compare に 3) items in NS:0 are searched from the [Go] button by default, but nothing else, 4) The same priniciple that applies to Appendix:, Index:, Transwiki:, WikiSaurus:, etc.; things that aren't main namespace type things (with the notable exception of Main Page) don't belong in the main namespace.
- Umm, the same reasons color/colour use
- That said, I could be completely wrong about へ/compare に. Perhaps the discussion should return now to the RFC page? --Connel MacKenzie T C 00:20, 26 May 2006 (UTC)
Sanskrit
Hi Connel! I've noticed that there are quite a few transliterations in Sanskrit here on Wiktionary, but I'm not sure how they should be handled. Is there a transliteration policy for Sanskrit, or should both Devanagari and transliterations be added as entries? --Dijan 00:29, 26 May 2006 (UTC)
- Sorry, I have no idea. I'm assuming the cleanup entries I find were entered in good faith, but beyond basic syntax, I try not to touch them. --Connel MacKenzie T C 00:45, 26 May 2006 (UTC)
- The Sanskrit Wikipedia is written in Devanagari. I'm assuming Devanagari should be used to write entries in Sanskrit. Thanks. --Dijan 01:26, 26 May 2006 (UTC)
145,000
Hi Connel. I can't figure out when we passed 145,000. It could well be your CheatBot. SemperBlotto 07:24, 26 May 2006 (UTC)
- Yes, it certainly was, but for some reason, the first two hundred 'bot entries did not affect the article count. When I returned later and ran the rest of them, they did. But it was at 145,003 when I saw DG's entry had hit the milestone. {sigh} :-) There's always 150,000... --Connel MacKenzie T C 07:56, 26 May 2006 (UTC)
- When I had written that, I thought I was joking. natures it is, in honor of TheCheatBot entering plurals (and the other bots eventually entering all other forms.) --Connel MacKenzie T C 17:17, 5 June 2006 (UTC)
Latest language templates
I doubt you will, but if you ever get bored, here are the latest of language templates to be subst:'ed, with numbers in main namespace:
- Special:Whatlinkshere/Template:sv 166
- Special:Whatlinkshere/Template:fr 132
- Special:Whatlinkshere/Template:io 119
- Special:Whatlinkshere/Template:fi 81
- Special:Whatlinkshere/Template:pt 70
- Special:Whatlinkshere/Template:it 52
- Special:Whatlinkshere/Template:lv 45
Yup, that's all that's left! (At least I think I didn't forget any). —Vildricianus 20:34, 26 May 2006 (UTC)
- OK, all done. --Connel MacKenzie T C 00:14, 29 May 2006 (UTC)
- Crazy! Thanks! —Vildricianus 16:15, 29 May 2006 (UTC)
God knows what happened but it started bugging all of a sudden. I managed to kick it out with your <includeonly></includeonly> trick. Cheers. —Vildricianus 21:28, 26 May 2006 (UTC)
- Heh. I usually bracket the ":" inbetwixt the includeonlys. At least something, as I'm convinced there are circumatances where it will make a difference (if not in MediaWiki, then within my off-line template subst routines.) --Connel MacKenzie T C 22:56, 26 May 2006 (UTC)
Sorry dude, I didn't see you had reworked this thing already. I'm sure you like the new version as well. What do we do with the old one? Preserve in a nostalgy archive? :-) —Vildricianus 22:27, 28 May 2006 (UTC)
- :-) Well done. Is there a place for me to add the Alphabetical list of topics back in there at some point? --Connel MacKenzie T C 22:31, 28 May 2006 (UTC)
- Ah yes, the headers. Say, you did/do that automatically, don't you? You may want to redo it perhaps: I reworked Wiktionary:Beer parlour archive/January-March 06 - it was inordinately large. —Vildricianus 23:16, 28 May 2006 (UTC)
- Yes, I did it semi-automatically. Where should I put it now, when I run it again? --Connel MacKenzie T C 23:43, 28 May 2006 (UTC)
- I've put it at the very bottom. —Vildricianus 22:12, 30 May 2006 (UTC)
TheCheatBot, subst'ing?
Looks like you subst:'ed {{plural of}}
in its latest additions? It produces things like this:
# <span class='use-with-mention'>Plural of <span class='mention'>[[knorr|{{{2|knorr}}}]]</span>.</span>
Are you sure that's ok? Looks quite magical for the average user (did I miss a mention or announcement of this? Sorry then.) —Vildricianus 13:29, 29 May 2006 (UTC)
- I'm sure that was an accident. Can Cheatbot go back and "de-subst"? Rod (A. Smith) 14:07, 29 May 2006 (UTC)
- Ugh. When Ncik threatened to start vandalizing
{{plural of}}
, I experimented with subst:ing the template, on the generation pass. Later, when I ran the "201-to-end" entries, I forgot to undo that, before running. What I was focusing on checking was the list of entries, trying to catch "incorrect" plural entries, not looking at the entry contents. - The easiest way to "undo" it would be to delete those thousand+ entries, and re-run it. OTOH, cryptic or not, that part of the entry is not likely to be legitimately changed; it may be best to just leave them as they are?
- --Connel MacKenzie T C 17:27, 29 May 2006 (UTC)
- Mmm, perhaps not necessary to delete them. Subst'ing the template is also not necessary anymore - the template should anyway remain protected to prevent heavy server load. —Vildricianus 22:08, 30 May 2006 (UTC)
- If you had a file with a wikified list of pages that TheCheatBot has editet, we could feed it to replace.py. If you have no such list, maybe repalce.py accepts "
-links:http://en.wiktionary.org/w/index.php?title=Special:Contributions&limit=2000&target=TheCheatBot
" as a parameter. In either case, the following would "unsubst"{{plural of}}
:replace.py -regex "# <span class='use-with-mention'>Plural of <span class='mention'>\[\[(?P<word>.*)\|\{\{\{2\|\g<word>}}}]]</span>\.</span>" "# {{plural of|\g<word>}}"
- If you have no list and "-link:http..." is invalid, ignore this suggestion. Rod (A. Smith) 23:20, 30 May 2006 (UTC)
- Ugh. When Ncik threatened to start vandalizing
- I have a partial list, as it got overwritten when I had to restart it for some unicode exceptions in the middle. But shuffling together a list from a raw Special:Contributions is an easy fallback, if the pywikipediabot code can't get the list itself.
- I'm trying to understand <word> in your syntax above, and failing miserably. Is that, one word or any multiple? Or do you mean I should substitute it in, time and again, for each word? --Connel MacKenzie T C 06:48, 2 June 2006 (UTC)
You shouldn't need to modify "word" at all. In short, "it should just work." :-) "word" is the name of a named capture group.
- (Brief diversion: You're probably familiar with numberd capture groups and backreferences, e.g. "(foo*)", "\0", and "\1". When you begin to use named capture groups, you'll find that you can make your pattern self-documenting to some extent. Once you know about named groups, it becomes much easier to read regex expressions that use well-named capture groups than it is to read the same expressions using numbered capture groups.
Anyway, "(?P<word>.*)
" creates a named capture group with the name "word" and the regex pattern ".*
". Due to the "word" capture group's position, it matches a string preceded by "span class='mention'>[[" and follwed by "|{{{2|".
Specifically, the "word" capture group matches and captures "knorr" in the example above. Then, the backreference "\g<word>
" in the match pattern refers to the string captured, i.e. "knorr". It also refers to "knorr" in the replacement pattern.
In short, "it should just work." :-) Rod (A. Smith) 08:11, 2 June 2006 (UTC)
- Running now... --Connel MacKenzie T C 20:54, 5 June 2006 (UTC)
- Rerunning now (corrected.) The second <word> thing didn't work at all, so I replaced it with (.*) and it is blasting the entries correctly now. Someone please remind me to run it for "raw" entries (ones not created by TheCheatBot in the first place) after the next XML dump. --Connel MacKenzie T C 03:39, 6 June 2006 (UTC)
Hmm. There are so many ways to specify named groups, I must have chosen the wrong syntax. I thought the Python syntax was "(?P<groupName>...)
", but I guess not, oh well. BTW, I'm working on a Wiktionary-specific programatic editing framework and hope to expose it through I.E. plugins and web services. I guess I should say something about it on WT:GP, but not 'till there's actually something to show. Anyway, thanks for fixing the regex and running the cleanup. Rod (A. Smith) 05:57, 6 June 2006 (UTC)
Pentecost
You seem to have rfv-sensed the Shavout sense (I found it in category:RFV), but it isn't on the RFV page.
Anyway, having discovered that Shavout was a misspelling of Shavuot I found loads of books.google cites, ancient and modern, Jewish and Gentile, starting with the obvious (in hindsight) point that Pentecost is mentioned in the Bible before the Christian pentecostal miracle has occurred, so it appears to refer there to a Jewish festival. So I added some quotes, and did a whole lot more to other senses, including reordering historically. (Actually, it wasn't that simple -- I started out last night on a completely different track -- see History if you want a laugh.)
And then, only then (slap), I read the Talk page and found that we've been here before (though without any positive results that time -- not even noticing the spelling). Anyway, if you're happy with my cites, perhaps you would remove the tag, rather than listing the rfv.
I've also picked up that there are minor errors in Passover & Easter, each implying that they normally fall on the same day (which seems unlikely given that, in many years, the Eastern Orthodox churches can't even agree with the Western churches re the date for Easter). I'll check and amend soon, but no time now. --Enginear 06:43, 30 May 2006 (UTC)
- I'll suppress my laughter, and remove the rfv now! :-) --Connel MacKenzie T C 06:45, 30 May 2006 (UTC)
Yo CMK, on lenition there was Category: Experimental interwiki translation links, which I removed, during my sporadic Special:Categories sweep. Did you know that we have a template Template:t that works similar to what you had there (I occasionally use it when I remember, and I beleive Paul G does too). --Newnoise (Shout louder) 11:29, 30 May 2006 (UTC)
- I was aware of several such experiments, but I know they received criticism, and none gained popular support. --Connel MacKenzie T C 14:21, 30 May 2006 (UTC)
More CSS
WT:CUSTOM#Change headers, last note. You wanted Courier new for spotting spurious usernames, didn't you? —Vildricianus 16:37, 30 May 2006 (UTC)
- Sweetness! Thank you. Actually, it may be more important to identify usernames than titles, but this works if I visit their userpage. (Can RecentChanges even be messed with?) --Connel MacKenzie T C 16:41, 30 May 2006 (UTC)
- The content of most Special pages is withing the class special, so you could set
.special { font-family:courier new }
- but that will also change Wanted pages & co, and I don't know how useful it is anyway (doesn't work with "enhanced recent changes" btw). JavaScript could probably do more if you want all usernames displayed in Courier. You can always look at the source code of pages and see which parts are defined by a class. In CSS, classes or IDs are easy to mess with. Most other things are beyond me. —Vildricianus 17:28, 30 May 2006 (UTC)
- The content of most Special pages is withing the class special, so you could set
CommonsTicker
Hi - I have set up CommonsTicker and posted test entries on Wiktionary:CommonsTicker. Teh bot is not active yet, but I just have to throw the switch. The ticker page is not "styled" yet, if you want, copy the required bits from meta:User:Duesentrieb/CommonsTicker#CSS to the global css if Wiktionary. There are a few options for CommonsTicker:
- newest on top (current) or oldest on top
- hide minor changes (overwriting own uploads, deletion old versions; currently shown)
- append new (current) or update all - in append mode, old entries have to be archived/deleted manually; in update mode, old entries get removed automatically, and so do all annotations, etc.
Talk to me at de:Benutzer Diskussion:Duesentrieb or better, catch me in IRC - i'll be there for another couple of hours; i'm there pretty much every day -- CommonsTicker 22:31, 30 May 2006 (UTC)
Spodding
Hi - just wondering why you deleted my entry on spodding?
- Unsigned comment from User:82.32.209.136
- I used the [rollback] feature hoping to unconver some valid content from previous entries. Since there is none, I tagged the remaining entry with
{{rfv}}
. --Connel MacKenzie T C 21:43, 31 May 2006 (UTC)
Would you mind me rewording the entire message? It's big, flashy and too pink to be the welcoming message I'd want to receive. Looks like we could do with some brevity here. —Vildricianus 21:45, 31 May 2006 (UTC)
- No. I tried a couple times to solict help rewording it, but each time it got longer! --Connel MacKenzie T C 22:13, 31 May 2006 (UTC)
- Please take a read now. Made it quite simple and similar to the standard welcome template. —Vildricianus 15:23, 2 June 2006 (UTC)
- How about pointing your python machine to subst all remnants of welcoming sprees? There are about 200. Could you do that or does that actually take up time? (I've tried, but I guess I'll never be able to set up a bot myself). — Vildricianus 22:07, 3 June 2006 (UTC)
- I wanted to do them manually to make sure they got the proper date stuffed in, instead of the default date. But since that mechanism has changed, that may no longer be possible. OK, I'll fire them off now. --Connel MacKenzie T C 22:12, 3 June 2006 (UTC)
- Fortunately we don't have barnstars over here, or someone would have to give it to you. — Vildricianus 22:22, 3 June 2006 (UTC)
- Probably Template:welcome needs the same treatment (390 transclusions). — Vildricianus 12:12, 4 June 2006 (UTC)