Wiktionary:Beer parlour/2009/June

From Wiktionary, the free dictionary
Jump to navigation Jump to search
This is an archive page that has been kept for historical purposes. The conversations on this page are no longer live.
Beer parlour archives edit
2024

2023
Earlier years

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002
December


June 2009

Term for “pertaining to”?

Does anyone know what the grammatical term, if any, is for “a word pertaining to a subject”, such as “verbal” for “pertaining to words”? (In English these are often formed by using a Latin term plus -al.)

This came up due to a quandary, like all too familiar to the lexicographically inclined: I couldn’t think of a word. Specifically, “pertaining to dance”. I was tongue-tied. Mortified. It was awful. (The word, of course, is terpsichorean.)

I ask not idly – think of the category we could make! And an appendix for the irregular forms beyond counting!

Your assistance, as ever, kindly appreciated in this matter, dear sirs and madams.

—Nils von Barth (nbarth) (talk) 22:57, 1 June 2009 (UTC)[reply]
I think the word you're looking for is adjective or adjectival. We already have a category for those. --EncycloPetey 13:27, 2 June 2009 (UTC)[reply]

I don't think so. I think (s)he's looking for a term for these sorts of (semantic) relationships:

nounadjective
word(s)verbal
danceterpsichorean
sheepovine
nickname(s)hypocoristic
star(s)stellar

which would be a good word to have. If there isn't one already, maybe we should create one!

The relationship is kind of subtle; in the general case, the adjective can mean "of (a) ___", "pertaining to (a) ___", "being (a) ___", "resembling (a) ___", and maybe other things as well, but it seems like each specific noun–adjective pair has duked it out separately and come up with different rules for what the adjective is allowed to mean. :-P

RuakhTALK 15:04, 2 June 2009 (UTC)[reply]

"Derived adjectives" would give us 'rainy' from 'rain', 'childish' from 'child', etc. Or are we concerned with cases were the adjective uses a different root than the corresponding noun? Something like "suppletive adjectives" maybe? kwami 18:36, 2 June 2009 (UTC)[reply]
Yes, that seems to work. (Note though that "suppletive adjective" is often shorthand for "suppletive adjectival paradigm", such as good ~ better.)
David Beck (2006) Aspects of the theory of morphology, in the section "The typology of suppletion", p 421 example (9), has
Noun ~ corresponding denominal adjective (= relational adjective)
with examples like father ~ paternal, earth ~ terrestrial, sun ~ solar. Footnote 12 (p 461) then says,
This is a well known phenomenon: borrowed (here, Latinate) adjectives standing in suppletion relation to the native (Germanic) nouns. Compare these to the non-suppletive adjectives fatherly, earthly, sunny, etc. with more concrete meanings.
Though he never specifically calls them "suppletive adjectives", the fact he contrasts them to "non-suppletive adjectives" makes it pretty clear that's accidental. kwami 18:57, 2 June 2009 (UTC)[reply]
Ah, so "denominal adjective" or "relational adjective" must be the term we want. Thanks! :-)   —RuakhTALK 19:09, 2 June 2009 (UTC)[reply]
Depends on how narrow a category you want. A denominal adjective is any adjective formed from a noun root. "Fatherly" is a derived denominal, "paternal" is a suppletive denominal. "Relational" is an adjective used to classify a noun, not the word itself. So in "musical instrument", the adj. is relational, in "she's quite musical" it is not. kwami 20:51, 2 June 2009 (UTC)[reply]
Paul Georg Meyer (1997) Coming to know: studies in the lexical semantics and pragmatics of academic English also uses "suppletive adjective(s)" on p 130.
Found a snippet from an abstract posted online: "Although many linguists have referred to [collateral adjectives] (paternal, vernal) as 'suppletive' adjectives with respect to their base nouns (father, spring), the nature of ..."
As for the term "collateral adjective", I found the following: Tetsuya Koshiishi, "Collateral adjectives, Latinate vocabulary, and English morphology", Studia Anglica Posnaniensia, Jan 2002.[1] The intro explains the term:
The purpose of this paper is to study the nature of collateral adjectives and the Latinate vocabulary in English together with some morphological problems relating to them. English abounds in pairs like the following, in which adjective counterparts are difficult to relate in terms of their shape to the base nouns:
(1) spring — vernal fall (American) — autumnal dog — canine wolf — lupine arm — brachial iron — ferric father — paternal day — diurnal summer — aestival winter — hibernal cat — feline horse — equine heart — cardiac ice — gelid, glacial mother — maternal church — ecclesiastical, ecclesial (2)
Pyles — Algeo (1970: 129) call the above adjectives collateral adjectives (CAs). According to their definition, CAs are "[adjectives which] are closely related in meaning but quite different in form from their corresponding nouns, like equine and horse". It seems that this terminology is strictly theirs, and I have not seen any other literature on word formation referring to it.
In the tradition of lexicography, on the other hand, the term CA was once used in the dictionaries published by Funk and Wagnalls in the 1950's. (3) Those dictionaries are peculiar in that they describe CAs in the entry of the base nouns. However, all of the dictionaries of that company are now out of print and hence we seldom see this term used in the lexicographic field as well.
He continues,
Although in its strict sense, vernal is not etymologically collateral to spring (i.e. no "same stock" can be assumed between them etymologically), I still refer to it as a CA because it is an adjective of Latinate origin used dissociatively.
That is, this author defines "collateral adjectives" as cognate, like father ~ paternal, but not pairs like sheep ~ ovine. But if you read the paper, that appears to be an assumption on the part of the author, due to the etymology of "collateral". Wikipedia at w:Collateral adjective would include sheep ~ ovine. If Funk and Wagnalls did as well, then we would have no reason not to ourselves.
Okay, it looks like F&W covered all such pairs, not just cognate pairs: "Another category in which the F&W prides itself are 'collateral adjectives', which are 'adjectival forms of the noun so remote in spelling that they may not be brought to mind' [p 7a] [arm .. n. 1. Anat. ... <> Collateral adjective: brachial.]."[2] Now, "brachial" is not cognate with "arm". Nor are other pairs I can see in Google snippet view (I don't have an F&W dictionary): dawn ~ auroral, heat ~ thermal, flood ~ diluvial. F&W was published at least up to 1984.
I think "suppletive adjective" and "collateral adjective" are probably equally good for sheep ~ ovine. "Denominal adjective" would be the term if you also want to cover "sheepish". kwami 21:38, 2 June 2009 (UTC)[reply]

Thank you all so kindly, particularly kwami!

The tasks now before us are these, I perceive:

  • A general category – say, Suppletives or Suppletion or Suppletive forms, in Category:Grammar.
  • A specific category therein called Suppletive denominal adjectives or Collateral adjectives, for such words as ovine.
  • Properly, a containing category, Suppletive adjectives, for such words as better, best, worse, worst (compared to good and bad, but not denominal).
  • For completeness, a category Derived denominal adjectives for such words as sheepish, and a category Denominal adjectives to contain this category and that of suppletive denominal adjectives.
  • Similarly, Suppletive verbs for such verbs as be, and Suppletive nouns for such nouns as people.
  • Move Wikipedia’s List of irregular English adjectives to Wiktionary, under Appendix:Irregular English adjectives or Appendix:English suppletive adjectives or the like.

So I ask you, O Beer Parlourians all:

  • Do these seem the right items (categories, tasks)?
  • How find you these nomenclatural propositions?
—Nils von Barth (nbarth) (talk) 00:22, 3 June 2009 (UTC)[reply]
I'd say that denominal adjectives are part of the lexicon, rather than part of the grammar.
Suppletive nouns may only make sense as suppletive plurals. That might be a clearer cat. name.
For clarity, we might want a more specific term for good ~ better than just "suppletive adjectives", since that term would cover father ~ paternal as well. Irregular adj. or suppletive adj. forms / paradigms would work. kwami 01:03, 3 June 2009 (UTC)[reply]
Ok, I’ve made a category at Category:Suppletion, and various subcategories, together with Category:Irregular inflections to contain it – hope it looks good!
Once List of irregular English adjectives gets transwikied, I’ll fix it up (and categorize the words therein) and we should be sorted – thanks!
—Nils von Barth (nbarth) (talk) 00:30, 5 June 2009 (UTC)[reply]
Moved discussion to WT:RFDO.msh210 04:56, 8 June 2009 (UTC)[reply]

POS prominence

There are frequently comments at WT:FEED to the effect that pages are "too cluttered" or that users "can't find the definition". While the nesting hierarchy of headers is pretty clear from the table of centents, it's not so clear just looking at the page itself, especially if there is a long etymology, several etymologies, or a lot of alternative spellings. Structuring entries so that the POS is the top-level header is, in my view, extremely undesirable. Therefore I wonder if we should consider ways in which the POS line(s) could be made to stand out more. Two options that suggest themselves are 1) they could have a differently-coloured or -shaded background, or 2) they could be accompanied by some sort of picture-style icon (in the manner of other Wiktionaries, only ours would ONLY use it for POS headers). Any thoughts? (Through PREFS, I have inflection templates display as coloured boxes where available, which virtually eliminates this problme; but that obviously isn't the default.) Ƿidsiþ 14:24, 3 June 2009 (UTC)[reply]

This seems like a good approach to an important goal. I've wondered whether we couldn't make all of the headers other than the language and PoS headers smaller by default, while retaining their structural role. Icons would be an additional attention-directing tool for new and occasional visitors.
Icons might be a good way to draw attention to important content concealed under show/hide boxes, such as long etymology and pronunciation sections. It would be handy for registered users to be able to suppress the icons to quieten down the visual impact. Horizontalizing alternative spellings and pronunciation would also help. DCDuring TALK 15:48, 3 June 2009 (UTC)[reply]
Just one comment: whatever we do we'd bettermake sure the section headers remain editable. The stupid uneditable sections at fr:wikt are the single major reason I don't touch that project if I can avoid it. Circeus 16:31, 3 June 2009 (UTC)[reply]
I like this idea, and believe that User:Hippietrail has some javascript that will divide the pages up into sections (allowing our current layout to be colourised). Making headings smaller may not help that much, as then what little structure there is becomes harder to follow, but it may be worth experimenting with. As Widsith says, structuring entries by POS is extremely undesirable, and dividing them by Etymology and Pronunciation is, in my opinion, even worse. I would strongly support a large one-time-effort to restructure our entries to make them useful and usable, for editors, readers and robots. Conrad.Irwin 16:36, 3 June 2009 (UTC) (Section moved after some auto-conflict resolution misplaced my comment)[reply]
On a semi-related note, I am particularly keen on an approach taken by http://www.wolframalpha.com/input?i=alphabet whereby they assign a short "word" to identify each definition. Even where the word isn't particularly appropriate, it gives some initial context which aids in understanding the definition, and it could be used to link to individual definitions easily. Conrad.Irwin 16:36, 3 June 2009 (UTC)[reply]

Someone pretty recently wrote (at WT:FEED) "the definition is sometimes not clearly marked and you need to look all over the page" and I'll copy my response here:

We get that complaint a lot. Well, more precisely, there seem to me to be a lot of people who write, here at WT:FEED, that they went to a page and couldn't find the definition. Two options that I can think of off the top of my head, and which have surely been considered already, are:
  • Make the definitions slightly larger (in font size) by means of CSS. This would require a single edit to whatever css file: body.ns-0 #bodyContent ol li{...}.
  • Instead of ==Language==, ===POS===, infl, #defn, ====Other stuff====, use ==Language==, ===POS===, infl, ====Definitions====, #defn, ====Other stuff====. This would, of course, require lots and lots of edits, though most of them would be bot-doable.
As I said, I have little doubt these have been suggested before, but I don't know where or when.

End-quote. I also like DCDuring's idea of making headers' font sizes smaller (except POS and language headers'), though.—msh210 16:30, 3 June 2009 (UTC)[reply]

Wiktionary:Layout woes shows some ideas, though many of them have not been discussed in detail. Conrad.Irwin 16:50, 3 June 2009 (UTC)[reply]
I hope that whatever we do does not require changing the structure or, worse, debating about changing the structure. Nor should it make headings uneditable. Do we have test suite of a few different types of entries (fully compliant and not-fully compliant; with/without images; short/long) so that the consequences of different style-sheet decisions could be easily tested and seen?
The WolframAlpha thing depends on the existence of an appropriate synonym or context for each sense, which would itself be a good exercise. It would be interesting to test on something like "head". MWOnline uses bold synonyms in some of their definitions, but with a different layout, more like ours. Could we lead off a definition with a bold synonym? Perhaps either context or the bolded synonym could appear in ToC instead of some of the lower level headings. DCDuring TALK 17:06, 3 June 2009 (UTC)[reply]
  • Indeed I have a JavaScript extension User:Hippietrail/addstructure.js which I made at least a couple of years ago which wraps all sections in correctly nested divs with classNames generated from the headings. CSS styles can then be applied.
    It still works unchanged now. I think it may not work with cirwin's "paper view" though and it doesn't do the addloadhook startup stuff that became standard some time after I wrote it. Any of the technically inclined may like to play with it. — hippietrail 18:24, 3 June 2009 (UTC)[reply]

Animations policy

At Commons, looking for an image suitable for barber pole, I found a gif animation, which is now inserted there. (There are perfectly fine still photos of barber poles at Commons.) Should we have a policy against animations? Arguably they detract from our overall serious (?, boring?) image. An animation that initiated only at the request of the user might be different. DCDuring TALK 16:14, 3 June 2009 (UTC)[reply]

No. Though now you've brought it up there are going to be endless arguments about exactly when animation should be used :p. In my opinion, almost any animation is fine, and most animation is good; the situations I would not want to see are where there are several animations clustered together on a page, or where the animation is not directly relevant to the topic (i.e. a video of someone running is not relevant to "leg") but as both of these also apply to static imagery I see no reason to differentiate. I strongly dislike the idea of policy, but I would have no problem with people creating Help:Illustration (or similar) that can give advice as to what looks tasteful. We don't currently have a forum for debating difference in opinion for page appearance, I would suggest that WT:RFC can perform this role, if and when such a disagreement arises. Conrad.Irwin 16:23, 3 June 2009 (UTC)[reply]
I agree with Conrad.—msh210 16:33, 3 June 2009 (UTC)[reply]
Me, too. BTW, if anyone wants to go nuts with this, entries like Geneva wheel and escapement — all the clock mechanisms, really — would be a good place to start. —RuakhTALK 17:21, 3 June 2009 (UTC)[reply]
Yeah, let's forbid animations, because god knows they're completely useless. — [ R·I·C ] opiaterein17:25, 3 June 2009 (UTC)[reply]
Stroke order hadn't occurred to me. Still, shouldn't it be at user initiative? At barber pole, the animation makes a minimal contribution to the entry and is somewhat distracting to old codgers (sample of 1). Is it a good illustration for guideline purposes of minimum suggested value? DCDuring TALK 17:40, 3 June 2009 (UTC)[reply]
I for one find animations distracting. An animation that starts only after I click some button is okay with me. But I am not proposing anything like a ban on animations, cemented by a 75%-majority rule needed for its removal. I just think that in each case the pros of using an animation should be weighed against the con of the distraction effect. --Dan Polansky 18:16, 3 June 2009 (UTC)[reply]
Limited use of animation can be a very good thing. Besides stroke order, animation can be more enlightening than static images for (1) encoding Sign Languages, (2) illustrating actions (e.g. parkour), and (3) representing 3-D shapes (e.g. dodecahedron). --EncycloPetey 02:47, 4 June 2009 (UTC)[reply]
I'm nuts about animation. Look at drip. You couldn't do justice to the verb with a static image. bd2412 T 06:30, 4 June 2009 (UTC)[reply]
I do appreciate that animations can be useful. Still, I have difficulties concentrating on reading the definition when I see something moving in one screen corner, which may be untypical of an average reader though. In parkour, I would ideally like to have a button to pause the animation, calmly read the definition, and then run the animation again, which I currently cannot do. Having this option, we can combine the benefit of having an animation with the benefit of having a calm screen.
An alternative to animation for "parkour"
A still image can contain an animation of sorts, as shown in the image for "parkour":
I still think that, per individual case, if the job of the animation can be achieved using a still image, an animation should better be avoided.
Let us see what Wikipedia has on animations: W:Wikipedia:Image_use_policy#Animated_images:
Inline animations should be used sparingly; a static image with a link to the animation is preferred unless the animation has a very small file size.
My concern is not about size though; it is about distraction, and about lacking control for the user to disable this distraction. --Dan Polansky 08:33, 4 June 2009 (UTC)[reply]
I have up to now suppressed my itch to start ranting about evolutionary visual systems of humans that give increased priority to items that move, until I have come over a guideline from Jakob Nielsen:
Let me quote:
"These days, tiger-avoidance is less of an issue, but anything that moves in your peripheral vision still dominates your awareness: it is very hard to, say, concentrate on reading text in the middle of the a page if there is a spinning logo up in the corner."
--Dan Polansky 09:05, 4 June 2009 (UTC)[reply]
I've ask the person who provided the Commons barber-pole animation about whether user-initiated animation is feasible. I find that particular animation makes me ill. In the meantime I'm going to insert a photo. DCDuring TALK 12:13, 4 June 2009 (UTC)[reply]
“Sometimes moving” isn't even a defining characteristic of the term barber pole—there is no reason to substitute a movie for a good diagram in this entry. Michael Z. 2009-06-05 18:23 z
Dancing kitties would illustrate this dodecahedron even better.

Goodness, many of the hand-picked examples aren't even very useful, are they? What on Earth does does jiggling a dodecahedron contribute to the definition of dodecahedron? And traditional illustrations of calligraphy are more effective than a letter redrawing itself.[3][4][5][6][7] Please keep in mind that a dictionary only needs to illustrate a thing with characteristics difficult to define simply (like fraktur type), not explain its technical (encyclopedic) details.

Why wait around to see the strokes?

But the worst thing is that an unstoppable moving image on the page makes it almost impossible for me to comprehend even a sentence. Drives me freaking nuts.

Gifs are the worst kind of movie file. Any animation's motion should be user-controlled, as required by accessibility standards,[8][9] or clicked through to, via a link. Michael Z. 2009-06-05 04:41 z

I don't feel that a rotating dodecahedron or barber's pole adds anything; it seems gratuitous and irritating (even though most modern browsers are capable of disabling animations per site). But the calligraphic stroke orders are good; these would have to be very large otherwise and full of directional arrows. Equinox 10:44, 5 June 2009 (UTC)[reply]
While we are at it: let's forbid anything three dimensional like dodecahedrons entirely, Equinox. After all if it does not conform to a proper 2D computer screen, it cannot possibly be real, can it? And let's reinstate the old Utah law that pi equals three. So much easier! Jcwf 19:11, 5 June 2009 (UTC)[reply]
Yeah, clearly my calmly stated opinion on animations means I want to exterminate the Jews. Thanks for clarifying. Equinox 19:40, 5 June 2009 (UTC)[reply]
P.S. Sorry to sound irritated, but then try to teach three dimensional objects like molecules or unit cells to freshman chemists that are of the computer screen-flatland-generation, like I do. They are hopeless... And yes they get annoyed at rotation tetrahedra at such, too. That is why I insist on showing them. Contrary to their strongly held belief they do live in a 3D world. So do you. Jcwf 19:11, 5 June 2009 (UTC)[reply]
I've taught geometry to high school students, and they don't interpret static "3-D" drawings as three-dimensional. Classes like woodshop, drafting, etc are gone from US high schools, and the result is that the only experience students have with "3-D" is in computer animation. If it isn't moving or physically present, then it isn't interpreted as three-dimensional. --EncycloPetey 22:15, 5 June 2009 (UTC)[reply]

After edit conflict: In light of the comment immediately above, the discussion my not have concluded.

  • I take the sense of this discussion (so far) to be:
    1. No desire for explicit policy or even guidelines.
    2. Acceptance of animation that contributes notably to the content of an important sense in the entry.
    3. A strong preference for user-initiated animation over other animation, where feasible.
    4. General user-oriented Web good practice provides useful guidance.
    5. A preference to have links to off-page (usually off-wikt) animations that might be low value, distracting, or not user-initiated.
    6. WT:RFC or WT:TR are the appropriate forums for reviewing animation content at this time.
    Is that about right? DCDuring TALK 19:21, 5 June 2009 (UTC)[reply]
Thank you. Michael Z. 2009-06-06 15:39 z
Have a look at Mills Mess, it's just about impossible to explain a juggling motion with words and still images; but I'll happily abide by an consensus decision. Mglovesfun 19:37, 5 June 2009 (UTC)[reply]
What about the potential for converting it to something user-initiated? It seems more valuable than the barber-pole animation. BTW, shouldn't the English entry be at Mills' mess, with "Mills Mess" being the German translation? DCDuring TALK 19:47, 5 June 2009 (UTC)[reply]
I would like to point out the .gifs are not binary, they have three states. A .gif image can be static, it can loop continuously, or it can play once and then remain static. In image which plays through once and then is static on a relevant frame is often very useful, and they can be replayed by reloading the page or by reloading the image. - [The]DaveRoss 19:50, 5 June 2009 (UTC)[reply]
Although the play-once solution isn't perfect, there are some cases where it would be good. Is that something that we can do without editing the gifs on Commons? What would be the required edits to the gifs or to our Image link to Commons? DCDuring TALK 19:58, 5 June 2009 (UTC)[reply]
Not only that, but if I understand correctly (I've never done this myself), you can actually set it to play n times, where n is any non-zero integer in the unsigned two-byte range (i.e., any integer 1–65535). And for JS-enabled users, I think we can create a "replay" link that basically removes and re-adds the image. (But I haven't actually tried this. And the n times thing would need to be specified within the gif itself on Commons, which means it's not easily tweakable.) —RuakhTALK 20:23, 5 June 2009 (UTC)[reply]
That would seem to be an approach that might work according to what little I've gleaned from WikiCommons Village pump. DCDuring TALK 10:36, 6 June 2009 (UTC)[reply]
Let's just try to replace gifs with real movie files—javascript workarounds will always be hacks. For the same quality, they can probably be as small as gifs or smaller, if the video format incorporates frame-differencing or whatever. A movie has the following properties that gif + javascript doesn't:
  1. Can meet web accessibility standards.
  2. Players are ready to embed today, rather than inventing a gif player.
  3. A movie can start in an initial state, without motion.
  4. A movie controller is a visual signal to the reader that they can start and control the movie.
  5. A movie can be started, stopped, looped an arbitrary number of times, scanned forwards and backwards, at any desired rate. This lets the reader actually analyze motion at their own pace, rather than relying on us guessing what won't drive them mental.
  6. A movie can support appropriate compression for animation or video imagery, at various frame rates and resolutions, incorporating sound, titles or captions, etc.
The question is how? Free and open .ogg, which lacks decent browser plugins, proprietary WMV or QuickTime, or Flash, which is ubiquitous but lacks free authoring tools? Michael Z. 2009-06-06 15:37 z
How about this: a default condition of "animation runs when user clicks a button to make it run", and stops running when the user clicks said button again; but with an option to set preferences to automatically run them, still having a button that lets the user stop it from running. bd2412 T 22:00, 5 June 2009 (UTC)[reply]
FYI, in most versions of FF and IE you can stop all gif animations by just pressing Esc on the page. Don't know about Opera or Webkit browsers.
JS code to switch between the static and animated versions would be great. One could even imagine a bot creating static (or single time animated) versions of existing animated gifs. --Bequw¢τ 18:33, 6 June 2009 (UTC)[reply]
So, if I understand, there might be a kludge using JS that generated an escape on loading the gif to stop it, and on mouseover or click-on reloaded the image (without another escape) to start it, generating another escape on another click or mouse-out to stop it. Thus mouse-over or click would activate the gif animation. To be useful this would have to work for anyone with Java enabled and we would have to hope that most users would have java enabled. I hope that the facts conform to our hopes. DCDuring TALK 18:58, 6 June 2009 (UTC)[reply]
JavaScript, not Java. At least it would be an improvement, but not ideal. Michael Z. 2009-06-08 03:21 z

How should we handle statutory/regulatory legal definitions? There are many terms (i.e. assault, gross income, solicitation, unfair competition) which are defined in statutes, which, in some cases may vary from jurisdiction to jurisdiction, and others (i.e. ground beef, cream cheese) which have national "standards of identity" established by U.S. federal agencies. Should we include such definitions? bd2412 T 23:39, 5 June 2009 (UTC)[reply]

Significant issue. I think it may be a question of how rather than whether to include or refer to such statutory or regulatory definitions. Those "official definitions" may in some cases be the sole reason for keeping a compound entry (perhaps semisweet chocolate). I would like to avoid needlessly cluttering an entry with many definitions differing only in truly minor details. Indeed, I would like to exclude minor details from any definition, which might be a means to reduce the number of senses.
It would be a service if we could refer people to a source of appropriate jurisdiction-specific detailed definitions. But external links would themselves be a clutter. Would it make sense to have an "Appendix:External Links to Sources of Regulatory Definitions" with headings for the major categories (eg, Food, Air transportation, Pharmaceuticals) and have a single link to the appropriate section of the appendix? DCDuring TALK 00:47, 6 June 2009 (UTC)[reply]
We could appendicize them internally as well, which would actually be quite useful. I can think of a number of reasons why a legal practitioner might want to know the definition of battery or conspiracy according to the laws of each of the 50 states (especially when figuring out where to bring an action or whether case law of other states is useful). As for the statutory food-identity definitions, that's federal law so there is only one for the country, which should be included in our entries (or perhaps summarized with a link if its one that's filled with bizarre and lengthy exceptions). bd2412 T 02:27, 6 June 2009 (UTC)[reply]
I don't see why we would want to have the full text. Our standard is to have one- or two-line glosses. Folks have repeatedly complained about longer "encyclopedic" definitions. If a full legal definition is longer, then it is not really dictionary material.
We are also not set up to maintain mirrors of text on other sites. I would think we would be happy to have live links. :
The US is not the only English-speaking country whose regulatory and legal pronouncements are relevant. The UK, Canada, Oz, NZ, EU, India, Ireland, and several others may do so. In some cases provincial, state and even municipal laws govern.
In any event, collecting the links seems like a feasible first step, whatever subsequent steps we may be able to take with the vast increase in volunteers, technical resources, bandwidth, server capacity etc that are just around the corner. DCDuring TALK 03:55, 6 June 2009 (UTC)[reply]
The thing is, the legal definition can be exacting. For example, the U.S. Food & Drug Administration standard of identity for "ground beef"] states: "(a) "Chopped Beef" or "Ground Beef" shall consist of chopped fresh and/or frozen beef with or without seasoning and without the addition of beef fat as such, shall not contain more than 30 percent fat, and shall not contain added water, phosphates, binders, or extenders. When beef cheek meat (trimmed beef cheeks) is used in the preparation of chopped or ground beef, the amount of such cheek meat shall be limited to 25 percent..." So, if any of the conditions of this definition are not met (say, the cheek meat is 26%, or it contains added phosphates), then the meat in question is legally not "ground beef". bd2412 T 04:49, 6 June 2009 (UTC)[reply]
I think that we could give the general definition, with a note mentioning that legal definitions also exist, and that readers interested should use the link to Wikipedia for more information. Legal definitions should not contradict the general definition, only be more precise. The existence of a legal definition seems to be a good reason to include a phrase, and should be added to CFI. But the length of the definition is not the issue: some definitions have to be long (e.g. you cannot define topological space in a few words, simplifying the mathematical definition would make it useless). Lmaltier 05:58, 6 June 2009 (UTC)[reply]
But our definition of (deprecated template usage) topological space is completely wrong: a topological space is not a set for which a topology exists (that would make it a rather useless concept: a topology exists for every set), but rather a set-plus-topology. And I don't think that's just an incidental fact about one entry; in our zeal to cover the technical criteria for a topological space (which is arguably encyclopedic), we've failed to our more basic obligation to explain what sort of thing it is (which is more clearly dictionaric). (It's not impossible to do both — in this case, we can replace our current definition of (deprecated template usage) topological space with a brief and correct definition in terms of (deprecated template usage) topology, and move the formal criteria to [[topology]] — but I think it's difficult and risky, so it's worth contemplating whether we really want to do both.) —RuakhTALK 14:43, 6 June 2009 (UTC)[reply]
(Actually, I shouldn't say it's completely wrong: mathematicians will frequently speak of "a topological space X", where X is actually the set, and the specific topology of interest is either obvious, irrelevant, or mentioned elsewhere in the surrounding context. But our definition doesn't even cover that use accurately, since the mere existence of a possible topology is not enough to permit this sort of locution.) —RuakhTALK 14:50, 6 June 2009 (UTC)[reply]
I agree with you, but I feel that the definition should have been only slightly changed. As it is now, it is not very useful, much less useful as the previous one (there are several definitions in topology, which one is intended is not clear at all). Lmaltier 14:12, 7 June 2009 (UTC)[reply]
The legal definition is not necessarily merely "more precise" - sometimes statutes provide a definition that is counterintuitive to the common definition. For example, burglary for the longest time could, by definition (at common law) only be committed at night (if it was during the day, it simply was not burglary). In fact, the common law definition of burglary was "the breaking and entering the house of another in the night time, with intent to commit a felony therein, whether the felony be actually committed or not", which required all kinds of case law to strain out the niceties of what constituted a "dwelling" and what constituted "breaking". Every U.S. state has since modified this definition by statute, said statutes modifying the common law by specifying structures other than a "dwelling", or times other than "at night", or not requiring a "breaking" but merely an illegal entry. At the very least, we should probably have the common law definition for common law crimes/torts, and the definitions used in model codes and uniform acts if they have been adopted by multiple states. With respect to the FDA "standards of identity" for food items, I think there are only about 500 of them, and I think they should all be included. There are some other situations where items in commerce are defined by statute, particularly in international treaties, where the definitions would be useful for us to have as well. bd2412 T 04:01, 7 June 2009 (UTC)[reply]
What can we do now to see what would prove useful and feasible? What should be do with semisweet chocolate and ground beef? They seem to be interesting cases. The US national regulatory definitions are good places to start. Should we put the entire US FDA regulatory definitions on the respective Citations pages? I would be wary of putting it on the main page, where it might drive some users away from Wiktionary. What else should we do with these two? Is anyone inclined to search out other Anglophone regulatory definitions? Should any corresponding Francophone regulations for "viande hachée" get parallel treatment, even though the defining regulations are in French? That would conflict with the idea of all definitions here being in English. It seem to me to argue for treating the full regulatory definition as a citation rather than a definition. DCDuring TALK 16:19, 7 June 2009 (UTC)[reply]
I think we should have the definitions, for different jurisdictions, of words that are defined by statue or regulation (though not, I think, those defined by common law: that's too difficult, I think). Yes, [[assault]] may have a lot of definitions, but, hey, that's what subsenses are for  :-) . This would be a great boon for anyone who needs to nwo the legal definition. The big problem imo is how to include a sense which the law defines using three paragraphs of the Uniform Commercial Code along with several annotations. Or, to use bd2412's example above, how to include a U.S.-legal definition of ground beef. The ideal would be to summarize the law briefly; for ground beef, perhaps "chopped beef without fillers, without added fat, with fat content at most 30% of the whole, and with cheek-meat content at most 25%". Where this is not possible because the law is too detailed, perhaps leave off some detail — in fact, I did that in my proposed definition of ground beef by not specifying what's meant by fillers, but it can be done to a greater extent, so that the U.S.-legal definition of bockwurst can be reduced to perhaps "uncured, comminuted food comprising at least 70% meat, some of which is pork, and containing also eggs, liquid, onions (or the like), and sometimes other ingredients, as specified in 9 CFR 319.281". (To reply to a question, above, of DCDuring, yes, we should also include France's regulatory definitions of French terms, but in English translation, of course, like all our definitions.) And any such legal-type definition must specify that it is such, as by a context tag.—msh210 04:45, 8 June 2009 (UTC)[reply]
I think as long as the definition indicates that the food must be "with specified quantities" of certain ingredients, we don't have to say exactly what those quantities are; we can leave a link to the FDA page which hosts their definition. Regarding concepts which will have multiple jurisdictional definition, it won't actually be that much. Many states either adopt a uniform act or simply copy one another. As for other countries, as raised by DCDuring above, I suppose we'll have to have the French limitations translated for the page on "viande hachée", but I'd be surprised if the French government itself has not released official translations for importers (and if other countries have not also generally done so). bd2412 T 05:46, 8 June 2009 (UTC)[reply]
By the way, the entire FDA catalog of standards of identity for foods is available here, from section 131 forward. bd2412 T 06:54, 8 June 2009 (UTC)[reply]
So how do we decide which specific bits of law, from which countries, are worthy of entries? Perhaps there is New Zealand legislation whereby a "sheepfold" only counts if it has more than 30 sheep in it, and no NZer has heard of it, except lawyers. There are thousands of acts and contracts and case histories, and almost all of them define terms. Equinox 00:38, 9 June 2009 (UTC)[reply]
Perhaps there's a usage restricted to the local slang. I don't see how this would be much different (you'd be surprised how many legal definitions are just codifications of what is essentially slang as used by lawyers). As for judicial opinions, judges interpret the law, and to the extent that they define legal terms, they are either doing so pursuant to the statute, or pursuant to a long line of cases which reflect variations of the same basic definition. bd2412 T 22:33, 10 June 2009 (UTC)[reply]
I can see that having all possible legal definitions will never work in theory, but it might well work in practice. Most of the possible definitions will come from a small number of jurisdictions, mostly Anglophone. Few will display the energy to provide properly formatted material and veteran contributors may selectively withhold help. The result will be a small number of entries, perhaps with a high degree of topical relevance to some news story.
"Veteran contributors may selectively withhold help"! This is the kind of pragmatic say-what-you-meanism that would spark a hysterical committee investigation on Wikipedia. And you're probably right. Equinox 23:57, 10 June 2009 (UTC)[reply]
At Wikipedia can they force volunteers to do what the volunteers think is not good for Wikipedia? I wasn't talking about a conspiracy to suppress anything. If some veteran decided to help someone do do something I thought was bad, I might get cranky and I might engage in angry forum discussions and e-mail exchanges. Period. Seriously, I just don't think that we are likely to be overwhelmed with poor quality legal content. Even some really good ideas die because of lack of support. Bad ideas mostly have less of a chance. This is why I don't think we should make a point of eliminating the Shorthand header. It is a useful reminder. DCDuring TALK 00:34, 11 June 2009 (UTC)[reply]
No one is talking about forcing anyone to do anything here. But if editors (such as myself) plan to add statutory or regulatory definitions, I'd like to know that that's considered appropriate. bd2412 T 01:16, 11 June 2009 (UTC)[reply]
That was just at Wikipedia. I'm sure some way of doing fuller legal/regulatory entries would be good in most people's opinion. Let's try to do a couple of them and see what people like and how hard it is to do them well. Whatever it is it will be better than arguing in the abstract. "Semisweet chocolate" and "ground beef" seem like good places to start, but anything of sufficient interest to someone to get them to do the work would be fine too, don't you think? For example, I might take a run at deepening "Value at Risk", "VaR", the bank regulatory concept with its official definition, if I can. Or the prudent man rule. Or cap and trade. We could go back and improve skim milk, non-fat milk, 1% milk, 2% milk, light cream, heavy cream, half-and-half. We could define a frankfurter in a regulatory sense, if anyone has the stomach for it. DCDuring TALK 02:04, 11 June 2009 (UTC)[reply]

Here is what I would propose for milk, by way of example:

  1. {{context|US|legal|lang=und}} Under the standard of identity established by U.S. Food and Drug Administration,[10] the lacteal secretion, practically free from colostrum, obtained by the complete milking of one or more healthy cows, and including the addition of limited amounts of vitamin A, vitamin D, and other carriers or flavoring ingredients identified as safe and suitable.

We could create a category for FDA definitions, as well, which would inherently indicate that the context is U.S. and legal. bd2412 T 19:23, 11 June 2009 (UTC)[reply]

So, then, you propose listing this as a separate definitional sense, with its own Translations table and Synonyms? I see some rather odd consequences of this sort of format. --EncycloPetey 19:30, 11 June 2009 (UTC)[reply]
There wouldn't be a point to having a translation table or synonyms, as no other language will have a word that uniquely corresponds to the U.S. legal definition of milk, and no synonyms at all since each legal definition is unique (ok, there are some exceptions, but they are minor). However, it is key to remember that if you sell cow juice in the U.S. and you fortify it with vitamin C instead of vitamins A or D, you can not legally call it milk. That's the point of having a legal definition at all. bd2412 T 19:57, 11 June 2009 (UTC)[reply]
I wonder if any countries effectively use the FDA's standards, but provide translations into a language of their own. In principle, that could lead to an exact correspondence between regulatory definitions. Does it make sense for Spanish leche to repeat the exact words and contexts as this sense of "milk"? It would then, of course, be a translation. Other languages spoken in the US would seem to warrant the same treatment.— This unsigned comment was added by DCDuring (talkcontribs).
If another country, say Chile, has the same definition for leche as we do for milk then I can see the translation listed with a Chile qualifier. I wonder how often that would happen, though.msh210 21:14, 11 June 2009 (UTC)[reply]
Is Spanish-language labeling and advertising of milk sold in the US exempt from FDA labeling and advertising requirements? I think not. So leche in a US TV advert or newspaper circular should have the same restricted meaning as bd2412's definition of milk. The same might apply to lait sold in the far northern reaches of New England where the population is partially Francophone and may get milk with Canadian bilingual labeling. Hawaii and some US possessions might also have issues of this sort in other languages.
Of course, we could decide that this has nothing to do with consumers, that it is only a jargon of lawyers, regulators and regulatees, so only the official language of the courts and the regulators matters. DCDuring TALK 23:13, 11 June 2009 (UTC)[reply]
According to the FDA Compliance Manual, all food sold in the U.S. must be labeled in English, with the exception of Puerto Rico (where it can be labeled in Spanish or both languages). I don't think there is any place in the U.S. where you can legally sell food that does not bear an English identification of the food being sold (although the law may be laxly enforced in some places). bd2412 T 01:01, 12 June 2009 (UTC)[reply]
As far as the FDA is concerned, all the regs specify is that if you use a foreign language, you have to include the same information, but they don't bother with specifying what the translations into Spanish or any other language would be. — Carolina wren discussió 01:27, 12 June 2009 (UTC)[reply]
Still, I think you could get in trouble selling something that does not qualify under FDA standards for milk, and calling it leche. bd2412 T 01:46, 12 June 2009 (UTC)[reply]
On one side it says "Milk"; on another it says "Leche". The issue is that the FDA standards undoubtedly are intended to protect consumers who do not read or speak English as well as those who do. So the word leche when used in an ad or on a package for milk sold in the US must mean the same thing as "milk". Incidentally, I regularly go to stores that have merchandise that bears labels in English and another language. That Spanish alone can appear on food labels in Puerto Rico is notable. DCDuring TALK 02:01, 12 June 2009 (UTC)[reply]
Now we're talkin'. Including some form of FDA and standard of identity among the contexts (after the first) should lead to the creation of one or more categories once there are enough entries to populate them. Putting them in the context would also shorten the definition, which, at three lines, is long. I think our format would say the link should appear under a Notes section which means it needs to appear between <ref> tags. DCDuring TALK 21:09, 11 June 2009 (UTC)[reply]
I agree: now we're talking; and I agree: the context should be in a context tag, not in the definition proper. That said, I don't see why "FDA" needs to be in the tag: the fact that it's a legal standard of identity in the U.S. is what's contextual; which agency defined it is, if anything, etymological. And I agree: the link should not be in the definition: perhaps in References or Etymology.msh210 21:14, 11 June 2009 (UTC)[reply]
I merely thought "FDA" would be helpful to the user by providing some brief context that they are more likely to be familiar with than "standards of identity". DCDuring TALK 02:01, 12 June 2009 (UTC)[reply]

Here would be the legal definition of artificially sweetened canned figs:

  1. {{context|US|legal|lang=und}} Under the standard of identity established by U.S. Food and Drug Administration,[11] mature figs of the light or dark varieties packaged in water artificially sweetened with saccharin, sodium saccharin, or a combination of both, having a specified density, to which lemon juice, concentrated lemon juice or organic acids are added as necessary to reduce the pH of the finished product to pH 4.9 or below, and optionally containing any combination of natural and artificial flavoring, spice, vinegar, unpeeled segments of citrus fruits, and salt.

I realize that this sounds like a "sum of parts" case (figs, which are canned, and which are sweetened, and the sweetener is artificial) but if you try canning your own figs with a different artificial sweetener, or other ingredients that deviate from the FDA definition, or your pH is above the limit, then you will be admonished for selling a "misbranded" product (i.e., the FDA will say that your use of the phrase "artificially sweetened canned figs" does not correctly identify your product, and is in violation of the law, and your cans of figs will be confiscated and destroyed). bd2412 T 20:29, 11 June 2009 (UTC)[reply]

The reason that the FDA doesn't specify a generic artificially sweetened for canned fruit in general seems to be that only certain canned fruits have been requested to be labeled artificially sweetened The actual regulations supports using a SoP construction. Save for the base product and the referred to base regulation, the exact same verbiage is used in defining artificially sweetened for apricots, cherries, figs, fruit cocktail, peaches, pears, and pineapple, but no standards for artificially sweetened applesauce, berries, plums, or prunes are defined. — Carolina wren discussió 01:34, 12 June 2009 (UTC)[reply]
If we can be reasonably certain that the FDA uses "artificially sweetened" the same way every time (e.g. "sweetened with saccharin, sodium saccharin, or a combination of both"), then we could simply have that definition of the phrase, and spare the multitude of more SOP-like possibilities (but see nonstandardized breaded composite shrimp units and fried clams made from minced clams - "The common or usual name of the food product that resembles and is of the same composition as fried clams, except that it is composed of comminuted clams, shall be fried clams made from minced clams"). bd2412 T 07:22, 12 June 2009 (UTC)[reply]

Okay, pursuant to the above I've created Template:standard of identity and Category:Standards of identity, and added corresponding definitions to milk and ground beef. How does it look? bd2412 T 04:12, 13 June 2009 (UTC)[reply]

The template needs help from someone who knows how to make a context template (i.e., not me). I like the done entries, but the context tag should specify the country. And why is milk rfd'ed?msh210 22:17, 14 June 2009 (UTC)[reply]
How does this work for non-food terms? For example, some states' codes, following an old version of the UCC (section 2-319 et seq.), define the terms "F.O.B.", "F.A.S.", "C.I.F.", "C.&F.", "ex-ship", etc.msh210 22:17, 14 June 2009 (UTC)[reply]
And what about wine appellations?msh210 19:54, 16 June 2009 (UTC)[reply]

Links to next and previous entry per language

I'm releasing a new JavaScript extension I've been working on for testing and comments.

Please add this to your monobook.js

importScript('User:Hippietrail/nearbypages.js');

Then clear your cache: hold the control key and click reload on most browsers.

Links are currently added in two places.

  1. At the left between the "navigation" box and the "toolbox" will be added one box for each language in the currnet page. Each is named after the relevant language or namespace. Or "browse" when it can't find any langage headings in the page such as in edit mode or on a nonexisting page.
    ◄ links to the previous page and ► to the next page. Between them is a link to the current page in bold or in red if it doesn't exist.
  2. Below each language heading in the page. These have only the ◄ and ► buttons.

When in a namspace other than the entry/definition namespace only the navigation column links appear.

The links come from the latest dump file, not from the database. This means for new changes the links may be wrong. Currently there are new dump files released every 4 or 5 days.

Correct alphabetical order is used for all languages for which there is a locale on the Toolserver. For all others the fallback is current "en_US.utf8".

There is no specific support for languages with more than one script yet.

In all cases a better sorting algorithm is used than elsewhere on Wiktionary such as Category pages. Basically punctuation, capitalization, and spaces between words in a term are treated as secondary with primary attention paid just to the letters themselves.

I'd like to hear who prefers just the links in the left column, just the links below each language heading, or both.

Note that namespaces and redirects are treated much as languages too but always with the default sorting sequence. American English for now.

Note that some namespaces are not in the normal dump files and as such will not get the links. This includes talk pages and user pages.

Cirwin has suggested it should be put in common.js - if you have rights to make such a change and think it's a good idea then please be bold.

All feedback appreciated. — hippietrail 03:52, 6 June 2009 (UTC)[reply]

Thank you for doing this, the functionality is great and very helpful. I tested it in Firefox v3.0.10:
  • Occasionally, after a few navigation clicks, the previous/next words will no longer appear, I have to refresh the cache again.
  • Sometimes it loads slowly, it takes 2-3 seconds after the rest of the page already loaded.
  • I am not sure if the left navigator solution is needed.
  • Formatting under the language header: Can it be one line? Conrad had a good example for formatting.
  • The ◄ and ► images are used for audio link in the index. Could they be replaced by ← and → ?
  • What other browser version and type will this work? --Panda10 12:10, 6 June 2009 (UTC)[reply]
  • Can you add linking to the specific language on a multi-language page? For example, a is a multi-language page. When I go to it from a Hungarian entry, can it jump to the Hungarian section? --Panda10 12:14, 6 June 2009 (UTC)[reply]
I like it (more so than some other linking projects like the ranking boxes). And I think it would be very helpful when you want to look for similar words, or just for making wikt seem more like a bound dictionary. A couple things,
  • I have the interwiki links showing below the language header and sometimes the vertical spacing is too little and your links run over the other ones a bit (rendering on Google Chrome on MS Vista).
  • I prefer just th in-entry links. The navigation links are a bit long (it makes the presentation more obvious to show the current entry title between the for previous and next ones, but it makes them longer). They get even longer still on long page names (see pneumoultramicroscopicossilicovulcanoconiótico). The left-pane navigation turns into 5 lines because the arrow & word (for both the prev & next entries) don't fit on the same line. Worse, the whole words don't even show anyways! Smart abbreviation might help, but I'm not sure about the consistency of left-pane widths.
  • I'd be much more tempted to use it consistently if the in-entry links where consolidated onto one line, though you might have some technical limitations. Maybe they could even be on the same line as the interwiki link, though that might be confusing.
  • Could you add a link to the Index pages when they exist (now that Conrad has them more updated) to the in-entry links? Maybe something linke (prev - Index - next)?
  • On a, I was startled to see previous links starting with 'z'. While wrap-around might be interesting (especially to get to the 'last' word in the index) it might be unintuitive for others.
Thanks for the great work. --Bequw¢τ 18:24, 6 June 2009 (UTC)[reply]

First a few quick responses. I might respond at more length soon:

  • On links not appearing or being slow to appear, that's most likely to be when the Toolserver is under heavy load. The Toolserver often slows down due the many and varied tasks it does. It shouldn't be anything to do with the cache. It does seem much less reliable on MSIE for me though.
  • The left navigator solution may have more appeal soon when I increase the number of previous and next links. It is closest to how other online dictionaries such as Encarta do it. But yes both are still experimental for now.
  • I am thinking of ways to format on one line. You may see it soon.
  • I could use ← and → but on the computers I use they don't stand out well. I haven't seen the other places where ◄ and ► are used.
  • I have tested the feature on Firefox 3.0/3.1/3.5 Google Chrome 2 Opera 9/10 and Safari Windows. I have only tested on Windows, not on Mac or *nix. I did some work today to make sure it works on Internet Explorer but it seemed sketchy. I'd like more feedback on this.
  • I have now added linking to the correct language section of the page.
  • I personally really dislike the Gutenberg ranking sections. I think that data is very interesting but definitely doesn't deserve to take up so much space right at the top of articles.
  • I haven't yet tried the interwiki links under the language headings. That's another JavaScript extension by cirin I think. I'll look into it soon.
  • I'll look into adding a link to the Index pages too though it might be tricky to link to the specific page. I think this shows that there are several concepts which are language specific which would benefit from being done in a consistent manner somewhere near the language headings: random, next/previous, indexes, interwikis.
  • The problem you saw on a where some entries wrapped around to "z" is now fixed. There were several minor problems with both the back end and front end that I knew were there but which I ironed out today.
  • Thanks for your comments and please refresh your caches to check on the problems that are now fixed. — hippietrail 09:59, 7 June 2009 (UTC)[reply]

OK I've got the extension working with MSIE now and I've also listened to your opinions and changed my beautiful arrow symbols to the ugly ones you all prefer (-: — hippietrail 00:27, 8 June 2009 (UTC)[reply]

Just a couple of comments on the current version:
  • I like it.
  • I like the in-entry version much better than the sidebar one. One result of the sidebar version is that stuff in the sidebar will be at variable height (depending, roughly, on how many languages we have a word defined in), which makes me have to search more for the link I want in the sidebar: a bad thing.
  • There was some talk of including User:Conrad.Irwin/iwiki.js into the default .js. (It's currently a PREF.) In fact, I seem to recall — though perhaps incorrectly — that that was the plan. Although not strictly relevant to this discussion, I'd like to take this opportunity to urge that.
  • I think that for those with iwiki.js and nearbypages.js, they should be integrated so as to appear on one line. We currently have a <div> for the preceding word, a div for the succeeding, and a div for iw; perhaps these can be spans instead, or divs with style set so they float next to each other?
  • Even if the iwiki.js and nearbypages.js links can't (or won't) be on one line, at least the two nearbypages.js links should be.
  • In the in-entry version, the font size can (and should I think) be smaller, à la iwiki.js.
msh210 04:13, 8 June 2009 (UTC)[reply]
I haven't much to say beyond what is mentioned above. Intregration of iwiki.js and a link to the index would be cool where the index is well-constructed (I can create a list of which words are where for the indices I generate, American Sign language might be a problem). And you should exclude form-of entries from the index (I can give you the code I use for that for the indices if I haven't already). My idea for formatting would be something similar to what follows, with perhaps more efficient wording for the central section - this would allow the adding of a few more links if we suddenly decided we needed them, though I can't imagine what for (perhaps for Wikipedia links - would be nicer than the floating boxes by a long shot). Conrad.Irwin 15:34, 8 June 2009 (UTC)[reply]
Hungarian


  • I have finally gotten around to finishing the next version for testing. There are now more than one previous and next term. Comments and suggestions are again appreciated. — hippietrail 15:50, 11 June 2009 (UTC)[reply]
I like it. The new arrow looks good, too. Thanks. --Panda10 22:44, 11 June 2009 (UTC)[reply]
Improvement, a much better look! A couple things, (1) I lost the sub-lang-heading iwiki link (was this by design?)found them, and (2) on entries created recently, the entry name in the list on the left is red (hinting the entry doesn't exist, when it does). Keep up the good work. Are you going to put it in WT:PREFS? --Bequw¢τ 00:59, 12 June 2009 (UTC)[reply]
  • I haven't done anything to interact with cirwin's iwiki script. I still see it on my machine and it still doesn't match of course. Getting two javascript extensions to interact is a bit tricky so I won't attempt it until it's clear that nearbypages won't change much more. I also made my own version of the iwiki thing as an experiment which checks all the relevant wikis live but it seems that the interwicket bot doesn't miss a beat so there's no gain.
  • Yes the red link in the left is on purpose though it looks like an edit link it always displays the page rather than editing it. This is because the links are generated from the latest XML dump so can be up to five days stale and a red link on the left is more often a too-new entry rather than a nonexistant entry.
  • I will add it to WT:PREFS when I think I've had enough positive comments. Then I'll work with cirwin to nicely interact with his iwiki script too. I will also be able to add some options and variables such as which of the two display areas you want and how much of each of prev and next context you want.
  • Thanks for the feedback! — hippietrail 02:03, 12 June 2009 (UTC)[reply]

One Million Words in English

According to http://www.languagemonitor.com/ we're about to hit 1,000,000 words (where they seem to define a "word" as something that has been used 25,000 times), a press release is available. Conrad.Irwin 10:39, 6 June 2009 (UTC)[reply]

Nah, that's bogus. They've discussed it a few times on Language Log; see e.g. http://languagelog.ldc.upenn.edu/nll/?p=972. —RuakhTALK 14:55, 6 June 2009 (UTC)[reply]
Yeah, basically the millionth word's been rescheduled like 5 or 6 times over a period of ca. a year to match with the release date of the guy's book. Circeus 13:52, 12 June 2009 (UTC)[reply]

This category is currently defined as including "all languages spoken in Armenia, Azerbaijan and Georgia". Not only does this give the wrong impression that the Caucasus is limited to these three countries, but it also excludes the more famous Caucasian languages of Russia (Chechen for example). Because some languages spoken only in Russia are already in that category anyway, the definition should be extended to include some of the Russian administrative regions. -- Prince Kassad 18:20, 6 June 2009 (UTC)[reply]

Fixed, thanks. —RuakhTALK 19:08, 6 June 2009 (UTC)[reply]

Adding references

There's a new user who's adding references to German pages, but adding the source in the edit summary - I think it would be better to add it to the etymology section, like in Lob - is there a better way to reference other books in our entries? I'd like to inform the user (User:MaEr) how we should do this too, as it seems like useful information to have. --Jackofclubs 08:25, 7 June 2009 (UTC)[reply]

In the wikipedias I'm used to adding sources, with <ref>...</ref> and ==References== <references/>.
I tried it here but it looked quite strange, with an additional "Notes" header, so I guess this cannot be the right way.
And I could not find out where to place the "References" header(s): only one for the whole article or for every language one. This is not documented, as far as I can see. --MaEr 10:21, 7 June 2009 (UTC)[reply]
Hmm, maybe a page Wiktionary:References cna be written? --Jackofclubs 10:40, 7 June 2009 (UTC)[reply]
WT:ELE#References -- Prince Kassad 10:53, 7 June 2009 (UTC)[reply]
Unfortunately, this was one of the places that could not help me.
But when searching in the page later, I found some other interesting statements: WT:ELE#The_essentials seems to suggest that you are free to put the references anywhere. WT:ELE#A_very_simple_example suggests a per-language-section "References" but does not use the <references/> tag. --MaEr 11:08, 7 June 2009 (UTC)[reply]
I created a new reference template for the Etymologisches Wörterbuch der deutschen Sprache - at Template:R:EWddS . Do you think you could use it? --Jackofclubs 12:45, 7 June 2009 (UTC)[reply]
The template looks fine! May I suggest an parameter for the lemma? Maybe an optional one?
Did you notice that I do not use the most recent edition of the dictionary? The hard-core etymologist (or library inhabitant) probably does not use the 22nd but the 24th edition. See also w:de:Etymologisches Wörterbuch der deutschen Sprache. So maybe we need another parameter which controls the edition (and the ISBN). --MaEr 13:07, 7 June 2009 (UTC)[reply]
I've created Wiktionary:References as a redirect to WT:ELE#References. If needed, Wiktionary:References can become a guideline of its own. --Dan Polansky 12:59, 7 June 2009 (UTC)[reply]

I added (as a suggestion) two parameters to Template:R:EWddS: ed for the edition of dictionary and hw for the headword of the article within the dictionary. Feel free to correct or comment details — I never did any template programming before and English is not my native language. It still is easy to change things because I used the new parameters in Spiegel only. --MaEr 19:20, 8 June 2009 (UTC)[reply]

'Variables' in page names

Following debates such as X one's Y off and I'll see your X and raise you Y, are these sort of titles acceptable, if not how do we deal with pages that need some sort of variable in the title, and what others are allowable? A few examples:

  1. amuse oneself
  2. work one's butt off
  3. milk it -- currently up for deletion

Or in other languages

  1. s'appeler (French)
  2. llamarse (Spanish)

Of course in Spanish the particle is quite often attached the to verb, so stuff like llámeme and llamarte are one word, but still sum of parts. Indeed in German you can go on (deprecated template usage) ad infinitum creating one-word terms that are stil SoP. Mglovesfun 19:57, 7 June 2009 (UTC)[reply]

Although the headword's with "one", "someone", etc are not very searchable for normal users, we seem to have accepted them. I believe that previous discussions have not yielded any consensus on how to handle formulas other than these. There is some thought, not opposed, that we should have redirects from all major pronoun forms (eg, "amuse myself", "amuse themselves", "amuse herself", etc). That would seem to be a job for a bot, though it seems a good idea to leave a redirect behind when moving one of these to a "oneself" form or similar. DCDuring TALK 20:38, 7 June 2009 (UTC)[reply]
I'm gonna start an article for s'appeler tomorrow as I don't think that's SoP. But I wouldn't then created all the conjugated forms - surely putting see appeler covers this. Actually, WT:CFI avoids (or doesn't yet refer to) quite a lot of these issues, such as what is a term; indeed the phrase sum of parts doesn't seem to be in there even once. Mglovesfun 22:52, 7 June 2009 (UTC)[reply]
Wiktionary:About French says to cover (deprecated template usage) s'appeler at [[appeler]], as we currently do; but Spanish we seem to do the opposite way, with separate entries for e.g. (deprecated template usage) lavar and (deprecated template usage) lavarse (and the former not even linking prominently to the latter). It would probably be nice to do them both the same way, but as long as we present things such that readers expecting one approach will notice we're taking the opposite approach, I don't think it's a big deal. —RuakhTALK 00:04, 8 June 2009 (UTC)[reply]
I too have pondered over the issue of reflexive verbs. So far, most Swedish reflexive verbs are defined in the entry for the actual verb (lata, not lata sig) but I'm not sure that's ideal for a number of particle verbs where the word order tend to mess things up (ta till sig versus ta sig till). On the other hand, to put everything in the full-word entry would make a couple of reflexive verbs harder to find, e.g. in the case of lata sig, where there doesn't exist any non-reflexive use, and a verb entry at lata would have to be made merely to supply the user with some kind of "see also". \Mike 10:07, 8 June 2009 (UTC)[reply]
I'd tend to think that the SoP 'rule' could easily be applied to reflexive verbs; so s'appeler in the sense of to 'call each other' (by telephone) is SoP, but in the sense of 'to have the name of', these meaning is not SoP (IMO) so I'm gonna start the article and if someone puts it up for deletion, I have no objection to that. As for WT:CFI it does say 'all words in all languages', so stuff like llámame which is really SoP ought to be fine on the grounds it is undeniably a word. That takes us to the definition of 'word' - is (deprecated template usage) don't one word or two? How about (deprecated template usage) to-morrow, is that one or two? Hence the reason that the CFI could do with a bit of work on it. Mglovesfun 11:47, 8 June 2009 (UTC)[reply]
Well, according to our current approach, I suppose "s'appeler" is SOP, in that "s'-" is used in its sense of "indicating a reflexive verb" and "appeler" is used in its sense of "(reflexive) to be called". But there's no law that says we have to do it that way. —RuakhTALK 18:59, 8 June 2009 (UTC)[reply]

Favicon

Alright. Am I the only one for whom the favicon is suddenly the little scrabble tile thing? And when did this happen? And is there a way that we can not have this happen until a definitive outcome of whatever the heck logo vote is going on? Call me a grumpy old fart (given that I'm turning eighteen in an hour) but I find this annoying and would rather not see sudden changes when I sign on with the intent of going through newpages, going through RC, and adding some entries. It's irksome. And perhaps meddlesome on someone's part -- or at the very least this sort of change should have been announced somewhere? --Neskaya kanetsv 05:50, 8 June 2009 (UTC)[reply]

To answer your first question, no: I, too, first noticed it this time that I signed on. I, too, was surprised to find a change made while meta's discussion is ongoing. Unlike you, though, I wasn't irked. But the main reason I'm commenting here is to wish you a happy birthday.  :-) --msh210 05:58, 8 June 2009 (UTC)[reply]
Call me grumbly and persnickety and what say you, I just generally don't like sudden changes, and I'd really, really rather they not happen during meta discussion. Also, thanks. --Neskaya kanetsv 06:06, 8 June 2009 (UTC)[reply]
No change for me yet, unfortunately (and the new favicon is not a scrabble tile!). This request (different favicons for Wiktionary and Wikipedia) has been delayed for a long time, and there is no reason why a new discussion about the logo should have caused an additional delay. Happy birthday. Lmaltier 06:59, 8 June 2009 (UTC)[reply]
The request was invalid, and, at any rate, no action should have been taken until after the new logo procedings. The argument that "Wikipedia" and "Wiktionary" shouldn't be the same cuts no ice with me, why don't Wikipedia use something that look like their logo? Conrad.Irwin 11:11, 8 June 2009 (UTC)[reply]
I too see a new favicon, the one that has reminded me of a scrabble tile from the first time I saw it. I am speaking of the effect of the favicon on me, of which I have direct unquestionable evidence, not of the effect on other people. I was slightly annoyed when I have noticed it, but not annoyed enough to do anything about it. --Dan Polansky 07:49, 8 June 2009 (UTC)[reply]
Likewise, I see the icky Scrabble icon. On the one hand, we do need to be using something different from Wikipedia (with whom we seem to be perpetually confused), but on the other hand, I really dislike the Scrabble-style logo. --EncycloPetey 15:35, 8 June 2009 (UTC)[reply]
I like it! --Jackofclubs 18:21, 8 June 2009 (UTC)[reply]
I don't like it! --Vahagn Petrosyan 18:43, 8 June 2009 (UTC)[reply]
Er, I see the new icon on Firefox on MS Windows, but not on Firefox on Red Hat Linux. I don't know whether this is a caching issue, or some markup that different browsers interpret differently.--msh210 19:57, 8 June 2009 (UTC)[reply]
Now, I see it for fr.wiktionary, not for en.wiktionary... Lmaltier 20:00, 8 June 2009 (UTC)[reply]
Almost certainly caching. Try visiting <http://en.wiktionary.org/favicon.ico> and hitting your browser's refresh button. --RuakhTALK 20:02, 8 June 2009 (UTC)[reply]
For those interested, the relevant request to the server folk is at https://bugzilla.wikimedia.org/show_bug.cgi?id=16315 Conrad.Irwin 20:33, 8 June 2009 (UTC)[reply]
Also, for those who care, I'll be talking to RobH at some point today and pointing out that the action taken was perhaps a great deal premature, and asking him nicely to undo it. --Neskaya kanetsv 20:54, 8 June 2009 (UTC)[reply]
Reverted, and I thus withdrawl from consensus discussion ;p [12] --RobH 21:20, 8 June 2009 (UTC)[reply]
Thank you! —Neskaya kanetsv 22:32, 8 June 2009 (UTC)[reply]
No, please leave the change. I don't like it either, but having the same icon as Wikipedia is worse than having none! Its function is to differentiate the site in my history menu, so a non-differentiating icon is an absolute failure, when I jump between the two Wikis.
When there's a consensus it can be updated again, but we waited until hell froze over the last time, didn't we? Michael Z. 2009-06-09 00:16 z
I want the favicon to be animated. Can we make it take up the whole screen? Equinox 00:20, 9 June 2009 (UTC)[reply]
I agree with Mzajac. How many years should we wait? Changing it does not mean that it cannot be changed again if a decision on a new logo is taken, some day. Anyway, a consensus is impossible on such a subject. Therefore, the only possible way to take a decision is a vote (after a discussion). I'm not aware of any vote against the new favicon (or against the new logo). Lmaltier 05:50, 9 June 2009 (UTC)[reply]
The change was reverted partly because the new favicon was very badly drawn. See http://bug-attachment.wikimedia.org/attachment.cgi?id=6207 . Had it been good, I suspect that it would not have been changed back. Conrad.Irwin 09:58, 9 June 2009 (UTC)[reply]
Are you referring to the aliased icon mask, which gives it a jaggy outline on a coloured background? I believe I can fix that if I can get a copy of the icon file. Michael Z. 2009-06-12 20:50 z

(back to unindented thanks to tiny netbook!) The new favicon didn't display correctly in at least fifty percent of browsers, and was definitely unpleasant to look at for most people. We may need an updated one but we do not need one that makes us look like an icky Scrabble game. --Neskaya kanetsv 19:52, 9 June 2009 (UTC)[reply]

  • I finally had an idea for a logo for a dictionary that won't look like either a block of text, any other book, or any other multilanguage project. I've done a quick mockup (with the Japanese version of Microsoft Paint!) and put it in use as the favicon on my Toolserver homepage. For some reason the favicon doesn't work for me on Firefox 3 or Explorer 6 so in case you can't see it, here is a direct link. If anyone with more artistic talent or actual Gimp or Photoshop skillz would like to make a better realization of the concept please go ahead. It should scale up easily and be adaptable to other languages. Ideally it should look more like other WikiMedia project logos. Feedback appreciated. — hippietrail 03:39, 12 June 2009 (UTC)[reply]
  • I like it. It didn't work for me on FF either. My eyes are not great so I had to scale it up times two+ to realize what it was. But that might be just a "realization" issue. DCDuring TALK 04:23, 12 June 2009 (UTC)[reply]

Entry Layout for Abbrevations etc.?

A can of worms! Why not open it‽

What is the proper entry layout for Abbreviations, Acronyms, and Initialisms?

This has limited policy guidance, and inconsistent use – perhaps we might hash it out, if it’s not already been addressed, in some lost corner of the archives, beneath discarded beer bottles?

Policy:

…which cites:

…where I just wrote:

…summarizing previous discussion, AFAICT.

Previous discussion:

Motivation: we were having a discussion at RFC:LOL regarding how to format LOL.

What appears to be agreed so far is:

  • Ab/Ac/In are legit L3 headers (despite not being Parts of Speech), as per vote
  • Short forms should just expand, if the long form is used – e.g., NATO should simply expand to “North Atlantic Treaty Organization”. (as per WT:ELE).

Some questions – “Parts of Speech” is the main question.

Parts of Speech
Should Parts of Speech be used in addition to Ab/Ac/In? Always? Never? If an Ab/Ac/In is used as more than one part of speech?
  • What if it is used as only one part of speech?
  • What if only one expansion is used for these other parts of speech (as in SMS)?
Etymology – break up
Having multiple expansions in a single L3 header breaks the “Divide by etymology” principle. Breaking up each expansion into separate Etymology sections is rather long (bloated?), however.
Etymology – where?
Should etymology be given in the Ab/Ac/In section (as in NATO) or in a separate Etymology section (as in SNAFU)?

Examples:

  • SMS
    Cited in discussion – has an Initialism, with 4 expansions, a Noun (“a text message”), and a Verb (“to send a text message”), the noun and verb only associated with one expansion/etymology.
  • NATO
    Unambiguous expansion – can be used as a Noun (the organization) or an adjective (“NATO forces”). Do we want separate POS for this? Consider “blue”.
  • LOL
    Pronounced both as an Initialism (letter-by-letter) and as an Acronym (as a word) – currently listed as an Abbrevation, splitting the difference.
    Further, mostly used as an Interjection, but also used as a verb (as in “I LOLed/LOL’ed”).

Oh Beer Parlourians, what sayeth (sayst?) you?

Second-person plural (as in this case): “what say you?”; second-person singular: “what sayest thou?” or “what sayst thou?”; “sayeth” is third-person singular.  (u):Raifʻhār (t):Doremítzwr﴿ 12:49, 9 June 2009 (UTC)[reply]
—Nils von Barth (nbarth) (talk) 22:56, 8 June 2009 (UTC)[reply]
[e/c] Just the way I think of these, not necessarily anyone else's opinion, and may possibly even be counter to what has already been definitively established as policy for all I know (as Nbarth implies, though it's news to me): Ab, Ac, and In are second-rate headers, used when nothing else works, such as for phrases (where Phrase also works, but is not better). Where Noun or one of the other standard POS headers works, use it instead. The set of pages with Ab/In/Ac headers that could be a N/V/Adj/Adv/Prep/Cardinal number/... is a cleanup category. Not that Ab, Ac, and In is illegal (anti-policy), just that they're poor substitutes. So SMS — currently defined as ===Initialism=== # Short Message Service # Sega Master System # special mint set # short man syndrome ===Noun=== pl SMSes # A text message sent on a cell phone ===Verb=== conj SMSes SMSing SMSed # To send a text message on a cell phone — should instead have just a noun and a verb section (perhaps proper noun for the video-game system, if attested). These can all be listed under the same etymology (all have "initialism" as their etymology) or split: I see advantages to either and have yet to be convinced which is better. Again, this is all just my own view, natch.—msh210 01:02, 9 June 2009 (UTC)[reply]
  1. The one issue that might be easy is the question of adjective use of the noun-type abbreviations. It would seem that such use is just like attributive use of the corresponding noun which we do not present as a separate part of speech unless it is attestably gradable or comparative. OTOH, I can imagine someone be said to be "more NATO than EU". We could simply ignore this kind of fairly trivial case.
  2. An abbreviation used as verb seems to need to be a separate PoS. Its etymology is almost always just the noun form of the same abbreviation.
  3. For all of the abbreviations that would be nouns, I would think no change would be required. The etymology and rudiments of pronunciation are already built in to the entry as we have it. Plurals are trivially formed by adding "s" in almost all cases. Exceptions could be noted by a hand-made additions to inflection or sense lines.
  4. The "texting"/"internet" abbreviations like "LOL" raise other issues because they function differently than their expanded forms, due to the medium.
  5. I don't know whether we really need to do anything with the abbreviations that are adverbs or other parts of speech that are not texting/internet.
  6. There is an issue I've noticed with headwords that are both normal English words and English abbreviations. The Abbreviation heading only makes sense to me if it is at the same level as the Etymology header for the English word. It also seems inappropriate to put a word below the abbreviations in these cases, though our standard is alphabetical order by part of speech, which always puts abbreviation and acronym at the top. DCDuring TALK 00:57, 9 June 2009 (UTC)[reply]
I agree with Msh210. I think that Initialism, abbreviation, etc. should belong to the etymology section, and that tele is a noun (currenty, no POS is mentioned), and UNO is a proper noun. And I think that the OK page is OK (except for the Oklahoma sense: Oklahoma is a proper noun, but are such codes really words?). Lmaltier 05:59, 9 June 2009 (UTC)[reply]
In a small proportion of the cases what MSH proposes makes sense. But imposing the same structure on abbreviations as on real words will lead to extravagantly long entries in many cases. We have a number of instances of more than ten different organization names sharing a given abbreviation. WP disambiguation pages often show many that we don't even have. To show the entire set with etymologies that merely repeat the sense line would be a serious waste of space. Allowing etymology and pronunciation sections for each noun abbreviation may lead to those sections being 10 times longer (in vertical screen space consumed) with minimal additional information value. DCDuring TALK 11:43, 9 June 2009 (UTC)[reply]
In many cases, it might make sense to consider "Initialism." as a single etymology, and to detail meanings on each definition line, in order to make pages more readable and to save space (this technique does not work well when there are several pronunciations, or several genders, etc.). But my point was about the POS... Lmaltier 12:35, 9 June 2009 (UTC)[reply]

Abbreviations facts and guesses

We are missing some important facts that have to do with the relative importance of the considerations. What we know: as of March 22, 1334 L3 headers for abbreviations, 139 acronyms, 314 initialisms. Counting from categories we would get more than 3 times as many, many more initialisms. We also know that many abbreviation headers should be one of the others to faithfully carry out existing "best" practice.
My a priori assessment based on the English entries I've worked on (no file extensions) is that:
  1. most (60+% of entries, 80%+ of senses) abbreviations are just nouns, most (80+%) of them proper (or at least capitalized) nouns.
  2. there are few (<1%) abbreviation senses that are verbs
  3. an important 1% might be the internet/texting type abbreviations of various PoSs, but with important and troublesome common characteristics.
  4. the balance are adverbs, true adjectives, phrases, and items of uncertain classification (eg, file extensions and URL components).
  5. there are a number of abbreviations (< 100?) that are formatted as real parts of speech, some of which are not categorized as abbreviations.
We could expect there to be a number of entries that have very many senses. Our level of coverage is not high. The significance of many of the abbreviations is low. Given that we don't subject these to much selection pressure, we can expect there to be many more, especially of "low significance".
Are there sources of additional information? Does my a priori assessment clash with others'? Are there important considerations for languages other than English? for non-Latin scripts?
Do we need more facts? What can be readily collected by dump analysis? Do we need some kind of sample of entries for other characteristics ? DCDuring TALK 13:27, 9 June 2009 (UTC)[reply]
To illustrate the low level of completeness and therefore the likelihood of long entries, compare ABA (2 senses) with w:ABA (~30, almost all with article links). DCDuring TALK 13:37, 9 June 2009 (UTC)[reply]

Following on DC’s thoughtful analysis, there seem to be 3 main cases:

simple
Most a/a/i are a single term, used as a single POS, usually Noun – often Proper Noun.
many senses
Some a/a/i have many expansions – this is a concern for suggestions of a separate etymology for each expansion.
complicated
A few entries (1%), such as SMS, are more complicated – what works for them may be overkill for others.

A significant concern is at WT:CFI#Names of specific entities – from my reading, most expansions of ABA are not in our scope: they are just names, and not used attributively.

Some thoughts:

simple
What’s the best way to say: “This is an a/a/i used as a noun/adjective?”.
An existing example is OEM, which lists Initialism and then uses {en-noun} for the POS line, indicating plural, but otherwise does not indicate that this is a noun.
Another option is Etymology+Noun or Etymology+Adjective, as suggested above (msh).
many senses
A list is easiest; policing CFI for these may be tricky, and less-used senses are often just names, hence should only be at Wikipedia.
complex
Not sure if we can think of or address all cases.
One rule of thumb may be in complex cases to consider the a/a/i L3 header as an “Etymology” section of sorts, and have POS headers subordinate to it (L4), if there are multiple senses.
E.g., SMS would be split into 2 Initialisms sections, with Noun & Verb being subordinate to the first).

Note that giving the expansion in “Etymology”, the pronunciation (Acronym/Initialism) in “Pronunciation”, and the use in POS headers (as indicated by msh and Lmaltier) is consistent with other entries, though it takes up more space and has been rejected in the past – WT:VOTE.

Perhaps Etyl/Pron/POS would work for simple entries?

That is, the key questions one likely has for a simple a/a/i is:

  • what does it mean?
  • how is it pronounced (Acronym/Initialism, or some combo: JPEG, LOL)?
  • what part of speech is it?

…which are addressed by separate sections.

OTOH, this is bulky, especially for many senses.

—Nils von Barth (nbarth) (talk) 22:07, 9 June 2009 (UTC)[reply]
IMHO, the simple vs. many senses distinction seems hard to maintain. Simple entries tend to become many-sense entries. Unfortunately, I fear we have no choice to design for the many-sense and complex cases, but make sure that a simple entry is not overburdened.
We have not been excluding abbreviations because the referent would not meet WT:CFI. I have occasionally put an abbreviation in RfV when the referent was not even in WP AFAICT. I am usually concerned with abbreviation entries only to keep them from cluttering up User:Robert Ullmann/Missing for which putting them inside a WP link is sufficient.
To me, the default case of a noun entry is adequately handled by existing practice. It is only a few unusual noun cases and the non-noun PoSes that might benefit from well-designed change. We could posit that they all non-noun abbreviations should be treated as normal word entries and be done with it. We could then finesse the remaining exceptional noun cases on a case-by-case basis. DCDuring TALK 00:30, 10 June 2009 (UTC)[reply]
Why not keeping the current practice for nouns (where possible), with a single difference: Noun or Proper noun as POS, instead of Initialism, etc.? Remember the KISS principle. Consistency is important, it makes things simpler. Lmaltier 07:04, 10 June 2009 (UTC)[reply]
Thanks for clarification on inclusion practice DC!
The main concerns with current practice, AFAICT, are:
  • No POS given in simple cases.
  • No guidance for complicated cases.
How’s this for a proposal?:
For a single POS
use === Acronym === etc. as an L3 header, immediately followed by ==== Noun ==== etc. as an L4 header – i.e., the a/a/i L3 header is a brief Etymology/Pronunciation gloss, but POS is indicated.
Likewise for a list
list multiple senses under a single Noun heading, or rather under 2 if some uses are countable and some are uncountable.
For complex cases
use multiple L3 Acronym/Initialism headers if senses are used/inflected differently (SMS), with L4 POS headers as above.
…and for terms like LOL that are pronounced both as Acronym and Initialism, use === Acronym/Initialism === as the L3 header.
I think this addresses concerns raised, scaling cleanly between simple and complicated cases, with the only change to existing practice in most cases being “adding a POS subheader, in addition to a/a/i header”.
—Nils von Barth (nbarth) (talk) 10:23, 10 June 2009 (UTC)[reply]
Why adding a level? It's not needed. tele is a noun, UNO is a proper noun. KISS! Lmaltier 10:44, 10 June 2009 (UTC)[reply]
Economy of headers has a major value in reducing the visual complexity of our entries and in enhancing the usability of the Table of Contents. It also adds to economy in the use of vertical screen space increasing the amount of information on the first screen that a user sees. We need to give new users what they want to avoid falling farther behind the other online dictionaries in the number of users (I have facts on this.).
The biggest merit of the status quo is its economical use of headers. The single ab/ac/in header:
  1. distinguished these from normal PoSes (providing a reason why an Etymology was not required)
  2. gave the most important pronunciation information (but only for Acronyms and Initialisms)
  3. and took but one header-worth of vertical screen space.
From a consistency perspective the status quo is deficient, but the category of abbreviations is large enough to merit its own treatment rather than being forced onto a procrustean bed. That there are specialized dictionaries for abbreviations and specialized sections for abbreviations suggests that abbreviations do not fit all that well into the data structure for other lexical units.
Abbreviations do not normally need Etymology or Pronunciation headers (or PoS headers if users can be assumed to infer the PoS from the header). The etymology is the gloss. Some ac- and ab-type abbreviations are not served well in terms of pronunciation, but the same is true for all parts of speech from the point of view of the vast portion of users who don't know IPA and haven't figured out how to do audio in our format. Putting an audio icon and IPA at the end of the gloss would serve the cases that need pronunciation.
An alternative approach would be to cede to WP disambiguation-type pages all proper noun abbreviations or initialisms (which have no pronunciation issues). In my experience their coverage seemingly far exceeds ours, certainly for notable entities. The only-in template would speed users to those pages from here. DCDuring TALK 12:00, 10 June 2009 (UTC)[reply]
DC – any thoughts on indicating POS?
My major issue with the status quote is that POS is nowhere indicated – e.g., FUBAR is an adjective, while SNAFU is a noun, but this is not clear from the entries.
Separate POS headers are bulky, as noted; the shortest possible unambiguous solution is {{pos n}} (which displays an n, for “noun”), as suggested by Connel at WT:ELE/POS talk.
That is, I see three options:
  1. Status quo – POS nowhere indicated, shortest
  2. POS headers – either in addition to or instead of a/a/i header – clearest, most consistent, bulkiest: many variants:
    1. instead of a/a/i, (as msh and Lmaltier suggest, though failed in a previous vote)
    2. in addition to a/a/i as L3 (as Ullmann suggests),
    3. in addition to a/a/i as terse L4 (my suggestion above)
  3. Definition line – minimal change to status quo to indicate POS, as Connel suggests.
Any other suggestions or possibilities? What do y’all think?
—Nils von Barth (nbarth) (talk) 14:45, 10 June 2009 (UTC)[reply]
The status quo can bear some improvement. I am completely open on approaches to indicate PoS for non-nouns. Even treating as a normal PoS would be OK. The "pos-X" templates (under review for deletion) seem to do a very economical job of handling this, but don't address the problem of those few items that need pronunciation or have etymologies not obvious from the gloss. Would end-of-gloss pronunciation be acceptable? (BTW, don't you think that snafu, though etymologically an acronym, has entered the lexicon as an ordinary noun?) My own favored option would be a hybrid:
4 (= 1 + 2 + 3)
  1. Status quo for all abbreviations that are not exceptions in terms of actual content of entry.
  2. Exceptions are items meriting full PoS treatment by virtue of
    1. having entered the lexicon,
    2. having etymologies distinct from their gloss,
    3. needing an inflection line,
    4. having a need for long or multi-line pronunciations, or
    5. for such other reason as we might find acceptable.
  3. No pronunciations for initialisms.
  4. Pronunciations, if any, to be shown at the end of the gloss, if they can be, for non-exceptions.
  5. Appropriate category to be added to entries that are exceptions.
  6. In-gloss PoS indicators for non-nouns not otherwise exceptional.

Among a list of merits to long to repeat (;-}), this has the merit of fitting into an incremental process of altering entries. Almost any of our existing entries would conform. Only valuable content forces change. Many of the abbreviations that already are treated as true parts of speech already conform. In other words, it is very close to our actual best current practice. DCDuring TALK 16:03, 10 June 2009 (UTC)[reply]

Sounds good in the main – I’ll see about drafting something to summarize.
A question or two:
  1. Should the term be displayed via an inflection line ({en-noun}, {en-noun|-}, {en-proper noun}, etc.), or manually as word?
    I.e., this shows plural or (uncountable), if relevant, and categorizes as Noun/etc.
  2. Is the reasoning behind not including in-gloss PoS for nouns b/c they’ll likely be parsed as nouns anyway or it’s distracting? It seems to add clarity even for nouns.
—Nils von Barth (nbarth) (talk) 22:49, 10 June 2009 (UTC)[reply]

Would adding links to wikipedia be a bot job the way inter-language links are added? (Is it possible without a current WP dump?) RJFJR 14:09, 10 June 2009 (UTC)[reply]

I could see it being a bot job for ex. {{langcatboiler}}, but not really for the main namespace. -- Prince Kassad 14:18, 10 June 2009 (UTC)[reply]
I don't know how easy it is to do without error. Perhaps a process could be tested on a closely related but more manageable problem: linking entries in the taxonomic name category to Wikispecies under See also for the appropriate PoS. These entries usually would have WP links too. At the genus level, they might benefit from Wikicommons too. Perhaps also there are categories of nouns that would benefit from links to Wikicommons. A clean-up list of those that don't have good targets in the sister project would be useful because it is usually not hard to make a manual adjustment to find a good target article to link to.
I am among those here have an aversion to the big sister-project link boxes as opposed to the more discretediscreet things that fit into "See also". Some have stronger aversion than I. DCDuring TALK 15:02, 10 June 2009 (UTC)[reply]
I agree linking to WP by bot would be error-prone, but to Species per DCDuring might be a good idea. I think that if Commons has a page or category related to one of our entries, then we should find a good picture, include it in the entry, and not link to Commons. Links to Commons are saying "here are some pictures!" and imply that we're too lazy (or understaffed) to pick one of those pictures as a representative. (Incidentally, I'm with DCDuring: the discreet {{pedia}} beats the discrete {{wikipedia}} every time.)msh210 16:01, 10 June 2009 (UTC)[reply]
We've had bots try to do that and fail miserably. For countries, a bot often picks out a flag, so (deprecated template usage) Italia (as an example) is illustrated with a picture of the flag, rather than the country (although I'm not sure this specific case was done by bot, I've come across it often). Selecting a good picture really is a manual job, although it could perhaps be bot-assisted. --EncycloPetey 16:06, 10 June 2009 (UTC)[reply]
I must not have been clear: I did not intend for bots to add pictures! I merely meant that we should add pictures (viz, manually) rather than link to Commons (as DCDuring had tentatively suggested).msh210 16:17, 10 June 2009 (UTC)[reply]
Linking at the genus level is frought with problems, since Wikipedia is case-insensitive and sometimes disambiuates by moving the genus to a common name rather than its scientific name. There are situations, for example, where an animal and plant genus have the same name, so only one of them (or neither) is at the basic page name. We would want both linked, and there is no standard on WP for how the disambiguation genus page is named. There are also cases where the genus name also happens to be a fairly common word, so the genus page is not at the basic name but has instead a parenthetical component. There are also situations where a genus is monotypic, so the information is only at the species page and the genus is a redirect. I don't think a bot is sophisticated enough to handle all that. --EncycloPetey 16:11, 10 June 2009 (UTC)[reply]
Perhaps we could just have a labor-reducing, rather than a labor-eliminating approach. I usually add pictures by first adding a generic commonslite link to the headword and follow searches and links until I get a good image for import and a page or category, if there is one. I do a similar process for Species and WP to confirm a valid link. If the labor was reduced by inserting the sister project link templates and the entry were assigned to a clean-up list, we could accelerate the process significantly. Both vernacular and taxonomic names would benefit from this process because many vernacular name entries include what some contributor thought was the right taxonomic name.
Doing WP, Species, and Commons all at once gives benefit because WP and Species don't always agree and Commons is not perfectly consistent in using the latest taxonomic name, an obsolete one, a synonym, or a vernacular name. WP often and sometime Species far surpass us in vernacular names, so we might generate many new entries or, at least, productive red-links. DCDuring TALK 16:39, 10 June 2009 (UTC)[reply]
You (EncycloPetey) seem to feel that we only want Wikipedia links when those are relevant to what we mention in our entry; but personally, I'd be quite happy with {{projectlink}}s to whatever Wikipedia might happen to have at its like-named article (as long as said article exists, or redirects to one that does). There are basically two uses for Wikipedia-links: "Here's more information about what you just read a definition for!" and "Here're other things you may have been looking for instead!". The former is not readily bottable, but the latter is, and IMHO is worthwhile. —RuakhTALK 01:30, 12 June 2009 (UTC) Clarification added 13:27, 20 June 2009 (UTC). I'm careful with my indentation, but often forget that not everyone else is, so the meaning of "you" isn't always clear, even when it theoretically should be.[reply]
No, I would often put multiple links in a single article, most commonly one to a dab page and one to something actually on the subject of one or another gloss, or to multiple Species and Pedia articles if more than one is referred to by a single vernacular name.
This is about entry quality improvement. I am just looking to make the process of adding pictures, species links, and content-specific WP links easier by having a bot:
  1. find the entries,
  2. enter some useful headers and templates, and
  3. put the so-templated entries on a list for the manual content enrichment.
It would be a real help for a fairly large group of entries equal to the union of entries with taxonomic names; entries in categories of animals, plants, fish, insects, fungi, etc; entries using the "spelink" template less those already with the multiple sister project links under a "See also" header. Taking a minute off the link-up process for each entry is good, but even better is having it all on a list.
Being able to do all these quickly could also help us add a large number of English vernacular names for species and genera as synonyms and thereby as red-links for new entries with starter content already available for a bot (potentially} or a human to generate new entries. DCDuring TALK 02:24, 12 June 2009 (UTC)[reply]
A bot would be an excellent idea. It could go through and strip out all those damn Wikipedia boxes we have littered all over the place, especially the ones in section zero above any language header, and substitute a simple link. But that's not the purpose you had in mind, is it? I don't like the idea of automatically adding links to anything other than a disambiguation page since thoroughly linking articles can takes some thought to determine which ones are most related. There are tons of Wikipedia articles that we should link to but do not for the simple reason that Wikipedia includes parenthesized annotations in titles. Also there are many cases where topics are coalesced and the most relevant article is under a different name. I suppose if you could find a way to determine which ones were related then that would be fine. No one who adds Wikipedia links now really puts any thought into it either, so a dumb machine could easily do a better job. DAVilla 12:01, 17 June 2009 (UTC)[reply]

After a bit of feedback and a new version I've added "nearbypages", my extension to provide links to previous and subsequent pages in alphabetic order of each languages where possible. It's in the experimental section. So far I haven't added suboptions as I'm not sure WT:PREFS currently supports such a concept. But I may add options to turn on/off the navbar links separately from the language heading links, and to specify how many such links to add. Enjoy and please leave impressions here. — hippietrail 16:06, 12 June 2009 (UTC)[reply]

Where's this discussed? I love it that the sorting works.
I'd like to propose using single instead of double angle quotation marks for dividers ( ‹ › )--works just as well, but reduces busy type-clutter). Also, the dividers should be preceded by a non-breaking space, so they won't randomly show up at the start or end of a line, depending on the wrap width.
Could also try a smaller font, so that it looks more like supplementary information and not part of the entry. --Michael Z. 2009-06-13 16:23 z
Contrarily, as I sometimes am, I'm really much more pleased with « and » than I would be with the single angle quotation marks, and I find the single angle quotation marks (‹›) difficult to distinguish. Maybe this could be made into a suboption for it, for instance having one or the other as default and the one that isn't default as a suboption checkbox, "Use «» instead of ‹›. That's I think the best compromise we're gonna find on this. However, I really like this. It's making a few tasks I have to do that are language-wide for Hiligaynon (template replacement and displaying an inflected form) a lot easier for me. —Neskaya kanetsv 20:20, 13 June 2009 (UTC)[reply]
Also, slightly weird that the sort order goes co-op, coop, Co-op. Perhaps punctuation characters should have more significance than capitalization, so the order would be coop, co-op, Co-op. --Michael Z. 2009-06-13 16:33 z'
  • So far it's only been discussed here and not a lot yet. But now that it's released I'll take cirwin's lead and make its offical talk page ... its talk page (-: User talk:Hippietrail/nearbypages.js
  • The separator to use is open so I'll go with whatever is most popular. You could just set it yourself with CSS if I didn't have to support stupid IE6. The original version used chunky solid triangles that looked like the play button icon but people complained. Then it used the skinny arrows but those didn't work so well with more than one link either side. I've made it random for now which will no doubt annoy people but you'll get to see them all. I'll set it permanently when it's clear which is most popular.
  • I will give it a CSS class so its font size and other characteristics can be set.
  • The nonbreaking space idea is a very good one. I've tried to implement it now.
  • The sort order is the one specified on the Toolserver for the best matching locale I could find. I assume it's a localized version of the standard Unicode sort order.

hippietrail 01:38, 14 June 2009 (UTC)[reply]

Greenlandic

Apparently, the name of this language is not consistent in Wiktionary. {{kl}} says "Greenlandic", while {{kal}} says "Kalaallisut". This results in inconsistent naming across Wiktionary (for example, Category:Greenlandic language but Category:Kalaallisut derivations) and the templates should be changed to be consistent. The only question is what name to choose. -- Prince Kassad 14:59, 13 June 2009 (UTC)[reply]

Greenlandic is the English name, which is used by Wikipedia, Wikimedia, and one of two names used by Ethnologue. --EncycloPetey 15:02, 13 June 2009 (UTC)[reply]
Greenlandic is probably the best name for us to use. It is clear and doesn't conflict with the dialect names. It appears to be used by Greenlanders.[13]
Ethnologue calls this Greenlandic Inuktitut[14] (or “Inuktitut, Greenlandic”), with alternate names Greenlandic and Kalaallisut. Apparently Kalaallisut is the most common and “standard” dialect, also called Western Greenlandic or West Greenlandic. The other two dialects are Eastern Greenlandic, East Greenlandic, or Tunumiit oraasiat, and “Polar Eskimo”, Northern Greenlandic, North Greenlandic, Thule Inuit, Inuktun, or AvanersuarmiututMichael Z. 2009-06-13 16:12 z
I say Greenlandic too, with polar directions for dialects if needed. Circeus 17:15, 13 June 2009 (UTC)[reply]

Pictures!

Alright, so bear with me while I hijack a small section of the Beer Parlour for something that will not only help the project but give me something to do.

I'm pretty soon going to have actual full time access to a dSLR -- actually, I'm buying one within two weeks from now. Unfortunately I really have no idea as to what I want to go out and take pictures of, so I decided to create a project for myself to do with it. While I'm aware that commons has billions of images (or I might be slightly exaggerating) I'd like to go out and work on taking pictures for entries here. Specifically for entries here. So the project right now is that I'd like a list from anyone here of entries that they want pictures of. Nouns preferably. Any language works as long as the item that it is I can logically find somewhere in the greater Los Angeles metro area. Suggestions can either be left here (I will be watching) or at User:Neskaya/pictures.

Thanks all, and I'm hoping that we actually get a couple hundred good pictures out of this. --Neskaya kanetsv 21:25, 13 June 2009 (UTC)[reply]

See Category:Requests for photographs for about 20. There are some long-standing requests there, including ones for a photo to replace the one at the top of WT:GP. Some may have already been satisfied. Take a look at lie detector, which doesn't really have a suitable image (one with a person hooked up would be good, but hard to get.). The template {{rfphoto}} puts things there, though without explanation. DCDuring TALK 23:27, 13 June 2009 (UTC)[reply]
Well yes, but that category doesn't seem widely used, and there are a metric ton of entries without photographs at all. So this was a hope of getting people to think and figure out what other entries should reasonably have photographs. Household objects comes to mind too. --Neskaya kanetsv 00:25, 14 June 2009 (UTC)[reply]
I was just hoping to give the existing system a free ride on your enthusiasm and promotional efforts. Any way we get better visuals is fine with me. See also {{rfdrawing}}, for which the camera won't help. DCDuring TALK 01:16, 14 June 2009 (UTC)[reply]
Most of those things that I can possibly photograph will get photographed. Thank you. I still want more entries and want to force people to actually think about what could use a photo. :D --Neskaya kanetsv 01:41, 14 June 2009 (UTC)[reply]
Your approach is already working. I just added gravity boots and inversion table to the category (likely to be found in Lala Land, I think) based on a query at WT:FEEDBACK. I couldn't find anything at Commons for them. DCDuring TALK 01:58, 14 June 2009 (UTC)[reply]
I don't suppose you could be bothered to add anything that doesn't have a picture at all that you come across that can possibly have a photo? (Foreign language terms should all have photos too.) --Neskaya kanetsv 19:37, 14 June 2009 (UTC)[reply]
I'll be happy to, but nonkilling, carry someone's water, carry water for, patriotism, Caló didn't seem suited. zoot suit already has a vintage picture. From w:Zoot suit, carlango and tramas (pegged pants). Actually each element of a zoot suit would be good: the hat, with feather tapa, tarda; the extra-long watch-chain, the pointy shoes calcos would be great. All conveniently located (I hope) in some retro shop window in East LA. DCDuring TALK 20:11, 14 June 2009 (UTC)[reply]
Speaking of variously Hispanic items, though not ones on east LA (as East LA is going to take a bit more coordination -- not exactly somewhere someone without a car should be going on their own by public transit, and definitely not somewhere someone should be going with an expensive camera), I'm going to be going to Olvera St. at some point. Anything that you can think of there? It has got a bunch of little marketplace stalls with ethnic/cultural stuff and lots of touristy people there already so I shouldn't have too much trouble taking pictures of stuff -- if I know what I'm looking for. --Neskaya kanetsv 18:21, 15 June 2009 (UTC)[reply]
Looks like fun. w:List of Chicano Caló words and expressions didn't have any specific nouns that gave me any ideas. Produce, toys, and clothing come to mind, especially if you can get Calo or other Spanish dialect or Spanglish terms from the merchants. DCDuring TALK 18:58, 15 June 2009 (UTC)[reply]

Gutenberg word frequency rankings

While I'm collecting feedback on my new previous/next pages extension (nearbypages) I've boldly taken the liberty to change the format of the Gutenberg word frequency rankings to match it as best I could. If you love the rankings as they were and hate my changes I'll be happy to change it back in a little while. I'll even change the format of nearbypages to match the old rankings format if enough people prefer it.

Please comment here or on User talk:Hippietrail/nearbypages.jshippietrail 23:13, 14 June 2009 (UTC)[reply]

A colorful little side-project of mine

A while ago (well, last summer actually), I began running the Wiktionary WOTD through a tool called the Transmonster. This generates five-color palettes that I subsequently uploaded to the ColourLovers (link to gallery) website. I'm up to some 200 of them (because I had a fallout between January and May >__>;;). It's not much of a serious thing, but I just thought it might as well let you all know. Circeus 03:26, 15 June 2009 (UTC)[reply]

Dog breed names - capitalization

I've noticed that some believe that all dog breed names should be capitalized. The practice seems to be widespread enough that it seems to be more than just a handful of grammatically ignorant people. Any thoughts / information / consensus on this? And should it be mentioned somewhere? -Oreo Priest talk 16:27, 15 June 2009 (UTC)[reply]

The same question applies to the common (i.e., English, not binomial scientific) name of bird and some other species, fwiw.msh210 16:50, 15 June 2009 (UTC)[reply]
There seems to be wider usage of capitalization for the pet breed names than for the vernacular names of animals. In pet names perhaps it's influenced by the fact that many breeds would be at least partially capitalized as a result of having a place name/ethnonym as part of the name (Rottweiler, [[English]] [[bulldog]]). So to be orthographically "fair" other breed names are capitalized (Poodle) and then the AKC names Standard Poodle.
Animal and plant vernacular names have an analogous "problem". Some are commonly referred to by their 2-part taxonomic names, of which the genus part is properly capitalized in technical writing. It seems only "fair" to capitalize a vernacular name. But capitalization seems less prevalent.
If we were to declare one form (for each class of names) to be the default standard, allowed alternative forms, and allowed evidence-based departure from the defaults, we would have a reasonable solution. Or they could be "only in Wikipedia" entries by default, perhaps with a translation section. DCDuring TALK 17:52, 15 June 2009 (UTC)[reply]
Ideally, each should be capitalized according to the most frequent usage. But determining this may be a lot of work, so we'd benefit from a default guideline (preferably not a rule or policy).
But I think that the breeders' association standards are prescriptive, and probably don't represent actual usage. Wouldn't one write “he had a poodle,” possibly adding an adjective indicating the dog's size (“he had a toy, miniature, or standard poodle”), just like “he had a dog?” Both “he had a Dog” and “he had a Poodle” look weird.
Or does capitalization differ between naming a breed and referring to individual animals? “The Poodle is a noble breed” vs. “poodles are noble animals?” Whatever we choose, I'd rather it be based on real usage than some organization's style guide. Michael Z. 2009-06-15 22:24 z
See [15] and the subarticles, for example. And I did a quick define:mastiff on Google and got predominantly (but not exclusively) sites that used the capital in all cases. -Oreo Priest talk 23:57, 15 June 2009 (UTC)[reply]
None of those is a good usage example. Look for mastiff where it appears in running text, not in a title or heading.[16] Also search americancorpus.org.[17] Michael Z. 2009-06-16 00:58 z


Depending on the AKC is like asking Kimberly-Clark whether we should capitalize "kleenex". I'd bet that most actual usage does not capitalize dog breed names much. But our attestation depends on print sources, which seem to capitalize them a lot. If we would like to formalize this a bit (ie guidelines), then we should sample a few types of breed names (and vernacular names plant and animal names to get as much out of this as we can) and see which way it actually goes. Using COCA and BNC at BYU we can fairly conveniently do this.
A preliminary look would suggest that overwhelmingly the lower case wins for some well-known dog breeds. Even in the case of Rottweiler, lower case rottweiler represented 40% of usage (n=96, 4 indeterminate). Only "standard poodle" appeared (n=14); not Standard Poodle. Lower-case "dachshund" 94 vs 1 Dachshund (n=95, 1 indeterminate). German shepherd predominated with 88 vs 11 for German Shepherd (n=99, 1 indeterminate).
Also in Category:Dogs there are more breed names are lower case or mixed upper and lower case. The capitalized parts are often ethnonyms or toponyms. Hardly any breeds with compound names had the full complement of capitalization combinations.
Nevertheless, I believe that some of the folks motivated to enter the breed names will prefer to capitalize them and we would have no trouble finding a sufficient number of citations for all "recognized" (by AKC et al) breeds and other breeds as well. I would be happy if all of the capitalized names were "only in Wikipedia", but I don't think many others like that approach in any of the other proposed applications. DCDuring TALK 00:59, 16 June 2009 (UTC)[reply]
Interesting citation: “Note: For this book, breed names will be capitalized; breed types (and general groupings) will not.”[18]
After looking at about five entries, I'm guessing that our definitions are mostly inadequate, lacking both the senses of “a dog breed” and “a dog of the breed”. When I say “a bichon piddled on the rug,” I don't mean that “a class of toy dogs” piddled, I mean “a dog belonging to the bichon breed” did.
And I shall speculate freely that the one is capitalized much more often than the other. Michael Z. 2009-06-16 01:17 z
I didn't look carefully enough to be sure, but there was not all that much difference. The sample of capitalized instances turned out to be quote small. The Rottweiler case would be the best one to review on COCA.
All of our taxonomic name entries are like, ignoring the logical need for two senses on the grounds, I think, that it is more or less a fixed rule of grammar that assures us that both senses are possible. We have similar situations in Proper nouns of other types and even in nouns where countability and uncountability can occur in the usage of almost every noun according to predictable rules. I don't know whether we can make that kind of assumption for dog breeds and plant and animal vernacular names. DCDuring TALK 01:34, 16 June 2009 (UTC)[reply]
I can see how that might be standard practice in print dictionaries for native speakers, but perhaps not so perfect in a learner's dictionary. There may be exceptions (I can think of some for ethnonyms, but not dog breeds). In the long run, we may as well be precise about this. Michael Z. 2009-06-16 01:47 z
I think that most of the population of users of taxonomic names can be expected to know (or, at least, learn) the rules, whereas the users of vernacular names and the breed names may include more of the folks who should not be expected to know or learn them. If so, we can focus our efforts to add more precise and refined senses where they will do the most good, without prejudicing our ability to eventually improve even the taxon entries. In any case, EP prefers that we not expend effort on two- and three-part taxonomic names, leaving them for WikiSpecies and Wikipedia. DCDuring TALK 02:20, 16 June 2009 (UTC)[reply]
This duality of meaning (specific member and general class) is a common feature of all English nouns. I can say "Lamps light rooms." In which case I am speaking of lamps in general. Or, I can say "My lamp needs a new bulb." In which case I am speaking of a particular lamp. We do not need to add this duality of meaning to every noun entry in every language on Wiktionary. --EncycloPetey 03:04, 16 June 2009 (UTC)[reply]
Perhaps it's as open and shut as you say, but there seems to me to be a small difference in how different types of nouns are defined and used. A taxon is never defined as an individual; it is defined as a class: a subfamily, genus, etc. It may be used to identify an individual. A countable common noun is almost always (normatively: always) defined as an individual. Lamp is defined as "A device [] ". It may be used in the singular as a class name, but rarely. One could say "The lamp was a great invention", but not so naturally "The lamp is used to light things." when one means "Lamps are used to light things."
Dog breeds are in between these two in usage, I think, which is why we are banging the keyboards about it. DCDuring TALK 03:33, 16 June 2009 (UTC)[reply]
Did I give the impression that I thought that was open and shut? I certainly didn't mean to imply that, and was suggesting quite the opposite. I agree with the points you've just made. Note that a taxon is always defined as a Proper noun denoting a class of members. This is directly connected with the way the nomenclature codes are written. We went through a big discussion on the issue of capitalization in plants names in the Plants Group on Wikipedia some time back. Similar discussions have happened from time to time in the other taxon-specific groups. --EncycloPetey 04:14, 16 June 2009 (UTC)[reply]
I think the need for a separate sense might be strictly governed by usage. How often do we speak of “the lamp”? Maybe some specific treatises on “Thomas Edison's light bulb” generalize this noun. But can't “the poodle” refer to the breed as easily and as often as it refers to a particular beast, warranting two separate senses? The ultimate test is the specific usage for each word, but until someone takes the time to estimate or measure usage for each one, I'd like to encourage defining these separately.
This is also a good application for subsenses, or at least compound definitions like “or an individual of the breed”. Michael Z. 2009-06-16 04:50 z
Please keep in mind that dogs aren't the only things that have breeds or cultivars. Your proposal potentially affects half a million current or future Wiktionary entries, and possibly more. This isn't a decision to be made lightly, and applies to most common names of living organisms. "The cheetah is the fastest animal on land." "The monarch butterfly spends the winter in Mexico." "The geranium is popular as a house plant." etc. --EncycloPetey 05:18, 16 June 2009 (UTC)[reply]

I think that capitalizing an animal name insists on the fact that this name is used with a generic sense (animal, or "generic" animal, belonging to this category of animals, as opposed to other categories), but does not really change the meaning. This is one of the cases where capitalization is possible to express something special without changing the meaning. Other cases are personalized nouns (e.g. Truth), beginning of sentences, book titles, shouting in Internet forums, etc. In such cases, I think that a single entry should be created (e.g. mastiff). There is no reason to create two pages for each plant name and each animal name. We could add many millions of such pages (don't forget that this issue also exists in other languages), but this would not help readers at all, this would only confuse them. When both forms are used, determining which is the most frequent is not relevant: assuming that You is used more often as you should not lead to create You instead of you, and this is a similar case. Lmaltier 06:46, 16 June 2009 (UTC)[reply]

What I explain is the policy adopted by other dictionaries, and it's a good policy. However, Webster's policy is much too extremist (e.g. they write new jersey pine). When the capitalized form is the normal form (e.g. Newfoundland, Thunnus albacares, New Jersey pine...), this form should be privileged, and the uncapitalized form may be created in addition if it's also in use. Lmaltier 07:14, 16 June 2009 (UTC)[reply]

Context label cleanup

The following context labels can mean different things, and are used differently by some dictionaries. We need to define what we mean by them. Once we make up our minds, I'll draft up some documentation for the template or category pages, or for WT:GLOSS.

(By the way, some of the hide links and 250 links on “what links here” pages seem to be completely broken today.) Michael Z. 2009-06-16 04:32 z

{{slang}}

Over 500 inclusions. The concept of slang is defined in different ways. The Oxford Guide to Practical Lexicography says this label “indicates that the item is non-standard language used by the named group” [emphasis sic], but “in some dictionaries, ‘slang’ is considered a register label, meaning ‘even more informal than very informal’.”p 228 Unfortunately, we also have {{cant}} (20 inclusions) and {{jargon}} (12), which mean pretty much the same thing, and WT:GLOSS is no help. I suggest that we merge these three, and define them similarly to Oxford. Michael Z. 2009-06-16 04:32 z

  • To me "slang" is informal language, but "jargon" can be formal. If you work in a formal environment, you wouldn't likely speak to your boss using slang words, but your conversation could well be full of jargon. Here I am using "jargon" to mean "A technical terminology unique to a particular subject." (ety 1 def 1) and "slang" to mean "Vernacular language outside of conventional usage." (1st synonym + def 1). I do not support merging these two. Thryduulf 22:51, 16 June 2009 (UTC)[reply]
Well, of the three English terms which carry {jargon}, I don't think any falls under your definition of jargon (although I admit I'm not clear on what the label represents in the two which aren't prison slang).
Do we currently need this usage label at all? To represent technical terminology particular to a subject, I would simply apply the subject label, like medicine. This is clearer and supplies more information than just jargonMichael Z. 2009-06-16 23:42 z
"Jargon", as used here, seemed to be an expression of a negative attitude by the tagger toward the entry. "Buzzword" (now gone) was used the same way. "Cant" has some particularly linguistic meaning, but I don't think it has any use in the portions of a general dictionary that are supposed to be for normal users. I think it could often be replaced in its use here by "obsolete|_|slang". Do we have {{argot}} too? In all of this group "slang" seems like the keeper. DCDuring TALK 00:53, 17 June 2009 (UTC)[reply]

20 inclusions. {{vulgar}} and {{slang}} represent two different things, and this template looks like “sum of parts”. I'd like to replace it with {{vulgar|slang}}Michael Z. 2009-06-16 04:32 z

Seems very sensible to me. DCDuring TALK 00:43, 17 June 2009 (UTC)[reply]
Orphaned by replacing all 18 uses, then found that it had been discussed without conclusion in 2006 and 2007. Anyway, if anyone objects, they can still have a crack at WT:RFDO#Template:vulgar slangMichael Z. 2009-06-25 03:19 z

Over 150 transclusions. Does this mean the term expresses an attitude towards the referent, like {{endearing}}, {{pejorative}}, {{ethnic slur}}, or simply risks offending a reader or listener, like {{vulgar}}? Let's pick one, or let's decide to retire this vague wording, and I'll get to work on the resolution. I think every instance can be safely replaced with {{pejorative}}, {{vulgar}}, or both. Michael Z. 2009-06-16 04:32 z

"Offensive" is a hypernym of "ethnic slur" and a hyponym of "pejorative". To me the term "ethnic slur" is the one of no clear value. (I don't get "endearing" either.) I did not think that it was intended as a synonym or a hypernym for "vulgar". There are terms that the user does not view as pejorative, that no one views as vulgar, but are nevertheless taken by auditors or readers as offensive. I'd be happy to provide examples. DCDuring TALK 00:41, 17 June 2009 (UTC)[reply]
Now I'm looking at the Oxford book I cited above, which seems to use offensive the other way. These terms fall into two categories (this book considers these both subclasses of register, but other books don't):
  1. attitude or approval of the speaker to the subject: affectionate, endearing, appreciative, approving, disapproving, derogatory, pejorative, insult, strong insult, slur
  2. taboo or vulgar language: rude, offensive, vulgar, taboo
My main problem with offensive is that its nature is not clear—it could be taken as offending, i.e. “insulting”. But we could continue using it as long as we agree on the meaning of the label.
ethnic slur is part definition and part usage—I think “ethnic” ought to be evident from the definition, and a label like pejorative or insult sufficient to describe the usage. racist and sexist are similar labels also used in some dictionaries. Michael Z. 2009-06-17 01:42 z
I'm not wedded the to which terms are used and would prefer that we be consistent with user expectations. Other dictionaries have done some research on and have helped shape users' expectations. Their views should be accorded some weight, especially in the absence of any usability research budget here. I still think there is value in distinguishing personal insults from other pejorative terms. If offensive and vulgar were combined and labeled vulgar or offensive, that would be fine. But I believe that some of the terms labeled as offensive are in fact more properly considered "insulting". So, three terms seem important to me: offensive/vulgar, insults directed at people, non-personal pejoratives ("rust-bucket", "jingoism"). Many of these need to be refined via usage notes. If additional tags would significantly reduce the need for usage notes, they should be considered.
I think we should take a little time to make sure that all the items tagged with labels that are to be removed are properly tagged as "vulgar/offensive"" or "insulting" if they are. "Pejoratives" are less important. DCDuring TALK 02:23, 17 June 2009 (UTC)[reply]

Over 400 transclusions. In most dictionaries this is a regional label, indicating that the scope is too complicated to express within the constraints of print. Example at User:Mzajac/Dialect labels#Dialectal. Let's define it as such, and resolve to substitute detailed regional labels whenever the information is available. Michael Z. 2009-06-16 04:32 z

I changed the category description. Please review and improve. Michael Z. 2009-06-25 12:58 z

Over 125 transclusions. Looks like it's the same as dialectal. Merge? Michael Z. 2009-06-16 04:32 z

  • I would think of a word marked "regional" as a word that was used in two or more dialects in a similar geographical location, for a word used in the dialects of Cornwall, Devon, Somerset and Bristol. I agree though that wherever possible we should have more detail than just either of these labels, so I support this merger. Thryduulf 22:51, 16 June 2009 (UTC)[reply]
    I start looking at this from the practical point of view of merging, and see that the meanings overlap a great deal, but regional and dialectal are not identical. E.g. bunny hug is a regionalism from Saskatchewan, where people speak the same basic dialect of Canadian English found from Vancouver through Toronto. I'll hold off on changing this one. Michael Z. 2009-06-23 12:32 z

Invitation to Kosovo for Wiktionary

Hi guys, would like o invite you to kosovo for our software conference. includes topics of wikimedia and wiktionary. I have been recruiting people. please come, and speakers might get sponsored, so get your talks submitted.

mike http://www.kosovasoftwarefreedom.org/

Refactored WT:PREFS

After years of procrastinating I've finally began to refactor WT:PREFS. Please check that nothing has changed. Refresh your caches (control F5 etc).

You shouldn't see any difference at all. The code has been rearranged to make it more modular, easier to add to, easier to maintain and improve.

If anything doesn't behave as before please let me know. If something is drastically broken feel free to revert.

In MediaWiki:Common.js I have made this change:

before: importScript('User:Connel_MacKenzie/custom.js'); now: importScript('User:Hippietrail/custom.js');

If you'd like to look at the code: User:Hippietrail/custom.js

hippietrail 12:08, 17 June 2009 (UTC)[reply]

  • I've now added partial support for disabled (greyed out) items.
    Some features are disabled due to certain problems such as broken servers, one feature of mine only works with JavaScript 1.7 or better.
    Next step is to disable/enable all options when the master switch is toggled. — hippietrail 00:51, 18 June 2009 (UTC)[reply]
  • All controls are now greyed out or enabled as the master switch is toggled. Let me know if there are any problems. I've tested it with all major Windows browsers. — hippietrail 04:09, 19 June 2009 (UTC)[reply]
    Awesome. It improves the control 100%, because I always assumed that you'd have to save settings to change its state. Now the greying out directly reflects whether it is in effect.
    But there is still a weird disconnect in having a non-modal control to change the entire activate state, but clicking a link to refresh the page to change individual behaviours. Michael Z. 2009-06-19 04:15 z
    Yes that's next. I've always hated it not having "OK" and "Cancel" buttons but Connel, who originally implemented it, insisted it was impossible. That was what I really wanted to fix but the code was pretty crufty so first I wanted to refactor it making sure I didn't break any features.
    It's somewhat complicated by the fact that it works with two ways of storing and retrieving preference settings from the browser cookies. It seems the following options have all been broken for some time but I don't know if anyone has complained: WiktionaryPreferencesTimeUTC, WiktionaryPreferencesTickClock, WiktionaryPreferencesShowNav, WiktionaryDisableAutoRedirect
    If I can be sure nobody uses those options I would be glad to remove them and decomplexify the code.
    With any luck I'll have OK/Cancel code tonight Sydney time. — hippietrail 05:26, 19 June 2009 (UTC)[reply]
No, those preferences work for me using the current WT:PREFS, ShowNav is a particularly "used" one (maybe I gave you bad information last night, sorry). Conrad.Irwin 09:30, 19 June 2009 (UTC)[reply]

Conjugated verb phrases

Are there guidelines about conjugated verb phrases (e.g. wastes time)? If they are allowed, why not creating pages such as appelé un chat un chat, appelant un chat un chat, appelât un chat un chat, etc. (from appeler un chat un chat)? I cannot imagine that they might be added. It would be ridiculous, and 100% useless. Lmaltier 14:04, 17 June 2009 (UTC)[reply]

I would think that the default should be to not inflect idioms. (Is waste time really an idiom?) If there were something unusual about the inflection, perhaps, but I can't think of an example. DCDuring TALK 15:06, 17 June 2009 (UTC)[reply]
How about possessives like eat one's hat? Of which there certainly are exemples in French: bête comme ses deux pieds. Personally I've often added a note about agreement (cf. attacher sa tuque). Circeus 11:24, 19 June 2009 (UTC)[reply]

Pronominal verbs

(separated from the above topic)

There could be something to it... things like m'appele and t'appeles and s'appele can be confusing. But I do think there's more productive things to do...there's some more French one-word verbs to conjugate, if you like. --Jackofclubs 15:40, 17 June 2009 (UTC)[reply]
It would be m'appelle, t'appelles, s'appelle, etc. Creating them might be considered, it's not ridiculous. But this is another issue (s'appeler is not a verb phrase, it's a pronominal verb). Lmaltier 16:33, 17 June 2009 (UTC)[reply]
… though [[Wiktionary:About French]] currently says that not even [[s'appeler]] should exist. —RuakhTALK 16:41, 17 June 2009 (UTC)[reply]
I propose we change that, admittedly I did start s'appeler on the simple enough premise that it's not sum of parts (s'appeler could literally mean call each other (by phone, I mean)). Mglovesfun 17:36, 17 June 2009 (UTC)[reply]
Yeah, I think I agree. Most dictionaries do not have separate entries for idioms, but rather list them under the most salient word; such dictionaries, unsurprisingly, cover "s'appeler" at their entry for "appeler". We, however, put idioms on their own page (which doesn't work quite so well, but probably won't change any time soon), so yeah, the consistent thing for us to do would be to put "s'appeler" on its own page as well. But right now we cover it both at [[appeler]] and at [[s'appeler]], which is not so good. —RuakhTALK 21:42, 17 June 2009 (UTC)[reply]
The basic argument we came up with on fr.wikt (with virtually no opposition) was that se + infinitive entries are acceptable if not sum of parts. So se laver is sum of parts, because it's just to wash oneself, but se passer isn't because it means to (deprecated template usage) happen (in fact I've been meaning to add se passer and se produire to fr.wikt for a while now). Mglovesfun 21:53, 17 June 2009 (UTC)[reply]
FWIW I could have sworn the absolutive meaning of the reflexive (which IMHO covers se produire) was considered a grammatical feature and did not usually warrant special definition? Circeus 03:13, 18 June 2009 (UTC)[reply]
Source? Mglovesfun 04:45, 18 June 2009 (UTC)[reply]
All dictionaries I have tried have special definitions for se produire, these definitions are needed. And it's the case for most pronominal verbs in French. Lmaltier 17:39, 18 June 2009 (UTC)[reply]
I'm not disagreeing as to whether definitions are needed. There are clearly cases where they should be given. What I disagree is as to whether a separate se produire page is warranted. What I've taken to do is create one for idioms in the reflexive, but not the verbs themselves. Though I wouldn't be against redirects from such entries. As a side note, I'm really not too keen on "sub entries" of the type found at fr:. IMHO either these are definition with labels like (deprecated template usage) reflexive and (deprecated template usage) pronominal, or they are different entries. Of course, I am opposed to conjugated reflexive entries just as I am for idioms, if only because it feels silly to have an entry for "m'appelle" in the first person singular, but not the third... Circeus 11:35, 19 June 2009 (UTC)[reply]
When something warrants a definition, this "something" also warrants a page. Don't you agree? But providing info about the pronominal verb in the page of the simple verb is also needed (at least a soft redirect, or more...) I agree with you about conjugated entries such as m'appelle, it's similar to mother's or l'eau. Lmaltier 16:56, 19 June 2009 (UTC)[reply]
(Let's start from the left again) where's the appropriate place to have a vote on this? I can think of some 'good' changes I can make, but I don't want to do them now just have to have someone revert them all. Mglovesfun 04:48, 18 June 2009 (UTC)[reply]
I propose that we continue this discussion here. Mglovesfun 13:29, 19 June 2009 (UTC)[reply]

Another idiom

Please help the Wikipedia editors with opinions as to whether this would be an idiom that we take. Uncle G 19:54, 17 June 2009 (UTC)[reply]

Done. DCDuring TALK 21:02, 17 June 2009 (UTC)[reply]

Typical collocations

The german wiktionary has a header “Charakteristische Wortkombinationen”, “typical word combinations”. Do we have something similar? I think we should. For example, in , it says this is typically used with the Verb versetzen: in ~ versetzen. H. (talk) 08:46, 18 June 2009 (UTC)[reply]

We have "Derived terms" but only for combinations that meet CFI. For combinations that don't, we include example sentences or information under "Usage notes". --EncycloPetey 14:59, 18 June 2009 (UTC)[reply]
Unfortunately any combination that involves the headword properly bolded in a usage example will not be found by the search engine. Your suggestion would help overcome the deficiency.
We also sometimes have restrictions on collocations in context tags or in the sense line. DCDuring TALK 14:20, 19 June 2009 (UTC)[reply]
This comes up all the time, and our guidelines' failure to address this seems contrary to our mandate and to the instincts of many editors. Important collocations are often listed at RfD with the (warranted) justification that they are merely sum-of-parts, but many editors claim that they belong in the dictionary as “set phrases”—of course this is wrong, as being a set phrase is not a CFI.
Our ELE also don't give us any reasonable way of including unlinked collacations in entries. These are not simply derived terms so their significance is lost if they are piled in there, and they are likely to be removed if they don't have an entry. The best we can do is to persistently shoehorn them into Usage notes and see if some conventional format arises.
This is a feature which is very important in dictionaries for language learners. We need to resolve to address such needs, and we need a hard-working editor to introduce this and other such dictionary features, and we need to support her or him. Michael Z. 2009-06-20 14:31 z

See also #Collocations, below. Michael Z. 2009-06-26 04:27 z

Broken web bug on Wiktionary??

I've been having intermittent problems with Wiktionary for the past few days.

  • Often page loading completes bug only a totally blank page is shown.
  • Sometimes an alert appears saying "This doesn't look like a Wiktionary page. No can do."
  • Both problems have occurred on various browsers and various machines.

My hunch is that they are a side effect of a badly programmed hit counter or tracker web bug of some kind that only activates randomly once out of every so many hits. Does anyone have any idea what it is, how to fix or remove it, or if the problems even share the same cause? — hippietrail 09:04, 18 June 2009 (UTC)[reply]

I haven't had any such problems myself. My bot has had difficulties, but that was from a server that had gone rogue and should be fixed now. --EncycloPetey 14:57, 18 June 2009 (UTC)[reply]
Is this while logged in, while logged out, or both? —RuakhTALK 15:00, 18 June 2009 (UTC)[reply]
I'm getting it right now while logged in using Google Chrome on a work computer and submitting an edit for the page miga. The Chrome debugger tells me there were two Google JavaScript files included. I'm pretty sure the devs don't like that kinda thing. But neither script included the error message.
So I think the Google web bug thingies are causing a conflict in some other piece of JavaScript. It might even be one of my own old scripts but a search in Wiktionary and a search in Google both find no hint of the error message... — hippietrail 02:49, 20 June 2009 (UTC)[reply]
  • Found it. One of Conrad's older js extensions was clashing with one of my newer extensions. The Google bugs were a red herring I think. The error message itself came from Conrad's parser.js aka "paper view". But where are the Google bugs coming from? — hippietrail 02:37, 21 June 2009 (UTC)[reply]

New toy to play with

In my copious free time this winter I have crafted a new toy for you all to play with.

in WT:PREFS turn on "For each language section add interwiki and random links." and tell me if you like it. — hippietrail 11:26, 19 June 2009 (UTC)[reply]

I haven’t been able to find it. I searched for "for each language" but it was nowhere to be found in the PREFS. What does it do, anyway? —Stephen 11:50, 19 June 2009 (UTC)[reply]
Oh sorry you will have to refresh your browser cache, control+F5 on most browsers. It will be the last item under the heading "Experiments – these are likely to be buggy and may not work in very common browsers." — hippietrail 00:10, 20 June 2009 (UTC)[reply]
Cleared my cache in Safari (cmd-opt-E) and forced reload (shift-Reload) several times on WT:PREFS, but I don't see this. I don't usually have caching problems at all. Is it still installed? Michael Z. 2009-06-20 14:40 z
Ditto. What's more, I don't see anything like it in User:Connel MacKenzie/custom.js or User:Hippietrail/custom.js. —RuakhTALK 14:54, 20 June 2009 (UTC)[reply]
Apologies again. I neglected to copy my development version onto the public version after testing it on all browsers. It should work if you refresh your caches this time. — hippietrail 02:38, 21 June 2009 (UTC)[reply]

Main Page redesign saga: Part 3

Wiktionary:Main_Page/2009_redesign

Okay, discussion had slowed down and I sort of got sidetracked for a while. I implemented as much suggestions as I could to the proposed redesign and would like final input before I start nagging the people over at commons for icon retracings and stuff. Circeus 11:28, 19 June 2009 (UTC)[reply]

Apparently this isn't an official vote yet, but a discussion to get us started. Mglovesfun 13:31, 19 June 2009 (UTC)[reply]

Announcing this June's Solstice Competition. Its open and close dates are not yet set, so as to allow editors to amend the rules.msh210 22:39, 19 June 2009 (UTC)[reply]

The competition has begun.msh210 20:51, 22 June 2009 (UTC)[reply]

This template has a lang parameter but it is not used. Would you support to create a category Category:English compounds and the appropriate FL counterparts by using this template? So compound words using this template would be automatically added to the category. --Panda10 22:10, 20 June 2009 (UTC)[reply]

The word "compound" is ambiguous. The relevant category already exists as Category:English compound words. --EncycloPetey 22:15, 20 June 2009 (UTC)[reply]
Ok, great, then Category:English compound words and the appropriate FL categories. The question remains: would it make sense to add this new functionality to the template? --Panda10 22:26, 20 June 2009 (UTC)[reply]
Hard to see a downside to this. If we are going to refine the category or even maintain it, much of the activity would be by language. I noticed at least one instance of what looked to me like a single character having the template, so the interpretation of this will definitely be by script and language.
It seems as if the template is used in nearly a thousand entries, many (most?) of which are not English, but at least a couple of hundred are English. There are at present 12 in Category:English compound words. I wonder how many other languages have "compound word categories", whether the naming is consistent, and how many entries are so categorized. One preparatory step would be to get the compound template onto the items in those categories, assuming that they are properly categorized. I assume that most English use of the template have not inserted the lang= parameter. Is that also true for the other languages? Can we automate or accelerate the insertion of the lang=? Autoformat? DCDuring TALK 23:33, 20 June 2009 (UTC)[reply]

ijs and ijs

I just discovered these two pages: ijs and ijs, the first one with ij, the second one with ij. The contents treat the same word. What is the correct form? Or do we need both? --MaEr 18:20, 21 June 2009 (UTC)[reply]

Given this is not (AFAIK) even an alternative spelling (purely a typographical issue, like the use of æ/ae in english or œ/oe in French), I say it's better to pick only one (is it a letter, a digraph or a ligature anyway??) of the version and redirect the others (either directly or as an {{alternative form of}}). Circeus
Custom in the past has been to avoid using "virtual" ligature combinations in any page name. They're neither easy to type nor easy to recognize for what they are. This applies also to the dž digraph in South Slavic langaguages. --EncycloPetey 19:23, 21 June 2009 (UTC)[reply]
There are a couple of "forcibly ligatured" words finding their way into the English section as well, fisherwomen and firſt. It seems to me we should delete the version with ligatures; good fonts will add the tasteful ligatures back anyway (though mine seems to only put one between "r" and "ſ") and it just leads to confusion when firſt and firſt and firſt and firſt are all distinct. (Not to mention first and first of course). I'd settle for making leaving redirects behind if people want to create these things, but I see no reason to create them or keep them. Conrad.Irwin 20:02, 21 June 2009 (UTC)[reply]
Dutch ij is considered a digraph. In alphabetizing words with ij are in general found under i not y. Capitalized it does become IJ not Ij. Otherwise I do not think that it needs a special symbol, although there is a bit of a problem if stress marks are added. This is optional in Dutch, but the spelling is regulated. The proper spelling has an acute on both i and j which I do not know how to do. Even the above digraph symbol produces ij́, which is not correctJcwf 22:11, 21 June 2009 (UTC)[reply]
Obviously, then ij needs heavy refactoring because it refers only to a specific/incidental typesetting of ij, the digraph, which we currently have no entry for, and would take most of the content. ij should have a translingual section too, while ij should not. Circeus 23:06, 23 June 2009 (UTC)[reply]

Citations

The current draft proposal at Wiktionary:Citations apparently deals only with English, and some of the ill-designed templates it suggests to use (like {{citation}}) are based on that assumption. Given the ever-growing application of Citations: namespace format (>2k pages) as laid out in that proposal, I think it's high time its deficiencies be discussed before more damage is done by its usage.

First of all, the template {{citation}} is broken and should be terminated. It creates L2 section and that behavior is to be avoided by the unwritten common template practice (unnecessarily complicates already complex parsing of wiki-code). Its purpose is to list variant spellings whose usage Citations page should illustrate, but these are already listed ad the corresponding mainspace page(s) in the ==Alternative spellings/forms== section, which is one click away.

The suggested practice is to put ====Quotations==== as L4 section, which doesn't make sense if all that it contains is a soft redirect to the corresponding Citations page by means of {{seeCites}} template. This makes it needlessly duplicate for every PoS of every etymology, as one can see at the exemplary entry hinder.

Now, the obvious thing to do would be to follow the same formatting scheme in the citationspace as in the mainspace, i.e. L2 section names separated by ----, each explicitly categorizing in [[Category:Xxx citations]]. (Category:Citations was apparently deleted recently, I've created Category:Citations by language which seems more appropriate). That way individual languages can bee linked to by means of lang= parameter of {{seeCites}}. Within the L2s, senses should all (regardless of etymology/PoS) be listed as L3s in a sequence they appear in the corresponding mainspace entry. As a first L3, perhaps a duplicate of ===Alternative spellings/forms=== should be made, to let the readers know which spellings are being grouped.

Thoughts? --Ivan Štambuk 22:24, 21 June 2009 (UTC)[reply]

If {{seeCites}} is used to categorize in cat:French citations, then the entry, not the citations page, will be categorized, which is not, I think, the desired effect. I'm not sure why citations need to be categorized by language at all, but, if they are to be, then doing so in {{citation}}, not seeCites, would seem to be the way to go. (What would be the purpose of such categorization?)msh210 23:06, 21 June 2009 (UTC)[reply]
We need the citations page to have a link to each form that it cites, if at all possible, so that those checking whatlinkshere for a page prior to deleting it (as all admins do, of course) will know whether the word has been cited. That is currently accomplished by {{citation}}, but poorly: the template displays only up to some low number (four IIRC) of terms, and should be fixed if its use for this is to continue.msh210 23:06, 21 June 2009 (UTC)[reply]
I, for one, have no opinion as to the order of language, POS, or etymology sections in the citations namespace, or of what level the headers should be or whether they should be template-generated: I don't care.msh210 23:06, 21 June 2009 (UTC)[reply]
Well, I wasn't suggesting that {{seeCites}} be used for categorization in the first place. Categories would be added manually in the Citation: namespace L2 section, e.g. [[Category:French citations|entry]]. The purpose of such categorizations scheme would be obviously to categorize all the citations on a per-language basis, so that the interested editors could maintain them. So far there is no way to list all the citations pages for a particular language. Also, sort key must be mandatory, or else all the citations would be sorted under "C".
Four doubtful entries (variant spellings or such) that need citations other than in the corresponding Citations:{{PAGENAME}} page, creators should create the ===Quotations=== section with {{seeCites}} linking to the appropriate Citations page (by using the first unnamed parameter). It's the burden of the entry creator to provide the evidence that the doubtful word or variant spelling exists. Also, as I said, methinks that that that kind of behavior, if needed to be implemented at all, should be accomplished by means of a L3 section ===Alternative spellings/forms=== which should wikilink to all the entries for which citations are being provided, not by using the awkward {citation} template. --Ivan Štambuk 08:28, 22 June 2009 (UTC)[reply]
One use I have made of Citations namespace is for the removal of quotations that are arguably not valid for attestation or for which it is not clear what sense they might support. I have sometimes been using headers to create bins for sorting such quotes. It seems clear that I should use context tags for the quotations that have problems so as not to interfere with desirable permanent structuring of these pages. Please bring such pages to my attention if you notice them. I will undertake to make some proposals about context tags to mark attestation issues in the near future. Does anyone remember prior discussion of this?
One aspect of quotation sorting is the sorting by alternative forms and spellings. This is usually just a temporary thing. Arguably, at the close of an RfV, the quotations not involving the headword and its inflected forms should be moved to the page of citation space that exactly corresponds to the spelling in the quote. If the alternative form or spelling is being used to support attestation, I suppose it ought to be linked in the citation space.
This makes me wonder what functionality we really need from the citations template.
I regularly observe problems in connecting individual citations (especially the uplifting or humorous literary ones) with individual senses of the headword, sometimes even the part of speech. (See Citations:lagging for today's example.) Any structure for citations pages needs to preserve our ability to accommodate such ambiguous citations. DCDuring TALK 17:02, 23 June 2009 (UTC)[reply]
Initially my idea for Citations pages was to have all inflections and certain other forms of a word on the same citations page. I've argued this at length. For instance, it makes no sense to separate capitalizations, since it would be impossible to determine for a lowercase word that begins a sentence, nor hyphenation, if a word is split on two lines. But groupthink and the introduction of the citations tab steamrollered right over this. DAVilla 06:48, 24 June 2009 (UTC)[reply]

Suggested enhancement to search logic for terms containing possessive pronouns

Suppose the search logic for terms containing possessive pronouns is enhanced, so that whenever a user enters such a term (e.g., off his rocker or off your rocker), if the initial search fails, the code substitutes one's for the possessive pronoun and repeats the search? If the second search gets a hit, the user is auto-redirected to the version of the term containing one's ((deprecated template usage) off one's rocker). Is this a good idea? If so, is it doable? -- WikiPedant 05:38, 22 June 2009 (UTC)[reply]

Currently the only supported auto-redirects are ones where server-side MediaWiki markup generates the bluelinks and JavaScript handles the redirection. Thus, it's basically limited to what combinations of uc:, lc:, ucfirst:, and lcfirst: can accomplish. (Though for entries with capital letters in the middle, we sometimes augment this with manual redirects from the all-lowercase form.) If we want to support other kinds of redirects — and that would be really nice — we need either (1) to write JavaScript code that generates a list of permutations to try and then queries the API to see which of them are bluelinks, or (2) to create an external page (possibly on the toolserver) that contains a bag of bluelinked page-titles, implements this logic, and generates appropriately-redirecting JavaScript. The latter is more flexible algorithmically (it wouldn't need to generate all possible relevant bluelinks; for example, without getting too technical, it could normalize the page-titles, e.g. storing [[burst somebody's bubble]] under "burst one's bubble", so when someone looks up "burst his bubble", it would check its index for "burst one's bubble" and find [[burst somebody's bubble]]), but has major downsides (it would probably be editable only by whoever's hosting it; we'd have to trust that person not to steal our passwords, or to use our passwords for good and not for evil :-P  ; and its index wouldn't be up-to-the-second). The former approach is much more limited (how many permutations can we try? how much language-specific code can we build in?), but is clearly safer, and would still be much more comprehensive than what we've currently got. If nothing else, I'd love it if it could remove Hebrew diacritics (vowels, chanting notations, etc.), replace fullwidth English characters with normal ones, change the I-J ligature to a normal "ij", and so on. —RuakhTALK 00:21, 24 June 2009 (UTC)[reply]

Excessive cognates revisited

Straw poll: (1) Do you think this is too many cognates to list for an Old English entry? (2) Should modern languages be included in a list of such cognates? Note that these questions are specific to Old English entries here. --EncycloPetey 02:34, 23 June 2009 (UTC)[reply]

Yes. For a given word, there should never be more than one or two useful cognates, ideally from contemporary languages. Such material is interesting, but best used in appendices. Circeus 03:05, 23 June 2009 (UTC)[reply]
I don’t see any problem with that entry. It’s such a stub that even such an exhaustive list of cognates cannot possibly be regarded as detracting from more useful information or whatever. One thing though: Why is an Old English word listed as a cognate in an Old English entry? –Shouldn’t lang (long) be listed in a Related terms section, rather than in the Etymology section?  (u):Raifʻhār (t):Doremítzwr﴿ 03:16, 23 June 2009 (UTC)[reply]
This IP has been doing that a lot, including adding an etymology section to an Old English word that has nothing more in it than {{etyl|ang}}, which is doubly wrong because it categorizes the word as if it were a modern English word derived from Old English. I've seen improvement in the anon's edits, but have not gotten a response to any comments or edits made. --EncycloPetey 03:22, 23 June 2009 (UTC)[reply]
Yes, that seems like too many. I have always thought that cognates should mostly be found by going to the list of descendants of ancestors. In the case in point one could not do that because the protoGermanic conjectural ancestor is not a permitted entry and there is no alternative home for the cognate terms. Without a suitable home, I suppose we could just have a show/hide for reducing really long lists to a single line. Or we could we have a WikiCognate analogous to WikiSaurus. DCDuring TALK 03:28, 23 June 2009 (UTC)[reply]
We have a mechanism for including PIE roots, so I don't see why the same mechanism couldn't be used for proto-Germanic. See dēns for an example of this. --EncycloPetey 03:31, 23 June 2009 (UTC)[reply]
I haven't had much reason to go back that far so this was the first I had seen of it. That would be lovely for the Germanic reconstructed languages. Is that where reconstructed terms from, say, Vulgar Latin, would go, too? Could this contributor be introduced to the attractions of this approach? DCDuring TALK 04:05, 23 June 2009 (UTC)[reply]
I wouldn't put reconstructed terms from documented languages into such a format. PIE and proto-Germanic are reconstructed languages. For recorded languages, this format probably shouldn't be used. --EncycloPetey 14:55, 23 June 2009 (UTC)[reply]
Where do/should they go? DCDuring TALK 16:10, 23 June 2009 (UTC)[reply]
On the page. They look just fine. DAVilla 06:09, 24 June 2009 (UTC)[reply]

(1) I see no problem with adding bunch of cognates to such short articles. Especially to Old English ones, which, I think, often are accessed through etymology sections of Modern English entries and thus are interesting etymology-wise. (2) Modern language cognates are enclosed in parentheses after their parent Old language, so I don't see any problem with this either. As for making Proto-Appendices like Appendix:Proto-Germanic *dagaz, and moving cognates there, well, that's a reasonable alternative but the practice shows people are reluctant to create such appendices. --Vahagn Petrosyan 05:34, 23 June 2009 (UTC)[reply]

I agree completely (with Vahagn Petrosyan). Even for readers who are actually looking up Old English words directly, I'd bet most would be interested in other Germanic cognates. It's not a language like Modern French, that people learn for practical reasons. —RuakhTALK 00:29, 24 June 2009 (UTC)[reply]

Is what is written at (deprecated template usage) dēns meant to avoid the inclusion in the entry of the long tree of terms at Appendix:Proto-Indo-European *h₃dónts#Descendants? If so, I think I get your point…  :-S Cognates are really useful, but I’d shy away from that many lest they swamp the entry.  (u):Raifʻhār (t):Doremítzwr﴿ 14:48, 23 June 2009 (UTC)[reply]

Unorthodox request.

Hello Wiktionary community. I have a confession to make. You see, as suspected, but not proved I am in fact Wonderfool. And Wonderfools hae never been keen on serious long-term admin work. So this year, instead of being dangerous and going on a spree, I'll be amical and request desysoppig the polite way (I'll delete the main page, of course, but that's all). And I think it would be wonderful if I could remain a Wiktionarian, and be open about my WFness (i.e. don't send me underground so I'm forced to clandestine editting and hopping from IP address to IP address and town to town and continent to continent). This way, I can edit hardcore French stuff, which I haven't done properly for about a year, without worrying about being blocked. And It'd be nice to run User:Keenebot2 again - there's tens of thousands of pages waiting to be rapidly added to this project in my off-wiki files. Anyway, I propose a mini-poll to allow WF to edit here, but without boring adminship duties. If not, then I'll probably see you again in 2010 under a new name. Regards --Jackofclubs 07:00, 24 June 2009 (UTC)[reply]

Great contributor with thousands of quality edits. It's better to have him around with known non-sysop account than waste time looking out for additional sockpuppets. Unless of course we agree that he be indef blocked for betraying the trust of community (whatever that means). --Ivan Štambuk 08:36, 24 June 2009 (UTC)[reply]
Re: "It's better to have him around with known non-sysop account than waste time looking out for additional sockpuppets": This seems to be a false dilemma: just because he has a known non-sysop account, that doesn't mean he'll stop creating additional sockpuppets and getting them sysopped. That said, I'm inclined to agree that we might as well formally let him edit, especially since we can't really stop him anyway. —RuakhTALK 12:03, 24 June 2009 (UTC)[reply]
Keep this very valuable contributor. Especially because he's still got his User:Equinox acount with which he can do his sysop-related stuff (and he's good at that too). I say we indulge his kinks and let him delete the main page once in a while: this is a predictable benign prankster who fools no one but contributes greatly to Wiktionary. --Vahagn Petrosyan 08:49, 24 June 2009 (UTC)[reply]
"I saw Goody Vahagn with the Devil!" Equinox 14:19, 25 June 2009 (UTC)[reply]
Know the w:Clan Macdonald of Clanranald motto ;-) ? --Duncan 14:09, 26 June 2009 (UTC)[reply]
This is ridiculous. We're not having a vote on a banning just because the banned user comes here with a sock and starts a vote telling us to do so. We've never voted on such things in the first place, certainly not without discussion. Wonderfool will stay blocked pending a decision otherwise, and not the other way around. Wonderfool's presence is much more toxic than you two are willing to admit. Aside from the nonsense he contributes along with the good, he has contributed to a culture of suspicion among Wiktionarians, as the plight of Equinox (who you so demonstrably attack) shows. Besides which, I'm not sure what in his history leads you to think that this message is to be believed in any way; it seems like a rather transparent ploy to me, considering his personality. Dominic·t 10:12, 24 June 2009 (UTC)[reply]
I think most of the regulars here were from the start pretty confident that Jackofclubs & Equinox are WF's socks. Same edit and writing style, same language interests...the other day I saw Jackofclubs adding some Czech entries, and I remember WF's last sock talking on IRC of having starting to learn Czech. I mean, it was pretty much obvious. I agree that is silly to just let him "get away with this", but methinks it's equally silly to ban him (indefinitely, or for a period), as he's obvious willing to spend great deal of time adding value content to this project. Esp. now that he doesn't have sysop flag and cannot do any kind of real damage, esp. after this that his edits and behavior are going to be much more scrutinized. --Ivan Štambuk 10:35, 24 June 2009 (UTC)[reply]
[Citation needed] Conrad.Irwin 21:18, 24 June 2009 (UTC)[reply]
I don't consider Wonderfool to be toxic. I consider him to be an idiot, but he contributes plenty of good entries and I see little benefit to us in blocking him. Unfortunately, there is little else we CAN do to him and it's obviously kind of silly to let users who delete the main page on a regular basis to just keep editing with impunity. Ƿidsiþ 10:21, 24 June 2009 (UTC)[reply]
If we were able to ensure that you only had one account, then we would get some benefit out of this arrangement, as we can't trust you to stick to this, then there is only a triumph to you having "won" an account off Wiktionary. I'd be very happy to see you turn over a new leaf, but (for old time's sake) it seems likely that you'll be banned and deleted. (Maybe next time, when you get nominated for admin, you can say "No" beforehand and explain, you'd be in a much stronger negotiating position if you could demonstrate some maturity). Conrad.Irwin 21:18, 24 June 2009 (UTC)[reply]
I recall that he used to sprinkle his "good entries" with approximately 15% bullshit under the belief that it approximates real life (thinking that university grades = real life). And just because he’s using one sock doesn’t mean he won’t be doing others. If you listen to him, one of these days over half of our active editors will be him. And just wait till he becomes a steward and desysops the remaining non-him editors. He suffers from a mental condition that will not permit him to quietly edit as a reformed Wonderfool. This is just part of his next plan. —Stephen 01:38, 25 June 2009 (UTC)[reply]
If he wants to show that he’s turned over a new leaf, have him submit to the same scrutiny that we have for checkusers. He must provide his real name, address, passport, etc. If he won’t do it, as I suspect he won’t, he has not changed a bit. —Stephen 02:07, 25 June 2009 (UTC)[reply]
Despite all his B.S., subtle (and not-so-subtle) vandalism, and stupidity, I have to admit I have a soft spot for WF; but this strikes me as a very good idea. Starting this discussion was not a show of good faith: good faith would have precluded re-deleting the main page. But submitting his identity to the Foundation would be a genuine show of good faith, IMHO (assuming the identity was truly his). —RuakhTALK 02:30, 25 June 2009 (UTC)[reply]
I agree completely with Stephen, every comment and suggestion. Because of the foolishness, he would get one login so that at least we would know which edits to monitor, with full immunity for past behavior but real-world consequences (even criminal charges) if he continues. I also agree with Conrad that he could have simply turned down adminship, which is why we should not trust him to seek a quiet role as he claims to desire. DAVilla 05:53, 25 June 2009 (UTC)[reply]
We are still cleaning up the mess he created under previous account names. For example, editing as Drago he created thousands of erroneous entries in Hungarian (among other languages), which I still encounter and which users still have to correct on a regular basis. In all this time, has he ever done anything to clean up his mess? No. --EncycloPetey 02:17, 25 June 2009 (UTC)[reply]
Drago was WF?! Dear lord... In that case, he better stay blocked ad infinitum. I also think that Stephen's suggestion of nothing less then providing a real-life identity to the Foundation could be the first step towards gaining trust from community. --Ivan Štambuk 03:51, 25 June 2009 (UTC)[reply]
Not confirmed, you understand, but I'm sure WF will tell us what he thinks at some point in the future, true or not. He's had so many aliases at this point that no one person here (except maybe Semper) has kept track of all of them. He does regularly delete the main page, deliberately, and regularly tries to become sysop under a new name after each new community booting. --EncycloPetey 03:57, 25 June 2009 (UTC)[reply]
No, Drago was actually Hungarian. If he had stuck to Hungarian entries, he would have been invaluable, but he wanted to do everything else. He had made a lifelong habit of collecting every two-bit bilingual dictionary he could get his hands on under the belief that a dictionary that costs 25 cents was as reliable as one that costs €25. In the end he realized the unfortunate error and abandoned the project. —Stephen 15:01, 25 June 2009 (UTC)[reply]
I'm not Wonderfool; I am Spartacus! Though I'm beginning to feel more like w:Bishop Berkeley. Actually someone with my initials and last name will be at the NYC WikiChapter meeting in July. But will it really be me?
More seriously, I would like to have as many proxies as Wonderfool seems to have in elections, especially for levels beyond admin. DCDuring TALK 17:02, 25 June 2009 (UTC)[reply]

{{chiefly}} > {{mainly}}

These both mean the same thing: that some term or sense is largely restricted to a particular context, but also used outside of it. As far as I know, any dictionary uses one or the other. We use mainly more times, so I propose we (I) replace chiefly with mainly and redirect the one to the other. Objections? Michael Z. 2009-06-25 04:03 z

Yikes. Also {{mostly}} and {{usually}}Michael Z. 2009-06-25 04:54 z
I think each has its nuance, and since having them all does no harm and allows editors to word context tags as they choose, I’d prefer we kept them all. That said, if I had to choose only one from between them, I’d opt to keep {{chiefly}}.  (u):Raifʻhār (t):Doremítzwr﴿ 11:59, 25 June 2009 (UTC)[reply]
Since these are our own keywords, then we must define their “nuances”. Unless we do, they do harm exactly because editors can use them, variously, as they choose, and readers are forced to guess what they mean. Just having four synonyms implies to the reader that there is a meaningful difference in they way they are used, but there isn't.
Please tell me: what is the difference in usage we express when we say that the usage of meter is “mostly US,” gray is “mainly US”, but grande is “chiefly US.” There is zero chance that any three editors have been using these to indicate the same nuanced difference.
I'd be happy to merge them to chiefly, too. At least mainly and mostly, but perhaps usually indicates frequency rather than majority. Michael Z. 2009-06-25 12:45 z
Actually, yes; thinking about it more clearly, I agree with the proposal to collapse {{mainly}} and {{mostly}} into {{chiefly}} but to retain {{usually}} as a distinct qualifier.  (u):Raifʻhār (t):Doremítzwr﴿ 13:19, 25 June 2009 (UTC)[reply]
I don't see a difference in meaning between {{mainly}} and {{chiefly}}, but I always use {{mainly}}. I suspect usage preference is an inter-Pondian matter, with North Americans tending toward {{mainly}} and Brits toward {{chiefly}}. Since we accept spellings from both sides of the Pond on an equal footing, it seems only fair to keep both templates. -- WikiPedant 16:56, 25 June 2009 (UTC)[reply]
I believe that will be the case, Michael just (rightly) wants some of them to be redirects. --Bequw¢τ 23:03, 25 June 2009 (UTC)[reply]
I'll be skeptical about a North American–British difference until I see some evidence. Even if there is a difference in usage, I betcha there's no lack of understanding of either term in any place. Let's just pick one, the simplest.
It's a bad idea to keep two labels and designate them as the same. Just having them implies a difference, so this will confuse readers.
And to revisit nuance, this is completely unnecessary from that point of view. These labels are arguably not necessary at all, because every single restricted usage is mainly in some particular context, and sometimes or often found outside of it. Absolutely nothing is completely exclusive to some dictionary-label context. Let's not get too uptight about the nuance, and let's not try to hard to confuse both Brits and Americans equally. Let's just K.I.S.S.Michael Z. 2009-06-26 04:39 z

Okay, so there seems to be no objection to merging three of these. I've also found another synonym used in Wiktionary and elsewhere: especially.

The question is, which label do we choose?

The dictionary stats are from Norri 1996,[19] and refer specifically to qualifiers for regional labels. Chiefly is used by dictionaries from both the UK and USA.

I don't really have any strong preference. They all mean the same thing, they're all clear, and three have good precedent in use by dictionaries. Especially has the advantage that we can abbreviate it to esp., if we ever find the need to shorten our label blocks (which sometimes get too long). Michael Z. 2009-06-29 03:05 z

I reckon we keep {{chiefly}}. {{especially}} is problematic — it is not synonymous with {{chiefly}}; consider {{especially|formal}} and {{chiefly|formal}}: whereas the latter means “generally restricted to formal contexts, though sometimes used outside thereof”, the former means something more like “this really is only appropriate in formal contexts; you’ll sound like a tit if you use it at any other time”.  (u):Raifʻhār (t):Doremítzwr﴿ 10:54, 29 June 2009 (UTC)[reply]
You mean especially can mean “very much,” in addition to “principally?” Michael Z. 2009-06-29 13:43 z
Yes.  (u):Raifʻhār (t):Doremítzwr﴿ 00:45, 30 June 2009 (UTC)[reply]

So then I'll recompose the proposal: if there's no objection, I'd like to merge:

Since these are all popular, and they are very close synonyms as used by dictionaries, I would create redirects. I'll also add an entry to Appendix:GlossaryMichael Z. 2009-06-30 12:47 z

Okay, since there's no objection, I'll gradually get started on this. Michael Z. 2009-07-02 01:43 z
I orphaned and redirected {{particularly}}. About half the occurrences were incorrect used as an isolated label (or as a conjunction between two definitions), and the other half with typed-in context descriptions. Michael Z. 2009-07-02 02:04 z
Filed WT:RFDO#Template:particularlyMichael Z. 2009-07-02 02:17 z
Filed WT:RFDO#Template:primarilyMichael Z. 2009-07-02 02:26 z
If these are deleted rather than redirected, won’t they still end up being used as plain-text fillers supported by {{context}}? If they’re all meant to be synonymous, is there a problem with simply having {{chiefly}} substituted for all the above candidates for deletion?  (u):Raifʻhār (t):Doremítzwr﴿ 09:24, 2 July 2009 (UTC)[reply]
Not sure, so I'm filing RFD for primarily because it was unused, particularly because it is practically unused and ambiguous in meaning, and maybe especially because it is ambiguous as you pointed out. The way I figure, having a redirect may encourage use, but should we encourage these ones? Michael Z. 2009-07-02 12:43 z
In my opinion, having them display a different word from that intended is more likely to discourage an editor from using one of the redirected qualifiers, and it makes substitution of any of them for {{chiefly}} a harmless, bot-able clean-up task.  (u):Raifʻhār (t):Doremítzwr﴿ 21:51, 2 July 2009 (UTC)[reply]
That does make sense to me. The way {particularly} was (mis)used, however, I'd still be tempted to delete it, or flag it for an editor's attention rather than bot-replace it. I did a survey of 50 examples of {mostly}, however, and it looks like {chiefly} could be swapped in for it with no new problems. Michael Z. 2009-07-03 00:15 z
Sure; since you’re the one who’s done the research, do what you think is best.  (u):Raifʻhār (t):Doremítzwr﴿ 00:40, 3 July 2009 (UTC)[reply]
Redirected {{mostly}}, after looking at every transclusion. Michael Z. 2009-07-03 05:19 z
Ditto {{especially}}Michael Z. 2009-07-04 14:57 z

{{markedly}} > {{very}}

Redirect? We also have {{extremely}}, which is still more intense, {{somewhat}}, which either means “not so much” or “just sometimes”, and {{quite}}, which means nothing at all. Michael Z. 2009-06-25 05:03 z

Keep all per §: {{chiefly}} > {{mainly}}, except {{quite}}delete which as semantically vacuous.  (u):Raifʻhār (t):Doremítzwr﴿ 12:01, 25 June 2009 (UTC)[reply]
Why keep? Why is fuck v. 1 “extremely vulgar,” fuck v. 3 “markedly vulgar,” but fuck n. 1 merely conventionally “vulgar?” How do we know that cameltoe is “markedly vulgar” but not “very vulgar”, and what is the difference between these? This is not helpful to anyone. Michael Z. 2009-06-25 12:51 z
Well, {{very}}, like {{quite}}, is too ambiguous to be used with any precision, so delete those two. However, {{extremely}} and {{somewhat}} are useful, so keep them. I also think {{markedly}} is useful, although it may be better worded as {{pointedly}}; what do you think?  (u):Raifʻhār (t):Doremítzwr﴿ 13:25, 25 June 2009 (UTC)[reply]
Very is an intensifier, and extremely is a stronger one. So if we really want to cover a full range of nuance, even though there is no objective basis like corpus analysis for any of our determinations, then we have rare, very rare, and extremely rare.
Where the heck does somewhat rare fall in this continuum? It sound to me like “only a little rare‚” or “more rare than rare, but not really very rare,” or maybe just “rare.” Michael Z. 2009-06-26 04:44 z
Re: “It sound to me like [] ‘more rare than rare, [] ’”: Really? I'm absolutely shocked. To me "somewhat" is unambiguously a de-intensifier. The scale is then (obscure)(extremely rare)(very rare)(rare)(somewhat rare)(ordinary, but I felt like adding a context tag)(overly common; consider substituting a synonym). (That last one would only be used for words like (deprecated template usage) the, that some so-called "writers" even use multiple times per sentence!)   Actually, I suppose the difference between "obscure" and "extremely rare" has to do with the word's transparency; if they existed, "highfalutin'itude" would presumably be extremely rare, while "yasezieuazy" would presumably be obscure. —RuakhTALK 12:31, 26 June 2009 (UTC)[reply]
NOAD defines somewhat, in full, as “to a moderate extent or by a moderate amount,” M–W as “in some degree or measure : slightly,” RH as “in some measure or degree; to some extent,” W3 as “in some degree or measure; a little,” and OED as “In a certain degree or measure; to some (slight or small) extent; slightly, a little; rather.” This is going to help our readers a lot.
I'll get right on creating {{more or less}} and {{kinda}} for you. Michael Z. 2009-06-26 12:50 z
I'm sorry, I know you're trying to be funny, but I don't get the joke. Maybe you get nothing out of a "somewhat _____" sense label, but on what basis do assume that no one else does? Obviously there are multiple editors using such labels, and I can assure you, it's not because we find them useless. —RuakhTALK 22:17, 26 June 2009 (UTC)[reply]
I'm trying to make some definite points:
  1. Somewhat is not clearly or even necessarily an deintensifier. The definitions I quoted show that it means “to some undefined degree, possibly a small one or not.” It is useless as a label modifier, because it doesn't even clearly imply your intended meaning to the reader. Slightly would work, if this was at all desirable.
  2. Five degrees of a labelled context, plus neutral, plus presumably five degrees of its opposite implies that our labelling has some fantastic degree of precision. Even dictionaries based on statistical corpus analysis don't pretend that such expression can be meaningful, so we'd just be fooling ourselves (but to our embarrassment, not the readers).
Multiple editors find this useful? It appears in a total of 14 articles, but it is not useful to any reader.
Here's a plum found in two entries: “dated or somewhat archaic,” using an overly-precise qualifier to imply twelve full levels of datedness, while specifying a range of four for the term. Lurvely. Michael Z. 2009-06-27 03:10 z
I think "somewhat" is clearly and necessarily a deintensifier, unless you'd expect an encyclopedia to tell you that "some humans are animals" and "some liquids are made of matter"? "Slightly dated" risks being too precise in some cases; we don't have the omniscience to know exactly how dated something is, nor the arrogance to fake it. I think a vague deintensifier, such as "somewhat", is quite reasonable in many cases. Your chief complaint seems to be that a reader can't look at "somewhat dated" and know exactly how dated that is; but if we start making stuff up, the reader is left only thinking he knows. When forced to choose between precision and accuracy, I prefer accuracy, every time.
That said, I agree that "dated or somewhat archaic" seems a bit silly, since "dated" and "somewhat archaic" are nearly synonymous; I imagine the writer meant "dated, perhaps archaic".
RuakhTALK 18:27, 30 June 2009 (UTC)[reply]
Heh, my favourite dictionary defines somewhat, in full, as “to some extent.” Michael Z. 2009-06-26 04:51 z

Back to the topic at hand. {{markedly}} is used in 2 entries. It is not an intensifier, like very, or extremely. (And markedly is not really a synonym for pointedly, which refers to the attitude with which one makes a remark, unless you reduce both to completely genericized senses). Its literal meaning is “clearly noticeable; evident,” a quality not really definable in, and irrelevant to lexicographical labels. To further confuse things, it has a different, specific linguistic meaning (see marked).

This is used in exactly two entries, to form markedly vulgar. This means very vulgar, but distinguishing it with the ambiguous term just creates confusion. Let's remove the 2 instances of {{markedly}}, and file for deletion. (If you have a problem with {{very}}, then please go ahead and file for deletion.) Michael Z. 2009-06-27 16:06 z

Regionally ambiguous qualifiers

The problems with quite and somewhat (and rather) is that they have both paucal and intensifying meanings, the paucal sense being more common in the UK and the intensifying sense more common in North America.--Brett 14:17, 30 June 2009 (UTC)[reply]
  • Should we deprecate the use of these three and any similar ambiguous qualifiers?
    This seems to be something where this is some potential for agreement in the short term. Eliminating them might help clarify the issues with the other qualifiers.
    I wonder whether we couldn't use the existing templates to assign entries using these to Category:Entries with ambiguous qualifiers. This would enable us to locate these and replace them with more suitable terms or eliminate them completely. Because they are terms likely to be used by contributors not sensitive to the regional differences in valence, it might be desirable to not actually create the category, so that it appeared red at the bottom of the page. Knowledgeable contributors could occasionally look for instances of items wanting to be in the category to correct them. DCDuring TALK 15:46, 30 June 2009 (UTC)[reply]
    There were 10 uses of {{quite}} and {{somewhat}}. "Rather" does not seem to have such a template, but a search for "rather rare" showed 39 hits, all English proper nouns. All of them look like cases where we would much prefer that the entry have qualifiers in the form of {{context}} or equivalent context templates. I suspect that there are substantial numbers of uses of "r ather", "quite", and "somewhat" outside of templated contexts.
    Consequently, I wonder why we are debating the relatively minor problem of redundant qualifiers as opposed to bringing entries into a higher level of conformity to WT:ELE and undocumented good practices. Once entries use some templates they are easier to identify and to standardize. More complete Templatization would make discussions about templates much more meaningful. DCDuring TALK 16:13, 30 June 2009 (UTC)[reply]
I'm surprised to hear that "quite" has paucal meanings in the U.K. — and shocked to hear that "somewhat" and "rather" have intensifying meanings here in the U.S.! I mean, sure, they're sometimes used as a form of understatement ("I was rather surprised to see him grow wings and fly away, but in retrospect I suppose it was to be expected"), but I can't imagine reading a dictionary entry and wondering to myself, "Hmm, does the dictionary actually mean what it's saying, or is it trying to be clever?". Even UrbanDictionary has very little of this sort of understatement IME. —RuakhTALK 18:17, 30 June 2009 (UTC)[reply]
Cambridge International Advanced Learners Dictionary reports both paucal and intensifying senses for "rather" without regional qualification. But the Cambridge American report it as "to a noticeable degree; somewhat". I would think we would want to avoid this kind of thing. Vacuity is one thing but actual ambiguity is quite another. DCDuring TALK 19:18, 30 June 2009 (UTC)[reply]
Yes, I agree. (There's a similar problem with {{vulgar}}.) —RuakhTALK 19:45, 30 June 2009 (UTC)[reply]
What's ambiguous about vulgar? Michael Z. 2009-07-01 00:56 z
It sometimes means something like "colloquial", "vernacular", or "low-brow", and sometimes something like "dirty", "crass", or "offensive". —RuakhTALK 02:32, 1 July 2009 (UTC)[reply]
But I think the former meaning hasn't been used in dictionary labels in many decades. Incidentally, the CanOD uses coarse slang where the NOAD uses vulgar slangMichael Z. 2009-07-01 14:40 z

I have two problems with such templates.

  1. Ambiguity, as mentioned above. In theory this can be completely resolved by simply defining what we mean by a particular label. But labels should be self-explanatory. If we want to be able to specify a range of intensities, then let's just use clear ones like slightly, [neutral], very, extremely, or rarely, sometimes, [neutral], often, chiefly.
  2. Fantastical precision. We don't need six grades of anything. Since all of these labels are based on our judgment and not on any corpus analysis, any finer grading than less, neutral, and more is just silly.

But don't take my word for it. Let's look in our professional dictionaries, and see what qualifiers they use in their labelling. (Can we find any surveys, or comprehensive lists of dictionary labels?) Michael Z. 2009-07-01 00:56 z

It's not "fantastical precision", it's diversity of wording, such as is unavoidable in a wiki, and IMHO not a bad thing. I doubt there's any one editor who distinguishes between very many different gradations, but I don't see what's gained by forcing editors to choose among other editors' preferred gradations. —RuakhTALK 02:32, 1 July 2009 (UTC)[reply]
The scale you delineated above in detail (obscure, extremely rare, very rare, rare, somewhat rare) does represent fantastical precision. Dictionaries don't use such labels. Dictionaries don't use diversity of wording; they define labels in their style manuals and stick to them. They “force” junior lexicographers to choose among the chief editor's preferred labels. They do this for consistency and predictable meaning of their labelling.
A free-form approach to labelling is what lexicographers used in the 16th and 17th centuries, and will distinguish us by our amateurism. Michael Z. 2009-07-01 14:40 z
That's odd, I thought the OED was a dictionary. My mistake. —RuakhTALK 15:12, 1 July 2009 (UTC)[reply]
By the way, it's not really precision. As I said in my above comment, "obscure" and "extremely rare" basically convey the same level of rarity (but differ in another way); and "extremely rare" overlaps with "very rare", "very rare" with "rare", "rare" with "somewhat rare". In the absence of the fantastical precision that you seem to be advocating, where each term has a specific, precise, and objective meaning, this sort of variety is unavoidable and not undesirable. Now, that doesn't necessarily mean that all of these are always appropriate — if something is really "extremely rare", it's probably worth examining the possibility that it's restricted to a very narrow context that we should try to elucidate — but eliminating "extremely rare" altogether is not going to help us any. —RuakhTALK 15:27, 1 July 2009 (UTC)[reply]
Naw. I'd be very happy if we put rare on any usage that seems notably rare, and nuked all the qualifiers. Then our labelling would be closer to the realistic precision of our determinations, and more like professional practice in lexicography.
(To say that the scale is fuzzy does not make it a relative rather than absolute scale—there is no way to make every John's “very rare” more rare than Jane's “somewhat rare”. We have no realistic way to judge that a usage are more or less frequent than others on five levels. And I must say that putting obscure on entries is absolutely opaque, even if you don't insist that it means two things at once.)
These qualifiers are just so much noise to keep editors busy and readers confused. Michael Z. 2009-07-02 01:41 z
Nuking these qualifiers might not be a bad idea … though it wouldn't surprise me if you were the only editor they kept busy, and the only reader confused. —RuakhTALK 02:04, 2 July 2009 (UTC)[reply]
Ooh, nice one! But enough about me... Michael Z. 2009-07-02 13:17 z

German inflection templates

I'm planning to unlink all the German inflection templates ending in -unc because of redundancy. This would affect the following templates:

The templates ending in -unc are supposed to be used for words which have no plural. I've added code to the plain templates so they work with these nouns too (by specifying plural=-), making the -unc templates unnecessary. -- Prince Kassad 17:55, 25 June 2009 (UTC)[reply]

As long as you're revising, have you considered a unified {{de-noun}} that would accept gender as the first parameter? It would be easier to make the change all at once, and we have bots that could do a simple replacement like that. --EncycloPetey 21:07, 25 June 2009 (UTC)[reply]
There's a {{de-noun}}, it hasn't got wide acceptance because it wants a diminutive parameter. (this parameter could probably be changed to be optional, to faciliate transition) -- Prince Kassad 21:17, 25 June 2009 (UTC)[reply]
The template {{de-noun}} has been created only recently, in April 2009 by Opiaterein, hence its low frequency of use. Its diminutive parameter can be set to "-", meaning "no diminutive". See also {{de-noun}}, specifically the sketchy documentation that I have just created, including examples of use. I think using unified {{de-noun}} instead of the other six mentioned templates would be nice. --Dan Polansky 07:55, 26 June 2009 (UTC)[reply]
I've been thinking that if the majority of German words don't have a diminutive, the template should print nothing unless a diminutive is specified. It's not easy finding that out, however. -- Prince Kassad 11:52, 26 June 2009 (UTC)[reply]
I don't know what the template should do for diminutives. My uninformed first guess would be that it should print nothing about diminutives unless diminutive is explicitly specified. --Dan Polansky 14:33, 26 June 2009 (UTC)[reply]
I can write a template that places those entries that lack a diminutive into an attention-style category for later cleanup. --EncycloPetey 15:02, 26 June 2009 (UTC)[reply]
I've done the needed changes, now we can deploy a bot that replaces the templates with {{de-noun}}. -- Prince Kassad 12:02, 30 June 2009 (UTC)[reply]
User:Opiaterein has a bot suited for this. --EncycloPetey 15:38, 1 July 2009 (UTC)[reply]
All done, deleting deprecated templates ^_^ — [ R·I·C ] opiaterein18:18, 4 July 2009 (UTC)[reply]
OK, so what's the next step? Right now, {{de-verb}} is a redirect that calls the conjugation table, and is used in the Conjugation section of many pages. I think the next step is to make sure {{de-verb-conj}} is the one being called in the Conjugation section by making the appropriate replacement. Once this is completed correctly, there shouldn't be any entries that call {{de-verb}}, at which point we can re-write that template to generate an appropriate inflection line and begin using it. --EncycloPetey 13:40, 5 July 2009 (UTC)[reply]
The current conjugation templates for German verbs are copied from the German Wiktionary and not very useful, since they only contain a select few conjugated forms. The first step therefore is to create new conjugation templates (Opiaterein already created the boiler {{de-conj}}) and add them to the entries on German verbs. Then, we can start caring about the inflection line. -- Prince Kassad 14:48, 5 July 2009 (UTC)[reply]
Yes, a robot should be run that replaces the uses of "de-verb" with "de-verb-conj", so that {{de-verb}} can be turned into a template to be put on the inflection line similarly to {{en-verb}}, which it currently is not, being a redirect to "de-verb-conj".
There are, for verbs, already two German templates to be put on the inflection line: {{de-verb-strong}}, and {{de-verb-weak}}. Possibly, these could be, after the robot is run, merged into {{de-verb}}.
A possibly trivial remark, just to ensure that we share context: From what I have understood, some selected conjugation information is placed on the inflection line using what is slightly misleadingly called "inflection templates" such as the planned "de-verb" or the existing "de-verb-strong", while the full conjugation table is placed under the heading "Conjugation" using a "conjugation template" (which is not an "infection template"), such as {{de-conj}}, used in träumen. An example of the use of both: verstehen. The inflection line has so far also be taken care of using "{{infl|de|verb}}. --Dan Polansky 07:01, 6 July 2009 (UTC)[reply]
My concern is mostly that the old conjugation template {{de-verb-conj}} only contains a select few conjugations, and not all. That's why I'm in the process of replacing it by {{de-conj}} which does show all conjugations. -- Prince Kassad 16:31, 6 July 2009 (UTC)[reply]
For me {{de-conj}} still is quite incomplete (why doesn't it contain Perfekt, Plusquamperfekt, Futur I + II and passive forms?). The templates the German wikt uses are much better. 79.238.33.133 19:54, 22 August 2009 (UTC)[reply]
It's no problem adding these, since you don't actually need to inflect anything with these tenses, similar to {{fr-conj}}. -- Prince Kassad 20:48, 22 August 2009 (UTC)[reply]

Collocations

See also #Typical collocations, above.

We waste so much time debating inclusion of terms which clearly don't meet our CFI. There's clearly a need for this. Let's create a Collocations subheading, which lists common collocations, and may have notes about their usage, meaning, etc. Only entries will be linked, other collocations will remain unlinked text. Comments or objections? Let's try this out. Michael Z. 2009-06-26 02:22 z

There are other approaches. For one partial example, mimimcking aspects of the MW 1913/Online style of phrasal usage examples, see dead#Adjective. The ones added have multiple phrases on a single line, with multiple bolded headwords. The collocations added were ones found at COCA that did not clearly fit with the other senses that we showed. DCDuring TALK 03:19, 26 June 2009 (UTC)[reply]
I like that a lot. I don't have much time these days, but I'd like to go through a bunch of sum-of-parts RFVs and RFDs and see how they can be inserted into the respective entries.
For instance, adding on television and in television to television. Might not always be ideal to do it as examples/quotations. I could see on television being used in sense 1 (“he's on television a lot these days”) or 3 (“let's see what's on the television”), or arguably 3. It may be desirable or more efficient to simply list common collocations on and in, but on the other hand having good examples with the senses does fill the bill.
By the way, would in television refer to sense 1, or another sense of “the television business,” or either? I guess a corpus search would help decide. Michael Z. 2009-06-26 04:26 z
I'd love to see an example of the Collocations header applied to a polysemic word (2 senses would be enough). We should think through which kinds of collocations are most worthwhile and at which term they would go. To me preposition-noun collocations are high value. I would place them at the noun. Phrasal verbs are mostly resolved. I can't see much value in normally having collocations at the entries for prepositions, determiners, pronouns, and conjunctions. The non "-ly" adverbs might be worth it. Adjective-noun (dead) and modal-verb-verb collocations might be interesting too. DCDuring TALK 17:32, 26 June 2009 (UTC)[reply]
We can just use {{sense}} to distinguish between senses, as is done in synonyms and derived terms. H. (talk) 11:38, 27 June 2009 (UTC)[reply]

I don't think that limiting collocations by word class (POS) is a useful approach. Different words will have different kinds of collocates. Sometimes a word will genuinely collocate with only one determiner or even with the verb be. Rather a statistical approach is likely to be the most helpful, though this will have to be tempered by a number of factors.

Generally speaking, an MI score of 3.0 or more is generally considered to show a genuine relationship between two words. The thing is, this score changes depending on how you search. For example, a search for words occurring within 4 words either side of help in the COCA returns an MI of 6.91 for defray, but if you search for 3 words either side, defray's MI is only 6.75. And if you only search for one word to the right, that goes up to 8.75. And then there's the question of whether you search for the form help, the verb help, the noun help, the lemmas help, etc.

Next, you need to consider not just the strength of the connection, but also its frequency. Back to our help example, contextualize has an MI of 3.29, but the collocation occurs only 10 times in that position, whereas the collocation with cope (MI=3.25) occurs over 340 times.

Along with frequency, you need to consider the range, in other words, the number of documents in which it is found. One document might use a particular collocation so often that it skews the results. Similarly, the context of various documents might be the same. Still with our help example, context has a very high MI score (MI=4.3) and a high frequency (over 1600), and occurs in a wide range of documents, but most of these documents have been taken from various web sites, all of which include the sentence, "Find Documents with Similar Topics Help Below are concepts discussed in this document."

Finally, transparency should be considered. The for sit, chair has an MI of 3.52, but it should be obvious to anyone who lives in the real world that sit is used with chair and may not be worthy of mention, whereas the relationship between help and cope is more opaque and may bear having attention drawn to it.--Brett 14:00, 30 June 2009 (UTC)[reply]

I've been noticing some of the issues you most helpfully raise in working with collocates of "in" (a perhaps overambitious project). {See Appendix:Collocations of in, which is so crude that I think I need to move it to my user space.) A more manageable case comes up in WT:TR#have in mind. I'll try applying MI to in mind to see if it helps.
The purpose of limits is just the practical one of focusing efforts of a few contributors in a single area so they can offer each other assistance. The area selected should be one where even an incomplete effort is likely to bear fruit. I had long been wondering about the quality of our preposition definitions. Category:English prepositional phrases is already large, but also incomplete. We seem to be at an impasse with noun-noun multi-word entries. And almost any effort that brings one to look at any somewhat homogeneous class reveals outright errors, inconsistencies that often suggest the existence of unresolved issues.
We are limited in the kinds of corpus-based work we can do. Google's target user is not us. It is useful for attestation, but much less so for collocation. The BYU databases offer much more power, but the "range" issue becomes apparent for less common collocations where a single document could be more than half of total usage, even in my limited experience.
The consideration of "opacity" is not one that appears in our rules for inclusion, but is an example of the kind of user consideration that needs to guide us in some of our choices. DCDuring TALK 16:41, 30 June 2009 (UTC)[reply]
I think you guys are taking this too far. In the initial discussion above, I was only thinking of typical combinations like “comment on”, “comment about”, neither of which I think deserve their own entry, but it is useful for a language learner to know that these are the prepositions the word comment is used with.
I don’t really see the use of a computational linguistics-like collocations database like it would be used in information retrieval. (That is, of course they have their use, but don’t belong in Wiktionary!) H. (talk) 10:19, 29 July 2009 (UTC)[reply]
I think we are trying to compensate for the inadequacies of search. Clearly, that cannot be efficient compared to lobbying for some greater recognition of the distinctive needs of a dictionary in search. BTW, would you know how I could find the English stopwords in wikimedia search. They are used, I think, in some of the fuzzy searches that generate search results when the exact search fails. For multiword searches pronouns and even determiners (or at least some of each) might be useful stopwords for us. DCDuring TALK 11:56, 29 July 2009 (UTC)[reply]

Bot flag request for CarsracBot

Bot flag request for iw on other namespaces then the main namespace.

I request the botflag for user:CarsracBot. Interwicket is doing a good job on the main space, but that is the only trick it can do. My bot can also interwiki user namespace (on request), category and other namespaces. I know that in the namespace I may only edit on request of a single user without a bot. It is very useless to edit the normal namespace as long as interwicket bot's are active. More info can be found here Carsrac 17:03, 26 June 2009 (UTC)[reply]

What does "in the namespace I may only edit on request of a single user without a bot" mean? Clarify please.​—msh210 17:22, 29 June 2009 (UTC)[reply]
The bot only interwiki the userpages on the request of the user themself Carsrac 11:13, 5 July 2009 (UTC)[reply]
I've seen severe problems in interwiki bots run on the Wikipedias. They fight with each other over what is "correct", with two bots going into an automated edit war. Also, they can't understand corrections when they are made, reverting the corrections made by users. How does your bot solve these problems? --EncycloPetey 18:12, 29 June 2009 (UTC)[reply]
If it is a simple situation the bot runs autonomaticly, It follows strict rules and that are the rules the editors agreed on. If there is a 'botwar' it mostly because the botowners made an other discission. If it looks like my bot is an edit war, please tell me and I will solve it and make sure that the edit war ends. BTW if a bot war is going on, info one of the global bot owners and they will end it one way or another. About the last remark of EncycloPetey the botowners understand the corrections by user and by their bot and can correct the mistakes one of the two has made. So if you see that my bot made a mistake, please tell me on an user page you think I read frequently. Carsrac 11:13, 5 July 2009 (UTC)[reply]
I've never found it to be that simple. Every time I've pointed out Bot errors to people on Wikipedia, the bot owner tells me I'm wrong and the bot is working just fine (even when it gets into an edit war). Usually, it takes someone clever who doesn't own a bot to find a way to stop the bot, usually by some sort of kludge. Can you point to a situation (on another project) where you had to clean up after your bot, and did so willingly and quickly? One of the biggest concerns we have here is that we've had bots (or editors) create numerous bad entries, but not ever clean up after themselves. We expect that a bot owner has the responsibility to clean up any and all problems created by the bot. Again, using a WP example, I'm involved in a project on WP (well, not so much these days) where a bot generated hundreds (thousands) of bad articles but the it's the project members and other helpful editors who are doing the cleanup (slowly). This is the third time such a thing has happened for that one specialized project. I therefore have little faith in WP-style bots, so can you demonstrate something that will give me confidence in your bot? --EncycloPetey 13:17, 5 July 2009 (UTC)[reply]
I don't know what kind of demonstration you want, but one of the actions I did with my bot is adding the links back from shipredirects in the polish wiki. It was something that would be too much work to undo that action by hand, so I I found a way to do it with my bot. It is an example that I read the messages and question on my talk pages and a simple question did give me a lot of work to to sort it out. Carsrac 21:11, 9 July 2009 (UTC)[reply]
I'd like to see a bot for the Project namespace. We don't have any interwiki bot for that one yet, and it would be definitely useful for comparing policies of different Wiktionaries or something. -- Prince Kassad 12:22, 5 July 2009 (UTC)[reply]
I would give it a try to run my interwiki bot on that namespace. I will do some test runs if there is work to be done and how much work it is. Carsrac 21:11, 9 July 2009 (UTC)[reply]
Thanks for the pointer, yes there needed a lot of work be done outside the main namespace. The project or wiktionary name space is where a lot of manual botwork can be done. To bad I have no botflag in the english wiktionary. And running without a botflag I will not do. Carsrac 18:59, 14 July 2009 (UTC)[reply]

(just a note: this bot was behaving badly in NS:0 on a number of wikts 2 days, linking things it should not have (rh and Rh); presumably running in error w/o the -wiktionary option. Interwicket has fixed/will fix them all) Robert Ullmann 06:49, 17 July 2009 (UTC)[reply]

{{somewhat}} > {{slightly}}

14 transclusions. Discussed in #{markedly} > {very}, above. somewhat means “to some extent,” or “to a slight extent,” or “to a moderate extent” (our current definition is lacking). As a label, it is unclear.

I'm going to replace this with {{temp|slightly}], which clearly carries the intended meaning. I hope this isn't controversial. Michael Z. 2009-06-27 16:28 z

({{slightly}} is already used in 6 entries.) Michael Z. 2009-06-27 16:57 z
I don't understand your goal here. You'd be better off pretending to "fix" all the various ways we define similar things. How come (deprecated template usage) ovine is "Of, pertaining to, resembling, or being a sheep", whereas (deprecated template usage) cygnine is "Of, concerning, pertaining to, resembling, or having the characteristics of a swan or swans"? Are we expressing some qualitative difference between the relationship of (deprecated template usage) ovine to (deprecated template usage) sheep and the relationship of (deprecated template usage) cygnine to (deprecated template usage) swan(s)? No, I don't think so; there's just variation in our phrasing, and that's unavoidable. Unless you're prepared to thoroughly standardize and define exactly what "slightly" means, and exactly what the boundary is between "slightly X" and simply "X", then I plan to continue using "somewhat", no matter what you want to do about it. Until the above conversation, I never realized there was even a special template for it; I'd been transcluding it accidentally, if you will, by typing (for example) {{context|somewhat|_|dated}}. Worse come to worst, you redirect {{somewhat}} to {{slightly}}, and I have to start circumventing you by typing {{context|<nowiki/>somewhat|_|dated}}. (That is, unless you convince me either (1) that you're right or (2) that the majority of editors think you're right.) —RuakhTALK 19:15, 27 June 2009 (UTC)[reply]
(I should clarify that if I thought you really could standardize qualifiers in a useful way, I'd go along with it. I just don't think that's possible.) —RuakhTALK 19:18, 27 June 2009 (UTC)[reply]
My goal is to get rid of a label which can mean three different things, and so likely to mislead readers who don't make the same assumption as some editor has, therefore being worse than useless. I'm puzzled why you insist on keeping it. What does somewhat mean to you, and what do you think it means to a reader who encounters it, not having read any documentation (that is, any reader, since we have none yet)?
(And no, I didn't plan to redirect {{somewhat}}; I planned to delete it.)
To indulge my avicidal tendencies (that is, to kill two birds with one stone), why not answer in the form of brief documentation, suitable for Appendix:Glossary or Template talk:somewhat#DocumentationMichael Z. 2009-06-27 21:53 z
It means "somewhat". I know you know what that means, firstly because I believe you live in Canada (right?), and secondly because you've filled this page with various dictionaries' definitions of it (all of which are spot-on). What are these "three different things" you think it can mean? Tell me them, and I'll tell you which two are wrong. ;-)   I think it's good to formulate useful definitions of, and encourage the appropriate use of, our common sense labels, like "rare" and "archaic" and "British", that we use all over the place and that populate large categories; but it seems useless to try to define the less common ones, such as "of a ship" or "Eastern US", nor to needlessly restrict the qualifiers of such labels, such as "now" and "somewhat" and "occasionally" and "formerly". There exist dictionaries that have fairly specific lists of sense labels (which they often use abbreviations and/or symbols for), but I just don't see that as a goal, and you've done nothing to convince me it's one. —RuakhTALK 22:10, 27 June 2009 (UTC)[reply]
Okay, here's a scenario. I'm a reader, and I find a sense labelled somewhat dated. I'm puzzled because I've mainly heard somewhat used conversationally, as a mild intensifier, but sometimes ironically so (like raTHER).
I want to figure out exactly what it's meant to represent in this reference. Of course I check Wiktionary's documentation at Appendix:Glossary, but I guess this project is in progress because no one has defined this work term. So I look in the handy CanOD on my desktop, which tells me that somewhat means “to some extent.” That makes no sense at all, because dated means “dated to some extent”, no?, and someone must have added the qualifier to change the meaning. So let's look for other definitions: Random House says “in some measure or degree; to some extent.”
So far, there is no clue how somewhat dated differs from dated. Why did someone bother adding the qualifier? Why?
But I remember that the NOAD is built into my Mac. Handy contextual menu takes me to “to a moderate extent or by a moderate amount”—but wouldn't dated be to a moderate amount? I guess the editor added the qualifier to ensure that I would know this is moderately dated. I guess the important point is that the word is clearly not slightly dated, or extremely dated (isn't the latter equivalent to archaic?).
I might even stumble on M–W or W3's definitions, which say that its definitions include “slightly” or “a little”. So does somewhat mean “less dated than to the normal degree?”
Sounds weird, anyway. I could ask at the info desk, but I might get “It means "somewhat". I know you know what that means” Ruakh, the point isn't that I know what it means (I know it means something too vaguely-defined to be a useful dictionary label). The point is that we all have an understanding that it means the same thing. If we can't define our own use of terms, then using them promotes misunderstanding, rather than the opposite. Michael Z. 2009-06-28 16:25 z
<sarcasm>By golly, you're right! We should also standardize all our definitions to be identical, because right now, a reader might not know exactly what we mean by a given definition!</sarcasm> —RuakhTALK 22:49, 28 June 2009 (UTC)[reply]
I guess we're not communicating, because I can't even tell what point you're trying to make. I'm not talking about writing definitions. I'm talking about knowing what a label means when you enter it. Since you'd rather not say, then this looks like a dead end conversation. Michael Z. 2009-06-29 00:20 z
Re: "I'm not talking about writing definitions. I'm talking about knowing what a label means when you enter it.": I don't see this distinction so clearly as you do. As I said above, there are some labels that I consider to be worth standardizing, linkifying, and categorizing; but there are other labels, and especially qualifiers of labels, that I see as basically freeform: as long as they're normal Standard English, I don't see the problem, even if every single entry had a different and unique one! Same with definitions: some (non-lemma forms, alternative spellings, etc.) are worth standardizing, linkifying, and categorizing; but there are other definitions that are obviously freeform. Unless you can present some sort of argument for why everything in a context label must be clearly well-defined and standardized, I think we'll just have to agree to disagree. —RuakhTALK 01:43, 29 June 2009 (UTC)[reply]
From what I've seen in practical lexicography books, it is recommended to define the meaning of labels in a style guide. Some dictionaries even present a bit of this information to the reader. This is not part of writing definitions—applying labels is something most lexicographers do by choosing items from a drop-down menu, and which puts entries into defined categories. It seems common sense to define what they mean, especially when they are as vague as somewhat.
But I'll try playing the creative labelling game. I'll change slightly back to somewhat, where I don't think the label meant “slightly”. But when I edit an entry with a label which conveys no meaning to me, then I'll change it to something that does, or remove it. Michael Z. 2009-06-29 01:22 z
Sounds good. Very wiki-ish. :-)   —RuakhTALK 01:43, 29 June 2009 (UTC)[reply]

As per the Licensing update, (i.e., new contributions licensed under Creative Commons Attribution-Share Alike 3.0 Unported (CC), legacy content dual-licensed under CC & FDL (AFAICT), should we:

--Nils von Barth (nbarth) (talk) 11:59, 28 June 2009 (UTC)[reply]
The Main Page also needs to be updated to reflect this. -- Prince Kassad 17:32, 28 June 2009 (UTC)[reply]
I've started developing the updates. I've finished formatting the new license page, here. --Blurpeace 01:43, 30 June 2009 (UTC)[reply]
I've updated the Wiktionary:Copyrights page as well. Check this diff for the edits (and to correct or improve any edits I have made). –blurpeace ( t / c ) 04:31, 30 June 2009 (UTC)[reply]
Thanks blurpeace!
  • I've marked Wiktionary:Copyrights as a draft policy; trust this is ok and accurate.
  • The main page still needs updating, as noted by Prince Kassad; I dursn't (and can't) do it because, duh, it's the main page - admins?
--Nils von Barth (nbarth) (talk) 19:09, 3 July 2009 (UTC)[reply]
Luckily Kassad can edit the main page... —Neskaya kanetsv 17:09, 7 July 2009 (UTC)[reply]