Wikidata:Property proposal/National Union Catalog ID

National Union Catalog ID

Originally proposed at Wikidata:Property proposal/Creative work

Done: National Union Catalog ID (P10816) (Talk and documentation)

Description	The identifier for this work in the The National Union Catalog, Pre-1956 Imprints: a Cumulative Author List Representing Library of Congress Printed Cards and Titles Reported by Other American Libraries (Q107434107)
Data type	External identifier
Domain	items of type version, edition or translation (Q3331189)
Allowed values	regex: `N[A-Z]? \d{7}(\.\d+)?`
Example 1	Robert Carter: His Life and Work (Q106913633) -> `NC 0506050`: qualifiers: volume (P478): 113 page(s) (P304): 470
Example 2	In Memoriam: Faded and Other Poems (Q106914750): `NJ 0071762` volume (P478): 278 page(s) (P304): 639
Example 3	Authoritative Christianity, First Council of Nicaea, vol. 1 (1891) (Q106917163): `NC 0407550` volume (P478): 108 page(s) (P304): 541
Planned use	adding NUC authority control to editions
Number of IDs in source	the number of items in the NUC (hundreds of thousands)
Expected completeness	always incomplete (Q21873886) (as new volumes being published)
Distinct-values constraint	yes
Wikidata project	WikiProject Books (Q8487081)

Motivation

The NUC is a huge catalogue of all works catalogued by a group of libraries in the US, including the Library of Congress. Of particular interest are the 754 volumes of the Pre-1956 Imprints, which are, for many works, the only place they are catalogued. Inductiveload (talk) 20:45, 19 September 2021 (UTC)[reply]

It's described at en:National Union Catalog.
- Many entries do not have an 'identification'. These entries all represent "alternate entries", in library-speak....redirects. There is little point to trying to enter those on WD, in my opinion.
- The catalog was compiled from copies of catalog cards submitted by hundreds of US research libraries, starting in the 1950s, of (what was intended to be) every title they held which was not already represented by a Library of Congress printed card. This was added to the works that already had cards (meaning, works that have an LCCN and are held by the Library of Congress).
- This means, specifically, that the NUC was intended to include every US publication that does not have an LCCN (as well as those that do), and many of the libraries that contributed to the NUC are not contributors to WorldCat..... there are a massive number of works in the NUC that have probably never been electronically cataloged, and are not in the Library of Congress.
- The LoC almost never obtained copies of later editions unless there was a new copyright claim registered, which is why many such editions do not have LCCNs.
- The 'human brains' that created this were far better than the OCLC algorithm at merging duplicates, as well.
- When you find references in LCCN or LoC authorities records to 'the old catalog'.... this is it. Much of the catalog of the Library of Congress that you find online, for older works, was a transcription from the printed card (often with info left out)... these are replaced as the LoC re-catalogs the actual books, and will probably take decades. This is where you look for complete bibliographic information... the entries are 'photostatic' copies of the submitted library catalog cards, cataloged by actual professional librarians from books they were actually holding, in the original binding (so, correct pagination).
- Additionally, the NUC itself also gives abbreviations for what libraries submitted information on each book, which is probably the most valuable bit of information... these libraries mainly still hold the same works. This allows you to search for digitized versions at those libraries based on the actual cataloged search terms, instead of whatever Google or the Internet Archive mangled it into.
- This is also the information one would need to possibly use interlibrary loan to obtain a copy.
TLDR; referring to the actual Mansell catalog is how you resolve the not-uncommon problem of either zero or way too many, and not matched, entries for a pre-1956 book in WorldCat, or obtain realistically accurate 'lists' of multiple editions of the same book. This is the pre-digital WorldCat actual people spent decades on. Jarnsax (talk) 21:43, 19 September 2021 (UTC)[reply]

Discussion

Support ofc. Jarnsax (talk) 23:18, 19 September 2021 (UTC)[reply]
- The exact construction of the identifier is the letter N, followed by the first letter of the "main entry" name (the card title) or null if it is a symbol, then a space, then a seven digit 'sequential but not consecutive' number, with inserted entries given the same identifier as the previous entry, with the addition of an unpadded sequential number after a decimal point. Jarnsax (talk) 23:29, 19 September 2021 (UTC)[reply]
Comment if it's an identifier, why would it need volume and page numbers as qualifiers? --- Jura 05:57, 20 September 2021 (UTC)[reply]

Because there is no online portal for the NUC, just a huge pile of page scans, so actually finding an item in the thousands of volumes of the NUC is a bit of a mission if you have the number but don't know exactly where it is. The IDs are sequential, but you still need locate the volume and then scrub though that volume to find it. But, if you've just looked up an item and found it, adding the volume/number is easy, because you have the page in front of you. For example, example 1 above has a page scan at https://babel.hathitrust.org/cgi/pt?id=mdp.39015082906424&view=1up&seq=480.
Probably would be better if there was an item for the volume, then it could be published in (P1433) → the volume item (which would then link to Hathi, the IA, Commons, etc), but it'll still need a page(s) (P304) and perhaps even column (P3903). Inductiveload (talk) 10:21, 20 September 2021 (UTC)[reply]
I see. Eventually I suppose the statements could provide a guess in which volume/page to find a new value. --- Jura 10:56, 20 September 2021 (UTC)[reply]

With enough items, you could certainly book-end a range of pages that a given ID would land in, though it would be a lot of items before it would land you right on the page reliably, since there are something like 500k pages in the catalog, and, AFAIK no machine-readable way to ingest the data. Inductiveload (talk) 13:00, 20 September 2021 (UTC)[reply]
FWIW, since the scans are on Hathi, it's possible to permalink to the scan of a specific page (using a handle url). It's down near the bottom of the left menu on Hathi. Jarnsax (talk) 16:06, 20 September 2021 (UTC)[reply]

Support —MasterRus21thCentury (talk) 06:06, 12 April 2022 (UTC)[reply]
@Inductiveload, Jarnsax, Jura1, MasterRus21thCentury: Done ArthurPSmith (talk) 17:46, 16 June 2022 (UTC)[reply]

Wikidata:Property proposal/National Union Catalog ID

National Union Catalog ID

Motivation

Discussion

Navigation menu

Search