Archive for November, 2010

Perception (originally posted 1-25-2009)   no comments

Posted at 10:07 pm in phenomenology

Sometimes it’s like pulling teeth ….

Okay, so I’ve been working with noesis, which is a matter of perception. The work all boils down to a basic question: how do we organize knowledge conceptually so long as every concept is perceived differently by everybody? The answer is that the whole process is a huge and constant brain-massage, wherein we shove a little to the left and then a little to the right and back and forth and on and on, trying to get everybody to agree to a common set of perceptions. Which, of course, was the idea behind universal KOS.

I thought this cartoon was perfect:

Written by lazykoblog on November 17th, 2010

Tagged with , ,

language (originally posted 7-20-2010)   no comments

Posted at 10:04 pm in linguistics

I’ve been reading a lot about languages, because my work with Otlet and the recent conference at the Mundaneum raised a lot of linguistic issues, and also because artificial language is closely related to classification, particularly in the identification of concepts and the constriction or expansion of their expression.

I want to recommend a new book by Arika Okrent, In the Land of Invented Languages: A Celebration of Linguistic Creativity, Madness, and Genius. (New York: Spiegel & Grau, 2010). Learn why Klingon is the only succesful invented language, and on the way to that revelation, watch what she has to say about concepts.

I’m reading now another book by John McWhorter, The Power of Babel: A Natural History of Language. (New York: Perennial, 2001). Look at the gem I just found on p. 53-54:

Atoms are not the irreducible entities that scientists once supposed; instead, atoms are complexes of subatomic particles. In the same way, viewed up close, most “languages” are actually bundles of variations on a general theme, dialects.

By this I do not mean that there is “a language” that is surrounded by variations called “dialects.” As will end up being a kind of mantra for this chapter, “dialects is all there is.”

A Martian, presented with [German and Schwaebish], would find no way of designating one as “the real one” and the other one as “a variation;” they would just look like two similar systems, just as a Burmese and a Siamese cat are to us different but equal versions of the same basic entity.

And in fact there is no “default” cat; there are only types of cat. Language change parallels biological evolution not only in creating different “languages” equivalent to species, but in that most languages consist of an array of dialects equivalent to subspecies.

This goes to the problem of mutual exclusivity, doesn’t it?

Now I really am going to go make a cup of tea. By the way, since I know nobody actually reads this, I feel liberated about going completely off into tangents. So: for the first time in over 20 years, today I have been home for more than a month! Okay, I drove to Cherry Hill to get the car worked on but I was only gone an hour. No overnights anywhere. And, while I’m at it, today I’m finally through teaching! Hurray. No more papers to grade for at least six weeks.

Written by lazykoblog on November 17th, 2010

Tagged with , , ,

Bibliocentrism (originally posted 5-24-2009)   no comments

Posted at 10:02 pm in Uncategorized

My paper for the ethics symposium (see my last post) was called “Bibliocentrism, Cultural Warrant, and the Ethics of Resource Description.” The point of the paper was that the pervasiveness of bibliocentrism constitutes an act of harm against the user-community by preventing Wilsonian exploitation. As usual, I don’t think I said that in my presentation (but I did say that in the paper). My method was case study–I took 3 “biographies” and 4 “nonbooks” and tried to demonstrate the uselessness of title-page transcription for facilitating exploitation. The connection being, of course, that bibliocentrism leads to standards that treat everything like a book (because books are good and nonbooks are nongood) and does even that badly.

I think listeners probably got the idea that I think title-page transcription should be replaced by digital imaging. That’s correct. I do think that. I used Morgan’s FDR as an example, to demonstrate how one could link from the table of contents–say concerning the Newport Incident–to the source list and then directly to the Newport archives where the sources are held. But there is much more, of course.

The ethical issue is that bibliocentrism, like racism or sexism or most ‘isms,” is endemic, and is perpetuated by the participation, tacit or otherwise, of everyone in the resource description community. An ethical response requires first an admission of complicity and then a commitment to change.

I ended the presentation (but not the paper) with an image of Otlet’s “grinder,” simply to demonstrate one possibility for an alternative to a bibliocentric catalog. (Documents go into the grinder to be disassembled into facts, which come out newly synthesized as raw knowledge, classified using UDC. See my new favorite book: European modernism and the information society, ed. by Boyd Rayward, for both the image–the grinder is on the cover–and more explanation.)

But, there is still more. I’ve been working with Jeff Gabel on his citation-chasing project, specifically now on the second phase which uses MDS to analyze co-assignment of LCSH to citation-chased monographs. (I’ll post something specific about this soon, so bear with me.) If this technique were to be employed, it would require systems that have all citations in digital form, and I don’t mean in separate citation indexes. So one very useful thing a non-bibliocentric catalog could do would be link to the digitized citations in everything.

Written by lazykoblog on November 17th, 2010

Tagged with , ,

Hjorland’s Lifeboats (originally posted 1-5-2009)   no comments

Posted at 9:59 pm in epistemology

Birger Hjorland and his colleagues have created several lifeboats. You can find them here:

http://www.iva.dk/jni/lifeboat/

http://www.iva.dk/bh/lifeboat_ko/home.htm

http://www.iva.dk/bh/Core%20Concepts%20in%20LIS/home.htm

Here is a note from Dr. Hjorland (isko-l 9-03-2006):

The site is intented both to be my own working space, a source for my own students, and a source for everybody that might find it useful for their own research, teaching and practical work.
As you see it is overall very comprehensive and ambitious. It is not however finished, and it is not intended ever to be finished. It is intended to grow, to function as my own memory about bibliographical references, links, full-text sources, definitions, quotes and data that I encounter in my reading and scanning and which I believe I will need in the future (or might be helpful, in student projects etc).

Some parts of it will, however, be relative finished. When I prepare lectures, I place bibliographical references, definitions, quotes and data for the students and provide a list of the pages I am referring to in my teaching. This way, I hope to gradually build up a textbook, or some textbooks of an interactive nature.

If you consider when a single entry is dated you will observe, that many entries are updated very often. If a student ask a question, if i read something in a journal, if somebody discuss with me (or send a friendly or an angry message) I often update some entries. Of course this is a very demanding task, and I always need to do more work, that time allows me to do. So, please do not be angry because you feel something is missing, but suggest how you would like it changed.

Signed contributions from you are also welcome. The intention is to make a tool that might help bring us forward by working together. I believe such a tool is much needed.

Written by lazykoblog on November 17th, 2010

Tagged with , ,

OCLC, music cataloging, and Ralph Papakhian (originally posted 7-20-2010)   no comments

Posted at 9:54 pm in cataloging

Ralph Papakhian, head of technical services in the music library at Indiana University, passed away last winter. Ralph was an inspiration to many people over the years. Back in the early days (ah, youth), when I was the head of music cataloging at Illinois, he and I had something of a music cataloging cabal going on, which is a fancy way of saying we helped each other out. We conducted a huge study of music in the then-new OCLC, which was a good survey of the conditions nascent automated music cataloging operations had to deal with in those days.* Here is a photo (I knew I had this someplace!) of my graduate assistant Constance Wernersbach (Connie) and Gretchen my beloved cat who graced my life from 1973 until 1985. Connie was sorting OCLC printouts and calculating some variable, and Gretchen was helping. You can tell, just look at her ears all perked up there! That’s the floor of our living room in Urbana, which as you can see, was then under renovation … I finished that woodwork the night Lech Walensa took over the government of Poland. Oh my, how the world changes. Well, notice Connie is using an old-fashioned calculator; this was before home computers folks! (That’s Brad’s gateleg table, for those of you who’ve been to visit us, its still sitting in our front hall; he got it at a flea market in New Orleans.) Sorry … brief moment there ….


I remember the study chiefly for being the impetus for my doctorate. It was such a difficult problem to sort through all of the data and it was very frustrating to discover one had gathered data that wouldn’t be useful in the end. I said that famously to Arlene Taylor, and she said “come to Chicago and learn how to do it right,” by which she meant learn research methods for real, and I did, and the rest is history, I suppose.

But we won an award for that study and we dined out on it for awhile. When Ralph passed away, I was asked to contribute a paper to a Festschrift, and the first thing I thought of was his study of personal name frequency in music catalogs. I will eventually write an essay about the theoretical value of that study, which reaches far beyond the obvious influence of the stated findings.

Be that as it may, I was reminded also of the OCLC study, and a little bird suggested I should perhaps replicate that study. I thought that was a good idea, and as I was about to teach music cataloging at UWM, I also thought using it as the backdrop for the course might be a good thing for those students.** Plus, it would bring them within Ralph’s orbit, however briefly. So, instead of having those students write term papers I’ve divided them into three different sets of small groups over the course of the semester and they’ve all been working on a replication of sorts. Eventually we will all be coauthors of an article that will appear in the Papakhian Festschrift.

Of course, we can’t exactly replicate the original study, nor would we want to. The original study included analysis of the music holdings by comparing OCLC to the Basic Music Library, to see whether one could (at that time) expect to find standard repertory. There’s clearly no point of replicating that–even in 1981 OCLC had copy for 91.5% of the essential scores and books. But we did replicate the analysis of timeliness for new publications. We searched lists of recently published books on music, music scores, and musical sound recordings from December 2009 and March 2010. Interestingly, while the books and recordings were almost all present, only 70% of the scores were found. So that’s just a small preliminary indication about rapidity of coverage. And I guess it also tells us what sorts of things an original music cataloger is likely to need to be working on these days.

We also received random samples of bibliographic records for scores and recordings from OCLC (thanks to Ed O’Neill and the OCLC Research Division). We searched all of those records and took some basic bibliographic “demographics.” Here are some tidbits from the preliminary analysis:

Scores: 44% AACR2 descriptions, 10% have pre-AACR2 ISBD descriptions, 24% are full level, 50% are M level (less than full, from batch-loading), 94% have Source: d (not LC), dates entered are consistently even from 1972 until 2002, and, the rate of record replacement is constant over time.

Sound recordings: 19% are full level, 48% are M level, 87% have Source: d, dates entered fall into two large clusters early and late, but otherwise are consistent.

(Results, such as they are, are drawn with 95% confidence +/- 5%.) Interestingly, only about half contain cataloging matched to current standards. That is another indicator of the kind of work music cataloging divisions can look forward to.

In the original study, because both Ralph and I ran cataloging departments, we included a sample of workforms from our two divisions, so we could comment on the sorts of changes we were making. For the replication we located the bibliographic records that had been changed the most (in all cases, more than 7 times, and as much as 22 times). Those are being analyzed independently by the students. When they’ve turned in their results I’ll post a summary here.

Cat-lovers among you might be wondering about Gretchen, so here she is, basking in the glory of the award she won for this research (okay, it was the afternoon sun in my study on High Street in Urbana). Still, she was always majestic.

(*Oops, sorry–the original study was Smiraglia, Richard P. and Papakhian, Arsen R. 1981. Music in the OCLC Online Union Catalog: a review. Notes 38: 267-74.)

(**We’ll have to ask the students what they think, but pedagogically, I’ve used the study to drive blogging exercises all through the course. I opened with the tale of how I dropped an entire drawer of music shelflist, and had to refile it, and in reading something like 1100 cards learned a lot. So each of these students has analyzed several hundred bibliographic records, in addition to the 7 they’ve created. I think it’s a good learning opportunity.)

Forgot to post a summary. The research is ongoing, of course; here are some basic results:

Most of the recently published works were found, which means coverage in WorldCat is excellent. That’s quite a change from the early days. The lowest rate of coverage, less than 70%, was for scores. Searchers’ notes raise some interesting questions, including the noise created by apparent duplicates (these are caused by multiple batch-loadings from different sources), the advantage of having unique item numbers to search with (instead of name-title combinations). Of 306 records for scores: 44% AACR2 descriptions, 10% have pre-AACR2 ISBD descriptions, 24% are full level, 50% are M level (less than full, from batch-loading), 94% have Source: d (not LC), dates entered are consistently even from 1972 until 2002, and, the rate of record replacement is constant over time. Of 309 records for sound recordings: 19% are full level, 48% are M level, 87% have Source: d, dates entered fall into two large clusters early and late, but otherwise are consistent.

Written by lazykoblog on November 17th, 2010

Tagged with , , ,

Delphic cataloging (originally posted 7-19-2010)   no comments

Posted at 9:46 pm in cataloging

I’ve just completed a Delphi study using the editorial board of CCQ as participants to posit a research agenda as part of the Year of Cataloging Research.

It was fun, and the results will appear in an editorial in CCQ in the fall (I’m told it will be in vol. 48, no. 8).

As happens with most research, some startling non-results revealed themselves. That is, a couple of issues came to the surface which weren’t part of the actual study, but I guess I can get away with writing about them here.

One was the frequent iteration of the sentiment that classification isn’t important because its only purpose is shelving (see below July 7 under “Cataloging for all *time*?”). Reflecting on this leads me in a couple of directions. I guess the first gut reaction is that we’re certainly not succeeding as a domain (KO I mean) if that point of view is still prominent among working professionals. I get this all the time from LIS students, and then I work really hard to convince them that classification is for more than shelving (and I don’t always succeed with them, because all they have to do is look around their libraries, or look at American Libraries for stories that abandon Dewey for Barnes-and-Noble-style groupings). But on deeper reflection it seems to me we have to work harder to make the case for classification as a fundamental aspect of information retrieval. Or for that matter, of information itself. It isn’t just that classification could be used to facilitate more efficacious retrieval, although it could and that has been known all through my career in the field. But there also is the empirical knowledge that phenomena generate inherent heuristics for their own classification, and these can provide natural means of translating among KOS.

Another surprise, and I guess I really shouldn’t have been surprised, was the number of folks who called for gathering basic empirical data about catalogs and cataloging. There has been quite a lot of that research, and my two papers on “theory” both addressed the cumulation of those data (Further progress in theory in knowledge organization Canadian journal of information and library science 26 n2/3 (2002): 30-49. ; and The progress of theory in knowledge organization¬Ě Library trends 50 (2002): 300-49). But of course those pieces just marked a way-station, as it were; there could and should be a lot more empirical evidence-gathering. But there also is the continuing problem that research results aren’t disseminated in the domain.

Written by lazykoblog on November 17th, 2010

Tagged with , , ,

Chaim Zins and the Map of Human Knowledge (originally posted 1-5-2009)   no comments

Posted at 9:41 pm in domain analysis

See his paper in the 2006 ISKO Proceedings. Then please visit the site:

http://www.success.co.il/

Written by lazykoblog on November 17th, 2010

Tagged with ,

NASKO domain analysis (originally posted 6-19-2009)   no comments

Posted at 9:37 pm in domain analysis

Greetings from rainy Syracuse. This is a small group (around 20 people at any given moment), but the program and the business meeting have both been fascinating. The 10 papers are all really pithy. Based around the concept of North American pioneers, the proceedings are here: http://iskocus.org/nasko2009-proceedings.php. As part of my paper developed using author co-citation analysis of North American KO authors I ran up a quick co-citation analysis from the 10 papers: there were two clusters–Smiraglia, Miksa, and Shera in one; Hjorland, Mai, Tennis, Olson, and Bates in the other–the conceptual basis of the clusters seems to be bibliographic classification and fundamentals of KO. The link between the clusters stretches from Shera to Hjorland (which I thought was fascinating). The cluster on faceting (La Barre, Cochrane, Richmond, Ranganathan) dropped off while I was trying to make the map readable; when I have more time I’ll run it again. Anyway, stay tuned ….

Written by lazykoblog on November 17th, 2010

Tagged with , ,

Roll over Lubetzky! (originally posted 7-19-2010)   no comments

Posted at 9:33 pm in cataloging

I’m finally finished (for this summer) teaching music cataloging. For the non-credit institute we created post-institute 20 “perfect” bibliographic records for scores and recordings, and for the for-credit course I created a “backlog” for the students to catalog that included eventually 21 “perfect” bibliographic records for scores, recordings, and videos. I think it’s safe to say I’m now officially very worried about the intensely silly detail required for even minimal cataloging. My how things have changed while I wasn’t looking!

As the header here says, Lubetzky would roll over in his grave (or perhaps I should have said my grandmother might have said that, had she known Lubetzky … never mind) to discover what has become of “Is this rule necessary?” Because, the answer most profoundly is “no, no, a thousand times no!”

It’s bad enough that we’re still transcribing title pages instead of using actual images (ummm, anyone noticed Amazon.com lately?), right down to contents notes instead of browsable pdfs. It’s bad enough, also, that we duplicate data from the AACR2 -dictated legible part of the record in the MARC21 encoded part of the record–one says “duplicate” with a grain of salt, because of course, the same thing is slightly different in each place. What’s up with 033 paired with 518? Why, if we gave up coding 047 and 048, are we still coding the fixed field “comp”? Why are we coding 041 for sung and closed-captioned languages and giving it in a note? But then logically we arrive at questions like why are we coding UPCs and ISMNs and ISBNs and on and on and on? When will these become automatic and not part of the cataloging process? Since I’m on a roll, let’s ask why we’re using $4 relator codes in headings?

In addition to making me crazy, it just makes me wonder which part of this is actually cataloging that should be considered professional work and taught in graduate schools, and which part ought to be fully automated?

The authority work, of course, is the knowledge organization component of cataloging, certainly of music cataloging, because it is this part where we establish relationships among performing ensembles and opera companies, for instance; it is this part where we establish uniform titles under composer headings for all of the instantiations of works; in both of these we are creating alphabetico-classed arrangements for the purpose of both collocation and disambiguation. That’s knowledge organization. This is the part Lubetzky would likely approve.

But all the rest–folks it’s 2010, enough already!

Written by lazykoblog on November 17th, 2010

Tagged with , , , ,

Instantiation as metonymy (originally posted 1-24-2009)   no comments

Posted at 9:30 pm in instantiation

In the meantime I prepared a paper on superworks for a volume edited by Arlene Taylor. This was fun, because often when I talk about cultural forces as catalyst for instantiation peoples’ eyes glaze over, but this time I was able to use Brokeback Mountain as an example. It had been fun to see the movie in the US one week, and a week later find a novelization in English with the movie actors on the cover in a bookstore in Amsterdam.

Constellations of works exist with abundance in the bibliographic universe. While this is good news for library users-cultural forces drive the marketplace to see to it that a wide variety of useful instantiations evolves-it presents a challenge for information retrieval. A simple citation for a work might be the anchoring node for a large family of related works. The future of sophisticated information retrieval depends on the development of integrated repositories that allow informed selection among the plethora of entities that share intellectual content. Achieving this goal will bring us much closer to Wilson’s notion of exploitative control of humankind’s store of recorded knowledge.

I recently published a larger meta-analysis of the concept of instantiation (A meta-analysis of instantiation as a phenomenon of information objects. Culture del testo e del documento 9, no.25, gennaio-aprile 2008, pp. 5-25). Here is the abstract:

“Instantiation” is the phenomenon observed among information objects of all types, in which multiple iterations of the information content exist and must be collocated and disambiguated in a retrieval system. The phenomenon has been observed among bibliographic works, cultural heritage artifacts, archival documents, scientific models, and ontological constructs. Studies have demonstrated some consistent theoretical parameters for the concept of instantiation, such as the importance of canonicity as a catalyst for instantiation, positive correlation of age of progenitor with large instantiation sets, and positive correlation of age of progenitor with complexity of instantiation sets. In the present paper all relevant terms are defined, an epistemological analysis of the concept of instantiation is presented in summary form, and a meta-analysis of the phenomenon of instantiation is performed using empirical evidence from several studies. The result demonstrates theoretical consistency across studies, suggesting the importance of the phenomenon for the development of the semantic Web, as well as pan- and inter-institutional digital libraries incorporating representations of both documentary and artifactual information resources.

At the moment I am interested in understanding the concept of instantiation as a form of metonymy.

Written by lazykoblog on November 17th, 2010

Tagged with , , , ,