Notions

Ideas, supposings, in the workings. And the occasional thread.

Notions

Main menu

Skip to primary content
Skip to secondary content
  • About
  • Experience
  • Projects
  • Presentations

Category Archives: Readings

A joke and thesauri reviewed

Posted on February 15, 2019 by Julie
Reply

There wasn’t necessarily any logic to what I went through on Day 4 other than they were mostly things in print sitting on my table. I’m pretty sure I’ve heard this one before but one of my readings actually began with a joke: 

How many librarians does it take to change a light bulb? LC doesn’t change light bulbs. They have INCANDESCENT LAMPS, not light bulbs.

Heh. I think I might modify that punch line for my topic to say LCSH can’t change light bulbs because they already have incandescent lamps. It takes a lot for LC to make changes and I haven’t even read the Sanford Berman book that started the major call to update biased terminology about groups of people. (I have the book but I figured out that for research leave it was better to focus on shorter pieces to get through more. I think I’m a slow book reader too.) And LCSH has both light bulbs (http://id.loc.gov/authorities/subjects/sh90003210) and incandescent lamps (http://id.loc.gov/authorities/subjects/sh85041761) so maybe the joke is now they don’t have to change anything at all.

From there I explored more about different thesauri. For instance, I’m pretty sure now that A Women’s Thesaurus from 1987 is not connected to the Women’s Thesaurus that is the basis for the online thesaurus at Atria: Institute on gender equality and women’s history in the Netherlands. That seems to have been an international effort within Europe and published in the Netherlands in the early 1990s where the one I have in print is from the late 1980s and talks about being very U.S.-focused for that first edition but wanting to expand to be international. The chronology might show there could be a connection and there is some overlap in the highest level categories but I don’t have enough info yet to say for sure.

I also reviewed more about the only Linked Data vocabulary I’ve encountered so far, Homosaurus.org. There are no connections to LCSH or any other controlled vocabulary. A Women’s Thesaurus also has no connections to LCSH but it did say in the front matter that when an LCSH term worked, that term was used. They talked about doing that to maintain compatibility with LCSH but there’s no indication in this print version where terms match with LCSH so I’m not sure if you’re just supposed to know that or do that work on your own. And there was also an indication that this thesaurus is set up for electronic use but there is no indication of the format or how to get it (and my guess is that in the late 1980s, it was still pretty local and probably involved floppy disks). Possibilities with both to consider, though, depending on what I want to try doing.

I also got into more details about the work of Hope A. Olson and Dennis B. Ward to link up A Women’s Thesaurus and DDC classification. That project happened in the 1990s and tells me there probably was an electronic version of the thesaurus available somehow. I have another article to review from them that goes into even more detail and I’m kind of thinking there’s going to be something cool there that they figured out.

Posted in Metadata, Readings, Research Leave 2019 | Leave a reply

Gathering and connecting

Posted on February 13, 2019 by Julie
Reply

Wow, today. I found the best Venn diagram from Hope A. Olson comparing the “mainstream core” perspective offered through classifications like DDC and LCC and controlled vocabs like LCSH to what that “core” actually represents. Its a smaller set within the entirety of Everyone and limited to the following Boolean combination:

white AND male AND straight AND European AND Christian AND middle-class AND able-bodied AND Anglo

The result is a very tiny splotch within all of those concentric circles. It’s pretty awesome and I have needed this in my life before now.

I also got my hands on two separate thesauri for women’s issues (On Equal Terms: A Thesaurus for Nonsexist Indexing and Cataloging from 1977 compiled by Joan K. Marshall and The Women’s Thesaurus from 1987 edited by Mary Ellen S. Capek). I’m interested to see what the differences are between them. They are both in print but The Women’s Thesaurus might have been adapted for online use at Atria from the Institute on gender equality and women’s history in the Netherlands. I’m still figuring that out for sure.

I also sorted out various LGBTQ thesaurus sources from my readings (and in my head). There’s a classification scheme by Dee Michel and David Moore (International Gay and Lesbian Archives Classification System) and there’s another thesaurus from the Netherlands (A Queer Thesaurus: An International Thesaurus of Gay and Lesbian Index Terms) that has become a Linked Data vocabulary called Homosaurus.org. The College of the Holy Cross is now hosting that Linked Data vocabulary online and I still need to check it out some more but IHLIA in Amsterdam is using it to support online searching. I like their concept of supplying Broader Terms, Related Topics, Narrower Terms, and Used For in that visual way. I don’t think there’s a way to activate that without conducting a search first so you don’t start off with that help, but it is an example implementation of controlled vocabulary help.

I am also seeing that some controlled vocabularies are supplying connections to LCSH. The BC First Nations Subject Headings from the Xwi7xwa Library at the University of British Columbia is in PDF format online but identifies connected terms from LCSH. And the Lavender Library, Archives and Cultural Exchange in Sacramento, California has finding aids that include LC subject headings along with their own subject headings. The only reason I know anything about the Lavender Library is because the LGBTQ+ Library at Indiana University uses a classification system based on the one used at Lavender Library (called the LLACE Classification Scheme after the full name of the library) and they were awesome and shared it with me.

I’ve also encountered some vocabularies that are discussed but don’t seem to be available (at least not to me). EBSCO has a thesaurus for its LGBT Life resource but I cannot come up with the thesaurus no matter what I try so I don’t think we have access to that through my institution (we have access to the contents of LGBT Life but not the thesaurus). It was mentioned in one of my readings and would be good to see but it doesn’t look like an option for now.

My list of classification systems and controlled vocabularies is growing and getting a little more organized but I have more to review and learn about and a lot more to understand.

Posted in Metadata, Readings, Research Leave 2019 | Leave a reply

Reading, listing, and still learning

Posted on February 12, 2019 by Julie
Reply

Today involved digging into details about different classification schemes and controlled vocabularies and I realized I have enough to start a list! I’m interested to see how this list grows and what meta-characteristics they have in common. So far I’m tracking if the classification scheme or controlled vocabulary is available online, if it is in Linked Data format, and where I am finding it (online resource, in a book, in print some other way, etc).

My readings today were about American Indian classification and subject heading issues in Dewey Decimal Classification, Library of Congress Classification, and Library of Congress Subject Headings as well as more information about Dorothy B. Porter and her work to organize, increase, and provide access to the African and African American collections that became the Moorland-Spingarn Research Center at Howard University. Practices for classifying American Indian resources have placed much of this content in the historic past under sections of the catalog about the history of North America (in both DDC and LCC) as if American Indians don’t even exist anymore. And Porter recalled a time when many libraries grouped anything by an African American author under a DDC heading for colonization (and migration). There are clunky ways to somewhat work within these classification systems but only to a point and only for some material. Limitations of DDC to expand and the slow pace of change in LC just seems to allow these problems to languish. So new classification schemes and controlled vocabularies have been developed and I’m learning how they have been used and how they can be applied to aid in the research process. This is where my thoughts turn to Linked Data possibilities but they aren’t well-formed thoughts yet.

And just to make sure I have some warning lights going off in my head regarding Linked Data, I also read about issues of bias in Knowledge Graphs related to the Semantic Web:

  • data bias (Linked Data from sources being mostly about Europe, Japan, Australia, and the US)
  • schema bias (depending on the ontology you can get very different results for a concept like the article’s example, theater)
  • inferential bias (taking data from a source like DBPedia and running inference results in high confidence assumptions from the graph that say things like: “if X is a US president, X is male”).

That graph could use some more learning. 

This brings up something that is coming across in other readings. Bias on its own isn’t necessarily a problem. Everyone has implicit biases. It’s when that implicit bias becomes systemic and reflects out as the appropriate or authorized way to organize and interpret classifications and subject matter – bias without recognition or documentation, without transparency, is a problem. Or in the case of this knowledge graph example, results without context show bias.

Posted in Ideas, Metadata, Readings, Research Leave 2019 | Leave a reply

Reading and Learning

Posted on February 11, 2019 by Julie
Reply

Today I read more from Safiya Noble’s book Algorithms of Oppression and then Jessie Daniels’ article from 2013 on “Race and Racism in Internet Studies: A Review and Critique.” I am seeing more how the racism that underlies the United States as a country is reflected and even magnified on the Internet through commercial search engines, the pornography industry online, and the capitalistic way the Internet has developed over time, supporting a dominant color-blind “norm” (aka white and male). Noble shows this through her study of Google search results and Daniels shows this through her critique of Internet studies’ unwillingness to center race or the history of racism as it is manifested in online spaces.

I was also made aware of my own complicity in supporting racist narratives through a meme I had included in a previous blog post. I thought I knew the story behind the meme but I didn’t. I removed it from the blog post and have to accept the fact that I am part of the problem. It was a learning moment to see that what I had shared had a source based on a racist Internet meme, but that is what it is. And I am learning. 

The United States is race-based in how we function. Racial divisions are defined by those in power, a group also defined by race (White), and all other groups are kept in an Other category that exists still today online. Acknowledging this racism (but not accepting it) is one way to figure out strategies to counteract it.

Within the academic library, we have to serve our researchers in a more inclusive way. I think part of this is by being more transparent about the deficiencies of our description, not because we are bad at describing things but because we are part of the aforementioned society and that is reflected in the metadata choices we make and the controlled vocabularies we use. I’m not sure yet how that transparency can be best expressed, but I do think there are ways. 

Posted in Metadata, Readings, Research Leave 2019 | Leave a reply

All the slices of pie

Posted on October 28, 2014 by Julie
Reply

Readings on combining and exposing library data sets

I feel like I’m seeing calls across a variety of subject domains for sharing data and making it easily available and reusable. National funding models in the U.S. are beginning to require sharing of data so this idea of providing your data for others to use is kind of catching on.

I also finally read Aaron Swartz’s posthumously published “A Programmable Web: An Unfinished Work,” which is an important read for a multitude of reasons. He makes his own call for exposing data in ways that make it easy for people to grab data they want or get all of the data and make use of it however they want (Chs. 5-7). His ideas implement this around JSON and web-based technology. I like that but I think there’s probably also still a place for XML in exchanging data in a standardized way or communicating data at an institutional level (feeding our data into DPLA, for example).

With a goal of combining our library data for discovery, access, and reuse, I’ve been trying to uncover a literature review of sorts on combining data sets within a library context. I’ve come upon ideas about how to evaluate and compare data sets for commonalities and how to think about providing data in ways that are actually useful and understandable to researchers outside of the library context. Following is the current state of an annotated bibliography, plus some delicious slices of pie because, well, pie:

Slice of Cherry Blueberry Pie

Slice of Cherry Blueberry Pie by digidi via flickr

Abed, Alea. (2014). Podcast: Project Blacklight, Hydra and libraries in the digital age. Lucidworks. http://www.lucidworks.com/blog/podcast-project-blacklight-hydra-and-libraries-in-the-digital-age/

Bess Sadler from Stanford University discusses Project Hydra and what is happening in new developments. They are trying to improve discovery and access for digital libraries by adding a technology stack onto the inventory system that has been digital repositories up to now. Also improving this inventorysystem by providing self-deposit interfaces. Two new areas of work highlighted were GeoBlacklight for GIS data and displaying archival collections effectively in Blacklight.

Maple-Bourbon Pumpkin Pie

Maple-Bourbon Pumpkin Pie by djwtwo via flickr

Breeding, Marshall. (2005). Plotting a new course for metasearch. Computers in Libraries, 25:2, pp. 27-29.

Breeding makes the case for a giant central search of content instead of federated searching (searching against multiple targets). This provides a single access point instead of multiple search interfaces and lessens the burden of searching multiple targets and needing multiple indexes. Making this switch can be difficult since different providers don’t always make metadata openly available for combining.

Emde, Judith Z., Sara E. Morris, and Monica Claassen-Wilson. (2009). Testing an academic library website for usability with faculty and graduate students. Evidence Based Library and Information Practice, 4:4, pp. 24-36.

This article describes findings from a usability study of a library website. Findings include that graduate students tend to get results that are too broad from federated searching. They have to use quotation marks to be precise and results can be too mixed, making it hard to tell what is what. Federated searching is most helpful to graduate students to point out resources or databases they have not previously used. Another finding was that graduate students want subject-specific searching or limited combined subject searching, not cross-subject searching. Subject-specific resource help is most useful when given within a context, such as a course.

Hofmann, Melissa A. and Sharon Q. Yang. (2011). How next-gen r u? A review of academic OPACs in the United States and Canada. Computers in Libraries 31:6, pp. 26-29.

Initial study that was followed up in 2011 found that of 260 academic libraries surveyed, very few were using federated searching to combine data sources and most were still only offering catalog searching. If there was a discovery layer tool in use, it tended to provide faceted navigation.

Hofmann, Melissa A. (2012). “Discovering” what’s changed: a revisit of the OPACs of 260 academic libraries. Library Hi Tech 30:2, pp. 253-274.

In this 2011 follow-up to a 2009 study that found that discovery layers were not in wide use among academic online library catalogs, more institutions are using discovery layers but there are weaknesses in what these tools can do in terms of unified one-stop searching, recommended items, and relevancy display based on circulation statistics. Interest is shown in the eXtensible Catalog (XC) Metadata Toolkit because it “aggregates metadata from various silos, normalizes (cleans-up) metadata of varying levels of quality, and transform[s]… metadata into a consistent format for use in the discovery layer.” [p. 261]

Dutch Apple Pie a la mode

Dutch Apple Pie a la mode by mattmendoza via flickr

Johnson, Thomas. (2013). Indexing linked bibliographic data with JSON-LD, BibJSON and Elasticsearch. The Code4Lib Journal, 19. http://journal.code4lib.org/articles/7949

This article describes using JSON to map RDF into JSON-LD (linked data). The main point of interest for me is that indexes were not actually combined but kept separate. This helped to include context along with the index and allowed for different mappings based on discrepancies between data sources. There were no performance issues querying across multiple indexes using JSON.

All We are saying is Give Pie a Chance

All We are saying is Give Pie a Chance by bitzcelt via flickr

Kipp, Margaret E. I. (2005). Complementary ordiscrete contexts in online indexing: A comparison of user, creator, and intermediary keywords. Canadian Journal of Information & Library Sciences 29:4, pp. 419-436.

This article describes a study comparing descriptors assigned by different actors in the metadata creation process. 165 articles from CiteULike (a bookmarking web service similar to de.li.cio.us) were compared based on user-provided tags, author-provided keywords, and intermediary-provided descriptors using the Voorbij scale along with structured thesauri from INSPEC and Library Literature to identify broader, narrower, and related terms. The study found that user tags are quite different from author- and intermediary-provided descriptors and can supplement a controlled vocabulary entryway to content. Additionally, providing both abbreviations and long-form terms helped to expand content use to interdisciplinary research.

Limani, Fidan and Vladimir Radevski. (2013). Enrichment of digital libraries with Web 2.0: Resources for enhanced user search experience. 8th Annual South-East European Doctoral Student Conference: Infusing Research and Knowledge in South-East Europe. South-East European Research Center: Thessaloniki, Greece, 2013. pp. 294-300.

This article proposes connecting “traditional” scientific research resources (indexed, categorized, and searchable) with scientific Web 2.0 data (socially maintained scholarly library services like blogs and wikis) by tagging those Web 2.0 data sources with authoritative links. This introduces Semantic Web connections to tie together these data sources and expose digital library collections more effectively, reducing the “search span and effort” on the part of the user. [p. 299]

key lime pie

key lime pie by roboppy via flickr

Stephens, Owen. (2011). Mashups and open data in libraries. Serials: The Journal for the Serials Community 24:3, pp. 245-250.

Stephens argues that making data open involves more than just licensing – it should refer to “the ease with which data can be used, taking into consideration aspects such as format and access mechanisms.” [p. 246] The most common ways library data is shared are via XML, JSON, and, increasingly, RDF but these “formats offered are usually familiar only to those who specialize in library data.” [p. 247] Offering APIs to access data makes it easier to understand and use the data, allowing mashups to occur and new ways to use data possible.

Thomas, Marliese, Dana M. Caudle, and Cecilia M. Schmitz. (2009). To tag or not to tag? Library Hi Tech, 27:3, pp. 411-434.

This article describes a study comparing user-contributed tags to controlled vocabulary subject headings (LCSH) to identify broader, narrower, and related terms to identify new terms via the tags that can be brought in to enhance controlled vocabulary used in a system (a “collabulary”). Kipp’s modification of Voorbij scale was used to look at tags compared to hierarchical relationships from a thesaurus. Tagging is generally for personal use (such as finding something later) so there needs to be an incentive to create tags.

Apple Pie

Apple Pie by belochkavita via flickr

Tillett, Barbara B. (2000). Authority control on the web. In: Bicentennial Conference on Bibliographic Control for the New Millennium: Confronting the Challenges of Networked Resources and the Web (Washington DC, November 15-17, 2000).

This report discusses the concept of a “mandatory minimal set of data elements… in all authority records to facilitate international exchange or use” [p. 5] It shows growing support for authority control to manage different sources of common metadata and the idea of common core data points for aligning and relating records from different sources.

Voorbij, Henk J. (1998). Title keywords and subject descriptors: A comparison of subject search entries of books in the humanities and social sciences. Journal of Documentation, 54:4, pp. 466-476.

This article describes results from two studies – one where librarians compared subject descriptors and words in titles for 475 catalog records and rated them on a scale of 1 (subject is the same as the title) to 7 (subject is not at all in the title) and a second where librarians searched on subject and title words for the same topic. Findings suggest that subject descriptors enhanced recall for searches and 37% of the first study’s records were enhanced by subject descriptors. [The scale used for comparison has been used in other studies (Thomas, et al., 2009; Kipp, 2006) with variations in what is being compared but focusing on comparing different types of metadata.]

 

Posted in Metadata, Readings | Leave a reply

Recent Posts

  • It’s over but I’m not done
  • A joke and thesauri reviewed
  • Gathering and connecting
  • Reading, listing, and still learning
  • Reading and Learning

Recent Comments

  • ChristmasCrud on I think I learned things today

Archives

  • February 2019
  • January 2016
  • December 2015
  • August 2015
  • October 2014
  • June 2014
  • March 2014
  • September 2012

Categories

  • Ideas
  • Metadata
  • Readings
  • Research Leave 2015
  • Research Leave 2019
  • Solr

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
  • Professional Vita
  • Proudly powered by WordPress