I don’t know who I was kidding thinking I was going to get sucked back into this research leave and article work while we had a house full of people – it did not happen and we had a crazy great time. I did spend just a little bit of the first weekend of vaca finding a few more articles that I’m going to read through tomorrow, but I gave myself a break on posting about it because it really was only an hour or so. Back at it in the new year!
Today I tried to understand more about Fedora, Hydra, and how they connect to triplestores (or more like how triplestores connect to Fedora). I worked through the Dive into Hydra and Dive into Hydra-Works tutorials as well as an older ActiveFedora 7 tutorial on working with RDF metadata in ActiveFedora directly. None of them actually showed me anything with external triplestores though. According to the Fedora 4.x wiki documentation, a triplestore can be set up to work with Fedora/Hydra but I still don’t know if I understand its purpose or how it helps to have all of this Fedora data in RDF now. What I’m understanding is the following:
External searches against Fedora 4 data can supposedly happen using triplestores but any external triplestore seems to only really be used to help with CRUD calls (create, read, update, delete) to and from Fedora via a Java Messaging Service (JMS) indexer and not for any end user search and discovery interfaces. The current way to connect between a triplestore and Fedora involves a messenger like fcrepo-message-consumer and that documentation explicitly has the caveat that Fedora doesn’t support blank nodes, which tells me that getting complex hierarchical RDF into a triplestore from Fedora isn’t going to work because Fedora can’t send it over that way. I had thoughts about using the external triplestore to manage the full metadata of an object and only putting RDF properties into Fedora if they were simple and indexable (useful to identify a Fedora object as unique) but Fedora is still the main place for all original source metadata and it isn’t handling RDF when it is complex and hierarchical.
All this to say that I still have questions. I don’t think I understand enough about triplestores (can they manage complex hierarchical triples?) and I don’t think I understand the purpose of the external triplestore on Fedora 4 (is it just a data endpoint or another way to index Fedora data or does it help to actually manage the RDF triples in Fedora?). This warrants further online investigating and possibly some actual conversation with real people who know things about Fedora 4 and Hydra. Asking questions on listservs always makes me feel slightly stupid and like I haven’t tried hard enough but I don’t have all the time in the world to explore this on my own. I need to write more and read those other articles I found tomorrow so maybe I’ll get up the gumption by the end of the day?