MW2008 - Aggregating Museum Data: Use Issues
Skip to main content
Walker News

MW2008 - Aggregating Museum Data: Use Issues

Exploring Museum COllections On-line: The Quantitative Method

Frankie Roberto, Science Museum, United Kingdom

Three problems with museum data.

  • 1. Getting it (API’s) – Screen scrape, FOI (Freedom of Information) request
  • 2. Structure (Metadata) – Some logic involved
  • 3. Dodgy Data (Hard work) – Have to assume data is “good enough”
  • Data from FOI requests include curator, object, country, year, and acquisition method. Need a mapping process, as not everything maps, and certain items can be mapped to something simplified, for example, “edged weapon” becomes just “weapon”. Same for countries, etc. However, there are several tricky ones, for example, what country is “asia”? This is when you say “Good enough”.

    Frankie shows off a locally hosted website showing the aggregated data. By putting the data into more generic silos he’s able to parse things much more easily for view and searching.

    Issues – all objects counted equally (small coins all counted separate, so there are many more of them), no photos, user interactions not available. Prototype at museum-collections.org.

    Uniting The Shanty Towns: Data Combining Across Multiple Institutions

    Seb Chan, Powerhouse Museum, Australia

    People like order, but if you look closer you get mess. But mess is good. Yet mess makes mashups hard. Can we agree on standards? Lets start with calendars. Figured can’t be hard, it’s just a calendar. But it was, everyone has different CMS’s or no CMS at all. How do we do it? Could just use people to do it by hand, but that’s too much work. So we scrape, aggregate, have a nice backend and use sites we can trust. Then we can get a nice frontend, RSS and iCal to all these aggregated sites.

    Semantic web, why can’t we use it for collections? We write themes, tags, tracking searches, etc, but there’s gotta be a better way. Use Calais, a text analysis tool, creates dynamically generated meta data tags. It’s work humans can do, but this is automatic, which saves a lot of time. However, it doesn’t always come up with proper tags (but again is “good enough”). Once you have this data, you can then start connecting it to other data. Once you have the data identified, you can use it in mashups of other data (for example, if you have a company pulled out for Google, you can then do auto mashups of stock prices, locations, etc).

    Take it to the next level. If you know where Google is in our example, you can mashup your own location, put it into a search page, and it will show you what things are near your current location (including Google if you happen to be near it for example), as well as all the other data associated with the original record.

    Get Walker Reader in your inbox. Sign up to receive first word about our original videos, commissioned essays, curatorial perspectives, and artist interviews.