27 April 2016

Research Data Thing 7/23 - Data citation for access & attribution

I first explored Infogr.am in 2012 when I was starting to notice infographics appearing on the web more frequently. Then there seemed to be a huge growth in the addition of interactive visualisations and a proliferation of tools for novices to make their own. I have used it a few times here, here and for work purposes.

The power of visualisations is their ability to make engagement with data easier. They capture the eye more compellingly compared than tables of data, and they (especially infographics) enable a layer of interpretation to often be included to get the 'reader' started with exploring the data. So there is some similarity between data visualisations/infographics and research 'publications'/articles/books in this interpretive role. Referencing or citing data visualisations has become more prevalent as their number has grown on the web and I think this has contributed to a greater appreciation for the underlying data. Another contributing factor could be the rise of data journalism with publications such as the Guardian's datablog.

A useful site for learning more about using data is Storytelling with Data.

As an exercise for the Challenge Me section, here is a map I made using Google Fusion with data from the State Library of Queensland which lists the locations of public libraries.  The data is available under a CC licence. In the process of correcting geocoding that Google automated I ended up messing some columns of data so I have not displayed opening hours and contact information as they would be mislabeled. The original source data is correct for these columns. I've run out of motivation, but ideally it would be cool to merge some other data source into this showing some other type of service, or demographic data for the region.

Source:  State Library of Queensland (2016), Queensland Public Libraries February 2016, http://data.gov.au/dataset/queensland-public-libraries (Accessed 27 April, 2016)


22 April 2016

Playing with Periscope

I had a little play with Periscope live broadcasting app just to see how it worked.

The video quality is not wonderful, so there are limitations on how you could use this. Sound was OK.

Here is my test. It requires that app for viewing.

15 April 2016

Research Data Thing 6/23 - Data Curation & Preservation

I found the list of preservation tools listed on the ANDS data preservation page very helpful as they led me on a path through to more tools and services such as MyExperiment where researchers can share their experiment workflows and plans.

Perusing the University policies on digital preservation highlights the need for suitable resourcing to be allocated to ensure that appropriate curation is carried out. The metadata requirements are much more than simple description of the research data. It must be extended to preservation metadata such as

  • Provenance as it changes over time
  • Authenticity
  • What actions have been taken with the data eg. copying to new file formats
  • What technology is required to access the data. This will also change over time with format obsolescence
  • Rights management - this too can change over time for example embargo periods will lapse and data may then become open

Additionally institutions must commit to forward resourcing to maintain and upgrade preservation activities and systems.

This week I went through all three of the things and had a little play with DROID provided by the UK National Archives as well. Excellent if you have a need to quickly identify batches of file formats to be managed. It links to the Pronom file format registry to give you more information about each file type.
Droid identifying digital objects in a folder

11 April 2016

Research Data Thing 5/23 - Data Sharing

One of the questions raised in Thing 5 is about different ways to make data more openly available.

I'm not sure whether there are shades of openness, but I think there are certainly open data sets which are more easily found and accessed than others. Factors include:

  • marketing/promotion - are researchers and data stewards communicating at appropriate times about the data? On release, at time of updates, augmentation, new data formats released etc, when related topics are in the news.
  • audience targeting - are the communications in the most relevant channels as well as more generally for those serendipitous connections
  • search engine optimisation - is the repository well indexed and is the data description good
  • identification of the data - the use of DOIs for journal articles is now starting to expand to data sets. In Australia ANDS DOI service is expanding to cover grey literature. This provides a unique identifier that can be used in other services (altmetrics and research data management systems, etc) and I believe this would have a spin off effect of improving findability.