Connecting the dots: Creating a joined up approach to Data Management Plans

February 13, 2019February 12, 2019 ~ Rosie Higman ~ Leave a comment

This was originally published on the Library Research Plus blog on 6th December 2018 and is re-posted with kind permission.

Eight months on from a major revision of data management planning processes at the University of Manchester, we’re often asked about how we work and so we thought it might be useful to share how we created a process that gives researchers maximum value from creating a Data Management Plan (DMP) and assists in the University’s compliance with GDPR.

The University of Manchester has required a DMP for every research project for nearly 5 years, as have most major UK research funders, and we had an internal data management planning tool during this period. Whilst this tool was heavily used we wanted something that was more user-friendly and easier to maintain. We were also keen on having a tool which would allow Manchester researchers to collaborate with researchers at other institutions so turned to DMPonline, maintained by the Digital Curation Centre. Once the decision had been taken to move to DMPonline we took the opportunity to consider links to the other procedures researchers complete before starting a project to see if we could improve the process and experience.

The One Plan That Rules Them All

We brought together representatives from the Library, Information Governance Office, Research IT, ethics and research support teams to map out the overlaps in forms researchers have to complete before beginning research. We also considered what additional information the University needed to collect to ensure compliance with GDPR. We established that whilst there were several different forms required for certain categories of research, the DMP is the one form used by all research projects across the University and so was the most appropriate place to be the ‘information asset register’ for research required under GDPR.

We also agreed on common principles that:

Researchers should not have to fill in the same information twice;
Where possible questions would be multiple choice or short form, to minimise completion time;
DMP templates should be as short as possible whilst capturing all of the information needed to provide services and assist in GDPR compliance

To achieve this we carefully considered all existing forms. We identified where there were overlaps and agreed on wording we could include in our DMP templates that would fulfil the needs of all teams – not an easy task! We also identified where duplicate questions could be removed from other forms. The agreed wording was added to our internal template and as a separate section at the beginning of every funder template as the ‘Manchester Data Management Outline’ to ensure unity across every research project at the University.

The Journey of a DMP

Once we had agreed on the questions to be asked we designed a process to share information between services with minimal input from researchers. Once a researcher has created their plan the journey of a DMP begins with an initial check of the ‘Manchester Data Management Outline’ section by the Library’s Research Data Management (RDM) team. Here we’re looking for any significant issues and we give researchers advice on best practices. We ensure that all researchers who create plans are contacted, so that all researchers benefit from the process, even if that is just confirmation that they are doing the right thing.

DMPChecks2

If the issues identified suggest the potential for breaches of GDPR or a need for significant IT support, these plans are sent to the Information Governance Office and Research IT respectively. At this point all researchers are also offered the option of having their full DMP reviewed, using DMPonline’s ‘request feedback’ button.

DMPChecks3

If researchers take up this service – and more than 200 have in the first eight months – we review their plans within DMPonline, using the commenting functionality, and return the feedback to the researcher within 10 working days.

DMPChecks4

If a research project requires ethics approval, researchers are prompted whilst filling in their ethics form to attach their DMP and any feedback they have received from the Library or other support services. This second step was introduced shortly after the move to DMPonline so that we could ensure that the advice being given was consistent. These processes ensure that all the relevant services have the information they need to support effective RDM with minimal input from researchers.

Implementation

On 17th April a message was sent to all researchers informing them of the change in systems and new processes. Since then Manchester researchers have created more than 2000 DMPs in DMPonline, demonstrating brilliant engagement with the new process. Sharing information between support services has already paid dividends – we identified issues with the handling of audio and video recordings of participants which contributed to the development of a new Standard Operating Procedure.

Next Steps

Whilst we have seen significant activity in DMPonline and a lot of positive feedback about our review service there are still improvements to our service that we would like to make. We are regularly reviewing the wording of our questions in DMPonline to ensure that they are as clear as possible; for example, we have found that there is frequent confusion around the terminology used for personal, sensitive, anonymised and pseudonymised data. There are also still manual steps in our process, especially for researchers applying for ethics approval, and we would like to explore how we could eliminate these.

Our new data management planning process has improved and all the services involved in RDM-related support at Manchester now have a much richer picture of the research we support. The University of Manchester has a distributed RDM service and this process has been a great opportunity to strengthen these links and work more closely together. Our service does not meet the ambitious aims of Machine Actionable DMPs but we hope that it offers an improved experience for the researcher, and is a first step towards semi-automated plans, at least from a researcher perspective.

A Research Data Librarian’s experience of OpenCon2017

February 13, 2019February 12, 2019 ~ Rosie Higman ~ Leave a comment

This was originally published on the Library Research Plus blog on 6th December 2017 and is re-posted with their kind permission

OpenConGroupPhoto

After following and participating in the OpenCon Librarian calls for much of the last year I was delighted to win a partial scholarship to OpenCon 2017. The monthly calls had raised my awareness of the variety of Open Access, Education and Data initiatives taking place elsewhere and I was keen to learn more about others’ advocacy efforts with students, librarians, policy makers, social entrepreneurs and researchers from around the world.

Too often when discussing Open Access and Data it seems that researchers, librarians and policy makers are at separate conferences and having separate conversations; so it is great that OpenCon brings together such a diverse group of people to work across national, disciplinary and professional boundaries. Thus I was very excited to arrive in Berlin for a long weekend working with a dedicated group of advocates on how to advance Open Research and Education.

The weekend started with a panel of inspiring early career professionals discussing the initiatives they are working on which showed the many different levels it is possible to influence academic culture. These included Kholoud Al Ajarma’s work enabling refugee children to tell their stories through photography, the Bullied into Bad Science campaign which supports early career researchers in publishing ethically, Robin Champieux’s efforts to affect grassroots cultural change and research into how open science is (or is not!) being incorporated into Review, Promotion and Tenure in American and Canadian universities. Learning about projects working at the individual, institutional, and national level was a great way to get inspired about what could be achieved in the rest of the conference.

OpenConCircles

This emphasis on taking practical action was a theme of the weekend, OpenCon is not an event where you spend much time listening! After sharing how we all came to be interested in ‘Open’ during the stories of self on Saturday afternoon, we plunged into regional focus groups on Sunday working on how we can affect cultural change as individuals in large institutions.

The workshops used design thinking, so we spent time thinking through the goals, frustrations and preoccupations of each actor. This meant that when we were coming up with strategies for cultural change they were focused on what is realistic for the people involved rather than reaching for a technical solution with no regard to context. This was a great chance to talk through the different pressures facing researchers and librarians, understand each other’s points of view and come up with ways we can work in alliance to advocate for more openness.

During the do-athon (think a more inclusive version of a hackathon) I spent much of my time working with a group lead by Zoe Wake Hyde looking at Open Humanities and Social Sciences which was born out of one of the unconference sessions on the previous day.

When discussing Open Research, and particularly Open Data, the conversation is frequently geared towards the type of research and publishing which occurs in the physical sciences and so solutions do not take account of the challenges faced by the Humanities and Social Sciences. These challenges include a lack of funding, less frequent publishing which puts more pressure on each output, and the difficulties of making monographs Open Access. Often at conferences there are only a couple of us who are interested in the Humanities and Social Sciences so it was great to be able to have in depth discussions and start planning possible actions.

How does one talk about “open” to humanists (without necessarily talking about “digital”)? #opencon pic.twitter.com/uKqzeKQu9m

— Micah Vandegrift (@micahvandegrift) 13 November 2017

During the initial unconference session we talked about the differences (and potential conflicts) between Digital Humanities and Open Humanities, the difficulties in finding language to advocate effectively for Open in the Humanities, and the difficulty of sharing qualitative social sciences data. It was reassuring to hear others are having similar difficulties in getting engagement in these disciplines and, whilst trying to avoid it turning into a therapy session, discuss how we could get Humanities and Social Sciences to have a higher profile within the Open movement. It was by no means all discussion and true to stereotype several of our group spent the afternoon working on their own getting to grips with the literature in this area.

It was inspiring to work together with an international group of early career researchers, policy makers and librarians to get from an initial discussion about the difficulties we are all facing to a draft toolkit for advocates in little over 24 hours. Our discussions have continued since leaving Berlin and we hope to have a regular webchat to share best practice and support each other.

Whilst getting involved with practical projects was a fantastic opportunity my main takeaway from the weekend was the importance of developing a wider and more inclusive perspective on Open Research and Education. It is easy to lose sight of these broader goals when working on these issues every day and getting bogged down in funder compliance, the complications of publisher embargoes and the technical difficulties of sharing data.

The Diversity, Equity and Inclusion panel focused on the real world impact of openness and the importance of being critical in our approaches to openness. Denise Albornoz spoke powerfully on recognising the potential for Open Research to perpetuate unequal relationships across the world with wealthy scientists being the only ones able to afford to publish (as opposed to being the only ones being able to afford to read the literature) and so silencing those in developing countries. Tara Robertson highlighted the complicated consent issues exposed through opening up historic records, Thomas Mboa focused on how Open Access prioritises Western issues over those important in Africa, and Siko Bouterse spoke about the Whose Knowledge project which campaigns on ensuring knowledge from marginalised communities is represented on the Internet.

This panel, much like the whole of OpenCon, left me reflecting on how we can best advance Open Access and Open Data and re-energised to make a start with new allies from around the world.

Making data open: resources, gaps and incentives

September 12, 2018September 9, 2018 ~ Rosie Higman ~ Leave a comment

This was originally published on the Software Sustainability Institute blog on 24th May 2017 under a CC-BY-NC licence.

By Naomi Penfold, eLife, Penny Andrews, University of Sheffield, M. H. Beals, Loughborough University, Rosie Higman, University of Cambridge, Callum Iddon, Science and Technologies Facilities Council, Cyril Pernet, University of Edinburgh, Diana Suleimenova, Brunel University London.

This post is part of the Collaborations Workshops 2017 speed blogging series.

What resources already exist and what’s needed next?

Data sharing relies on having somewhere that the data can be accessed, typically in a repository. Some researchers are lucky enough to have university repositories; for the others they have to rely on external resources, such as Zenodo or disciplinary repositories such as those found at Re3data. This is a trivial but necessary first step: identifying the most suitable place to host data.

It is also worth noting that open data does not mean just posting your research dataset online with your publication. The FORCE11 community advocates for open and FAIR data: Findable, Accessible, Interoperable, and Reusable. Understanding all the best practises and resources available to help achieve these goals can be intimidating for a researcher who wants to start sharing their data.

It can be confusing to know how best to share data: what formats are best? Which data should be shared? How best can data be managed so that it can be shared later? Both the Digital Curation Centre and DataCite provide great resources to support this. There are also tools available to help access, and make the best use of, data that is already available in semi-open states (open, but difficult to access, particularly in machine-readable form). Below is a (not at all exhaustive) list of some key tools and resources to help any researcher to get started with open research data:

Decide which research data to preserve¹
Plan for managing and sharing your research data²
No data is truly open without the right licence: learn how to license your data³
Find the appropriate repository for your data at re3data.org
Learn how to cite and link to datasets in your research articles⁴ and easily format your DOI citations
Track the impact of your open research data⁵
Find and reuse other researcher’s open data via DataCite
Clean up your own and other’s messy data using OpenRefine
Extract data from PDFs using Scraper Wiki and Chem Data Extractor

Literacy gap

As well as ensuring that different disciplinary communities can adopt these resources in an appropriate way, their adoption also relies on researchers having the skills to use them. Many researchers teach themselves coding skills or learn some basics through Software Carpentry and Data Carpentry, but this relies on enthusiasts doing this in their spare time. However, there remain many researchers in academia who understandably struggle with basic data management, such as keeping regular back-ups. When setting out best practices for open data, whether they are tools, frameworks or standards, it is important to recognise these disparities in data and software literacy. Standards and frameworks should be accessible to different skills levels and allow researchers to develop their skill levels and move “up” the framework.

Workflows mid-project

It is often stated that it is important to introduce best practices early on in a project. However, one of the key issues facing widening the range of academics involved in Open Data, both in its use and its creation, is the dynamic and fast-moving landscape of tools, standards and expectations. It is rare for a principal investigator or sole-researcher to be both at the stage of the career where they are producing data suitable for open dissemination and at the true “start” of a project. Even those entering the field as postgraduates often come with an assortment of materials from previous study or supervisors that must be somehow integrated into their new project’s workflow. A key factor in the dissemination of Open Data practices may therefore lie in the ability to adapt methods to existing workflows and datasets. This may mean incremental or iterative improvement of metadata, documentation and format as the project progresses, rather than a halting and reformatting of existing materials. In other cases, it may mean providing tools that automatically reformat data that had been created using existing proprietary or eclectic storage practices and highlighting where additional information is required. In any case, any new practice must be made immediately actionable to learners lest it becomes lost in the “for my next project” mental file.

Incentives and compliance

Even with sufficient knowledge and understanding of best practices in research data sharing, few researchers are likely to adopt these practices without appropriate incentivisation. How to achieve this remains an open question, and it is not a foregone conclusion that there will be a single mechanism to incentivise all data creators to practise Open Data. Nor is there a single stakeholder responsible for the incentives. That said, funding bodies hold the obvious stick and carrot: several funders of scientific research have requested researchers share their data for many years. The Engineering and Physical Sciences Research Council has been singled out as a funder that goes beyond a policy statement, and demands compliance in order for the individual to receive another grant in the future (see “Research data management and openness: The role of data sharing in developing institutional policies and practices” by Rosie Higman and Stephen Pinfield. Success is also contingent on the community, contributors and users alike. Should we be demanding that open data users eat their own dog food and contribute to the open data pool too? Or does this conflict with the definition of open data published by the Open Data Initiative: “Open Data is data anyone can access, use and share”?

There is a dilemma between taking the carrot or the stick approach to getting researchers to share non-publication research outputs such as data and code. For the Open Access movement, it appears that only the stick has worked, with threats from funders, including Higher Education Funding Council for England, Research Councils UK, Wellcome and Bill & Melinda Gates Foundation that money will no longer be forthcoming if researchers do not comply. We are at an earlier stage in the Open Data movement, and it may be possible to take a different route when it comes to advocacy and incentives to share. Incentives can seem abstract: beyond life sciences the “it saves lives” argument doesn’t always work. Punishing researchers with fewer resources, e.g. those in the Global South, for not sharing or sharing lower quality data is unlikely to help achieve the social justice goals of Open. Further, research data managers and librarians feel uncomfortable with the role of “policing” compliance with policies around Open and sharing, preferring an advocacy role. However, as funders begin to enforce their Open Data policies, responsibility for monitoring and incentivising data sharing has to fall somewhere.

Certainly, a means to surface the value that comes from the extra effort and time taken to document, structure and share data effectively is well-needed. Moreover, adequate attention needs paying to concerns about Open Data: will a better-resourced group perform my next research project before I get the chance (the leapfrog scoop)? Will my Open Data be distorted in media for which there is no appropriate channel for debate? Is it ever possible to publish data about individuals, given identifiability is possible even when data are anonymised, and closed data can still be accessed by anyone able to mimic researcher credentials?

Where next?

Regardless of self-study efforts – or even approaching the local curation expert for advice – enacting Open Data remains intimidating where there is a skills gap, or a lack of appreciation of the individual researcher’s situation and expectations. If Open Data is to be promoted then the significant issues regarding research assessment need to addressed at the same time.

References

DCC (2014). ‘Five steps to decide what data to keep: a checklist for appraising research data v.1’. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/how-guides
Jones, S. (2011). ‘How to Develop a Data Management and Sharing Plan’. DCC How-to Guides. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/how-guides
Ball, A. (2014). ‘How to License Research Data’. DCC How-to Guides. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/how-guides
Ball, A. & Duke, M. (2015). ‘How to Cite Datasets and Link to Publications’. DCC How-to Guides. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/how-guides
Alex Ball, Monica Duke (2015). ‘How to Track the Impact of Research Data with Metrics’. DCC How-to Guides. Edinburgh: Digital Curation Centre. Available online: http://www.dcc.ac.uk/resources/how-guides

Open at scale: sharing images in the Open Research Pilot

September 12, 2018September 9, 2018 ~ Rosie Higman ~ Leave a comment

This was originally published on the Unlocking Research blog on 8th May 2017 under CC-BY licence and was co-authored with Dr Ben Steventon.

Dr Ben Steventon is one of the participants in the Open Research Pilot. He is working with the Office of Scholarly Communication to make his research process more open and here reports on some of the major challenges he perceives at the beginning of the project.

The Steventon Group is a new group established last year which looks at embryonic development, in particular focusing on the zebrafish. To investigate problems in this area the group uses time-lapse imaging and tracks cells in 3D visualisations which presents many challenges when it comes to data sharing, which they hope to address through the Wellcome Trust Open Research Project. Whilst the difficulties that this group are facing are specific to a particular type of research, they highlight some common challenges across open research: sharing large files, dealing with proprietary software and joining up the different outputs of a group.

Sharing imaging data

The data created by time-lapse imaging and cell tracking is frequently on a scale that presents a technical, as well as financial, challenge. The raw data consists of several terabytes of film which is then compressed for analysis into 500GB files. These compressed files are of a high enough quality that they can be used for analysis but they are still not small enough that they can be easily shared. In addition the group also generates spreadsheets of tracking data, which can be easily shared but are meaningless without the original imaging files and specific software to allow the two pieces of data to be connected. One solution which we are considering is the Image Data Resource, which is working to make imaging datasets in the life sciences, which have not previously been shareable due to their size, available to the scientific community to re-use.

Making it usable

The software used in this type of research is a major barrier to making the group’s work reproducible. The Imaris software the group uses costs thousands of pounds so anything shared in their proprietary formats are only accessible to an extremely small group of researchers at wealthier institutions, which is in direct opposition to the principles of Open Research. It is possible to use Fiji, an open source alternative, to recreate tracking with the imaging files and tracking spreadsheets; however, the data annotation originally performed in Imaris will be lost when the images are not saved in the proprietary formats.

An additional problem in such analyses is the sharing of protocols that detail the methodologies applied, from the preparation of the samples all the way through data generation and analysis. This is a common problem with standard peer-review journals that are often limited in the space available for the description of methods. The group are exploring new ways to communicate their research protocols and have created an article for the Journal of Visualised Experiments, but these are time consuming to create and so are not always possible. Open peer-review platforms potentially offer a solution to sharing detailed protocols in a more rapid manner, as do specialist platforms such as Wellcome Open Research and Protocols.io.

Increasing efficiency by increasing openness

Whilst the file size and proprietary software in this type of research presents some barriers to sharing, there are also opportunities through sharing to improve practice across the community. Currently there are several different software packages being used for visualisation and tracking. Therefore, sharing more imaging data would allow groups to try out different types of images on different tools and make better purchasing decisions with their grant money. Furthermore, there is a great frustration in this area that lots of people are working on different algorithms for different datasets, so greater sharing of these algorithms could reduce the amount of time wasted creating algorithms when it might be possible to adapt a pre-existing one.

Shifting models of scholarly communication

As we move towards a model of greater openness, research groups are facing a new difficulty in working out how best to present their myriad outputs. The Steventon group intends to publish data (in some form), protocols and a preprint at the same time as submitting their papers to a traditional journal. This will make their work more reproducible, and it also allows researchers who are interested in different aspects of their work to access the bits that interest them. These outputs will link to one another, through citations, but this relies on close reading of the different outputs and checking references. The Steventon group would like to make the links between the different aspects of their work more obvious and browsable, so the context is clear to anyone interest in the lab’s work. As the research of the group is so visual it would be appropriate to represent the different aspects of their work in a more appealing form than a list of links.
The Steventon lab is attempting to link and contextualise their work through their website, and it is possible to cross-reference resources in many repositories (including Cambridge’s Apollo), but they would like there to be a more sustainable solution. They work in areas with crossovers to other disciplines – some people may be interested in their methodologies, others the particular species they work on, and others still the particular developmental processes they are researching. There are opportunities here for openness to increase the discoverability of interdisciplinary research and we will be exploring this, as well as the issues around sharing images and proprietary software, as part of the Open Research Pilot.

Published 8 May 2017
Written by Rosie Higman and Dr Ben Steventon

Strategies for engaging senior leadership with RDM – IDCC discussion

September 12, 2018September 9, 2018 ~ Rosie Higman ~ Leave a comment

This was originally published on the Unlocking Research blog on 5th May 2017 under a CC-BY licence.

This blog post gathers key reflections and take-home messages from a Birds of a Feather discussion on the topic of senior management engagement with RDM, and while written by a small number of attendees, the content reflects the wider discussion in the room on the day. [Authors: Silke Bellanger, Rosie Higman, Heidi Imker, Bev Jones, Liz Lyon, Paul Stokes, Marta Teperek*, Dirk Verdicchio]

On 20 February 2017, stakeholders interested in different aspects of data management and data curation met in Edinburgh to attend the 12th International Digital Curation Conference, organised by the Digital Curation Centre. Apart from discussing novel tools and services for data curation, the take-home message from many presentations was that successful development of Research Data Management (RDM) services requires the buy-in of a broad range of stakeholders, including senior institutional leadership.

Summary

The key strategies for engaging senior leadership with RDM that were discussed were:

Refer to doomsday scenarios and risks to reputations
Provide high profile cases of fraudulent research
Ask senior researchers to self-reflect and ask them to imagine a situation of being asked for supporting research data for their publication
- Ready to use risk-assessment tools developed by the Research Data Service at the University of Illinois at Urbana-Champaign
Refer to the institutional mission statement / value statement
Collect horror stories of poor data management practice from your research community
Know and use your networks – know who your potential allies are and how they can help you
Work together with funders to shape new RDM policies
Don’t be afraid to talk about the problems you are experiencing – most likely you are not alone and you can benefit from exchanging best practice with others

Why it is important to talk about engaging senior leadership in RDM?

Endorsement of RDM services by senior management is important because frequently it is a prerequisite for the initial development of any RDM support services for the research community. However, the sensitive nature of the topic (both financially and sometimes politically as well) means there are difficulties in openly discussing the issues that RDM service developers face when proposing business cases to senior leadership. This means the scale of the problem is unknown and is often limited to occasional informal discussions between people in similar roles who share the same problems.

This situation prevents those developing RDM services from exchanging best practice and addressing these problems effectively. In order to flesh out common problems faced by RDM service developers and to start identifying possible solutions, we organised an informal Birds of a Feather discussion on the topic during the 12th IDCC conference. The session was attended by approximately 40 people, including institutional RDM service providers, senior organisational leaders, researchers and publishers.

What is the problem?

We started by fleshing out the problems, which vary greatly between institutions. Many participants said that their senior management was disengaged with the RDM agenda and did not perceive good RDM as an area of importance to their institution. Others complained that they did not even have the opportunity to discuss the issue with their senior leadership. So the problems identified were both with the conversations themselves, as well as with accessing senior management in the first place.

We explored the type of senior leadership groups that people had problems engaging with. Several stakeholders were identified: top level institutional leadership, heads of faculties and schools, library leadership, as well as some research team leaders. The types of issues experienced when interacting with these various stakeholder groups also differed.

Common themes

Next we considered if there were any common factors shared between these different stakeholder groups. One of the main issues identified was that people’s personal academic/scientific experience and historic ideals of scientific practice were used as a background for decision making.

Senior leaders, like many other people, tend to look at problems with their own perspective and experience in mind. In particular, within the rapidly evolving scholarly communication environment what they perceive as community norms (or in fact community problems) might be changing and may now be different for current researchers.

The other common issue was the lack of tangible metrics to measure and assess the importance of RDM which could be used to persuade senior management of RDM’s usefulness. The difficulties in applying objective measures to RDM activities are mostly due to the fact that every researcher is undertaking an amount of RDM by default so it is challenging to find an example of a situation without any RDM activities that could be used as a baseline for an evidenced-based cost benefit analysis of RDM. The work conducted by Jisc in this area might be able to provide some solutions for this. Current results from this work can be found on the Research Data Network website.

What works?

The core of our discussion was focused on exchanging effective methods of convincing managers and how to start gathering evidence to support the case for an RDM service within an institution.

Doomsday scenarios

We all agreed that one strategy that works for almost all possible audience types are doomsday scenarios – disasters that can happen when researchers do not adhere to good RDM practice. This could be as simple as asking individual senior researchers what they would do if someone accused them of falsifying research data five years after they have published their corresponding research paper. Would they have enough evidence to reject such accusations? The possibility of being confronted with their own potential undoing helped convince many senior managers of the importance of RDM.

Other doomsday scenarios which seem to convince senior leaders were related to broader institutional crises, such as risk of fire. Useful examples are the fire which destroyed the newly built Chemistry building at the University of Nottingham, the fire which destroyed valuable equipment and research at the University of Southampton (£120 million pounds’ worth of equipment and facilities), the recent fire at the Cancer Research UK Manchester Institute and a similar disaster at the University of Santa Cruz.

Research integrity and research misconduct

Discussion of doomsday scenarios led us to talk about research integrity issues. Reference to documented cases of fraudulent research helped some institutions convince their senior leadership of the importance of good RDM. These cases included the fraudulent research by Diederik Stapel from Tilburg University or by Erin Potts-Kant from Duke University, where $200 million in grants was awarded based on fake data. This led to a longer discussion about research reproducibility and who owns the problem of irreproducible research – individual researchers, funders, institutions or perhaps publishers. We concluded that responsibility is shared, and that perhaps the main reason for the current reproducibility crisis lies in the flawed reward system for researchers.

Research ethics and research integrity are directly connected to good RDM practice and are also the core ethical values of academia. We therefore reflected on the importance of referring to the institutional value statement/mission statement or code of conduct when advocating/arguing for good RDM. One person admitted adding a clear reference to the institutional mission statement whenever asking senior leadership for endorsement for RDM service improvements. The UK Concordat on Open Research Data is a highly regarded external document listing core expectations on good research data management and sharing, which might be worth including as a reference. In addition, most higher education institutions will have mandates in teaching and research, which might allow good RDM practice to be endorsed through their central ethics committees.

Bottom up approaches to reach the top

The discussion about ethics and the ethos of being a researcher started a conversation about the importance of bottom up approaches in empowering the research community to drive change and bring innovation. As many researcher champions as possible should convince senior leadership about important services. Researcher voices are often louder than those of librarians, or those running central support services, so consider who will best help to champion your cause.

Collecting testimonies from researchers about the difficulties of working with research data when good data management practice was not adhered to is also a useful approach. Shared examples of these included horror stories such as data loss from stolen laptops (when data had not been backed up), newly started postdocs inheriting projects and the need to re-do all the experiments from scratch due to lack of sufficient data documentation from their predecessor, or lost patent cases. One person mentioned that what worked at their institution was an ‘honesty box’ where researchers could anonymously share their horror data management stories.

We also discussed the potential role of whistle-blowers, especially given the fact that reputational damage is extremely important for institutions. There was a suggestion that institutions should add consequences of poor data management practice to their institutional risk registers. The argument that good data management practice leads to time and efficiency savings also seems to be powerful when presented to senior leadership.

The importance of social networks

We then discussed the importance of using one’s relationships in getting senior management’s endorsement for RDM. The key to this is getting to know the different stakeholders, their interests and priorities, and thinking strategically about target groups: who are potential allies? Who are the groups who are most hesitant about the importance of RDM? Why are they hesitant? Could allies help with any of these discussions? A particularly powerful example was from someone who had a Nobel Prize winner ally, who knew some of the senior institutional leaders and helped them to get institutional endorsement for their cause.

Can people change?

The question was asked whether anyone had an example of a senior leader changing their opinion, not necessarily about RDM services. Someone suggested that in case of unsupportive leadership, persistence and patience are required and that sometimes it is better to count on a change of leadership than a change of opinions. Another suggestion was that rebranding the service tends to be more successful than hoping for people to change. Again, knowing the stakeholders and their interests is helpful in getting to know what is needed and what kind of rebranding might be appropriate. For example, shifting the emphasis from sharing of research data and open access to supporting good research data management practice and increasing research efficiency was something that had worked well at one institution.

This also led to a discussion about the perception of RDM services and whether their governance structure made a difference to how they were perceived. There was a suggestion that presenting RDM services as endeavours from inside or outside the Library could make a difference to people’s perceptions. At one science-focused institution anything coming from the library was automatically perceived as a waste of money and not useful for the research community and, as a result, all business cases for RDM services were bound to be unsuccessful due to the historic negative perception of the library as a whole. Opinion seemed to confirm that in places where libraries had not yet managed to establish themselves as relevant to 21st century academics, pitching library RDM services to senior leadership was indeed difficult. A suggested approach is to present RDM services as collaborative endeavours, and as joint ventures with other institutional infrastructure or service providers, for example as a collaboration between the library and the central IT department. Again, strong links and good relationships with colleagues at other University departments proved to be invaluable in developing RDM services as joint ventures.

The role of funding bodies

We moved on to discuss the need for endorsement for RDM at an institutional level occurring in conjunction with external drivers. Institutions need to be sustainable and require external funding to support their activities, and therefore funders and their requirements are often key drivers for institutional policy changes. This can happen on two different levels. Funding is often provided on the condition that any research data generated as a result needs to be properly managed during the research lifecycle, and is shared at the end of the project.

Non-compliance with funders’ policies can result in financial sanctions on current grants or ineligibility for individual researchers to apply for future grant funding, which can lead to a financial loss for the University overall. Some funders, such as the Engineering and Physical Sciences Research Council (EPSRC) in the United Kingdom, have clear expectations that institutions should support their researchers in adhering to good research data management practice by providing adequate infrastructure and policy framework support, therefore directly requesting institutions to support RDM service development.

Could funders do more?

There was consensus that funding bodies could perhaps do more to support good research data management, especially given that many non-UK funders do not yet have requirements for research data management and sharing as a condition of their grants. There was also a useful suggestion that funders should make more effort to ensure that their policies on research data management and sharing are adhered to, for example by performing spot-checks on research papers acknowledging their funding to see if supporting research data was made available, as the EPSRC have been doing recently.

Similarly, if funders would do more to review and follow up on data management plans submitted as part of grant applications it would be useful in convincing researchers and senior leadership of the importance of RDM. Currently not all funders require that researchers submit data management plans as part of grant applications. Although some pioneering work aiming to implement active data management plans started, people taking part in the discussion were not aware of any funding body having a structured process in place to review and follow up on data management plans. There was a suggestion that institutions should perhaps be more proactive in working together with funders in shaping new policies. It would be useful to have institutional representatives at funders’ meetings to ensure greater collaboration.

Future directions and resources

Overall we felt that it was useful to exchange tips and tricks so we can avoid making the same mistakes. Also, for those who had not yet managed to secure endorsement for RDM services from their senior leaders it was reassuring to understand that they were not the only ones having difficulty. Community support was recognised as valuable and worth maintaining. We discussed what would be the best way of ensuring that the advice exchanged during the meeting was not lost, and also how an effective exchange of ideas on how best to engage with senior leadership should be continued. First of all we decided to write up a blog post report of the meeting and to make it available to a wider audience.

Secondly, Jisc agreed to compile the various resources and references mentioned and to create a toolkit of techniques with examples for making RDM business cases for RDM. An initial set of resources useful in making the case can be found on the Research Data Network webpages. The current resources include A High Level Business Case, some Case studies and Miscellaneous resources – including Videos, slide decks, infographics, links to external toolkits, etc. Further resources are under development and are being added on a regular basis.

The final tip to all RDM service providers was that the key to success was making the service relevant and that persistence in advocating for the good cause is necessary. RDM service providers should not be shy about sharing the importance of their work with their institution, and should be proud of the valuable work they are doing. Research datasets are vital assets for institutions, and need to be managed carefully, and being able to leverage this is the key in making senior leadership understand that providing RDM services is essential in supporting institutional business.

The art of software maintenance

September 12, 2018September 9, 2018 ~ Rosie Higman ~ Leave a comment

This was originally published on the Unlocking Research blog on 29th January 2017 under a CC-BY license.

When it comes to software management there are probably more questions than answers to problems – that was the conclusion of a recent workshop hosted by the Office of Scholarly Communication (OSC) as part of a national series on software sustainability, sharing and management, funded by Jisc. The presentations and notes from the day are available, as is a Storify from the tweets.

The goal of these workshops was to flesh out the current problems in software management and sharing and try to identify possible solutions. The researcher-led nature of this event provided researchers, software engineers and support staff with a great opportunity to discuss the issues around creating and maintaining software collaboratively and to exchange good practice among peers.

Whilst this might seem like a niche issue, an increasing number of researchers are reliant on software to complete their research, and for them the paper at the end is merely an advert for the research it describes. Stephen Eglen described this in his talk as an ‘inverse problem’ – papers are published and widely shared but it is very hard to get to the raw data and code from this end product, and the data and code are what is required to ensure reproducibility.

These workshops were inspired by our previous event in 2015, where Neil Chue Hong and Shoaib Sufi spoke with researchers at Cambridge about software licensing and Open Access. Since then the OSC has had several conversations with Daniela Duca at Jisc and together we came up with an idea of organising researcher-led workshops across several institutions in the UK.

Opening up software in a ‘post-expert world’

We began the day with a keynote from Neil Chue-Hong from the Software Sustainability Institute who outlined the difficulties and opportunities of being an open researcher in a ‘post-expert world’ (the slides are available here). Reputation is crucial to a researcher’s role and therefore researchers seek to establish themselves as experts. On the other hand, this expert reputation might be tricky to maintain since making mistakes is an inevitable part of research and discovery, which is poorly understood outside of academia. Neil introduced Croucher’s Law to help us understand this: everyone will make mistakes, even an expert, but an expert will be aware of this so will automate and share their work as much as possible.

Accepting that mistakes are inevitable in many ways makes sharing less intimidating. Papers are retracted regularly due to errors and Neil gave examples from a variety of disciplines and career stages where people were open about their errors so their communities were accepting of the mistakes. In fact, once you accept that we will all make mistakes then sharing becomes a good way to get feedback on your code and to help you fix bugs and errors.

This feeds into another major theme of the workshop which Neil introduced; that researchers need to stop aiming for perfect and adopt ‘good enough’ software practices for achievable reproducibility. This recognises that one of the biggest barriers to sharing is the time it takes to learn software skills and prepare data to the ‘best’ standards. Good enough practices mean accepting that your work may not be reproducible forever but that it is more important to share your code now so that it is at least partially reproducible now. Stephen Eglen built on this with his paper on ‘Towards standard practices for sharing computer code and programs in neuroscience’ which includes providing data, code, tests for your code and using licences and DOIs.

Both speakers and the focus groups in the afternoon highlighted that political work is needed, as well as cultural change, to normalise code sharing. Many journals now ask for evidence of the data which supports articles and the same standards should apply to software code. Similarly, if researchers ask for access to data when reviewing articles then it makes sense to ask for the code as well.

Automating your research: Managing software

Whilst sharing code can be seen as the end of the lifecycle of research software, writing code with the intention of sharing it was repeatedly highlighted as a good way to make sure it is well-written and documented. This was one of several ‘selfish’ reasons to share, where sharing also helps the management of software, through better collaboration, the ability to track your work and being able to use students’ work after they leave.

Croucher’s Law demonstrates one of the main benefits of automating research through software; the ability to track the mistakes to improve reproducibility and make fixing mistakes easier. There were lots of tools mentioned throughout the day to assist with managing software from the well-known version control and collaboration platform Github to the more dynamic such as Jupyter notebooks and Docker. As well as these technical tools there was also discussion of more straightforward methods to maintain software such as getting a code buddy who can test your code and creating appropriate documentation.

Despite all of these tools and methods to improve software management it was recognised by many participants that automating research through software is not a panacea; the difficulties of working with a mix of technical and non-technical people formed the basis of one of the focus groups.

Sustaining software

Managing software appropriately allows it to be shared but re-using it in the long- (or even medium) term means putting time into sustaining code and make sure it is written in a way that is understandable to others. The main recommendations from our speakers and focus groups to ensure sustainability were to use standards, create thorough documentation and embed extensive comments within your code.

As well as thinking about the technical aspects of sustaining software there was also discussion of what is required to motivate people to make their code re-usable. Contributing to a community seemed to be a big driver for many participants so finding appropriate collaborators is important. However larger incentives are needed and creating and maintaining software is not currently well-rewarded as an academic endeavour. Suggestions to rectify this included more software-oriented funding streams, counting software as an output when assessing academics, and creating a community of software champions to mirror the Data Champions scheme we recently started in Cambridge.

Next steps

This workshop was part of a national discussion around research software so we will be looking at outcomes of other workshops and wider actions the Office of Scholarly Communication can support to facilitate sharing and sustaining research software. Apart from Cambridge, five other institutions held similar workshops (Bristol, Birmingham, Leicester, Sheffield, and the British Library). As one of the next steps, all organisers of these events want to meet up to discuss the key issues raised by researchers to see what national steps should be taken to better support the community of researchers and software engineers and also to consider if there any remaining problems with software which could require a policy intervention.

However, following the maxim to ‘think global, act local’, Neil’s closing remarks urged everyone to consider the impact they can have by influencing those directly around them to make a huge difference to how software is managed, sustained and shared across the research community.

Creating a research data community

September 12, 2018September 8, 2018 ~ Rosie Higman ~ Leave a comment

This originally appeared on the Unlocking Research blog on 30th November 2016 under a CC-BY license and was co-authored with Hardy Schwamm.

Are research institutions engaging their researchers with Research Data Management (RDM)? And if so, how are they doing it? In this post, Rosie Higman (@RosieHLib), Research Data Advisor, University of Cambridge, and Hardy Schwamm (@hardyschwamm), Research Data Manager, Lancaster University explore the work they are doing in their respective institutions.

Whilst funder policies were the initial catalyst for many RDM services at UK universities there are many reasons to engage with RDM, from increased impact to moving towards Open Research as the new normal. And a growing number of researchers are keen to get involved! These reasons also highlight the need for a democratic, researcher-led approach if the behavioural change necessary for RDM is to be achieved. Following initial discussions online and at the Research Data Network event in Cambridge on 6 September, we wanted to find out whether and how others are engaging researchers beyond iterating funder policies.

At both Cambridge and Lancaster we are starting initiatives focused on this, respectively Data Champions and Data Conversations. The Data Champions at Cambridge will act as local experts in RDM, advocating at a departmental level and helping the RDM team to communicate across a fragmented institution. We also hope they will form a community of practice, sharing their expertise in areas such as big data and software preservation. The Lancaster University Data Conversations will provide a forum to researchers from all disciplines to share their data experiences and knowledge. The first event will be on 30 January 2017.

RDMFBreakout Having presented our respective plans to the RDM Forum (RDMF16) in Edinburgh on 22nd November we ran breakout sessions where small groups discussed the approaches our and other universities were taking, the results summarised below highlighting different forms that engagement with researchers will take.

Targeting our training

RDM workshops seem to be the most common way research data teams are engaging with researchers, typically targeting postgraduate research students and postdoctoral researchers. A recurrent theme was the need to target workshops for specific disciplinary groups, including several workshops run jointly between institutions where this meant it was possible to get sufficient participants for smaller disciplines. Alongside targeting disciplines some have found inviting academics who have experience of sharing their data to speak at workshops greatly increases engagement.

As well as focusing workshops so they are directly applicable to particular disciplines, several institutions have had success in linking their workshop to a particular tangible output, recognising that researchers are busy and are not interested in a general introduction. Examples of this include workshops around Data Management Plans, and embedding RDM into teaching students how to use databases.

An issue many institutions are having is getting the timing right for their workshops: too early and research students won’t have any data to manage or even be thinking about it; too late and students may have got into bad data management habits. Finding the goldilocks time which is ‘just right’ can be tricky. Two solutions to this problem were proposed: having short online training available before a more in-depth training later on, and having a 1 hour session as part of an induction followed by a 2 hour session 9-18 months into the PhD.

Tailored support

Alongside workshops, the most popular way to get researchers interested in RDM was through individual appointments, so that the conversation can be tailored to their needs, although this obviously presents a problem of scalability when most institutions only have one individual staff member dedicated to RDM.

IMG_20161122_121401-300x223 There are two solutions to this problem which were mentioned during the breakout session. Firstly, some people are using a ‘train the trainer’ approach to involve other research support staff who are based in departments and already have regular contact with researchers. These people can act as intermediaries and are likely to have a good awareness of the discipline-specific issues which the researchers they support will be interested in.

The other option discussed was holding drop-in sessions within departments, where researchers know the RDM team will be on a regular basis. These have had mixed success at many institutions but seem to work better when paired with a more established service such as the Open Access or Impact team.

What RDM services should we offer?

We started the discussion at the RDM Forum thinking about extending our services beyond sheer compliance in order to create an “RDM community” where data management is part of good research practice and contributes to the Open Research agenda. This is the thinking behind the new initiatives at Cambridge and Lancaster.

However, there were also some critical or sceptical voices at our RDMF16 discussions. How can we promote an RDM community when we struggle to persuade researchers being compliant with institutional and funder policies? All RDM support teams are small and have many other tasks aside from advocacy and training. Some expressed concern that they lack the skills to market our services beyond the traditional methods used by libraries. We need to address and consider these concerns about capacity and skill sets as we attempt to engage researchers beyond compliance.

Summary

It is clear from our discussions that there is a wide variety of RDM-related activities at UK universities which stretch beyond enforcing compliance, but engaging large numbers of researchers is an ongoing concern. We also realised that many RDM professionals are not very good at practising what we preach and sharing our materials, so it’s worth highlighting that training materials can be shared on the RDM training community on Zenodo as long as they have an open license.

Many thanks to the participants at our breakout session at the RDMForum 16, and Angus Whyte for taking notes which allowed us to write this piece. You can follow previous discussions on this topic on Gitter.

Making the connection: research data network workshop

September 12, 2018September 8, 2018 ~ Rosie Higman ~ Leave a comment

This was originally published on the Unlocking Research blog on 14th September 2016 under a CC-BY licence.

During International Data Week 2016, the Office of Scholarly Communication is celebrating with a series of blog posts about data. The first post was a summary of an event we held in July. This post looks at the challenges associated with financially supporting RDM training.

corpus-main-hall-300x193 — Image: Corpus Christi

Following the success of hosting the Data Dialogue: Barriers to Sharing event in July we were delighted to welcome the Research Data Management (RDM) community to Cambridge for the second Jisc research data network workshop. The event was held in Corpus Christi College with meals held in the historical dining room.

RDM services in the UK are maturing and efforts are increasingly focused on connecting disparate systems, standardising practices and making platforms more usable for researchers. This is also reflected in the recent Concordat on Research Data which links the existing statements from funders and government, providing a more unified message for researchers.

The practical work of connecting the different systems involved in RDM is being led by the Jisc Research Data Shared Services project which aims to share the cost of developing services across the UK Higher Education sector. As one of the pilot institutions we were keen to see what progress has been made and find out how the first test systems will work. On a personal note it was great to see that the pilot will attempt to address much of the functionality researchers request but that we are currently unable to fully provide, including detailed reporting on research data, links between the repository and other systems, and a more dynamic data display.

Context for these attempts to link, standardise and improve RDM systems was provided in the excellent keynote by Dr Danny Kingsley, head of the Office of Scholarly Communication at Cambridge, reminding us about the broader need to overhaul the reward systems in scholarly communications. Danny drew on the Open Research blogposts published over the summer to highlight some of the key problems in scholarly communications: hyperauthorship, peer review, flawed reward systems, and, most relevantly for data, replication and retraction. Sharing data will alleviate some of these issues but, as Danny pointed out, this will frequently not be possible unless data has been appropriately managed across the research lifecycle. So whilst trying to standardise metadata profiles may seem irrelevant to many researchers it is all part of this wider movement to reform scholarly communication.

Making metadata work

Metadata models will underpin any attempts to connect repositories, preservation systems, Current Research Information Systems (CRIS), and any other systems dealing with research data. Metadata presents a major challenge both in terms of capturing the wide variety of disciplinary models and needs, and in persuading researchers to provide enough metadata to make preservation possible without putting them off sharing their research data. Dom Fripp and Nicky Ferguson are working on developing a core metadata profile for the UK Research Data Discovery Service. They spoke about their work on developing a community-driven metadata standard to address these problems. For those interested (and Git-Hub literate) the project is available here.

They are drawing on national and international standards, such as the Portland Common Data Model, trying to build on existing work to create a standard which will work for the Shared Services model. The proposed standard will have gold, silver and bronze levels of metadata and will attempt to reward researchers for providing more metadata. This is particularly important as the evidence from Dom and Nicky’s discussion with researchers is that many researchers want others to provide lots of metadata but are reluctant to do the same themselves.

We have had some success with researchers filling in voluntary metadata fields for our repository, Apollo, but this seems to depend to a large extent on how aware researchers are of the role of metadata, something which chimes with Dom and Nicky’s findings. Those creating metadata are often unaware of the implications of how they fill in fields, so creating consistency across teams, let alone disciplines and institutions can be a struggle. Any Cambridge researchers who wish to contribute to this metadata standard can sign up to a workshop with Jisc in Cambridge on 3rd October.

Planning for the long-term

A shared metadata standard will assist with connecting systems and reducing researchers’ workload but if replicability, a key problem in scholarly communications, is going to be possible digital preservation of research data needs to be addressed. Jenny Mitcham from the University of York presented the work she has been undertaking alongside colleagues from the University of Hull on using Archivematica for preserving research data and linking it to pre-existing systems (more information can be found on their blog.)

Jenny highlighted the difficulties they encountered getting timely engagement from both internal stakeholders and external contractors, as well as linking multiple systems with different data models, again underlining the need for high quality and interoperable metadata. Despite these difficulties they have made progress on linking these systems and in the process have been able to look into the wide variety of file formats currently in use at York. This has lead to conversations with the National Archive about improving the coverage of research file formats in PRONOM (a registry of file formats for preservation purposes), work which will be extremely useful for the Shared Services pilot.

In many ways the project at York and Hull felt like a precursor to the Shared Services pilot; highlighting both the potential problems in working with a wide range of stakeholders and systems, as well as the massive benefits possible from pooling our collective knowledge and resources to tackle the technical challenges which remain in RDM.

Championing RDM training

September 12, 2018September 8, 2018 ~ Rosie Higman ~ Leave a comment

This originally appeared on the Unlocking Research blog on 14 September 2016 under a CC-BY license and was co-authored with Dr Marta Teperek.

The problem

There is a desperate need for training in research data management. Our significant engagement with researchers at the University of Cambridge over the past 18 months has indicated to us that research data cannot be effectively shared if it has not been properly managed during the research lifecycle. Researchers cannot be expected to share their data at the end of their research project if they are unable to locate their data, if the data is not correctly labelled or if it lacks metadata to make the data re-usable. We have stated that on several occasions, including our response to the draft UK Concordat on Open Research Data which was released in its final form on the 28th July this year.

To test whether our beliefs were in-line with researchers’ needs, last year we conducted a short survey on research data management needs among our academic community. Of those responding, 94% of researchers indicated that it would be ‘useful’ or ‘very useful’ to have workshops on research data management (see our earlier blog post for the full discussion of the survey results). In response to this we developed a 1.5 hour introductory workshop to research data management and started delivering this to researchers in Cambridge in July 2015.
Train early, train often

Our workshops have evolved substantially since the initial sessions, through responses to the feedback we collect. The workshops are now considerably improved, not only more interactive, but also have been extended to 2.5 hours to allow time to provide more practical examples of good data management.

Screen-Shot-2016-09-14-at-16.02.24

The feedback from our workshops was overwhelmingly positive. As a result, many departments identified research data management skills as core competencies needed by every PhD student and asked us to deliver our workshops as part of their compulsory training for PhD students. This is fantastic news for both our team and the awareness of RDM at Cambridge, and we have therefore accepted all these individual requests.

But sometimes you can be a little too successful. We recently received another request from the Graduate School of Life Sciences to make our workshop compulsory for all their PhD students. While this is great news, there are 400 new PhD students every year at the Graduate School of Life Sciences. The maximum capacity at our workshop is 20 attendees, which means we would need to deliver 20 workshops throughout the year to cover this cohort. And when we consider the Graduate School of Life Sciences is one of five schools accepting PhD students in Cambridge, we need to look at how we would respond if other schools approached us with similar requests.

Another issue is that while our improved workshop on research data management covered the basic research data management needs, researchers have told us they need more in-depth discipline-specific training. We recognise this, and do try to provide training that is directed at the audience – a challenge when we need to serve all researchers across the University – from arts and humanities through social and life sciences, to medical research and particle physics (and many, many other disciplines).

So we have identified a broad need for RDM training, a specific need for training for PhD students as they begin their research, as well as discipline-specific support for all researchers. But there are only two staff members in the Research Data Team who can deliver RDM training, and delivering training is only one of many tasks which we need to undertake for the Research Data Facility to function. We simply do not have the capacity to meet this obvious need. After discussion we have agreed to deliver four workshops out of the requested 20 for the Graduate School of Life Sciences and to reconsider the situation in the summer break next year.

The plan – Data Champions

We have begun to think both about how we can meet RDM training needs given current staff capacity and how we can link the experts in different aspects of data management that we know exist around the University of Cambridge. There is already an active OpenCon Cambridge group which promotes the benefits of Open Access, Open Data and Open Education; but we wanted to focus on all aspects of RDM.

We have started to develop the idea of having Data Champions in each department, institute or college who can act as the local experts as well as delivering some discipline-specific training to their community. This has the advantage of increasing the number of trainers across the University, making the workshops tailored to the relevant audience, and building a community of experts.

The University of Cambridge is not alone in facing this problem and the several other universities are pursing, or already have, Data Champions in some form. The idea was recently discussed on both Jisc RDM mailing lists and at the Research Data Network meeting which was held at Corpus Christi College, Cambridge last week. At this meeting several different models were proposed, including a national network of champions who could advocate within disciplines at a senior level.

Whilst a national network of champions would be great in the long term we still have an immediate problem within the University of Cambridge and so we have launched a Call for Data Champions to help us raise awareness and increase the amount of training available. The call is open until 17th October and we would welcome any research or support staff or research student with an interest in RDM. There will be support available in learning how to deliver RDM training, a template workshop provided, the opportunity to influence the future of RDM services and a website built to showcase our Data Champions.

We hope to bring all the Data Champions together towards the end of the Michaelmas term so they can start delivering workshops in 2017. We hope to foster a community of experts who can share their knowledge with each other and the Research Data Team so we are in a better position to support our researchers.

The future

The community of Data Champions we hope to bring together at Cambridge will begin to ameliorate some of the problems we are facing with regards to RDM training. However, we will still struggle with staff capacity and at some point researchers will need to be appropriately rewarded for sharing data and supporting others to do the same, a theme of our recent Open Research discussion event.

So we have a temporary solution, and one which we hope will significantly improve the RDM training available at Cambridge, but the issue will not be solved until the underlying incentives for sharing data, publications and all the other research outputs have been addressed.

Written by Rosie Higman and Dr Marta Teperek