Friday 15 October 2010

Some use cases are in, some use cases are out

Over the last few weeks we've been thrashing out the details of what's possible within DURA, in terms of the technical architecture of what connects to what, out of our system components (at least 2 out of the 3: Mendeley, Symplectic, and the repository).

We started by asking the big questions - how many different possible architectures could exist within the overall scope of DURA? what use cases might this sort of system support long term? - and then narrowed down to a set of use cases we feel are feasible within the project timescale and budget.  This means we aren't going to address every possible use case in the project, but will get some solid groundwork done, which other efforts can build upon in the future (by adding different repository platforms, or Repository Junction, or adding support for other reference manager systems, and so on).


So, what are we going to aim to do in the first instance?  The project has identified three major use cases we'd like to support:


Use case 1

The researcher in this case is on a campus which has Symplectic deployed (including the Repository Tools system, which connects to the institutional repository). They already have a Symplectic account and a Mendeley account.

The researcher starts using the DURA system when they are logged in to Symplectic.

This use case must work for Cambridge (ie. with our single sign on system and related requirements, and our DSpace repository) and ideally will work for the general case (ie. another campus with Symplectic and a repository) too.




Use case 2

The researcher has a Symplectic account, which they have used, and also a Mendeley account.

The researcher starts using the DURA system when they are using Mendeley.


This use case must work for Cambridge (ie. with our single sign on system and related requirements, and our DSpace repository) and ideally will work for the general case (ie. another campus with Symplectic and a repository) too.


Ideally, this also supports the case where the researcher hasn't used Symplectic yet, but does have an account (which will have been automatically created for them by their institution). (This might require a subtley different workflow.)


Optionally, this also includes the case where the researcher actually isn't eligible to use Symplectic on campus - for example, if they are a PhD student whose research publications are not monitored centrally by their university. In this case, the pattern of behaviour should smoothly segue into:




Use case 3


The researcher is a Mendeley user, who either isn't eligible to use Symplectic at their institution, or who is at a campus which doesn't have Symplectic deployed.

This use case must work for Cambridge (ie. with our single sign on system and related requirements) and ideally will work for the general case (ie. another campus with a DSpace repository) too.


The researcher starts using the DURA system when they are using Mendeley.

Wednesday 29 September 2010

What is success?

It's always fun to think seriously about success at the start - much more encouraging than thinking in terms of risks and all the things which could go wrong.

So, what could go right, how would we know it had gone right, and which things going right should we focus on?

We've already blogged about one thing we could measure and which could go right - deposit rates into the IR (or possibly subsequent access rates); this is a very measurable element and in fact measures itself.

User satisfaction is another great thing. If we can create a community of happy researchers and academics who are using Mendeley and our project deposit system without problems, and who feel it benefits them in some way, then that's another good success for DURA. To assess if we manage this, we'll need to do some combination of user testing, interviews and surveys (which will give us specific information about how the researchers using our tool feel about it as well as what the experience of using the tool is like) and potentially measure support requests and usage levels, which give us an indirect measure of how well things are going for users, but which could be affected by other factors too.  We are already thinking hard about user experience, particularly around setting up the deposit system for the first time for a user, which is where we hit the exciting technical challenges around authentication and authorisation. Getting the setup process right will be key, because without that, no one will make it through to the truly simple day to day operation of deposit, where we hope to have "no UI" because it will automagically happen whilst researchers use Mendeley normally.

Another form of success which we'd love to have is to be playing an active role in a thriving community of people, all contributing to awesome scholarly infrastructure around repositories and access and preservation. Our engagement with SWORD2 and the whole JISC DEPO programme is part of that, and so is our connection to the community of Mendeley users at Cambridge and beyond. This is a bit harder to measure...

So we will probably focus on the first two kinds of success - happy users, and checking the deposit rates. The system design, including the user interfaces but also the overall technology architecture decisions we make, will play a major role in making sure our researchers find the system easy to use and useful; these are things we are working on today. Later on, we'll also have to make sure we have seamless deployment plans and good support systems in place, as well as the processes to make sure we know how satisfied our users are.  The deposit rates we'll look at later on, and again the pilot deployment and publicity and so forth will be the big areas which will affect this kind of success.

Having written all that, it's back to the bits of the project which are not all successes - yet. The real challenge in this coming term for me locally is the package of institutional issues around coordinating diverse bits of the university to come together around the project and our forthcoming pilot deployments (of which more soon) - plus our partner companies. These aren't all technical issues - there are policy issues and communication challenges, translating between teams with very different backgrounds and priorities, and of course the inevitably slow progress of other university activities which DURA may depend on later. But that's the fun of a project like DURA, bringing together lots of different things to deliver something new :)

Wednesday 15 September 2010

Counting

One of the requirements of the JISCdepo programme is that the respositories engaged in it should have an analytics engine of some sort so that deposit rates can be observed over the course of the projects.

But what does this mean, particularly for DURA?

Firstly - what are we counting? The most obvious thing to count for a project around deposit would be deposits! 

Irrefutable proof?
One thing we must remember is that even if we count the deposits going into a respository, and observe an increase during a project to do with deposit, it does not prove that the project made a difference.

We might reasonably expect deposits to be going up anyway, as awareness and interest in respositories and openness increases.  And there may be other initiatives underway which could alter the deposit rate within the institution - perhaps general repository publicity, perhaps some particularly large research datasets are stored during the period, perhaps an enthusiastic staff member evangelising effectively. So we can observe changes in deposit rate, but we cannot necessarily draw meaningful conclusions about the effectiveness of projects to increase deposit from them.

So we must consider results of counting deposit with some caution - it is certainly valuable information, but not necessarily a vindication, by itself, of a single project to increase deposit rates.



Counting somewhere else
In DURA we are super lucky though, because all the deposits from our project will come from other places, and those other places are engaged with the project and so we can count things there.

If we're depositing from Mendeley direct to the institutional repository, Mendeley will be counting and we can access that data.  If we're depositing from Symplectic to the institutional repository, Symplectic's institutional deployment will be counting, and at least for the case of Cambridge where we'll be doing our test and pilot deployments, we can access that data.

Even better, by counting in Mendeley or in Symplectic, we can tell exactly what submissions come from our project rather than anywhere else, so it's real data which will help us assess the project's success.



What is success anyway?
For the purposes of this discussion, we're only going to consider success in terms of counting things. 

We could easily count deposits during the project. We'd end up with some deposit counts from our trial code work, and some deposit counts from our pilot version (coming later).  

I think that these counts are useful to the project internally, but less useful to everyone else. They don't really show the meaningful impact of our work because even during the pilot phase, we may still be ironing out bugs and improving the experience. Also, our work aims to make deposit an integrated part of research workflow on an ongoing basis, and people's initial use of our system is more likely to reflect experimentation than an ongoing engagement.

So, real project success in terms of deposit count will need to be monitored after the formal project ends. We are considering that reviewing the deposit count over the 12 months after the end of the project, capturing both a reasonable embedding period and also conveniently a full academic cycle, might be the way to go.  We have yet to decide what a good metric for success would be - double the existing annual rate of academic paper deposit?  Do chip in with your thoughts in the comments.

(You may spot that we haven't really talked about counting access to papers in the repository. That's a topic for another day, but also a slightly less relevant one for DURA.)

Tuesday 14 September 2010

Science Online London 2010

Science Online London was a great event - lots of interesting and lively people from a variety of communities, and some really excellent speakers. Martin Rees, Aleks Krotoski, and Evan Harris stood out for me.


The most relevant bits to DURA were to do with repositories. One important rationale for DURA is that  integration of deposit with reference management (a normal researcher task) might increase deposit rates without anyone on repository staff needing to chivvy the academics along. Science Online attendees were reminded of the breakdown of costs for repositories, where outreach, acquisition and ingest can be up to 55% of the overall costs.  Ouch!

If we can help researchers deposit their works without needing to add to their already busy schedules, this should help deposit rates, and potentially reduce repository costs.

Friday 27 August 2010

Mini update and project team picture

The last few weeks have been busy with meetings and calls - with Symplectic, Mendeley, the DSpace at Cambridge team, the REF project at Cambridge, JISC, and so on. We're fleshing out the details of the project a little more, and identifying the issues which are likely to become big topics of discussion later on - the biggest of these is authentication and authorisation, as we bridge the world of modern web systems and institutional identity. (This connects up to SWORD - and what SWORD2 will be like. We imagine using mostly SWORD1.3, but contributing suggestions for SWORD2 would be a useful contribution.)


You can now see our project's information on the JISC programme information site.  

I'll be at Science Online London next Friday and Saturday, so if you'd like to learn more about the project or have any questions, find me there!


Whilst visiting Symplectic and Mendeley this week we took pictures of the teams - so you can see we all exist, at least in a slightly blurry way :)

Wednesday 11 August 2010

Mendeley API launched

Mendeley have opened their API for developers!

Read all about it here, discuss with others here.

This is exciting news for our project, as well as for others interested in building upon or around Mendeley.

Tuesday 3 August 2010

IR, CRIS, ...

There's an interesting post over on the SONEX blog about institutional repositories and current research information systems, or CRISs. CRISes? Anyway:  read it here.

Tuesday 6 July 2010

Project Plan Post 7 of 7: Budget (and sustainability!)

Our budget includes commercially sensitive details, so is not shared in full here.

Instead, below you will find some information about how we'll be making sure the project is sustainable and well embedded into the user base and community - providing value for money!

The total cost of the project is £369,288. This can be broken down into the institutional contributions, and JISC's contribution:



We can also break the total cost down into the individual components of the project:





























Rather than worrying about risks around sustainability, we've chosen to focus on what success would look like.


From a Symplectic perspective, success would mean that academics in institutions that use Symplectic who had also signed up to Mendeley would have another key data source available to them for finding their publications.  For the data souced from Mendeley (much like arXiv) academics would have the ability to directly deposit from Mendeley into whatever local digital repository were configured for the Symplectic system (be it ePrints, DSpace, Fedora or one of the three commerical technologies that are currently being developed for Symplectic's system).  The deposit mechanism can be made "single click" for the most part and the project may be able to allow people to configure options that will allow full text to automatically be deposited on "approval" of a record.

A user might be able to initiate a "push" from their Mendeley account to an appropriate Symplectic system.  A considerable amount of work might need to be done on authentication methodology here to ensure that target Symplectic systems remain secure.

A further advantage would be that Mendeley would be able to pull data concerning a particular academic from the Symplectic API.  Again security would need to be understood.  However, if successful, an academic could connect to their Symplectic account from Mendeley and automatically pull all their pre-disambiguated records into Mendeley (subject to the records coming from open sources such as PubMed, arXiv or being manually entered).  Where full text exists in a digital repository, a handle will be provided or possibly a file stream object for Mendeley to consume.

Specific budget components: end user engagement and sustainability of embedded deposit solutions


The budget includes 7 person-months for user engagement and support. This will be split between the project partners to deliver coherent support over multiple channels; this is needed because end users will be encountering the project in several ways - through the University's Symplectic deployment, through Mendeley, and possibly through contact with project or repository (DSpace@Cambridge) staff. Laura will be responsible for coordinating these efforts.

The project manager will act as a direct point of contact for other projects in the programme, JISC, and other relevant work. Laura will connect such projects and useful learning to the project team more broadly; in addition she will proactively engage other projects as appropriate to ensure that the programme is well connected internally - for the benefit of other projects, not just this one :)

The budget contains various non-staff costs, which will be allocated to specific activities by the project manager in agreement with the project leads as necessary:

    • £10k hardware
    • £5k travel and expenses split between partners
    • £5k dissemination cost including attending events, sharing information with users and the community, etc
    • £2.5k evaluation costs (assume that this will mostly be used by the University of Cambridge)
    • up to £2.5k contingency funds for the unexpected

Project Plan Post 6 of 7 - WP4: Requirements design, testing and dissemination, project management

WP 4 is the basis for and integrates WP1-3 and sets as its objectives the definition the requirements design, testing and dissemination of the solution, and project management. Laura James at CARET, University of Cambridge, is responsible for delivery of this work package.

During testing, users can make additional suggestions to improve the system. These suggestions will be evaluated and will be the basis for further development (only partially scope of this WP). The project management task is responsible for the co-ordination of the project in both administrative and technical terms aiming towards achieving effective operation of the project as well as timely delivery of high-quality results. Specific management structures and techniques have been devised to support the following objectives:
  • Organisation and running of project meetings and achieving common understanding within the project.
  • Setting up services for electronic documentation storage.
  • Production of high quality technical documentation.
  • Establishment of an efficient system of electronic communication and report storage.
  • Coordinating and managing end user communications methods, and managing user expectations
  • Aligning the needs and wishes of project stakeholders, including the project partners and JISC, and ensuring all stay informed about project progress
  • Identifying project success factors and monitoring progress towards these, taking action when required
  • Disseminating project learning to the wider sector community, including at events (potentially continuing past project end)

Task 4.1
Definition which requirements in terms of standards, algorithms, methods and implementation technologies are being used for the project to ensure a successful technological cooperation during the project. (Partners: Mendeley 0.5PM, Symplectic 0.5PM, University of Cambridge 0.5PM)

Task 4.2

Testing and dissemination the developed solution among selected users and researchers at the University of Cambridge. Evaluation of user feedback and testing results to establish basis for further improvement / development of the system. Ongoing end user engagement through multiple channels. (Partners: Mendeley 3PM, Symplectic 1PM, University of Cambridge 3PM)

Task 4.3

Project and development management and project administration
  • Project development and implementation management 
  • Organisation of a kick-off meeting, intermediary meetings and mid-term assessment of
    project progress 
  • General contract administration including consortium agreement
  • Financial administration and liaison with JISC 
  • Solution of possible conflicts and misunderstandings and overcoming encountered
  • problems throughout the project period
(Partners: Mendeley2PM, Symplectic 1PM, University of Cambridge 2PM)

Task 4.4
Reporting – To guarantee smooth running of project activities and timely delivery of planned outcomes.

  • Production of management reports and integrated cost statements 
  • Instruction and supervision of partners for efficient reporting of the progress of their
    activities
  • Compilation of joint intermediate and final reports
(Partners: University of Cambridge 1PM)




Deliverables

D 4.1 Requirements definition document (month 1) 

D 4.2 Dissemination, testing, and feedback report (month 12) 

D 4.3 Project kick-off meeting (Month 1) 

D 4.4 Project meetings (incl. minutes) (months 3, 6, 9) and final project report (month 12)


D 4.5 Sharing learnings with wider community via event(s) (month 12+)

D 4.6 Report on end user engagement and its implications for sustainability (month 12)

Project Plan Post 6 of 7 - WP3: Integration in IR Systems

This work package has two objectives: (1) to build on work packages 1 and 2 by integrating Mendeley with Symplectic, and (2) to create a means for secondary deposit and notification system (without Symplectic) with the researcher’s institution (task 3.4), as well as to investigate the use of OA-RJ for authentication. Jason Hoyt at Mendeley will be responsible for delivery.

The Mendeley Desktop application is capable of extracting metadata from PDF files on a researcher’s computer. This means that a user does not need to manually enter authorship details about their publications. Using this initially extracted metadata, Mendeley will ping the Symplectic servers, which will then verify author information (see WP2). Symplectic will then either deposit or update any records at the end-user’s institutional repository. Symplectic will ensure that unwanted deposits, such as for example master theses, can be excluded. Repository Junction was established through a previous JISC funded grant, OA-RJ. OA-RJ is able to determine a user’s institutional repository via IP. Mendeley will investigate the use of OA- RJ to properly authenticate a user to their IR. Where Symplectic is already in use, such as the University of Cambridge, it will be used in combination with OA-RJ to authenticate and securely deposit documents and their metadata from the Mendeley Desktop client. The use of OA-RJ at institutions that have not installed Symplectic will also be investigated, but not implemented during the course of the DURA project.

Task 3.1
Create a secure API to transmit metadata to Symplectic for author lookup via its databases. Here, Mendeley will first verify authorship details using Symplectic and return the result to the end-user for confirmation. (Partners: Mendeley 1PM, Symplectic 1PM)

Task 3.2

Create a secure API to transmit the verified authored document to Symplectic. Once the user has confirmed the details of the document, this task will ensure the document is delivered to Symplectic for deposit into the appropriate IR. (Partners: Mendeley 1PM, Symplectic 1PM)

Task 3.3
Create APIs to enable a notification system between the institution and Mendeley’s client software to verify primary and secondary deposits. (Partners: Mendeley 1PM, University of Cambridge 1PM)

Task 3.4

Develop and implement APIs to deposit works directly into Cambridge and other universities based on an open standards API. (Partners: Mendeley 4PM, University of Cambridge 1PM)

Task 3.5

Load balancing on the Mendeley Desktop platform. Since transmitting documents via Mendeley will consume bandwidth and put additional strain on system resources, a solution will be developed that ensures tasks 3.1 and 3.2 work cohesively with the Mendeley software already developed.
(Partners: Mendeley 2PM)

Task 3.6

Investigate the use of OA-RJ. Working with Cambridge, we will determine if OA-RJ is a suitable alternative when Symplectic is not already installed. A complete solution, however, will not be integrated into Mendeley Desktop. (Partners: Mendeley 1PM, University of Cambridge 1PM)






Deliverables


D 3.1 Connect Mendeley to Symplectic and transmit user documents (month 9)

The metadata and full-text to documents will need to be securely transmitted to Symplectic for inclusion into the IR. An application programming interface will be built into the Mendeley Desktop client based on tasks 3.1-3.

D 3.2 Open APIs developed and implemented within Mendeley to use as alternate means of depositing content (month 9) 
APIs will be created to confirm document deposits in the user interface built in work package 1. This will also let the researcher’s institution send requests for secondary deposits, such as into the NHS. Additionally, it will provide a means to deposit materials into the IR outside of the need for Symplectic. The project will develop open source APIs for this communication available for use by any organization. These APIs will then be integrated and tested within Mendeley and Cambridge.

D 3.3 Report on the use of OA-RJ (month 10)

Based on the investigation of using OA-RJ to identify Mendeley users with the University of Cambridge, a report will be created and shared on this project blog.

Project Plan Post 6 of 7 - WP2: Symplectic interface

This work package will allow data to be transferred between Mendeley and Symplectic in a bi-directional way. Also, extend Symplectic’s data API to give Mendeley a mediated access to institutional digital repository (allowing Mendeley to interface to DSpace, ePrints, Fedora and, in time, further repositories) through a single, consistent data API.  Daniel Hook at Symplectic will be responsible for delivery of this work package.

As part of this work package, authentication methodology will be investigated. Each institution has a separate authentication methodology – Active Directory, LDAP, Kerberos, Shibboleth and there are often variants on these main themes. An approach will be taken that may be based on institutional authentication methodology or may be independent of the local system depending on the outcome of analysis. The methodology chosen will be based on inclusiveness – allowing the
largest number of institutions to participate with the lowest entry requirements while maintaining an appropriate level of data security.

Task 2.1
Develop Mendeley data source (to pull both bibliographic data and full text).
(Partners: Symplectic 1PM, Mendeley 0.5PM)

Task 2.2
Develop API extension to allow Mendeley to pull bibliographic data and full-text from Symplectic / Repository respectively. (Partners: Symplectic 1PM, Mendeley 0.5PM)

Task 2.3
Investigate authentication and develop methodology for authentication between Mendeley and Symplectic. (Partners: Symplectic 2PM, Mendeley 1PM)

Task 2.4

Develop API to allow Mendeley to push both bibliographic and full-text data into Symplectic, including authentication methodology. (Partners: Symplectic 2PM, Mendeley 1PM)


Deliverables

D 2.1 Mendeley as a data source (month 3)
Use Mendeley API to authenticate user and pull data from their account. Provide user interface to support Mendeley API usage (authentication with Mendeley etc.). Implement “throughload” functionality for Mendeley data source.

D 2.2 Mendeley back-sync from Symplectic (month 4)
Extend existing Symplectic data API to allow full-text to be located in local institutional repository and harvested into Mendeley.

D 2.3 Authentication methodology (month 5)

Documentation describing results of analysis. Explanation of methodology for authentication between Mendeley and Symplectic – source code for authentication mechanism.

D 2.4 Mendeley “Push” functionality (month 7)
Create API for specific integration with Mendeley – handle authentication of a user; account matching; and accepting bibliographic metadata into Symplectic’s software (matching to existing articles in user’s profile) and passing full-text through into target repositories.

Project Plan Post 6 of 7 - WP1: Open/Standardised interface

The objective of work package 1 is to create an interface within the existing Mendeley Desktop software client to deposit records into institutional repositories.  Jason Hoyt at Mendeley is responsible for delivering this work package.

Scientists and researchers use Mendeley for their own reference management. Mendeley offers a feature called “My Publications” where the user can simply drag & drop their own publications (i.e. the articles they have authored) into a designated collection in Mendeley Desktop. These publications (either metadata only or metadata and full-text) then sync to the user’s profile page on Mendeley Web. This work package intends to create an interface between Mendeley and an institutional repository of the university the user is affiliated with to deposit these publications and the respective metadata into the institutional repository. In an example use case, a researcher would click the “sync” button in the Mendeley client. The researcher’s institution would then respond to confirm deposits. In some cases, deposits might additionally be required in a secondary repository, for example the NHS database. In such cases, the institution would also send a request to confirm deposit into the secondary repository. The logic and APIs for this will be developed in work package 3, while work package 1 will manage the user interface requirements.

Task 1.1
Determine the user interface requirements for the authentication form and document selection. Mock ups based on the UI/UX needs will then be created. This task will be performed by the Mendeley project manager and Mendeley’s lead designer. (Partners: Mendeley 2PM, Symplectic 1PM, University of Cambridge 1PM)

Task 1.2
Mock ups from task 1.1 will be used to build the client-side and server-side logic to handle authentication and document selection. (Partners: Mendeley 1PM)

Task 1.3
Finally, the mock ups from task 1.1 will guide the development of the actual user interface in the Mendeley Desktop client. (Partners: Mendeley 2PM)



Deliverables

D 1.1 User interface requirements document (month 2)
This document, based on task 1.1 will be an internal document for project partners and uploaded on the project’s Basecamp account.

D 1.2 Client-side authentication interface (month 9)
The researcher will need to provide appropriate authentication credentials in a securely transmitted form to their IR. To support this, the Mendeley Desktop client will need to be extended to provide an input form for those credentials. The existence of this form and its use will also need to be communicated to the user in line with the current Mendeley Desktop user interface. The actual implementation will occur in project month 9 once WP3 deliverable 3.1 is complete.

D 1.3 User interface for selecting documents to be deposited (month 9)
Not all documents may be appropriate for deposit into the IR. Therefore a simple method is needed to designate documents for inclusion or exclusion. The actual implementation will occur in project month 9 once WP3 deliverable 3.1 is complete.

Project Plan Post 6 of 7: Projected Timeline, Workplan & Overall Project Methodology

Responsibilities

Laura, as project manager, will be responsible for managing the processes by which the project engages with end users, and feeds the information thus obtained back into the development work.  Responsibility for monitoring each individual end user communication channel (eg Mendeley support systems, DSpace support systems, CARET support) will lie with the existing personnel responsible for those channels, who will be aware of the project and in close contact with the relevant project team members.

Laura will also be responsible for the dissemination plan, ensuring that others in the programme and beyond learn about the project's outputs and learnings, and that JISC stay up to date with the work so that they can support the project in sharing findings with the wider sector in appropriate ways.

Responsibility for the technical work packages is detailed in each work package below.

The formal agreements between the partner organisations will be the responsibility of John Norman at CARET.


Work packages
The various work packages are described below, with deliverables and responsible project team member shown.  Further detail on each work package follows in subsequent blog posts, so that this post isn't incredibly long :)

  • WP1: Open/Standardised interface
    •  Responsible: Jason Hoyt, Mendeley
    • Deliverables:
      • D 1.1 User interface requirements document (month 2)
      • D 1.2 Client-side authentication interface (month 9)
      • D 1.3 User interface for selecting documents to be deposited (month 9)
  • WP2: Symplectic interface
    • Responsible: Daniel Hook, Symplectic
    • Deliverables:
      • D 2.1 Mendeley as a data source (month 3)
      • D 2.2 Mendeley back-sync from Symplectic (month 4)
      • D 2.3 Authentication methodology (month 5)
      • D 2.4 Mendeley “Push” functionality (month 7)
  • WP3: Integration in IR Systems
    • Responsible: Jason Hoyt, Mendeley
    • Deliverables:
      • D 3.1 Connect Mendeley to Symplectic and transmit user documents (month 9)
      • D 3.2 Open APIs developed and implemented within Mendeley to use as alternate means of depositing content (month 9) 
      • D 3.3 Report on the use of OA-RJ (month 10)
  • WP4: Requirements design, testing and dissemination, project management
    • Responsible: Laura James, CARET, University of Cambridge
    • Deliverables: 
      • D 4.1 Requirements definition document (month 1) 
      • D 4.2 Dissemination, testing, and feedback report (month 12)
      • D 4.3 Project kick-off meeting (Month 1) 
      • D 4.4 Project meetings (incl. minutes) (months 3, 6, 9) and final project report (month 12)
      • D 4.5 Sharing learnings with wider community via event(s) (month 12+)
      • D 4.6 Report on end user engagement and its implications for sustainability (month 12)

         

 Workplan timings



Monday 5 July 2010

Project Plan Post 5 of 7: Project Team Relationships and End User Engagement

Project Team
The project team is split between Mendeley, Symplectic and the University of Cambridge.

Daniel Hook - Symplectic - Project Technical Lead                 
Caroline Rouault - Symplectic - Project Manager 
John Fearns - Symplectic - Head Architect 
Richard Jones - Symplectic - Repository systems architect 

Jan Reichelt - Mendeley - Project director  
Jason Hoyt - Mendeley - Project manager 
Kris Jack - Mendeley - Senior Data Mining Engineer   
Nick Jones - Mendeley - Lead Web Engineer  
Robert Knight - Mendeley - Lead Desktop Engineer    


Laura James - CARET, University of Cambridge - Project manager  
John Norman - CARET, University of Cambridge - Project director 


Elin Stangeland - University Library, University of Cambridge - DSpace liaison  

Patricia Killiard - University Library, University of Cambridge - Project director  



End user engagement

We have a number of channels for end user engagement. Firstly, both CARET and the UL work with researchers and others around the University of Cambridge and operate support desks for existing systems (such as CARET's support for the institutional VRE). We can use these support systems, particularly to monitor for any incoming enquiries.

Mendeley are able to contact their Cambridge users, and are already doing this to ask permission for CARET to contact them directly about this project. In addition, Mendeley operates a university advisor system, through which a volunteer researcher supports local users and connects them to Mendeley. There is also the conventional user support system operated by Mendeley, which may be able to set up a subdomain to handle support specifically about this project. Mendeley also offer a strong user engagement ethos, exemplified by events such as open office Fridays) We hope to work with the incoming university advisor for Cambridge, as well as through our direct user contacts.

Symplectic already have a pilot system in place at Cambridge. We will also work with the local management of the pilot system to establish whether our project will work directly with that, or with a separate instance. In either case, we will work with the pilot project to ensure that clear communication lines exist for researchers using our system vs their system, and to make sure that there is no resulting communication confusion. Daniel at Symplectic has been heavily involved in the Cambridge pilot and can ensure that this relationship operates smoothly. 

In terms of promotion of our project, then, we have a variety of methods available, including using existing channels at Cambridge (such as posters, word of mouth at events where researchers congregate, the librarian network, the VRE announcement system, etc).  We note that it will be important to record the reasons if any user declines to use our system, or expresses any concerns about it, as well as the responses of the enthusiastic. The support systems of CARET and Mendeley both support automatic issue recording and tracking so that issues are not lost or forgotten.


Management of end user communications and end user concerns will be the responsibility of Laura, the project manager at Cambridge.

In addition to the deliverables offered in the proposal, we will also create a report about end user engagement and what this means for sustainability.


External and partner stakeholder management

Day to day, Daniel and Jason will ensure the project progresses at each of their respective companies. Laura will be the project manager responsible for overall project delivery, collaboration and communication (including to JISC and the broader community); this is a 0.25FTE role.  John Norman will be responsible for setting up the project's formal collaboration agreement.

Technical decision-making will be the joint responsibility of Jason and Daniel (the technical leads), with Laura bringing in neutral technical experts (such as Ian Boston) from Cambridge if required to mediate and reach a good conclusion. 



Internal project team management and communication

As all the project collaborators are based in London or Cambridge, even face to face meetings should be feasible, particularly at the start and end of the project.  Kickoff and major review meetings, including those involving JISC, will be face to face meetings.

All partners are experienced at working collaboratively and remotely. We will use a combination of online tools to support our collaboration, to match existing work practices at the project partner institutions (to avoid introducing new tools which do not fit smoothly into established working practices as far as possible). We anticipate using an email list (hosted on CamTools, our Cambridge VRE), a shared subversion repository or Google code site (to be chosen by project technical team contributors), and possibly a BaseCamp site.  We plan to use Skype (including chat) to keep in touch as well as the old staples of email and phone, which we are already using to coordinate our planning.

Laura will be responsible for sharing communication details, monitoring that effective communication is happening at an appropriate frequency, and changing communication methods if necessary to correct any problems, in consultation with the rest of the active project .

Friday 2 July 2010

Project Plan Post 4 of 7: IPR (Creative Commons Use & Open Source Software Licence)

All APIs developed during the course of this project will be open sourced, whilst all solutions currently implemented in either Mendeley or Symplectic will remain proprietary to protect their business models.

Symplectic is currently unable to make open source its already existing API and Repository Tools software as a part of this project. However, the “push” functionality referred to elsewhere, which would allow a user of the Mendeley interface to apply data to the Symplectic system, will be made open source. Symplectic’s open source code, together with documentation, will be provided under GPL and will be deposited on Google Code.

While Mendeley aims to create a “freemium” end-user product (software that has both a basic free version and a paid premium version), the APIs developed to connect the institutional repository to Mendeley will be open sourced, thoroughly documented, and hosted by both Cambridge and Mendeley, as well as on sourceforge.net and Google Code under GPL license.

As well as code, there may be IPR issues around the content to be transferred/produced in the system we create; there will be an ongoing strand around IPR in the DURA project in which we will investigate these issue and disseminate what we learn on this blog. We do not anticipate major problems in this area, though.

All project documentation including this blog [will be | is] available under a Creative Commons CC-BY licence.

Project Plan Post 3 of 7: Risk Analysis and Success Plan

General risk analysis

The biggest threat for DURA not to have a meaningful impact is a lack of user base using Mendeley as reference management tool. This risk is mitigated by several factors:
  • (1) There are very low barriers to adoption: Mendeley solves an urgent problem for researchers (document and data organisation), it is available on multiple platforms, and it is available for free (and future planned paid- for features will not take away anything that is free now); 
  • (2) Universities and libraries, with librarians being the advisors for many researchers, already have and will have an increased incentive to motivate their researchers to use reference management software, even more so if this software is integrated with repository and reporting software, such as Symplectic.
Risk analysis WP 1: Open/Standardised interface
This carries a very low risk that the delivered interface will be confusing to the end-user, which could result in incorrect documents being deposited or none at all. To prevent this, user interface designs will be tested and prototyped with selected end-users prior to full implementation. There’s also a risk that the interface for other institutional repositories isn’t understandable and thus inaccessible – however, the project will make sure to adhere to common industry practices and standards (example).

Risk analysis WP 2: Symplectic interface
There is a low risk that authentication models will fail due to complexities within existing client institutions. To address this, Symplectic will work with Mendeley to ensure that a variety of authentication modes may be possible and will work with client organisations to ensure that appropriate measures are taken to ensure that authentication is possible.

Risk analysis WP 3: Integration in institutional repository systems
There is a low level risk that authentication could fail. Standard practice would be to attempt re-authentication until a success response is returned. There is also a low level risk that the transmission from Mendeley to Symplectic could fail. This could be due to bandwidth issues or general connection errors. To address this, large files will be throttled to ensure that all bandwidth needs are met and that system resources are not overloaded.

Collaboration risk
A final, significant risk, is that this is a collaboration between two commercial companies, and a university.  We will be working with our university Legal Services team to ensure that a simple, readable but effective agreement is put in place.  We will also use a range of online collaboration and project management tools (ideally ones which the project team are already familiar with and using in other work) to ensure that the three partners stay in close contact, and that the project manager can detect any problems at the earliest opportunity. Regular communication between the active team members and senior management at all three organisations will also be used, to make sure that problems get tackled early and don't fester. Most of the project team have worked on collaborative projects before. Kickoff meetings to allow as many of the team to meet face to face as possible will also be used.  With all partners in London or Cambridge, this should be fairly easy logistically :)

Project Plan Post 2 of 7: Wider Benefits to Sector & Achievements for Host Institution

JISC DURA seeks to significantly encourage academic engagement with institutional repository technologies by integrating the concept of deposit into the academic workflow. The current low population of repositories is due to several barriers:
  • i) lack of confidence in understanding copyright laws – academics do not wish to deposit articles that may not be of appropriate copyright status;
  • ii) rekeying bibliographic information into a digital repository interface; 
  • iii) finding a full-text version to deposit; 
  • iv) finding time to carry out the activity of deposit.
DURA addresses each of these barriers – by integrating with the RoMEO data service, copyright information can be shown to an academic and used intelligently as part of the framework developed here to ensure that appropriate articles are deposited; by using Mendeley and Symplectic bibliographic data rekeying of data is minimised; and by using Mendeley as a full-text data source barrier iii) is significantly reduced across all fields engaging journal publication. The final barrier is the real key to success in increasing repository engagement. By providing easy access to deposit in an institutional repository through existing well-used tools such as Mendeley and Symplectic, the required effort of uploading full-text and exchanging metadata is reduced to clicking a button in either system. The combination of moving repository deposit activity away from the repository interface by using simple intuitive tools that source data automatically on behalf of academic users removes the final barrier to academic engagement and opens the door to sustainable institutional mandates for deposit.

The central idea is two-fold:
  • (1) To allow academics, who use Mendeley regularly to manage their publications lists, to upload their publications metadata and full-text directly into institutional repositories. This end-user driven approach is hoped to increase deposit rates, since a direct integration into the workflow of a researcher significantly reduces barriers to and efforts for depositing data, as discussed earlier. The impact of this approach will also increase along with a more widespread adoption of Mendeley, which has a very clear value proposition to researchers. 
  • (2) Where available at a university, Symplectic’s research information management software will, on the one hand, acquire data from Mendeley alongside its other data sources, and on the other hand mediate deposit into the repository through its Repository Tools technology

That means that DURA essentially results in two possible ways of depositing content into a university’s institutional repository: (2) Universities can either directly link their repository to the Mendeley API which should be developed during this project, or (2) Where available, the integration can take place via Symplectic’s software, meaning that the data is already clean and disambiguated and full copyright data sourced from RoMEO is available to the repository and the subtle nature of “gearing” between the dynamic data in the researcher’s publication list and the more static repository are automatically handled by an existing technological solution.


The evolution of the national policy environment has been rapid over the past few years, with most major UK research funders now insisting that research outputs stemming from the projects they fund be deposited into an Open Access repository. Many universities have similar policies. However, the deposit rate is too low, and repositories remain under-used. One reason for this is that researchers do not, at the time when they might deposit an item, see a benefit outweighing the cost of their time in doing so (known as a ‘negative feedback loop’). Furthermore, while most researchers have at least one repository in which to deposit their outputs, many have more than one, and the lack of join-up between them, and between them and other services, has meant that ‘deposit’ is less straightforward than it should be. What is immediately required is the embedding of the complete deposit solution into the authoring or related research process.

DURA is focussed on the tools the individual researcher chooses to maintain his/her collections of research papers as a core element of their research process. The proposition is to allow the researcher to continue to use his/her preferred tool while satisfying other demands (such as institutional deposit mandates or REF reporting requirements) at little extra effort. The primary benefit to users is therefore a time saving in meeting reporting requirements and minimal distraction from research activity. A secondary benefit is the knowledge that by synchronising with institutional systems to save their own time, they will also increase the integrity of the publications databases on which they depend.

The research tools popular with researchers for managing papers and references include Zotero, Mendeley, RefWorks and EndNote. The institutional systems that create demands on the active researcher for publication reporting/deposit include DSpace, Fedora, e-Prints and Symplectic. In  DURA we focus on Mendeley, Symplectic and DSpace (integrating with Symplectic however will automatically cover ePrints and Fedora as well), but the aim is to develop open, standardised interfaces, infrastructure, and software technology to permit information exchange between researcher applications and institutional systems such that institutional requirements can be met at almost no cost and some benefit to researchers themselves.

Mendeley is an exemplar of the new generation of researcher-focussed end-user tools for managing publications as part of the research process. It is a new tool but use is growing rapidly and it is clearly popular with its users. Symplectic is not an institutional repository, but a system for gathering and validating the publication lists of the academics known to be employed at a particular institution, with a particular emphasis on HEFCE REF reporting requirements. DSpace is a widely used institutional repository.

We anticipate that authentication will be provided using Shibboleth, so that Mendeley can establish a link between itself and the target system. Shibboleth is an ideal technology for this since it already incorporates the "Where are you from?" concept. We will also evaluate the potential of Repository Junction to facilitate account matching. Once the accounts are linked, synchronisation options can be set according to the systems being linked. The publications associated with the researcher in both systems will be compared and the researcher will have an opportunity to push publications from one system to the other in order to fill gaps. This could be automatic or manual according to user preferences. In some cases, the end-user may wish to receive Symplectic notifications of potential publication matches to assist in maintaining the researchers list of his/her own publications while in the Mendeley client. It will also be possible to initiate a Mendeley synchronisation from within the Symplectic interface. (In the context of the Symplectic software, Mendeley will appear as a new data source alongside existing sources such as Web of Knowledge, Scopus and PubMed.)

The net result should be a near-zero extra cost to complying with institutional deposit and reporting requirements from within the system chosen by the researcher to support their research activity, with a possible benefit of prompting from Symplectic of potential author matches from publication databases. The choice might be: you can learn how to deposit in DSpace or how to use Symplectic to meet your reporting/deposit obligations, or you can click 'synchronise' in Mendeley and forget about it after a few simple configuration choices.

Overall, the sector and the University of Cambridge will benefit from what the end users will benefit from:

Researcher benefits
Researchers are naturally inclined to participate in institutional repositories, but are often unaware of the protocols to do so. Often, the requirements are as simple as sending in a bibliography list of the author’s publications. Despite the simplicity, many do not participate because of misunderstanding the effort or complexity required. By taking advantage of the natural reference management workflow that most researchers already participate in, repository input can be made as easy as clicking a sync button in the reference manager. No extra effort is required to create the bibliography or gather materials. The researcher will also benefit from a highly accessible method of retrieving literature already in the IR via the same interface.

Repository manager benefitsFaculty participation in IRs is typically in the range of 10-15%. Again, this is mainly due to the perceived complexity and amount of effort by the researcher to participate. The key benefit then for the repository manager is that since no additional work is required of the researcher, it is highly likely that deposits will greatly increase.  

 

Project Plan Post 1 of 7: Aims, Objectives and Final Outputs of the project

The overall goal of DURA is to change the way researchers, departments, and universities deposit information into institutional repositories via an integration of a workflow reference management tool and institutional repositories. The information to be deposited includes both metadata and full-text content of research papers that have been authored by researchers affiliated with a specific institution.

Aims:
  1. Derive “automated” methodology to source appropriate copies of full text from Mendeley for academics within institutions and deposit these into the institutional digital repository.
  2. Minimise rekeying of bibliographic information by academics by repurposing Mendeley metadata in Symplectic Elements (allowing the academic to do much of the bibliographic management in Mendeley if they wish).
  3. Enrich an academic’s Mendeley experience by fetching full text from appropriate digital repositories into their Mendeley user area.
 
The project partners are the University of Cambridge, Mendeley, and Symplectic. At Cambridge, both CARET and the UL are involved.

Mendeley

Mendeley is a personal academic publication library and social networking site.  The software is not open source, however, it is free to use and APIs are available to process data.  Academics sign up individually and are able to maintain bibliographic lists of publications that they use in their research (this may include publications that they’ve created themselves).  In addition, Mendeley offers the academic the opportunity to upload a full text version of the publication to an area of free webspace so that these documents may be accessed by the academic wherever they may be.  As part of the upload exercise, the Mendeley software parses the full text and extracts key data so that the academic is not required to rekey bibliographic information.

Full text and bibliographic lists are managed through a variety of interfaces: a web-interface, a desktop tool that may be downloaded and installed locally, an iPhone and iPad app (to be released). 
A key feature of the Mendeley software is the ability to download a formatted bibliography in a variety of formats and citation styles for the inclusion in a CV or academic paper.

Symplectic
Symplectic provides an academic publications management solution for universities.  The interface is entirely web-based.  Universities buy the software on licence centrally and interface the software with other key university systems: Human Resources/Identity management, authentication, Content Management System, Grants/Financial systems, Digital Repository.  Use of the system is often mandated centrally by the university.  The software is neither open source nor free, however, Symplectic has started releasing certain pieces of software in an open source environment.   Symplectic is focussed on maintaining personal publications lists rather than a wider body of work and hence has a slightly different bias in terms of its data content to Mendeley.  As Mendeley, Symplectic also tries to minimise rekeying of data – this is mainly through the use of API’s to third party data providers such as Web of Science, Scopus, PubMed, arXiv and DBLP.

Full text may be uploaded to the local digital repository through Symplectic’s web interface.  The link between Symplectic Elements and the institutional repository is extremely close.  This is achieved through a complex set of webservices, part of which are installed on the repository and part of which exist in the Symplectic Elements system itself.  These web services allow the repository and the Elements system to participate in a two-way conversation, allowing each system to understand the state of the other with respect to each individual article which they hold.






Project overview

  • A.  Symplectic to Mendeley:  Using Symplectic’s existing API to allow Mendeley to pull records for a particular user – particularly link to Institutional Repository and stream a file from repository to Mendeley.  This will require us to develop a handshake model that matches institutional requirements with regard to security – Shibboleth?
  • B.  Mendeley to Symplectic: 
    • i) Using Mendeley’s existing API to allow Symplectic to pull data for a particular user.  This will require the user to enter their Mendeley account details into the Symplectic Elements system so that the system can authenticate with Mendeley to retrieve metadata and full text; 
    • ii) Allowing a user to “push” data from their Mendeley account to Symplectic by pushing a “Sync with Symplectic” button.  This will also require some form of authentication – Shibboleth?
  • C.  Symplectic pushes full text and metadata to IR (including DSpace, EPrints and Fedora).  Existing technology: Symplectic Repository Tools.
  • D.  Repository gives Symplectic information on demand concerning article level statuses.  Existing technology: Symplectic Repository Tools.
  • E.  Mendeley to Repository.  Mendeley develop a Sword-based methodology for deposit into the Repository for institutions who don’t subscribe to Symplectic – this will be DSpace specific in this project. Assume that this will be 
    • a) authenticated somehow – Shibboleth?; 
    • b) “fire and forget” – ie. Mendeley won’t retain state information about the article in the IR).  To be developed by Mendeley with Symplectic support
  • F.  Repository to Mendeley. Allowing Mendeley to acquire full text from the IR and associate with an academic so that this may be loaded into their Mendeley user area. To be developed by Mendeley.

Thursday 1 July 2010

Still planning!

Some project planning documents will shortly appear on this blog. We haven't formally started any project, but these are so we can share our thinking with JISC and each other, and feedback comments in the, err, comments.