Wednesday 15 September 2010

Counting

One of the requirements of the JISCdepo programme is that the respositories engaged in it should have an analytics engine of some sort so that deposit rates can be observed over the course of the projects.

But what does this mean, particularly for DURA?

Firstly - what are we counting? The most obvious thing to count for a project around deposit would be deposits! 

Irrefutable proof?
One thing we must remember is that even if we count the deposits going into a respository, and observe an increase during a project to do with deposit, it does not prove that the project made a difference.

We might reasonably expect deposits to be going up anyway, as awareness and interest in respositories and openness increases.  And there may be other initiatives underway which could alter the deposit rate within the institution - perhaps general repository publicity, perhaps some particularly large research datasets are stored during the period, perhaps an enthusiastic staff member evangelising effectively. So we can observe changes in deposit rate, but we cannot necessarily draw meaningful conclusions about the effectiveness of projects to increase deposit from them.

So we must consider results of counting deposit with some caution - it is certainly valuable information, but not necessarily a vindication, by itself, of a single project to increase deposit rates.



Counting somewhere else
In DURA we are super lucky though, because all the deposits from our project will come from other places, and those other places are engaged with the project and so we can count things there.

If we're depositing from Mendeley direct to the institutional repository, Mendeley will be counting and we can access that data.  If we're depositing from Symplectic to the institutional repository, Symplectic's institutional deployment will be counting, and at least for the case of Cambridge where we'll be doing our test and pilot deployments, we can access that data.

Even better, by counting in Mendeley or in Symplectic, we can tell exactly what submissions come from our project rather than anywhere else, so it's real data which will help us assess the project's success.



What is success anyway?
For the purposes of this discussion, we're only going to consider success in terms of counting things. 

We could easily count deposits during the project. We'd end up with some deposit counts from our trial code work, and some deposit counts from our pilot version (coming later).  

I think that these counts are useful to the project internally, but less useful to everyone else. They don't really show the meaningful impact of our work because even during the pilot phase, we may still be ironing out bugs and improving the experience. Also, our work aims to make deposit an integrated part of research workflow on an ongoing basis, and people's initial use of our system is more likely to reflect experimentation than an ongoing engagement.

So, real project success in terms of deposit count will need to be monitored after the formal project ends. We are considering that reviewing the deposit count over the 12 months after the end of the project, capturing both a reasonable embedding period and also conveniently a full academic cycle, might be the way to go.  We have yet to decide what a good metric for success would be - double the existing annual rate of academic paper deposit?  Do chip in with your thoughts in the comments.

(You may spot that we haven't really talked about counting access to papers in the repository. That's a topic for another day, but also a slightly less relevant one for DURA.)

No comments:

Post a Comment