DSpace@Cambridge Project Proposal
The following edited extracts are taken from the original project
proposal submitted by Cambridge University Library and the MIT Libraries
to the Cambridge-MIT Institute (CMI). The proposal was submitted in July
2002 and funding was awarded in November 2002.
DSpace@Cambridge: a project to extend DSpace into Cambridge University
and the United Kingdom
1. Summary of project
Cambridge University Library, in conjunction with Cambridge University
Computing Service, joins the DSpace federation initiative, and together
with the MIT Libraries develops methods to collect, preserve, and make
accessible digital content which is used in undergraduate programmes and
other activities; and maximises the value of existing educational assets
through the implementation of a long term digital preservation strategy.
Strategies will be established regarding deployment costs, intellectual
property rights issues, organisational change, institutional governance,
and policy issues, based on the proposed implementation, and these lessons
will be applied in support of the shared goals of MIT and Cambridge University,
in order to establish a digital object preservation service in Cambridge
University.
2. Objectives of project
The project will focus on developing a system to support the storage
and use of digital materials for undergraduate learning and other purposes.
It will thereby contribute to the following CMI themes relating to the
evolution of future technology:
• the development of new modes of teaching and new uses of technology
to enhance learning.
• the continuing revolutions in information technology and the intelligence
sciences;
• understanding the nature and impact of new digital media;
• the understanding, design and operation of large-scale, complex systems;
• blending technology, management, economics, policy, and education;
It will pursue these objectives through specific goals:
a. Identify needs within the institution for digital asset management
services, and a deployment strategy to satisfy these needs.
b. Establish DSpace at Cambridge to preserve: learning materials, internally
produced digital objects, and externally published digital objects, including
the ability to enable appropriate use.
c. Demonstrate applicable Cedars and CAMiLEON approaches within an operational
DSpace context. Share technologies between institutions.
d. Work with other institutional entities to agree on standards applicable
for the deposit of educational materials in a digital repository (AMPS
and CARET, for example). Collaboratively develop interoperability standards
with Learning Management Systems using emerging standards from the Open
Knowledge Initiative work, or other similar initiatives.
e. Position the institutions to deliver archived educational digital
assets through emerging Learning Management Systems.
f. Leverage the business model created at MIT to produce the Cambridge
University business model, including costs, funding streams, and institutional
structures, in co-operation with the business schools at both institutions.
g. Inform national digital repository strategies.
h. Produce pragmatic approaches that enable practical solutions to be
applied to rights management issues.
3. Overview
3.1 DSpace is a digital repository system that the MIT Libraries and
Hewlett-Packard have built as a collaborative project. Its purpose is
twofold: "to create and establish an electronic system that captures,
preserves and communicates the intellectual output of MIT's faculty and
researchers"; and to extend the DSpace technology into "a federation
of systems [that] makes available the collective intellectual resources
of the world's leading research institutions". DSpace functions as
an institutional repository for organizations that must capture, manage,
and preserve digital material in a variety of formats for a variety of
purposes. DSpace was designed from the outset to be an “open source”
system that could be freely given to other institutions, and the MIT Libraries
plans to develop a federation of institutions using DSpace to solve a
number of problems:
• Helping make the system more useful and sustainable over time
by collaborating on its growth and development with other interested institutions.
The open source model, and the federation contract, will ensure that useful
developments made by one institution become part of the basic DSpace system
and available to the rest of the federation. This will allow the system
to advance far more rapidly than would be possible at just one institution.
• Gaining more influence in important policy and standards development,
and in the development of the commercial sector by combining a significant
number of influential organizations into one voice with a common goal.
The DSpace federation will provide an important testbed in the area of
digital libraries, archives, and repositories, and should become one of
the major influences on future developments in these areas.
• Developing new solutions to digital collection management, preservation,
and scholarly communication by supporting interoperation among institutions
running DSpace. This includes functions such as “virtual”
collections or publications distributed across several institutions, cross-institutional
searching, and distributed services that federation members can take advantage
of (e.g. data cleanup or enhancement, format migrations for preservation,
and so on).
3.2 The initial target institutions for DSpace are other research libraries,
cultural heritage institutions, and government agencies that have a mandate
to support research and research-based education using digital materials.
The DSpace federation is being defined by the MIT Libraries in consultation
with many other institutions that are interested in deploying it, and
will be fully defined during the fall of 2002. The system will be made
generally available as Open Source in the fall of 2002.
3.3 As DSpace becomes a robust, scalable, production quality system at
the MIT Libraries, we are beginning to see the actual demand and potential
for the system in a number of areas.
3.3.1 Scholarly Communication
• The SPARC organization, an alliance of universities, research
libraries, and organizations including both the MIT Libraries and Cambridge
University Library, is a constructive response to market dysfunctions
in the scholarly communication system. These dysfunctions have reduced
dissemination of scholarship and constrained libraries. SPARC serves as
a catalyst for action, helping to create systems that expand information
dissemination and use in a networked digital environment while responding
to the needs of scholars and academe. One of their current priorities
is the promotion of institutional repositories, and they have identified
DSpace as a leader in that category. The MIT Libraries are in discussion
with SPARC about an active collaboration to promote DSpace internationally
as a means of promoting their agenda.
• DSpace supports the Open Archives Initiative’s Protocol
for Metadata Harvesting, a new NISO standard that promotes the ability
to create “virtual” catalogs and collections of material from
distributed institutions. It has been applied in a number of domains (especially
preprint archives) and is a promising technology for solving some of the
problems of institutionally-based publishing. OAI is a component of the
recent JISC FAIR programme in the UK, and the MIT Libraries are in discussions
with a UKOLN FAIR project that is interested in developing e-print archives
for the UK about possible collaborations. The MIT Libraries are also in
discussions with the University of Illinois, who have developed a set
of OAI harvesting tools under a grant from the Andrew W. Mellon Foundation,
about the possibilities for integrating OAI further in DSpace to support
new types of publications.
• MIT Libraries are discussing a relationship with the Electronic
Publishing Initiative at Columbia (EPIC) University about developing a
common set of publishing tools that would be integrated with DSpace in
support of new publishing ventures such as faculty-based journals or “e-communities”.
Both the MIT Press and Cambridge University Press are interested in participating
in such an initiative.
3.3.2 Digital Preservation
• DSpace provides both a secure repository on which to build a
digital preservation program and a large test bed of digital material
on which to perform badly needed research in this area. As an institutional
repository, DSpace accepts digital material in non-standard formats including
both proprietary formats (e.g. Microsoft Word or Real Networks RealAudio
files) and one-time items (e.g. a program written by a faculty member
for a physics visualization used in a course). These are, of course, very
fragile and difficult to preserve.
• Cambridge University Library has been closely involved in a number
of important digital preservation initiatives in the UK, and is an acknowledged
leader in this field with several JISC initiatives completed already.
CUL was one of three main partners in the Cedars project. It is closely
linked with the CAMiLEON project, and plans to work closely with the SHERPA
project (part of JISC's FAIR programme, the remit of which is to investigate
e-print archive clusters. CUL is also collaborating with other legal deposit
libraries to prepare for the likely extension of UK legal deposit legislation
to cover electronic resources. All these initiatives will inform the development
of DSpace.
• DSpace also shows potential for solving the difficult problem
of archiving web sites. Both MIT and Cambridge University produce numerous
web sites of significant and persistent value, especially in the area
of course web sites that support undergraduate education, and their libraries
are being asked to help develop the technologies and procedures to preserve
them. A recent community standard of the US Digital Library Federation
called METS (Metadata Encoding and Transmission Syntax) shows promise
in how to accomplish this, and MIT Libraries are represented on the editorial
board for the standard to insure that it develops in this direction. METS
will be adopted broadly both in the US (e.g. the Library of Congress,
University of California, Harvard and others, including the MIT Libraries),
and the UK (the British Library, UKOLN, and Oxford are all committed to
its adoption).
3.3.3 Policy and Standards Development
• The preceding discussion of METS is an excellent example of where
a DSpace federation can have an impact on standards development and adoption.
There are many other examples, for example the choice of and support for
a standard language for intellectual property rights management such as
the international Open Digital Rights Language (ODRL) initiative. If the
DSpace federation implements one such standard then it will go a long
way towards driving its adoption by other stakeholders in the digital
library and institutional repository fields.
• Also in the area of technical standards adoption are Learning
Management Systems. MIT is the lead institution for an important project
funded by the Andrew W. Mellon Foundation to design a standard, open architecture
for Learning Management Systems called the Open Knowledge Initiative (OKI),
to which Cambridge University is also an important contributor. The OKI
project team at MIT is collaborating closely with the Libraries on the
question of what services the DSpace institutional repository can offer
the campus Learning Management System, both to support current courses
and the ongoing pedagogical use of course material. Together with other
OKI participants, we are developing the standards for how these systems
will interoperate.
• An example of a non-technical aspect of this is in the area of
copyright policy. Institutions are struggling with developing local policy
for copyrighted material, both in the sense of faculty who publish in
commercial or scholarly society journals and in the sense of reusing copyrighted
materials for research and educational purposes. DSpace is positioned
for both of these activities, and can help drive policy discussions at
each institution. Then the federation can collectively determine if there
are best practices in this regard, and if it should develop standards
across the federation (and by implication, beyond) for copyright policy.
3.3.4 Research
• DSpace has always been envisioned to be a research platform in
a number of areas, some of which have already been touched on (e.g. digital
preservation, rights management). Another area is that of metadata support
in the context of the Internet and World Wide Web. MIT Libraries have
been funded to work together with the Hewlett Packard Corporation (in
particular their research labs in Bristol, UK), the W3C, and an MIT researcher
from the Lab for Computer Science on the problem of implementing new W3C
standards for semantic interoperability of metadata (the Semantic Web
Activity) in the context of real institutional repositories such as DSpace.
The result of that project will be a solid understanding of the degree
to which DSpace can support complex community-defined metadata schemas
and how well they will interoperate to support resource discovery across
extremely disparate material. We will also be examining how to extend
DSpace to support the use of Web Services for distributed collection management,
the goal being to reduce costs and other barriers to maintaining DSpace
collections over time.
3.4 As the first DSpace federation partner, the Cambridge University
Library brings to the DSpace project proven expertise in digital preservation
techniques and an international perspective, and will be a strong partner
in collaborations with separately funded Learning Management System projects
such as OKI. CUL will also provide expertise and leadership in the UK,
both in advancing the DSpace federation programme and in contributing
to the development of UK digital preservation policies through initiatives
such as JISC programmes and the Digital Preservation Coalition.
4. Objectives
4.1 The DSpace@Cambridge project seeks to ensure, firstly, that the collective
digital intellectual resources of Cambridge University are systematically
captured, preserved, and made accessible for both present-day and long-term
use; and secondly, that the knowledge gained in developing this service
for Cambridge University can be disseminated as a model for adoption by
other UK universities.
4.2 Digital materials are increasingly being created and used in Cambridge
University for teaching, research, and related activities. These materials
may be either "born digital" or a digitised version of analogue
material. While substantial resources are being directed towards the creation
of such material, comparatively little effort has so far been directed
towards ensuring (a) that related materials from different sources can
be brought together and made immediately available in a co-ordinated manner
to users throughout the University; and (b) that digital materials can
be stored and retrieved in the longer term.
4.3 Cambridge University Library's established primary role is to provide
the University with a central repository of knowledge supporting the University's
teaching, research, and related activities, both in the present day and
for the long-term future. In this role it has acquired, preserved, and
made accessible both the intellectual output of the University and the
output of 3000 years of intellectual activity elsewhere, in printed and
manuscript form. Through its status as a legal deposit library it has
a similar, broader, responsibility to the national and international academic
community.
4.4 The greatest challenge facing the University Library now is to ensure
that it can continue to fulfil these roles for digital material - which
will include not only text but also images, audio, video, datasets, and
multimedia formats for which there is no print alternative - as it has
done hitherto for traditional printed and manuscript materials. To do
so, it must provide the University and the broader community with a managed
preservation and delivery service for digital resources. DSpace offers
a means of achieving this responsibility.
4.5 The Project's broad objectives are:
• to develop DSpace as a service managed by Cambridge University
Library for the University as a whole, for the acquisition, preservation,
and delivery of digital material.
• to further develop DSpace, together with the MIT Libraries, to
support new instructional technologies such as Learning Management Systems
in support of the teaching mission of the university
• to provide an UK exemplar of DSpace in support of the MIT Libraries’
efforts to build an international federation of DSpace users, and to relate
DSpace technology, organisational procedures, and rights management issues
to other digital preservation initiatives in the UK which inform national
digital preservation strategies.
4.6 By federating, in a UK context, with DSpace at MIT Libraries, Cambridge
will be able to develop a "sustainable long-term digital storage
repository that provides an opportunity to explore issues surrounding
access control, rights management, versioning, retrieval, community feedback,
and flexible publishing capabilities". Cambridge University Library
will itself bring to this development the knowledge and experience it
has gained as a partner in the Cedars Project, which has focussed on specific
aspects of digital preservation such as rights management, metadata, and
organisational issues. It will capitalise on relevant outcomes from the
related CAMiLEON Project investigating emulation as a digital preservation
strategy. The CU Librarian is a member of the UK librarians' and publishers'
joint committee on voluntary legal deposit of electronic resources, though
which CUL will be exploring the long-term preservation of legal deposit
materials.
4.7 Material for populating DSpace in the context of Cambridge University
will potentially come from a variety of both internal and external sources,
including, for example:
• learning materials created by CARET, by teaching departments,
and by individuals
• papers written by teaching and research staff
• technical reports
• theses
• databanks
• administrative documents (the official University archive)
• personal files, working notes, etc.
• purchased digital texts
• bibliographic datasets
• hosting of e-journals published by Cambridge academics
• legal deposit of digital publications
4.8 The Project will concentrate initially on learning materials and
other internally-produced objects, while mindful of the need to develop
a repository that can handle other types of material as well. These materials
will be received in different formats, and we will examine whether the
repository should be concerned with preserving the content or the physical
format or both.
4.9 The value of DSpace will derive, not simply from its ability to preserve
digital material in a retrievable form, thus relieving departments and
individuals from the need to make their own arrangements for storage and
retrieval, but in the added value it can impart to its output. For example:
• There will be an emphasis on indexing of digital material, both
to identify the object and to enable searching across objects for specific
content. This will greatly enhance DSpace's potential as a tool for resource
discovery, and will promote the exchange of ideas and co-operation in
the production of learning materials.
• With additional development, DSpace could provide a publish-on-demand
service, e.g. for the distribution of committee papers and other current
business documents to defined audiences; and for the production of lecture
handouts.
• Access to DSpace materials may require secure authentication of
users, utilising the University Card or other forms of accreditation for
this purpose.
4.10 Cambridge University's DSpace service will offer all members of
the University the resources they require to archive and make available
the digital objects they create in the course of their work (although
not necessarily reproducing the original context and functionality of
certain materials, such as an administrative records that were available
from an SAP system or records management system). The Project will place
particular emphasis on community support for the service among affiliates
of the CMI programme, developing close working relationships with key
producers and managers of digital materials within each university.
4.11 In Cambridge the key central producers and managers will include:
• The Computing Service. The Computing Service provides the infrastructure
of computing facilities and related services in support of research and
teaching in the University of Cambridge;
• The Centre for Applied Research in Educational Technologies (CARET).
CARET provides Cambridge University with a central focus for online education,
with the remit to facilitate cross-disciplinary collaborations, identify
and evaluate the innovative use of technology, promote the use of teaching
technologies across all educational sectors, and address applied research
questions to improve teaching and learning.
• The Educational Technology Service (ETS). The ETS is a new central
service being created jointly by CARET and the Computing Service to support
the deployment and delivery of online educational environments and new
educational resources throughout the University, working directly with
staff in Departments and Colleges to provide the necessary support, guidance
and training that they require to implement new technology based teaching
for their own courses. In the immediate future, the service will be delivered
through a variety of commercial and open source products which will allow
departments to begin to use these immediately. Over the next eighteen
months the ETS expects to deploy a new open source Learning Management
System based on the Open Knowledge Initiative at MIT and anticipates that
it will migrate to this in the future.
• The Management Information Services Division. The MISD provides
and maintains administrative computing systems for the Central Administrative
offices, including Payroll, Personnel, Student Records, the University
Finance System, and the central administration's web systems. Its responsibilities
include security of data and of equipment, development and support of
network services, database and systems management, and backup and recovery
facilities.
• The Press & Publications Office. The PPO's main activities
include managing the editorial policy of the central administration's
web site, and running the University's online news and events service.
• Cambridge University Press. Cambridge University Press, an integral
part of the University, will in future publish in new formats and media
and invest in technological change to improve its production, distribution
and information systems.
4.12 These relationships will seek to establish agreed standards for the
creation, deposit and use of learning materials. Both at MIT and Cambridge,
DSpace is being considered for a repository of record to house OKI-compliant
educational materials. The Open Knowledge Initiative (OKI), a collaboration
funded initially by the Andrew W. Mellon Foundation and led by MIT and
involving Cambridge University and six other US institutions, is developing
a scalable and sustainable open-source reference system for internet-enabled
education. It aims to create an architectural platform that addresses
key management functions in assembling, delivering and accessing educational
resources, and then to encourage the development of applications that
use this platform. The Project will additionally develop a capability
for the production of metadata packages and filters for types of material
(representation networks), especially for the OKI architecture.
4.13 In its initial phase the Project will develop a defensible cost
model for the installation and support of DSpace within Cambridge University.
It will subsequently develop a sustainable business model for establishing
a self-supporting long-term service, and appropriate alternative exit
strategies. These models will be produced in collaboration with the Judge
Institute of Management Studies.
4.14 The Project will propagate its findings and experience across other
UK universities. The results of the Cambridge project, building on those
gained from the original MIT Libraries/Hewlett-Packard collaboration,
will be disseminated through the National Competitiveness Network, and
through other conferences and publications. There will also be extensive
consultation with other UK initiatives in this field, notably those of
the JISC DNER such as the Digital Preservation Coalition, FAIR (Focus
on Access to Institutional Resources) Programme, SPARC Europe, and the
AHDS (Arts and Humanities Data Service). Consultation will ensure that
the DSpace project complements and helps to advance these initiatives.
4.15 There may be an opportunity to explore, in collaboration with Nature
Publishing Group, the role DSpace@Cambridge might play in providing perpetual
archives of scientific material that has been published on the web, and
in addressing issues surrounding continued access to such material.
4.16 To ensure that DSpace fulfils its intended role, there will need
to be a University-wide programme of communication and consultation to
raise awareness of the issues surrounding digital storage, develop a set
of corporate procedures to encourage use of the service, and establish
an agreed institutional strategy for digital preservation.
4.17 The Project will identify short term and long term measures of success,
using a combination of user evaluation, technical evaluation by peer review
and formal testing, assessment by external experts, and a formal reporting
mechanism in keeping with the CMI project evaluation requirements.
5. Outcomes and deliverables
5.1 An understanding of the need for digital asset management, and an
institutional strategy that will meet these needs within the Cambridge
University context.
5.2 An installed and supported DSpace system customised to the Cambridge
environment, managed by Cambridge University Library, and federated between
Cambridge and MIT.
5.3 A business model for running DSpace at Cambridge University based
on knowledge gained through the development of the MIT business model.
It will include costs, funding streams, institutional structures, and
exit strategies, in co-operation with business consultants familiar with
both institutions.
5.4 An understanding of the localization issues with DSpace in a new
institution. The project will spend considerable time exploring which
changes to the DSpace systems are localizations, of value only to the
particular institution implementing the system, and which are enhancements
to the system that would be of general interest and value to members of
the DSpace federation. This exploration will help MIT Libraries formulate
the contractual requirements for future DSpace federators, and understand
how to manage distributed enhancements to a centrally developed and supported
system.
5.5 Investigation of the application of digital preservation recommendations
such as those of Cedars and CAMiLEON within an operational DSpace context,
and the appropriate and necessary system developments to support these.
5.6 The formulation, in collaboration with other institutional entities,
of standards applicable for the deposit and use of educational materials
in a digital repository, working collaboratively with the Open Knowledge
Initiative and other such initiatives. The ability to apply archived digital
assets for delivery via emerging Learning Management Systems.
5.7 Pragmatic solutions to rights management issues.
6. Project management and evaluation
6.1 Management
The operational organisation of the project will be overseen
• at CUL by a project manager who will report on a day-to-day basis
to the CUL project advisor.
• at MIT by a project liaison who will report on a day-to-day basis
to the MIT project advisor.
The project will be directed on a day-to-day basis by a Project Management
Committee, which will monitor progress on the project and provide direction
on a day-to-day basis. The Committee will be chaired by the project manager,
and its membership will include members of the project team and key staff
from CUL and MIT Libraries.
Strategic direction will be provided by a Project Advisory Board, meeting
at six-month intervals throughout the project, which will provide general
guidance and advice and will contribute to the evaluative process. It
will consist of both Project Leaders, at least one other senior staff
each from CUL and MIT Libraries, two external UK members, and two external
US members. Meetings of the Project Advisory Board will be convened by
the Project Manager on the schedule indicated.
Other groups may be set up as required by the Management Committee or
the Advisory Board to provide additional guidance on specific aspects
of the project.
6.2 Cambridge University Library
• The Cambridge project leader will be the University Librarian.
• The CUL project advisor will be the Senior Under-Librarian responsible
for digital library co-ordination.
6.3 MIT
• The MIT project leader will be the Director of Libraries
• The MIT project advisor will be the Associate Director for Technology
and DSpace Project Director
• The MIT aspects of the project will be managed by a dedicated
full-time project liaison based in the MIT Libraries.
6.4 Evaluation
Progress reports to CMI will be compiled and submitted at six-month
intervals from the start of the project. These will assess the effectiveness
of the management structure, a review of work to date, and an assessment
of progress towards the project objectives.
Technical evaluation will be carried out against the project objectives:
• by a formal test programme involving CU users as well as selected
other UK and US institutions.
• by peer review using external experts forming a technical group
established for the purpose.
Summative evaluation will be carried out through:
• the final project report to CMI
• dissemination and feedback through the National Competitiveness
Network
• dissemination through other workshops and publications
7. Methodology
7.1 The project is divided into three phases that together will run for
three years. We begin by doing detailed project planning and hiring critical
staff members, continue by implementing the DSpace system at the Cambridge
University Library and planning for its transition to operational status
there, and conclude by running a pilot project to deploy the system at
the Cambridge University Library and studying its impact and operational
requirements.
|