The scientific names and classification of species of animals, plants, microbes and other organisms are subject to change as knowledge improves. Most areas of biology, and professionals in other disciplines who work with species of organisms, therefore need a list of the accepted names of these species, including synonyms that may have been used, in order to ensure that the data they manage is complete and up-to-date. No single complete list exists, but is under construction by the Species 2000 and ITIS Catalogue of Life consortium described below.

A significant long-term barrier to creating such a catalogue of life has been the lack of coordination of the information about known species, in varied formats, distributed among taxonomic specialists world-wide. The Species 2000 Interoperability Coordination Environment (SPICE) architecture (http://biodiversity.cs.cf.ac.uk/spice) resolves this by enabling a widely-used database federation which accesses distributed databases containing authoritative taxonomic information. It provides users and client software with information about the scientific name and synonyms for every known organism; "common" names; key literature references; geographical distribution, etc.

The current SPICE architecture was developed through international collaboration in the EU-funded Species 2000 europa (EuroCat) project (2003-2006). It builds on software developed in a previous BBSRC/EPSRC-funded research project which ran until 2001. It comprises Global Species Databases (GSDs); wrappers translating between GSD data models and SPICE's Common Data Model (CDM); a Common Access System (CAS) assembling information from the wrapped databases; and a Web front-end. Wrapper/CAS communication takes place using either a CORBA-based or an HTTP/XML-based protocol; the CAS provides a Web Service interface for other systems to access SPICE directly. Cardiff led the software workpackage in the Species 2000 europa project which addressed the architectural design and development of SPICE. The research by the Cardiff team involved defining and implementing the CDM; the CAS; wrapper software components, and also defining and overseeing the development of the Web front end. The system can operate with 200+ autonomous databases, each controlled by its supplying institution and having its own structure. It is designed to provide current information as the taxonomic knowledge evolves, to handle heterogeneity and to be scalable. Experience with previous prototypes led to experimentation with caching strategies and development of scalable CORBA-based and HTTP-based approaches which have been empirically validated. The CORBA-based approach gives significantly better performance, but it is not ideal for use outside an intranet and so the CORBA-based approach is no longer actively used.

SPICE supports individual users searching for species information interactively using the Web, but it also supports organisations using its Web Services to obtain a "taxonomic backbone" for their own services. These organisations include GBIF (Global Biodiversity Information Facility), the Encyclopedia of Life and SpeciesBase programmes, Consortium for the Barcode of Life, GenBank and World Conservation Union. Species 2000 has formed an agreement with ITIS (Integrated Taxonomic Information System) in North America to deliver "The Catalogue of Life (CoL)", combining Species 2000's catalogue and parts of ITIS' catalogue, using SPICE software. The Catalogue of Life currently delivers over one million of the estimated 1.8 million known species from 50 institutions. The Species 2000 secretariat reports that the web portal currently receives 40 million hits per year, from tens of thousands of individual users.

The international Taxonomic Databases Working Group (TDWG), now known as "Biodiversity Information Standards", has recently developed a Taxon Concept Schema (TCS) and proposals for using Life Science Identifiers (LSIDs) within systems that exchange and use biodiversity data. These are very relevant to our work, and Cardiff University, as provider of SPICE, is among those organisations consulted in the development of these standards. Moreover, we are currently funded by TDWG/GBIF to develop a prototype version of SPICE and other software that is able to use these new standards. These developments will make possible interoperation with a wider range of biodiversity resources, and will provide a mechanism for tracking changes that occur due to instability in taxonomic nomenclature and new discoveries. A further outcome of this current work will be the opportunity for us to influence the development of TCS and of conventions for using LSIDs in biodiversity informatics, as TCS and the LSID conventions are still evolving.

You can use Spice from the following link to the Catalogue of Life Dynamic Checklist. Note that the search term expected is a scientific name for a species of organism, such as "Faba vulgaris", which the system will tell you is a synonym for Vicia faba, the broad bean. The name may be truncated, for example to include just the genus part. So, for example, searching for "Abrus" will list the names of the species in the genus, from which you can select one to view in more detail.

The web interface you see when using the link above was written by ETI at the University of Amsterdam. It uses a Web Service to communicate with the actual Spice software written in Cardiff and running on a server at the University of Reading, UK. The resulting data sets that you see have come from a federated array of databases, via wrappers to achieve interoperability and a cache database to ensure availability.

Spice is the software which provides the Common Access System (CAS) for Species 2000. In other words it implements a hub, which gathers data from the providers, integrates it and makes it available to users and other software through interfaces.

There are currently two hubs operating with the Spice software. One is the Global hub, which integrates data from Global Species Databases (GSDs) and delivers the Species 2000 Dynamic Checklist. The other is the European hub, which integrates the three main European species checklist databases (Fauna Europeaea, Euro+Med PlantBase and ERMS, the European Register of Marine Species) into a regional species checklist for Europe. This is seen as a prototype for further regional hubs in other parts of the world.

Species 2000 operates a working federated biodiversity information system to deliver a Catalogue of Life, consisting of basic information about all species of organisms together with the hierarchy in which they are arranged. Species data providers interoperate in this federation according to a number of conventions and principles. These protocols and standards are intended to be open and available for others to use when building similar federated information systems. They are described further in
Species 2000 data standards.

