Third Meeting, Part Two, Amsterdam, 1-2 December 1999



Attendance: George MacKenzie, Rob Mildren, NAS; Jaap Kloosterman, IISG, Goran Kristiansson, Lena Wilhelmsson, RA, Peter Horsman, outside expert.

1. System Profiles

The aim of the audit of IT systems in this workpackage is to identify those aspects that will permit effective exchange of data between systems and with outside users, which is the ultimate aim of the project. It was agreed that one way of expressing this would be to compile a set of criteria for participation in the EUAN project in future A list was drafted for discussion, which is in Annex 1. This will be developed into part of the specification for the prototype.

2. Computing Approaches

2.1 The meeting identified three possible approaches, divided into two basic types, the centralised and the de-centralised.

2.2 The de-centralised approach would be based on the Z39.50 protocol, and involve sending a query around a variety of databases. For this to work, each database must be set up to receive and process the query.

2.3 The centralised approach would, by contrast, bring together a subset of the data from the different databases into a text index, using software, such as Fulcrum or Optosof, already used in the library world. This index would be used for searching, rather than the databases themselves. The text index would be based on agreed data elements, conforming to the 13 core ISAD elements identified in the archival workpackages. Updating the text index could be achieved in one of two ways.

2.3.1 In model 1, the individual system databases would export a set of data at regular intervals in an agreed format to the text index. This is illustrated in Annex 2.

2.3.2 In model 2, a web crawler would regularly search the individual systems, extracting new data and copying it to a series of searchable pages. This is illustrated in Annex 3.

2.4 Both the centralised and de-centralised approaches have advantages and disadvantages.


Centralised (Z39.50)
For Against
well established, mature standard old technology, using low level programming language
widely used in library and museum world complex to implement, requiring specialized skills
accesses actual data, which is therefore fully up to date expensive to develop and maintain systems


De-centralised (Text Index)
For Against
ease of handling data user is looking at a subset of the data, which may not be fully up to date, rather than at the database
uses established formats (MARC-AMC, EAD) need to update regularly
easy to maintain possibility of data redundancy
well established software (Fulcrum etc.) available danger from changes at institutional level
allows flexibility among institutions taking part data inconsistencies (i.e. it is easy to add new data, but it will not recognise deletions)

2.5 Intelligent agent software can be used to enhance either the text index or web crawler models. In the centralised approaches, a suitable exchange format needs to be agreed. It was agreed that for a data standard, MARC-AMC or EAD would be suitable and could be implemented by IISG and RA. It was also agreed that for character sets, Unicode could be used. Dates should be exported in ISO 8601 format, though display would be set either by the user's web browser, or by the user him or herself.

3. Authorities

3.1 It was agreed that these represent a potential problem. In each partner institution, there are de facto authorities for place names and personal and organizational names, which will follow institutional or national rules. When the EUAN project level is reached, there is a potential for differences between the authorities. This will be made greater the more partners there are in EUAN and the more detailed the descriptions that are included. This is really a question for the archival group, but it could have an impact on the technical implementation. It was concluded that, with the current five partners and top level descriptions only, the problem is manageable.

3.2 Two means of dealing with it were identified:

  • using the Swedish model, in which the National Archives and the National Library each compile their own authorities and then meet periodically to exchange and co-ordinate them;
  • using a pragmatic approach in which an authority is always sought at the next level up, and if a suitable one is not found, one is created and then passed up.

3.3 It was also agreed to consult the archival group further on this question. In the longer term, there may be a role for the European Commission in co-ordinating work on name authorities at a trans-national level.

3.4 The best vehicle for exchanging name authority data is likely to be a new Document Type Definition (DTD) based on ISAAR(CPF), which has been drafted by Daniel Pitti. P-G Ottosson in RA is currently testing the first draft.

13 December 1999

