Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Page Properties

Status

Status
colourGreen
titleDECIDED

Description

Given our shared understanding that some level of duplication of bibliographic records for the same title exists in our system, how should title counts work?

Decision summary

Use counts of MMS IDs and accept possible overcounting because it is widely known that record duplication exists but . This is the best approach because duplication issues are widely understood, and other methods of deduplication are too unreliable and inconsistent.

Owning group

AASAP Team sils-aasa-l@listserv.ucop.edu

Approver

Consulted

AASAP Team members

Resource Management and Acquisitions experts from UCB

Informed

Leadership Group

Resource Management Group

Decision-making process

Informal consultation among team members, expert consultation, discussion.

Priority

Target decision date

Date decided

...

Use MMS ID because it is easy and consistent, and because other options will result in undercounting. We will end up overcounting titles where there are multiple bibliographic records for the same thing, especially for electronic resources. However, there is both 1) widespread understanding of the limitations of this approach and 2) widespread acceptance of its utility.

...

Stakeholder group

Impact

UC Libraries

Determinations around what and how we report are for the most part managed/owned by the UC Libraries (i.e., shared ownership).

CDL

CDL analysts, who are responsible for constructing report queries at the Network Zone according to templates agreed upon by the UC Libraries, must exclude items based a variety of parameters…

...

Background

System-wide, some duplication exists in title-level records. Cataloging rules provide guidance about when separate title-level records are required and when campuses should use the same bibliographic record to represent multiple items. However, in practice, bibliographic records are duplicated with some regularity, especially for electronic materials. For example:

  • CDL OA eBook title duplication.
    From CDL e-resources Acquisitions. Summarized email from cdlacq on Sept 21, 2022:

    • CDL turns on full yearly CZ collections.

    • If publishers add some open access titles to either the yearly CZ frontlist collections or give them to us when we receive marc records for our package, then those titles may end up duplicating titles from the same publisher under a full Open Access collection that we have turned on in the NZ.  

    • We determined that this is ok to leave as they are for now, for the sake of staff time & making collections available to users faster in the NZ, compared to the work it would take to clean them up now or prior to processing a new collection.  This is not exactly a new issue. We may consider internally whether there are ways we can try to mitigate this going forward, in the future when we have more capacity for this.

    • Sample: Ex Libris Discovery - 9780806192116 (exlibrisgroup.com)

  • Vendor-supplied, non-provider-neutral, non-OCLC record duplication. like Like those from Cassidy cataloging for resources in Westlaw and LexisNexis.

    • 13,700 bibs in UCI Law Westlaw collection

    • 16,650 MMS IDs for Distinct count MMS ID where the Title + Author Combined and Normalized match

    • Example: separate, non-provide-neutral, non-OCLC bib records for every law review in Lexis & Westlaw

  • Electronic record duplication when campuses use different record sets. For example, UCI Law is using OCLC records for an eBook package without vendor-supplied records, which is time-consuming but gets us the OCLC updates. But another campus is using non-OCLC CZ records. Both of those sets of records have linked NZ bibs.

  • MARCIVE government document records. It’s known that there are duplicative records in the system.

Options Considered

ARL instructions for title counts :

...

Alma Options (Count distinct)

Notes

Title Normalized, separated by format

This undercounts unique titles because works with different authors and editions are not distinguished.

Title Author Combined and Normalized, separated by format

This undercounts unique titles because works in different editions are not distinguished

Concat(Title Author Combined and Normalized, Edition), separated by format

This can undercount undercounts unique titles because generically-named titles that are actually different things do not have unique-enough metadata in these fields to distinguish them.

MMS ID

This overcounts unique titles.

This is especially true for electronic resources, where we are using bibliographic records of varying quality from many different sources in order to provide access.

...

How much concern about duplication exists in the system? In other words, do we already have widespread acceptance for possible overcounting based on shared and longstanding professional expertise about metadata limitations?
Answer: There is some concern but it tends to be context-specific. Systemwide, there does already exist a shared understanding among campus experts that 1) libraries regularly report record counts as an approximation for title counts, and 2) title count data is inherently imprecise.

...