Statistics: Dates for item counts
See Best Practices for Decision Pages and Tags for groups
Legend: not started IN PROGRESS STALLED decided
Status | In progress |
|---|---|
Description | Determine which Analytics field(s) should be used when counting items to ensure we include things we have and exclude things we don’t have. |
Decision summary | Use “Physical Items”.”Item Creation Date” to count physical items based on date. Use “E-Inventory“.”Portfolio Activation Date”.”Portfolio Activation Date” for electronic items. Adjust queries to account for migration artifacts and Alma date value inconsistencies when necessary. |
Owning group | AASAP Team sils-aasa-l@listserv.ucop.edu |
Approver |
|
Consulted | AASAP Team members consulted locally. |
Informed | Leadership Group |
Decision-making process |
|
Priority |
|
Target decision date | Apr 24, 2023 |
Date decided | [type // to add Date] |
Recommendations
All materials: account for Alma date variable quirks. Adjust queries to account for inconsistent behavior for dates due to Alma’s method of manipulating and storing date variables, as well as migration artifacts. For example, many Network Zone portfolios migrated with a null value for Activation Date.
Physical materials: use “Physical Items”.”Item Creation Date”
Electronic materials: use “Activation Date”
Using the data that currently exists in these fields provides the best balance of costs and benefits: it avoids significant, immediate local cleanup projects while providing counts that are within an acceptable margin of error.
Impact
Stakeholder group | Impact |
|---|---|
UC Libraries | Determinations around what and how we report are for the most part managed/owned by the UC Libraries (i.e., shared ownership). |
CDL | CDL analysts, who are responsible for building report queries at the Network Zone according to templates agreements upon by the UC Libraries, will have to exclude items and titles based a variety of date parameters. |
Reasoning
Background
The AASA-PT Harmonization group reviewed all the UCOP statistics data that can be retrieved via Alma Analytics. Factors considered during this review were the existing UCOP, ACRL and ARL requirements.
UCOP statistical reporting are collected for risk management purposes.
ARL reporting guidelines would like institutions to be consistent with their reporting.
ACRL and ARL reporting requirements focus on catalogued material.
CEAL reporting requirements ask about both catalogued and un-catalogued material
UCOP and other requirements statistics want counts for a snapshot in time: usually, it’s the end of the most recent fiscal year. Therefore, campuses have in the past used various approaches to exclude any material added after the end of the fiscal year when running reports after that date. For example, a report run on August 1 would be set up to exclude material added after July 1. Alma Analytics has a variety of date fields that could be used to accomplish this. However, these fields each have a variety of drawbacks for several reasons, e.g., campus procedures, migration artifacts, and other data issues. The following variables were reviewed :
Physical
Lifecycle
Item Creation Date v. Creation Date
Item Receiving Date v. Receiving Date v. Receiving Date (Calendar)
Item Modification Date v. Modification Date
Electronic
Lifecycle
Portfolio Modification Date v. Modification Date
Portfolio Creation Date
Portfolio Activation Date
Observations about the current data:
For counts of items added or withdrawn by fiscal year, options include:
For Added :
Physical
Use “Physical Items“.”Physical Item Details”.”Lifecycle” = Active AND
Use “Physical Items”.”Item Creation Date” OR
Use “Physical Items”.”Item Receiving Date” and research significant impacts at San Diego, Davis, Irvine, Los Angeles, Riverside, San Francisco, and Santa Barbara when dates are added into the data extract in Alma Analytics.
Electronic
Use “E-Inventory”.”Portfolio”.”Lifecycle” = Active AND
Use “E-Inventory“.”Portfolio Activation Date”.”Portfolio Activation Date”
For Withdrawn:
Physical
Use “Physical Items”.”Item Modification Date”.”Item Modification Date” AND
Use “Physical Items”.”Physical Item Details”.”Lifecycle” = Deleted and/or None or NULL AND/OR
Find out reasons behind the usage of the Deleted value and harmonize on a field and values for deletion notes
Electronic
Use “E-Inventory”.”Portfolio”.”Lifecycle” = Deleted and/or None or NULL AND
Use “E-Inventory“.”Portfolio Modification Date”.”Portfolio Modification Date”
Note: ExLibris is reviewing the differences in counts when dates are used in Analytics reports as of May, 2023.
Physical
Lifecycle
Three values are available for “Physical Items“.”Physical Item Details”.”Lifecycle”:
Active items are active and discoverable in Primo.
Deleted items are not discoverable in Primo.
None items are not discoverable in Primo. These records are also associated with records without creation dates and possibly other pertinent metadata; consultation with Ex Libris is ongoing.
For withdrawn counts, filtering based on lifecycle = “Deleted” or “None” causes:
Exclusion of items that were deleted from the repository after the reporting deadline but before the report run date. For example, if a weeding project happens in August 2023 and hundreds of print items and their associated records are removed, a report looking for the items from before July 2023 that runs in October 2023 will exclude those items, even though they were still in the collection before July 2023.
Inclusion of items deleted because they were added by mistake, but that do not actually represent items that were removed from the collection.
Modification Date
There are two modification date options:
“Physical Items”.”Item Modification Date”, which cleans up and harmonizes the value available in “Physical Items”.”Physical Items Details”.”Modification Date”. ExLibris documentation says: “The Item Modification date is the date of the last change to the item. Therefore, if on January 19, 2016 the item was in process type Acquisition, on January 20, 2016 the item was in process type Request, and on January 21, 2016 the item was in process type Loan (and no additional change to the item was made after January 21), the Item Modification Date is January 21, 2016.“
“Physical Items”.”Physical Items Details”.”Modification Date”, which allows us to know when the item record is modified. ExLibris documentation says it “Holds the date the physical item was modified“.
Alma Analytics data-based findings:
When crossed with Lifecycle = Delete, modification dates could be used to count items withdrawn from the collection (see Prototype Version 1 at https://docs.google.com/spreadsheets/d/1IBV9tKyvO3xq-UZeuLoeEViSmBwgp2YO/edit#gid=552748451 ).
Another option after a full fiscal year in Alma is to use [count last year] + [count of items added this FY] - [current count] = Withdrawn.
Based on the reported knowledge and practices, “Physical Items”.”Item Modification Date” is the best variable to use to determine when the items' records were last modified, especially when the record of the item was deleted.
Creation Date
Automatic date and time stamps for when records are created are a norm in relational databases. In Alma physical items, there are two creation dates:
“Physical Items”.”Item Creation Date”, which cleans up and harmonizes the value available in “Physical Items”.”Physical Items Details”.”Creation Date.” ExLibris documentation says this “Stores the item creation date in a date format such as 2/29/2014” and that “All date dimensions include dates up to and including 30 years back and 20 years forward. So, for example, if today is March 17, 2021: The earliest loan date that would appear is March 17, 1991. The latest due date that would appear is March 17, 2041.“).
“Physical Items”.”Physical Items Details”.”Creation Date”, which allows us to know when the record on the item was created. ExLibris documentation says that this “Holds the date the physical item was created” and that “This date is assigned by Alma when the physical item is created.“
Note: ExLibris documentation does not completely align with findings.
Alma Analytics data findings:
Initial findings for creation date are at https://docs.google.com/document/d/125dx-__jZOHNRH0ysuRLtEr1j33HjV7e/;
14% (4.9 million records of 34+ million total records in Alma Analytics) show no creation date. Therefore, using Creation Date = NULL as filter does not provide accurate counts of records created in each specific, non-NULL fiscal year or date range. (See also https://docs.google.com/spreadsheets/d/12B7ICBG4DLiViDerm4EwN1Y5BtKByngn/).
Recent findings from comparing inclusion and exclusion of creation date in report design criteria show that significant records are dropped when dates such as Creation Date is included. Compare https://docs.google.com/spreadsheets/d/1IBV9tKyvO3xq-UZeuLoeEViSmBwgp2YO/edit#gid=552748451 where Creation Date is included against https://docs.google.com/spreadsheets/d/1eoN1qFGTYA4JQG_SY89UvhznCvgGZj6w/ where Creation Date is excluded. See Table 3.
Based on the following and our knowledge of the campuses' current practices, “Physical Items”.”Item Creation Date” would be the best variable to use, if reporting counts of items added into the catalog by fiscal year and/or a particular date range is needed:
“Physical Items”.”Physical Items Details”.”Creation Date” is the raw data transferred from Alma to Alma Analytics.
“Physical Items”.”Item Creation Date” is the processed variable based on the Physical Items.Physical Items Details.Creation Date.
Institution Name | Version 1 (Item Creation Date Used) | Version 2 (Item Creation Date Not Used) |
|---|---|---|
Northern Regional Library Facility | 7,596,443 | 7,596,443 |
Southern Regional Library Facility | 6,971,497 | 6,971,497 |
UC San Diego | 1,704,653 | 2,363,938 |
University of California Berkeley | 5,983,372 | 5,983,372 |
University of California Davis | 1,961,543 | 3,024,327 |
University of California Irvine | 1,466,838 | 2,010,303 |
University of California Los Angeles | 4,437,059* | 4,436,031* |
University of California Riverside | 1,235,630 | 2,359,278 |
University of California San Francisco | 72,977 | 162,356 |
University of California, Merced | 154,615 | 154,615 |
University of California, Santa Barbara | 1,968,100 | 2,581,878 |
University of California, Santa Cruz | 1,067,382 | 1,067,382 |
Total | 34,620,109 | 38,711,420 |
Table 3. Comparison of counts when creation date is used. The above table compares the total counts at each institution when Item Creation Date is used and not used in the report build. For details behind the counts, see https://docs.google.com/spreadsheets/d/1xOqWAWqn1P7uF5gkZbPZ5rOCDrWctZxencx4yf50_z0/edit#gid=471658574, where Version 1 is at https://docs.google.com/spreadsheets/d/1IBV9tKyvO3xq-UZeuLoeEViSmBwgp2YO/edit#gid=552748451 and Version 2 is at https://docs.google.com/spreadsheets/d/1eoN1qFGTYA4JQG_SY89UvhznCvgGZj6w/. Causes are not yet determined, and institutions impacted are highlighted in red.
* Cannot determine why there is a smaller count in Version 2 for UCLA. Version 1 and 2 were both ran on May 22, 2023 between 8 and 9 a.m. Although Version 2’s UCLA did not finish running until 8:57 am, there should not be that much of a difference in the output. Numbers in Table 4, which were based on Version 1’s build, shows that the numbers for UCLA was closer to Version 2, too. See Table 4.
Receiving Date
There are three available receiving date options:
“Physical Items”.”Item Receiving Date”
“Physical Items”.”Physical Items Details”.”Receiving Date (calendar)”
“Physical Items”.”Physical Items Details”.”Receiving Date”
Alma :
Screenshot of receive new material screen:
The Receiving Date variable in Alma Analytics comes from a “Received Date” field in Alma that is currently validated. End users can use a calendar drop down; and they can also type dates into the field—the values must translate to a calendar date for the record to save.
Alma Analytics:
Future date values are found in receiving dates. (see https://docs.google.com/spreadsheets/d/1DtYRcjTn9WVCJsEUggGQGmWE-1Yr5hW2/ ).
Blank and null values are found in receiving dates. This is true for records added after migration. See https://docs.google.com/document/d/125dx-__jZOHNRH0ysuRLtEr1j33HjV7e/
Data review for migrated data shows that some date values are read as strings rather than dates, and Alma Analytics cannot filter these correctly. See https://docs.google.com/spreadsheets/d/12B7ICBG4DLiViDerm4EwN1Y5BtKByngn/).
Institution-Level Consultation findings:
Some campuses report that they do not use this date.
Based on the following and our knowledge of the campuses' current practices, “Physical Items”.”Item Receiving Date” would be the best variable to use, if reporting counts of items added into the collection by fiscal year and/or a particular date range is needed:
“Physical Items”.”Physical Items Details”.”Receiving Date” is the raw data transferred from the corresponding text-free field in Alma to Alma Analytics.
“Physical Items”.”Physical Items Details”.”Receiving Date (Calendar)” is the raw data transferred from the corresponding calendar dropdown menu in Alma to Alma Analytics.
“Physical Items”.”Item Receiving Date” is the harmonization/data cleanup processed variable based on “Physical Items”.”Physical Items Details”.”Receiving Date” and “Physical Items”.”Physical Items Details”.”Receiving Date (Calendar)”.
Institution Name | A Item Creation Date | B Item Receiving Date | C No Dates | D Impact (numeric) | E Impact (more than 500,000 in difference) | F Impact | G Impact |
|---|---|---|---|---|---|---|---|
Northern Regional Library Facility | 7,597,208 | 7,567,096 | 7,597,208 | 30,112 | not significant | 0.40% | not significant |
Southern Regional Library Facility | 6,973,790 | 6,871,888 | 6,973,790 | 101,902 | not significant | 1.46% | not significant |
UC San Diego | 1,704,821 | 1,688,838 | 2,364,098 | 675,260 | significant | 28.56% | significant |
University of California Berkeley | 5,983,706 | 5,912,356 | 5,983,706 | 71,350 | not significant | 1.19% | not significant |
University of California Davis | 1,961,837 | 1,872,691 | 3,024,594 | 1,151,903 | significant | 38.08% | significant |
University of California Irvine | 1,466,925 | 1,455,819 | 2,010,391 | 554,572 | significant | 27.59% | significant |
University of California Los Angeles | 4,436,011 | 4,003,694 | 4,436,011 | 432,317 | not significant | 9.75% | significant |
University of California Riverside | 1,235,634 | 1,169,536 | 2,359,105 | 1,189,569 | significant | 50.42% | significant |
University of California San Francisco | 72,946 | 71,391 | 162,115 | 90,724 | not significant | 55.96% | significant |
University of California, Merced | 155,356 | 151,862 | 155,356 | 3,494 | not significant | 2.25% | not significant |
University of California, Santa Barbara | 1,968,635 | 1,923,660 | 2,582,326 | 658,666 | significant | 25.51% | significant |
University of California, Santa Cruz | 1,067,475 | 1,044,417 | 1,067,475 | 23,058 | not significant | 2.16% | not significant |
Grand Total | 34,624,344 | 33,733,248 | 38,716,175 | 4,982,927 | significant | 12.87% | significant |
Table 4. Comparison of counts and impacts when the dates considered are used in build. The above table compares the counts and shows the calculated impacts based on the counts with and without the dates listed.
Item Creation Date and Item Receiving Date are the harmonization/data cleanup processed variable provided by ExLibris in Alma Analytics which would be useful for filtering for adding and withdrawal accountabilities by certain dates and date ranges, including fiscal years. These two variables are compared with a build without any dates for comparisons in this Table.
Impact (numeric) (D) is calculated by (C-B) or (C-A), whichever is greater. If more than 500,000 in difference, impact (E) is significant.
Impact (percent) (F) is calculated by (C-B)/C or (C-A)/C, whichever is greater. If more than 5% in difference, impact (G) is significant and the Institution Name is highlighted in red.