National Archive Project for Women Religious

Draft Version 0.1

This standard defines the metadata exchange profile and technical requirements for the National Archives for the Preservation of Women Religious (NAPWR) portal. The portal will provide unified discovery across four hub locations while maintaining distributed authority and respecting local practices.

Click on each heading below for detailed information, links, and guidance.
1. Why a Standardized Approach

Women religious archives are geographically dispersed, described unevenly, and often not digitized. A common exchange profile allows NAPWR to expose consistent, high-value discovery metadata while respecting local practices, staffing realities, and phased digitization.

1.1 Distributed ArchivesSpace, Unified Portal

NAPWR will operate with four hub locations, each maintaining its own ArchivesSpace instance as the system of record for its holdings. The portal will function as a service provider that harvests standardized metadata from each hub (via ArchivesSpace's built-in OAI-PMH interface and/or approved exports), and indexes it for portal-wide search, facets, and browse.

Click the Example of Platform Structure, Capabilities and Access to preview potential design.

Key Principle: Records remain authoritative at the hub, and the portal always links users back to the hub record for complete context and requests. This mirrors the "harvesting model for partner content" approach used in ACDAP.
1.2 We Do Not Need to Digitize Everything

NAPWR discovery does not require full digitization. For many items, a thumbnail (or representative preview image) plus a stable link back to the hub's ArchivesSpace record is sufficient to:

  • Confirm relevance
  • Support browsing
  • Route researchers to the correct repository for access

This approach is aligned with using a "preview" URL plus "available at / web view" links (preview → portal browsing; isShownAt/isShownBy → hub context and best-available access).

1.3 Rights, Privacy, and Responsible Access

NAPWR will require standardized rights statements (RightsStatements.org URIs and/or CC Public Domain Mark where applicable) and expects partners to protect privacy by excluding or redacting PII from portal-shared representations. NAPWR will also encourage remediation or contextualization of harmful legacy descriptions before portal exposure.

1.4 Outcomes
  • A shared public portal with consistent browsing and filtering (congregation, hub, record type, date span, place, subjects, languages, ministries/charisms, etc.)
  • Reduced duplication of effort and clearer pathways for congregations seeking partner support
  • A scalable foundation for future enhancements (full digital delivery, IIIF, citation builder, enrichment services)
2. NAPWR Metadata Model: Core Exchange Profile

Note: This is written as an exchange profile (what the portal needs), not a mandate to redesign anyone's local ArchivesSpace practices. See the Guide to Converting ArchivesSpace Records to Portal-Ready Metadata.

🚩 Don't Worry🚩 The Crosswalk-Mapping process is broken down into manageable steps at the Converting ArchivesSpace Records to Portal-Ready Metadata webpage.

2.1 Minimum Entities

What we're describing:

  • Hub Repository (one of the four)
  • Component/Item (Archival Object and/or Digital Object as exposed)
  • Collection/Resource (ArchivesSpace Resource / the women religious body)
  • Title (Name given to the resource)
  • Source (Finding Aid URI/URL)
  • Digital Representation (optional; may be thumbnail-only)
2.2 Required Fields

The portal should contain these fields: 

🟢  View the Controlled Vocabularly Workbook) and refer to the Metadata Template Interactive Worksheet for guidance. 🟢 

🟢   Names/Agent standardization workflow is located on the NAPWR Portal Name Standard webpage.  🟢

 

Contributing Hub (Repository)

Definition: Hub name and location, in citation-ready form
Mapping: "Contributing HUB" in required fields

Title

Definition: Name given to the resource
Application: User-friendly where possible; accept local practice

Date (display)

Definition: Human-readable creation/coverage date
Application: Use "undated" when needed; use "approximately" for uncertainty

Date (machine / indexed)

Definition: Computer-readable date for indexing
Application: ISO 8601/W3CDTF formats; use ranges with / when needed (W3C)

Creator (primary) - 🚩See the Names/Agent Transforms Guidance and Examples🚩

Definition: Entity primarily responsible (person, family, or corporation)
Application: Prefer LCNAF form when available; otherwise, create a local authorized form 

Rights (URI)

Definition: Standard rights statement for portal display and reuse clarity
Application: Use RightsStatements.org URI, or CC Public Domain Mark

Language (code)

Definition: Language of the resource
Application: ISO 639-3 alpha-3 codes (repeatable) (iso639-3.sil.org)

Local Identifier

Definition: Unique identifier within the hub's system

Physical/Intellectual Location

Definition: Where the item sits in the collection hierarchy with physical location (collection/box/folder or equivalent)

Landing Page / Source Link

Definition: Stable URL that links to the hub's full record (portal links back)
Mapping: "Available at" / "edm:isShownAt"

Record Type

Definition: Nature/genre of the resource
Application: Prefer Getty AAT for type terms; allow one NAPWR local list if needed
2.3 Required If Available

Strongly recommended for best portal UX:

Preview (thumbnail URL)

Definition: URL of thumbnail representing the object.
Application: If no digitization exists, hubs may supply a representative preview (collection-level image) or a standard placeholder image
Mapping: edm:preview

Web View / Best-Available Digital Representation

Definition: Direct URL to best-available representation.
Mapping: edm:isShownBy
2.4 Recommended Fields

These fields power the portal facets:

  • Subjects (Topical): LCSH where feasible (with URI) (id.loc.gov)
  • Missions: Local Dictionary
  • Place (spatial): Getty TGN preferred for place normalization (with URI) (Getty TGN)
  • Extent: Brief physical extent for context (pages/photos/hours)
  • Description/Abstract: Short narrative snippet for search results
2.5 Citations

The portal generates citations in Chicago style, assembled from title, date, identifier, box/folder, collection, repository, and URL.

"The Missionary Catechist. Volume 1, Number 2. January 1, 1925. HARC_010_7_1_1_2_0001. Box 1. Our Lady of Victory Missionary Sisters Collection (HARC-010). Heritage and Research Center at Saint Mary’s, Notre Dame, Indiana."
3. Controlled Vocabularies & “Starting Strategy”

A practical approach: store both (a) the human-readable label and (b) the authoritative URI when available. Use reconciliation tools (e.g., OpenRefine, Python) against the sources below. See Mapping-EAD-DC-Portal Controlled Vocabulary and Mapping for a complete list of controlled terms, notes, and examples.

3.1 Authority Sources (Recommended)
3.2 NAPWR Local Controlled Lists

These are the lists the portal can reliably facet even when external authority control is uneven

(Click on the Controlled Vocabs Sheet for guidance.)

 

NAPWR Hub (4 values)

Boston College Catholic Religious Archives Repository
Heritage and Research Center at Saint Mary's
Santa Clara University Archives and Special Collections
Women Religious Archives Collaborative

Congregation (Corporate Body)

Preferred form + variant forms (for legacy search)

Content Category (Series-level)

  • Governance    
  • Formation
  • Ministries
  • Community Life
  • Spirituality/Charism
  • Missions
  • Education
  • Healthcare
  • Social Justice
  • Administration/Finance
  • Property/Buildings
  • Publications
  • Visual Materials
  • Audio/Oral History
  • Artifacts/Objects
  • Digital Records

Access Flag

  • Open
  • Restricted
  • On-site only
  • Permission required (portal display rules)

Digitization Status

  • Thumbnail only
  • Partial digital
  • Fully digitized (helps manage expectations)
5.1 Contributing Hub

Field: Contributing Hub → dcterms:publisher

Identifies which NAPWR hub provided the record. Use one of the four agreed labels. Values must match the forms in the elements-controlled tab exactly.

"Heritage and Research Center"
5.2 Women Religious Archival Genre (Record Type)

Field: Women Religious Archival Genre (Record type) → dcterms:type

Describes what kind of record the object is (genre/form), not the subject. Uses a shared list of terms such as:

  • General chapter records

  • House chronicles

  • Necrologies

  • Newsletters

  • Photographic prints

  • Oral history interviews

Each term has a short scope note and a suggested Getty AAT mapping in the workbook. Choose one main genre for the portal; keep extra local genres in your own system if needed.

5.3 Carrier / Technical Format

Field: Carrier / technical format → dcterms:format

Records the technical format using MIME / IANA media types when possible. Examples:

  • application/pdf – PDF access copies
  • image/tiff – master scans
  • image/jpeg – access images
  • audio/wav – preservation audio
  • audio/mpeg – MP3 access audio
  • video/mp4 – access video

Note: ArchivesSpace  has a pre-loaded set of carriers and formats called 'Containers' and 'Instances'. 

5.4 Extent Type

Field: Extent → dcterms:extent

Separates the number from the unit during mapping (e.g., 12 + linear feet), then combines them into a human-readable string for the portal.

Common extent units include: linear feet, cubic feet, boxes, folders, items, volumes, reels (audio), cassettes (audio), videocassettes, film reels, photographs, photo albums, negatives, slides, digital files, gigabytes, megabytes, minutes (sound recordings), minutes (moving images).

Note: ArchivesSpace has a preloaded list of Extent Types

5.5 Language

Field: Language → dcterms:language

Uses ISO 639-3 codes as the controlled values (e.g., eng, fra, zho). The workbook provides both the label and code for common languages used in NAPWR collections. For the portal export, use the code; labels can be handled locally.

5.6 Catholic Subject Headings (Women Religious)

Field: dcterms:subject

NAPWR will use a shared list of Catholic subject headings focused on women religious and canon law. Use these terms when LC/LCSH does not adequately cover the concept. Do not create local variations if a NAPWR heading already exists; propose new terms to the group so they can be added to the shared list.

5.7 How Hubs Use the Lists

The mapping workbook is the authoritative home for these controlled lists. Local systems (ArchivesSpace, etc.) can keep their own labels, but export scripts should normalize to NAPWR values for portal delivery. Any changes to the lists should be coordinated through the NAPWR working group and reflected in an updated version of the workbook.

6. Roundtripping EAD to Excel

This overview of the workflow lets each hub use its local ArchivesSpace instance as the system of record, export EAD, convert EAD into an Excel editing sheet, then generate a NAPWR portal-ready CSV (Dublin Core/DCTERMS) that the portal can ingest and index while always linking back to the hub record.

See a breakdown of workflow in the Converting ArchivesSpace Records to Portal-Ready Metadata Guide.

Note: Use oXygen editor 23.1 or the XML editor available for your institution. Download the 'aspace-plus-excel-at-yale-2021-03-30.xpr' for this process. 

6.1 Convert EAD → Excel Using oXygen XML Editor

Goal: Produce a flat table where each row = a "portal record" (often a component/archival object, series-level depending on what you want exposed)

6.2 Map Excel to Portal Worksheet Using Python

Goal: Produce a pre-defined, mapped worksheet that includes NAPWR portal-ready mappings.

See the simplified walk-through at the EAD Transform Guide.

Note: This is a starting place for creating a basic workflow. Fields, metadata, and other elements can be adjusted and improved as communication proceeds.
7  Existing Born-Digital or Digitized Files

For existing digitized and born-digital materials, the project will accept files at the standards at which they were digitized, with the exception of materials of such poor quality that they will not meet the project's goals.

See the digitization guides: (1Scanning Text and Images; (2Audio Digitization Guidelines; (3Moving Images and Video Digitization Guide.

Scanning Webpage Audio Webpage Video Webpage

 

It is recommended that newly digitized materials for the project follow the digitization guidelines for Master Preservation Copies provided below. However, access copies (as defined in section 2.5.2 of the Federal Agencies Digitization Guidelines Initiative may be hosted by partner institutions and used for harvest into the portal.

 

🚨 Note:  Full A/V Preservation and Digitization Handbook can be reviewed for reference. 🚨  

🚨 Note: Full Oral History Interviews Data Curation Primer  is available for use.🚨

7.1 Digitization Guidelines (FADGI-aligned)

The following standards are informed by the Federal Agencies Digitization Guidelines Initiative (FADGI) and the Library of Congress Recommended Formats Statement, and are for Master Preservation Copies. Use the A/V Preservation and Digitization Handbook for guidance with media. 

 

Master Preservation Copies

Material Type Resolution Color File Format Min. Bit Depth
Textual Minimum 400 dpi; 600 preferred Color preferred to grayscale Uncompressed TIFF 24
Visual (photographic, artwork, maps, posters) Minimum 600 dpi Color preferred to grayscale Uncompressed TIFF 24
Audio 44.1 kHz/16-bit or higher n/a Uncompressed WAV or MP3 (access copy) n/a
Video 10 bits n/a Uncompressed MOV or MPEG-4 OR MP4 (access copy) n/a

 

Preferred Access Copy Formats

Material Type Resolution Color Access Format
Textual Minimum 300 dpi Color preferred to grayscale .pdf
Visual (photographic, artwork, maps, posters) Minimum 300 dpi Color preferred to grayscale .jpg
Audio Refer to A/V Process n/a Compressed or uncompressed WAV or MP3
Video Refer to A/V Process n/a Compressed or uncompressed MOV or MPEG-4 OR MP4
7.2 File Naming Convention

Institutions should use a stable, provenance-carrying naming convention for all digital files. Names should meaningfully encode:

  • collection or congregation code - (example: HARC_010)
  • series / box / folder / item - (example: 7_1_1_2_0001)

This ensures files remain traceable both inside the institution and when served to the NAPWR portal.

Output:     HARC_010_7_1_1_2_0001