Identity Resolution

Identity resolution maps workspace-specific table rows to workspace-agnostic global identity records using securely hashed identifiers. No plaintext identity data is ever stored in the global layer. This is the mechanism that makes the CSP citadel model work: an overseer firm can reason across its client base ("how many of my clients have John Smith as a director?") without any client workspace ever exposing John Smith's plaintext name to a sibling.

Three Layers

Layer 1: GlobalIdentity
  Workspace-agnostic person or company.
  Types: INDIVIDUAL, ENTITY.
       │
Layer 2: GlobalIdentifier
  Hashed identity field on a GlobalIdentity.
  Types: EMAIL, PASSPORT_NUMBER, EMIRATES_ID, COMPANY_REG_NUMBER,
         TAX_ID, NATIONAL_ID.
  HMAC-SHA256 keyed by IDENTITY_HASH_SECRET. Versioned.
       │
Layer 3: IdentityLink
  Maps a workspace TableRow → a GlobalIdentity.
  Includes confidence score (0-1), disputed flag, expiry.

A GlobalIdentity is the shared anchor. Multiple workspaces can each have an IdentityLink pointing at the same GlobalIdentity for their local row. The cleartext identifying data lives only in the workspace that owns the row.

Resolution Flow

Row created/updated in Individuals or Companies table
   │
   ├─ Identity resolution triggered automatically
   ├─ Check: is identity resolution enabled for this workspace?
   │   (requires active oversight relationship)
   ├─ Extract identity fields by table type:
   │      INDIVIDUALS: Email, Passport Number, Emirates ID
   │      COMPANIES:   Company Number, Tax ID
   ├─ Normalise + securely HMAC-SHA256 hash each field
   ├─ Search GlobalIdentifier for matching hashes
   ├─ Compute confidence score
   ├─ Decision:
   │      High confidence: Auto-link to existing GlobalIdentity
   │      Low confidence:  Queue for human review
   │      No match:        Create new GlobalIdentity
   └─ Resolution logged for audit trail

Confidence Scoring

Match criteria	Score	Action
1 email match	0.3	Queued for human verification
1 government ID match	0.5	Queued for human verification
2 field matches	0.7	Auto-linked
3+ field matches	0.9	Auto-linked
All government IDs match	1.0	Auto-linked

Below the auto-link threshold, the match goes to the identity review queue at Stakeholders > Requests. ADMIN reviewers can approve, reject, or escalate.

AI-tool surface for the queue:

get_identity_review_queue - list pending matches.
resolve_identity_review - approve / reject from the AI side.

Hashing & Key Rotation

IDENTITY_HASH_SECRET is the HMAC key. The secret is held server-side alongside the session signing secret but is deliberately separate - reusing the session secret would conflate two unrelated trust domains.

Versioned rotation is supported:

IDENTITY_HASH_SECRET - the active key.
IDENTITY_HASH_SECRET_V2 - the next key during a rotation window.
IDENTITY_HASH_SECRET_V3 - and so on.

During rotation, every GlobalIdentifier write produces hashes under all configured key versions. Verification (lookup) checks against any active version. Once the rotation window closes, the old version's hashes are decommissioned and the column is dropped.

This means zero-downtime rotation without ever exposing a window where some identifiers are in the old keyspace and some in the new.

Document Presence

Document presence is the discovery layer that lets an overseer workspace know a document exists for a shared identity - without revealing the document's content. Only existence, type, and expiry are shared.

File linked with KYC or EVIDENCE role to a row
   │
   ├─ Presence trigger fires
   ├─ Identity link found for the row
   ├─ DocumentPresence record created (status = VALID)
   └─ If row is a CSP stakeholder, cross-workspace entity presence refreshed

What's exposed in a presence record:

globalIdentityId (the shared anchor, hashed under the same keyspace).
documentClass (PASSPORT, MEMORANDUM_OF_ASSOCIATION, ...).
expiryDate (if applicable).
status (VALID, EXPIRED, SUPERSEDED, PENDING, REVOKED).

What's not exposed:

The file bytes.
The extracted field values.
The owning workspace's internal IDs.
Any indication of who else has presence for the same identity.

Access Grant Flow

When the overseer wants the actual file (not just presence), they request access:

Overseer sees presence:
  "John D. has a PASSPORT (expires 2027-03)"
       │
       ▼
POST /api/files/[fileId]/request-access
       │
       ├─ Routes to appropriate approver
       ├─ Auto-approve (within configured thresholds)
       └─ OR queued for ADMIN review
       │
       ▼
On approval: AccessGrant created
       │
       ▼
GET /api/files/shared/[fileId]?grantId=...
   → validates grant
   → streams file bytes

AccessGrant carries:

The file ID, requesting workspace, custodian workspace.
level: VIEW or DOWNLOAD.
status: ACTIVE, REVOKED, EXPIRED.

Grant lifecycle is fully audit-logged. Revocation cascades to DocumentPresence status updates - if a grant is revoked, the presence record's status reflects that immediately (no caching, no delay).

Sensitivity Levels

Files are classified by sensitivity. Each level defines a minimum role for operations:

Level	Default Access	Typical Content
`PUBLIC`	All roles	Published documents
`INTERNAL`	EDITOR+	Standard business files
`CONFIDENTIAL`	ADMIN+	KYC, financial, personal data
`RESTRICTED`	OWNER only	Board minutes, legal privilege

Cross-workspace access via grants must respect the sensitivity ceiling - an INTERNAL-level grant cannot stream a CONFIDENTIAL file.

Global Profile Register

GlobalProfile aggregates identities at the org level. It lets an overseer build a unified view of "who is John Smith across all my client workspaces?" without ever crossing the cleartext boundary.

Endpoint	Purpose
`GET /api/oversight/identities`	Search profiles. Workspace + overseer security model.
`GET /api/oversight/identities/[id]`	Single profile with risk scoring + compliance signals + audit trail.

AI-tool surface: resolve_row_identity, get_document_presence, list_access_grants, list_workspace_grants, request_file_access, check_file_sensitivity.