Identity Resolution
Identity resolution maps workspace-specific table rows to workspace-agnostic global identity records using securely hashed identifiers. No plaintext identity data is ever stored in the global layer. This is the mechanism that makes the CSP citadel model work: an overseer firm can reason across its client base ("how many of my clients have John Smith as a director?") without any client workspace ever exposing John Smith's plaintext name to a sibling.
Three Layers
Layer 1: GlobalIdentity
Workspace-agnostic person or company.
Types: INDIVIDUAL, ENTITY.
│
Layer 2: GlobalIdentifier
Hashed identity field on a GlobalIdentity.
Types: EMAIL, PASSPORT_NUMBER, EMIRATES_ID, COMPANY_REG_NUMBER,
TAX_ID, NATIONAL_ID.
HMAC-SHA256 keyed by IDENTITY_HASH_SECRET. Versioned.
│
Layer 3: IdentityLink
Maps a workspace TableRow → a GlobalIdentity.
Includes confidence score (0-1), disputed flag, expiry.
A GlobalIdentity is the shared anchor. Multiple workspaces can each have an IdentityLink pointing at the same GlobalIdentity for their local row. The cleartext identifying data lives only in the workspace that owns the row.
Resolution Flow
Row created/updated in Individuals or Companies table
│
├─ Identity resolution triggered automatically
├─ Check: is identity resolution enabled for this workspace?
│ (requires active oversight relationship)
├─ Extract identity fields by table type:
│ INDIVIDUALS: Email, Passport Number, Emirates ID
│ COMPANIES: Company Number, Tax ID
├─ Normalise + securely HMAC-SHA256 hash each field
├─ Search GlobalIdentifier for matching hashes
├─ Compute confidence score
├─ Decision:
│ High confidence: Auto-link to existing GlobalIdentity
│ Low confidence: Queue for human review
│ No match: Create new GlobalIdentity
└─ Resolution logged for audit trail
Confidence Scoring
| Match criteria | Score | Action |
|---|---|---|
| 1 email match | 0.3 | Queued for human verification |
| 1 government ID match | 0.5 | Queued for human verification |
| 2 field matches | 0.7 | Auto-linked |
| 3+ field matches | 0.9 | Auto-linked |
| All government IDs match | 1.0 | Auto-linked |
Below the auto-link threshold, the match goes to the identity review queue at Stakeholders > Requests. ADMIN reviewers can approve, reject, or escalate.
AI-tool surface for the queue:
get_identity_review_queue- list pending matches.resolve_identity_review- approve / reject from the AI side.
Hashing & Key Rotation
IDENTITY_HASH_SECRET is the HMAC key. The secret is held server-side alongside the session signing secret but is deliberately separate - reusing the session secret would conflate two unrelated trust domains.
Versioned rotation is supported:
IDENTITY_HASH_SECRET- the active key.IDENTITY_HASH_SECRET_V2- the next key during a rotation window.IDENTITY_HASH_SECRET_V3- and so on.
During rotation, every GlobalIdentifier write produces hashes under all configured key versions. Verification (lookup) checks against any active version. Once the rotation window closes, the old version's hashes are decommissioned and the column is dropped.
This means zero-downtime rotation without ever exposing a window where some identifiers are in the old keyspace and some in the new.
Document Presence
Document presence is the discovery layer that lets an overseer workspace know a document exists for a shared identity - without revealing the document's content. Only existence, type, and expiry are shared.
File linked with KYC or EVIDENCE role to a row
│
├─ Presence trigger fires
├─ Identity link found for the row
├─ DocumentPresence record created (status = VALID)
└─ If row is a CSP stakeholder, cross-workspace entity presence refreshed
What's exposed in a presence record:
globalIdentityId(the shared anchor, hashed under the same keyspace).documentClass(PASSPORT,MEMORANDUM_OF_ASSOCIATION, ...).expiryDate(if applicable).status(VALID,EXPIRED,SUPERSEDED,PENDING,REVOKED).
What's not exposed:
- The file bytes.
- The extracted field values.
- The owning workspace's internal IDs.
- Any indication of who else has presence for the same identity.
Access Grant Flow
When the overseer wants the actual file (not just presence), they request access:
Overseer sees presence:
"John D. has a PASSPORT (expires 2027-03)"
│
▼
POST /api/files/[fileId]/request-access
│
├─ Routes to appropriate approver
├─ Auto-approve (within configured thresholds)
└─ OR queued for ADMIN review
│
▼
On approval: AccessGrant created
│
▼
GET /api/files/shared/[fileId]?grantId=...
→ validates grant
→ streams file bytes
AccessGrant carries:
- The file ID, requesting workspace, custodian workspace.
level:VIEWorDOWNLOAD.status:ACTIVE,REVOKED,EXPIRED.
Grant lifecycle is fully audit-logged. Revocation cascades to DocumentPresence status updates - if a grant is revoked, the presence record's status reflects that immediately (no caching, no delay).
Sensitivity Levels
Files are classified by sensitivity. Each level defines a minimum role for operations:
| Level | Default Access | Typical Content |
|---|---|---|
PUBLIC | All roles | Published documents |
INTERNAL | EDITOR+ | Standard business files |
CONFIDENTIAL | ADMIN+ | KYC, financial, personal data |
RESTRICTED | OWNER only | Board minutes, legal privilege |
Cross-workspace access via grants must respect the sensitivity ceiling - an INTERNAL-level grant cannot stream a CONFIDENTIAL file.
Global Profile Register
GlobalProfile aggregates identities at the org level. It lets an overseer build a unified view of "who is John Smith across all my client workspaces?" without ever crossing the cleartext boundary.
| Endpoint | Purpose |
|---|---|
GET /api/oversight/identities | Search profiles. Workspace + overseer security model. |
GET /api/oversight/identities/[id] | Single profile with risk scoring + compliance signals + audit trail. |
AI-tool surface: resolve_row_identity, get_document_presence, list_access_grants, list_workspace_grants, request_file_access, check_file_sensitivity.
See Also
- Intelligence Pipeline - the broader pipeline this slots into.
- Document Extraction - how identifiers come into the system.