Skip to content

Project Handbook

This handbook documents the repository as it exists in code today.

  • Repository root: securevault/
  • Active application: secure-vault
  • Stack: Next.js App Router, React 19, TypeScript, Drizzle ORM, MariaDB, Redis, Cloudflare R2, optional Gemini-powered semantic indexing

How To Read This Document

  • Part 1 is written for non-technical readers, product stakeholders, demo reviewers, and new teammates.
  • Part 2 is written for engineers who need to understand the implementation, architecture boundaries, data flow, and operational dependencies.
  • Shared-preview security and browser-copying limits live in Shared Preview Protection.
  • The dedicated HTTP API reference lives in API Reference.
  • Docker and Compose runtime notes live in Docker and Compose.
  • Playwright execution and case coverage live in Playwright Coverage.

Part 1: Simplified Overview For Non-Technical Readers

What SecureVault Is

SecureVault is a secure file storage application. It lets people upload files, organize them into folders, preview and download them later, share them with other people through links, recover deleted items from trash, and optionally search files using semantic AI indexing.

The application is designed around one main promise:

  • files should stay private and encrypted
  • each user should only see their own data
  • shared access should be explicit and controllable
  • AI features should be additive, not required for normal file storage

Important scope note:

  • this is server-managed encryption, not end-to-end encryption
  • the server decrypts files for authorized preview, download, sharing, and indexing operations
  • the goal is strong encrypted-at-rest handling and scoped access control inside the application boundary

What A User Can Do Today

  • create an account and sign in
  • upload files through a resumable chunked upload flow
  • organize files into folders
  • rename, move, delete, and restore files and folders
  • preview and download files
  • generate share links for files or folders
  • protect shared links by allowed email plus OTP verification
  • see storage usage and biggest files
  • see activity such as uploads, share creation, share revocation, and share access
  • reset a forgotten password with email OTP
  • optionally run semantic indexing for PDFs and images, then search indexed content semantically

What The App Looks Like At A High Level

Core User Journeys

1. Sign In And Enter The Workspace

  • Users log in or sign up through dedicated auth pages.
  • Once authenticated, they enter the dashboard workspace.
  • The main working areas are Files, Storage, Trash, Activity, and Settings.

2. Upload Files

  • The browser splits a file into chunks.
  • The server creates an upload session and a file record.
  • Each chunk is encrypted and stored in object storage.
  • When all chunks are done, the file becomes available in the file library.
  • If semantic indexing is enabled and the file is eligible, indexing starts after the upload finishes.

3. Share Files Or Folders

  • A signed-in owner can create a share link for a file or folder.
  • A link can be public or restricted to specific email addresses.
  • Restricted links require an OTP sent to an allowed email address.
  • Access events are logged so the owner can see when a shared link was used.
  • Shared previews use layered deterrents to reduce casual saving and inspection, but verified viewers can still capture what appears on screen.

4. Delete And Recover

  • Delete actions move files and folders into trash instead of destroying them immediately.
  • Trashed items can be restored.
  • Permanent deletion removes metadata and stored chunks and reclaims storage usage.

Simple Workflow Diagram

Why This Matters

SecureVault is not just a simple file browser. The project combines:

  • secure authentication
  • encrypted-at-rest file handling
  • controlled sharing
  • quota and lifecycle management
  • optional AI search

That makes it suitable as a strong demo for a secure storage product, especially for a hackathon or portfolio setting where both user value and engineering depth matter.

Part 2: Technical Reference For Engineers

Repository Structure

PathPurpose
secure-vault/src/appNext.js route groups, layouts, pages, API route handlers
secure-vault/src/componentsUI and page-level client components
secure-vault/src/hooksReact Query hooks and upload queue hooks
secure-vault/src/lib/authsession, cookies, current-user loading, password reset, request metadata
secure-vault/src/lib/cryptoAES helpers, key hierarchy, stream crypto, filename sanitization
secure-vault/src/lib/dbDrizzle connection, schema, CRUD helpers
secure-vault/src/lib/filesfile explorer and storage dashboard query logic
secure-vault/src/lib/sharingshare link lifecycle, OTP flow, share access session
secure-vault/src/lib/storagechunking helpers and Cloudflare R2 integration
secure-vault/src/lib/uploadbrowser upload scheduler and upload concurrency coordination
secure-vault/src/lib/aisemantic indexing config, queueing, worker, providers
secure-vault/src/lib/searchfilename search and hybrid semantic search
secure-vault/testsunit, component, integration, and Playwright end-to-end coverage
compose.yamllocal MariaDB and Redis services plus optional app and worker containers

Runtime Architecture

SecureVault is implemented as a Next.js monolith with clear internal service boundaries:

  • server-rendered pages and layouts handle auth gating and initial data loading
  • client components handle rich workspace interaction
  • server actions cover authenticated dashboard mutations
  • route handlers cover JSON APIs, streaming downloads, uploads, sharing, and cron endpoints
  • MariaDB stores all durable relational state
  • Cloudflare R2 stores encrypted file chunks and thumbnails
  • Redis backs rate limiting and optional queued background work

Why MariaDB Is Central To The Submission

MariaDB is one of the project strengths, not just a dependency.

  • It carries the core operational model: users, sessions, folders, files, upload sessions, shares, access logs, quotas, trash, and embedding jobs all live in one relational system.
  • It supports workflows that benefit from transactional behavior, including OTP reset consumption, session invalidation, share governance, and upload finalization.
  • It also supports one of the differentiators of the project: semantic retrieval, where vectors and chunk metadata are stored in MariaDB and ranked there before being returned to the UI.
  • That makes the MariaDB story reviewer-friendly: the database powers both the reliable product backbone and the more novel search capability.

Frontend Architecture

The user-facing app is primarily split into:

  • (auth) for login, signup, forgot password, and reset password
  • (dashboard) for the authenticated workspace
  • s/[token] for public or restricted shared-link access

Key frontend patterns:

  • src/app/providers.tsx wires QueryClientProvider and UploadQueueProvider
  • dashboard layout uses getCurrentUser() and redirects unauthenticated users to /login
  • page routes load initial data on the server, then hand off to client page-content components
  • React Query hooks keep explorer, trash, storage, and current-user state in sync
  • the upload queue is a singleton state machine exposed through context plus useSyncExternalStore

Important current-state note:

  • / is a completed product-facing landing page for reviewers and first-time users
  • the real product entry point is the authenticated dashboard under /files, /activity, /storage, /settings, and /trash
  • for demos and onboarding, treat / as the public entry point and /files as the authenticated workspace home

Authentication And Session Model

Current auth behavior is implemented in server actions and shared auth utilities.

  • login and signup are server actions
  • passwords are hashed with Argon2
  • the app stores hashed session and refresh tokens in MariaDB
  • auth cookies are __Secure-session and __Secure-refresh
  • current-user resolution happens server-side through the session cookie
  • dashboard protection currently comes from layout-level user checks rather than a top-level middleware file

Password reset is separate from server actions:

  • POST /api/auth/password-reset/request-otp
  • POST /api/auth/password-reset/reset

The reset implementation uses:

  • email normalization
  • OTP hashing
  • attempt counting
  • short-lived OTP validity windows
  • transaction-level protection against concurrent token consumption
  • full session invalidation after password reset

Encryption And File Security

The application uses a three-tier key hierarchy:

  • master key from MASTER_ENCRYPTION_KEY
  • one user encryption key per user
  • one file encryption key per file

Implementation details confirmed in code:

  • the master key must be a 64-character hex string
  • UEKs and FEKs are generated as 32-byte random values
  • UEKs are encrypted with the master key
  • FEKs are encrypted with the owning user UEK
  • file chunks store per-chunk IV and auth tag metadata in file_chunks
  • shared downloads never expose raw storage keys to the client
  • this is application-managed encryption at rest; the app server performs decrypt operations for authorized reads

Upload Architecture

The upload path is one of the most important parts of the app.

Client-side:

  • files are split into chunks by src/lib/storage/chunker.ts
  • UploadManager schedules a bounded number of active uploads
  • UploadJob handles init, resume, chunk upload, completion, and semantic follow-up

Server-side:

  • POST /api/upload/init creates upload state and file metadata
  • GET /api/upload/status supports resume
  • POST /api/upload/start claims a concurrency slot
  • POST /api/upload/chunk uploads one chunk
  • POST /api/upload/complete finalizes the upload
  • POST /api/upload/release releases the concurrency slot

Upload coordination details:

  • upload concurrency is server-aware, not just a UI convenience
  • active upload slots are claimed and released explicitly
  • retries honor Retry-After where available
  • if Redis is disabled in local development, coordination can fall back to a no-op adapter

Operational limits confirmed in code:

LimitCurrent value
Upload chunk size5 MiB
Maximum file upload size100 MiB
Maximum active uploads per user3
Upload session expiry24 hours
Default storage quota1 GiB
Trash retention30 days
PDF semantic indexing size cap10 MiB
Eligible image types for semantic indexingJPEG, PNG, WEBP, GIF, AVIF
Eligible document type for upload and indexingPDF

File, Folder, Trash, And Storage Model

The file explorer is metadata-first and user-scoped.

  • files are only listed when status = ready and deleted_at is null
  • folders are user-scoped and soft-deletable
  • trash is implemented as soft deletion on both files and folders
  • permanent deletion purges rows and attempts to delete related R2 objects
  • storage dashboards count active files separately from trash, but quota-used is tracked on the user and trash still matters operationally

Trash behavior:

  • deleting a folder cascades soft deletion to descendants and contained files
  • restoring a child requires its parent folder to be restored first
  • permanent deletion also removes share links in scope
  • cron cleanup can purge expired trash and stale uploads

Sharing Model

Sharing is implemented for both individual files and folders.

  • a share link targets either file_id or folder_id
  • links can be public or restricted to an allowlist of emails
  • restricted links require OTP verification
  • verified share access is stored in a share access session
  • access events are logged
  • download counts can be capped
  • share-link creation warns owners that restricted previews are tied to allowed emails, while screen capture remains possible

Shared Preview Protection Model

Shared previews are protected by layered controls rather than a promise of impossible copying.

  • /s/{token} is server-rendered after share-token, expiry, revocation, and restricted-session checks
  • preview and download API routes repeat token/session/folder-scope validation before returning bytes
  • restricted links use email allowlists plus OTP to tie access to known recipients
  • browser-facing preview responses are no-store and include inline, same-origin, no-referrer, nosniff, and noindex/noarchive headers
  • shared image and PDF page previews use a protected CSS-background renderer instead of native image elements
  • shared pages block common right-click, save, source, and DevTools keyboard shortcuts as a deterrent
  • shared PDF preview serves rendered WebP page images instead of exposing the original PDF for inline viewing

The limitation is explicit: if a verified viewer can see a preview, they can still take a screenshot or use tools outside the page's control. For the detailed threat model and engineering checklist, read Shared Preview Protection.

Shared PDF Image Preview Architecture

Shared PDF preview is intentionally different from owned preview.

  • owned preview continues to use the existing authenticated file preview route
  • shared PDF preview uses dedicated manifest and page-image routes
  • visitors receive rendered image/webp pages instead of original PDF bytes
  • authorization happens before any cache lookup
  • server-side caching is layered so repeated shared page requests stay fast without weakening access control

The layered cache model is:

  • Redis stores a short-lived page-response cache keyed by share token, file, page, and render version
  • R2 plus pdf_preview_pages stores durable encrypted rendered page derivatives
  • original encrypted PDF chunks remain the source of truth when a derivative does not exist yet

Important implementation rules:

  • the browser-facing response remains Cache-Control: private, no-store
  • Redis is an optimization, not an authority source
  • revoked, expired, or unauthorized requests fail before Redis or R2 can be used
  • the Redis key includes the share token, so different links to the same file do not share hot-cache entries
  • Redis TTL is bounded by min(remaining share-link lifetime, 24 hours)

For engineers, the main code paths are:

Semantic indexing is optional and environment-gated.

  • the browser triggers indexing only after upload success
  • only eligible PDFs and image types are indexed
  • indexing supports inline and queued execution modes
  • queued execution requires Redis
  • status is tracked per file and modality in embedding_jobs
  • vectors and encrypted extracted text are stored in embedding_chunks

Search modes:

  • filename search: simple scoped metadata search
  • semantic search: query embedding plus hybrid ranking across the user’s own indexed chunks

Behavioral expectations:

  • semantic indexing is optional and never blocks a successful upload from becoming downloadable
  • if indexing is disabled, skipped, or fails, the file still remains usable in normal storage flows
  • current indexing eligibility is limited to PDFs and selected image MIME types
  • PDFs larger than 10 MiB are stored normally but skipped for semantic indexing
  • the schema supports encrypted extracted text fields alongside vectors, but the core user-facing contract is semantic retrieval rather than exposing raw extracted text directly

Data Model

Primary tables:

TablePurpose
usersaccount identity, password hash, encrypted UEK, quota counters
sessionshashed session and refresh tokens plus device metadata
foldersuser-owned hierarchical folders
filesfile metadata, encrypted FEK, lifecycle state, thumbnails, trash state
file_chunksone row per stored chunk with R2 key and crypto metadata
upload_sessionsresumable upload state and expiry
share_linkspublic or restricted share links
share_link_emailsallowlisted emails for restricted links
share_link_otpsOTP challenge state for restricted share access
share_link_access_logsshare access audit-style records
file_versionsversioning scaffold for future evolution
embedding_jobsper-file semantic indexing job state
embedding_chunkschunk text metadata and vector embeddings
password_reset_tokenspassword-reset OTP state
email_verification_tokensschema support for verification flow

Infrastructure And Dependencies

Local Compose services:

  • MariaDB 12
  • Redis 8
  • optional web container
  • optional worker container for the embeddings worker

Environment split:

  • host-run app development uses secure-vault/.env.local
  • containerized Compose app runs use the repo-root .env

Important environment variables:

GroupVariables
DatabaseDATABASE_HOST, DATABASE_PORT, DATABASE_NAME, DATABASE_USER, DATABASE_PASSWORD, optional DATABASE_SSL_MODE, DATABASE_SSL_CA, DATABASE_SSL_CERT, DATABASE_SSL_KEY
EncryptionMASTER_ENCRYPTION_KEY
Object storageR2_ACCOUNT_ID, R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY, R2_BUCKET_NAME
RedisREDIS_URL, DISABLE_REDIS
EmailRESEND_API_KEY, NEXT_PUBLIC_APP_URL
CronCRON_SECRET
Semantic indexingSEMANTIC_INDEXING_ENABLED, SEMANTIC_INDEXING_EXECUTION_MODE, SEMANTIC_INDEXING_PROVIDER, GEMINI_API_KEY, related tuning variables

Security Controls Confirmed In Code

  • signed-in dashboard access is enforced server-side
  • auth cookies are httpOnly, sameSite=strict, and secure
  • session and refresh tokens are stored hashed, not raw
  • when database TLS is enabled through DATABASE_SSL_MODE, certificate verification stays on instead of disabling trust checks
  • rate limiting exists for login, signup, password reset, sharing OTP, uploads, and downloads
  • CSP and other security headers are configured in next.config.ts
  • filenames are sanitized before persistence
  • sharing checks are owner-scoped and token-scoped
  • semantic search is user-scoped before returning ranked results

Testing Strategy

The repository has meaningful automated coverage across:

  • unit tests for crypto, auth, rate limiting, sharing, upload logic, search, and semantic indexing
  • component tests for dashboard, files, trash, activity, and auth pages
  • route tests for uploads, sharing, download, password reset, search, cron, and embeddings
  • Playwright end-to-end flows for upload, sharing, semantic indexing, password reset, trash, storage search, and activity

Current Implementation Notes

These are important for anyone onboarding:

  • the dashboard experience is real, and / is now the completed landing page that routes users into the live product
  • email verification is modeled in the schema and UI, but new signups are currently auto-verified as a hackathon shortcut; a fuller production flow would send a verification message through Resend or a similar provider before activating the account
  • refresh token utilities exist, but the active route surface is centered on session-cookie validation
  • the semantic indexing subsystem is feature-gated and can be unavailable without the correct environment
  1. README.md
  2. Shared Preview Protection
  3. API Reference
  4. Docker and Compose
  5. Playwright Coverage
  6. secure-vault/src/app
  7. secure-vault/src/lib/auth
  8. secure-vault/src/lib/upload
  9. secure-vault/src/lib/sharing
  10. secure-vault/src/lib/ai
  11. secure-vault/src/lib/db/schema

Built with VitePress and deployed through GitHub Pages.