Shrutik Architecture Overview
This document provides a comprehensive overview of Shrutik’s system architecture, design principles, and technical decisions.
System Architecture
Shrutik follows a modern, microservices-inspired architecture with clear separation of concerns and scalable design patterns.
High-Level Architecture
graph TB
subgraph "Presentation Layer"
WEB[React Frontend]
MOBILE[Mobile App]
API_DOCS[API Documentation]
end
subgraph "API Gateway Layer"
NGINX[Nginx Reverse Proxy]
RATE_LIMIT[Rate Limiting]
AUTH_MW[Authentication Middleware]
end
subgraph "Application Layer"
API[FastAPI Backend]
WORKER[Celery Workers]
SCHEDULER[Task Scheduler]
end
subgraph "Business Logic Layer"
AUTH_SVC[Authentication Service]
VOICE_SVC[Voice Recording Service]
TRANS_SVC[Transcription Service]
CONSENSUS_SVC[Consensus Service]
EXPORT_SVC[Export Service]
ADMIN_SVC[Admin Service]
end
subgraph "Data Layer"
POSTGRES[(PostgreSQL)]
REDIS[(Redis)]
FILES[File Storage]
end
subgraph "External Services"
CDN[Content Delivery Network]
EMAIL[Email Service]
MONITORING[Monitoring & Logging]
end
WEB --> NGINX
MOBILE --> NGINX
NGINX --> API
API --> AUTH_SVC
API --> VOICE_SVC
API --> TRANS_SVC
WORKER --> CONSENSUS_SVC
WORKER --> EXPORT_SVC
AUTH_SVC --> POSTGRES
VOICE_SVC --> POSTGRES
VOICE_SVC --> FILES
TRANS_SVC --> POSTGRES
TRANS_SVC --> REDIS
API --> REDIS
WORKER --> REDIS
FILES --> CDN
API --> EMAIL
API --> MONITORING
Design Principles
1. Modularity
- Service-Oriented: Clear separation between different business domains
- Loose Coupling: Services communicate through well-defined interfaces
- High Cohesion: Related functionality grouped together
2. Scalability
- Horizontal Scaling: Stateless services that can be scaled independently
- Async Processing: Heavy operations handled by background workers
- Caching Strategy: Multi-layer caching for performance optimization
3. Reliability
- Error Handling: Comprehensive error handling and recovery mechanisms
- Health Checks: Automated monitoring and alerting
- Data Integrity: ACID transactions and data validation
4. Security
- Authentication: JWT-based authentication with refresh tokens
- Authorization: Role-based access control (RBAC)
- Data Protection: Encryption at rest and in transit
5. Maintainability
- Clean Code: Following Python and TypeScript best practices
- Documentation: Comprehensive API and code documentation
- Testing: High test coverage with unit, integration, and E2E tests
Technology Stack
Backend Technologies
| Component | Technology | Purpose |
|---|---|---|
| Web Framework | FastAPI | High-performance async API framework |
| Database | PostgreSQL | Primary data storage with ACID compliance |
| Cache/Queue | Redis | Caching, session storage, and message queue |
| Background Jobs | Celery | Async task processing |
| Audio Processing | Librosa, PyDub | Audio analysis and manipulation |
| Authentication | JWT | Stateless authentication |
| Validation | Pydantic | Data validation and serialization |
| ORM | SQLAlchemy | Database abstraction layer |
| Migrations | Alembic | Database schema migrations |
Frontend Technologies
| Component | Technology | Purpose |
|---|---|---|
| Framework | React 18 | Component-based UI framework |
| Meta Framework | Next.js | Full-stack React framework |
| Language | TypeScript | Type-safe JavaScript |
| Styling | Tailwind CSS | Utility-first CSS framework |
| State Management | Zustand | Lightweight state management |
| HTTP Client | Axios | Promise-based HTTP client |
| Audio Recording | MediaRecorder API | Browser audio recording |
| Testing | Jest, React Testing Library | Unit and integration testing |
Infrastructure Technologies
| Component | Technology | Purpose |
|---|---|---|
| Containerization | Docker | Application containerization |
| Orchestration | Docker Compose | Multi-container application management |
| Reverse Proxy | Nginx | Load balancing and SSL termination |
| Monitoring | Prometheus, Grafana | Metrics collection and visualization |
| Logging | Structured logging | Centralized log management |
| CI/CD | GitHub Actions | Automated testing and deployment |
Data Architecture
Database Schema Design
erDiagram
%% Core Relationships
USERS ||--o{ VOICE_RECORDINGS : "creates"
USERS ||--o{ TRANSCRIPTIONS : "creates"
USERS ||--o{ QUALITY_REVIEWS : "performs"
USERS ||--o{ EXPORT_AUDIT_LOGS : "performs"
USERS ||--o{ EXPORT_DOWNLOADS : "downloads"
USERS ||--o| EXPORT_BATCHES : "creates (optional)"
LANGUAGES ||--o{ SCRIPTS : "has"
LANGUAGES ||--o{ VOICE_RECORDINGS : "recorded in"
LANGUAGES ||--o{ TRANSCRIPTIONS : "transcribed in"
SCRIPTS ||--o{ VOICE_RECORDINGS : "recorded from"
VOICE_RECORDINGS ||--|{ AUDIO_CHUNKS : "divided into (1 to many)"
AUDIO_CHUNKS ||--o{ TRANSCRIPTIONS : "has many"
AUDIO_CHUNKS ||--o| TRANSCRIPTIONS : "has one consensus"
TRANSCRIPTIONS ||--o{ QUALITY_REVIEWS : "reviewed by"
EXPORT_BATCHES ||--o{ EXPORT_AUDIT_LOGS : "generates"
EXPORT_BATCHES ||--o{ EXPORT_DOWNLOADS : "downloaded by"
%% Entities with attributes
USERS {
int id PK
string name
string email UK
string password_hash
string role "(enum: userrole)"
json meta_data
timestamptz created_at
timestamptz updated_at
}
LANGUAGES {
int id PK
string name
string code UK
timestamptz created_at
}
SCRIPTS {
int id PK
int language_id FK
text text
string duration_category "(enum: durationcategory)"
json meta_data
timestamptz created_at
timestamptz updated_at
}
VOICE_RECORDINGS {
int id PK
int user_id FK
int script_id FK
int language_id FK
string file_path
float duration
string status "(enum: recordingstatus)"
json meta_data
timestamptz created_at
timestamptz updated_at
}
AUDIO_CHUNKS {
int id PK
int recording_id FK
int chunk_index
string file_path
float start_time
float end_time
float duration
text sentence_hint
json meta_data
timestamptz created_at
int transcript_count
boolean ready_for_export
float consensus_quality
int consensus_transcript_id FK "optional"
int consensus_failed_count
}
TRANSCRIPTIONS {
int id PK
int chunk_id FK
int user_id FK
int language_id FK
text text
float quality
float confidence
boolean is_consensus
boolean is_validated
json meta_data
timestamptz created_at
timestamptz updated_at
}
QUALITY_REVIEWS {
int id PK
int transcription_id FK
int reviewer_id FK
string decision "(enum: reviewdecision)"
float rating
text comment
json meta_data
timestamptz created_at
}
EXPORT_BATCHES {
int id PK
string batch_id UK
string archive_path
string storage_type "(enum: storagetype)"
int chunk_count
bigint file_size_bytes
json chunk_ids
string status "(enum: exportbatchstatus)"
boolean exported
text error_message
int retry_count
string checksum
int compression_level
string format_version
json recording_id_range
json language_stats
float total_duration_seconds
json filter_criteria
timestamptz created_at
timestamptz completed_at
int created_by_id FK "optional"
}
EXPORT_AUDIT_LOGS {
int id PK
string export_id
int user_id FK
string export_type
string format
json filters_applied
int records_exported
bigint file_size_bytes
string ip_address
string user_agent
timestamptz created_at
}
EXPORT_DOWNLOADS {
int id PK
string batch_id FK
int user_id FK
timestamptz downloaded_at
string ip_address
string user_agent
}
Data Flow Patterns
1. Voice Recording Data Flow
User Input → Frontend → API → Database → File Storage → Background Processing → Chunking → Database Update
2. Transcription Data Flow
User Request → API → Database Query → Cache Check → Response → User Input → Validation → Database Save → Consensus Trigger
3. Consensus Calculation Flow
Transcription Submit → Background Job → Collect Related → Calculate Similarity → Weight Quality → Update Consensus → Notify Users
API Design
RESTful API Principles
Shrutik follows REST architectural principles with some pragmatic adaptations:
- Resource-Based URLs:
/api/recordings,/api/transcriptions - HTTP Methods: GET, POST, PUT, DELETE for CRUD operations
- Status Codes: Proper HTTP status codes for different scenarios
- JSON Format: Consistent JSON request/response format
- Pagination: Cursor-based pagination for large datasets
- Versioning: API versioning through URL path (
/api/v1/)
API Structure
/api/
├── auth/
│ ├── POST /login
│ ├── POST /register
│ ├── POST /refresh
│ └── POST /logout
├── recordings/
│ ├── GET /
│ ├── POST /sessions
│ ├── POST /upload
│ └── GET /{id}/progress
├── transcriptions/
│ ├── GET /
│ ├── POST /tasks
│ ├── POST /submit
│ └── POST /skip
├── chunks/
│ ├── GET /{id}/audio
│ └── GET /{id}/info
├── admin/
│ ├── GET /stats/platform
│ ├── GET /users
│ └── GET /performance/dashboard
└── export/
├── POST /dataset
└── GET /jobs/{id}/status
Authentication & Authorization
sequenceDiagram
participant C as Client
participant A as API
participant Auth as Auth Service
participant DB as Database
C->>A: POST /auth/login
A->>Auth: Validate Credentials
Auth->>DB: Check User
DB-->>Auth: User Data
Auth->>Auth: Generate JWT
Auth-->>A: JWT + Refresh Token
A-->>C: Authentication Response
Note over C: Store JWT in memory/secure storage
C->>A: GET /recordings (with JWT)
A->>Auth: Validate JWT
Auth->>Auth: Check Expiry & Signature
Auth-->>A: User Context
A->>A: Check Permissions
A-->>C: Protected Resource
Performance Architecture
Caching Strategy
graph LR
subgraph "Client Side"
BROWSER[Browser Cache]
LOCAL[Local Storage]
end
subgraph "CDN Layer"
CDN[Content Delivery Network]
end
subgraph "Application Layer"
API_CACHE[API Response Cache]
DB_CACHE[Database Query Cache]
SESSION[Session Cache]
end
subgraph "Database Layer"
DB[(PostgreSQL)]
REDIS[(Redis)]
end
BROWSER --> CDN
CDN --> API_CACHE
API_CACHE --> DB_CACHE
DB_CACHE --> REDIS
SESSION --> REDIS
DB_CACHE --> DB
Performance Optimizations
Backend Optimizations
- Connection Pooling: Database connection pooling with configurable limits
- Query Optimization: Indexed queries and efficient SQL patterns
- Async Processing: Non-blocking I/O for concurrent request handling
- Background Jobs: Heavy operations moved to background workers
- Response Compression: Gzip compression for API responses
Frontend Optimizations
- Code Splitting: Dynamic imports for reduced bundle size
- Lazy Loading: Components and routes loaded on demand
- Image Optimization: Optimized images with Next.js Image component
- Caching: Aggressive caching of static assets and API responses
- Service Workers: Offline functionality and background sync
Database Optimizations
- Indexing Strategy: Proper indexes on frequently queried columns
- Query Optimization: Efficient queries with proper joins and filters
- Read Replicas: Separate read replicas for analytics queries
- Partitioning: Table partitioning for large datasets
Security Architecture
Security Layers
graph TB
subgraph "Network Security"
FIREWALL[Firewall Rules]
DDoS[DDoS Protection]
SSL[SSL/TLS Encryption]
end
subgraph "Application Security"
AUTH[Authentication]
AUTHZ[Authorization]
VALIDATION[Input Validation]
SANITIZATION[Data Sanitization]
end
subgraph "Data Security"
ENCRYPTION[Encryption at Rest]
BACKUP[Secure Backups]
AUDIT[Audit Logging]
end
FIREWALL --> AUTH
DDoS --> AUTH
SSL --> AUTH
AUTH --> ENCRYPTION
AUTHZ --> ENCRYPTION
VALIDATION --> BACKUP
SANITIZATION --> AUDIT
Security Measures
Authentication & Authorization
- JWT Tokens: Stateless authentication with short-lived access tokens
- Refresh Tokens: Secure token refresh mechanism
- Role-Based Access: Granular permissions based on user roles
- Session Management: Secure session handling with Redis
Data Protection
- Input Validation: Comprehensive input validation using Pydantic
- SQL Injection Prevention: Parameterized queries with SQLAlchemy
- XSS Protection: Content Security Policy and input sanitization
- CSRF Protection: CSRF tokens for state-changing operations
Infrastructure Security
- HTTPS Enforcement: All communications encrypted with TLS
- Security Headers: Comprehensive security headers implementation
- Rate Limiting: Protection against abuse and DoS attacks
- File Upload Security: Secure file upload with type validation
Monitoring & Observability
Monitoring Stack
graph LR
subgraph "Application"
APP[Shrutik Application]
METRICS[Metrics Collection]
LOGS[Structured Logging]
TRACES[Distributed Tracing]
end
subgraph "Collection"
PROMETHEUS[Prometheus]
LOKI[Loki]
JAEGER[Jaeger]
end
subgraph "Visualization"
GRAFANA[Grafana Dashboards]
ALERTS[Alert Manager]
end
APP --> METRICS
APP --> LOGS
APP --> TRACES
METRICS --> PROMETHEUS
LOGS --> LOKI
TRACES --> JAEGER
PROMETHEUS --> GRAFANA
LOKI --> GRAFANA
JAEGER --> GRAFANA
PROMETHEUS --> ALERTS
Key Metrics
Application Metrics
- Request Rate: Requests per second by endpoint
- Response Time: P50, P95, P99 response times
- Error Rate: Error percentage by endpoint and status code
- Throughput: Data processing throughput
Business Metrics
- User Engagement: Active users, session duration
- Data Quality: Transcription accuracy, consensus rates
- System Usage: Recording uploads, transcription submissions
- Performance: Audio processing times, consensus calculation speed
Infrastructure Metrics
- System Resources: CPU, memory, disk usage
- Database Performance: Query times, connection pool status
- Cache Performance: Hit rates, memory usage
- Network: Bandwidth usage, connection counts
Deployment Architecture
Environment Strategy
graph LR
subgraph "Development"
DEV_LOCAL[Local Development]
DEV_DOCKER[Docker Development]
end
subgraph "Testing"
TEST_UNIT[Unit Tests]
TEST_INTEGRATION[Integration Tests]
TEST_E2E[E2E Tests]
end
subgraph "Staging"
STAGING[Staging Environment]
UAT[User Acceptance Testing]
end
subgraph "Production"
PROD[Production Environment]
MONITORING[Production Monitoring]
end
DEV_LOCAL --> TEST_UNIT
DEV_DOCKER --> TEST_INTEGRATION
TEST_UNIT --> TEST_E2E
TEST_INTEGRATION --> STAGING
TEST_E2E --> STAGING
STAGING --> UAT
UAT --> PROD
PROD --> MONITORING
Deployment Pipeline
- Code Commit: Developer pushes code to repository
- Automated Testing: Unit, integration, and E2E tests run
- Build Process: Docker images built and tagged
- Staging Deployment: Automatic deployment to staging
- Manual Testing: QA and user acceptance testing
- Production Deployment: Manual approval and deployment
- Health Checks: Automated health verification
- Monitoring: Continuous monitoring and alerting
Future Architecture Considerations
Scalability Enhancements
- Microservices: Further decomposition into microservices
- Event-Driven Architecture: Event sourcing and CQRS patterns
- Kubernetes: Container orchestration for better scaling
- Service Mesh: Advanced service-to-service communication
Performance Improvements
- Edge Computing: Edge nodes for global content delivery
- Advanced Caching: Distributed caching with Redis Cluster
- Database Sharding: Horizontal database partitioning
- GraphQL: More efficient data fetching
AI/ML Integration
- Automated Quality Assessment: ML-based quality scoring
- Smart Chunk Assignment: AI-driven task assignment
- Real-time Transcription: Automatic transcription assistance
- Anomaly Detection: ML-based fraud and quality detection
This architecture provides a solid foundation for Shrutik’s current needs while maintaining flexibility for future growth and enhancements.