Shrutik Documentation
Welcome to the comprehensive documentation for Shrutik (শ্রুতিক), the open-source voice data collection platform designed to help communities build high-quality voice datasets in their native languages.
Shrutik means “listener” in Bengali, reflecting our mission to listen to and preserve diverse voices from around the world.
About This Documentation
This documentation is built with mdBook and provides comprehensive guides, API references, and tutorials for users, developers, and administrators.
Enhanced Features
- Interactive Mermaid Diagrams: Zoom, pan, and view complex flowcharts in fullscreen
- Professional Styling: Custom theme with Shrutik branding and improved readability
- Responsive Design: Optimized experience on desktop and mobile devices
- Status Badges: Color-coded indicators for different content types
- Enhanced Navigation: Improved sidebar, search, and user experience
Interactive Diagram Controls
- Zoom: Use mouse wheel or +/- buttons to zoom in/out
- Pan: Drag to move around when zoomed in
- Reset: Double-click or press ‘0’ to reset view
- Fullscreen: Click the fullscreen button for better viewing
- Mobile: Touch-friendly controls for mobile devices
Documentation Overview
Getting Started
- Getting Started Guide - Quick setup and first steps
- Docker Local Setup - Complete Docker development guide
- Local Development - Native development environment setup
Architecture & Design
- System Architecture - Complete system design overview
- API Reference - Comprehensive API documentation
- Flowcharts - Visual system flow documentation
Contributing
- Contributing Guide - How to contribute to Shrutik
- Engineering Conventions - Development standards and philosophy
- Code of Conduct - Community guidelines
Additional Resources
- Audio Processing Modes - Audio processing capabilities
- Troubleshooting - Common issues and solutions
- FAQ - Frequently asked questions
Quick Navigation
For New Users
- Getting Started - Set up Shrutik in minutes
- Docker Local Setup - Run everything with Docker
- User Guide - Learn how to contribute voice data
For Developers
- Docker Local Setup - Quick Docker development setup
- Local Development - Native development environment
- Architecture Overview - Understand the system design
- API Reference - Integrate with Shrutik APIs
- Contributing Guide - Contribute code and features
For System Administrators
- Docker Local Setup - Deploy with Docker
- Deployment Guide - Production deployment strategies
- Monitoring & Health Checks - System monitoring
For Researchers & Data Scientists
- API Reference - Export datasets
- Architecture - Understand data structure
- Quality Control - Data quality processes
Visual Documentation
System Flows
- System Architecture - High-level system overview
- Voice Recording Flow - Complete recording process
- Transcription Workflow - Transcription and consensus
Technical Diagrams
- API Request Flow - API request lifecycle
- Database Operations - Data flow patterns
- Caching Strategy - Performance optimization
Development Resources
Setup & Configuration
- Environment Setup - Development environment
- Configuration Guide - Environment variables
- Testing Guide - Testing strategies
Code Standards
- Engineering Conventions - Development philosophy and standards
- Coding Standards - Code style guidelines
- API Design - RESTful API principles
- Database Design - Schema and patterns
Deployment Options
| Option | Complexity | Use Case | Documentation |
|---|---|---|---|
| Docker Compose | Low | Development, Small Teams | Docker Deployment |
| Kubernetes | High | Production, Enterprise | Deployment Guide |
| Cloud Platforms | Medium | Managed Services | Deployment Guide |
| Bare Metal | Medium | On-Premises | Deployment Guide |
Community & Support
Get Help
- Discord Community - Real-time community support
- GitHub Issues - Bug reports and feature requests
- GitHub Discussions - Community discussions
Contribute
- Voice Data - Contribute recordings and transcriptions
- Code - Develop features and fix bugs
- Documentation - Improve guides and tutorials
- Translation - Translate to new languages
Stay Updated
- GitHub Repository - Source code and releases
- Twitter - Latest updates and announcements
Additional Resources
External Links
- FastAPI Documentation - Backend framework
- React Documentation - Frontend framework
- PostgreSQL Documentation - Database
- Redis Documentation - Caching and queues
Research Papers
- Voice Data Collection Best Practices - Academic research
- Crowdsourcing for Language Technology - Methodology
- Quality Control in Voice Datasets - Quality assurance
What’s New
Recent Updates
- Performance Optimization - Added comprehensive caching and rate limiting
- CDN Integration - Optimized audio delivery with CDN support
- Enhanced Monitoring - Real-time performance metrics and dashboards
- Security Improvements - Advanced authentication and authorization
Coming Soon
- Mobile App - Native mobile applications for iOS and Android
- AI Assistance - ML-powered transcription assistance
- Multi-language UI - Interface translations for global accessibility
- Cloud Integration - Enhanced cloud platform support
License & Legal
- CC BY-NC-SA 4.0 License - Creative Commons license for non-commercial use
- Privacy Policy - Data privacy and protection
- Code of Conduct - Community guidelines
Need help? Join our Discord community or check our GitHub discussions.
Found an issue? Please report it on GitHub.
Want to contribute? Read our Contributing Guide to get started.
Together, we’re building a more inclusive digital future, one voice at a time.
Home • Get Started • Develop • Contribute
Getting Started with Shrutik
Welcome to Shrutik! This guide will help you set up and start using the platform in just a few minutes.
Overview
Shrutik is a voice data collection platform that allows communities to contribute voice recordings and transcriptions in their native languages. You can either contribute data or set up your own instance of the platform.
Quick Setup Options
Option 1: Docker (Recommended)
The fastest way to get Shrutik running is with Docker:
# Clone the repository
git clone https://github.com/Onuronon-lab/Shrutik.git
cd shrutik
# Switch to the deployment-dev branch
git fetch origin
git switch deployment-dev
# Copy Docker environment configuration
cp .env.example .env
# Build images and start all services
docker compose up --build -d
Access the platform:
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
Note: For detailed Docker setup instructions, see our comprehensive Docker Local Setup Guide for configuration details, troubleshooting, and switching between local/Docker environments.
Option 2: Local Development
For development or customization:
# Clone and setup
git clone https://github.com/Onuronon-lab/Shrutik.git
cd shrutik
# Switch to the deployment-dev branch
git fetch origin
git switch deployment-dev
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
To start backend, frontend, and Celery worker, see the Local Setup Guide.
Verify Setup
Once you’ve successfully started the services using either Option 1 (Docker) or Option 2 (Local Development), confirm that everything is running correctly:
Check Backend Health
The backend provides a simple health endpoint to verify that the FastAPI server is up and running.
curl http://localhost:8000/health
Check Frontend
curl http://localhost:3000
First Steps
For Developers
- Register an Account: Visit http://localhost:3000 and create an account
- Start Recording: Begin with voice recordings or transcriptions
- Track Progress: Monitor your contributions in the dashboard
For Administrators
- Access Admin Panel: Login with your admin account
- Configure Languages: Add supported languages and scripts
- Manage Users: Review user registrations and assign roles
- Monitor Quality: Review transcription quality and consensus
Contributing Voice Data
Recording Guidelines
- Environment: Record in a quiet environment
- Equipment: Use a good quality microphone
- Duration: Keep recordings between 2-10 seconds
- Content: Read the provided text clearly and naturally
Transcription Guidelines
- Accuracy: Transcribe exactly what you hear
- Formatting: Follow language-specific formatting rules
- Quality: Rate the audio quality honestly
- Consensus: Multiple transcriptions improve dataset quality
Troubleshooting
Common Issues
Services won’t start:
# All services
docker compose logs -f
# Specific service (example: backend)
docker compose logs -f backend
# Restart all services
docker compose restart
# Restart a single Service
docker compose restart backend
# Or check status
docker compose ps
Database connection errors:
# Stop services and remove volumes
docker compose down -v --remove-orphans
# (Optional) Clean unused Docker resources
docker system prune -f
# Rebuild and start all services
docker compose up -d --build
# Run migrations inside the backend container
docker compose exec backend python scripts/init-db.py
# If that fails, try the fallback
docker compose exec backend python scripts/simple-init.py
Permission errors:
# Fix file permissions
sudo chown -R $USER:$USER uploads/
chmod -R 755 uploads/
Getting Help
- Documentation: Check our comprehensive docs
- GitHub Issues: Report bugs and request features
- Discord: Join our community for real-time help
- Email: Contact us at onuronon.dev@gmail.com
Next Steps
- Local Development Guide - Set up development environment
- Docker Local Setup - Docker development environment
- API Reference - Integrate with external systems
- Contributing Guide - Contribute to the project
- Architecture Overview - Understand the system design
Welcome to the Community
You’re now ready to start using Shrutik! Whether you’re contributing voice data, developing features, or deploying your own instance, you’re part of a global movement to make voice technology more inclusive.
Join our community channels to connect with other contributors and stay updated on the latest developments.
Docker Local Setup Guide
This guide explains how to run Shrutik completely with Docker on your local machine, including all the configuration changes needed to switch from local development to Docker.
Quick Docker Setup
Prerequisites
-
Docker 20.10+
-
Docker Compose 2.0+
-
Git
1. Clone the Repo
git clone https://github.com/Onuronon-lab/Shrutik.git
cd Shrutik
# Switch to the deployment-dev branch
git fetch origin
git switch deployment-dev
2. Configure Environment for Docker
Use the Docker-specific environment file:
cp .env.example .env
Make sure the DATABASE_URL is correct in .env file
DATABASE_URL=postgresql://postgres:password@postgres:5432/voice_collection
When running inside Docker, services communicate using their Docker Compose service names.
Available Environment Files:
.env.example- Template with all available options
3. Configure Frontend
cd frontend
cp .env.example .env
4. Start All Containers
Use this when running the app for the first time or after changing Dockerfiles, requirements.txt, or package.json:
docker compose up -d --build
Regular use (no changes)
For normal daily use, when the images are already built:
docker compose up -d
Check service status:
docker compose ps
5. Initialize the Database
Run migrations:
docker compose exec backend alembic upgrade head
Create admin user:
docker compose exec backend python scripts/create_admin.py --name "Admin" --email admin@example.com
6. Access the Application
-
Frontend → http://localhost:3000
-
API Docs → http://localhost:8000/docs
-
Health Check → http://localhost:8000/health
Configuration Changes Explained
Key Differences: Local vs Docker
| Component | Local Development | Docker |
|---|---|---|
| Database URL | localhost:5432 | postgres:5432 |
| Redis URL | localhost:6379 | redis:6379 |
| Frontend API URL | http://localhost:8000 | http://localhost:8000 |
| File Paths | ./uploads | /app/uploads |
Development Workflow
Start services
docker compose up -d
Stop everything
docker compose down
Stop AND remove volumes (fresh reset)
docker compose down -v
View logs
docker compose logs -f
Specific service logs:
docker compose logs -f backend
Rebuild after changing requirements
docker compose build --no-cache
docker compose up -d
Shell into a container
docker compose exec backend bash
Check backend health
curl http://localhost:8000/health
Database Management
Run migrations:
docker compose exec backend alembic upgrade head
Auto-generate migration:
docker compose exec backend alembic revision --autogenerate -m "message"
Connect to PostgreSQL:
docker compose exec postgres psql -U postgres -d voice_collection
Redis Debugging
Test Redis:
docker compose exec redis redis-cli ping
Restart Redis:
docker compose restart redis
Troubleshooting
Port in use
Check:
sudo lsof -i :6379
sudo lsof -i :5432
Kill process:
sudo kill <pid>
Backend not starting
docker compose logs backend
Frontend not loading
docker compose logs frontend
docker compose build frontend --no-cache
docker compose up -d frontend
Local Development Guide
This guide covers setting up Shrutik for local development, including all the tools and configurations needed for contributing to the project.
Prerequisites
System Requirements
- Python: 3.11 or higher
- Node.js: 20 or higher
- PostgreSQL: 15 or higher
- Redis: 7 or higher
- Git: Latest version
Development Tools (Recommended)
- IDE: VS Code with Python and TypeScript extensions
- API Testing: Postman or Insomnia
- Database GUI: pgAdmin or DBeaver
- Redis GUI: RedisInsight
Setup Instructions
1. Clone and Navigate
git clone https://github.com/Onuronon-lab/Shrutik.git
cd shrutik
# Switch to the deployment-dev branch
git fetch origin
git switch deployment-dev
2. Backend Setup
Create Virtual Environment
# Create virtual environment
python -m venv venv
# Activate virtual environment
# On Linux/Mac:
source venv/bin/activate
# On Windows:
venv\Scripts\activate
Install Dependencies
# Install Python dependencies
pip install -r requirements.txt
Database Setup
# Start PostgreSQL (if not running)
sudo systemctl start postgresql # Linux
brew services start postgresql # Mac
# Switch to PostgreSQL user (Linux)
sudo -i -u postgres
# Create database
createdb voice_collection
# Exit postgres user shell (Linux)
exit
# Set environment variables
cp .env.example .env
Edit .env:
# Development Database
DATABASE_URL=postgresql://postgres:password@localhost:5432/voice_collection
# Redis
REDIS_URL=redis://localhost:6379/0
# Development Settings
DEBUG=true
USE_CELERY=true
# File Storage
UPLOAD_DIR=uploads
MAX_FILE_SIZE=104857600
# Security (use a secure key in production)
SECRET_KEY=dev-secret-key-change-in-production
Run Database Migrations
# Run database migrations
alembic upgrade head
# Create admin user
python scripts/create_admin.py --name "AdminUser" --email admin@example.com
Follow the prompts to create your first admin user.
3. Frontend Setup
# Navigate to frontend directory
cd frontend
# Install dependencies
npm install
# Copy environment file
cp .env.example .env
4. Start Development Services
Start Services
Terminal 1 - Backend:
source venv/bin/activate
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
Terminal 2 - Celery Worker:
source venv/bin/activate
celery -A app.core.celery_app worker --loglevel=info
Terminal 3 - Frontend:
cd frontend
npm start
Development Configuration
Alembic Configuration
Alembic is configured to automatically use the correct database URL from your environment variables:
- The
alembic/env.pyfile reads fromsettings.DATABASE_URL - No manual configuration changes needed when switching environments
- Migrations work seamlessly in both local and Docker environments
Switching Between Local and Docker
When switching between local development and Docker, you need to update these configurations:
1. Environment Variables (.env file)
Local Development:
DATABASE_URL=postgresql://postgres:password@localhost:5432/voice_collection
REDIS_URL=redis://localhost:6379/0
Docker:
DATABASE_URL=postgresql://postgres:password@postgres:5432/voice_collection
REDIS_URL=redis://redis:6379/0
Note: This stays the same for local Docker since we access from host
3. Quick Switch Commands
Switch to Docker:
# Stop local services
pkill -f uvicorn
pkill -f celery
# Update config for Docker
cp .env.example .env
# Start Services
docker compose up -d
Switch to Local:
# Stop Docker
docker-compose down
Follow The Previous Instructions for locally starting service
Complete Docker Guide: For detailed Docker setup instructions, troubleshooting, and configuration explanations, see our Docker Local Setup Guide.
🧪 Testing
Backend Tests
# Run all tests
pytest
# Run with coverage
pytest --cov=app
# Run specific test file
pytest tests/test_auth.py
# Run with verbose output
pytest -v
Frontend Tests
cd frontend
# Run tests
npm test
# Run with coverage
npm run test:coverage
# Run E2E tests
npm run test:e2e
Integration Tests
# Start test environment
docker-compose -f docker-compose.test.yml up -d
# Run integration tests
pytest tests/integration/
# Cleanup
docker-compose -f docker-compose.test.yml down -v
Debugging
Backend Debugging
VS Code Configuration
Create .vscode/launch.json:
{
"version": "0.2.0",
"configurations": [
{
"name": "FastAPI Debug",
"type": "python",
"request": "launch",
"program": "${workspaceFolder}/venv/bin/uvicorn",
"args": ["app.main:app", "--reload", "--host", "0.0.0.0", "--port", "8000"],
"console": "integratedTerminal",
"envFile": "${workspaceFolder}/.env.development"
}
]
}
Logging Configuration
Enable debug logging in .env.development:
LOG_LEVEL=DEBUG
Frontend Debugging
Browser DevTools
- Use React Developer Tools extension
- Enable source maps for debugging TypeScript
- Use Network tab to debug API calls
VS Code Configuration
Install recommended extensions:
- ES7+ React/Redux/React-Native snippets
- TypeScript Importer
- Prettier - Code formatter
- ESLint
Database Management
Migrations
# Create new migration
alembic revision --autogenerate -m "Description of changes"
# Apply migrations
alembic upgrade head
# Rollback migration
alembic downgrade -1
# Check migration status
alembic current
Database Reset
# Drop and recreate database
dropdb voice_collection
createdb voice_collection
alembic upgrade head
python create_admin.py
Common Development Tasks
Adding New API Endpoints
- Create schema in
app/schemas/ - Add model in
app/models/(if needed) - Implement service in
app/services/ - Create router in
app/api/ - Register router in
app/main.py - Add tests in
tests/
Adding New Frontend Components
- Create component in
frontend/src/components/ - Add TypeScript types in
frontend/src/types/ - Implement API calls in
frontend/src/services/ - Add routing in
frontend/src/pages/ - Add tests in
frontend/src/__tests__/
Database Schema Changes
- Modify models in
app/models/ - Generate migration:
alembic revision --autogenerate -m "description" - Review and edit migration file if needed
- Apply migration:
alembic upgrade head - Update tests and documentation
Performance Optimization
Development Performance
# Use faster database for development
export DATABASE_URL="postgresql://postgres:password@localhost:5432/voice_collection"
# Disable Celery for faster startup
export USE_CELERY=false
# Use development Redis
export REDIS_URL="redis://localhost:6379/1"
Hot Reload Configuration
Backend hot reload is enabled by default with --reload flag.
Frontend hot reload configuration in frontend/next.config.js:
module.exports = {
reactStrictMode: true,
swcMinify: true,
experimental: {
esmExternals: false
}
}
Additional Resources
- API Documentation - Complete API reference
- Architecture Overview - System design details
- Contributing Guide - Contribution guidelines
- Docker Local Setup - Docker development environment
Happy coding! 🎉
Shrutik Architecture Overview
This document provides a comprehensive overview of Shrutik’s system architecture, design principles, and technical decisions.
System Architecture
Shrutik follows a modern, microservices-inspired architecture with clear separation of concerns and scalable design patterns.
High-Level Architecture
graph TB
subgraph "Presentation Layer"
WEB[React Frontend]
MOBILE[Mobile App]
API_DOCS[API Documentation]
end
subgraph "API Gateway Layer"
NGINX[Nginx Reverse Proxy]
RATE_LIMIT[Rate Limiting]
AUTH_MW[Authentication Middleware]
end
subgraph "Application Layer"
API[FastAPI Backend]
WORKER[Celery Workers]
SCHEDULER[Task Scheduler]
end
subgraph "Business Logic Layer"
AUTH_SVC[Authentication Service]
VOICE_SVC[Voice Recording Service]
TRANS_SVC[Transcription Service]
CONSENSUS_SVC[Consensus Service]
EXPORT_SVC[Export Service]
ADMIN_SVC[Admin Service]
end
subgraph "Data Layer"
POSTGRES[(PostgreSQL)]
REDIS[(Redis)]
FILES[File Storage]
end
subgraph "External Services"
CDN[Content Delivery Network]
EMAIL[Email Service]
MONITORING[Monitoring & Logging]
end
WEB --> NGINX
MOBILE --> NGINX
NGINX --> API
API --> AUTH_SVC
API --> VOICE_SVC
API --> TRANS_SVC
WORKER --> CONSENSUS_SVC
WORKER --> EXPORT_SVC
AUTH_SVC --> POSTGRES
VOICE_SVC --> POSTGRES
VOICE_SVC --> FILES
TRANS_SVC --> POSTGRES
TRANS_SVC --> REDIS
API --> REDIS
WORKER --> REDIS
FILES --> CDN
API --> EMAIL
API --> MONITORING
Design Principles
1. Modularity
- Service-Oriented: Clear separation between different business domains
- Loose Coupling: Services communicate through well-defined interfaces
- High Cohesion: Related functionality grouped together
2. Scalability
- Horizontal Scaling: Stateless services that can be scaled independently
- Async Processing: Heavy operations handled by background workers
- Caching Strategy: Multi-layer caching for performance optimization
3. Reliability
- Error Handling: Comprehensive error handling and recovery mechanisms
- Health Checks: Automated monitoring and alerting
- Data Integrity: ACID transactions and data validation
4. Security
- Authentication: JWT-based authentication with refresh tokens
- Authorization: Role-based access control (RBAC)
- Data Protection: Encryption at rest and in transit
5. Maintainability
- Clean Code: Following Python and TypeScript best practices
- Documentation: Comprehensive API and code documentation
- Testing: High test coverage with unit, integration, and E2E tests
Technology Stack
Backend Technologies
| Component | Technology | Purpose |
|---|---|---|
| Web Framework | FastAPI | High-performance async API framework |
| Database | PostgreSQL | Primary data storage with ACID compliance |
| Cache/Queue | Redis | Caching, session storage, and message queue |
| Background Jobs | Celery | Async task processing |
| Audio Processing | Librosa, PyDub | Audio analysis and manipulation |
| Authentication | JWT | Stateless authentication |
| Validation | Pydantic | Data validation and serialization |
| ORM | SQLAlchemy | Database abstraction layer |
| Migrations | Alembic | Database schema migrations |
Frontend Technologies
| Component | Technology | Purpose |
|---|---|---|
| Framework | React 18 | Component-based UI framework |
| Meta Framework | Next.js | Full-stack React framework |
| Language | TypeScript | Type-safe JavaScript |
| Styling | Tailwind CSS | Utility-first CSS framework |
| State Management | Zustand | Lightweight state management |
| HTTP Client | Axios | Promise-based HTTP client |
| Audio Recording | MediaRecorder API | Browser audio recording |
| Testing | Jest, React Testing Library | Unit and integration testing |
Infrastructure Technologies
| Component | Technology | Purpose |
|---|---|---|
| Containerization | Docker | Application containerization |
| Orchestration | Docker Compose | Multi-container application management |
| Reverse Proxy | Nginx | Load balancing and SSL termination |
| Monitoring | Prometheus, Grafana | Metrics collection and visualization |
| Logging | Structured logging | Centralized log management |
| CI/CD | GitHub Actions | Automated testing and deployment |
Data Architecture
Database Schema Design
erDiagram
%% Core Relationships
USERS ||--o{ VOICE_RECORDINGS : "creates"
USERS ||--o{ TRANSCRIPTIONS : "creates"
USERS ||--o{ QUALITY_REVIEWS : "performs"
USERS ||--o{ EXPORT_AUDIT_LOGS : "performs"
USERS ||--o{ EXPORT_DOWNLOADS : "downloads"
USERS ||--o| EXPORT_BATCHES : "creates (optional)"
LANGUAGES ||--o{ SCRIPTS : "has"
LANGUAGES ||--o{ VOICE_RECORDINGS : "recorded in"
LANGUAGES ||--o{ TRANSCRIPTIONS : "transcribed in"
SCRIPTS ||--o{ VOICE_RECORDINGS : "recorded from"
VOICE_RECORDINGS ||--|{ AUDIO_CHUNKS : "divided into (1 to many)"
AUDIO_CHUNKS ||--o{ TRANSCRIPTIONS : "has many"
AUDIO_CHUNKS ||--o| TRANSCRIPTIONS : "has one consensus"
TRANSCRIPTIONS ||--o{ QUALITY_REVIEWS : "reviewed by"
EXPORT_BATCHES ||--o{ EXPORT_AUDIT_LOGS : "generates"
EXPORT_BATCHES ||--o{ EXPORT_DOWNLOADS : "downloaded by"
%% Entities with attributes
USERS {
int id PK
string name
string email UK
string password_hash
string role "(enum: userrole)"
json meta_data
timestamptz created_at
timestamptz updated_at
}
LANGUAGES {
int id PK
string name
string code UK
timestamptz created_at
}
SCRIPTS {
int id PK
int language_id FK
text text
string duration_category "(enum: durationcategory)"
json meta_data
timestamptz created_at
timestamptz updated_at
}
VOICE_RECORDINGS {
int id PK
int user_id FK
int script_id FK
int language_id FK
string file_path
float duration
string status "(enum: recordingstatus)"
json meta_data
timestamptz created_at
timestamptz updated_at
}
AUDIO_CHUNKS {
int id PK
int recording_id FK
int chunk_index
string file_path
float start_time
float end_time
float duration
text sentence_hint
json meta_data
timestamptz created_at
int transcript_count
boolean ready_for_export
float consensus_quality
int consensus_transcript_id FK "optional"
int consensus_failed_count
}
TRANSCRIPTIONS {
int id PK
int chunk_id FK
int user_id FK
int language_id FK
text text
float quality
float confidence
boolean is_consensus
boolean is_validated
json meta_data
timestamptz created_at
timestamptz updated_at
}
QUALITY_REVIEWS {
int id PK
int transcription_id FK
int reviewer_id FK
string decision "(enum: reviewdecision)"
float rating
text comment
json meta_data
timestamptz created_at
}
EXPORT_BATCHES {
int id PK
string batch_id UK
string archive_path
string storage_type "(enum: storagetype)"
int chunk_count
bigint file_size_bytes
json chunk_ids
string status "(enum: exportbatchstatus)"
boolean exported
text error_message
int retry_count
string checksum
int compression_level
string format_version
json recording_id_range
json language_stats
float total_duration_seconds
json filter_criteria
timestamptz created_at
timestamptz completed_at
int created_by_id FK "optional"
}
EXPORT_AUDIT_LOGS {
int id PK
string export_id
int user_id FK
string export_type
string format
json filters_applied
int records_exported
bigint file_size_bytes
string ip_address
string user_agent
timestamptz created_at
}
EXPORT_DOWNLOADS {
int id PK
string batch_id FK
int user_id FK
timestamptz downloaded_at
string ip_address
string user_agent
}
Data Flow Patterns
1. Voice Recording Data Flow
User Input → Frontend → API → Database → File Storage → Background Processing → Chunking → Database Update
2. Transcription Data Flow
User Request → API → Database Query → Cache Check → Response → User Input → Validation → Database Save → Consensus Trigger
3. Consensus Calculation Flow
Transcription Submit → Background Job → Collect Related → Calculate Similarity → Weight Quality → Update Consensus → Notify Users
API Design
RESTful API Principles
Shrutik follows REST architectural principles with some pragmatic adaptations:
- Resource-Based URLs:
/api/recordings,/api/transcriptions - HTTP Methods: GET, POST, PUT, DELETE for CRUD operations
- Status Codes: Proper HTTP status codes for different scenarios
- JSON Format: Consistent JSON request/response format
- Pagination: Cursor-based pagination for large datasets
- Versioning: API versioning through URL path (
/api/v1/)
API Structure
/api/
├── auth/
│ ├── POST /login
│ ├── POST /register
│ ├── POST /refresh
│ └── POST /logout
├── recordings/
│ ├── GET /
│ ├── POST /sessions
│ ├── POST /upload
│ └── GET /{id}/progress
├── transcriptions/
│ ├── GET /
│ ├── POST /tasks
│ ├── POST /submit
│ └── POST /skip
├── chunks/
│ ├── GET /{id}/audio
│ └── GET /{id}/info
├── admin/
│ ├── GET /stats/platform
│ ├── GET /users
│ └── GET /performance/dashboard
└── export/
├── POST /dataset
└── GET /jobs/{id}/status
Authentication & Authorization
sequenceDiagram
participant C as Client
participant A as API
participant Auth as Auth Service
participant DB as Database
C->>A: POST /auth/login
A->>Auth: Validate Credentials
Auth->>DB: Check User
DB-->>Auth: User Data
Auth->>Auth: Generate JWT
Auth-->>A: JWT + Refresh Token
A-->>C: Authentication Response
Note over C: Store JWT in memory/secure storage
C->>A: GET /recordings (with JWT)
A->>Auth: Validate JWT
Auth->>Auth: Check Expiry & Signature
Auth-->>A: User Context
A->>A: Check Permissions
A-->>C: Protected Resource
Performance Architecture
Caching Strategy
graph LR
subgraph "Client Side"
BROWSER[Browser Cache]
LOCAL[Local Storage]
end
subgraph "CDN Layer"
CDN[Content Delivery Network]
end
subgraph "Application Layer"
API_CACHE[API Response Cache]
DB_CACHE[Database Query Cache]
SESSION[Session Cache]
end
subgraph "Database Layer"
DB[(PostgreSQL)]
REDIS[(Redis)]
end
BROWSER --> CDN
CDN --> API_CACHE
API_CACHE --> DB_CACHE
DB_CACHE --> REDIS
SESSION --> REDIS
DB_CACHE --> DB
Performance Optimizations
Backend Optimizations
- Connection Pooling: Database connection pooling with configurable limits
- Query Optimization: Indexed queries and efficient SQL patterns
- Async Processing: Non-blocking I/O for concurrent request handling
- Background Jobs: Heavy operations moved to background workers
- Response Compression: Gzip compression for API responses
Frontend Optimizations
- Code Splitting: Dynamic imports for reduced bundle size
- Lazy Loading: Components and routes loaded on demand
- Image Optimization: Optimized images with Next.js Image component
- Caching: Aggressive caching of static assets and API responses
- Service Workers: Offline functionality and background sync
Database Optimizations
- Indexing Strategy: Proper indexes on frequently queried columns
- Query Optimization: Efficient queries with proper joins and filters
- Read Replicas: Separate read replicas for analytics queries
- Partitioning: Table partitioning for large datasets
Security Architecture
Security Layers
graph TB
subgraph "Network Security"
FIREWALL[Firewall Rules]
DDoS[DDoS Protection]
SSL[SSL/TLS Encryption]
end
subgraph "Application Security"
AUTH[Authentication]
AUTHZ[Authorization]
VALIDATION[Input Validation]
SANITIZATION[Data Sanitization]
end
subgraph "Data Security"
ENCRYPTION[Encryption at Rest]
BACKUP[Secure Backups]
AUDIT[Audit Logging]
end
FIREWALL --> AUTH
DDoS --> AUTH
SSL --> AUTH
AUTH --> ENCRYPTION
AUTHZ --> ENCRYPTION
VALIDATION --> BACKUP
SANITIZATION --> AUDIT
Security Measures
Authentication & Authorization
- JWT Tokens: Stateless authentication with short-lived access tokens
- Refresh Tokens: Secure token refresh mechanism
- Role-Based Access: Granular permissions based on user roles
- Session Management: Secure session handling with Redis
Data Protection
- Input Validation: Comprehensive input validation using Pydantic
- SQL Injection Prevention: Parameterized queries with SQLAlchemy
- XSS Protection: Content Security Policy and input sanitization
- CSRF Protection: CSRF tokens for state-changing operations
Infrastructure Security
- HTTPS Enforcement: All communications encrypted with TLS
- Security Headers: Comprehensive security headers implementation
- Rate Limiting: Protection against abuse and DoS attacks
- File Upload Security: Secure file upload with type validation
Monitoring & Observability
Monitoring Stack
graph LR
subgraph "Application"
APP[Shrutik Application]
METRICS[Metrics Collection]
LOGS[Structured Logging]
TRACES[Distributed Tracing]
end
subgraph "Collection"
PROMETHEUS[Prometheus]
LOKI[Loki]
JAEGER[Jaeger]
end
subgraph "Visualization"
GRAFANA[Grafana Dashboards]
ALERTS[Alert Manager]
end
APP --> METRICS
APP --> LOGS
APP --> TRACES
METRICS --> PROMETHEUS
LOGS --> LOKI
TRACES --> JAEGER
PROMETHEUS --> GRAFANA
LOKI --> GRAFANA
JAEGER --> GRAFANA
PROMETHEUS --> ALERTS
Key Metrics
Application Metrics
- Request Rate: Requests per second by endpoint
- Response Time: P50, P95, P99 response times
- Error Rate: Error percentage by endpoint and status code
- Throughput: Data processing throughput
Business Metrics
- User Engagement: Active users, session duration
- Data Quality: Transcription accuracy, consensus rates
- System Usage: Recording uploads, transcription submissions
- Performance: Audio processing times, consensus calculation speed
Infrastructure Metrics
- System Resources: CPU, memory, disk usage
- Database Performance: Query times, connection pool status
- Cache Performance: Hit rates, memory usage
- Network: Bandwidth usage, connection counts
Deployment Architecture
Environment Strategy
graph LR
subgraph "Development"
DEV_LOCAL[Local Development]
DEV_DOCKER[Docker Development]
end
subgraph "Testing"
TEST_UNIT[Unit Tests]
TEST_INTEGRATION[Integration Tests]
TEST_E2E[E2E Tests]
end
subgraph "Staging"
STAGING[Staging Environment]
UAT[User Acceptance Testing]
end
subgraph "Production"
PROD[Production Environment]
MONITORING[Production Monitoring]
end
DEV_LOCAL --> TEST_UNIT
DEV_DOCKER --> TEST_INTEGRATION
TEST_UNIT --> TEST_E2E
TEST_INTEGRATION --> STAGING
TEST_E2E --> STAGING
STAGING --> UAT
UAT --> PROD
PROD --> MONITORING
Deployment Pipeline
- Code Commit: Developer pushes code to repository
- Automated Testing: Unit, integration, and E2E tests run
- Build Process: Docker images built and tagged
- Staging Deployment: Automatic deployment to staging
- Manual Testing: QA and user acceptance testing
- Production Deployment: Manual approval and deployment
- Health Checks: Automated health verification
- Monitoring: Continuous monitoring and alerting
Future Architecture Considerations
Scalability Enhancements
- Microservices: Further decomposition into microservices
- Event-Driven Architecture: Event sourcing and CQRS patterns
- Kubernetes: Container orchestration for better scaling
- Service Mesh: Advanced service-to-service communication
Performance Improvements
- Edge Computing: Edge nodes for global content delivery
- Advanced Caching: Distributed caching with Redis Cluster
- Database Sharding: Horizontal database partitioning
- GraphQL: More efficient data fetching
AI/ML Integration
- Automated Quality Assessment: ML-based quality scoring
- Smart Chunk Assignment: AI-driven task assignment
- Real-time Transcription: Automatic transcription assistance
- Anomaly Detection: ML-based fraud and quality detection
This architecture provides a solid foundation for Shrutik’s current needs while maintaining flexibility for future growth and enhancements.
API Reference
This document provides comprehensive documentation for the Shrutik API, including authentication, endpoints, request/response formats, and examples.
🔗 Base URL
- Development:
http://localhost:8000
Authentication
Shrutik uses JWT (JSON Web Token) based authentication with refresh tokens for secure API access.
Authentication Flow
sequenceDiagram
participant C as Client
participant A as API
participant DB as Database
C->>A: POST /api/auth/login
A->>DB: Validate credentials
DB-->>A: User data
A-->>C: JWT + Refresh Token
Note over C: Store tokens securely
C->>A: GET /api/recordings (with JWT)
A->>A: Validate JWT
A-->>C: Protected resource
Note over C: JWT expires
C->>A: POST /api/auth/refresh
A->>A: Validate refresh token
A-->>C: New JWT
Authentication Endpoints
Login
POST /api/auth/login
Content-Type: application/json
{
"email": "user@example.com",
"password": "secure_password"
}
Response:
{
"access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
"token_type": "bearer",
"user": {
"id": 1,
"email": "user@example.com",
"name": "John Doe",
}
}
Register
POST /api/auth/register
Content-Type: application/json
{
"email": "newuser@example.com",
"password": "secure_password",
"name": "Jane Smith",
"preferred_language": "bn"
}
Refresh Token
POST /api/auth/refresh
Content-Type: application/json
{
"refresh_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}
Logout
POST /api/auth/logout
Authorization: Bearer <access_token>
Using Authentication
Include the JWT token in the Authorization header for all protected endpoints:
Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
Response Format
All API responses follow a consistent format:
Success Response
{
"data": {
// Response data
},
"message": "Operation successful",
"timestamp": "2024-01-01T12:00:00Z"
}
Error Response
{
"error": {
"code": "VALIDATION_ERROR",
"message": "Invalid input data",
"details": {
"field": "email",
"issue": "Invalid email format"
}
},
"timestamp": "2024-01-01T12:00:00Z"
}
Pagination Response
{
"data": [
// Array of items
],
"pagination": {
"total": 150,
"page": 1,
"per_page": 20,
"total_pages": 8,
"has_next": true,
"has_prev": false
}
}
Voice Recordings API
Create Recording Session
Start a new recording session for a specific script.
POST /api/recordings/sessions
Authorization: Bearer <token>
Content-Type: application/json
{
"script_id": 123,
"language_id": 1,
"metadata": {
"device_info": "iPhone 14",
"environment": "quiet_room"
}
}
Response:
{
"session_id": "uuid-string",
"script": {
"id": 123,
"content": "আমি বাংলায় কথা বলি।",
"language": "Bengali",
"difficulty": "easy"
},
"expires_at": "2024-01-01T14:00:00Z"
}
Upload Recording
Upload an audio file for a recording session.
POST /api/recordings/upload
Authorization: Bearer <token>
Content-Type: multipart/form-data
session_id: uuid-string
duration: 5.2
audio_format: wav
file_size: 1048576
sample_rate: 44100
channels: 1
audio_file: <binary_data>
Response:
{
"recording_id": 456,
"status": "uploaded",
"processing_job_id": "job-uuid",
"estimated_processing_time": 30
}
Get User Recordings
Retrieve paginated list of user’s recordings.
GET /api/recordings?skip=0&limit=20&status=processed
Authorization: Bearer <token>
Response:
{
"recordings": [
{
"id": 456,
"script_id": 123,
"language": "Bengali",
"duration": 5.2,
"status": "processed",
"chunks_count": 3,
"created_at": "2024-01-01T12:00:00Z"
}
],
"total": 50,
"page": 1,
"per_page": 20,
"total_pages": 3
}
Get Recording Progress
Check processing progress for a recording.
GET /api/recordings/456/progress
Authorization: Bearer <token>
Response:
{
"recording_id": 456,
"status": "processing",
"progress_percentage": 75,
"current_step": "chunking_audio",
"estimated_completion": "2024-01-01T12:05:00Z",
"chunks_created": 2
}
Transcriptions API
Get Transcription Task
Request audio chunks for transcription.
POST /api/transcriptions/tasks
Authorization: Bearer <token>
Content-Type: application/json
{
"quantity": 5,
"language_id": 1,
"skip_chunk_ids": [10, 15, 20],
"difficulty_preference": "mixed"
}
Response:
{
"session_id": "transcription-session-uuid",
"chunks": [
{
"id": 789,
"recording_id": 456,
"chunk_index": 1,
"file_path": "/chunks/chunk_789.wav",
"duration": 3.5,
"sentence_hint": "Greeting phrase",
"transcription_count": 2
}
],
"total_available": 1500
}
Submit Transcriptions
Submit transcriptions for audio chunks.
POST /api/transcriptions/submit
Authorization: Bearer <token>
Content-Type: application/json
{
"session_id": "transcription-session-uuid",
"transcriptions": [
{
"chunk_id": 789,
"language_id": 1,
"text": "আমি বাংলায় কথা বলি।",
"quality": 4.5,
"confidence": 0.95,
"metadata": {
"time_taken": 45,
"difficulty_rating": 3
}
}
],
"skipped_chunk_ids": [790]
}
Response:
{
"submitted_count": 1,
"skipped_count": 1,
"transcriptions": [
{
"id": 1001,
"chunk_id": 789,
"text": "আমি বাংলায় কথা বলি।",
"quality": 4.5,
"is_consensus": false,
"created_at": "2024-01-01T12:00:00Z"
}
],
"message": "Successfully submitted 1 transcriptions"
}
Skip Chunk
Skip a difficult or unclear audio chunk.
POST /api/transcriptions/skip
Authorization: Bearer <token>
Content-Type: application/json
{
"chunk_id": 790,
"reason": "poor_audio_quality",
"comment": "Background noise makes it unclear"
}
Get User Transcriptions
Retrieve user’s transcription history.
GET /api/transcriptions?skip=0&limit=20&language_id=1
Authorization: Bearer <token>
Response:
{
"transcriptions": [
{
"id": 1001,
"chunk_id": 789,
"text": "আমি বাংলায় কথা বলি।",
"quality": 4.5,
"confidence": 0.95,
"is_consensus": true,
"is_validated": true,
"created_at": "2024-01-01T12:00:00Z"
}
],
"total": 100,
"page": 1,
"per_page": 20,
"total_pages": 5
}
Audio Chunks API
Get Chunk Audio
Retrieve audio file for a specific chunk.
GET /api/chunks/789/audio
Authorization: Bearer <token>
Response: Binary audio data with optimized headers
Headers:
Content-Type: audio/wav
Cache-Control: public, max-age=3600
Accept-Ranges: bytes
Content-Length: 1048576
Get Chunk Info
Get metadata about an audio chunk.
GET /api/chunks/789/info
Authorization: Bearer <token>
Response:
{
"chunk_id": 789,
"recording_id": 456,
"duration": 3.5,
"start_time": 1.2,
"end_time": 4.7,
"transcription_count": 3,
"file_size": 1048576,
"optimized_url": "https://cdn.example.com/chunks/789.wav",
"alternatives": [
{
"format": ".mp3",
"url": "https://cdn.example.com/chunks/789.mp3",
"mime_type": "audio/mpeg"
}
]
}
Admin API
Platform Statistics
Get comprehensive platform statistics (admin only).
GET /api/admin/stats/platform
Authorization: Bearer <admin_token>
Response:
{
"users": {
"total": 1500,
"active_last_30_days": 450,
"new_this_month": 75
},
"recordings": {
"total": 5000,
"total_duration_hours": 250.5,
"processed": 4800,
"pending": 200
},
"transcriptions": {
"total": 15000,
"consensus_reached": 12000,
"average_quality": 4.2
},
"languages": {
"supported": 5,
"most_active": "Bengali"
}
}
User Management
Get users for management (admin only).
GET /api/admin/users?role=contributor&limit=50
Authorization: Bearer <admin_token>
Performance Dashboard
Get performance metrics (admin only).
GET /api/admin/performance/dashboard
Authorization: Bearer <admin_token>
Response:
{
"system_metrics": {
"cpu_usage": 45.2,
"memory_usage": 67.8,
"disk_usage": 23.1,
"active_connections": 150
},
"cache_performance": {
"hit_rate": 85.5,
"memory_used": "512MB",
"keys_count": 15000
},
"database_performance": {
"connection_pool": {
"total_connections": 20,
"active_connections": 8,
"idle_connections": 12
},
"slow_queries": 2
}
}
Export API
Create Dataset Export
Request a dataset export job.
POST /api/export/dataset
Authorization: Bearer <token>
Content-Type: application/json
{
"format": "csv",
"language_ids": [1, 2],
"include_audio": true,
"quality_threshold": 4.0,
"consensus_only": true,
"date_range": {
"start": "2024-01-01",
"end": "2024-12-31"
}
}
Response:
{
"job_id": "export-job-uuid",
"status": "queued",
"estimated_completion": "2024-01-01T12:30:00Z",
"estimated_size_mb": 150
}
Get Export Status
Check export job status.
GET /api/export/jobs/export-job-uuid/status
Authorization: Bearer <token>
Response:
{
"job_id": "export-job-uuid",
"status": "completed",
"progress_percentage": 100,
"download_url": "https://api.example.com/downloads/dataset-uuid.zip",
"file_size_mb": 145.7,
"expires_at": "2024-01-08T12:00:00Z"
}
Scripts API
Get Available Scripts
Retrieve scripts available for recording.
GET /api/scripts?language_id=1&difficulty=easy&limit=20
Authorization: Bearer <token>
Response:
{
"scripts": [
{
"id": 123,
"content": "আমি বাংলায় কথা বলি।",
"language": {
"id": 1,
"name": "Bengali",
"code": "bn"
},
"difficulty": "easy",
"estimated_duration": 3.5,
"recording_count": 25
}
],
"total": 500,
"page": 1,
"per_page": 20
}
Languages API
Get Supported Languages
Retrieve list of supported languages.
GET /api/languages
Response:
{
"languages": [
{
"id": 1,
"name": "Bengali",
"code": "bn",
"script": "Bengali",
"active": true,
"recording_count": 5000,
"transcription_count": 15000
},
{
"id": 2,
"name": "Hindi",
"code": "hi",
"script": "Devanagari",
"active": true,
"recording_count": 3000,
"transcription_count": 9000
}
]
}
Search API
Search Transcriptions
Search through transcriptions (admin only).
GET /api/search/transcriptions?q=greeting&language_id=1&limit=20
Authorization: Bearer <admin_token>
Health Check
System Health
Check system health and status.
GET /health
Response:
{
"status": "healthy",
"checks": {
"database": true,
"redis": true,
"disk_space": true,
"memory": true
},
"performance": {
"database_pool": {
"total_connections": 20,
"active_connections": 5
},
"cache_status": true
},
"timestamp": "2024-01-01T12:00:00Z"
}
Metrics
Performance Metrics
Get performance metrics (admin only).
GET /metrics
Authorization: Bearer <admin_token>
Error Codes
HTTP Status Codes
| Code | Description |
|---|---|
| 200 | Success |
| 201 | Created |
| 400 | Bad Request |
| 401 | Unauthorized |
| 403 | Forbidden |
| 404 | Not Found |
| 422 | Validation Error |
| 429 | Rate Limited |
| 500 | Internal Server Error |
Custom Error Codes
| Code | Description |
|---|---|
VALIDATION_ERROR | Input validation failed |
AUTHENTICATION_FAILED | Invalid credentials |
INSUFFICIENT_PERMISSIONS | User lacks required permissions |
RESOURCE_NOT_FOUND | Requested resource not found |
RATE_LIMIT_EXCEEDED | Too many requests |
SESSION_EXPIRED | Recording/transcription session expired |
FILE_TOO_LARGE | Uploaded file exceeds size limit |
UNSUPPORTED_FORMAT | Audio format not supported |
PROCESSING_ERROR | Audio processing failed |
CONSENSUS_PENDING | Transcription consensus not yet reached |
Rate Limits
Default Limits
| User Type | Requests/Minute |
|---|---|
| Anonymous | 60 |
| Authenticated | 300 |
| Admin | 1000 |
| Sworik Developer | 2000 |
Endpoint-Specific Limits
| Endpoint | Limit | Window |
|---|---|---|
/api/auth/login | 10/min | 1 minute |
/api/recordings/upload | 20/min | 1 minute |
/api/transcriptions/submit | 100/min | 1 minute |
/api/chunks/*/audio | 10/sec | 1 second |
Rate Limit Headers
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 299
X-RateLimit-Reset: 1640995200
Retry-After: 60
Security
API Security Best Practices
- Always use HTTPS in production
- Store JWT tokens securely (not in localStorage for web apps)
- Implement proper CORS policies
- Validate all inputs on client and server
- Use refresh tokens for long-lived sessions
- Implement rate limiting to prevent abuse
- Log security events for monitoring
Content Security Policy
Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; media-src 'self' blob:; connect-src 'self' wss:
SDKs and Libraries
JavaScript/TypeScript SDK
npm install @shrutik/sdk
import { ShrutikClient } from '@shrutik/sdk';
const client = new ShrutikClient({
baseURL: 'https://api.yourdomain.com',
apiKey: 'your-api-key'
});
// Get transcription task
const task = await client.transcriptions.getTask({
quantity: 5,
languageId: 1
});
// Submit transcription
await client.transcriptions.submit({
sessionId: task.sessionId,
transcriptions: [{
chunkId: 789,
text: 'Transcribed text',
quality: 4.5
}]
});
Python SDK
pip install shrutik-sdk
from shrutik import ShrutikClient
client = ShrutikClient(
base_url='https://api.yourdomain.com',
api_key='your-api-key'
)
# Get transcription task
task = client.transcriptions.get_task(
quantity=5,
language_id=1
)
# Submit transcription
client.transcriptions.submit(
session_id=task.session_id,
transcriptions=[{
'chunk_id': 789,
'text': 'Transcribed text',
'quality': 4.5
}]
)
Testing
API Testing with curl
# Login
curl -X POST https://api.yourdomain.com/api/auth/login \
-H "Content-Type: application/json" \
-d '{"email":"user@example.com","password":"password"}'
# Get recordings (with token)
curl -X GET https://api.yourdomain.com/api/recordings \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
# Upload recording
curl -X POST https://api.yourdomain.com/api/recordings/upload \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-F "session_id=uuid" \
-F "duration=5.2" \
-F "audio_format=wav" \
-F "file_size=1048576" \
-F "audio_file=@recording.wav"
For additional support, join our Discord community or check our GitHub repository.
Audio Processing Modes
The Voice Data Collection Platform supports two modes for audio processing to accommodate different development and deployment scenarios.
Processing Modes
1. Asynchronous Processing (Celery) - Recommended for Production
When to use:
- Production deployments
- Development with full background processing
- When you want non-blocking audio uploads
- When processing large audio files
How it works:
- User uploads audio file
- File is saved and marked as
UPLOADED - Celery task is queued for background processing
- User gets immediate response
- Background worker processes audio into chunks
- Status updates to
PROCESSEDwhen complete - User can check progress via API
Setup:
# In .env file
USE_CELERY=true
# Start services
redis-server
celery -A app.core.celery_app worker --loglevel=info
uvicorn app.main:app --reload
Benefits:
- ✅ Non-blocking uploads
- ✅ Scalable (multiple workers)
- ✅ Retry mechanisms
- ✅ Progress tracking
- ✅ Monitoring via Flower
Drawbacks:
- ❌ More complex setup
- ❌ Requires Redis
- ❌ Requires Celery workers
2. Synchronous Processing - Simple Development
When to use:
- Quick local development
- Testing without Celery setup
- Simple deployments
- When immediate results are needed
How it works:
- User uploads audio file
- File is saved and marked as
PROCESSING - Audio processing happens immediately in the request
- Chunks are created during the upload request
- Status updates to
PROCESSEDbefore response - User gets complete results immediately
Setup:
# In .env file
USE_CELERY=false
# Start service
uvicorn app.main:app --reload
Benefits:
- ✅ Simple setup (no Redis/Celery needed)
- ✅ Immediate results
- ✅ Easier debugging
- ✅ No additional services required
Drawbacks:
- ❌ Blocking uploads (slower response)
- ❌ No retry mechanisms
- ❌ Single-threaded processing
- ❌ No progress tracking
Automatic Mode Detection
The system automatically detects which mode to use:
def _is_celery_available(self) -> bool:
# 1. Check configuration
if not settings.USE_CELERY:
return False
# 2. Check if workers are running
try:
inspect = celery_app.control.inspect()
stats = inspect.stats()
return stats is not None and len(stats) > 0
except:
return False
Fallback Logic:
- If
USE_CELERY=false→ Always use synchronous processing - If
USE_CELERY=truebut no workers → Fall back to synchronous processing - If
USE_CELERY=trueand workers available → Use Celery processing
API Behavior Differences
Upload Response
Celery Mode:
{
"id": 123,
"status": "uploaded",
"message": "File uploaded successfully, processing queued"
}
Synchronous Mode:
{
"id": 123,
"status": "processed",
"chunks_created": 5,
"message": "File uploaded and processed successfully"
}
Progress Tracking
Celery Mode:
# Check progress
GET /api/recordings/123/progress
{
"status": "processing",
"progress": 45,
"chunks_created": 0
}
# Later...
GET /api/recordings/123/progress
{
"status": "processed",
"progress": 100,
"chunks_created": 5
}
Synchronous Mode:
# Progress is always complete
GET /api/recordings/123/progress
{
"status": "processed",
"progress": 100,
"chunks_created": 5
}
Configuration Options
Environment Variables
# Enable/disable Celery
USE_CELERY=true|false
# Celery configuration (when enabled)
REDIS_URL=redis://localhost:6379/0
JOB_MAX_RETRIES=3
JOB_RETRY_DELAY=60
Runtime Detection
The system logs which mode is being used:
# Celery mode
INFO: Queued audio processing task abc123 for recording 456
# Synchronous mode
INFO: Celery not available, processing recording 456 synchronously...
INFO: Successfully processed recording 456 into 5 chunks
Development Workflow
For Frontend Development
Use synchronous mode for simplicity:
USE_CELERY=false
uvicorn app.main:app --reload
For Full-Stack Development
Use Celery mode to test complete workflow:
USE_CELERY=true
./scripts/start-local-dev.sh
For Production Testing
Always use Celery mode:
USE_CELERY=true
# + proper Redis/Celery setup
Monitoring and Debugging
Celery Mode Monitoring
# Check worker status
celery -A app.core.celery_app inspect active
# Monitor via Flower
celery -A app.core.celery_app flower --port=5555
# Check job status via API
GET /api/jobs/active
Synchronous Mode Debugging
# Check logs for processing errors
tail -f logs/app.log
# Processing happens in main thread
# Errors appear immediately in response
Performance Considerations
Celery Mode
- Throughput: High (parallel processing)
- Response Time: Fast (immediate return)
- Resource Usage: Distributed across workers
- Scalability: Horizontal (add more workers)
Synchronous Mode
- Throughput: Limited (sequential processing)
- Response Time: Slow (includes processing time)
- Resource Usage: Single process
- Scalability: Vertical only
Error Handling
Celery Mode
- Automatic retries with exponential backoff
- Failed tasks can be manually retried
- Detailed error tracking in job monitoring
- Notifications for failures
Synchronous Mode
- Immediate error response
- No automatic retries
- Simpler error debugging
- Direct error messages
Migration Between Modes
From Synchronous to Celery
- Set
USE_CELERY=true - Start Redis and Celery workers
- Existing processed recordings work normally
- New uploads use background processing
From Celery to Synchronous
- Set
USE_CELERY=false - Stop Celery workers (optional)
- Existing queued tasks will fail
- New uploads use synchronous processing
Note: In-progress Celery tasks will fail when switching to synchronous mode. Complete or cancel them first.
Best Practices
Development
- Use synchronous mode for quick testing
- Use Celery mode when testing full workflow
- Monitor logs for processing errors
Production
- Always use Celery mode
- Set up proper monitoring
- Configure retry mechanisms
- Use multiple workers for scalability
Testing
- Test both modes in CI/CD
- Verify fallback behavior
- Test error scenarios in both modes
Shrutik System Flowcharts
This directory contains visual documentation of Shrutik’s system flows and processes using Mermaid diagrams. These flowcharts help developers and contributors understand the system architecture and data flow.
Available Flowcharts
Core System Flows
- Overall System Architecture - High-level system overview
- Voice Recording Flow - Complete voice recording process
- Transcription Workflow - Transcription and consensus process
- User Authentication Flow - User registration and authentication
- Data Processing Pipeline - Audio processing and chunking
Technical Flows
- API Request Flow - API request lifecycle
- Database Operations - Database interaction patterns
- Caching Strategy - Caching and performance optimization
- Background Jobs - Celery task processing
How to Read These Diagrams
Symbols and Conventions
- Rectangles: Processes or services
- Diamonds: Decision points
- Circles: Start/end points
- Cylinders: Databases or storage
- Clouds: External services
- Arrows: Data flow direction
Color Coding
- Blue: User interactions
- Green: Successful operations
- Red: Error conditions
- Yellow: Processing/waiting states
- Purple: External services
🔧 Updating Flowcharts
When making changes to the system:
- Review Affected Diagrams: Check which flowcharts need updates
- Update Mermaid Code: Modify the diagram code
- Test Rendering: Ensure diagrams render correctly
- Update Documentation: Sync with code changes
Mermaid Syntax Reference
graph TD
A[Start] --> B{Decision?}
B -->|Yes| C[Process]
B -->|No| D[Alternative]
C --> E[End]
D --> E
📚 Additional Resources
Contributing
To contribute new flowcharts or update existing ones:
- Follow the naming convention:
kebab-case.md - Include a description and context
- Use consistent styling and colors
- Test diagram rendering
- Update this README if adding new diagrams
These visual guides complement our technical documentation and help make Shrutik more accessible to contributors of all backgrounds.
Voice Recording Flow
This flowchart details the complete process of voice recording in Shrutik, from user interaction to final storage and processing.
Complete Voice Recording Process
flowchart TD
START([User Starts Recording]) --> AUTH{User Authenticated?}
AUTH -->|No| LOGIN[Redirect to Login]
LOGIN --> AUTH
AUTH -->|Yes| SCRIPT[Get Script for Recording]
SCRIPT --> PERM{Microphone Permission?}
PERM -->|No| REQ_PERM[Request Microphone Access]
REQ_PERM --> PERM_GRANT{Permission Granted?}
PERM_GRANT -->|No| ERROR_PERM[Show Permission Error]
PERM_GRANT -->|Yes| PERM
PERM -->|Yes| SETUP[Setup Audio Recording]
SETUP --> DISPLAY[Display Script Text]
DISPLAY --> READY[Show Record Button]
READY --> RECORD_START[User Clicks Record]
RECORD_START --> RECORDING[Recording Audio...]
RECORDING --> MONITOR{Monitor Recording}
MONITOR --> CHECK_DURATION{Duration < Max?}
CHECK_DURATION -->|No| AUTO_STOP[Auto Stop Recording]
CHECK_DURATION -->|Yes| USER_STOP{User Stops?}
USER_STOP -->|No| MONITOR
USER_STOP -->|Yes| STOP_REC[Stop Recording]
AUTO_STOP --> STOP_REC
STOP_REC --> VALIDATE[Validate Audio]
VALIDATE --> VALID{Audio Valid?}
VALID -->|No| ERROR_AUDIO[Show Audio Error]
ERROR_AUDIO --> READY
VALID -->|Yes| PREVIEW[Show Audio Preview]
PREVIEW --> USER_ACTION{User Action}
USER_ACTION -->|Re-record| READY
USER_ACTION -->|Cancel| CANCEL[Cancel Recording]
USER_ACTION -->|Submit| PREPARE[Prepare Upload]
PREPARE --> CREATE_SESSION[Create Recording Session]
CREATE_SESSION --> SESSION_VALID{Session Created?}
SESSION_VALID -->|No| ERROR_SESSION[Session Creation Error]
SESSION_VALID -->|Yes| UPLOAD[Upload Audio File]
UPLOAD --> UPLOAD_PROGRESS[Show Upload Progress]
UPLOAD_PROGRESS --> UPLOAD_COMPLETE{Upload Complete?}
UPLOAD_COMPLETE -->|No| UPLOAD_ERROR[Upload Error]
UPLOAD_ERROR --> RETRY{Retry Upload?}
RETRY -->|Yes| UPLOAD
RETRY -->|No| CANCEL
UPLOAD_COMPLETE -->|Yes| SAVE_DB[Save to Database]
SAVE_DB --> QUEUE_PROCESSING[Queue for Processing]
QUEUE_PROCESSING --> SUCCESS[Show Success Message]
SUCCESS --> NEXT_ACTION{User Next Action}
NEXT_ACTION -->|Record Another| SCRIPT
NEXT_ACTION -->|View Progress| DASHBOARD[Go to Dashboard]
NEXT_ACTION -->|Logout| LOGOUT[Logout User]
CANCEL --> CLEANUP[Cleanup Resources]
ERROR_PERM --> CLEANUP
ERROR_SESSION --> CLEANUP
CLEANUP --> END([End])
DASHBOARD --> END
LOGOUT --> END
%% Background Processing (Async)
QUEUE_PROCESSING -.-> BG_START[Background Processing Starts]
BG_START -.-> VALIDATE_FILE[Validate Audio File]
VALIDATE_FILE -.-> CHUNK_AUDIO[Intelligent Audio Chunking]
CHUNK_AUDIO -.-> SAVE_CHUNKS[Save Audio Chunks]
SAVE_CHUNKS -.-> UPDATE_STATUS[Update Recording Status]
UPDATE_STATUS -.-> NOTIFY_USER[Notify User of Completion]
%% Styling
classDef userAction fill:#e3f2fd
classDef process fill:#e8f5e8
classDef decision fill:#fff3e0
classDef error fill:#ffebee
classDef success fill:#e0f2f1
classDef background fill:#f3e5f5
class START,RECORD_START,USER_STOP,USER_ACTION,NEXT_ACTION userAction
class SETUP,DISPLAY,RECORDING,VALIDATE,PREPARE,UPLOAD,SAVE_DB process
class AUTH,PERM,PERM_GRANT,CHECK_DURATION,VALID,SESSION_VALID,UPLOAD_COMPLETE,RETRY decision
class ERROR_PERM,ERROR_AUDIO,ERROR_SESSION,UPLOAD_ERROR error
class SUCCESS,NOTIFY_USER success
class BG_START,VALIDATE_FILE,CHUNK_AUDIO,SAVE_CHUNKS,UPDATE_STATUS background
Process Breakdown
1. User Authentication & Setup
sequenceDiagram
participant U as User
participant F as Frontend
participant B as Backend
participant DB as Database
U->>F: Access Recording Page
F->>B: Check Authentication
B->>DB: Validate Session
DB-->>B: User Data
B-->>F: Authentication Status
alt Not Authenticated
F->>U: Redirect to Login
U->>F: Login Credentials
F->>B: Authenticate User
B-->>F: JWT Token
end
F->>B: Request Script
B->>DB: Get Available Script
DB-->>B: Script Data
B-->>F: Script Content
F->>U: Display Script
2. Audio Recording Process
sequenceDiagram
participant U as User
participant F as Frontend
participant M as MediaRecorder
participant V as Validation
U->>F: Click Record Button
F->>M: Start Recording
M-->>F: Recording Started
F->>U: Show Recording UI
loop During Recording
M->>F: Audio Data Chunks
F->>V: Validate Duration
V-->>F: Status Update
end
U->>F: Stop Recording
F->>M: Stop Recording
M-->>F: Final Audio Blob
F->>V: Validate Audio Quality
V-->>F: Validation Result
F->>U: Show Preview/Options
3. File Upload & Processing
sequenceDiagram
participant U as User
participant F as Frontend
participant B as Backend
participant S as Storage
participant Q as Queue
participant W as Worker
U->>F: Submit Recording
F->>B: Create Recording Session
B-->>F: Session ID
F->>B: Upload Audio File
B->>S: Store Audio File
S-->>B: File Path
B->>Q: Queue Processing Job
B-->>F: Upload Success
F->>U: Show Success Message
Q->>W: Process Audio File
W->>S: Read Audio File
W->>W: Validate & Chunk Audio
W->>S: Save Audio Chunks
W->>B: Update Status
B->>F: Notify Completion (WebSocket)
F->>U: Show Processing Complete
🔍 Validation Steps
Audio Quality Validation
- Duration Check: 1-60 seconds
- Format Validation: Supported audio formats
- File Size: Maximum 100MB
- Sample Rate: Minimum quality requirements
- Noise Level: Basic noise detection
Security Validation
- File Type: MIME type verification
- Malware Scan: Basic security checks
- User Permissions: Recording quota limits
- Session Validation: Valid recording session
Performance Optimizations
Frontend Optimizations
- Progressive Upload: Chunked file upload
- Compression: Client-side audio compression
- Caching: Cache user preferences and scripts
- Offline Support: Queue recordings when offline
Backend Optimizations
- Async Processing: Background job processing
- Connection Pooling: Database connection optimization
- Caching: Redis caching for frequent data
- CDN Integration: Optimized file delivery
Error Handling
Common Error Scenarios
-
Microphone Access Denied
- Show clear instructions
- Provide alternative options
- Guide user through browser settings
-
Network Connection Issues
- Implement retry logic
- Show connection status
- Queue uploads for later
-
File Upload Failures
- Automatic retry with exponential backoff
- Resume interrupted uploads
- Clear error messages
-
Audio Quality Issues
- Real-time quality feedback
- Recording tips and guidance
- Option to re-record
Error Recovery
flowchart LR
ERROR[Error Occurs] --> LOG[Log Error Details]
LOG --> CLASSIFY{Error Type}
CLASSIFY -->|Network| RETRY[Automatic Retry]
CLASSIFY -->|Validation| USER_FIX[User Action Required]
CLASSIFY -->|System| FALLBACK[Fallback Method]
RETRY --> SUCCESS{Retry Success?}
SUCCESS -->|Yes| CONTINUE[Continue Process]
SUCCESS -->|No| USER_FIX
USER_FIX --> GUIDE[Show User Guidance]
FALLBACK --> ALTERNATIVE[Alternative Flow]
GUIDE --> CONTINUE
ALTERNATIVE --> CONTINUE
Monitoring & Analytics
Key Metrics
- Recording Success Rate: Percentage of successful recordings
- Average Recording Duration: User engagement metrics
- Upload Success Rate: Technical performance metrics
- Processing Time: Background job performance
- Error Rates: System reliability metrics
User Experience Metrics
- Time to First Recording: Onboarding effectiveness
- Recording Abandonment Rate: UX friction points
- Retry Attempts: Error recovery effectiveness
- User Satisfaction: Quality ratings and feedback
This comprehensive flow ensures a smooth, reliable voice recording experience while maintaining high quality standards and robust error handling.
Transcription Workflow
This flowchart details the complete transcription process in Shrutik, including task assignment, transcription submission, consensus building, and quality control.
💡 Pro Tip: All diagrams below are interactive! Use your mouse wheel to zoom, drag to pan, and double-click to reset. Click the fullscreen button for a better view of complex diagrams.
Complete Transcription Process
flowchart TD
START([User Requests Transcription Task]) --> AUTH{User Authenticated?}
AUTH -->|No| LOGIN[Redirect to Login]
LOGIN --> AUTH
AUTH -->|Yes| REQ_TASK[Request Transcription Task]
REQ_TASK --> TASK_PARAMS[Specify Task Parameters]
TASK_PARAMS --> FIND_CHUNKS[Find Available Chunks]
FIND_CHUNKS --> FILTER[Apply Filters]
FILTER --> EXCLUDE[Exclude User's Previous Work]
EXCLUDE --> AVAILABLE{Chunks Available?}
AVAILABLE -->|No| NO_CHUNKS[No Chunks Available]
NO_CHUNKS --> SUGGEST[Suggest Alternatives]
SUGGEST --> END_NO_WORK([End - No Work])
AVAILABLE -->|Yes| SELECT[Select Random Chunks]
SELECT --> CREATE_SESSION[Create Transcription Session]
CREATE_SESSION --> LOAD_AUDIO[Load Audio Files]
LOAD_AUDIO --> OPTIMIZE[Optimize Audio Delivery]
OPTIMIZE --> PRESENT[Present Chunks to User]
PRESENT --> USER_WORK[User Transcribes Audio]
USER_WORK --> TRANSCRIBE{Transcription Action}
TRANSCRIBE -->|Skip Chunk| SKIP_CHUNK[Record Skip Reason]
TRANSCRIBE -->|Transcribe| ENTER_TEXT[Enter Transcription Text]
TRANSCRIBE -->|Submit All| VALIDATE_SUBMISSION[Validate Submission]
SKIP_CHUNK --> UPDATE_SKIP[Update Skip Metadata]
UPDATE_SKIP --> NEXT_CHUNK{More Chunks?}
ENTER_TEXT --> QUALITY_RATE[Rate Audio Quality]
QUALITY_RATE --> CONFIDENCE[Set Confidence Level]
CONFIDENCE --> SAVE_DRAFT[Save Draft Locally]
SAVE_DRAFT --> NEXT_CHUNK
NEXT_CHUNK -->|Yes| PRESENT
NEXT_CHUNK -->|No| VALIDATE_SUBMISSION
VALIDATE_SUBMISSION --> CHECK_REQUIRED{Required Fields?}
CHECK_REQUIRED -->|Missing| SHOW_ERRORS[Show Validation Errors]
SHOW_ERRORS --> USER_WORK
CHECK_REQUIRED -->|Complete| SUBMIT[Submit Transcriptions]
SUBMIT --> PROCESS_SUBMISSION[Process Submission]
PROCESS_SUBMISSION --> VALIDATE_SESSION{Valid Session?}
VALIDATE_SESSION -->|No| SESSION_ERROR[Session Error]
SESSION_ERROR --> ERROR_RECOVERY[Error Recovery]
VALIDATE_SESSION -->|Yes| CHECK_DUPLICATES{Check Duplicates}
CHECK_DUPLICATES -->|Found| DUPLICATE_ERROR[Duplicate Error]
DUPLICATE_ERROR --> ERROR_RECOVERY
CHECK_DUPLICATES -->|None| SAVE_TRANSCRIPTIONS[Save Transcriptions]
SAVE_TRANSCRIPTIONS --> UPDATE_STATS[Update User Stats]
UPDATE_STATS --> TRIGGER_CONSENSUS[Trigger Consensus Calculation]
TRIGGER_CONSENSUS --> SUCCESS[Show Success Message]
SUCCESS --> CLEANUP_SESSION[Cleanup Session]
CLEANUP_SESSION --> NEXT_ACTION{User Next Action}
NEXT_ACTION -->|Continue| REQ_TASK
NEXT_ACTION -->|View Progress| DASHBOARD[Go to Dashboard]
NEXT_ACTION -->|Logout| LOGOUT[Logout User]
ERROR_RECOVERY --> RETRY{Retry Submission?}
RETRY -->|Yes| SUBMIT
RETRY -->|No| SAVE_DRAFT
DASHBOARD --> END_SUCCESS([End - Success])
LOGOUT --> END_SUCCESS
%% Background Consensus Process
TRIGGER_CONSENSUS -.-> BG_CONSENSUS[Background Consensus Process]
BG_CONSENSUS -.-> COLLECT_TRANSCRIPTIONS[Collect All Transcriptions for Chunk]
COLLECT_TRANSCRIPTIONS -.-> CALCULATE_SIMILARITY[Calculate Text Similarity]
CALCULATE_SIMILARITY -.-> WEIGHT_QUALITY[Weight by Quality Scores]
WEIGHT_QUALITY -.-> DETERMINE_CONSENSUS[Determine Consensus Text]
DETERMINE_CONSENSUS -.-> UPDATE_CONSENSUS[Update Consensus in Database]
UPDATE_CONSENSUS -.-> NOTIFY_CONTRIBUTORS[Notify Contributors]
%% Styling
classDef userAction fill:#e3f2fd
classDef process fill:#e8f5e8
classDef decision fill:#fff3e0
classDef error fill:#ffebee
classDef success fill:#e0f2f1
classDef background fill:#f3e5f5
class START,USER_WORK,TRANSCRIBE,NEXT_ACTION userAction
class REQ_TASK,FIND_CHUNKS,SELECT,LOAD_AUDIO,SAVE_TRANSCRIPTIONS process
class AUTH,AVAILABLE,CHECK_REQUIRED,VALIDATE_SESSION,CHECK_DUPLICATES decision
class SESSION_ERROR,DUPLICATE_ERROR,SHOW_ERRORS error
class SUCCESS,NOTIFY_CONTRIBUTORS success
class BG_CONSENSUS,COLLECT_TRANSCRIPTIONS,CALCULATE_SIMILARITY,WEIGHT_QUALITY background
Task Assignment Algorithm
flowchart LR
subgraph "Task Request Parameters"
LANG[Language Preference]
QTY[Quantity Requested]
SKIP[Skip List]
DIFFICULTY[Difficulty Level]
end
subgraph "Filtering Process"
ALL_CHUNKS[All Available Chunks]
FILTER_LANG[Filter by Language]
FILTER_USER[Exclude User's Work]
FILTER_SKIP[Exclude Skip List]
FILTER_STATUS[Filter by Status]
PRIORITIZE[Prioritize by Need]
end
subgraph "Selection Strategy"
RANDOM[Random Selection]
BALANCED[Balance Difficulty]
QUALITY[Quality Distribution]
FINAL[Final Chunk List]
end
LANG --> FILTER_LANG
QTY --> RANDOM
SKIP --> FILTER_SKIP
DIFFICULTY --> BALANCED
ALL_CHUNKS --> FILTER_LANG
FILTER_LANG --> FILTER_USER
FILTER_USER --> FILTER_SKIP
FILTER_SKIP --> FILTER_STATUS
FILTER_STATUS --> PRIORITIZE
PRIORITIZE --> RANDOM
RANDOM --> BALANCED
BALANCED --> QUALITY
QUALITY --> FINAL
Consensus Algorithm
flowchart TD
CHUNK[Audio Chunk] --> COLLECT[Collect All Transcriptions]
COLLECT --> COUNT{Transcription Count}
COUNT -->|< 3| NEED_MORE[Need More Transcriptions]
COUNT -->|≥ 3| ANALYZE[Analyze Transcriptions]
ANALYZE --> SIMILARITY[Calculate Text Similarity]
SIMILARITY --> CLUSTER[Group Similar Transcriptions]
CLUSTER --> WEIGHT[Apply Quality Weights]
WEIGHT --> SCORE[Calculate Consensus Scores]
SCORE --> THRESHOLD{Above Threshold?}
THRESHOLD -->|No| NEED_MORE
THRESHOLD -->|Yes| SELECT_CONSENSUS[Select Consensus Text]
SELECT_CONSENSUS --> VALIDATE_CONSENSUS[Validate Consensus Quality]
VALIDATE_CONSENSUS --> MARK_COMPLETE[Mark Chunk as Complete]
NEED_MORE --> PRIORITY[Increase Priority for Assignment]
MARK_COMPLETE --> UPDATE_CONTRIBUTORS[Update Contributor Stats]
%% Consensus Calculation Details
subgraph "Similarity Calculation"
LEVENSHTEIN[Levenshtein Distance]
SEMANTIC[Semantic Similarity]
PHONETIC[Phonetic Matching]
COMBINED[Combined Score]
end
SIMILARITY --> LEVENSHTEIN
SIMILARITY --> SEMANTIC
SIMILARITY --> PHONETIC
LEVENSHTEIN --> COMBINED
SEMANTIC --> COMBINED
PHONETIC --> COMBINED
COMBINED --> CLUSTER
Quality Control Process
sequenceDiagram
participant U as User
participant F as Frontend
participant B as Backend
participant Q as Quality Engine
participant DB as Database
participant N as Notification
U->>F: Submit Transcription
F->>B: Send Transcription Data
B->>DB: Save Transcription
B->>Q: Trigger Quality Check
Q->>DB: Get Related Transcriptions
Q->>Q: Calculate Quality Metrics
alt Quality Issues Detected
Q->>DB: Flag for Review
Q->>N: Notify Moderators
else Quality Acceptable
Q->>Q: Update Quality Score
end
Q->>B: Quality Assessment Complete
B->>F: Update UI Status
F->>U: Show Completion Status
Note over Q: Background Consensus Process
Q->>Q: Check Consensus Threshold
alt Consensus Reached
Q->>DB: Update Consensus Text
Q->>N: Notify Contributors
else Need More Transcriptions
Q->>DB: Increase Chunk Priority
end
Progress Tracking
Individual User Progress
graph LR
subgraph "User Metrics"
TOTAL[Total Transcriptions]
ACCURACY[Accuracy Rate]
SPEED[Average Speed]
QUALITY[Quality Score]
end
subgraph "Achievements"
BADGES[Achievement Badges]
LEVELS[Experience Levels]
STREAKS[Contribution Streaks]
RANKINGS[Leaderboards]
end
TOTAL --> BADGES
ACCURACY --> LEVELS
SPEED --> RANKINGS
QUALITY --> STREAKS
System-wide Progress
graph TD
subgraph "Dataset Metrics"
CHUNKS_TOTAL[Total Audio Chunks]
CHUNKS_TRANSCRIBED[Transcribed Chunks]
CONSENSUS_REACHED[Consensus Achieved]
QUALITY_VALIDATED[Quality Validated]
end
subgraph "Language Coverage"
LANG_SUPPORTED[Supported Languages]
DIALECT_COVERAGE[Dialect Coverage]
SPEAKER_DIVERSITY[Speaker Diversity]
DOMAIN_COVERAGE[Domain Coverage]
end
CHUNKS_TOTAL --> CHUNKS_TRANSCRIBED
CHUNKS_TRANSCRIBED --> CONSENSUS_REACHED
CONSENSUS_REACHED --> QUALITY_VALIDATED
QUALITY_VALIDATED --> LANG_SUPPORTED
LANG_SUPPORTED --> DIALECT_COVERAGE
DIALECT_COVERAGE --> SPEAKER_DIVERSITY
SPEAKER_DIVERSITY --> DOMAIN_COVERAGE
Optimization Strategies
Performance Optimizations
- Caching: Cache frequently accessed chunks and user data
- Preloading: Preload next chunks while user works on current ones
- CDN: Optimize audio delivery through CDN
- Compression: Compress audio for faster loading
User Experience Optimizations
- Smart Assignment: Assign chunks based on user expertise and preferences
- Progress Indicators: Clear progress tracking and feedback
- Keyboard Shortcuts: Efficient transcription interface
- Auto-save: Prevent data loss with automatic saving
Quality Optimizations
- Difficulty Balancing: Mix easy and challenging chunks
- Context Provision: Provide helpful context and hints
- Real-time Feedback: Immediate quality feedback
- Consensus Weighting: Weight transcriptions by contributor reliability
Error Handling & Recovery
Common Error Scenarios
-
Session Timeout
- Auto-save work in progress
- Seamless session renewal
- Recovery of unsaved work
-
Network Interruption
- Offline work capability
- Automatic retry mechanisms
- Queue submissions for later
-
Audio Loading Issues
- Fallback audio formats
- Progressive loading
- Error reporting and alternatives
-
Consensus Conflicts
- Human review escalation
- Weighted voting systems
- Quality threshold adjustments
Recovery Mechanisms
flowchart LR
ERROR[Error Detected] --> CLASSIFY{Error Classification}
CLASSIFY -->|Temporary| AUTO_RETRY[Automatic Retry]
CLASSIFY -->|User Error| USER_GUIDANCE[User Guidance]
CLASSIFY -->|System Error| ESCALATE[Escalate to Support]
AUTO_RETRY --> SUCCESS{Retry Success?}
SUCCESS -->|Yes| CONTINUE[Continue Process]
SUCCESS -->|No| USER_GUIDANCE
USER_GUIDANCE --> RESOLVED{Issue Resolved?}
RESOLVED -->|Yes| CONTINUE
RESOLVED -->|No| ESCALATE
ESCALATE --> SUPPORT[Support Intervention]
SUPPORT --> CONTINUE
This comprehensive transcription workflow ensures high-quality data collection while providing an engaging and efficient experience for contributors.
Shrutik System Architecture
This diagram shows the high-level architecture of the Shrutik voice data collection platform, including all major components and their interactions.
Overall System Architecture
graph TB
subgraph "Client Layer"
WEB[Web Browser]
MOBILE[Mobile App]
API_CLIENT[API Client]
end
subgraph "Load Balancer & Proxy"
NGINX[Nginx<br/>Load Balancer]
end
subgraph "Application Layer"
FRONTEND[React Frontend<br/>Next.js]
BACKEND[FastAPI Backend<br/>Python]
WORKER[Celery Workers<br/>Background Jobs]
end
subgraph "Caching Layer"
REDIS[(Redis<br/>Cache & Queue)]
end
subgraph "Database Layer"
POSTGRES[(PostgreSQL<br/>Primary Database)]
REPLICA[(PostgreSQL<br/>Read Replica)]
end
subgraph "Storage Layer"
LOCAL_STORAGE[Local File Storage<br/>Audio Files]
CDN[CDN<br/>Static Assets]
BACKUP[Backup Storage<br/>S3/MinIO]
end
subgraph "External Services"
SMTP[Email Service<br/>SMTP]
MONITORING[Monitoring<br/>Prometheus/Grafana]
LOGGING[Logging<br/>ELK Stack]
end
subgraph "Processing Pipeline"
AUDIO_PROC[Audio Processing<br/>Librosa/PyDub]
CONSENSUS[Consensus Engine<br/>Quality Control]
EXPORT[Data Export<br/>Multiple Formats]
end
%% Client connections
WEB --> NGINX
MOBILE --> NGINX
API_CLIENT --> NGINX
%% Load balancer routing
NGINX --> FRONTEND
NGINX --> BACKEND
%% Application connections
FRONTEND --> BACKEND
BACKEND --> REDIS
BACKEND --> POSTGRES
BACKEND --> REPLICA
WORKER --> REDIS
WORKER --> POSTGRES
WORKER --> LOCAL_STORAGE
%% Processing connections
WORKER --> AUDIO_PROC
WORKER --> CONSENSUS
BACKEND --> EXPORT
%% Storage connections
BACKEND --> LOCAL_STORAGE
FRONTEND --> CDN
LOCAL_STORAGE --> BACKUP
%% External service connections
BACKEND --> SMTP
BACKEND --> MONITORING
BACKEND --> LOGGING
%% Styling
classDef client fill:#e1f5fe
classDef app fill:#e8f5e8
classDef data fill:#fff3e0
classDef external fill:#f3e5f5
classDef processing fill:#e0f2f1
class WEB,MOBILE,API_CLIENT client
class FRONTEND,BACKEND,WORKER app
class POSTGRES,REPLICA,REDIS,LOCAL_STORAGE data
class SMTP,MONITORING,LOGGING external
class AUDIO_PROC,CONSENSUS,EXPORT processing
Component Descriptions
Client Layer
- Web Browser: Primary interface for contributors using React/Next.js frontend
- Mobile App: Future mobile application for voice contributions
- API Client: External integrations and automated systems
Load Balancer & Proxy
- Nginx: Handles SSL termination, load balancing, and static file serving
- Routes requests to appropriate backend services
- Implements rate limiting and security headers
Application Layer
- React Frontend: User interface built with Next.js and TypeScript
- FastAPI Backend: RESTful API server with automatic documentation
- Celery Workers: Background job processing for audio tasks
Caching Layer
- Redis: Serves multiple purposes:
- Session storage and caching
- Message queue for Celery
- Rate limiting counters
- Real-time data caching
Database Layer
- PostgreSQL Primary: Main database for all application data
- PostgreSQL Replica: Read-only replica for analytics and reporting
- Supports horizontal scaling and high availability
Storage Layer
- Local File Storage: Audio files and uploads stored locally or on network storage
- CDN: Content delivery network for static assets and optimized audio delivery
- Backup Storage: Automated backups to S3-compatible storage
External Services
- Email Service: SMTP for user notifications and system alerts
- Monitoring: Prometheus and Grafana for system monitoring
- Logging: Centralized logging with ELK stack or similar
Processing Pipeline
- Audio Processing: Intelligent audio chunking and format conversion
- Consensus Engine: Quality control and transcription consensus algorithms
- Data Export: Multiple format support for dataset export
Data Flow Patterns
1. Voice Recording Flow
User → Frontend → Backend → Storage → Worker → Audio Processing → Database
2. Transcription Flow
User → Frontend → Backend → Database → Consensus Engine → Quality Metrics
3. API Request Flow
Client → Nginx → Backend → Cache/Database → Response → Client
🚀 Scalability Considerations
Horizontal Scaling
- Frontend: Multiple instances behind load balancer
- Backend: Stateless API servers can be scaled horizontally
- Workers: Auto-scaling based on queue length
- Database: Read replicas for query distribution
Performance Optimization
- Caching: Multi-layer caching strategy with Redis
- CDN: Global content delivery for static assets
- Database: Connection pooling and query optimization
- Background Jobs: Async processing for heavy operations
High Availability
- Load Balancing: Multiple instances of each service
- Database Replication: Master-slave setup with failover
- Health Checks: Automated monitoring and alerting
- Backup Strategy: Regular automated backups
Security Architecture
Authentication & Authorization
- JWT-based authentication with refresh tokens
- Role-based access control (RBAC)
- API key authentication for external clients
Data Protection
- HTTPS/TLS encryption for all communications
- Database encryption at rest
- Secure file upload validation
- Input sanitization and validation
Network Security
- Firewall rules and network segmentation
- Rate limiting and DDoS protection
- Security headers and CORS configuration
- Regular security audits and updates
Monitoring & Observability
Metrics Collection
- Application performance metrics
- System resource monitoring
- Business metrics and analytics
- Error tracking and alerting
Logging Strategy
- Structured logging with correlation IDs
- Centralized log aggregation
- Log retention and archival policies
- Security event logging
Health Checks
- Service health endpoints
- Database connectivity checks
- External service dependency monitoring
- Automated failover mechanisms
Deployment Architecture
Development Environment
- Local development with Docker Compose
- Hot reload for rapid development
- Isolated test databases
- Mock external services
Staging Environment
- Production-like environment for testing
- Automated deployment pipeline
- Integration testing
- Performance testing
Production Environment
- Multi-zone deployment for high availability
- Blue-green deployment strategy
- Automated rollback capabilities
- Comprehensive monitoring and alerting
This architecture supports Shrutik’s mission of democratizing voice technology while maintaining high performance, security, and scalability standards.
Contributing to Shrutik
Thank you for your interest in contributing to Shrutik! This guide will help you get started with contributing to our open-source voice data collection platform.
Ways to Contribute
Voice Data Contribution
- Record Voice Samples: Contribute voice recordings in your native language
- Transcribe Audio: Help transcribe audio clips to improve dataset quality
- Quality Review: Review and validate transcriptions from other contributors
- Language Support: Help add support for new languages and dialects
Code Contribution
- Bug Fixes: Fix reported issues and improve stability
- Feature Development: Implement new features and enhancements
- Performance Optimization: Improve system performance and scalability
- Testing: Write and improve test coverage
- Documentation: Improve code documentation and API references
Documentation
- User Guides: Improve setup and usage documentation
- Developer Docs: Enhance technical documentation
- Translations: Translate documentation to other languages
- Tutorials: Create tutorials and examples
Design & UX
- UI/UX Improvements: Enhance user interface and experience
- Accessibility: Improve accessibility features
- Mobile Responsiveness: Optimize for mobile devices
- Branding: Improve visual design and branding
Getting Started
1. Set Up Development Environment
Follow our Local Development Guide to set up your development environment. also you can setup with docker as well. See Docker Local Setup
2. Find an Issue
- Browse open issues
- Look for issues labeled
good first issuefor beginners - Check issues labeled
help wantedfor areas needing assistance - Join our Discord to discuss ideas
3. Fork and Clone
# Fork the repository on GitHub
# Then clone your fork
git clone https://github.com/YOUR_USERNAME/shrutik.git
cd shrutik
# Add upstream remote
git remote add upstream https://github.com/Onuronon-lab/Shrutik.git
Development Workflow
1. Create a Branch
Important: All PRs must be submitted to the deployment-dev branch, not master.
Before starting development, please review our Engineering Conventions for branch naming, commit messages, and coding standards.
# Update deployment-dev branch
git checkout deployment-dev
git pull origin deployment-dev
# Create a feature branch following our naming convention
git checkout -b feat/your-feature-name
# or for bug fixes
git checkout -b fix/issue-number-description
2. Make Changes
- Write code
- Add tests for new functionality
- Update documentation as needed
- Ensure all tests pass
3. Commit Changes
Follow our Engineering Conventions for commit message format.
# Stage your changes
git add .
# Commit with a descriptive message following conventional commits
git commit -m "feat: add voice recording validation
- Add audio quality validation
- Implement duration checks
- Add error handling for invalid formats
- Update tests and documentation
Fixes #123"
4. Push and Create PR
# Push to your fork
git push origin feature/your-feature-name
# Create a Pull Request to deployment-dev on GitHub
PR Guidelines:
- Target the
deployment-devbranch (not master!) - Fill out the PR template completely
- Ensure all CI checks pass
- Code must be formatted (see Code Formatting section)
Code Formatting
We use automated code formatters to maintain consistent code style and eliminate formatting-related merge conflicts.
Tools & Configuration
- Backend (Python): Black (88 chars), isort, flake8
- Frontend (TypeScript/React): Prettier (100 chars), ESLint
Quick Setup
1. Install formatting tools:
pip install black isort flake8
cd frontend && npm install && cd ..
2. Set up pre-commit hooks (recommended):
./scripts/setup_pre_commit.sh
This auto-formats your code on every commit!
Using Pre-commit Hooks
Once set up, just commit normally:
git add .
git commit -m "feat: your changes"
# ✨ Code is automatically formatted before commit!
Before Submitting a PR
If not using pre-commit hooks, format manually:
# Format entire codebase
./scripts/format_code.sh
# Review changes
git diff
# Commit and push
git add .
git commit -m "style: format code"
git push
Manual Formatting Commands
# Format everything
./scripts/format_code.sh
# Backend only
black app/ tests/ scripts/
isort app/ tests/ scripts/
# Frontend only
cd frontend
npm run format
npm run lint:fix
CI/CD Checks
Our GitHub Actions workflow automatically checks formatting on all PRs to deployment-dev. If formatting fails:
./scripts/format_code.sh
git add .
git commit -m "style: fix formatting"
git push
Skipping Hooks (Emergency Only)
git commit --no-verify -m "emergency fix"
Note: Use sparingly! The CI will still check formatting.
Troubleshooting
| Problem | Solution |
|---|---|
| Tools not found | pip install black isort flake8 |
| Prettier not found | cd frontend && npm install |
| Hooks not running | pre-commit install |
Style Guidelines
Python
# ✅ Good (Black formatted)
def calculate_total(items: list[dict], tax_rate: float = 0.1) -> float:
"""Calculate total with tax."""
subtotal = sum(item["price"] for item in items)
return subtotal * (1 + tax_rate)
TypeScript/React
// ✅ Good (Prettier formatted)
const UserCard = ({ name, email }: UserCardProps) => {
return (
<div className="user-card">
<h2>{name}</h2>
<p>{email}</p>
</div>
);
};
Benefits:
- ✅ Zero formatting conflicts in PRs
- ✅ Faster code reviews (focus on logic)
- ✅ Consistent codebase
- ✅ Automatic on every commit
For more details, see docs/FORMATTING.md
Commit Message Guidelines
We follow the Conventional Commits specification as outlined in our Engineering Conventions:
<type>[optional scope]: <description>
[optional body]
[optional footer(s)]
Types
feat: New featurefix: Bug fixdocs: Documentation changesstyle: Code style changes (formatting, etc.)refactor: Code refactoringtest: Adding or updating testschore: Maintenance tasks
Examples
feat(auth): add OAuth2 authentication
fix(api): resolve transcription submission error
docs(readme): update installation instructions
test(voice): add unit tests for audio processing
Testing Guidelines
Running Tests
# Backend tests
pytest
# Frontend tests
cd frontend && npm test
# Integration tests
pytest tests/integration/
# E2E tests
cd frontend && npm run test:e2e
Writing Tests
Backend Tests (Python)
# tests/test_transcription.py
import pytest
from app.services.transcription_service import TranscriptionService
def test_create_transcription(db_session):
"""Test transcription creation."""
service = TranscriptionService(db_session)
transcription = service.create_transcription(
chunk_id=1,
user_id=1,
text="Test transcription"
)
assert transcription.text == "Test transcription"
Frontend Tests (TypeScript/Jest)
// frontend/src/__tests__/VoiceRecorder.test.tsx
import { render, screen } from '@testing-library/react';
import VoiceRecorder from '../components/VoiceRecorder';
test('renders voice recorder component', () => {
render(<VoiceRecorder />);
const recordButton = screen.getByRole('button', { name: /record/i });
expect(recordButton).toBeInTheDocument();
});
Test Coverage
- Maintain minimum 80% test coverage
- Write tests for all new features
- Include edge cases and error scenarios
- Test both happy path and error conditions
Coding Standards
Please refer to our Engineering Conventions for detailed coding standards and philosophy. The following sections provide specific implementation guidelines.
Python (Backend)
Code Style
- Follow PEP 8 style guide
- Use Black for code formatting
- Use isort for import sorting
- Use flake8 for linting
# Format code
black app/
isort app/
# Check linting
flake8 app/
Code Structure
# Good: Clear function with type hints and docstring
from typing import Optional
from sqlalchemy.orm import Session
def get_user_by_email(db: Session, email: str) -> Optional[User]:
"""
Retrieve user by email address.
Args:
db: Database session
email: User email address
Returns:
User object if found, None otherwise
"""
return db.query(User).filter(User.email == email).first()
Error Handling
# Good: Specific exception handling
try:
user = create_user(db, user_data)
except ValidationError as e:
logger.error(f"User validation failed: {e}")
raise HTTPException(status_code=400, detail=str(e))
except DatabaseError as e:
logger.error(f"Database error: {e}")
raise HTTPException(status_code=500, detail="Internal server error")
TypeScript/React (Frontend)
Code Style
- Use Prettier for code formatting
- Use ESLint for linting
- Follow React best practices
- Use TypeScript for type safety
# Format code
npm run format
# Check linting
npm run lint
Component Structure
// Good: Typed React component with proper structure
interface VoiceRecorderProps {
onRecordingComplete: (audioBlob: Blob) => void;
maxDuration?: number;
}
export const VoiceRecorder: React.FC<VoiceRecorderProps> = ({
onRecordingComplete,
maxDuration = 60
}) => {
const [isRecording, setIsRecording] = useState(false);
// Component logic here
return (
<div className="voice-recorder">
{/* JSX here */}
</div>
);
};
Database
Migrations
# Good: Clear migration with proper naming
"""Add voice quality metrics
Revision ID: 001_add_voice_quality
Revises: 000_initial
Create Date: 2024-01-01 12:00:00.000000
"""
from alembic import op
import sqlalchemy as sa
def upgrade():
op.add_column('transcriptions',
sa.Column('quality_score', sa.Float, nullable=True))
def downgrade():
op.drop_column('transcriptions', 'quality_score')
Documentation Standards
Code Documentation
- Use clear, descriptive docstrings
- Document all public functions and classes
- Include parameter types and return values
- Provide usage examples for complex functions
API Documentation
- Use OpenAPI/Swagger annotations
- Document all endpoints, parameters, and responses
- Include example requests and responses
- Document error codes and messages
User Documentation
- Write clear, step-by-step instructions
- Include screenshots and examples
- Test all instructions on a fresh environment
- Keep documentation up-to-date with code changes
Code Review Process
Submitting a Pull Request
- Title: Clear, descriptive title
- Description: Explain what and why
- Testing: Describe how you tested the changes
- Screenshots: Include for UI changes
- Breaking Changes: Document any breaking changes
PR Template
## Description
Brief description of changes
## Type of Change
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update
## Testing
- [ ] Unit tests pass
- [ ] Integration tests pass
- [ ] Manual testing completed
## Screenshots (if applicable)
## Checklist
- [ ] Code follows style guidelines
- [ ] Self-review completed
- [ ] Documentation updated
- [ ] Tests added/updated
Review Criteria
Reviewers will check for:
- Functionality: Does the code work as intended?
- Code Quality: Is the code clean and maintainable?
- Testing: Are there adequate tests?
- Documentation: Is documentation updated?
- Performance: Are there any performance implications?
- Security: Are there any security concerns?
Internationalization
Adding New Languages
- Language Configuration: Add language to
app/models/language.py - Frontend Translations: Add translations to
frontend/src/locales/ - Backend Messages: Update error messages and notifications
- Documentation: Translate key documentation
Translation Guidelines
- Use proper Unicode support for all scripts
- Test with right-to-left languages
- Consider cultural context in translations
- Use native speakers for translation review
🎤 Voice Data Guidelines
Recording Quality
- Environment: Quiet, echo-free environment
- Equipment: Good quality microphone
- Format: WAV or high-quality MP3
- Duration: 2-10 seconds per clip
- Content: Clear, natural speech
Transcription Guidelines
- Accuracy: Transcribe exactly what is spoken
- Formatting: Follow language-specific conventions
- Punctuation: Include appropriate punctuation
- Quality Rating: Rate audio quality honestly
Recognition
Contributor Recognition
- Contributors are listed in our CONTRIBUTORS.md file
- Significant contributors may be invited to join the core team
- We highlight contributions in our release notes
- Annual contributor appreciation events
Badges and Achievements
- First-time contributor badge
- Language champion badges
- Code contributor levels
- Community helper recognition
Getting Help
Community Support
- Discord: Join our server for real-time help
- GitHub Discussions: Ask questions and share ideas
- Office Hours: Weekly community calls (schedule in Discord)
Mentorship Program
- New contributors can request mentorship
- Experienced contributors can volunteer as mentors
- Structured onboarding for major contributions
Contact
- General Questions: community@shrutik.org
- Technical Issues: dev@shrutik.org
- Security Issues: security@shrutik.org (private)
📜 Code of Conduct
We are committed to providing a welcoming and inclusive environment. Please read our Code of Conduct before contributing.
Our Standards
- Be Respectful: Treat everyone with respect and kindness
- Be Inclusive: Welcome contributors from all backgrounds
- Be Collaborative: Work together towards common goals
- Be Patient: Help others learn and grow
📄 License
By contributing to Shrutik, you agree that your contributions will be licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. This ensures that all contributions remain available for educational and non-commercial use while requiring attribution to the original creators.
Thank you for contributing to Shrutik! Together, we’re building a more inclusive digital future. 🎉
Engineering Conventions & Philosophy
Why this exists
This document is not about rules for the sake of rules. It exists to reduce confusion, remove unnecessary discussion, and protect engineering quality.
Conventions are not constraints, they are agreements. Agreements let teams move fast without stepping on each other.
1. Philosophy (Read this first)
- Engineering is about clarity, not cleverness.
- If something needs explanation, it’s already slightly wrong.
- Conventions exist so that:
- No one has to ask questions repeatedly
- No one has to justify decisions emotionally
- The system explains itself
We don’t optimize for personal preference. We optimize for collective understanding and future maintainability.
Everything here follows one principle:
Do not raise unnecessary questions for the next person reading your work.
That next person might be your teammate. Or future you at 3 AM.
2. Branch Naming Convention
Branch names must follow this format:
<prefix>/<short-description>
Allowed prefixes
feat/→ New featuresfix/→ Bug fixeshotfix/→ Critical production fixesdocs/→ Documentation-only changes
Examples
feat/auth-verificationfix/password-reset-tokendocs/api-guidelines
Why this matters
- Branch lists should be scannable at a glance
- Prefixes instantly communicate intent
- Consistency removes cognitive load
If every branch uses a different word (feature/, new/, stuff/), the system slowly becomes noisy.
Noise kills velocity.
3. Commit Message Convention
We follow Conventional Commits.
Format:
<type>(<scope>): <clear, concrete description>
Allowed types
feat→ New functionalityfix→ Bug fixdocs→ Documentationrefactor→ Code restructure without behavior changetest→ Testschore→ Tooling / config
Examples
feat(auth): add email verification flowfix(auth): prevent reset token reusedocs(readme): add setup instructions
What commit messages are not
- Not marketing
- Not self-evaluation
- Not emotion
Avoid words like:
- strong
- robust
- powerful
- improved (without context)
A commit message should describe what changed, not how good it feels.
If something is buggy → it’s wrong. If something works → that’s the baseline, not an achievement.
4. Pull Requests
- A PR should do one logical thing
- The title should summarize the change
- The description should answer:
- What changed?
- Why was it needed?
No philosophy debates inside PRs. If a rule is violated, it will be requested to change, not discussed.
5. Source of Truth
- WIP project docs are not the source of truth
- External standards, official documentation, and established practices take priority
Always question:
- outdated docs
- informal assumptions
- “this is how we’ve been doing it”
Engineering grows by questioning, not by accepting.
6. Ego & Engineering
- Software is never perfect
- Everything has limits
- Everything breaks eventually
That’s exactly why we aim for:
- clarity over cleverness
- simplicity over ego
- consistency over preference
Having strong opinions is good. Letting conventions decide instead of ego is better.
7. Final Note
These conventions are not optional. They exist so we can:
- move faster
- argue less
- build things that last
If something here feels strict, that’s intentional. Discipline is what gives freedom later.
Clean systems scale. Messy ones don’t.
Follow the convention. Save your energy for real problems.
Code of Conduct
Our Pledge
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
Our Standards
Examples of behavior that contributes to a positive environment for our community include:
- Demonstrating empathy and kindness toward other people
- Being respectful of differing opinions, viewpoints, and experiences
- Giving and gracefully accepting constructive feedback
- Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
- Focusing on what is best not just for us as individuals, but for the overall community
- Using welcoming and inclusive language
- Being respectful of differing cultural backgrounds and languages
- Encouraging and supporting new contributors
Examples of unacceptable behavior include:
- The use of sexualized language or imagery, and sexual attention or advances of any kind
- Trolling, insulting or derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others’ private information, such as a physical or email address, without their explicit permission
- Discrimination or harassment based on any protected characteristic
- Other conduct which could reasonably be considered inappropriate in a professional setting
Language and Cultural Sensitivity
Given Shrutik’s mission to support underrepresented languages and communities:
- Be respectful of all languages, dialects, and accents
- Avoid making assumptions about language proficiency or cultural backgrounds
- Be patient with non-native speakers of any language
- Celebrate linguistic diversity and cultural differences
- Provide translations or explanations when using technical terms
- Be mindful that humor and expressions may not translate across cultures
Voice Data Contribution Guidelines
When contributing voice data or transcriptions:
- Respect the privacy and consent of all speakers
- Do not submit recordings without proper consent
- Be honest about audio quality and transcription accuracy
- Respect cultural and religious sensitivities in content
- Follow platform guidelines for appropriate content
- Report any inappropriate or harmful content
Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
Scope
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at conduct@shrutik.org. All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the reporter of any incident.
Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
1. Correction
Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
2. Warning
Community Impact: A violation through a single incident or series of actions.
Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
3. Temporary Ban
Community Impact: A serious violation of community standards, including sustained inappropriate behavior.
Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
4. Permanent Ban
Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
Consequence: A permanent ban from any sort of public interaction within the community.
Reporting Guidelines
If you experience or witness unacceptable behavior, please report it by:
- Email: onuronon.dev@gmail.com
- Discord: Direct message to moderators
- GitHub: Use the report feature or contact maintainers
When reporting, please include:
- Your contact information
- Names (usernames, real names) of any individuals involved
- Your account of what occurred, including any available records (screenshots, logs, etc.)
- Any additional information that may be helpful
Response Process
- Acknowledgment: We will acknowledge receipt of your report within 24 hours
- Investigation: We will investigate the matter thoroughly and fairly
- Decision: We will make a decision based on our guidelines and communicate it to all parties
- Follow-up: We will follow up to ensure the resolution is effective
Appeals Process
If you disagree with a moderation decision:
- Send an appeal to appeals@shrutik.org within 30 days
- Include your reasoning and any additional information
- The appeal will be reviewed by different community leaders
- The appeal decision is final
Community Resources
Support Channels
- Discord Community: https://discord.gg/9hZ9eW8ARk
- GitHub Discussions: https://github.com/Onuronon-lab/Shrutik/discussions
Recognition
We believe in recognizing positive contributions to our community:
- Community Champions: Monthly recognition for helpful community members
- Mentorship Program: Opportunities to guide new contributors
- Speaking Opportunities: Invitations to represent Shrutik at events
- Contributor Spotlight: Featured stories of community members
Continuous Improvement
This Code of Conduct is a living document. We regularly review and update it based on:
- Community feedback and suggestions
- Evolving best practices in open source communities
- Lessons learned from enforcement experiences
- Changes in our community’s needs and composition
To suggest improvements, please:
- Open an issue on GitHub with the “code-of-conduct” label
- Join discussions in our Discord #community-guidelines channel
- Email suggestions to conduct@shrutik.org
Acknowledgments
This Code of Conduct is adapted from the Contributor Covenant, version 2.1, available at https://www.contributor-covenant.org/version/2/1/code_of_conduct.html.
Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder.
For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.
Contact Information
- General Conduct Questions: onuronon.dev@gmail.com
Remember: We’re all here because we believe in making voice technology more inclusive. Let’s work together to create a welcoming space where everyone can contribute their unique perspectives and talents.
Thank you for helping make Shrutik a welcoming, inclusive community for everyone! 🎤✨
Contributors
Thank you to all the amazing people who have contributed to Shrutik (শ্রুতিক)! This project exists because of the collective effort of developers, linguists, designers, and community members from around the world.
Core Team
Project Founders
- [Ifrun Kader Ruhin] - Project Creator & Lead Developer
- GitHub: @ifrunruhin12
- Role: Architecture, Backend Development, Project Vision
Core Maintainers
-
[Maintainer Name] - Lead Frontend Developer
- GitHub: @maintainer
- Role: Frontend Architecture, UI/UX Design
-
[Maintainer Name] - DevOps & Infrastructure
- GitHub: @devops
- Role: Deployment, CI/CD, Performance Optimization
💻 Code Contributors
Major Contributors (50+ commits)
- [Contributor Name] - @username
- Contributions: Audio processing pipeline, performance optimizations
- Languages: Python, JavaScript
Regular Contributors (10+ commits)
-
[Contributor Name] - @username
- Contributions: API development, database design
- Languages: Python, SQL
-
[Contributor Name] - @username
- Contributions: Frontend components, accessibility improvements
- Languages: TypeScript, React
First-Time Contributors
- [Furqan Ahmed] - @furqanRupom
- First contribution: Bug fix in audio validation
- Date: 2025-11-02
Voice Data Contributors
Language Champions
These contributors have made significant voice data contributions in their native languages:
Bengali (বাংলা)
- [Contributor Name] - 500+ recordings, 1000+ transcriptions
- [Contributor Name] - 300+ recordings, 800+ transcriptions
- [Contributor Name] - 200+ recordings, 600+ transcriptions
Quality Reviewers
- [Reviewer Name] - 2000+ transcription reviews
- [Reviewer Name] - 1500+ transcription reviews
- [Reviewer Name] - 1200+ transcription reviews
Documentation Contributors
Documentation Team
-
[Doc Contributor] - @username
- Contributions: API documentation, user guides
- Specialty: Technical writing
-
[Doc Contributor] - @username
- Contributions: Deployment guides, troubleshooting
- Specialty: DevOps documentation
Translators
- [Translator Name] - Bengali translation lead
- [Translator Name] - Hindi translation lead
- [Translator Name] - Tamil translation lead
Design Contributors
UI/UX Designers
-
[Designer Name] - @username
- Contributions: User interface design, user experience research
- Tools: Figma, Adobe XD
-
[Designer Name] - @username
- Contributions: Logo design, branding, visual identity
- Tools: Illustrator, Photoshop
Accessibility Experts
- [A11y Expert] - @username
- Contributions: Accessibility audits, WCAG compliance
- Specialty: Screen reader optimization
Research Contributors
Academic Researchers
-
Dr. [Researcher Name] - [University/Institution]
- Contributions: Voice quality metrics, consensus algorithms
- Publications: [Link to relevant papers]
-
Prof. [Researcher Name] - [University/Institution]
- Contributions: Linguistic analysis, dialect classification
- Expertise: Computational linguistics
Data Scientists
- [Data Scientist] - @username
- Contributions: Quality control algorithms, statistical analysis
- Tools: Python, R, Machine Learning
Discord Moderators
- [Moderator Name] - Senior Moderator
- [Moderator Name] - Community Moderator
- [Moderator Name] - Technical Support Moderator
🏅 Special Recognition
Milestone Achievements
🥇 Gold Contributors (Exceptional Impact)
- [Contributor Name] - First to reach 1000 voice contributions
- [Contributor Name] - Implemented critical security features
- [Contributor Name] - Led successful community outreach campaign
🥈 Silver Contributors (Significant Impact)
- [Contributor Name] - Major performance optimizations
- [Contributor Name] - Comprehensive testing framework
- [Contributor Name] - Multi-language support implementation
🥉 Bronze Contributors (Notable Impact)
- [Contributor Name] - Bug fixes and stability improvements
- [Contributor Name] - Documentation improvements
- [Contributor Name] - Community support and mentoring
Annual Awards (2026)
🏆 Contributor of the Year
[Winner Name] - For outstanding contributions across code, community, and voice data
🌟 Rising Star
[Winner Name] - For exceptional growth and impact as a new contributor
🤝 Community Champion
[Winner Name] - For building bridges between technical and linguistic communities
Technical Excellence
[Winner Name] - For innovative solutions and architectural improvements
📊 Contribution Statistics
Overall Stats (as of 2026)
- Total Contributors: x+
- Code Contributors: x
- Voice Contributors: x+
- Documentation Contributors: x
- Countries Represented: x+
- Languages Supported: x
🎯 Contribution Types
Code Contributions
- Backend Development: x contributors
- Frontend Development: x contributors
- DevOps & Infrastructure: x contributors
- Testing & QA: x contributors
- Security: x contributors
Non-Code Contributions
- Voice Recordings: x contributors
- Transcriptions: x contributors
- Quality Reviews: x contributors
- Documentation: x contributors
- Translation: x contributors
- Design: x contributors
- Community Management: x contributors
🚀 How to Join
Want to see your name here? Here’s how you can contribute:
For Developers
- Check our Contributing Guide
- Look for issues labeled
good first issue - Join our Discord for technical discussions
For Voice Contributors
- Visit our platform at onuronon.org
- Register and start recording in your native language
- Help transcribe and review audio from others
For Designers
- Check our design needs in GitHub issues
- Share your portfolio and design ideas
- Help improve user experience and accessibility
For Linguists & Researchers
- Join our research discussions on Discord
- Contribute to quality metrics and algorithms
- Help with linguistic analysis and validation
🙏 Acknowledgments
Special Thanks
- Open Source Community - For the amazing tools and libraries we build upon
- Academic Partners - For research collaboration and validation
- Early Adopters - For testing and feedback during development
- Funding Partners - For supporting the project’s growth
- Language Communities - For trusting us with their voices and stories
Inspiration
This project is inspired by the belief that technology should serve all communities, regardless of the language they speak. We’re grateful to everyone who shares this vision and contributes to making it a reality.
Recognition Requests
If you’ve contributed to Shrutik but don’t see your name here:
- Open an issue with the “recognition” label
- Email us at onuronon.dev@gmail.com
- Message us on Discord
We want to make sure everyone gets the recognition they deserve!
🔄 Updates
This file is updated monthly to recognize new contributors. The next update is scheduled for the first week of each month.
Last Updated: October 2025
Thank you to everyone who makes Shrutik possible! Your contributions, big and small, are building a more inclusive digital future. ✨
“Alone we can do so little; together we can do so much.” - Helen Keller
Troubleshooting
This guide covers common issues and their solutions when working with Shrutik.
Docker Issues
Services Won’t Start
Problem: Docker services fail to start or crash immediately.
Solutions:
# Check logs for all services
docker compose logs -f
# Check logs for a specific service
docker compose logs -f backend
docker compose logs -f postgres
docker compose logs -f redis
# Restart all services
docker compose restart
# Clean restart (removes containers, networks, and volumes)
docker compose down -v --remove-orphans
docker system prune -f # optional: remove unused Docker resources
docker compose up -d # start services again
Port Already in Use
Problem: Error messages about ports 3000, 5432, 6379, or 8000 being in use.
Solutions:
# Find processes using ports
sudo lsof -i :8000
sudo lsof -i :3000
sudo lsof -i :5432
sudo lsof -i :6379
# Kill processes using specific ports
sudo lsof -ti:8000 | xargs kill -9
sudo lsof -ti:3000 | xargs kill -9
# Or use netstat
netstat -tulpn | grep :8000
Environment Configuration
Problem: Migration fails due to .env misconfigurations.
Solution:
# Incorrect
DATABASE_URL=postgresql://postgres:password@localhost:5432/voice_collection
# Correct
DATABASE_URL=postgresql://postgres:password@postgres:5432/voice_collection
# Incorrect
REDIS_URL=redis://localhost:6379/0
# Correct
REDIS_URL=redis://redis:6379/0
Database Migrations Not Applied
Problems:
-
alembic upgrade head was not running or failed.
-
Tables such as users, recordings, etc., are missing.
-
Application may return errors like: relation “users” does not exist.
Solutions:
# Run database migrations
alembic upgrade head
# Verify tables exist
psql -U postgres -d voice_collection -c "\dt"
⚠️ Always run migrations after configuring environment variables and before starting the backend or running tests.
Database Connection Issues
Problem: Backend can’t connect to PostgreSQL database.
Solutions:
# Check database status
docker-compose exec postgres pg_isready -U postgres
# Check database logs
docker compose logs -f postgres
# Reset database and remove containers, volumes, and networks
docker compose down -v --remove-orphans
# Optional: prune unused Docker resources
docker system prune -f
# Start all services
docker compose up -d
# Run database migrations inside the backend container
docker compose exec backend alembic upgrade head
# Or use a custom initialization script if you have one
docker compose exec backend python scripts/init-db.py
Redis Connection Issues
Problem: Backend can’t connect to Redis.
Solutions:
# Test Redis connection
docker-compose exec redis redis-cli ping
# Check Redis logs
docker compose logs -f redis
# Restart Redis
docker compose restart redis
Database Issues
Problem: PostgreSQL connection errors in local development.
Solutions:
# Check PostgreSQL status
sudo systemctl status postgresql
# Start PostgreSQL
sudo systemctl start postgresql
# Create database if missing
createdb voice_collection
# Run migrations
alembic upgrade head
Permission Errors
Problem: File permission errors, especially with uploads directory.
Solutions:
# Fix upload directory permissions
mkdir -p uploads
sudo chown -R $USER:$USER uploads/
chmod -R 755 uploads/
# Fix general project permissions
sudo chown -R $USER:$USER .
Application Issues
Admin User Creation Fails
Problem: Cannot create admin user or login fails.
Solutions:
# Ensure the database is migrated
# Local environment
alembic upgrade head # see Local Database docs for details.
# Docker environment
docker compose exec backend alembic upgrade head # see Docker Database docs for details
# Create admin user
# Local
python scripts/create_admin.py --name "AdminUser" --email admin@example.com
# Docker
docker compose exec backend python scripts/create_admin.py --name "AdminUser" --email admin@example.com
# Check users in database
# Local
psql -U postgres -d voice_collection -c "SELECT * FROM users;"
# Docker
docker compose exec postgres psql -U postgres -d voice_collection -c "SELECT * FROM users;"
File Upload Issues
Problem: Audio file uploads fail or return errors.
Solutions:
- Check file size: Ensure files are under 100MB (default limit)
- Check file format: Supported formats:
.wav,.mp3,.m4a,.flac,.webm - Check permissions: Ensure uploads directory is writable
- Check disk space: Ensure sufficient disk space available
# Check upload directory
ls -la uploads/
df -h # Check disk space
API Errors
Problem: API endpoints return 500 errors or unexpected responses.
Solutions:
# Check backend logs
docker compose logs -f backend
# Check API health
curl http://localhost:8000/health
# Check specific endpoint
curl -X GET http://localhost:8000/api/auth/me \
-H "Authorization: Bearer YOUR_TOKEN"
Frontend Issues
Frontend Won’t Load
Problem: Frontend shows blank page or connection errors.
Solutions:
# Check frontend logs
docker compose logs -f frontend
# Verify API connection
curl http://localhost:8000/health
# Check environment variables
cat frontend/.env
Build Errors
Problem: Frontend build fails with dependency or compilation errors.
Solutions:
# Clear node modules and reinstall
cd frontend
rm -rf node_modules package-lock.json
npm install
# Clear Next.js cache
rm -rf .next
# Rebuild
npm run build
Debugging Tips
Enable Debug Logging
Add to your .env file:
DEBUG=true
LOG_LEVEL=DEBUG
Check Service Health
# Backend health check (works for both local and Docker)
curl http://localhost:8000/health
# Database connection
# Local PostgreSQL
pg_isready -U postgres -d voice_collection
# Docker PostgreSQL
docker-compose exec postgres pg_isready -U postgres -d voice_collection
# Redis connection
# Local Redis
redis-cli ping
# Docker Redis
docker-compose exec redis redis-cli ping
Monitor Resource Usage
# Docker resource usage
docker stats
# System resource usage
htop
df -h
free -h
Common Local Issues
Port already in use:
# Find and kill process using port 8000
lsof -ti:8000 | xargs kill -9
Database connection issues:
# Check PostgreSQL status
sudo systemctl status postgresql
# Restart PostgreSQL
sudo systemctl restart postgresql
Create database if missing
createdb voice_collection
Run migrations
alembic upgrade head
Redis connection issues:
# Check Redis status
redis-cli ping
# Start Redis
redis-server
Getting Help
If you’re still experiencing issues:
- Search existing issues: Check GitHub Issues
- Create detailed issue: Include:
- Operating system and version
- Docker/Docker Compose versions
- Complete error messages
- Steps to reproduce
- Relevant log outputs
- Join community: Discord Server
- Check documentation: Review relevant sections in this documentation
Frequently Asked Questions
General Questions
What is Shrutik?
Shrutik (শ্রুতিক) is an open-source voice data collection platform designed to help communities build high-quality voice datasets in their native languages. The name “Shrutik” means “listener” in Bengali, reflecting our mission to listen to and preserve diverse voices.
What languages does Shrutik support?
Shrutik is designed to support any language. Currently, it comes pre-configured with Bengali (Bangla), but administrators can easily add support for additional languages through the admin interface.
Is Shrutik free to use?
Yes! Shrutik is free and open-source under the Creative Commons BY-NC-SA 4.0 License. You can use it for learning, education, and non-commercial projects. Commercial use requires separate permission.
Technical Questions
What are the system requirements?
For Docker (Recommended):
- Docker 20.10+
- Docker Compose 2.0+
- 4GB RAM minimum, 8GB recommended
- 10GB free disk space
For Local Development:
- Python 3.11+
- Node.js 18+
- PostgreSQL 13+
- Redis 6+
- 8GB RAM recommended
How do I backup my data?
Database Backup:
# Create database backup
docker-compose exec postgres pg_dump -U postgres voice_collection > backup.sql
# Restore from backup
docker-compose exec -T postgres psql -U postgres voice_collection < backup.sql
File Uploads Backup:
# Backup uploads directory
tar -czf uploads-backup.tar.gz uploads/
Usage Questions
How do I add a new language?
- Log in as an admin user
- Go to the admin dashboard
- Navigate to “Languages” section
- Click “Add Language”
- Enter language name and ISO code
- Add scripts/texts for that language
What audio formats are supported?
Shrutik supports these audio formats:
- WAV (recommended for quality)
- MP3
- M4A
- FLAC
- WebM
What’s the maximum file size for uploads?
The default maximum file size is 100MB. This can be configured in the environment variables:
MAX_FILE_SIZE=104857600 # 100MB in bytes
How does the transcription consensus system work?
Shrutik uses a multi-contributor consensus system:
- Multiple users transcribe the same audio
- The system compares transcriptions
- When transcriptions match (or are very similar), they’re marked as “consensus”
- High-consensus transcriptions are considered high-quality data
Can I export my data?
Yes! Administrators can export data through the admin API:
- Audio files and metadata
- Transcriptions and consensus data
- User statistics and contributions
- Quality metrics
Development Questions
How do I contribute to Shrutik?
- Fork the repository on GitHub
- Set up your development environment
- Make your changes
- Write tests for new features
- Submit a pull request
See our Contributing Guide for detailed instructions.
How do I report bugs?
- Check if the issue already exists in GitHub Issues
- If not, create a new issue with:
- Clear description of the problem
- Steps to reproduce
- Expected vs actual behavior
- System information
- Error logs
How do I request new features?
Create a feature request in GitHub Issues with:
- Clear description of the feature
- Use case and benefits
- Proposed implementation (if you have ideas)
Can I customize the UI?
Yes! The frontend is built with Next.js and React. You can:
- Modify the existing components
- Add new pages and features
- Customize styling and themes
- Add support for new languages in the UI
Privacy and Security
How is user data protected?
Shrutik implements several security measures:
- Password hashing with bcrypt
- JWT token-based authentication
- Role-based access control
- Input validation and sanitization
- CORS protection
- Rate limiting
Can I run Shrutik offline?
Yes! Shrutik can run completely offline once deployed. All processing happens locally on your infrastructure.
How do I configure HTTPS?
For production deployments, configure HTTPS using:
- Reverse proxy (nginx, Apache)
- Load balancer with SSL termination
- Cloud provider SSL certificates
Example nginx configuration is available in our deployment guides.
Community and Support
Where can I get help?
- Documentation: This documentation site
- GitHub Issues: For bugs and feature requests
- Discord: Join our community
- Email: Contact the maintainers
How can I stay updated?
- Watch the GitHub repository for releases
- Join our Discord community
- Follow our social media channels
- Subscribe to our newsletter (coming soon)
Can I hire someone to help with deployment?
While Shrutik is open-source and free, you can:
- Hire freelance developers familiar with the stack
- Contact the core team for consulting services
- Engage with the community for paid support
Troubleshooting
The application won’t start
See our detailed Troubleshooting Guide for common issues and solutions.
I forgot my admin password
Reset your admin password:
# Using Docker
docker-compose exec backend python create_admin.py
# Local development
python scripts/create_admin.py --name "AdminUser" --email admin@example.com
This will create a new admin user or update the existing one.
The database is corrupted
If your database becomes corrupted:
- Stop all services
- Restore from backup (if available)
- Or reset the database:
# Stop and remove all containers, volumes, and networks
docker compose down -v --remove-orphans
# Optional: prune unused Docker resources
docker system prune -f
# Start services (build images if necessary)
docker compose up -d --build
# Wait a few seconds for Postgres and Redis to be ready
# Run database migrations
docker compose exec backend alembic upgrade head
# Or use a custom initialization script
docker compose exec backend python scripts/init-db.py
# Create Admin user
docker compose exec backend python create_admin.py
Still have questions?
If your question isn’t answered here:
- Check our Troubleshooting Guide
- Search GitHub Issues
- Join our Discord community
- Create a new issue on GitHub
We’re here to help!
Page not available yet
This page will be available in a future update.
You can continue navigating using the sidebar or search.