Shrutik Documentation

Welcome to the comprehensive documentation for Shrutik (শ্রুতিক), the open-source voice data collection platform designed to help communities build high-quality voice datasets in their native languages.

Shrutik means “listener” in Bengali, reflecting our mission to listen to and preserve diverse voices from around the world.

About This Documentation

This documentation is built with mdBook and provides comprehensive guides, API references, and tutorials for users, developers, and administrators.

Enhanced Features

Interactive Mermaid Diagrams: Zoom, pan, and view complex flowcharts in fullscreen
Professional Styling: Custom theme with Shrutik branding and improved readability
Responsive Design: Optimized experience on desktop and mobile devices
Status Badges: Color-coded indicators for different content types
Enhanced Navigation: Improved sidebar, search, and user experience

Interactive Diagram Controls

Zoom: Use mouse wheel or +/- buttons to zoom in/out
Pan: Drag to move around when zoomed in
Reset: Double-click or press ‘0’ to reset view
Fullscreen: Click the fullscreen button for better viewing
Mobile: Touch-friendly controls for mobile devices

Documentation Overview

Getting Started

Getting Started Guide - Quick setup and first steps
Docker Local Setup - Complete Docker development guide
Local Development - Native development environment setup

Architecture & Design

System Architecture - Complete system design overview
API Reference - Comprehensive API documentation
Flowcharts - Visual system flow documentation

Contributing

Contributing Guide - How to contribute to Shrutik
Engineering Conventions - Development standards and philosophy
Code of Conduct - Community guidelines

Additional Resources

Audio Processing Modes - Audio processing capabilities
Troubleshooting - Common issues and solutions
FAQ - Frequently asked questions

For New Users

Getting Started - Set up Shrutik in minutes
Docker Local Setup - Run everything with Docker
User Guide - Learn how to contribute voice data

For Developers

Docker Local Setup - Quick Docker development setup
Local Development - Native development environment
Architecture Overview - Understand the system design
API Reference - Integrate with Shrutik APIs
Contributing Guide - Contribute code and features

For System Administrators

Docker Local Setup - Deploy with Docker
Deployment Guide - Production deployment strategies
Monitoring & Health Checks - System monitoring

For Researchers & Data Scientists

API Reference - Export datasets
Architecture - Understand data structure
Quality Control - Data quality processes

Visual Documentation

System Flows

System Architecture - High-level system overview
Voice Recording Flow - Complete recording process
Transcription Workflow - Transcription and consensus

Technical Diagrams

API Request Flow - API request lifecycle
Database Operations - Data flow patterns
Caching Strategy - Performance optimization

Development Resources

Setup & Configuration

Environment Setup - Development environment
Configuration Guide - Environment variables
Testing Guide - Testing strategies

Code Standards

Engineering Conventions - Development philosophy and standards
Coding Standards - Code style guidelines
API Design - RESTful API principles
Database Design - Schema and patterns

Deployment Options

Option	Complexity	Use Case	Documentation
Docker Compose	Low	Development, Small Teams	Docker Deployment
Kubernetes	High	Production, Enterprise	Deployment Guide
Cloud Platforms	Medium	Managed Services	Deployment Guide
Bare Metal	Medium	On-Premises	Deployment Guide

Community & Support

Get Help

Discord Community - Real-time community support
GitHub Issues - Bug reports and feature requests
GitHub Discussions - Community discussions

Contribute

Voice Data - Contribute recordings and transcriptions
Code - Develop features and fix bugs
Documentation - Improve guides and tutorials
Translation - Translate to new languages

Stay Updated

GitHub Repository - Source code and releases
Twitter - Latest updates and announcements

Additional Resources

External Links

FastAPI Documentation - Backend framework
React Documentation - Frontend framework
PostgreSQL Documentation - Database
Redis Documentation - Caching and queues

Research Papers

Voice Data Collection Best Practices - Academic research
Crowdsourcing for Language Technology - Methodology
Quality Control in Voice Datasets - Quality assurance

What’s New

Recent Updates

Performance Optimization - Added comprehensive caching and rate limiting
CDN Integration - Optimized audio delivery with CDN support
Enhanced Monitoring - Real-time performance metrics and dashboards
Security Improvements - Advanced authentication and authorization

Coming Soon

Mobile App - Native mobile applications for iOS and Android
AI Assistance - ML-powered transcription assistance
Multi-language UI - Interface translations for global accessibility
Cloud Integration - Enhanced cloud platform support

License & Legal

CC BY-NC-SA 4.0 License - Creative Commons license for non-commercial use
Privacy Policy - Data privacy and protection
Code of Conduct - Community guidelines

Need help? Join our Discord community or check our GitHub discussions.

Found an issue? Please report it on GitHub.

Want to contribute? Read our Contributing Guide to get started.

Together, we’re building a more inclusive digital future, one voice at a time.

Home • Get Started • Develop • Contribute

Getting Started with Shrutik

Welcome to Shrutik! This guide will help you set up and start using the platform in just a few minutes.

Overview

Shrutik is a voice data collection platform that allows communities to contribute voice recordings and transcriptions in their native languages. You can either contribute data or set up your own instance of the platform.

Quick Setup Options

Option 1: Docker (Recommended)

The fastest way to get Shrutik running is with Docker:

# Clone the repository
git clone https://github.com/Onuronon-lab/Shrutik.git
cd shrutik

# Switch to the deployment-dev branch
git fetch origin
git switch deployment-dev

# Copy Docker environment configuration
cp .env.example .env

# Build images and start all services
docker compose up --build -d

Access the platform:

Frontend: http://localhost:3000
Backend API: http://localhost:8000
API Documentation: http://localhost:8000/docs

Note: For detailed Docker setup instructions, see our comprehensive Docker Local Setup Guide for configuration details, troubleshooting, and switching between local/Docker environments.

Option 2: Local Development

For development or customization:

# Clone and setup
git clone https://github.com/Onuronon-lab/Shrutik.git
cd shrutik

# Switch to the deployment-dev branch
git fetch origin
git switch deployment-dev

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

To start backend, frontend, and Celery worker, see the Local Setup Guide.

Verify Setup

Once you’ve successfully started the services using either Option 1 (Docker) or Option 2 (Local Development), confirm that everything is running correctly:

Check Backend Health

The backend provides a simple health endpoint to verify that the FastAPI server is up and running.

curl http://localhost:8000/health

Check Frontend

curl http://localhost:3000

First Steps

For Developers

Register an Account: Visit http://localhost:3000 and create an account
Start Recording: Begin with voice recordings or transcriptions
Track Progress: Monitor your contributions in the dashboard

For Administrators

Access Admin Panel: Login with your admin account
Configure Languages: Add supported languages and scripts
Manage Users: Review user registrations and assign roles
Monitor Quality: Review transcription quality and consensus

Contributing Voice Data

Recording Guidelines

Environment: Record in a quiet environment
Equipment: Use a good quality microphone
Duration: Keep recordings between 2-10 seconds
Content: Read the provided text clearly and naturally

Transcription Guidelines

Accuracy: Transcribe exactly what you hear
Formatting: Follow language-specific formatting rules
Quality: Rate the audio quality honestly
Consensus: Multiple transcriptions improve dataset quality

Troubleshooting

Common Issues

Services won’t start:

# All services
docker compose logs -f

# Specific service (example: backend)
docker compose logs -f backend

# Restart all services
docker compose restart

# Restart a single Service
docker compose restart backend

# Or check status
docker compose ps

Database connection errors:

# Stop services and remove volumes
docker compose down -v --remove-orphans
# (Optional) Clean unused Docker resources
docker system prune -f
# Rebuild and start all services
docker compose up -d --build

# Run migrations inside the backend container
docker compose exec backend python scripts/init-db.py
# If that fails, try the fallback
docker compose exec backend python scripts/simple-init.py

Permission errors:

# Fix file permissions
sudo chown -R $USER:$USER uploads/
chmod -R 755 uploads/

Getting Help

Documentation: Check our comprehensive docs
GitHub Issues: Report bugs and request features
Discord: Join our community for real-time help
Email: Contact us at onuronon.dev@gmail.com

Next Steps

Local Development Guide - Set up development environment
Docker Local Setup - Docker development environment
API Reference - Integrate with external systems
Contributing Guide - Contribute to the project
Architecture Overview - Understand the system design

Welcome to the Community

You’re now ready to start using Shrutik! Whether you’re contributing voice data, developing features, or deploying your own instance, you’re part of a global movement to make voice technology more inclusive.

Join our community channels to connect with other contributors and stay updated on the latest developments.

Docker Local Setup Guide

This guide explains how to run Shrutik completely with Docker on your local machine, including all the configuration changes needed to switch from local development to Docker.

Quick Docker Setup

Prerequisites

Docker 20.10+
Docker Compose 2.0+
Git

1. Clone the Repo

git clone https://github.com/Onuronon-lab/Shrutik.git
cd Shrutik

# Switch to the deployment-dev branch
git fetch origin
git switch deployment-dev

2. Configure Environment for Docker

Use the Docker-specific environment file:

cp .env.example .env

Make sure the DATABASE_URL is correct in .env file

DATABASE_URL=postgresql://postgres:password@postgres:5432/voice_collection

When running inside Docker, services communicate using their Docker Compose service names.

Available Environment Files:

.env.example - Template with all available options

3. Configure Frontend

cd frontend
cp .env.example .env

4. Start All Containers

Use this when running the app for the first time or after changing Dockerfiles, requirements.txt, or package.json:

docker compose up -d --build

Regular use (no changes)

For normal daily use, when the images are already built:

docker compose up -d

Check service status:

docker compose ps

5. Initialize the Database

Run migrations:

docker compose exec backend alembic upgrade head

Create admin user:

docker compose exec backend python scripts/create_admin.py --name "Admin" --email admin@example.com

6. Access the Application

Frontend → http://localhost:3000
API Docs → http://localhost:8000/docs
Health Check → http://localhost:8000/health

Configuration Changes Explained

Key Differences: Local vs Docker

Component	Local Development	Docker
Database URL	`localhost:5432`	`postgres:5432`
Redis URL	`localhost:6379`	`redis:6379`
Frontend API URL	`http://localhost:8000`	`http://localhost:8000`
File Paths	`./uploads`	`/app/uploads`

Development Workflow

Start services

docker compose up -d

Stop everything

docker compose down

Stop AND remove volumes (fresh reset)

docker compose down -v

View logs

docker compose logs -f

Specific service logs:

docker compose logs -f backend

Rebuild after changing requirements

docker compose build --no-cache
docker compose up -d

Shell into a container

docker compose exec backend bash

Check backend health

curl http://localhost:8000/health

Database Management

Run migrations:

docker compose exec backend alembic upgrade head

Auto-generate migration:

docker compose exec backend alembic revision --autogenerate -m "message"

Connect to PostgreSQL:

docker compose exec postgres psql -U postgres -d voice_collection

Redis Debugging

Test Redis:

docker compose exec redis redis-cli ping

Restart Redis:

docker compose restart redis

Troubleshooting

Port in use

Check:

sudo lsof -i :6379
sudo lsof -i :5432

Kill process:

sudo kill <pid>

Backend not starting

docker compose logs backend

Frontend not loading

docker compose logs frontend
docker compose build frontend --no-cache
docker compose up -d frontend

Local Development Guide

This guide covers setting up Shrutik for local development, including all the tools and configurations needed for contributing to the project.

Prerequisites

System Requirements

Python: 3.11 or higher
Node.js: 20 or higher
PostgreSQL: 15 or higher
Redis: 7 or higher
Git: Latest version

Development Tools (Recommended)

IDE: VS Code with Python and TypeScript extensions
API Testing: Postman or Insomnia
Database GUI: pgAdmin or DBeaver
Redis GUI: RedisInsight

Setup Instructions

1. Clone and Navigate

git clone https://github.com/Onuronon-lab/Shrutik.git
cd shrutik

# Switch to the deployment-dev branch
git fetch origin
git switch deployment-dev

2. Backend Setup

Create Virtual Environment

# Create virtual environment
python -m venv venv

# Activate virtual environment
# On Linux/Mac:
source venv/bin/activate
# On Windows:
venv\Scripts\activate

Install Dependencies

# Install Python dependencies
pip install -r requirements.txt

Database Setup

# Start PostgreSQL (if not running)
sudo systemctl start postgresql  # Linux
brew services start postgresql   # Mac

# Switch to PostgreSQL user (Linux)
sudo -i -u postgres

# Create database
createdb voice_collection

# Exit postgres user shell (Linux)
exit

# Set environment variables
cp .env.example .env

Edit .env:

# Development Database
DATABASE_URL=postgresql://postgres:password@localhost:5432/voice_collection

# Redis
REDIS_URL=redis://localhost:6379/0

# Development Settings
DEBUG=true
USE_CELERY=true

# File Storage
UPLOAD_DIR=uploads
MAX_FILE_SIZE=104857600

# Security (use a secure key in production)
SECRET_KEY=dev-secret-key-change-in-production

Run Database Migrations

# Run database migrations
alembic upgrade head

# Create admin user
python scripts/create_admin.py --name "AdminUser" --email admin@example.com

Follow the prompts to create your first admin user.

3. Frontend Setup

# Navigate to frontend directory
cd frontend

# Install dependencies
npm install

# Copy environment file
cp .env.example .env

4. Start Development Services

Start Services

Terminal 1 - Backend:

source venv/bin/activate
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Terminal 2 - Celery Worker:

source venv/bin/activate
celery -A  app.core.celery_app  worker  --loglevel=info

Terminal 3 - Frontend:

cd frontend
npm start

Development Configuration

Alembic Configuration

Alembic is configured to automatically use the correct database URL from your environment variables:

The alembic/env.py file reads from settings.DATABASE_URL
No manual configuration changes needed when switching environments
Migrations work seamlessly in both local and Docker environments

Switching Between Local and Docker

When switching between local development and Docker, you need to update these configurations:

1. Environment Variables (`.env` file)

Local Development:

DATABASE_URL=postgresql://postgres:password@localhost:5432/voice_collection
REDIS_URL=redis://localhost:6379/0

Docker:

DATABASE_URL=postgresql://postgres:password@postgres:5432/voice_collection
REDIS_URL=redis://redis:6379/0

Note: This stays the same for local Docker since we access from host

3. Quick Switch Commands

Switch to Docker:

# Stop local services
pkill -f uvicorn
pkill -f celery

# Update config for Docker
cp .env.example .env

# Start Services 
docker compose up -d

Switch to Local:

# Stop Docker
docker-compose down

Follow The Previous Instructions for locally starting service

Complete Docker Guide: For detailed Docker setup instructions, troubleshooting, and configuration explanations, see our Docker Local Setup Guide.

🧪 Testing

Backend Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=app

# Run specific test file
pytest tests/test_auth.py

# Run with verbose output
pytest -v

Frontend Tests

cd frontend

# Run tests
npm test

# Run with coverage
npm run test:coverage

# Run E2E tests
npm run test:e2e

Integration Tests

# Start test environment
docker-compose -f docker-compose.test.yml up -d

# Run integration tests
pytest tests/integration/

# Cleanup
docker-compose -f docker-compose.test.yml down -v

Debugging

Backend Debugging

VS Code Configuration

Create .vscode/launch.json:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "FastAPI Debug",
            "type": "python",
            "request": "launch",
            "program": "${workspaceFolder}/venv/bin/uvicorn",
            "args": ["app.main:app", "--reload", "--host", "0.0.0.0", "--port", "8000"],
            "console": "integratedTerminal",
            "envFile": "${workspaceFolder}/.env.development"
        }
    ]
}

Logging Configuration

Enable debug logging in .env.development:

LOG_LEVEL=DEBUG

Frontend Debugging

Browser DevTools

Use React Developer Tools extension
Enable source maps for debugging TypeScript
Use Network tab to debug API calls

VS Code Configuration

Install recommended extensions:

ES7+ React/Redux/React-Native snippets
TypeScript Importer
Prettier - Code formatter
ESLint

Database Management

Migrations

# Create new migration
alembic revision --autogenerate -m "Description of changes"

# Apply migrations
alembic upgrade head

# Rollback migration
alembic downgrade -1

# Check migration status
alembic current

Database Reset

# Drop and recreate database
dropdb voice_collection
createdb voice_collection
alembic upgrade head
python create_admin.py

Common Development Tasks

Adding New API Endpoints

Create schema in app/schemas/
Add model in app/models/ (if needed)
Implement service in app/services/
Create router in app/api/
Register router in app/main.py
Add tests in tests/

Adding New Frontend Components

Create component in frontend/src/components/
Add TypeScript types in frontend/src/types/
Implement API calls in frontend/src/services/
Add routing in frontend/src/pages/
Add tests in frontend/src/__tests__/

Database Schema Changes

Modify models in app/models/
Generate migration: alembic revision --autogenerate -m "description"
Review and edit migration file if needed
Apply migration: alembic upgrade head
Update tests and documentation

Performance Optimization

Development Performance

# Use faster database for development
export DATABASE_URL="postgresql://postgres:password@localhost:5432/voice_collection"

# Disable Celery for faster startup
export USE_CELERY=false

# Use development Redis
export REDIS_URL="redis://localhost:6379/1"

Hot Reload Configuration

Backend hot reload is enabled by default with --reload flag.

Frontend hot reload configuration in frontend/next.config.js:

module.exports = {
  reactStrictMode: true,
  swcMinify: true,
  experimental: {
    esmExternals: false
  }
}

Additional Resources

API Documentation - Complete API reference
Architecture Overview - System design details
Contributing Guide - Contribution guidelines
Docker Local Setup - Docker development environment

Happy coding! 🎉

Shrutik Architecture Overview

This document provides a comprehensive overview of Shrutik’s system architecture, design principles, and technical decisions.

System Architecture

Shrutik follows a modern, microservices-inspired architecture with clear separation of concerns and scalable design patterns.

High-Level Architecture

graph TB
    subgraph "Presentation Layer"
        WEB[React Frontend]
        MOBILE[Mobile App]
        API_DOCS[API Documentation]
    end

    subgraph "API Gateway Layer"
        NGINX[Nginx Reverse Proxy]
        RATE_LIMIT[Rate Limiting]
        AUTH_MW[Authentication Middleware]
    end

    subgraph "Application Layer"
        API[FastAPI Backend]
        WORKER[Celery Workers]
        SCHEDULER[Task Scheduler]
    end

    subgraph "Business Logic Layer"
        AUTH_SVC[Authentication Service]
        VOICE_SVC[Voice Recording Service]
        TRANS_SVC[Transcription Service]
        CONSENSUS_SVC[Consensus Service]
        EXPORT_SVC[Export Service]
        ADMIN_SVC[Admin Service]
    end

    subgraph "Data Layer"
        POSTGRES[(PostgreSQL)]
        REDIS[(Redis)]
        FILES[File Storage]
    end

    subgraph "External Services"
        CDN[Content Delivery Network]
        EMAIL[Email Service]
        MONITORING[Monitoring & Logging]
    end

    WEB --> NGINX
    MOBILE --> NGINX
    NGINX --> API
    API --> AUTH_SVC
    API --> VOICE_SVC
    API --> TRANS_SVC
    WORKER --> CONSENSUS_SVC
    WORKER --> EXPORT_SVC
    
    AUTH_SVC --> POSTGRES
    VOICE_SVC --> POSTGRES
    VOICE_SVC --> FILES
    TRANS_SVC --> POSTGRES
    TRANS_SVC --> REDIS
    
    API --> REDIS
    WORKER --> REDIS
    
    FILES --> CDN
    API --> EMAIL
    API --> MONITORING

Design Principles

1. Modularity

Service-Oriented: Clear separation between different business domains
Loose Coupling: Services communicate through well-defined interfaces
High Cohesion: Related functionality grouped together

2. Scalability

Horizontal Scaling: Stateless services that can be scaled independently
Async Processing: Heavy operations handled by background workers
Caching Strategy: Multi-layer caching for performance optimization

3. Reliability

Error Handling: Comprehensive error handling and recovery mechanisms
Health Checks: Automated monitoring and alerting
Data Integrity: ACID transactions and data validation

4. Security

Authentication: JWT-based authentication with refresh tokens
Authorization: Role-based access control (RBAC)
Data Protection: Encryption at rest and in transit

5. Maintainability

Clean Code: Following Python and TypeScript best practices
Documentation: Comprehensive API and code documentation
Testing: High test coverage with unit, integration, and E2E tests

Technology Stack

Backend Technologies

Component	Technology	Purpose
Web Framework	FastAPI	High-performance async API framework
Database	PostgreSQL	Primary data storage with ACID compliance
Cache/Queue	Redis	Caching, session storage, and message queue
Background Jobs	Celery	Async task processing
Audio Processing	Librosa, PyDub	Audio analysis and manipulation
Authentication	JWT	Stateless authentication
Validation	Pydantic	Data validation and serialization
ORM	SQLAlchemy	Database abstraction layer
Migrations	Alembic	Database schema migrations

Frontend Technologies

Component	Technology	Purpose
Framework	React 18	Component-based UI framework
Meta Framework	Next.js	Full-stack React framework
Language	TypeScript	Type-safe JavaScript
Styling	Tailwind CSS	Utility-first CSS framework
State Management	Zustand	Lightweight state management
HTTP Client	Axios	Promise-based HTTP client
Audio Recording	MediaRecorder API	Browser audio recording
Testing	Jest, React Testing Library	Unit and integration testing

Infrastructure Technologies

Component	Technology	Purpose
Containerization	Docker	Application containerization
Orchestration	Docker Compose	Multi-container application management
Reverse Proxy	Nginx	Load balancing and SSL termination
Monitoring	Prometheus, Grafana	Metrics collection and visualization
Logging	Structured logging	Centralized log management
CI/CD	GitHub Actions	Automated testing and deployment

Data Architecture

Database Schema Design

erDiagram
    %% Core Relationships
    USERS ||--o{ VOICE_RECORDINGS : "creates"
    USERS ||--o{ TRANSCRIPTIONS : "creates"
    USERS ||--o{ QUALITY_REVIEWS : "performs"
    USERS ||--o{ EXPORT_AUDIT_LOGS : "performs"
    USERS ||--o{ EXPORT_DOWNLOADS : "downloads"
    USERS ||--o| EXPORT_BATCHES : "creates (optional)"

    LANGUAGES ||--o{ SCRIPTS : "has"
    LANGUAGES ||--o{ VOICE_RECORDINGS : "recorded in"
    LANGUAGES ||--o{ TRANSCRIPTIONS : "transcribed in"

    SCRIPTS ||--o{ VOICE_RECORDINGS : "recorded from"

    VOICE_RECORDINGS ||--|{ AUDIO_CHUNKS : "divided into (1 to many)"

    AUDIO_CHUNKS ||--o{ TRANSCRIPTIONS : "has many"
    AUDIO_CHUNKS ||--o| TRANSCRIPTIONS : "has one consensus"

    TRANSCRIPTIONS ||--o{ QUALITY_REVIEWS : "reviewed by"

    EXPORT_BATCHES ||--o{ EXPORT_AUDIT_LOGS : "generates"
    EXPORT_BATCHES ||--o{ EXPORT_DOWNLOADS : "downloaded by"

    %% Entities with attributes
    USERS {
        int id PK
        string name
        string email UK
        string password_hash
        string role "(enum: userrole)"
        json meta_data
        timestamptz created_at
        timestamptz updated_at
    }

    LANGUAGES {
        int id PK
        string name
        string code UK
        timestamptz created_at
    }

    SCRIPTS {
        int id PK
        int language_id FK
        text text
        string duration_category "(enum: durationcategory)"
        json meta_data
        timestamptz created_at
        timestamptz updated_at
    }

    VOICE_RECORDINGS {
        int id PK
        int user_id FK
        int script_id FK
        int language_id FK
        string file_path
        float duration
        string status "(enum: recordingstatus)"
        json meta_data
        timestamptz created_at
        timestamptz updated_at
    }

    AUDIO_CHUNKS {
        int id PK
        int recording_id FK
        int chunk_index
        string file_path
        float start_time
        float end_time
        float duration
        text sentence_hint
        json meta_data
        timestamptz created_at
        int transcript_count
        boolean ready_for_export
        float consensus_quality
        int consensus_transcript_id FK "optional"
        int consensus_failed_count
    }

    TRANSCRIPTIONS {
        int id PK
        int chunk_id FK
        int user_id FK
        int language_id FK
        text text
        float quality
        float confidence
        boolean is_consensus
        boolean is_validated
        json meta_data
        timestamptz created_at
        timestamptz updated_at
    }

    QUALITY_REVIEWS {
        int id PK
        int transcription_id FK
        int reviewer_id FK
        string decision "(enum: reviewdecision)"
        float rating
        text comment
        json meta_data
        timestamptz created_at
    }

    EXPORT_BATCHES {
        int id PK
        string batch_id UK
        string archive_path
        string storage_type "(enum: storagetype)"
        int chunk_count
        bigint file_size_bytes
        json chunk_ids
        string status "(enum: exportbatchstatus)"
        boolean exported
        text error_message
        int retry_count
        string checksum
        int compression_level
        string format_version
        json recording_id_range
        json language_stats
        float total_duration_seconds
        json filter_criteria
        timestamptz created_at
        timestamptz completed_at
        int created_by_id FK "optional"
    }

    EXPORT_AUDIT_LOGS {
        int id PK
        string export_id
        int user_id FK
        string export_type
        string format
        json filters_applied
        int records_exported
        bigint file_size_bytes
        string ip_address
        string user_agent
        timestamptz created_at
    }

    EXPORT_DOWNLOADS {
        int id PK
        string batch_id FK
        int user_id FK
        timestamptz downloaded_at
        string ip_address
        string user_agent
    }

Data Flow Patterns

1. Voice Recording Data Flow

User Input → Frontend → API → Database → File Storage → Background Processing → Chunking → Database Update

2. Transcription Data Flow

User Request → API → Database Query → Cache Check → Response → User Input → Validation → Database Save → Consensus Trigger

3. Consensus Calculation Flow

Transcription Submit → Background Job → Collect Related → Calculate Similarity → Weight Quality → Update Consensus → Notify Users

API Design

RESTful API Principles

Shrutik follows REST architectural principles with some pragmatic adaptations:

Resource-Based URLs: /api/recordings, /api/transcriptions
HTTP Methods: GET, POST, PUT, DELETE for CRUD operations
Status Codes: Proper HTTP status codes for different scenarios
JSON Format: Consistent JSON request/response format
Pagination: Cursor-based pagination for large datasets
Versioning: API versioning through URL path (/api/v1/)

API Structure

/api/
├── auth/
│   ├── POST /login
│   ├── POST /register
│   ├── POST /refresh
│   └── POST /logout
├── recordings/
│   ├── GET /
│   ├── POST /sessions
│   ├── POST /upload
│   └── GET /{id}/progress
├── transcriptions/
│   ├── GET /
│   ├── POST /tasks
│   ├── POST /submit
│   └── POST /skip
├── chunks/
│   ├── GET /{id}/audio
│   └── GET /{id}/info
├── admin/
│   ├── GET /stats/platform
│   ├── GET /users
│   └── GET /performance/dashboard
└── export/
    ├── POST /dataset
    └── GET /jobs/{id}/status

Authentication & Authorization

sequenceDiagram
    participant C as Client
    participant A as API
    participant Auth as Auth Service
    participant DB as Database

    C->>A: POST /auth/login
    A->>Auth: Validate Credentials
    Auth->>DB: Check User
    DB-->>Auth: User Data
    Auth->>Auth: Generate JWT
    Auth-->>A: JWT + Refresh Token
    A-->>C: Authentication Response

    Note over C: Store JWT in memory/secure storage

    C->>A: GET /recordings (with JWT)
    A->>Auth: Validate JWT
    Auth->>Auth: Check Expiry & Signature
    Auth-->>A: User Context
    A->>A: Check Permissions
    A-->>C: Protected Resource

Performance Architecture

Caching Strategy

graph LR
    subgraph "Client Side"
        BROWSER[Browser Cache]
        LOCAL[Local Storage]
    end
    
    subgraph "CDN Layer"
        CDN[Content Delivery Network]
    end
    
    subgraph "Application Layer"
        API_CACHE[API Response Cache]
        DB_CACHE[Database Query Cache]
        SESSION[Session Cache]
    end
    
    subgraph "Database Layer"
        DB[(PostgreSQL)]
        REDIS[(Redis)]
    end
    
    BROWSER --> CDN
    CDN --> API_CACHE
    API_CACHE --> DB_CACHE
    DB_CACHE --> REDIS
    SESSION --> REDIS
    DB_CACHE --> DB

Performance Optimizations

Backend Optimizations

Connection Pooling: Database connection pooling with configurable limits
Query Optimization: Indexed queries and efficient SQL patterns
Async Processing: Non-blocking I/O for concurrent request handling
Background Jobs: Heavy operations moved to background workers
Response Compression: Gzip compression for API responses

Frontend Optimizations

Code Splitting: Dynamic imports for reduced bundle size
Lazy Loading: Components and routes loaded on demand
Image Optimization: Optimized images with Next.js Image component
Caching: Aggressive caching of static assets and API responses
Service Workers: Offline functionality and background sync

Database Optimizations

Indexing Strategy: Proper indexes on frequently queried columns
Query Optimization: Efficient queries with proper joins and filters
Read Replicas: Separate read replicas for analytics queries
Partitioning: Table partitioning for large datasets

Security Architecture

Security Layers

graph TB
    subgraph "Network Security"
        FIREWALL[Firewall Rules]
        DDoS[DDoS Protection]
        SSL[SSL/TLS Encryption]
    end
    
    subgraph "Application Security"
        AUTH[Authentication]
        AUTHZ[Authorization]
        VALIDATION[Input Validation]
        SANITIZATION[Data Sanitization]
    end
    
    subgraph "Data Security"
        ENCRYPTION[Encryption at Rest]
        BACKUP[Secure Backups]
        AUDIT[Audit Logging]
    end
    
    FIREWALL --> AUTH
    DDoS --> AUTH
    SSL --> AUTH
    AUTH --> ENCRYPTION
    AUTHZ --> ENCRYPTION
    VALIDATION --> BACKUP
    SANITIZATION --> AUDIT

Security Measures

Authentication & Authorization

JWT Tokens: Stateless authentication with short-lived access tokens
Refresh Tokens: Secure token refresh mechanism
Role-Based Access: Granular permissions based on user roles
Session Management: Secure session handling with Redis

Data Protection

Input Validation: Comprehensive input validation using Pydantic
SQL Injection Prevention: Parameterized queries with SQLAlchemy
XSS Protection: Content Security Policy and input sanitization
CSRF Protection: CSRF tokens for state-changing operations

Infrastructure Security

HTTPS Enforcement: All communications encrypted with TLS
Security Headers: Comprehensive security headers implementation
Rate Limiting: Protection against abuse and DoS attacks
File Upload Security: Secure file upload with type validation

Monitoring & Observability

Monitoring Stack

graph LR
    subgraph "Application"
        APP[Shrutik Application]
        METRICS[Metrics Collection]
        LOGS[Structured Logging]
        TRACES[Distributed Tracing]
    end
    
    subgraph "Collection"
        PROMETHEUS[Prometheus]
        LOKI[Loki]
        JAEGER[Jaeger]
    end
    
    subgraph "Visualization"
        GRAFANA[Grafana Dashboards]
        ALERTS[Alert Manager]
    end
    
    APP --> METRICS
    APP --> LOGS
    APP --> TRACES
    
    METRICS --> PROMETHEUS
    LOGS --> LOKI
    TRACES --> JAEGER
    
    PROMETHEUS --> GRAFANA
    LOKI --> GRAFANA
    JAEGER --> GRAFANA
    
    PROMETHEUS --> ALERTS

Key Metrics

Application Metrics

Request Rate: Requests per second by endpoint
Response Time: P50, P95, P99 response times
Error Rate: Error percentage by endpoint and status code
Throughput: Data processing throughput

Business Metrics

User Engagement: Active users, session duration
Data Quality: Transcription accuracy, consensus rates
System Usage: Recording uploads, transcription submissions
Performance: Audio processing times, consensus calculation speed

Infrastructure Metrics

System Resources: CPU, memory, disk usage
Database Performance: Query times, connection pool status
Cache Performance: Hit rates, memory usage
Network: Bandwidth usage, connection counts

Deployment Architecture

Environment Strategy

graph LR
    subgraph "Development"
        DEV_LOCAL[Local Development]
        DEV_DOCKER[Docker Development]
    end
    
    subgraph "Testing"
        TEST_UNIT[Unit Tests]
        TEST_INTEGRATION[Integration Tests]
        TEST_E2E[E2E Tests]
    end
    
    subgraph "Staging"
        STAGING[Staging Environment]
        UAT[User Acceptance Testing]
    end
    
    subgraph "Production"
        PROD[Production Environment]
        MONITORING[Production Monitoring]
    end
    
    DEV_LOCAL --> TEST_UNIT
    DEV_DOCKER --> TEST_INTEGRATION
    TEST_UNIT --> TEST_E2E
    TEST_INTEGRATION --> STAGING
    TEST_E2E --> STAGING
    STAGING --> UAT
    UAT --> PROD
    PROD --> MONITORING

Deployment Pipeline

Code Commit: Developer pushes code to repository
Automated Testing: Unit, integration, and E2E tests run
Build Process: Docker images built and tagged
Staging Deployment: Automatic deployment to staging
Manual Testing: QA and user acceptance testing
Production Deployment: Manual approval and deployment
Health Checks: Automated health verification
Monitoring: Continuous monitoring and alerting

Future Architecture Considerations

Scalability Enhancements

Microservices: Further decomposition into microservices
Event-Driven Architecture: Event sourcing and CQRS patterns
Kubernetes: Container orchestration for better scaling
Service Mesh: Advanced service-to-service communication

Performance Improvements

Edge Computing: Edge nodes for global content delivery
Advanced Caching: Distributed caching with Redis Cluster
Database Sharding: Horizontal database partitioning
GraphQL: More efficient data fetching

AI/ML Integration

Automated Quality Assessment: ML-based quality scoring
Smart Chunk Assignment: AI-driven task assignment
Real-time Transcription: Automatic transcription assistance
Anomaly Detection: ML-based fraud and quality detection

This architecture provides a solid foundation for Shrutik’s current needs while maintaining flexibility for future growth and enhancements.

API Reference

This document provides comprehensive documentation for the Shrutik API, including authentication, endpoints, request/response formats, and examples.

🔗 Base URL

Development: http://localhost:8000

Authentication

Shrutik uses JWT (JSON Web Token) based authentication with refresh tokens for secure API access.

Authentication Flow

sequenceDiagram
    participant C as Client
    participant A as API
    participant DB as Database

    C->>A: POST /api/auth/login
    A->>DB: Validate credentials
    DB-->>A: User data
    A-->>C: JWT + Refresh Token

    Note over C: Store tokens securely

    C->>A: GET /api/recordings (with JWT)
    A->>A: Validate JWT
    A-->>C: Protected resource

    Note over C: JWT expires

    C->>A: POST /api/auth/refresh
    A->>A: Validate refresh token
    A-->>C: New JWT

Authentication Endpoints

POST /api/auth/login
Content-Type: application/json

{
  "email": "user@example.com",
  "password": "secure_password"
}

Response:

{
  "access_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...",
  "token_type": "bearer",
  "user": {
    "id": 1,
    "email": "user@example.com",
    "name": "John Doe",
  }
}

Register

POST /api/auth/register
Content-Type: application/json

{
  "email": "newuser@example.com",
  "password": "secure_password",
  "name": "Jane Smith",
  "preferred_language": "bn"
}

Refresh Token

POST /api/auth/refresh
Content-Type: application/json

{
  "refresh_token": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}

Logout

POST /api/auth/logout
Authorization: Bearer <access_token>

Using Authentication

Include the JWT token in the Authorization header for all protected endpoints:

Authorization: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...

Response Format

All API responses follow a consistent format:

Success Response

{
  "data": {
    // Response data
  },
  "message": "Operation successful",
  "timestamp": "2024-01-01T12:00:00Z"
}

Error Response

{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Invalid input data",
    "details": {
      "field": "email",
      "issue": "Invalid email format"
    }
  },
  "timestamp": "2024-01-01T12:00:00Z"
}

Pagination Response

{
  "data": [
    // Array of items
  ],
  "pagination": {
    "total": 150,
    "page": 1,
    "per_page": 20,
    "total_pages": 8,
    "has_next": true,
    "has_prev": false
  }
}

Voice Recordings API

Create Recording Session

Start a new recording session for a specific script.

POST /api/recordings/sessions
Authorization: Bearer <token>
Content-Type: application/json

{
  "script_id": 123,
  "language_id": 1,
  "metadata": {
    "device_info": "iPhone 14",
    "environment": "quiet_room"
  }
}

Response:

{
  "session_id": "uuid-string",
  "script": {
    "id": 123,
    "content": "আমি বাংলায় কথা বলি।",
    "language": "Bengali",
    "difficulty": "easy"
  },
  "expires_at": "2024-01-01T14:00:00Z"
}

Upload Recording

Upload an audio file for a recording session.

POST /api/recordings/upload
Authorization: Bearer <token>
Content-Type: multipart/form-data

session_id: uuid-string
duration: 5.2
audio_format: wav
file_size: 1048576
sample_rate: 44100
channels: 1
audio_file: <binary_data>

Response:

{
  "recording_id": 456,
  "status": "uploaded",
  "processing_job_id": "job-uuid",
  "estimated_processing_time": 30
}

Get User Recordings

Retrieve paginated list of user’s recordings.

GET /api/recordings?skip=0&limit=20&status=processed
Authorization: Bearer <token>

Response:

{
  "recordings": [
    {
      "id": 456,
      "script_id": 123,
      "language": "Bengali",
      "duration": 5.2,
      "status": "processed",
      "chunks_count": 3,
      "created_at": "2024-01-01T12:00:00Z"
    }
  ],
  "total": 50,
  "page": 1,
  "per_page": 20,
  "total_pages": 3
}

Get Recording Progress

Check processing progress for a recording.

GET /api/recordings/456/progress
Authorization: Bearer <token>

Response:

{
  "recording_id": 456,
  "status": "processing",
  "progress_percentage": 75,
  "current_step": "chunking_audio",
  "estimated_completion": "2024-01-01T12:05:00Z",
  "chunks_created": 2
}

Transcriptions API

Get Transcription Task

Request audio chunks for transcription.

POST /api/transcriptions/tasks
Authorization: Bearer <token>
Content-Type: application/json

{
  "quantity": 5,
  "language_id": 1,
  "skip_chunk_ids": [10, 15, 20],
  "difficulty_preference": "mixed"
}

Response:

{
  "session_id": "transcription-session-uuid",
  "chunks": [
    {
      "id": 789,
      "recording_id": 456,
      "chunk_index": 1,
      "file_path": "/chunks/chunk_789.wav",
      "duration": 3.5,
      "sentence_hint": "Greeting phrase",
      "transcription_count": 2
    }
  ],
  "total_available": 1500
}

Submit Transcriptions

Submit transcriptions for audio chunks.

POST /api/transcriptions/submit
Authorization: Bearer <token>
Content-Type: application/json

{
  "session_id": "transcription-session-uuid",
  "transcriptions": [
    {
      "chunk_id": 789,
      "language_id": 1,
      "text": "আমি বাংলায় কথা বলি।",
      "quality": 4.5,
      "confidence": 0.95,
      "metadata": {
        "time_taken": 45,
        "difficulty_rating": 3
      }
    }
  ],
  "skipped_chunk_ids": [790]
}

Response:

{
  "submitted_count": 1,
  "skipped_count": 1,
  "transcriptions": [
    {
      "id": 1001,
      "chunk_id": 789,
      "text": "আমি বাংলায় কথা বলি।",
      "quality": 4.5,
      "is_consensus": false,
      "created_at": "2024-01-01T12:00:00Z"
    }
  ],
  "message": "Successfully submitted 1 transcriptions"
}

Skip Chunk

Skip a difficult or unclear audio chunk.

POST /api/transcriptions/skip
Authorization: Bearer <token>
Content-Type: application/json

{
  "chunk_id": 790,
  "reason": "poor_audio_quality",
  "comment": "Background noise makes it unclear"
}

Get User Transcriptions

Retrieve user’s transcription history.

GET /api/transcriptions?skip=0&limit=20&language_id=1
Authorization: Bearer <token>

Response:

{
  "transcriptions": [
    {
      "id": 1001,
      "chunk_id": 789,
      "text": "আমি বাংলায় কথা বলি।",
      "quality": 4.5,
      "confidence": 0.95,
      "is_consensus": true,
      "is_validated": true,
      "created_at": "2024-01-01T12:00:00Z"
    }
  ],
  "total": 100,
  "page": 1,
  "per_page": 20,
  "total_pages": 5
}

Audio Chunks API

Get Chunk Audio

Retrieve audio file for a specific chunk.

GET /api/chunks/789/audio
Authorization: Bearer <token>

Response: Binary audio data with optimized headers

Headers:

Content-Type: audio/wav
Cache-Control: public, max-age=3600
Accept-Ranges: bytes
Content-Length: 1048576

Get Chunk Info

Get metadata about an audio chunk.

GET /api/chunks/789/info
Authorization: Bearer <token>

Response:

{
  "chunk_id": 789,
  "recording_id": 456,
  "duration": 3.5,
  "start_time": 1.2,
  "end_time": 4.7,
  "transcription_count": 3,
  "file_size": 1048576,
  "optimized_url": "https://cdn.example.com/chunks/789.wav",
  "alternatives": [
    {
      "format": ".mp3",
      "url": "https://cdn.example.com/chunks/789.mp3",
      "mime_type": "audio/mpeg"
    }
  ]
}

Admin API

Platform Statistics

Get comprehensive platform statistics (admin only).

GET /api/admin/stats/platform
Authorization: Bearer <admin_token>

Response:

{
  "users": {
    "total": 1500,
    "active_last_30_days": 450,
    "new_this_month": 75
  },
  "recordings": {
    "total": 5000,
    "total_duration_hours": 250.5,
    "processed": 4800,
    "pending": 200
  },
  "transcriptions": {
    "total": 15000,
    "consensus_reached": 12000,
    "average_quality": 4.2
  },
  "languages": {
    "supported": 5,
    "most_active": "Bengali"
  }
}

User Management

Get users for management (admin only).

GET /api/admin/users?role=contributor&limit=50
Authorization: Bearer <admin_token>

Performance Dashboard

Get performance metrics (admin only).

GET /api/admin/performance/dashboard
Authorization: Bearer <admin_token>

Response:

{
  "system_metrics": {
    "cpu_usage": 45.2,
    "memory_usage": 67.8,
    "disk_usage": 23.1,
    "active_connections": 150
  },
  "cache_performance": {
    "hit_rate": 85.5,
    "memory_used": "512MB",
    "keys_count": 15000
  },
  "database_performance": {
    "connection_pool": {
      "total_connections": 20,
      "active_connections": 8,
      "idle_connections": 12
    },
    "slow_queries": 2
  }
}

Export API

Create Dataset Export

Request a dataset export job.

POST /api/export/dataset
Authorization: Bearer <token>
Content-Type: application/json

{
  "format": "csv",
  "language_ids": [1, 2],
  "include_audio": true,
  "quality_threshold": 4.0,
  "consensus_only": true,
  "date_range": {
    "start": "2024-01-01",
    "end": "2024-12-31"
  }
}

Response:

{
  "job_id": "export-job-uuid",
  "status": "queued",
  "estimated_completion": "2024-01-01T12:30:00Z",
  "estimated_size_mb": 150
}

Get Export Status

Check export job status.

GET /api/export/jobs/export-job-uuid/status
Authorization: Bearer <token>

Response:

{
  "job_id": "export-job-uuid",
  "status": "completed",
  "progress_percentage": 100,
  "download_url": "https://api.example.com/downloads/dataset-uuid.zip",
  "file_size_mb": 145.7,
  "expires_at": "2024-01-08T12:00:00Z"
}

Scripts API

Get Available Scripts

Retrieve scripts available for recording.

GET /api/scripts?language_id=1&difficulty=easy&limit=20
Authorization: Bearer <token>

Response:

{
  "scripts": [
    {
      "id": 123,
      "content": "আমি বাংলায় কথা বলি।",
      "language": {
        "id": 1,
        "name": "Bengali",
        "code": "bn"
      },
      "difficulty": "easy",
      "estimated_duration": 3.5,
      "recording_count": 25
    }
  ],
  "total": 500,
  "page": 1,
  "per_page": 20
}

Languages API

Get Supported Languages

Retrieve list of supported languages.

GET /api/languages

Response:

{
  "languages": [
    {
      "id": 1,
      "name": "Bengali",
      "code": "bn",
      "script": "Bengali",
      "active": true,
      "recording_count": 5000,
      "transcription_count": 15000
    },
    {
      "id": 2,
      "name": "Hindi",
      "code": "hi",
      "script": "Devanagari",
      "active": true,
      "recording_count": 3000,
      "transcription_count": 9000
    }
  ]
}

Search API

Search Transcriptions

Search through transcriptions (admin only).

GET /api/search/transcriptions?q=greeting&language_id=1&limit=20
Authorization: Bearer <admin_token>

Health Check

System Health

Check system health and status.

GET /health

Response:

{
  "status": "healthy",
  "checks": {
    "database": true,
    "redis": true,
    "disk_space": true,
    "memory": true
  },
  "performance": {
    "database_pool": {
      "total_connections": 20,
      "active_connections": 5
    },
    "cache_status": true
  },
  "timestamp": "2024-01-01T12:00:00Z"
}

Metrics

Performance Metrics

Get performance metrics (admin only).

GET /metrics
Authorization: Bearer <admin_token>

Error Codes

HTTP Status Codes

Code	Description
200	Success
201	Created
400	Bad Request
401	Unauthorized
403	Forbidden
404	Not Found
422	Validation Error
429	Rate Limited
500	Internal Server Error

Custom Error Codes

Code	Description
`VALIDATION_ERROR`	Input validation failed
`AUTHENTICATION_FAILED`	Invalid credentials
`INSUFFICIENT_PERMISSIONS`	User lacks required permissions
`RESOURCE_NOT_FOUND`	Requested resource not found
`RATE_LIMIT_EXCEEDED`	Too many requests
`SESSION_EXPIRED`	Recording/transcription session expired
`FILE_TOO_LARGE`	Uploaded file exceeds size limit
`UNSUPPORTED_FORMAT`	Audio format not supported
`PROCESSING_ERROR`	Audio processing failed
`CONSENSUS_PENDING`	Transcription consensus not yet reached

Rate Limits

Default Limits

User Type	Requests/Minute
Anonymous	60
Authenticated	300
Admin	1000
Sworik Developer	2000

Endpoint-Specific Limits

Endpoint	Limit	Window
`/api/auth/login`	10/min	1 minute
`/api/recordings/upload`	20/min	1 minute
`/api/transcriptions/submit`	100/min	1 minute
`/api/chunks/*/audio`	10/sec	1 second

Rate Limit Headers

X-RateLimit-Limit: 300
X-RateLimit-Remaining: 299
X-RateLimit-Reset: 1640995200
Retry-After: 60

Security

API Security Best Practices

Always use HTTPS in production
Store JWT tokens securely (not in localStorage for web apps)
Implement proper CORS policies
Validate all inputs on client and server
Use refresh tokens for long-lived sessions
Implement rate limiting to prevent abuse
Log security events for monitoring

Content Security Policy

Content-Security-Policy: default-src 'self'; script-src 'self' 'unsafe-inline'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; media-src 'self' blob:; connect-src 'self' wss:

SDKs and Libraries

JavaScript/TypeScript SDK

npm install @shrutik/sdk

import { ShrutikClient } from '@shrutik/sdk';

const client = new ShrutikClient({
  baseURL: 'https://api.yourdomain.com',
  apiKey: 'your-api-key'
});

// Get transcription task
const task = await client.transcriptions.getTask({
  quantity: 5,
  languageId: 1
});

// Submit transcription
await client.transcriptions.submit({
  sessionId: task.sessionId,
  transcriptions: [{
    chunkId: 789,
    text: 'Transcribed text',
    quality: 4.5
  }]
});

Python SDK

pip install shrutik-sdk

from shrutik import ShrutikClient

client = ShrutikClient(
    base_url='https://api.yourdomain.com',
    api_key='your-api-key'
)

# Get transcription task
task = client.transcriptions.get_task(
    quantity=5,
    language_id=1
)

# Submit transcription
client.transcriptions.submit(
    session_id=task.session_id,
    transcriptions=[{
        'chunk_id': 789,
        'text': 'Transcribed text',
        'quality': 4.5
    }]
)

Testing

API Testing with curl

# Login
curl -X POST https://api.yourdomain.com/api/auth/login \
  -H "Content-Type: application/json" \
  -d '{"email":"user@example.com","password":"password"}'

# Get recordings (with token)
curl -X GET https://api.yourdomain.com/api/recordings \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

# Upload recording
curl -X POST https://api.yourdomain.com/api/recordings/upload \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -F "session_id=uuid" \
  -F "duration=5.2" \
  -F "audio_format=wav" \
  -F "file_size=1048576" \
  -F "audio_file=@recording.wav"

For additional support, join our Discord community or check our GitHub repository.

Audio Processing Modes

The Voice Data Collection Platform supports two modes for audio processing to accommodate different development and deployment scenarios.

Processing Modes

1. Asynchronous Processing (Celery) - Recommended for Production

When to use:

Production deployments
Development with full background processing
When you want non-blocking audio uploads
When processing large audio files

How it works:

User uploads audio file
File is saved and marked as UPLOADED
Celery task is queued for background processing
User gets immediate response
Background worker processes audio into chunks
Status updates to PROCESSED when complete
User can check progress via API

Setup:

# In .env file
USE_CELERY=true

# Start services
redis-server
celery -A app.core.celery_app worker --loglevel=info
uvicorn app.main:app --reload

Benefits:

✅ Non-blocking uploads
✅ Scalable (multiple workers)
✅ Retry mechanisms
✅ Progress tracking
✅ Monitoring via Flower

Drawbacks:

❌ More complex setup
❌ Requires Redis
❌ Requires Celery workers

2. Synchronous Processing - Simple Development

When to use:

Quick local development
Testing without Celery setup
Simple deployments
When immediate results are needed

How it works:

User uploads audio file
File is saved and marked as PROCESSING
Audio processing happens immediately in the request
Chunks are created during the upload request
Status updates to PROCESSED before response
User gets complete results immediately

Setup:

# In .env file
USE_CELERY=false

# Start service
uvicorn app.main:app --reload

Benefits:

✅ Simple setup (no Redis/Celery needed)
✅ Immediate results
✅ Easier debugging
✅ No additional services required

Drawbacks:

❌ Blocking uploads (slower response)
❌ No retry mechanisms
❌ Single-threaded processing
❌ No progress tracking

Automatic Mode Detection

The system automatically detects which mode to use:

def _is_celery_available(self) -> bool:
    # 1. Check configuration
    if not settings.USE_CELERY:
        return False
    
    # 2. Check if workers are running
    try:
        inspect = celery_app.control.inspect()
        stats = inspect.stats()
        return stats is not None and len(stats) > 0
    except:
        return False

Fallback Logic:

If USE_CELERY=false → Always use synchronous processing
If USE_CELERY=true but no workers → Fall back to synchronous processing
If USE_CELERY=true and workers available → Use Celery processing

API Behavior Differences

Upload Response

Celery Mode:

{
  "id": 123,
  "status": "uploaded",
  "message": "File uploaded successfully, processing queued"
}

Synchronous Mode:

{
  "id": 123,
  "status": "processed",
  "chunks_created": 5,
  "message": "File uploaded and processed successfully"
}

Progress Tracking

Celery Mode:

# Check progress
GET /api/recordings/123/progress
{
  "status": "processing",
  "progress": 45,
  "chunks_created": 0
}

# Later...
GET /api/recordings/123/progress
{
  "status": "processed", 
  "progress": 100,
  "chunks_created": 5
}

Synchronous Mode:

# Progress is always complete
GET /api/recordings/123/progress
{
  "status": "processed",
  "progress": 100,
  "chunks_created": 5
}

Configuration Options

Environment Variables

# Enable/disable Celery
USE_CELERY=true|false

# Celery configuration (when enabled)
REDIS_URL=redis://localhost:6379/0
JOB_MAX_RETRIES=3
JOB_RETRY_DELAY=60

Runtime Detection

The system logs which mode is being used:

# Celery mode
INFO: Queued audio processing task abc123 for recording 456

# Synchronous mode  
INFO: Celery not available, processing recording 456 synchronously...
INFO: Successfully processed recording 456 into 5 chunks

Development Workflow

For Frontend Development

Use synchronous mode for simplicity:

USE_CELERY=false
uvicorn app.main:app --reload

For Full-Stack Development

Use Celery mode to test complete workflow:

USE_CELERY=true
./scripts/start-local-dev.sh

For Production Testing

Always use Celery mode:

USE_CELERY=true
# + proper Redis/Celery setup

Monitoring and Debugging

Celery Mode Monitoring

# Check worker status
celery -A app.core.celery_app inspect active

# Monitor via Flower
celery -A app.core.celery_app flower --port=5555

# Check job status via API
GET /api/jobs/active

Synchronous Mode Debugging

# Check logs for processing errors
tail -f logs/app.log

# Processing happens in main thread
# Errors appear immediately in response

Performance Considerations

Celery Mode

Throughput: High (parallel processing)
Response Time: Fast (immediate return)
Resource Usage: Distributed across workers
Scalability: Horizontal (add more workers)

Synchronous Mode

Throughput: Limited (sequential processing)
Response Time: Slow (includes processing time)
Resource Usage: Single process
Scalability: Vertical only

Error Handling

Celery Mode

Automatic retries with exponential backoff
Failed tasks can be manually retried
Detailed error tracking in job monitoring
Notifications for failures

Synchronous Mode

Immediate error response
No automatic retries
Simpler error debugging
Direct error messages

Migration Between Modes

From Synchronous to Celery

Set USE_CELERY=true
Start Redis and Celery workers
Existing processed recordings work normally
New uploads use background processing

From Celery to Synchronous

Set USE_CELERY=false
Stop Celery workers (optional)
Existing queued tasks will fail
New uploads use synchronous processing

Note: In-progress Celery tasks will fail when switching to synchronous mode. Complete or cancel them first.

Best Practices

Development

Use synchronous mode for quick testing
Use Celery mode when testing full workflow
Monitor logs for processing errors

Production

Always use Celery mode
Set up proper monitoring
Configure retry mechanisms
Use multiple workers for scalability

Testing

Test both modes in CI/CD
Verify fallback behavior
Test error scenarios in both modes

Shrutik System Flowcharts

This directory contains visual documentation of Shrutik’s system flows and processes using Mermaid diagrams. These flowcharts help developers and contributors understand the system architecture and data flow.

Available Flowcharts

Core System Flows

Overall System Architecture - High-level system overview
Voice Recording Flow - Complete voice recording process
Transcription Workflow - Transcription and consensus process
User Authentication Flow - User registration and authentication
Data Processing Pipeline - Audio processing and chunking

Technical Flows

API Request Flow - API request lifecycle
Database Operations - Database interaction patterns
Caching Strategy - Caching and performance optimization
Background Jobs - Celery task processing

How to Read These Diagrams

Symbols and Conventions

Rectangles: Processes or services
Diamonds: Decision points
Circles: Start/end points
Cylinders: Databases or storage
Clouds: External services
Arrows: Data flow direction

Color Coding

Blue: User interactions
Green: Successful operations
Red: Error conditions
Yellow: Processing/waiting states
Purple: External services

🔧 Updating Flowcharts

When making changes to the system:

Review Affected Diagrams: Check which flowcharts need updates
Update Mermaid Code: Modify the diagram code
Test Rendering: Ensure diagrams render correctly
Update Documentation: Sync with code changes

Mermaid Syntax Reference

graph TD
    A[Start] --> B{Decision?}
    B -->|Yes| C[Process]
    B -->|No| D[Alternative]
    C --> E[End]
    D --> E

📚 Additional Resources

Mermaid Documentation
System Architecture Guide
API Documentation
Development Guide

Contributing

To contribute new flowcharts or update existing ones:

Follow the naming convention: kebab-case.md
Include a description and context
Use consistent styling and colors
Test diagram rendering
Update this README if adding new diagrams

These visual guides complement our technical documentation and help make Shrutik more accessible to contributors of all backgrounds.

Voice Recording Flow

This flowchart details the complete process of voice recording in Shrutik, from user interaction to final storage and processing.

Complete Voice Recording Process

flowchart TD
    START([User Starts Recording]) --> AUTH{User Authenticated?}
    
    AUTH -->|No| LOGIN[Redirect to Login]
    LOGIN --> AUTH
    AUTH -->|Yes| SCRIPT[Get Script for Recording]
    
    SCRIPT --> PERM{Microphone Permission?}
    PERM -->|No| REQ_PERM[Request Microphone Access]
    REQ_PERM --> PERM_GRANT{Permission Granted?}
    PERM_GRANT -->|No| ERROR_PERM[Show Permission Error]
    PERM_GRANT -->|Yes| PERM
    
    PERM -->|Yes| SETUP[Setup Audio Recording]
    SETUP --> DISPLAY[Display Script Text]
    DISPLAY --> READY[Show Record Button]
    
    READY --> RECORD_START[User Clicks Record]
    RECORD_START --> RECORDING[Recording Audio...]
    
    RECORDING --> MONITOR{Monitor Recording}
    MONITOR --> CHECK_DURATION{Duration < Max?}
    CHECK_DURATION -->|No| AUTO_STOP[Auto Stop Recording]
    CHECK_DURATION -->|Yes| USER_STOP{User Stops?}
    
    USER_STOP -->|No| MONITOR
    USER_STOP -->|Yes| STOP_REC[Stop Recording]
    AUTO_STOP --> STOP_REC
    
    STOP_REC --> VALIDATE[Validate Audio]
    VALIDATE --> VALID{Audio Valid?}
    
    VALID -->|No| ERROR_AUDIO[Show Audio Error]
    ERROR_AUDIO --> READY
    
    VALID -->|Yes| PREVIEW[Show Audio Preview]
    PREVIEW --> USER_ACTION{User Action}
    
    USER_ACTION -->|Re-record| READY
    USER_ACTION -->|Cancel| CANCEL[Cancel Recording]
    USER_ACTION -->|Submit| PREPARE[Prepare Upload]
    
    PREPARE --> CREATE_SESSION[Create Recording Session]
    CREATE_SESSION --> SESSION_VALID{Session Created?}
    
    SESSION_VALID -->|No| ERROR_SESSION[Session Creation Error]
    SESSION_VALID -->|Yes| UPLOAD[Upload Audio File]
    
    UPLOAD --> UPLOAD_PROGRESS[Show Upload Progress]
    UPLOAD_PROGRESS --> UPLOAD_COMPLETE{Upload Complete?}
    
    UPLOAD_COMPLETE -->|No| UPLOAD_ERROR[Upload Error]
    UPLOAD_ERROR --> RETRY{Retry Upload?}
    RETRY -->|Yes| UPLOAD
    RETRY -->|No| CANCEL
    
    UPLOAD_COMPLETE -->|Yes| SAVE_DB[Save to Database]
    SAVE_DB --> QUEUE_PROCESSING[Queue for Processing]
    QUEUE_PROCESSING --> SUCCESS[Show Success Message]
    
    SUCCESS --> NEXT_ACTION{User Next Action}
    NEXT_ACTION -->|Record Another| SCRIPT
    NEXT_ACTION -->|View Progress| DASHBOARD[Go to Dashboard]
    NEXT_ACTION -->|Logout| LOGOUT[Logout User]
    
    CANCEL --> CLEANUP[Cleanup Resources]
    ERROR_PERM --> CLEANUP
    ERROR_SESSION --> CLEANUP
    CLEANUP --> END([End])
    
    DASHBOARD --> END
    LOGOUT --> END

    %% Background Processing (Async)
    QUEUE_PROCESSING -.-> BG_START[Background Processing Starts]
    BG_START -.-> VALIDATE_FILE[Validate Audio File]
    VALIDATE_FILE -.-> CHUNK_AUDIO[Intelligent Audio Chunking]
    CHUNK_AUDIO -.-> SAVE_CHUNKS[Save Audio Chunks]
    SAVE_CHUNKS -.-> UPDATE_STATUS[Update Recording Status]
    UPDATE_STATUS -.-> NOTIFY_USER[Notify User of Completion]

    %% Styling
    classDef userAction fill:#e3f2fd
    classDef process fill:#e8f5e8
    classDef decision fill:#fff3e0
    classDef error fill:#ffebee
    classDef success fill:#e0f2f1
    classDef background fill:#f3e5f5

    class START,RECORD_START,USER_STOP,USER_ACTION,NEXT_ACTION userAction
    class SETUP,DISPLAY,RECORDING,VALIDATE,PREPARE,UPLOAD,SAVE_DB process
    class AUTH,PERM,PERM_GRANT,CHECK_DURATION,VALID,SESSION_VALID,UPLOAD_COMPLETE,RETRY decision
    class ERROR_PERM,ERROR_AUDIO,ERROR_SESSION,UPLOAD_ERROR error
    class SUCCESS,NOTIFY_USER success
    class BG_START,VALIDATE_FILE,CHUNK_AUDIO,SAVE_CHUNKS,UPDATE_STATUS background

Process Breakdown

1. User Authentication & Setup

sequenceDiagram
    participant U as User
    participant F as Frontend
    participant B as Backend
    participant DB as Database

    U->>F: Access Recording Page
    F->>B: Check Authentication
    B->>DB: Validate Session
    DB-->>B: User Data
    B-->>F: Authentication Status
    
    alt Not Authenticated
        F->>U: Redirect to Login
        U->>F: Login Credentials
        F->>B: Authenticate User
        B-->>F: JWT Token
    end
    
    F->>B: Request Script
    B->>DB: Get Available Script
    DB-->>B: Script Data
    B-->>F: Script Content
    F->>U: Display Script

2. Audio Recording Process

sequenceDiagram
    participant U as User
    participant F as Frontend
    participant M as MediaRecorder
    participant V as Validation

    U->>F: Click Record Button
    F->>M: Start Recording
    M-->>F: Recording Started
    F->>U: Show Recording UI
    
    loop During Recording
        M->>F: Audio Data Chunks
        F->>V: Validate Duration
        V-->>F: Status Update
    end
    
    U->>F: Stop Recording
    F->>M: Stop Recording
    M-->>F: Final Audio Blob
    F->>V: Validate Audio Quality
    V-->>F: Validation Result
    F->>U: Show Preview/Options

3. File Upload & Processing

sequenceDiagram
    participant U as User
    participant F as Frontend
    participant B as Backend
    participant S as Storage
    participant Q as Queue
    participant W as Worker

    U->>F: Submit Recording
    F->>B: Create Recording Session
    B-->>F: Session ID
    
    F->>B: Upload Audio File
    B->>S: Store Audio File
    S-->>B: File Path
    B->>Q: Queue Processing Job
    B-->>F: Upload Success
    F->>U: Show Success Message
    
    Q->>W: Process Audio File
    W->>S: Read Audio File
    W->>W: Validate & Chunk Audio
    W->>S: Save Audio Chunks
    W->>B: Update Status
    B->>F: Notify Completion (WebSocket)
    F->>U: Show Processing Complete

🔍 Validation Steps

Audio Quality Validation

Duration Check: 1-60 seconds
Format Validation: Supported audio formats
File Size: Maximum 100MB
Sample Rate: Minimum quality requirements
Noise Level: Basic noise detection

Security Validation

File Type: MIME type verification
Malware Scan: Basic security checks
User Permissions: Recording quota limits
Session Validation: Valid recording session

Performance Optimizations

Frontend Optimizations

Progressive Upload: Chunked file upload
Compression: Client-side audio compression
Caching: Cache user preferences and scripts
Offline Support: Queue recordings when offline

Backend Optimizations

Async Processing: Background job processing
Connection Pooling: Database connection optimization
Caching: Redis caching for frequent data
CDN Integration: Optimized file delivery

Error Handling

Common Error Scenarios

Microphone Access Denied
- Show clear instructions
- Provide alternative options
- Guide user through browser settings
Network Connection Issues
- Implement retry logic
- Show connection status
- Queue uploads for later
File Upload Failures
- Automatic retry with exponential backoff
- Resume interrupted uploads
- Clear error messages
Audio Quality Issues
- Real-time quality feedback
- Recording tips and guidance
- Option to re-record

Error Recovery

flowchart LR
    ERROR[Error Occurs] --> LOG[Log Error Details]
    LOG --> CLASSIFY{Error Type}
    
    CLASSIFY -->|Network| RETRY[Automatic Retry]
    CLASSIFY -->|Validation| USER_FIX[User Action Required]
    CLASSIFY -->|System| FALLBACK[Fallback Method]
    
    RETRY --> SUCCESS{Retry Success?}
    SUCCESS -->|Yes| CONTINUE[Continue Process]
    SUCCESS -->|No| USER_FIX
    
    USER_FIX --> GUIDE[Show User Guidance]
    FALLBACK --> ALTERNATIVE[Alternative Flow]
    
    GUIDE --> CONTINUE
    ALTERNATIVE --> CONTINUE

Monitoring & Analytics

Key Metrics

Recording Success Rate: Percentage of successful recordings
Average Recording Duration: User engagement metrics
Upload Success Rate: Technical performance metrics
Processing Time: Background job performance
Error Rates: System reliability metrics

User Experience Metrics

Time to First Recording: Onboarding effectiveness
Recording Abandonment Rate: UX friction points
Retry Attempts: Error recovery effectiveness
User Satisfaction: Quality ratings and feedback

This comprehensive flow ensures a smooth, reliable voice recording experience while maintaining high quality standards and robust error handling.

Transcription Workflow

Interactive Diagrams

Zoom & Pan Enabled

This flowchart details the complete transcription process in Shrutik, including task assignment, transcription submission, consensus building, and quality control.

💡 Pro Tip: All diagrams below are interactive! Use your mouse wheel to zoom, drag to pan, and double-click to reset. Click the fullscreen button for a better view of complex diagrams.

Complete Transcription Process

flowchart TD
    START([User Requests Transcription Task]) --> AUTH{User Authenticated?}
    
    AUTH -->|No| LOGIN[Redirect to Login]
    LOGIN --> AUTH
    AUTH -->|Yes| REQ_TASK[Request Transcription Task]
    
    REQ_TASK --> TASK_PARAMS[Specify Task Parameters]
    TASK_PARAMS --> FIND_CHUNKS[Find Available Chunks]
    
    FIND_CHUNKS --> FILTER[Apply Filters]
    FILTER --> EXCLUDE[Exclude User's Previous Work]
    EXCLUDE --> AVAILABLE{Chunks Available?}
    
    AVAILABLE -->|No| NO_CHUNKS[No Chunks Available]
    NO_CHUNKS --> SUGGEST[Suggest Alternatives]
    SUGGEST --> END_NO_WORK([End - No Work])
    
    AVAILABLE -->|Yes| SELECT[Select Random Chunks]
    SELECT --> CREATE_SESSION[Create Transcription Session]
    CREATE_SESSION --> LOAD_AUDIO[Load Audio Files]
    
    LOAD_AUDIO --> OPTIMIZE[Optimize Audio Delivery]
    OPTIMIZE --> PRESENT[Present Chunks to User]
    
    PRESENT --> USER_WORK[User Transcribes Audio]
    USER_WORK --> TRANSCRIBE{Transcription Action}
    
    TRANSCRIBE -->|Skip Chunk| SKIP_CHUNK[Record Skip Reason]
    TRANSCRIBE -->|Transcribe| ENTER_TEXT[Enter Transcription Text]
    TRANSCRIBE -->|Submit All| VALIDATE_SUBMISSION[Validate Submission]
    
    SKIP_CHUNK --> UPDATE_SKIP[Update Skip Metadata]
    UPDATE_SKIP --> NEXT_CHUNK{More Chunks?}
    
    ENTER_TEXT --> QUALITY_RATE[Rate Audio Quality]
    QUALITY_RATE --> CONFIDENCE[Set Confidence Level]
    CONFIDENCE --> SAVE_DRAFT[Save Draft Locally]
    SAVE_DRAFT --> NEXT_CHUNK
    
    NEXT_CHUNK -->|Yes| PRESENT
    NEXT_CHUNK -->|No| VALIDATE_SUBMISSION
    
    VALIDATE_SUBMISSION --> CHECK_REQUIRED{Required Fields?}
    CHECK_REQUIRED -->|Missing| SHOW_ERRORS[Show Validation Errors]
    SHOW_ERRORS --> USER_WORK
    
    CHECK_REQUIRED -->|Complete| SUBMIT[Submit Transcriptions]
    SUBMIT --> PROCESS_SUBMISSION[Process Submission]
    
    PROCESS_SUBMISSION --> VALIDATE_SESSION{Valid Session?}
    VALIDATE_SESSION -->|No| SESSION_ERROR[Session Error]
    SESSION_ERROR --> ERROR_RECOVERY[Error Recovery]
    
    VALIDATE_SESSION -->|Yes| CHECK_DUPLICATES{Check Duplicates}
    CHECK_DUPLICATES -->|Found| DUPLICATE_ERROR[Duplicate Error]
    DUPLICATE_ERROR --> ERROR_RECOVERY
    
    CHECK_DUPLICATES -->|None| SAVE_TRANSCRIPTIONS[Save Transcriptions]
    SAVE_TRANSCRIPTIONS --> UPDATE_STATS[Update User Stats]
    UPDATE_STATS --> TRIGGER_CONSENSUS[Trigger Consensus Calculation]
    
    TRIGGER_CONSENSUS --> SUCCESS[Show Success Message]
    SUCCESS --> CLEANUP_SESSION[Cleanup Session]
    CLEANUP_SESSION --> NEXT_ACTION{User Next Action}
    
    NEXT_ACTION -->|Continue| REQ_TASK
    NEXT_ACTION -->|View Progress| DASHBOARD[Go to Dashboard]
    NEXT_ACTION -->|Logout| LOGOUT[Logout User]
    
    ERROR_RECOVERY --> RETRY{Retry Submission?}
    RETRY -->|Yes| SUBMIT
    RETRY -->|No| SAVE_DRAFT
    
    DASHBOARD --> END_SUCCESS([End - Success])
    LOGOUT --> END_SUCCESS

    %% Background Consensus Process
    TRIGGER_CONSENSUS -.-> BG_CONSENSUS[Background Consensus Process]
    BG_CONSENSUS -.-> COLLECT_TRANSCRIPTIONS[Collect All Transcriptions for Chunk]
    COLLECT_TRANSCRIPTIONS -.-> CALCULATE_SIMILARITY[Calculate Text Similarity]
    CALCULATE_SIMILARITY -.-> WEIGHT_QUALITY[Weight by Quality Scores]
    WEIGHT_QUALITY -.-> DETERMINE_CONSENSUS[Determine Consensus Text]
    DETERMINE_CONSENSUS -.-> UPDATE_CONSENSUS[Update Consensus in Database]
    UPDATE_CONSENSUS -.-> NOTIFY_CONTRIBUTORS[Notify Contributors]

    %% Styling
    classDef userAction fill:#e3f2fd
    classDef process fill:#e8f5e8
    classDef decision fill:#fff3e0
    classDef error fill:#ffebee
    classDef success fill:#e0f2f1
    classDef background fill:#f3e5f5

    class START,USER_WORK,TRANSCRIBE,NEXT_ACTION userAction
    class REQ_TASK,FIND_CHUNKS,SELECT,LOAD_AUDIO,SAVE_TRANSCRIPTIONS process
    class AUTH,AVAILABLE,CHECK_REQUIRED,VALIDATE_SESSION,CHECK_DUPLICATES decision
    class SESSION_ERROR,DUPLICATE_ERROR,SHOW_ERRORS error
    class SUCCESS,NOTIFY_CONTRIBUTORS success
    class BG_CONSENSUS,COLLECT_TRANSCRIPTIONS,CALCULATE_SIMILARITY,WEIGHT_QUALITY background

Task Assignment Algorithm

flowchart LR
    subgraph "Task Request Parameters"
        LANG[Language Preference]
        QTY[Quantity Requested]
        SKIP[Skip List]
        DIFFICULTY[Difficulty Level]
    end
    
    subgraph "Filtering Process"
        ALL_CHUNKS[All Available Chunks]
        FILTER_LANG[Filter by Language]
        FILTER_USER[Exclude User's Work]
        FILTER_SKIP[Exclude Skip List]
        FILTER_STATUS[Filter by Status]
        PRIORITIZE[Prioritize by Need]
    end
    
    subgraph "Selection Strategy"
        RANDOM[Random Selection]
        BALANCED[Balance Difficulty]
        QUALITY[Quality Distribution]
        FINAL[Final Chunk List]
    end
    
    LANG --> FILTER_LANG
    QTY --> RANDOM
    SKIP --> FILTER_SKIP
    DIFFICULTY --> BALANCED
    
    ALL_CHUNKS --> FILTER_LANG
    FILTER_LANG --> FILTER_USER
    FILTER_USER --> FILTER_SKIP
    FILTER_SKIP --> FILTER_STATUS
    FILTER_STATUS --> PRIORITIZE
    
    PRIORITIZE --> RANDOM
    RANDOM --> BALANCED
    BALANCED --> QUALITY
    QUALITY --> FINAL

Consensus Algorithm

flowchart TD
    CHUNK[Audio Chunk] --> COLLECT[Collect All Transcriptions]
    COLLECT --> COUNT{Transcription Count}
    
    COUNT -->|< 3| NEED_MORE[Need More Transcriptions]
    COUNT -->|≥ 3| ANALYZE[Analyze Transcriptions]
    
    ANALYZE --> SIMILARITY[Calculate Text Similarity]
    SIMILARITY --> CLUSTER[Group Similar Transcriptions]
    CLUSTER --> WEIGHT[Apply Quality Weights]
    
    WEIGHT --> SCORE[Calculate Consensus Scores]
    SCORE --> THRESHOLD{Above Threshold?}
    
    THRESHOLD -->|No| NEED_MORE
    THRESHOLD -->|Yes| SELECT_CONSENSUS[Select Consensus Text]
    
    SELECT_CONSENSUS --> VALIDATE_CONSENSUS[Validate Consensus Quality]
    VALIDATE_CONSENSUS --> MARK_COMPLETE[Mark Chunk as Complete]
    
    NEED_MORE --> PRIORITY[Increase Priority for Assignment]
    MARK_COMPLETE --> UPDATE_CONTRIBUTORS[Update Contributor Stats]
    
    %% Consensus Calculation Details
    subgraph "Similarity Calculation"
        LEVENSHTEIN[Levenshtein Distance]
        SEMANTIC[Semantic Similarity]
        PHONETIC[Phonetic Matching]
        COMBINED[Combined Score]
    end
    
    SIMILARITY --> LEVENSHTEIN
    SIMILARITY --> SEMANTIC
    SIMILARITY --> PHONETIC
    LEVENSHTEIN --> COMBINED
    SEMANTIC --> COMBINED
    PHONETIC --> COMBINED
    COMBINED --> CLUSTER

Quality Control Process

sequenceDiagram
    participant U as User
    participant F as Frontend
    participant B as Backend
    participant Q as Quality Engine
    participant DB as Database
    participant N as Notification

    U->>F: Submit Transcription
    F->>B: Send Transcription Data
    B->>DB: Save Transcription
    
    B->>Q: Trigger Quality Check
    Q->>DB: Get Related Transcriptions
    Q->>Q: Calculate Quality Metrics
    
    alt Quality Issues Detected
        Q->>DB: Flag for Review
        Q->>N: Notify Moderators
    else Quality Acceptable
        Q->>Q: Update Quality Score
    end
    
    Q->>B: Quality Assessment Complete
    B->>F: Update UI Status
    F->>U: Show Completion Status
    
    Note over Q: Background Consensus Process
    Q->>Q: Check Consensus Threshold
    
    alt Consensus Reached
        Q->>DB: Update Consensus Text
        Q->>N: Notify Contributors
    else Need More Transcriptions
        Q->>DB: Increase Chunk Priority
    end

Progress Tracking

Individual User Progress

graph LR
    subgraph "User Metrics"
        TOTAL[Total Transcriptions]
        ACCURACY[Accuracy Rate]
        SPEED[Average Speed]
        QUALITY[Quality Score]
    end
    
    subgraph "Achievements"
        BADGES[Achievement Badges]
        LEVELS[Experience Levels]
        STREAKS[Contribution Streaks]
        RANKINGS[Leaderboards]
    end
    
    TOTAL --> BADGES
    ACCURACY --> LEVELS
    SPEED --> RANKINGS
    QUALITY --> STREAKS

System-wide Progress

graph TD
    subgraph "Dataset Metrics"
        CHUNKS_TOTAL[Total Audio Chunks]
        CHUNKS_TRANSCRIBED[Transcribed Chunks]
        CONSENSUS_REACHED[Consensus Achieved]
        QUALITY_VALIDATED[Quality Validated]
    end
    
    subgraph "Language Coverage"
        LANG_SUPPORTED[Supported Languages]
        DIALECT_COVERAGE[Dialect Coverage]
        SPEAKER_DIVERSITY[Speaker Diversity]
        DOMAIN_COVERAGE[Domain Coverage]
    end
    
    CHUNKS_TOTAL --> CHUNKS_TRANSCRIBED
    CHUNKS_TRANSCRIBED --> CONSENSUS_REACHED
    CONSENSUS_REACHED --> QUALITY_VALIDATED
    
    QUALITY_VALIDATED --> LANG_SUPPORTED
    LANG_SUPPORTED --> DIALECT_COVERAGE
    DIALECT_COVERAGE --> SPEAKER_DIVERSITY
    SPEAKER_DIVERSITY --> DOMAIN_COVERAGE

Optimization Strategies

Performance Optimizations

Caching: Cache frequently accessed chunks and user data
Preloading: Preload next chunks while user works on current ones
CDN: Optimize audio delivery through CDN
Compression: Compress audio for faster loading

User Experience Optimizations

Smart Assignment: Assign chunks based on user expertise and preferences
Progress Indicators: Clear progress tracking and feedback
Keyboard Shortcuts: Efficient transcription interface
Auto-save: Prevent data loss with automatic saving

Quality Optimizations

Difficulty Balancing: Mix easy and challenging chunks
Context Provision: Provide helpful context and hints
Real-time Feedback: Immediate quality feedback
Consensus Weighting: Weight transcriptions by contributor reliability

Error Handling & Recovery

Common Error Scenarios

Session Timeout
- Auto-save work in progress
- Seamless session renewal
- Recovery of unsaved work
Network Interruption
- Offline work capability
- Automatic retry mechanisms
- Queue submissions for later
Audio Loading Issues
- Fallback audio formats
- Progressive loading
- Error reporting and alternatives
Consensus Conflicts
- Human review escalation
- Weighted voting systems
- Quality threshold adjustments

Recovery Mechanisms

flowchart LR
    ERROR[Error Detected] --> CLASSIFY{Error Classification}
    
    CLASSIFY -->|Temporary| AUTO_RETRY[Automatic Retry]
    CLASSIFY -->|User Error| USER_GUIDANCE[User Guidance]
    CLASSIFY -->|System Error| ESCALATE[Escalate to Support]
    
    AUTO_RETRY --> SUCCESS{Retry Success?}
    SUCCESS -->|Yes| CONTINUE[Continue Process]
    SUCCESS -->|No| USER_GUIDANCE
    
    USER_GUIDANCE --> RESOLVED{Issue Resolved?}
    RESOLVED -->|Yes| CONTINUE
    RESOLVED -->|No| ESCALATE
    
    ESCALATE --> SUPPORT[Support Intervention]
    SUPPORT --> CONTINUE

This comprehensive transcription workflow ensures high-quality data collection while providing an engaging and efficient experience for contributors.

Shrutik System Architecture

This diagram shows the high-level architecture of the Shrutik voice data collection platform, including all major components and their interactions.

Overall System Architecture

graph TB
    subgraph "Client Layer"
        WEB[Web Browser]
        MOBILE[Mobile App]
        API_CLIENT[API Client]
    end

    subgraph "Load Balancer & Proxy"
        NGINX[Nginx<br/>Load Balancer]
    end

    subgraph "Application Layer"
        FRONTEND[React Frontend<br/>Next.js]
        BACKEND[FastAPI Backend<br/>Python]
        WORKER[Celery Workers<br/>Background Jobs]
    end

    subgraph "Caching Layer"
        REDIS[(Redis<br/>Cache & Queue)]
    end

    subgraph "Database Layer"
        POSTGRES[(PostgreSQL<br/>Primary Database)]
        REPLICA[(PostgreSQL<br/>Read Replica)]
    end

    subgraph "Storage Layer"
        LOCAL_STORAGE[Local File Storage<br/>Audio Files]
        CDN[CDN<br/>Static Assets]
        BACKUP[Backup Storage<br/>S3/MinIO]
    end

    subgraph "External Services"
        SMTP[Email Service<br/>SMTP]
        MONITORING[Monitoring<br/>Prometheus/Grafana]
        LOGGING[Logging<br/>ELK Stack]
    end

    subgraph "Processing Pipeline"
        AUDIO_PROC[Audio Processing<br/>Librosa/PyDub]
        CONSENSUS[Consensus Engine<br/>Quality Control]
        EXPORT[Data Export<br/>Multiple Formats]
    end

    %% Client connections
    WEB --> NGINX
    MOBILE --> NGINX
    API_CLIENT --> NGINX

    %% Load balancer routing
    NGINX --> FRONTEND
    NGINX --> BACKEND

    %% Application connections
    FRONTEND --> BACKEND
    BACKEND --> REDIS
    BACKEND --> POSTGRES
    BACKEND --> REPLICA
    WORKER --> REDIS
    WORKER --> POSTGRES
    WORKER --> LOCAL_STORAGE

    %% Processing connections
    WORKER --> AUDIO_PROC
    WORKER --> CONSENSUS
    BACKEND --> EXPORT

    %% Storage connections
    BACKEND --> LOCAL_STORAGE
    FRONTEND --> CDN
    LOCAL_STORAGE --> BACKUP

    %% External service connections
    BACKEND --> SMTP
    BACKEND --> MONITORING
    BACKEND --> LOGGING

    %% Styling
    classDef client fill:#e1f5fe
    classDef app fill:#e8f5e8
    classDef data fill:#fff3e0
    classDef external fill:#f3e5f5
    classDef processing fill:#e0f2f1

    class WEB,MOBILE,API_CLIENT client
    class FRONTEND,BACKEND,WORKER app
    class POSTGRES,REPLICA,REDIS,LOCAL_STORAGE data
    class SMTP,MONITORING,LOGGING external
    class AUDIO_PROC,CONSENSUS,EXPORT processing

Component Descriptions

Client Layer

Web Browser: Primary interface for contributors using React/Next.js frontend
Mobile App: Future mobile application for voice contributions
API Client: External integrations and automated systems

Load Balancer & Proxy

Nginx: Handles SSL termination, load balancing, and static file serving
Routes requests to appropriate backend services
Implements rate limiting and security headers

Application Layer

React Frontend: User interface built with Next.js and TypeScript
FastAPI Backend: RESTful API server with automatic documentation
Celery Workers: Background job processing for audio tasks

Caching Layer

Redis: Serves multiple purposes:
- Session storage and caching
- Message queue for Celery
- Rate limiting counters
- Real-time data caching

Database Layer

PostgreSQL Primary: Main database for all application data
PostgreSQL Replica: Read-only replica for analytics and reporting
Supports horizontal scaling and high availability

Storage Layer

Local File Storage: Audio files and uploads stored locally or on network storage
CDN: Content delivery network for static assets and optimized audio delivery
Backup Storage: Automated backups to S3-compatible storage

External Services

Email Service: SMTP for user notifications and system alerts
Monitoring: Prometheus and Grafana for system monitoring
Logging: Centralized logging with ELK stack or similar

Processing Pipeline

Audio Processing: Intelligent audio chunking and format conversion
Consensus Engine: Quality control and transcription consensus algorithms
Data Export: Multiple format support for dataset export

Data Flow Patterns

1. Voice Recording Flow

User → Frontend → Backend → Storage → Worker → Audio Processing → Database

2. Transcription Flow

User → Frontend → Backend → Database → Consensus Engine → Quality Metrics

3. API Request Flow

Client → Nginx → Backend → Cache/Database → Response → Client

🚀 Scalability Considerations

Horizontal Scaling

Frontend: Multiple instances behind load balancer
Backend: Stateless API servers can be scaled horizontally
Workers: Auto-scaling based on queue length
Database: Read replicas for query distribution

Performance Optimization

Caching: Multi-layer caching strategy with Redis
CDN: Global content delivery for static assets
Database: Connection pooling and query optimization
Background Jobs: Async processing for heavy operations

High Availability

Load Balancing: Multiple instances of each service
Database Replication: Master-slave setup with failover
Health Checks: Automated monitoring and alerting
Backup Strategy: Regular automated backups

Security Architecture

Authentication & Authorization

JWT-based authentication with refresh tokens
Role-based access control (RBAC)
API key authentication for external clients

Data Protection

HTTPS/TLS encryption for all communications
Database encryption at rest
Secure file upload validation
Input sanitization and validation

Network Security

Firewall rules and network segmentation
Rate limiting and DDoS protection
Security headers and CORS configuration
Regular security audits and updates

Monitoring & Observability

Metrics Collection

Application performance metrics
System resource monitoring
Business metrics and analytics
Error tracking and alerting

Logging Strategy

Structured logging with correlation IDs
Centralized log aggregation
Log retention and archival policies
Security event logging

Health Checks

Service health endpoints
Database connectivity checks
External service dependency monitoring
Automated failover mechanisms

Deployment Architecture

Development Environment

Local development with Docker Compose
Hot reload for rapid development
Isolated test databases
Mock external services

Staging Environment

Production-like environment for testing
Automated deployment pipeline
Integration testing
Performance testing

Production Environment

Multi-zone deployment for high availability
Blue-green deployment strategy
Automated rollback capabilities
Comprehensive monitoring and alerting

This architecture supports Shrutik’s mission of democratizing voice technology while maintaining high performance, security, and scalability standards.

Contributing to Shrutik

Thank you for your interest in contributing to Shrutik! This guide will help you get started with contributing to our open-source voice data collection platform.

Ways to Contribute

Voice Data Contribution

Record Voice Samples: Contribute voice recordings in your native language
Transcribe Audio: Help transcribe audio clips to improve dataset quality
Quality Review: Review and validate transcriptions from other contributors
Language Support: Help add support for new languages and dialects

Code Contribution

Bug Fixes: Fix reported issues and improve stability
Feature Development: Implement new features and enhancements
Performance Optimization: Improve system performance and scalability
Testing: Write and improve test coverage
Documentation: Improve code documentation and API references

Documentation

User Guides: Improve setup and usage documentation
Developer Docs: Enhance technical documentation
Translations: Translate documentation to other languages
Tutorials: Create tutorials and examples

Design & UX

UI/UX Improvements: Enhance user interface and experience
Accessibility: Improve accessibility features
Mobile Responsiveness: Optimize for mobile devices
Branding: Improve visual design and branding

Getting Started

1. Set Up Development Environment

Follow our Local Development Guide to set up your development environment. also you can setup with docker as well. See Docker Local Setup

2. Find an Issue

Browse open issues
Look for issues labeled good first issue for beginners
Check issues labeled help wanted for areas needing assistance
Join our Discord to discuss ideas

3. Fork and Clone

# Fork the repository on GitHub
# Then clone your fork
git clone https://github.com/YOUR_USERNAME/shrutik.git
cd shrutik

# Add upstream remote
git remote add upstream https://github.com/Onuronon-lab/Shrutik.git

Development Workflow

1. Create a Branch

Important: All PRs must be submitted to the deployment-dev branch, not master.

Before starting development, please review our Engineering Conventions for branch naming, commit messages, and coding standards.

# Update deployment-dev branch
git checkout deployment-dev
git pull origin deployment-dev

# Create a feature branch following our naming convention
git checkout -b feat/your-feature-name
# or for bug fixes
git checkout -b fix/issue-number-description

2. Make Changes

Write code
Add tests for new functionality
Update documentation as needed
Ensure all tests pass

3. Commit Changes

Follow our Engineering Conventions for commit message format.

# Stage your changes
git add .

# Commit with a descriptive message following conventional commits
git commit -m "feat: add voice recording validation

- Add audio quality validation
- Implement duration checks
- Add error handling for invalid formats
- Update tests and documentation

Fixes #123"

4. Push and Create PR

# Push to your fork
git push origin feature/your-feature-name

# Create a Pull Request to deployment-dev on GitHub

PR Guidelines:

Target the deployment-dev branch (not master!)
Fill out the PR template completely
Ensure all CI checks pass
Code must be formatted (see Code Formatting section)

Code Formatting

We use automated code formatters to maintain consistent code style and eliminate formatting-related merge conflicts.

Tools & Configuration

Backend (Python): Black (88 chars), isort, flake8
Frontend (TypeScript/React): Prettier (100 chars), ESLint

Quick Setup

1. Install formatting tools:

pip install black isort flake8
cd frontend && npm install && cd ..

2. Set up pre-commit hooks (recommended):

./scripts/setup_pre_commit.sh

This auto-formats your code on every commit!

Using Pre-commit Hooks

Once set up, just commit normally:

git add .
git commit -m "feat: your changes"
# ✨ Code is automatically formatted before commit!

Before Submitting a PR

If not using pre-commit hooks, format manually:

# Format entire codebase
./scripts/format_code.sh

# Review changes
git diff

# Commit and push
git add .
git commit -m "style: format code"
git push

Manual Formatting Commands

# Format everything
./scripts/format_code.sh

# Backend only
black app/ tests/ scripts/
isort app/ tests/ scripts/

# Frontend only
cd frontend
npm run format
npm run lint:fix

CI/CD Checks

Our GitHub Actions workflow automatically checks formatting on all PRs to deployment-dev. If formatting fails:

./scripts/format_code.sh
git add .
git commit -m "style: fix formatting"
git push

Skipping Hooks (Emergency Only)

git commit --no-verify -m "emergency fix"

Note: Use sparingly! The CI will still check formatting.

Troubleshooting

Problem	Solution
Tools not found	`pip install black isort flake8`
Prettier not found	`cd frontend && npm install`
Hooks not running	`pre-commit install`

Style Guidelines

Python

# ✅ Good (Black formatted)
def calculate_total(items: list[dict], tax_rate: float = 0.1) -> float:
    """Calculate total with tax."""
    subtotal = sum(item["price"] for item in items)
    return subtotal * (1 + tax_rate)

TypeScript/React

// ✅ Good (Prettier formatted)
const UserCard = ({ name, email }: UserCardProps) => {
  return (
    <div className="user-card">
      <h2>{name}</h2>
      <p>{email}</p>
    </div>
  );
};

Benefits:

✅ Zero formatting conflicts in PRs
✅ Faster code reviews (focus on logic)
✅ Consistent codebase
✅ Automatic on every commit

For more details, see docs/FORMATTING.md

Commit Message Guidelines

We follow the Conventional Commits specification as outlined in our Engineering Conventions:

<type>[optional scope]: <description>

[optional body]

[optional footer(s)]

Types

feat: New feature
fix: Bug fix
docs: Documentation changes
style: Code style changes (formatting, etc.)
refactor: Code refactoring
test: Adding or updating tests
chore: Maintenance tasks

Examples

feat(auth): add OAuth2 authentication
fix(api): resolve transcription submission error
docs(readme): update installation instructions
test(voice): add unit tests for audio processing

Testing Guidelines

Running Tests

# Backend tests
pytest

# Frontend tests
cd frontend && npm test

# Integration tests
pytest tests/integration/

# E2E tests
cd frontend && npm run test:e2e

Writing Tests

Backend Tests (Python)

# tests/test_transcription.py
import pytest
from app.services.transcription_service import TranscriptionService

def test_create_transcription(db_session):
    """Test transcription creation."""
    service = TranscriptionService(db_session)
    transcription = service.create_transcription(
        chunk_id=1,
        user_id=1,
        text="Test transcription"
    )
    assert transcription.text == "Test transcription"

Frontend Tests (TypeScript/Jest)

// frontend/src/__tests__/VoiceRecorder.test.tsx
import { render, screen } from '@testing-library/react';
import VoiceRecorder from '../components/VoiceRecorder';

test('renders voice recorder component', () => {
  render(<VoiceRecorder />);
  const recordButton = screen.getByRole('button', { name: /record/i });
  expect(recordButton).toBeInTheDocument();
});

Test Coverage

Maintain minimum 80% test coverage
Write tests for all new features
Include edge cases and error scenarios
Test both happy path and error conditions

Coding Standards

Please refer to our Engineering Conventions for detailed coding standards and philosophy. The following sections provide specific implementation guidelines.

Python (Backend)

Code Style

Follow PEP 8 style guide
Use Black for code formatting
Use isort for import sorting
Use flake8 for linting

# Format code
black app/
isort app/

# Check linting
flake8 app/

Code Structure

# Good: Clear function with type hints and docstring
from typing import Optional
from sqlalchemy.orm import Session

def get_user_by_email(db: Session, email: str) -> Optional[User]:
    """
    Retrieve user by email address.

    Args:
        db: Database session
        email: User email address

    Returns:
        User object if found, None otherwise
    """
    return db.query(User).filter(User.email == email).first()

Error Handling

# Good: Specific exception handling
try:
    user = create_user(db, user_data)
except ValidationError as e:
    logger.error(f"User validation failed: {e}")
    raise HTTPException(status_code=400, detail=str(e))
except DatabaseError as e:
    logger.error(f"Database error: {e}")
    raise HTTPException(status_code=500, detail="Internal server error")

TypeScript/React (Frontend)

Code Style

Use Prettier for code formatting
Use ESLint for linting
Follow React best practices
Use TypeScript for type safety

# Format code
npm run format

# Check linting
npm run lint

Component Structure

// Good: Typed React component with proper structure
interface VoiceRecorderProps {
  onRecordingComplete: (audioBlob: Blob) => void;
  maxDuration?: number;
}

export const VoiceRecorder: React.FC<VoiceRecorderProps> = ({
  onRecordingComplete,
  maxDuration = 60
}) => {
  const [isRecording, setIsRecording] = useState(false);

  // Component logic here

  return (
    <div className="voice-recorder">
      {/* JSX here */}
    </div>
  );
};

Database

Migrations

# Good: Clear migration with proper naming
"""Add voice quality metrics

Revision ID: 001_add_voice_quality
Revises: 000_initial
Create Date: 2024-01-01 12:00:00.000000

"""
from alembic import op
import sqlalchemy as sa

def upgrade():
    op.add_column('transcriptions',
        sa.Column('quality_score', sa.Float, nullable=True))

def downgrade():
    op.drop_column('transcriptions', 'quality_score')

Documentation Standards

Code Documentation

Use clear, descriptive docstrings
Document all public functions and classes
Include parameter types and return values
Provide usage examples for complex functions

API Documentation

Use OpenAPI/Swagger annotations
Document all endpoints, parameters, and responses
Include example requests and responses
Document error codes and messages

User Documentation

Write clear, step-by-step instructions
Include screenshots and examples
Test all instructions on a fresh environment
Keep documentation up-to-date with code changes

Code Review Process

Submitting a Pull Request

Title: Clear, descriptive title
Description: Explain what and why
Testing: Describe how you tested the changes
Screenshots: Include for UI changes
Breaking Changes: Document any breaking changes

PR Template

## Description

Brief description of changes

## Type of Change

- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
- [ ] Documentation update

## Testing

- [ ] Unit tests pass
- [ ] Integration tests pass
- [ ] Manual testing completed

## Screenshots (if applicable)

## Checklist

- [ ] Code follows style guidelines
- [ ] Self-review completed
- [ ] Documentation updated
- [ ] Tests added/updated

Review Criteria

Reviewers will check for:

Functionality: Does the code work as intended?
Code Quality: Is the code clean and maintainable?
Testing: Are there adequate tests?
Documentation: Is documentation updated?
Performance: Are there any performance implications?
Security: Are there any security concerns?

Internationalization

Adding New Languages

Language Configuration: Add language to app/models/language.py
Frontend Translations: Add translations to frontend/src/locales/
Backend Messages: Update error messages and notifications
Documentation: Translate key documentation

Translation Guidelines

Use proper Unicode support for all scripts
Test with right-to-left languages
Consider cultural context in translations
Use native speakers for translation review

🎤 Voice Data Guidelines

Recording Quality

Environment: Quiet, echo-free environment
Equipment: Good quality microphone
Format: WAV or high-quality MP3
Duration: 2-10 seconds per clip
Content: Clear, natural speech

Transcription Guidelines

Accuracy: Transcribe exactly what is spoken
Formatting: Follow language-specific conventions
Punctuation: Include appropriate punctuation
Quality Rating: Rate audio quality honestly

Recognition

Contributor Recognition

Contributors are listed in our CONTRIBUTORS.md file
Significant contributors may be invited to join the core team
We highlight contributions in our release notes
Annual contributor appreciation events

Badges and Achievements

First-time contributor badge
Language champion badges
Code contributor levels
Community helper recognition

Getting Help

Community Support

Discord: Join our server for real-time help
GitHub Discussions: Ask questions and share ideas
Office Hours: Weekly community calls (schedule in Discord)

Mentorship Program

New contributors can request mentorship
Experienced contributors can volunteer as mentors
Structured onboarding for major contributions

Contact

General Questions: community@shrutik.org
Technical Issues: dev@shrutik.org
Security Issues: security@shrutik.org (private)

📜 Code of Conduct

We are committed to providing a welcoming and inclusive environment. Please read our Code of Conduct before contributing.

Our Standards

Be Respectful: Treat everyone with respect and kindness
Be Inclusive: Welcome contributors from all backgrounds
Be Collaborative: Work together towards common goals
Be Patient: Help others learn and grow

📄 License

By contributing to Shrutik, you agree that your contributions will be licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. This ensures that all contributions remain available for educational and non-commercial use while requiring attribution to the original creators.

Thank you for contributing to Shrutik! Together, we’re building a more inclusive digital future. 🎉

Engineering Conventions & Philosophy

Why this exists

This document is not about rules for the sake of rules. It exists to reduce confusion, remove unnecessary discussion, and protect engineering quality.

Conventions are not constraints, they are agreements. Agreements let teams move fast without stepping on each other.

1. Philosophy (Read this first)

Engineering is about clarity, not cleverness.
If something needs explanation, it’s already slightly wrong.
Conventions exist so that:
- No one has to ask questions repeatedly
- No one has to justify decisions emotionally
- The system explains itself

We don’t optimize for personal preference. We optimize for collective understanding and future maintainability.

Everything here follows one principle:

Do not raise unnecessary questions for the next person reading your work.

That next person might be your teammate. Or future you at 3 AM.

2. Branch Naming Convention

Branch names must follow this format:

<prefix>/<short-description>

Allowed prefixes

feat/ → New features
fix/ → Bug fixes
hotfix/ → Critical production fixes
docs/ → Documentation-only changes

Examples

feat/auth-verification
fix/password-reset-token
docs/api-guidelines

Why this matters

Branch lists should be scannable at a glance
Prefixes instantly communicate intent
Consistency removes cognitive load

If every branch uses a different word (feature/, new/, stuff/), the system slowly becomes noisy. Noise kills velocity.

3. Commit Message Convention

We follow Conventional Commits.

Format:

<type>(<scope>): <clear, concrete description>

Allowed types

feat → New functionality
fix → Bug fix
docs → Documentation
refactor → Code restructure without behavior change
test → Tests
chore → Tooling / config

Examples

feat(auth): add email verification flow
fix(auth): prevent reset token reuse
docs(readme): add setup instructions

What commit messages are not

Not marketing
Not self-evaluation
Not emotion

Avoid words like:

strong
robust
powerful
improved (without context)

A commit message should describe what changed, not how good it feels.

If something is buggy → it’s wrong. If something works → that’s the baseline, not an achievement.

4. Pull Requests

A PR should do one logical thing
The title should summarize the change
The description should answer:
- What changed?
- Why was it needed?

No philosophy debates inside PRs. If a rule is violated, it will be requested to change, not discussed.

5. Source of Truth

WIP project docs are not the source of truth
External standards, official documentation, and established practices take priority

Always question:

outdated docs
informal assumptions
“this is how we’ve been doing it”

Engineering grows by questioning, not by accepting.

6. Ego & Engineering

Software is never perfect
Everything has limits
Everything breaks eventually

That’s exactly why we aim for:

clarity over cleverness
simplicity over ego
consistency over preference

Having strong opinions is good. Letting conventions decide instead of ego is better.

7. Final Note

These conventions are not optional. They exist so we can:

move faster
argue less
build things that last

If something here feels strict, that’s intentional. Discipline is what gives freedom later.

Clean systems scale. Messy ones don’t.

Follow the convention. Save your energy for real problems.

Code of Conduct

Our Pledge

We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.

We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.

Our Standards

Examples of behavior that contributes to a positive environment for our community include:

Demonstrating empathy and kindness toward other people
Being respectful of differing opinions, viewpoints, and experiences
Giving and gracefully accepting constructive feedback
Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
Focusing on what is best not just for us as individuals, but for the overall community
Using welcoming and inclusive language
Being respectful of differing cultural backgrounds and languages
Encouraging and supporting new contributors

Examples of unacceptable behavior include:

The use of sexualized language or imagery, and sexual attention or advances of any kind
Trolling, insulting or derogatory comments, and personal or political attacks
Public or private harassment
Publishing others’ private information, such as a physical or email address, without their explicit permission
Discrimination or harassment based on any protected characteristic
Other conduct which could reasonably be considered inappropriate in a professional setting

Language and Cultural Sensitivity

Given Shrutik’s mission to support underrepresented languages and communities:

Be respectful of all languages, dialects, and accents
Avoid making assumptions about language proficiency or cultural backgrounds
Be patient with non-native speakers of any language
Celebrate linguistic diversity and cultural differences
Provide translations or explanations when using technical terms
Be mindful that humor and expressions may not translate across cultures

Voice Data Contribution Guidelines

When contributing voice data or transcriptions:

Respect the privacy and consent of all speakers
Do not submit recordings without proper consent
Be honest about audio quality and transcription accuracy
Respect cultural and religious sensitivities in content
Follow platform guidelines for appropriate content
Report any inappropriate or harmful content

Enforcement Responsibilities

Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.

Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.

Scope

This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.

Enforcement

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at conduct@shrutik.org. All complaints will be reviewed and investigated promptly and fairly.

All community leaders are obligated to respect the privacy and security of the reporter of any incident.

Enforcement Guidelines

Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:

1. Correction

Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.

Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.

2. Warning

Community Impact: A violation through a single incident or series of actions.

Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.

3. Temporary Ban

Community Impact: A serious violation of community standards, including sustained inappropriate behavior.

Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.

4. Permanent Ban

Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.

Consequence: A permanent ban from any sort of public interaction within the community.

Reporting Guidelines

If you experience or witness unacceptable behavior, please report it by:

Email: onuronon.dev@gmail.com
Discord: Direct message to moderators
GitHub: Use the report feature or contact maintainers

When reporting, please include:

Your contact information
Names (usernames, real names) of any individuals involved
Your account of what occurred, including any available records (screenshots, logs, etc.)
Any additional information that may be helpful

Response Process

Acknowledgment: We will acknowledge receipt of your report within 24 hours
Investigation: We will investigate the matter thoroughly and fairly
Decision: We will make a decision based on our guidelines and communicate it to all parties
Follow-up: We will follow up to ensure the resolution is effective

Appeals Process

If you disagree with a moderation decision:

Send an appeal to appeals@shrutik.org within 30 days
Include your reasoning and any additional information
The appeal will be reviewed by different community leaders
The appeal decision is final

Community Resources

Support Channels

Discord Community: https://discord.gg/9hZ9eW8ARk
GitHub Discussions: https://github.com/Onuronon-lab/Shrutik/discussions

Recognition

We believe in recognizing positive contributions to our community:

Community Champions: Monthly recognition for helpful community members
Mentorship Program: Opportunities to guide new contributors
Speaking Opportunities: Invitations to represent Shrutik at events
Contributor Spotlight: Featured stories of community members

Continuous Improvement

This Code of Conduct is a living document. We regularly review and update it based on:

Community feedback and suggestions
Evolving best practices in open source communities
Lessons learned from enforcement experiences
Changes in our community’s needs and composition

To suggest improvements, please:

Open an issue on GitHub with the “code-of-conduct” label
Join discussions in our Discord #community-guidelines channel
Email suggestions to conduct@shrutik.org

Acknowledgments

This Code of Conduct is adapted from the Contributor Covenant, version 2.1, available at https://www.contributor-covenant.org/version/2/1/code_of_conduct.html.

Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder.

For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations.

Contact Information

General Conduct Questions: onuronon.dev@gmail.com

Remember: We’re all here because we believe in making voice technology more inclusive. Let’s work together to create a welcoming space where everyone can contribute their unique perspectives and talents.

Thank you for helping make Shrutik a welcoming, inclusive community for everyone! 🎤✨

Contributors

Thank you to all the amazing people who have contributed to Shrutik (শ্রুতিক)! This project exists because of the collective effort of developers, linguists, designers, and community members from around the world.

Core Team

Project Founders

[Ifrun Kader Ruhin] - Project Creator & Lead Developer
- GitHub: @ifrunruhin12
- Role: Architecture, Backend Development, Project Vision

Core Maintainers

[Maintainer Name] - Lead Frontend Developer
- GitHub: @maintainer
- Role: Frontend Architecture, UI/UX Design
[Maintainer Name] - DevOps & Infrastructure
- GitHub: @devops
- Role: Deployment, CI/CD, Performance Optimization

💻 Code Contributors

Major Contributors (50+ commits)

[Contributor Name] - @username
- Contributions: Audio processing pipeline, performance optimizations
- Languages: Python, JavaScript

Regular Contributors (10+ commits)

[Contributor Name] - @username
- Contributions: API development, database design
- Languages: Python, SQL
[Contributor Name] - @username
- Contributions: Frontend components, accessibility improvements
- Languages: TypeScript, React

First-Time Contributors

[Furqan Ahmed] - @furqanRupom
- First contribution: Bug fix in audio validation
- Date: 2025-11-02

Voice Data Contributors

Language Champions

These contributors have made significant voice data contributions in their native languages:

Bengali (বাংলা)

[Contributor Name] - 500+ recordings, 1000+ transcriptions
[Contributor Name] - 300+ recordings, 800+ transcriptions
[Contributor Name] - 200+ recordings, 600+ transcriptions

Quality Reviewers

[Reviewer Name] - 2000+ transcription reviews
[Reviewer Name] - 1500+ transcription reviews
[Reviewer Name] - 1200+ transcription reviews

Documentation Contributors

Documentation Team

[Doc Contributor] - @username
- Contributions: API documentation, user guides
- Specialty: Technical writing
[Doc Contributor] - @username
- Contributions: Deployment guides, troubleshooting
- Specialty: DevOps documentation

Translators

[Translator Name] - Bengali translation lead
[Translator Name] - Hindi translation lead
[Translator Name] - Tamil translation lead

Design Contributors

UI/UX Designers

[Designer Name] - @username
- Contributions: User interface design, user experience research
- Tools: Figma, Adobe XD
[Designer Name] - @username
- Contributions: Logo design, branding, visual identity
- Tools: Illustrator, Photoshop

Accessibility Experts

[A11y Expert] - @username
- Contributions: Accessibility audits, WCAG compliance
- Specialty: Screen reader optimization

Research Contributors

Academic Researchers

Dr. [Researcher Name] - [University/Institution]
- Contributions: Voice quality metrics, consensus algorithms
- Publications: [Link to relevant papers]
Prof. [Researcher Name] - [University/Institution]
- Contributions: Linguistic analysis, dialect classification
- Expertise: Computational linguistics

Data Scientists

[Data Scientist] - @username
- Contributions: Quality control algorithms, statistical analysis
- Tools: Python, R, Machine Learning

Discord Moderators

[Moderator Name] - Senior Moderator
[Moderator Name] - Community Moderator
[Moderator Name] - Technical Support Moderator

🏅 Special Recognition

Milestone Achievements

🥇 Gold Contributors (Exceptional Impact)

[Contributor Name] - First to reach 1000 voice contributions
[Contributor Name] - Implemented critical security features
[Contributor Name] - Led successful community outreach campaign

🥈 Silver Contributors (Significant Impact)

[Contributor Name] - Major performance optimizations
[Contributor Name] - Comprehensive testing framework
[Contributor Name] - Multi-language support implementation

🥉 Bronze Contributors (Notable Impact)

[Contributor Name] - Bug fixes and stability improvements
[Contributor Name] - Documentation improvements
[Contributor Name] - Community support and mentoring

Annual Awards (2026)

🏆 Contributor of the Year

[Winner Name] - For outstanding contributions across code, community, and voice data

🌟 Rising Star

[Winner Name] - For exceptional growth and impact as a new contributor

🤝 Community Champion

[Winner Name] - For building bridges between technical and linguistic communities

Technical Excellence

[Winner Name] - For innovative solutions and architectural improvements

📊 Contribution Statistics

Overall Stats (as of 2026)

Total Contributors: x+
Code Contributors: x
Voice Contributors: x+
Documentation Contributors: x
Countries Represented: x+
Languages Supported: x

🎯 Contribution Types

Code Contributions

Backend Development: x contributors
Frontend Development: x contributors
DevOps & Infrastructure: x contributors
Testing & QA: x contributors
Security: x contributors

Non-Code Contributions

Voice Recordings: x contributors
Transcriptions: x contributors
Quality Reviews: x contributors
Documentation: x contributors
Translation: x contributors
Design: x contributors
Community Management: x contributors

🚀 How to Join

Want to see your name here? Here’s how you can contribute:

For Developers

Check our Contributing Guide
Look for issues labeled good first issue
Join our Discord for technical discussions

For Voice Contributors

Visit our platform at onuronon.org
Register and start recording in your native language
Help transcribe and review audio from others

For Designers

Check our design needs in GitHub issues
Share your portfolio and design ideas
Help improve user experience and accessibility

For Linguists & Researchers

Join our research discussions on Discord
Contribute to quality metrics and algorithms
Help with linguistic analysis and validation

🙏 Acknowledgments

Special Thanks

Open Source Community - For the amazing tools and libraries we build upon
Academic Partners - For research collaboration and validation
Early Adopters - For testing and feedback during development
Funding Partners - For supporting the project’s growth
Language Communities - For trusting us with their voices and stories

Inspiration

This project is inspired by the belief that technology should serve all communities, regardless of the language they speak. We’re grateful to everyone who shares this vision and contributes to making it a reality.

Recognition Requests

If you’ve contributed to Shrutik but don’t see your name here:

Open an issue with the “recognition” label
Email us at onuronon.dev@gmail.com
Message us on Discord

We want to make sure everyone gets the recognition they deserve!

🔄 Updates

This file is updated monthly to recognize new contributors. The next update is scheduled for the first week of each month.

Last Updated: October 2025

Thank you to everyone who makes Shrutik possible! Your contributions, big and small, are building a more inclusive digital future. ✨

“Alone we can do so little; together we can do so much.” - Helen Keller

Troubleshooting

This guide covers common issues and their solutions when working with Shrutik.

Docker Issues

Services Won’t Start

Problem: Docker services fail to start or crash immediately.

Solutions:

# Check logs for all services
docker compose logs -f

# Check logs for a specific service
docker compose logs -f backend
docker compose logs -f postgres
docker compose logs -f redis

# Restart all services
docker compose restart

# Clean restart (removes containers, networks, and volumes)
docker compose down -v --remove-orphans
docker system prune -f  # optional: remove unused Docker resources
docker compose up -d    # start services again

Port Already in Use

Problem: Error messages about ports 3000, 5432, 6379, or 8000 being in use.

Solutions:

# Find processes using ports
sudo lsof -i :8000
sudo lsof -i :3000
sudo lsof -i :5432
sudo lsof -i :6379

# Kill processes using specific ports
sudo lsof -ti:8000 | xargs kill -9
sudo lsof -ti:3000 | xargs kill -9

# Or use netstat
netstat -tulpn | grep :8000

Environment Configuration

Problem: Migration fails due to .env misconfigurations.

Solution:

# Incorrect 
DATABASE_URL=postgresql://postgres:password@localhost:5432/voice_collection

# Correct 
DATABASE_URL=postgresql://postgres:password@postgres:5432/voice_collection

# Incorrect 
REDIS_URL=redis://localhost:6379/0

# Correct 
REDIS_URL=redis://redis:6379/0

Database Migrations Not Applied

Problems:

alembic upgrade head was not running or failed.
Tables such as users, recordings, etc., are missing.
Application may return errors like: relation “users” does not exist.

Solutions:

# Run database migrations
alembic upgrade head

# Verify tables exist
psql -U postgres -d voice_collection -c "\dt"

⚠️ Always run migrations after configuring environment variables and before starting the backend or running tests.

Database Connection Issues

Problem: Backend can’t connect to PostgreSQL database.

Solutions:

# Check database status
docker-compose exec postgres pg_isready -U postgres

# Check database logs
docker compose logs -f postgres

# Reset database and remove containers, volumes, and networks
docker compose down -v --remove-orphans

# Optional: prune unused Docker resources
docker system prune -f

# Start all services
docker compose up -d

# Run database migrations inside the backend container
docker compose exec backend alembic upgrade head

# Or use a custom initialization script if you have one
docker compose exec backend python scripts/init-db.py

Redis Connection Issues

Problem: Backend can’t connect to Redis.

Solutions:

# Test Redis connection
docker-compose exec redis redis-cli ping

# Check Redis logs
docker compose logs -f redis

# Restart Redis
docker compose restart redis

Database Issues

Problem: PostgreSQL connection errors in local development.

Solutions:

# Check PostgreSQL status
sudo systemctl status postgresql

# Start PostgreSQL
sudo systemctl start postgresql

# Create database if missing
createdb voice_collection

# Run migrations
alembic upgrade head

Permission Errors

Problem: File permission errors, especially with uploads directory.

Solutions:

# Fix upload directory permissions
mkdir -p uploads
sudo chown -R $USER:$USER uploads/
chmod -R 755 uploads/

# Fix general project permissions
sudo chown -R $USER:$USER .

Application Issues

Admin User Creation Fails

Problem: Cannot create admin user or login fails.

Solutions:

# Ensure the database is migrated
# Local environment
alembic upgrade head  # see Local Database docs for details.

# Docker environment
docker compose exec backend alembic upgrade head  # see Docker Database docs for details

# Create admin user 
# Local
python scripts/create_admin.py --name "AdminUser" --email admin@example.com

# Docker
docker compose exec backend python scripts/create_admin.py --name "AdminUser" --email admin@example.com

# Check users in database
# Local
psql -U postgres -d voice_collection -c "SELECT * FROM users;"

# Docker
docker compose exec postgres psql -U postgres -d voice_collection -c "SELECT * FROM users;"

File Upload Issues

Problem: Audio file uploads fail or return errors.

Solutions:

Check file size: Ensure files are under 100MB (default limit)
Check file format: Supported formats: .wav, .mp3, .m4a, .flac, .webm
Check permissions: Ensure uploads directory is writable
Check disk space: Ensure sufficient disk space available

# Check upload directory
ls -la uploads/
df -h  # Check disk space

API Errors

Problem: API endpoints return 500 errors or unexpected responses.

Solutions:

# Check backend logs
docker compose logs -f backend

# Check API health
curl http://localhost:8000/health

# Check specific endpoint
curl -X GET http://localhost:8000/api/auth/me \
  -H "Authorization: Bearer YOUR_TOKEN"

Frontend Issues

Frontend Won’t Load

Problem: Frontend shows blank page or connection errors.

Solutions:

# Check frontend logs
docker compose logs -f frontend

# Verify API connection
curl http://localhost:8000/health

# Check environment variables
cat frontend/.env

Build Errors

Problem: Frontend build fails with dependency or compilation errors.

Solutions:

# Clear node modules and reinstall
cd frontend
rm -rf node_modules package-lock.json
npm install

# Clear Next.js cache
rm -rf .next

# Rebuild
npm run build

Debugging Tips

Enable Debug Logging

Add to your .env file:

DEBUG=true
LOG_LEVEL=DEBUG

Check Service Health

# Backend health check (works for both local and Docker)
curl http://localhost:8000/health

# Database connection

# Local PostgreSQL
pg_isready -U postgres -d voice_collection

# Docker PostgreSQL
docker-compose exec postgres pg_isready -U postgres -d voice_collection

# Redis connection

# Local Redis
redis-cli ping

# Docker Redis
docker-compose exec redis redis-cli ping

Monitor Resource Usage

# Docker resource usage
docker stats

# System resource usage
htop
df -h
free -h

Common Local Issues

Port already in use:

# Find and kill process using port 8000
lsof -ti:8000 | xargs kill -9

Database connection issues:

# Check PostgreSQL status
sudo systemctl status postgresql

# Restart PostgreSQL
sudo systemctl restart postgresql

Create database if missing

createdb voice_collection

Run migrations

alembic upgrade head

Redis connection issues:

# Check Redis status
redis-cli ping

# Start Redis
redis-server

Getting Help

If you’re still experiencing issues:

Search existing issues: Check GitHub Issues
Create detailed issue: Include:
- Operating system and version
- Docker/Docker Compose versions
- Complete error messages
- Steps to reproduce
- Relevant log outputs
Join community: Discord Server
Check documentation: Review relevant sections in this documentation

Frequently Asked Questions

General Questions

What is Shrutik?

Shrutik (শ্রুতিক) is an open-source voice data collection platform designed to help communities build high-quality voice datasets in their native languages. The name “Shrutik” means “listener” in Bengali, reflecting our mission to listen to and preserve diverse voices.

Docker 20.10+
Docker Compose 2.0+
4GB RAM minimum, 8GB recommended
10GB free disk space

For Local Development:

Python 3.11+
Node.js 18+
PostgreSQL 13+
Redis 6+
8GB RAM recommended

How do I backup my data?

Database Backup:

# Create database backup
docker-compose exec postgres pg_dump -U postgres voice_collection > backup.sql

# Restore from backup
docker-compose exec -T postgres psql -U postgres voice_collection < backup.sql

File Uploads Backup:

# Backup uploads directory
tar -czf uploads-backup.tar.gz uploads/

Usage Questions

How do I add a new language?

Log in as an admin user
Go to the admin dashboard
Navigate to “Languages” section
Click “Add Language”
Enter language name and ISO code
Add scripts/texts for that language

What audio formats are supported?

Shrutik supports these audio formats:

WAV (recommended for quality)
MP3
M4A
FLAC
WebM

What’s the maximum file size for uploads?

The default maximum file size is 100MB. This can be configured in the environment variables:

MAX_FILE_SIZE=104857600  # 100MB in bytes

How does the transcription consensus system work?

Shrutik uses a multi-contributor consensus system:

Multiple users transcribe the same audio
The system compares transcriptions
When transcriptions match (or are very similar), they’re marked as “consensus”
High-consensus transcriptions are considered high-quality data

Can I export my data?

Yes! Administrators can export data through the admin API:

Audio files and metadata
Transcriptions and consensus data
User statistics and contributions
Quality metrics

Development Questions

How do I contribute to Shrutik?

Fork the repository on GitHub
Set up your development environment
Make your changes
Write tests for new features
Submit a pull request

See our Contributing Guide for detailed instructions.

How do I report bugs?

Check if the issue already exists in GitHub Issues
If not, create a new issue with:
- Clear description of the problem
- Steps to reproduce
- Expected vs actual behavior
- System information
- Error logs

How do I request new features?

Create a feature request in GitHub Issues with:

Clear description of the feature
Use case and benefits
Proposed implementation (if you have ideas)

Can I customize the UI?

Yes! The frontend is built with Next.js and React. You can:

Modify the existing components
Add new pages and features
Customize styling and themes
Add support for new languages in the UI

Privacy and Security

How is user data protected?

Shrutik implements several security measures:

Password hashing with bcrypt
JWT token-based authentication
Role-based access control
Input validation and sanitization
CORS protection
Rate limiting

Can I run Shrutik offline?

Yes! Shrutik can run completely offline once deployed. All processing happens locally on your infrastructure.

How do I configure HTTPS?

For production deployments, configure HTTPS using:

Reverse proxy (nginx, Apache)
Load balancer with SSL termination
Cloud provider SSL certificates

Example nginx configuration is available in our deployment guides.

Community and Support

Where can I get help?

Documentation: This documentation site
GitHub Issues: For bugs and feature requests
Discord: Join our community
Email: Contact the maintainers

How can I stay updated?

Watch the GitHub repository for releases
Join our Discord community
Follow our social media channels
Subscribe to our newsletter (coming soon)

Can I hire someone to help with deployment?

While Shrutik is open-source and free, you can:

Hire freelance developers familiar with the stack
Contact the core team for consulting services
Engage with the community for paid support

# Using Docker
docker-compose exec backend python create_admin.py

# Local development
python scripts/create_admin.py --name "AdminUser" --email admin@example.com

This will create a new admin user or update the existing one.

The database is corrupted

If your database becomes corrupted:

Stop all services
Restore from backup (if available)
Or reset the database:

# Stop and remove all containers, volumes, and networks
docker compose down -v --remove-orphans

# Optional: prune unused Docker resources
docker system prune -f

# Start services (build images if necessary)
docker compose up -d --build

# Wait a few seconds for Postgres and Redis to be ready

# Run database migrations
docker compose exec backend alembic upgrade head

# Or use a custom initialization script
docker compose exec backend python scripts/init-db.py

# Create Admin user
docker compose exec backend python create_admin.py

Still have questions?

If your question isn’t answered here:

Check our Troubleshooting Guide
Search GitHub Issues
Join our Discord community
Create a new issue on GitHub

We’re here to help!

Page not available yet

This page will be available in a future update.

You can continue navigating using the sidebar or search.

Keyboard shortcuts

Shrutik Documentation