docs: rewrite install guide with venv steps and dataset sizing guidance

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-18 20:40:23 +02:00
parent 6ddfcde8ee
commit 2e738df9ae
1 changed files with 114 additions and 42 deletions
@@ -81,58 +81,134 @@ All clips are exactly 8 seconds — the standard length for foley sound datasets
 Shortcuts are suppressed when a text field has focus.
-## Requirements
+## Installation
- Python 3.11+
+### Prerequisites
 - `ffmpeg` on `PATH`
 - PyQt6
 - python-mpv (requires libmpv)
-```
+- **Python 3.11+** — [python.org/downloads](https://www.python.org/downloads/)
-pip install -r requirements.txt
+- **ffmpeg** — video encoding
-```
+- **libmpv** — video playback
-### Platform setup
+### Quick start (all platforms)
-#### Linux
+The setup script creates a virtual environment and installs everything including PyTorch with CUDA support:
 ```bash
-# Arch
+# Linux / macOS
-pacman -S mpv ffmpeg python
+./setup_env.sh
-# Debian/Ubuntu
+# Windows (PowerShell)
-apt install libmpv-dev ffmpeg python3
+powershell -ExecutionPolicy Bypass -File setup-windows.ps1
 # Install Python deps
 pip install -r requirements.txt
 ```
-#### Windows
+Then run:
 ```bash
 # Linux / macOS
 ./8cut.sh
 # Windows
 8cut.bat
 ```
 The launch scripts auto-detect your venv or conda environment.
 ### Manual installation
 #### 1. Install system dependencies
 **Linux (Arch):**
 ```bash
 pacman -S python mpv ffmpeg
 ```
 **Linux (Debian/Ubuntu):**
 ```bash
 apt install python3 python3-venv libmpv-dev ffmpeg
 ```
 **Windows:**
 ```powershell
-# Install ffmpeg
+# ffmpeg
 winget install ffmpeg
-# Download mpv-2.dll from https://sourceforge.net/projects/mpv-player-windows/files/libmpv/
+# libmpv — download mpv-2.dll and place next to main.py
-# Place mpv-2.dll next to main.py or on PATH
+# https://sourceforge.net/projects/mpv-player-windows/files/libmpv/
 # Install Python deps
 pip install -r requirements.txt
 ```
-Or run the setup script:
+**macOS:**
-
+```bash
-```powershell
+brew install python mpv ffmpeg
 .\setup-windows.ps1
 ```
-#### macOS
+#### 2. Create a virtual environment
 A virtual environment keeps 8-cut's dependencies isolated from your system Python. This is strongly recommended — PyTorch alone is several GB and can conflict with other projects.
 **Using venv (recommended):**
 ```bash
 # Create the venv in the project directory
 python3 -m venv .venv
 # Activate it
 # Linux / macOS:
 source .venv/bin/activate
 # Windows (cmd):
 .venv\Scripts\activate.bat
 # Windows (PowerShell):
 .venv\Scripts\Activate.ps1
 ```
 **Using conda / miniforge:**
 ```bash
 conda create -n 8cut python=3.12
 conda activate 8cut
 ```
 You must activate the environment every time you open a new terminal before running 8-cut. The `8cut.sh` launcher does this automatically.
 #### 3. Install PyTorch
 PyTorch must be installed separately with the correct CUDA version for GPU acceleration. Without CUDA, audio scanning will fall back to CPU (much slower).
 **With NVIDIA GPU (CUDA 12.8):**
 ```bash
 pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu128
 ```
 **CPU only (no GPU):**
 ```bash
 pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu
 ```
 Check available CUDA versions at [pytorch.org/get-started](https://pytorch.org/get-started/locally/).
 #### 4. Install project dependencies
 ```bash
 brew install mpv ffmpeg
 pip install -r requirements.txt
 ```
 #### 5. Verify
 ```bash
 python -c "import torch; print('PyTorch', torch.__version__, 'CUDA', torch.version.cuda)"
 python -c "import librosa, torchaudio, sklearn; print('All imports OK')"
 ```
 ### Running
 ```bash
 # With venv activated:
 python main.py
 # Or use the launcher (auto-activates venv/conda):
 ./8cut.sh        # Linux / macOS
 8cut.bat         # Windows
 ```
 ### GPU encoding
 Hardware encoders are auto-detected from ffmpeg. Available encoders by platform:
@@ -145,22 +221,12 @@ Hardware encoders are auto-detected from ffmpeg. Available encoders by platform:
 Enable the **HW** checkbox in the export controls to use GPU encoding.
-### Audio scanning dependencies
+### Optional: audio scanning
-Audio scanning requires additional packages (installed via `requirements.txt`):
+Audio scanning requires PyTorch (installed above). Embedding models are downloaded on first use and cached in `cache/downloads/`. A CUDA-capable GPU is strongly recommended for training and scanning speed.
 - `torch` and `torchaudio` — embedding model inference (CUDA recommended)
 - `scikit-learn` — classifier training
 - `joblib` — model persistence
 Models are downloaded on first use and cached in `cache/downloads/`.
 ## Usage
 ```
 python main.py
 ```
 Drop videos onto the queue or click **+ Open Files**. Scrub to your cut point, then press **Export** (or `E`).
 ### Export layout
@@ -224,6 +290,12 @@ Labels persist between exports so you can cut many clips of the same class witho
 Export clips manually from several videos. Clips from the same export folder (e.g. `mp4_Intense`) become your positive training class.
 **Minimum dataset:** ~20 clips from 2–3 different videos. This is enough for the classifier to learn a basic boundary, but expect noisy results — you'll need to mark hard negatives and retrain.
 **Ideal dataset:** 50–100+ clips from 5+ videos covering the full range of variation in your target sound (different recording conditions, distances, intensities). More variety in your positives makes the model generalize better to unseen footage. Negatives are sampled automatically from regions far from your markers, but adding explicit negatives of confusable sounds (e.g. thunder when training for explosions) significantly reduces false positives.
 The classifier improves iteratively: export a small initial set → train → scan → mark false positives as hard negatives → retrain. Each cycle sharpens the decision boundary without needing a large upfront dataset.
 ### 2. Train a classifier
 Click **Train** to open the training dialog: