docs: rewrite install guide with venv steps and dataset sizing guidance

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-18 20:40:23 +02:00
parent 6ddfcde8ee
commit 2e738df9ae
+114 -42
View File
@@ -81,58 +81,134 @@ All clips are exactly 8 seconds — the standard length for foley sound datasets
Shortcuts are suppressed when a text field has focus. Shortcuts are suppressed when a text field has focus.
## Requirements ## Installation
- Python 3.11+ ### Prerequisites
- `ffmpeg` on `PATH`
- PyQt6
- python-mpv (requires libmpv)
``` - **Python 3.11+** — [python.org/downloads](https://www.python.org/downloads/)
pip install -r requirements.txt - **ffmpeg** — video encoding
``` - **libmpv** — video playback
### Platform setup ### Quick start (all platforms)
#### Linux The setup script creates a virtual environment and installs everything including PyTorch with CUDA support:
```bash ```bash
# Arch # Linux / macOS
pacman -S mpv ffmpeg python ./setup_env.sh
# Debian/Ubuntu # Windows (PowerShell)
apt install libmpv-dev ffmpeg python3 powershell -ExecutionPolicy Bypass -File setup-windows.ps1
# Install Python deps
pip install -r requirements.txt
``` ```
#### Windows Then run:
```bash
# Linux / macOS
./8cut.sh
# Windows
8cut.bat
```
The launch scripts auto-detect your venv or conda environment.
### Manual installation
#### 1. Install system dependencies
**Linux (Arch):**
```bash
pacman -S python mpv ffmpeg
```
**Linux (Debian/Ubuntu):**
```bash
apt install python3 python3-venv libmpv-dev ffmpeg
```
**Windows:**
```powershell ```powershell
# Install ffmpeg # ffmpeg
winget install ffmpeg winget install ffmpeg
# Download mpv-2.dll from https://sourceforge.net/projects/mpv-player-windows/files/libmpv/ # libmpv — download mpv-2.dll and place next to main.py
# Place mpv-2.dll next to main.py or on PATH # https://sourceforge.net/projects/mpv-player-windows/files/libmpv/
# Install Python deps
pip install -r requirements.txt
``` ```
Or run the setup script: **macOS:**
```bash
```powershell brew install python mpv ffmpeg
.\setup-windows.ps1
``` ```
#### macOS #### 2. Create a virtual environment
A virtual environment keeps 8-cut's dependencies isolated from your system Python. This is strongly recommended — PyTorch alone is several GB and can conflict with other projects.
**Using venv (recommended):**
```bash
# Create the venv in the project directory
python3 -m venv .venv
# Activate it
# Linux / macOS:
source .venv/bin/activate
# Windows (cmd):
.venv\Scripts\activate.bat
# Windows (PowerShell):
.venv\Scripts\Activate.ps1
```
**Using conda / miniforge:**
```bash
conda create -n 8cut python=3.12
conda activate 8cut
```
You must activate the environment every time you open a new terminal before running 8-cut. The `8cut.sh` launcher does this automatically.
#### 3. Install PyTorch
PyTorch must be installed separately with the correct CUDA version for GPU acceleration. Without CUDA, audio scanning will fall back to CPU (much slower).
**With NVIDIA GPU (CUDA 12.8):**
```bash
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cu128
```
**CPU only (no GPU):**
```bash
pip install torch torchaudio --index-url https://download.pytorch.org/whl/cpu
```
Check available CUDA versions at [pytorch.org/get-started](https://pytorch.org/get-started/locally/).
#### 4. Install project dependencies
```bash ```bash
brew install mpv ffmpeg
pip install -r requirements.txt pip install -r requirements.txt
``` ```
#### 5. Verify
```bash
python -c "import torch; print('PyTorch', torch.__version__, 'CUDA', torch.version.cuda)"
python -c "import librosa, torchaudio, sklearn; print('All imports OK')"
```
### Running
```bash
# With venv activated:
python main.py
# Or use the launcher (auto-activates venv/conda):
./8cut.sh # Linux / macOS
8cut.bat # Windows
```
### GPU encoding ### GPU encoding
Hardware encoders are auto-detected from ffmpeg. Available encoders by platform: Hardware encoders are auto-detected from ffmpeg. Available encoders by platform:
@@ -145,22 +221,12 @@ Hardware encoders are auto-detected from ffmpeg. Available encoders by platform:
Enable the **HW** checkbox in the export controls to use GPU encoding. Enable the **HW** checkbox in the export controls to use GPU encoding.
### Audio scanning dependencies ### Optional: audio scanning
Audio scanning requires additional packages (installed via `requirements.txt`): Audio scanning requires PyTorch (installed above). Embedding models are downloaded on first use and cached in `cache/downloads/`. A CUDA-capable GPU is strongly recommended for training and scanning speed.
- `torch` and `torchaudio` — embedding model inference (CUDA recommended)
- `scikit-learn` — classifier training
- `joblib` — model persistence
Models are downloaded on first use and cached in `cache/downloads/`.
## Usage ## Usage
```
python main.py
```
Drop videos onto the queue or click **+ Open Files**. Scrub to your cut point, then press **Export** (or `E`). Drop videos onto the queue or click **+ Open Files**. Scrub to your cut point, then press **Export** (or `E`).
### Export layout ### Export layout
@@ -224,6 +290,12 @@ Labels persist between exports so you can cut many clips of the same class witho
Export clips manually from several videos. Clips from the same export folder (e.g. `mp4_Intense`) become your positive training class. Export clips manually from several videos. Clips from the same export folder (e.g. `mp4_Intense`) become your positive training class.
**Minimum dataset:** ~20 clips from 23 different videos. This is enough for the classifier to learn a basic boundary, but expect noisy results — you'll need to mark hard negatives and retrain.
**Ideal dataset:** 50100+ clips from 5+ videos covering the full range of variation in your target sound (different recording conditions, distances, intensities). More variety in your positives makes the model generalize better to unseen footage. Negatives are sampled automatically from regions far from your markers, but adding explicit negatives of confusable sounds (e.g. thunder when training for explosions) significantly reduces false positives.
The classifier improves iteratively: export a small initial set → train → scan → mark false positives as hard negatives → retrain. Each cycle sharpens the decision boundary without needing a large upfront dataset.
### 2. Train a classifier ### 2. Train a classifier
Click **Train** to open the training dialog: Click **Train** to open the training dialog: