docs: update README and add SVG logo

- Add timeline-style SVG banner with markers, playhead, and branding
- Rewrite README to reflect current features (batch export, SELVA
  annotation, group delete/overwrite, playlist, shortcuts)
- Remove outdated mask generation references
- Update test count to 49

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-13 02:58:47 +02:00
parent bcdda9c783
commit 2ef387d87b
2 changed files with 176 additions and 53 deletions
+81 -53
View File
@@ -1,32 +1,57 @@
# 8-cut
[![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://github.com/ethanfel/8-cut/blob/master/LICENSE)
<p align="center">
<img src="assets/logo.svg" alt="8-cut — 8-second clips for SELVA datasets" width="720">
</p>
**Source:** https://github.com/ethanfel/8-cut
<p align="center">
<a href="https://github.com/ethanfel/8-cut/blob/master/LICENSE"><img src="https://img.shields.io/badge/License-GPLv3-blue.svg" alt="License: GPL v3"></a>
</p>
A desktop tool for cutting 8-second clips from video files, designed for building [SELVA](https://github.com/google-deepmind/selva) datasets.
## Overview
8-cut lets you scrub through a video, mark a cut point, and export a fixed 8-second clip with one keypress. It tracks every export in a local SQLite database so you can resume a session or switch between resolution variants of the same source without duplicating work.
8-cut lets you scrub through a video, mark a cut point, and export a batch of overlapping 8-second clips with one keypress. It tracks every export in a local SQLite database so you can resume a session or switch between resolution variants of the same source without duplicating work.
All clips are exactly 8 seconds — this is a hard constraint of the SELVA format.
All clips are exactly 8 seconds — a hard constraint of the SELVA format.
## Features
- **Frame-accurate scrubbing** — click or drag the timeline; arrow keys and J/K/L for frame-by-frame stepping
- **Keyboard shortcuts** — J/L step one frame, Shift+J/L step one second, Space/P play/pause, K pause and return to cursor, E export, M jump to next marker
- **Two export formats** — H.264/AAC MP4 or lossless WebP image sequence (frames + `.wav` audio extracted alongside)
- **Portrait crop** — crop to 9:16, 4:5, or 1:1 before export; adjustable horizontal crop position
- **Resize** — scale short side to a fixed pixel size (e.g. 256)
- **Export history** — timeline markers show previously exported clips; fuzzy filename matching detects resolution variants of the same file (e.g. `_2160p` vs `_1080p`)
- **Mask generation** — generate binary foreground masks per-frame using SAM2 (segmentation) or Depth Anything V2 (depth-based), via a bundled venv
- **Playlist** — drag-and-drop multiple files; duplicates are ignored
- **Frame-accurate scrubbing** — click or drag the timeline; arrow keys and J/L for frame-by-frame, Shift for 1-second steps
- **Batch export** — export multiple overlapping clips per cut point with configurable count and spread offset
- **Two export formats** — H.264 MP4 with lossless PCM audio, or WebP image sequence (frames + `.wav`)
- **Portrait crop** — crop to 9:16, 4:5, or 1:1 before export; click the video or crop bar to reposition
- **Random portrait** — optionally apply a random portrait crop to a subset of each batch
- **Resize** — scale short side to a fixed pixel size (e.g. 512)
- **SELVA annotation** — label and category fields saved to `dataset.json` and the clip database
- **Export history** — timeline markers show previously exported clips; double-click to enter overwrite mode; right-click to delete
- **Fuzzy matching** — detects resolution variants of the same file (`_2160p` vs `_1080p`) and shares markers between them
- **End-frame preview** — floating window shows the last frame of the selection region
- **Playlist** — drag-and-drop or use the Open Files button; right-click to remove items
- **Playback loop** — plays the exact selection region on loop so you can preview what will be exported
- **Group operations** — delete or overwrite acts on all sub-clips in a batch, not just one
## Keyboard shortcuts
| Key | Action |
|-----|--------|
| `Left` / `J` | Step back 1 frame |
| `Right` / `L` | Step forward 1 frame |
| `Shift+Left` / `Shift+J` | Step back 1 second |
| `Shift+Right` / `Shift+L` | Step forward 1 second |
| `Space` / `P` | Toggle play/pause |
| `K` | Pause and snap to cursor |
| `E` | Export |
| `M` | Jump to next marker (wraps) |
| `N` | Next file in playlist |
Shortcuts are suppressed when a text field has focus.
## Requirements
- Python 3.11+
- `ffmpeg` in `PATH`
- `ffmpeg` on `PATH`
- PyQt6
- python-mpv (requires libmpv)
@@ -34,15 +59,15 @@ All clips are exactly 8 seconds — this is a hard constraint of the SELVA forma
pip install -r requirements.txt
```
For mask generation tools, additional dependencies (PyTorch, transformers, segment-anything-2, opencv) are installed into `~/.8cut/venv/` via the Settings dialog.
### Platform notes
**Linux** — install libmpv via your package manager (`apt install libmpv-dev` / `pacman -S mpv`).
| Platform | libmpv |
|----------|--------|
| **Linux** | `apt install libmpv-dev` or `pacman -S mpv` |
| **macOS** | `brew install mpv` |
| **Windows** | Download `mpv-2.dll` from [mpv Windows builds](https://sourceforge.net/projects/mpv-player-windows/files/libmpv/) and place it in `PATH` or next to `main.py` |
**macOS** — install libmpv via Homebrew: `brew install mpv`.
**Windows**`python-mpv` requires `mpv-2.dll` in `PATH` or in the same directory as `main.py`. Download it from the [mpv Windows builds](https://sourceforge.net/projects/mpv-player-windows/files/libmpv/) page (pick the latest `mpv-dev-x86_64-*.7z`, extract `mpv-2.dll`). Also ensure `ffmpeg.exe` is in `PATH` (e.g. via [winget](https://winget.run/): `winget install ffmpeg`).
Windows also needs `ffmpeg.exe` on `PATH` (e.g. `winget install ffmpeg`).
## Usage
@@ -50,49 +75,52 @@ For mask generation tools, additional dependencies (PyTorch, transformers, segme
python main.py
```
Drop a video onto the playlist or use the file picker. Scrub to your cut point, set the output folder and clip name, then press **Export** (or `E`).
Drop videos onto the queue or click **+ Open Files**. Scrub to your cut point, then press **Export** (or `E`).
### Export formats
### Export layout
| Format | Output |
|--------|--------|
| MP4 | `<folder>/<name>_NNN.mp4` — H.264 video + AAC audio |
| WebP sequence | `<folder>/<name>_NNN/frame_%04d.webp` — lossless WebP frames + `<name>_NNN.wav` PCM audio |
### Keyboard shortcuts
| Key | Action |
|-----|--------|
| `←` / `J` | Step back 1 frame |
| `→` / `L` | Step forward 1 frame |
| `Shift+←` / `Shift+J` | Step back 1 second |
| `Shift+→` / `Shift+L` | Step forward 1 second |
| `Space` / `P` | Toggle play/pause |
| `K` | Pause and snap video to cursor |
| `E` | Export clip |
| `M` | Jump to next export marker (wraps) |
Arrow keys and J/K/L are ignored when a text field has focus.
### Mask generation tools
> **Warning:** The mask generation feature is untested and may not work reliably. For production use, consider [ComfyUI](https://github.com/comfyanonymous/ComfyUI) instead.
Two standalone scripts live in `tools/`. They are run by the app via a managed venv but can also be called directly:
Each export creates a group subfolder containing the overlapping sub-clips:
```
python tools/sam_masks.py --input clip.mp4 --output masks_dir/
python tools/depth_masks.py --input clip.mp4 --output masks_dir/
output/
clip_001/
clip_001_0.mp4 # starts at cursor
clip_001_1.mp4 # starts at cursor + spread
clip_001_2.mp4 # starts at cursor + 2 * spread
clip_002/
...
```
Both output one binary PNG per frame (`frame_0000.png`, …) where white = foreground.
With WebP sequence format, each sub-clip becomes a directory of frames plus a `.wav`:
- **SAM2** (`sam_masks.py`) — uses `facebook/sam2-hiera-large`; center-point prompt propagated across all frames
- **Depth Anything V2** (`depth_masks.py`) — uses `depth-anything/Depth-Anything-V2-Large-hf`; Otsu threshold on the depth map
```
output/
clip_001/
clip_001_0/
frame_0001.webp
frame_0002.webp
...
clip_001_0.wav
```
### SELVA annotation
Set a **Label** (e.g. "dog barking") and **Category** (Human / Animal / Vehicle / Tool / Music / Nature / Sport / Other) before exporting. These are saved to:
- `dataset.json` in the export folder — one entry per clip with `path` and `label`
- The SQLite database — for recall when you revisit a marker
Labels persist between exports so you can cut many clips of the same class without retyping.
### Overwrite and delete
- **Double-click** a timeline marker to enter overwrite mode — the next export re-encodes all clips in that group to their original paths
- **Right-click** a marker to delete it from the database
- The **Delete** button removes all clips in a group from disk, database, and `dataset.json`
## Database
Export history is stored in `~/.8cut.db` (SQLite). The database records filename, start time, and output path for every clip. When you open a file, 8-cut checks whether a similar filename has been processed before (stripping resolution tags like `_2160p`, `_1080p`, codec tags, etc.) and pre-populates the timeline with existing markers.
Export history is stored in `~/.8cut.db` (SQLite). The database records filename, start time, output path, label, category, and all encoding settings for every clip. When you open a file, 8-cut fuzzy-matches the filename (stripping resolution tags like `_2160p`, codec tags, etc.) and pre-populates the timeline with existing markers.
## Testing
@@ -100,7 +128,7 @@ Export history is stored in `~/.8cut.db` (SQLite). The database records filename
pytest tests/ -v
```
38 unit tests covering path builders, ffmpeg command generation, time formatting, and the processed-clips database.
49 unit tests covering path builders, ffmpeg command generation, time formatting, database operations, group queries, and annotation handling.
## License
+95
View File
@@ -0,0 +1,95 @@
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 720 200">
<defs>
<linearGradient id="timeline-bg" x1="0" y1="0" x2="1" y2="0">
<stop offset="0%" stop-color="#1e1e1e"/>
<stop offset="100%" stop-color="#2a2a2a"/>
</linearGradient>
<linearGradient id="selection" x1="0" y1="0" x2="1" y2="0">
<stop offset="0%" stop-color="#3c82dc" stop-opacity="0.6"/>
<stop offset="100%" stop-color="#3c82dc" stop-opacity="0.3"/>
</linearGradient>
<linearGradient id="eight-grad" x1="0" y1="0" x2="0" y2="1">
<stop offset="0%" stop-color="#ffd230"/>
<stop offset="100%" stop-color="#e6a800"/>
</linearGradient>
</defs>
<!-- Background -->
<rect width="720" height="200" rx="12" fill="#161616"/>
<!-- Timeline track -->
<rect x="40" y="100" width="640" height="48" rx="4" fill="url(#timeline-bg)" stroke="#333" stroke-width="1"/>
<!-- Timeline lane -->
<rect x="40" y="112" width="640" height="24" rx="2" fill="#2a2a2a"/>
<!-- Selection region (8 seconds) -->
<rect x="240" y="100" width="140" height="48" rx="2" fill="url(#selection)"/>
<line x1="240" y1="100" x2="240" y2="148" stroke="#3c82dc" stroke-width="2" stroke-opacity="0.8"/>
<line x1="380" y1="100" x2="380" y2="148" stroke="#3c82dc" stroke-width="2" stroke-opacity="0.8"/>
<!-- Playhead -->
<line x1="240" y1="96" x2="240" y2="152" stroke="#ffd230" stroke-width="2"/>
<polygon points="234,96 246,96 240,104" fill="#ffd230"/>
<!-- Ruler ticks -->
<g stroke="#555" stroke-width="1">
<line x1="80" y1="100" x2="80" y2="108"/>
<line x1="120" y1="100" x2="120" y2="105"/>
<line x1="160" y1="100" x2="160" y2="108"/>
<line x1="200" y1="100" x2="200" y2="105"/>
<line x1="240" y1="100" x2="240" y2="108"/>
<line x1="280" y1="100" x2="280" y2="105"/>
<line x1="320" y1="100" x2="320" y2="108"/>
<line x1="360" y1="100" x2="360" y2="105"/>
<line x1="400" y1="100" x2="400" y2="108"/>
<line x1="440" y1="100" x2="440" y2="105"/>
<line x1="480" y1="100" x2="480" y2="108"/>
<line x1="520" y1="100" x2="520" y2="105"/>
<line x1="560" y1="100" x2="560" y2="108"/>
<line x1="600" y1="100" x2="600" y2="105"/>
<line x1="640" y1="100" x2="640" y2="108"/>
</g>
<!-- Export markers -->
<g>
<line x1="130" y1="100" x2="130" y2="148" stroke="#dc3c3c" stroke-width="2"/>
<rect x="130" y="102" width="14" height="12" rx="1" fill="#c83232"/>
<text x="137" y="112" font-family="sans-serif" font-size="9" fill="white" text-anchor="middle">1</text>
</g>
<g>
<line x1="390" y1="100" x2="390" y2="148" stroke="#dc3c3c" stroke-width="2"/>
<rect x="390" y="102" width="14" height="12" rx="1" fill="#c83232"/>
<text x="397" y="112" font-family="sans-serif" font-size="9" fill="white" text-anchor="middle">2</text>
</g>
<g>
<line x1="540" y1="100" x2="540" y2="148" stroke="#dc3c3c" stroke-width="2"/>
<rect x="540" y="102" width="14" height="12" rx="1" fill="#c83232"/>
<text x="547" y="112" font-family="sans-serif" font-size="9" fill="white" text-anchor="middle">3</text>
</g>
<!-- "8" numeral -->
<text x="100" y="72" font-family="'Helvetica Neue', Helvetica, Arial, sans-serif" font-size="72" font-weight="bold" fill="url(#eight-grad)">8</text>
<!-- "-cut" text -->
<text x="148" y="70" font-family="'Helvetica Neue', Helvetica, Arial, sans-serif" font-size="48" font-weight="300" fill="#cccccc">-cut</text>
<!-- Scissors icon near playhead -->
<g transform="translate(296, 82) scale(0.7)" fill="none" stroke="#999" stroke-width="2" stroke-linecap="round">
<circle cx="5" cy="5" r="4" />
<circle cx="5" cy="19" r="4" />
<line x1="9" y1="7" x2="20" y2="17"/>
<line x1="9" y1="17" x2="20" y2="7"/>
</g>
<!-- Tagline -->
<text x="400" y="72" font-family="'Helvetica Neue', Helvetica, Arial, sans-serif" font-size="14" fill="#777">8-second clips for SELVA datasets</text>
<!-- Duration label in selection -->
<text x="310" y="130" font-family="'Courier New', monospace" font-size="13" fill="#aad4ff" text-anchor="middle" opacity="0.9">8.0s</text>
<!-- Time labels -->
<text x="40" y="166" font-family="'Courier New', monospace" font-size="10" fill="#666">0:00</text>
<text x="230" y="166" font-family="'Courier New', monospace" font-size="10" fill="#e6a800">1:15</text>
<text x="640" y="166" font-family="'Courier New', monospace" font-size="10" fill="#666">5:00</text>
</svg>

After

Width:  |  Height:  |  Size: 4.4 KiB