Everything you need to go hands-free
GestureOS pairs a state-of-the-art vision pipeline with a context engine that knows what you're doing — and what you mean.
21-point hand tracking
A palm detector finds your hand, then a landmark net maps 21 3D joints every frame — the same skeleton you see in the live overlay.
Runs on your device
ONNX models execute locally with DirectML / CUDA acceleration. No frames ever leave your machine. No latency, no cloud, no creep.
Sub-frame responsiveness
A Kalman-filtered tracker plus async inference keeps gestures buttery at 60fps, even while the model crunches in the background.
Context-aware profiles
GestureOS detects the focused app and swaps gesture maps automatically — gaming, meetings, and system control each get their own bindings.
Intent filtering
Temporal voting, hysteresis, and ghost-gesture suppression mean it only fires when you mean it. Wave goodbye to phantom clicks.
Live skeleton overlay
Watch what the engine sees in real time. Every joint, every confidence score, rendered as a glowing skeleton you can trust.
Map any gesture to any action
| Gesture | Name | Default action |
|---|---|---|
| 👆 | Point | Move cursor |
| 🤏 | Pinch | Click / select |
| ✊ | Fist | Grab & drag |
| ✋ | Open palm | Stop / release |
| 👍 | Thumbs up | Confirm / volume up |
| 👎 | Thumbs down | Cancel / volume down |
| 👋 | Wave | Switch app / dismiss |
| ✌️ | Peace | Screenshot / snap |
The right bindings, automatically
Map gestures to WASD, abilities, and camera control. Lean into a game without ever touching the keyboard.
Mute, raise hand, and toggle camera with a flick. React with emoji gestures your teammates actually see.
Scroll, switch windows, adjust volume, and navigate your desktop hands-free — perfect for presentations and accessibility.
Built on real computer-vision engineering
GestureOS isn't a demo — it's a tuned production pipeline.
Kalman-filtered tracking
Landmark positions are smoothed by a Kalman filter and velocity-clamped interpolation, so gestures stay steady even when frames are noisy.
Confidence-scored classifier
Binary gates were replaced with weighted confidence scoring, letting precision and recall be tuned per gesture independently.
Ghost-gesture suppression
Temporal voting, hysteresis, and an intent latch ensure an action only fires when you truly mean it — no phantom clicks.
Async vision pipeline
Heavy ONNX inference runs on a dedicated thread feeding a ring buffer, keeping capture and overlay smooth at 60fps.