Ekstra Ekstra
Live runtime

One interaction.
Many inputs.

Ekstra normalizes motion from any sensor into a single event stream. Below — the inputs we already speak, what we do with them, and a live demo you can drive with your phone or your camera.

01 The inputs we speak

Any sensor, same event.

Each tile below is a real provider in the Ekstra runtime. They all produce the same signed Motion Packet shape — so whatever reads the stream doesn't need to care where it came from.

In-browser

Phone IMU

Gyroscope + accelerometer. Millisecond latency, works on any phone with a modern browser.

Used for: phone-as-controller, screen pointing, tilt input.
In-browser

Webcam hand pose

MediaPipe Hands running on-device. Index-finger tip drives position, pinch fires a tap. No uploads.

Used for: hands-free control at DOOH screens and retail kiosks.
Network

Public cameras

9,719 live DOT cameras across NYC, Seattle, Los Angeles, London, Toronto and 20+ more feeds.

Used for: traffic, presence, and curb occupancy signals at city scale.
Edge

Router WiFi presence

Device-hash presence + dwell on GL.iNet routers — no camera, no personal data.

Used for: retail foot traffic, DOOH audience counts at venues.
Headset

XR hands + pose

OpenXR hands and Apple Vision pose adapters. Same Motion Packet shape as every other input.

Used for: in-store XR, training, immersive signage.
Custom

Your own sensor

Python, TypeScript, Go, or Browser SDK — write one provider, the rest of the stack stays the same.

Used for: RFID, BLE, Lidar, CSI WiFi sensing, whatever you have.

…and more every quarter. CSI WiFi sensing, visionOS pose, custom hardware — anything you can stream over WebSocket. Write a provider once; every surface, app, and SDK already knows what to do with it.

02 What Ekstra does with them

Every sensor goes through the same three stages.

Doesn't matter if the input is a phone in your pocket or a city camera in Brooklyn — once it's in the runtime, the downstream code is identical.

01 / Normalize

One Motion Packet.

Every provider emits the same shape. Signed ed25519 at the edge, timestamped, addressed by Motion Address.

{ motion_address, signal_type, value, t, sig }
02 / Detect

Primitives, not raw frames.

The runtime extracts tap, point, dwell, rotation, and presence — composable into higher-level gesture phrases.

TAP · POINT · DWELL · ROTATE · PRESENCE
03 / Route

To wherever it should go.

Your app, a physical screen, a map layer, or a signed on-chain receipt anchored on Solana.

app · screen · map · on-chain
03 Try it yourself

Pick an input. Rotate the cube.

Both sources drive the same cube through the same handler. Phone tilt, or your hand in front of the webcam — identical event shape downstream.

frontrotate me
back
right
left
top
bottom
rotX 0°
rotY 0°
pick an input below

Phone

Tilt your phone. Gyroscope → normalized 3-DOF orientation. Most immediate input — millisecond latency.

pair QR
Scan with your phone camera to pair.

Camera

Raise your hand in front of the camera. Index-finger tip drives rotation; pinch thumb+index to snap back to neutral.

04 The wire

One event stream, every source.

Your phone, your webcam, and 9,719 city cameras all produce the same event shape. Below is the live feed — raw frames from wss://ekstra.ai/ws interleaved with city space updates.

connecting… 0 events · 0 sources
05 The global network

You're looking at the Ekstra network, live.

Every dot is a real public camera already in the runtime. cameras across NYC, Seattle, Los Angeles, London, Toronto and 20+ more feeds, served from /api/v1/cameras/near. Add yourself to the map and your device joins the same network.

cameras in network
in this view
This is New York right now. Pan or zoom to explore the network. Tap Add yourself and your device becomes a pulse on the same map — wherever you are in the world.