Pose estimation, or the ability to detect humans and their poses from image data, is one of the most exciting — and most difficult — topics in machine learning and computer vision. Recently, Google shared PoseNet: a state-of-the-art pose estimation model that provides highly accurate pose data from image data (even when those images are blurry, low-resolution, or in black and white). This is the story of the experiment that prompted us to create this pose estimation library for the web in the first place.

Months ago, we prototyped a fun experiment called Move Mirror that lets you explore images in your browser, just by moving around. The experiment creates a unique, flipbook-like experience that follows your moves and reflects them with images of all kinds of human movement — from sports and dance to martial arts, acting, and beyond. We wanted to release the experience on the web, let others play with it, learn about machine learning, and share the experience with friends. Unfortunately we faced a problem: a publicly accessible web-specific model for pose estimation did not exist.

Typically, working with pose data means either having access to special hardware or having experience with C++/Python computer vision libraries. We thus saw a unique opportunity to make pose estimation more widely accessible by porting an in-house model to TensorFlow.js, a Javascript library that lets you run machine learning projects in the browser. We assembled a team, spent a few months developing the library, and ultimately released PoseNet, an open-source tool that allows any web developer to play with body-based interactions, entirely within the browser, no special cameras or C++/Python skills required.

With PoseNet out in the wild, we can finally release Move Mirror — a project that is a testament to the value that experimentation and play can add to serious engineering work. It was only through a true collaboration between research, product, and creative teams that we were able to build PoseNet and Move Mirror.