3D Gestures with Kinect

Role: Researcher, Prototyper
Size: 2 Interns
Client/Sponsor: Synaptics, Inc.
Duration: 3 months
Skills: User Research | Prototyping | Development
Methods: Participatory Design | Surveys | Usability Studies

 

Overview:
Through the course of a summer internship, our team of two interns researched and developed a prototype for scrolling through a document using 3D (in-air) gestures. Along the way, we discovered the challenges of gesture design, namely, coming up with an effective modal switch (a start and stop) in order for the gesture to be recognized. After several iterations of our prototype, we were able to user test the performance of our final scrolling gesture against Synaptics’ established ChiralMotion scrolling. While many found the new gesture appealing and different, it had not been developed to the level of ChiralMotion and therefore, could not compete in certain aspects. However, the positive feedback we received proved there is potential to develop the gesture further.

 

Problem Description:
As the design of user interfaces advances, there is a growing need for interactions to become more natural and less dependent on peripherals such as a touchscreen, keyboard, or mouse. Gestural interaction has entered the market in mainly gaming systems such as the Nintendo Wii and Microsoft Kinect. However, not all tasks can be accomplished efficiently and effectively through gestures due to the constraints of gesture recognition and limitations of depth cameras, like the Microsoft Kinect.

This project explores whether or not gestures could be compelling in a productivity work environment. In coming up with an effective gestural scrolling interaction, we had to overcome the challenges of gestural design, including the difficulty in reading the “start” and “stop” of a gesture, latency in processing a gesture and executing the appropriate following action, fatigue (i.e., “gorilla arm”), and working with a limited gesture region.

 

Process and Technology:
Our process included surveying the current Kinect applications in the open-source community, brainstorming ideas that aligned with the company’s vision, developing iterative prototypes, running usability studies, and conducting an experiment comparing various scrolling methods.

We developed our scrolling gesture using Microsoft Visual Studio, the Kinect SDK, OpenCV 2.2 Library, and EmguCV 2.2.1 Library. As novices to the Kinect development world, we went through a few small projects to familiarize ourselves with the technology. These projects include background extraction/replacement, simple blob detection, and sending a mouse wheel event to an external application (such as Adobe PDF Reader).

With each major iteration of our prototype, we ran informal surveys as well as conducted formal usability studies to understand users’ preferences and physiological constraints. The results from these studies played a significant role in our decisions to design for just one degree of freedom, in picking between two gestures, as well as selecting a modal switch.

In our final experiment, we compared our scrolling gesture against Synaptics’ ChiralMotion and tradtional mouse scrolling. We gave users the task of locating a red, bolded line in an 1100-line document within a browser window. The task was repeated for various line distances from the top of the document and from various directions (beginning at the top or bottom of the document). Our results revealed that users performed the task more quickly using the traditional mouse and ChiralMotion, however this may be greatly influenced by their amount of previous experience in using mice. Many commented that our scrolling gesture was novel, fun, and if tweaked to the level of ChiralMotion, it could be a primary scrolling tool for specific types of tasks.