Introducing the GPUImage framework

SecondConf logo

I'd like to introduce a new open source framework that I've written, called GPUImage. The GPUImage framework is a BSD-licensed iOS library (for which the source code can be found on Github) that lets you apply GPU-accelerated filters and other effects to images, live camera video, and movies. In comparison to Core Image (part of iOS 5.0), GPUImage allows you to write your own custom filters, supports deployment to iOS 4.0, and has a slightly simpler interface. However, it currently lacks some of the more advanced features of Core Image, such as facial detection.

UPDATE (4/15/2012): I've disabled comments, because they were getting out of hand. If you wish to report an issue with the project, or request a feature addition, go to its GitHub page. If you want to ask a question about it, contact me at the email address in the footer of this page, or post in the new forum I have set up for the project.

About a year and a half ago, I gave a talk at SecondConf where I demonstrated the use of OpenGL ES 2.0 shaders to process live video. The subsequent writeup and sample code that came out of that proved to be fairly popular, and I've heard from a number of people who have incorporated that video processing code into their iOS applications. However, the amount of code around the OpenGL ES 2.0 portions of that example made it difficult to customize and reuse. Since much of this code was just scaffolding for interacting with OpenGL ES, it could stand to be encapsulated in an easier to use interface.

Example of four types of video filters

Since then, Apple has ported some of their Core Image framework from the Mac to iOS. Core Image provides an interface for doing filtering of images and video on the GPU. Unfortunately, the current implementation on iOS has some limitations. The largest of these is the fact that you can't write your own custom filters based on their kernel language, like you can on the Mac. This severely restricts what you can do with the framework. Other downsides include a somewhat more complex interface and a lack of iOS 4.0 support. Others have complained about some performance overhead, but I've not benchmarked this myself.

Because of the lack of custom filters in Core Image, I decided to convert my video filtering example into a simple Objective-C image and video processing framework. The key feature of this framework is its support for completely customizable filters that you write using the OpenGL Shading Language. It also has a straightforward interface (which you can see some examples of below) and support for iOS 4.0 as a target.

Note that this framework is built around OpenGL ES 2.0, so it will only work on devices that support this API. This means that this framework will not work on the original iPhone, iPhone 3G, and 1st and 2nd generation iPod touches. All other iOS devices are supported.

The following is my first pass of documentation for this framework, an up-to-date version of which can be found within the framework repository on GitHub:

General architecture

GPUImage uses OpenGL ES 2.0 shaders to perform image and video manipulation much faster than could be done in CPU-bound routines. It hides the complexity of interacting with the OpenGL ES API in a simplified Objective-C interface. This interface lets you define input sources for images and video, attach filters in a chain, and send the resulting processed image or video to the screen, to a UIImage, or to a movie on disk.

Images or frames of video are uploaded from source objects, which are subclasses of GPUImageOutput. These include GPUImageVideoCamera (for live video from an iOS camera) and GPUImagePicture (for still images). Source objects upload still image frames to OpenGL ES as textures, then hand those textures off to the next objects in the processing chain.

Filters and other subsequent elements in the chain conform to the GPUImageInput protocol, which lets them take in the supplied or processed texture from the previous link in the chain and do something with it. Objects one step further down the chain are considered targets, and processing can be branched by adding multiple targets to a single output or filter.

For example, an application that takes in live video from the camera, converts that video to a sepia tone, then displays the video onscreen would set up a chain looking something like the following:

GPUImageVideoCamera -> GPUImageSepiaFilter -> GPUImageView

EDIT (10/6/2014): The instructions and code from this original post were for the framework as it was at launch, and have grown more out of date over time. People were not heeding my instructions to read the project page itself for the latest, and were copying and pasting code from here, so I'm removing that part of the post.

Please consult the project page for how to use the current version of the framework, and be aware that tutorials written years ago may have drifted out of date with the current API.

Things that need work

This is just a first release, and I'll keep working on this to add more functionality. I also welcome any and all help with enhancing this. Right off the bat, these are missing elements I can think of:

  • Images that exceed 2048 pixels wide or high currently can't be processed on devices older than the iPad 2 or iPhone 4S.
  • Currently, it's difficult to create a custom filter with additional attribute inputs and a modified vertex shader.
  • Many common filters aren't built into the framework yet.
  • Video capture and processing should be done on a background GCD serial queue.
  • I'm sure that there are many optimizations that can be made on the rendering pipeline.
  • The aspect ratio of the input video is not maintained, but stretched to fill the final image.
  • Errors in shader setup and other failures need to be explained better, and the framework needs to be more robust when encountering odd situations.

Hopefully, people will find this to be helpful in doing fast image and video processing within their iOS applications.