OpenGL ES



Relevant resources


Documentation


Video


Book


Web


My articles:


Sample code - Apple


Sample code - class


One of the things that surprised people the most when the iPhone SDK was first unveiled was that the iPhone supported 3-D graphics, and did so very well for such a portable device.  With the number of 3-D games out there on the App Store, and the dramatically increased graphics power of the iPhone 3G S and iPad, this is something we now take for granted.


The iPhone, particularly the newer models and the iPad, is a powerful mobile 3-D renderer.  To compare it with common devices (based on my benchmarks):


Device

Triangles / s

iPhone 3G

423,000

iPad

1,830,000

Nintendo DS

120,000

Sony PSP

5,000,000 - 8,000,000 (33,000,000 theoretical)

MacBook Air

2,150,000


3-D graphics on the iPhone are made possible using OpenGL ES.  The ES stands for Embedded Systems, making this the mobile version of the OpenGL (Open Graphics Library) graphics API.  OpenGL has been around since the early 1990's, and is a cross-platform API for performing hardware-accelerated 2-D and 3-D graphics.  Over the years, the OpenGL API has evolved, with more efficient approaches added for common tasks.  Unfortunately, that meant that there were multiple ways of doing the same type of drawing, some more performant than others.


OpenGL ES presented an opportunity for people to start over and build an API using only the most efficient rendering operations from desktop OpenGL.  This means that OpenGL ES is very close to OpenGL, with differences including the following:



OpenGL ES is not object-oriented, like Cocoa, but is a procedural C API that uses the concept of a state machine.  This means that you deal with OpenGL by putting it into a particular state, doing setup or drawing functions, then switching to the next appropriate state, repeating until the scene has been drawn.


The OpenGL ES standard is maintained by the Khronos Group, an industry organization that Apple is a part of, which manages several standards, including OpenGL, OpenCL, and  COLLADA.  The OpenGL ES 1.X and OpenGL ES 2.X specifications are available from this group.


3-D graphics terminology


Before we get into how to render content using OpenGL ES, it might be helpful to define some terms that you'll encounter when dealing with 3-D graphics.


A vertex is a point in 3-D space that defines a corner of a polygon (a triangle in the case of OpenGL ES).  This point has X, Y, and Z components to it.  For example, a triangle is defined by providing three vertices, with the edges of the triangle drawn between them.  OpenGL coordinate space is right-handed, meaning that if the positive X direction is to the right, and the positive Y direction upwards, then the positive Z direction will be toward the viewer.


The origin is the (0, 0, 0) coordinate.


A vector is a direction in 2-D or 3-D (3-D when talking about OpenGL ES).  It is defined by providing a coordinate, relative to the origin.  The vector has the direction and distance required to start from the origin and pass through this point.


A plane is a 2-D slice through 3-D space, with a given tilt and location.


A normal is a vector that is perpendicular to a plane.  You most often encounter normals when talking about lighting in OpenGL.


An index is a numerical representation of a location within an array of values.


A transform is a structure that contains instructions for how to rotate, scale, or otherwise distort an object, point, or the overall world coordinate system.  In OpenGL, a transform is a 4 x 4 matrix of values.  There is one primary transform, called the model view matrix, that is used to manipulate to coordinate system of OpenGL space.


Setting up an OpenGL ES view


Before we can do any OpenGL ES drawing, we need to have a place to draw to.  There is no specific UIView subclass that handles OpenGL drawing (like the Mac's NSOpenGLView), so we have to create one ourselves.  


Before we go through what's needed to set up an OpenGL drawing area, I should point out that you don't need to write all this setup code yourself.  Xcode's OpenGL ES Application new project template will create a generic implementation of this for you, and all you have to do is replace the rendering code with your own.  Still, it can be useful to know what that boilerplate code is doing.


All OpenGL ES content in an iPhone application is rendered through a CALayer subclass, CAEAGLLayer.  In order to host one of these within a view, we will need to subclass UIView and implement the following class method:


+ (Class) layerClass 

{

return [CAEAGLLayer class];

}


We've seen this method before.  It overrides the default layer type for a UIView (normally just CALayer) and causes the view to use a CAEAGLLayer instead.


Once the view has been configured to host the OpenGL layer, that layer needs to be configured.  Within the UIView's -initWithFrame: method, you'd typically place code like the following:


CAEAGLLayer *eaglLayer = (CAEAGLLayer *)self.layer;

eaglLayer.opaque = YES;

eaglLayer.drawableProperties = [NSDictionary dictionaryWithObjectsAndKeys: [NSNumber numberWithBool:FALSE], kEAGLDrawablePropertyRetainedBacking, kEAGLColorFormatRGBA8, kEAGLDrawablePropertyColorFormat, nil];


This makes sure the OpenGL layer is opaque (for performance reasons), and sets a couple of properties on the layer: the layer is set to not retain its contents after displaying them (a performance optimization), and the colorspace is set to be 8-bits for each of the red, green, blue, and alpha components.


There are two implementations of OpenGL supported by iPhone OS devices: OpenGL ES 1.1 and OpenGL ES 2.0.  The original iPhone only supported OpenGL ES 1.1, but the iPhone 3G S, third-generation iPod touch, and iPad all support OpenGL ES 2.0 as well.  If you would like to use OpenGL ES 2.0 rendering (which we'll discuss a little later on), you first need to determine if that rendering pathway is supported on the device.  You can do that using code like the following:


context = [[EAGLContext alloc] initWithAPI:kEAGLRenderingAPIOpenGLES2];


if (!context || ![EAGLContext setCurrentContext:context] || ![self loadShaders])

{

context = [[EAGLContext alloc] initWithAPI:kEAGLRenderingAPIOpenGLES1];

if (!context || ![EAGLContext setCurrentContext:context] || ![self createFramebuffer]) 

{

[self release];

return nil;

}


// Set up remainder of OpenGL ES 1.1 pathway

}

else

{

// Set up OpenGL ES 2.0 rendering pathway

}


where context is an EAGLContext instance.  This first tries to create an OpenGL ES 2.0 context, set that context, and load a set of shaders for the rendering.  If any of those actions fails, it indicates that OpenGL ES 2.0 is not supported on the device.  Instead, it falls back to creating an OpenGL ES 1.1 context and setting up that rendering pathway.  We will discuss the differences between these versions of the OpenGL ES specification later.


An EAGLContext is an OpenGL drawing context, similar to the Core Graphics drawing contexts we've seen before.  


After setting up a context, a framebuffer must be constructed to be the final destination in the rendering pipeline.  It is what all of the pixels of your 3-D graphics will be rendered into.  Framebuffers also have one or more renderbuffers attached to them.  The renderbuffers are 2-D images of pixels, and typically come in one of three types: color, depth, and stencil.  The following code shows how to create a framebuffer and attach a color renderbuffer to it in OpenGL ES 1.1:


glGenFramebuffersOES(1, &defaultFramebuffer);

glGenRenderbuffersOES(1, &colorRenderbuffer);

glBindFramebufferOES(GL_FRAMEBUFFER_OES, defaultFramebuffer);

glBindRenderbufferOES(GL_RENDERBUFFER_OES, colorRenderbuffer);

glFramebufferRenderbufferOES(GL_FRAMEBUFFER_OES, GL_COLOR_ATTACHMENT0_OES, GL_RENDERBUFFER_OES, colorRenderbuffer);


where defaultFrameBuffer and colorRenderbuffer are GLuint values called "names" in OpenGL.  A name is an integer handle to a particular object.


GLuint is an OpenGL-defined numerical type for an unsigned integer.  These numerical types include:



This same implementation in OpenGL ES 2.0 looks like:


glGenFramebuffers(1, &defaultFramebuffer);

glGenRenderbuffers(1, &colorRenderbuffer);

glBindFramebuffer(GL_FRAMEBUFFER, defaultFramebuffer);

glBindRenderbuffer(GL_RENDERBUFFER, colorRenderbuffer);

glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_RENDERBUFFER, colorRenderbuffer);


Note the lack of OES in the names of the OpenGL ES 2.0 functions.  OpenGL ES lets vendors add extensions to expand upon the base capabilities of the API.  In the case of OpenGL ES 1.1, Apple used extensions to add framebuffer objects, but this became common enough that it was rolled into the baseline specification for OpenGL ES 2.0.


To set up the initial size of the render buffer, you can use code like the following (OpenGL ES 1.1):


glBindRenderbufferOES(GL_RENDERBUFFER_OES, colorRenderbuffer);

[context renderbufferStorage:GL_RENDERBUFFER_OES fromDrawable:layer];

glGetRenderbufferParameterivOES(GL_RENDERBUFFER_OES, GL_RENDERBUFFER_WIDTH_OES, &backingWidth);

glGetRenderbufferParameterivOES(GL_RENDERBUFFER_OES, GL_RENDERBUFFER_HEIGHT_OES, &backingHeight);


This causes the color renderbuffer to be shared between Core Animation and OpenGL ES in the CAEAGLLayer, so that the content can be displayed.  The -renderbufferStorage:fromDrawable: also causes the color renderbuffer to be sized to fit the layer.  The width and height of the color renderbuffer are then retrieved, for later use.


If you are creating a depth renderbuffer (for managing shadows and other depth-related rendering), you would insert code like this after the above:


glGenRenderbuffersOES(1, &depthRenderbuffer);

glBindRenderbufferOES(GL_RENDERBUFFER_OES, depthRenderbuffer);

glRenderbufferStorageOES(GL_RENDERBUFFER_OES, GL_DEPTH_COMPONENT16_OES, backingWidth, backingHeight);

glFramebufferRenderbufferOES(GL_FRAMEBUFFER_OES, GL_DEPTH_ATTACHMENT_OES, GL_RENDERBUFFER_OES, depthRenderbuffer);


Finally, you can check the status of the created framebuffer to make sure everything was set up correctly using the following code:


if (glCheckFramebufferStatusOES(GL_FRAMEBUFFER_OES) != GL_FRAMEBUFFER_COMPLETE_OES)

{

NSLog(@"Failed to make complete framebuffer object %x", glCheckFramebufferStatusOES(GL_FRAMEBUFFER_OES));

return NO;

}


When done with the framebuffer, renderbuffers, and the OpenGL context, you can use code like the following to deallocate them:


if (defaultFramebuffer)

{

glDeleteFramebuffersOES(1, &defaultFramebuffer);

defaultFramebuffer = 0;

}


if (colorRenderbuffer)

{

glDeleteRenderbuffersOES(1, &colorRenderbuffer);

colorRenderbuffer = 0;

}


if ([EAGLContext currentContext] == context)

[EAGLContext setCurrentContext:nil];


[context release];

context = nil;


Once all of that is configured, you have yourself a UIView subclass that can host OpenGL drawing content.


Lighting


Before actually rendering 3-D content to the screen, you may want to configure the lighting of the scene we're about to render.  An example of this is as follows:


const GLfloat lightAmbient[] = {0.2, 0.2, 0.2, 1.0};

const GLfloat lightDiffuse[] = {1.0, 1.0, 1.0, 1.0};

const GLfloat matAmbient[] = {1.0, 1.0, 1.0, 1.0};

const GLfloat matDiffuse[] = {1.0, 1.0, 1.0, 1.0};

const GLfloat lightPosition[] = {0.466, -0.466, 0, 0}; 

const GLfloat lightShininess = 20.0;

glEnable(GL_LIGHTING);

glEnable(GL_LIGHT0);

glEnable(GL_COLOR_MATERIAL);

glMaterialfv(GL_FRONT_AND_BACK, GL_AMBIENT, matAmbient);

glMaterialfv(GL_FRONT_AND_BACK, GL_DIFFUSE, matDiffuse);

glMaterialf(GL_FRONT_AND_BACK, GL_SHININESS, lightShininess);

glLightfv(GL_LIGHT0, GL_AMBIENT, lightAmbient);

glLightfv(GL_LIGHT0, GL_DIFFUSE, lightDiffuse);

glLightfv(GL_LIGHT0, GL_POSITION, lightPosition); 


In this example, we enable lighting in our model using glEnable().  If you recall, OpenGL is a state machine, so when we enable or disable certain features, they remain so until we switch their values or switch appropriate states.


After enabling lighting overall, we enable light number 0 and the ability to specify different colors for your different vertices within your 3-D model.


We then set up several properties of the material to be used for our lighting model.  Note that the f at the end of glMaterialf() indicates that this particular value will be specified as a floating point number.  An fv in glMaterialfv() means that a pointer to a C array of floating point values will be given.  You'll see these suffixes in several OpenGL functions.  Another extension you might run into is x, which means that the function will use fixed-point values (integers).


The materials properties include the ambient and diffuse reflectance of the material, as well as the degree of shininess of the material.  The constant GL_FRONT_AND_BACK indicates that both front-facing and back-facing sides of the material will have these properties.


We then configure the properties of light 0.  The color and alpha channel of the diffuse and ambient light provided by this source are set, followed by its position in 3-D space (0.466, -0.466, 0).


Light 0 is slightly different from the other lights in that it has a default diffuse setting  of (1.0, 1.0, 1.0, 1.0).  All other lights have a default diffuse setting of  (0.0, 0.0, 0.0, 1.0).


Rendering properties


Many overall properties of our rendering can also be tweaked at this time.  For performance reasons, it's best to disable any extensions that you will not need in your rendered model:


glDisable(GL_NORMALIZE);

glDisable(GL_ALPHA_TEST);

glDisable(GL_FOG);


You can adjust the way that shading is done to your model.  For smooth shading, you can use the following:


glShadeModel(GL_SMOOTH);


The other possible value for this is flat shading using GL_FLAT.


If you would like to enable depth testing for working with your depth buffer (for shading, etc.), you use the following:


glEnable(GL_DEPTH_TEST);

glDepthFunc(GL_LEQUAL);


In this case, we've set the depth testing function such that a given pixel will be drawn if its depth value is less than or equal to the stored depth value.


As a performance optimization, you may wish to cull backfaces (remove faces of polygons that point away from the camera) using code like the following:


glEnable(GL_CULL_FACE);

glCullFace(GL_BACK);


Drawing a simple model


Finally, after all of the scene properties have been set it's time to actually render and display our model.  First, we may need to select the current context:


[EAGLContext setCurrentContext:context];


This call will be redundant if you only have one context.


Another potentially redundant call is one to bind the current framebuffer:


glBindFramebufferOES(GL_FRAMEBUFFER_OES, defaultFramebuffer);


Next, we will want to set up the viewport for our model.  The viewport is a rectangular window that translates model coordinates into the view coordinates.  This is accomplished using code like the following:


glViewport(0, 0, backingWidth, backingHeight);


After that is the projection matrix.  The projection matrix converts eye coordinates to clipping coordinates.  To just use the identity matrix (do nothing to the eye coordinates), you can use the following code:


glMatrixMode(GL_PROJECTION);

glLoadIdentity();


More on how to use the projection matrix (or rather how not to use it) can be found in the article "Help stamp out GL_PROJECTION abuse"  by Steve Baker.  See also the tutorial on OpenGL Transformation by Song Ho Ahn.


Next up is the model view matrix.  This matrix transforms the coordinate of each vertex and normal from object coordinates to eye coordinates by applying rotation, scaling, and other transformations.  In the following example, 


glMatrixMode(GL_MODELVIEW);

glLoadIdentity();

glTranslatef(0.0f, (GLfloat)translationInY, 0.0f);


we set the current state to be modifying the model view matrix, start it off at the identity matrix, and then translate in Y by translationInY units.  There are several functions for manipulating the model view matrix, including glRotatef() and glScalef().  Note that you do not need to set the model view matrix back to the identity matrix when you rendering each frame, you can apply rotations, translations, etc. to the previous value of the model view matrix.


If you have precalculated the model view matrix, you can directly replace the matrix using code like the following:


glLoadMatrixf(currentModelViewMatrix);


Additionally, you can multiply the current model view matrix with another using 


glMultMatrixf(currentModelViewMatrix);


Often, you may want to manipulate the model view matrix for drawing, then revert to the state it was in before your manipulations.  To do this, first use 


glPushMatrix();


to push the current matrix onto the stack.  When done with your matrix manipulations, call 


glPopMatrix();


to return the model view matrix to the state it was when you pushed it.  This is similar to the pushing and popping of graphics states we saw back when we were drawing in Quartz.


If you would like to clear the screen before drawing, you can use code like the following:


glClearColor(0.5f, 0.5f, 0.5f, 1.0f);

glClear(GL_COLOR_BUFFER_BIT);


This will uniformly color the screen grey.


When specifying the geometry to be drawn, you have several options.  The first is to create an array of vertices and potentially corresponding colors for those vertices, then draw those to the screen.  For example, the following code will draw a rectangle:


static const GLfloat squareVertices[] = {

-0.5f,  -0.33f,

0.5f,  -0.33f,

-0.5f,   0.33f,

0.5f,   0.33f,

};


static const GLubyte squareColors[] = {

255, 255,   0, 255,

0,   255, 255, 255,

0,     0,   0,   0,

255,   0, 255, 255,

};


glVertexPointer(2, GL_FLOAT, 0, squareVertices);

glEnableClientState(GL_VERTEX_ARRAY);

glColorPointer(4, GL_UNSIGNED_BYTE, 0, squareColors);

glEnableClientState(GL_COLOR_ARRAY);


glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);


In this example, a C array is set up that contains a set of 2-D vertices, along with a corresponding array of 4-byte RGBA values for the color of each vertex.


glVertexPointer() sets up the vertex array, with the arguments specifying that there are 2 dimensions to the points in the array (X, Y), the values in the array are GLfloats, the stride is 0, and the values in the array should come from squareVertices.  The stride portion tells how many bytes apart the vertices are in the array being fed in.  A value of 0 indicates that only vertices are present in this array.  As we'll talk about later, one performance optimization you can make is to interleave vertex, color, and texture data within one array, where the stride lets OpenGL know how to find all of the vertex values in that array.


The ability to use vertex arrays is enabled using the GL_VERTEX_ARRAY state.


Similarly, the color array is specified using glColorPointer(), which takes in the number of color components (4 for RGBA), the data type of the individual components (GLubyte), the stride (0), and the array to use (squareColors).  The use of color arrays needs to be enabled using the GL_COLOR_ARRAY state.


When both arrays have been specified, they can be sent to be rendered using glDrawArrays().  The parameters are the drawing mode, the offset from the start of the array to start drawing, and the number of indices to draw.


There are several drawing modes.  These include 



Another way to draw geometry is to use indices.  Indices let you define how triangles are formed from the vertex arrays you are feeding in, rather than having OpenGL ES automatically assign triangles from the arrays of vertices you pass in.  An index array contains a series of integers that indicate what vertex to use for each point in the triangles to be drawn.  This can save on geometry that needs to be sent, due to vertices that may be reused in multiple triangles.  The iPhone hardware also works best with indexed triangle strips.


Instead of drawing plain arrays, you draw indexed geometry using code like the following:


glDrawElements(GL_TRIANGLE_STRIP,4,GL_UNSIGNED_SHORT, indices);


This particular draw call will display the geometry as triangle strips, use 4 vertices, take in indices as GLushorts, and then finally take in an array of indices that describe the geometry to be drawn.


In addition to supplying vertices and colors, you can also determine how your geometry is lit by supplying normals for your geometry.  As mentioned before, a normal is a vector that is orthogonal to a particular plane.  In terms of lighting, to make a flat surface reflect light like a smooth, flat plane, you need to provide a normal to the plane that the surface lies within.  This can also be used to blunt sharp corners when lit, making sharp objects appear round.


Normals are specified in the same way as vertices, using an array of 3-D coordinates, only these are not coordinates but vector components.  To load a normal array, you use code like the following:


glNormalPointer(GL_FLOAT, 0, cubeNormals);

glEnableClientState(GL_NORMAL_ARRAY);


This is similar to what's used for vertices above, where we specify the data type used (GLfloat), the stride (0), and the name of the array containing the normals (cubeNormals).  We then enable the use of normal arrays in OpenGL.


When constructing the arrays for your model, I recommend using an NSMutableData instance.  That way, you can simply append data as you generate the model, without worrying about the memory allocation details, and extract the array at the end as a series of bytes that can be passed into a drawing call.


When done with your drawing commands, and you are ready to render your scene, you simply need to use code like the following:


glBindRenderbufferOES(GL_RENDERBUFFER_OES, viewRenderbuffer);

[context presentRenderbuffer:GL_RENDERBUFFER_OES];


Again, the glBindRenderbufferOES() may be redundant if you only have the one renderbuffer you are dealing with.


Vertex buffer objects


Above, we discuss ways of rendering geometry to the screen using arrays of triangles, etc.  In addition to those approaches, you can use a structure called a vertex buffer object (VBO) to optimize your rendering.  VBOs let you push geometry into the memory of the GPU and manage it there, rather than having to send it over to the GPU every frame.  You can also set up a VBO in such a way that you can alter its geometry using direct memory access (DMA), freeing up the CPU and leading to better performance.


On iPhone OS devices prior to the iPhone 3G S, VBOs did not lead to a significant performance gain, but on the iPhone 3G S, the third-generation iPod touch, the iPad, and the Mac, they can lead to huge increases in rendering speed because VBOs are hardware accelerated on those platforms.


To create a VBO for vertices, colors, or normals, you use code like the following:


glGenBuffers(1, &m_vertexBufferHandle); 

glBindBuffer(GL_ARRAY_BUFFER, m_vertexBufferHandle); 

glBufferData(GL_ARRAY_BUFFER, [currentVertexBuffer length], (void *)[currentVertexBuffer bytes], GL_STATIC_DRAW); 


where currentVertexBuffer is an NSData or NSMutableData instance and m_vertexBufferHandle is a GLuint.  This generates a single buffer and assigns its handle to m_vertexBufferHandle, binds that buffer as an array of data items so that we can write to it, and then transfers the data from the memory of the application to the GPU.  The flag GL_STATIC_DRAW indicates that this particular buffer will be written to once and not changed afterward.  This lets the GPU perform optimizations on the buffered data.  This can also be GL_DYNAMIC_DRAW, in which case you are indicating that the buffer data will be altered at some point in the future.


A similar process is used for creating a VBO for indices:


glGenBuffers(1, &m_indexBufferHandle); 

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, m_indexBufferHandle);   

glBufferData(GL_ELEMENT_ARRAY_BUFFER, [currentIndexBuffer length], [currentIndexBuffer bytes], GL_STATIC_DRAW);     


Once the buffers are created and loaded with data, the local copies of that data can be freed.


To draw from these buffers, you can use code like the following:


glBindBuffer(GL_ARRAY_BUFFER, m_vertexBufferHandle); 


glVertexPointer(3, GL_SHORT, 20, (char *)NULL + 0);

glNormalPointer(GL_SHORT, 20, (char *)NULL + 8); 

glColorPointer(4, GL_UNSIGNED_BYTE, 20, (char *)NULL + 16);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, m_indexBufferHandle);    


glDrawElements(GL_TRIANGLES,m_numberOfIndicesForBuffers,GL_UNSIGNED_SHORT, NULL);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0); 

glBindBuffer(GL_ARRAY_BUFFER, 0); 


This first binds a VBO containing interleaved vertex, normal, and color data (for performance reasons), and sets up the vertex, normal, and color arrays from this buffer.  Then, a VBO for the index array is bound.  Finally, glDrawElements() is called to draw the triangles to the screen.


Note that instead of arrays, a NULL value is passed in as part of the last argument for the array setup functions, as well as the call to glDrawElements().  This indicates that values from a VBO are to be used instead of the normal in-memory arrays.


When the drawing is finished, the buffers are unbound so that others can be used in their place for later drawing.


Finally, when you would like to free the memory for the VBOs, you need to use code like the following:


glDeleteBuffers(1, &m_indexBufferHandle);

glDeleteBuffers(1, &m_vertexBufferHandle);


Drawing textures


One of the keys to generating a realistic 3-D scene is the use of textures on your models.  Textures are images that can be wrapped around your 3-D models in various ways.


Textures on pre-iPhone 3G S devices need to be powers-of-two in dimension (256 x 256, 512 x 512).  Newer devices have extensions which provide for limited non-power-of-two textures size.


When using textures in your application, you can start from images you load out of your application's resources and simply convert those to textures.  However, it is highly recommended that you compress your images to be used as textures ahead of time using the PowerVR (the brand of GPU in the current iPhone OS devices) texture compression tool texturetool.   This application is located at /Developer/Platforms/iPhoneOS.platform/Developer/usr/bin/texturetool.


Textures compressed in this manner can be used on the GPU in their compressed form, instead of being the full bitmapped images you normally would use.  For an example of how to automatically compress your PNG images to textures as part of the build process, see Apple's PVRTextureLoader sample application and its Encode Images Run Script build phase.


To use these compressed textures in your application, I again recommend using the PVRTextureLoader sample as a basis and lifting its PVRTexture class wholesale.  You can then use code like the following:


NSString *pathToTexture = [[NSBundle mainBundle] pathForResource:@"texture" ofType:@"pvrtc"];

texture = [[PVRTexture alloc] initWithContentsOfFile:pathToTexture];


which creates a new PVRTexture instance, properly loading the compressed texture for use in OpenGL ES.


You then can bind the texture using code like the following:


glEnable(GL_TEXTURE_2D);


glBindTexture(GL_TEXTURE_2D, texture.name);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR);

glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAX_ANISOTROPY_EXT, 1.0f);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);


This binds the texture based on the name property of the PVRTexture instance.  It also sets some necessary properties for displaying the texture, including the interpolation filter to be applied when the texture is zoomed in and out, and setting the texture to be clamped to the edges of the model, rather than repeat across its face.


You then need to describe how the texture will be mapped to the face of your model.  A texture has its own coordinate system, with the bottom-left being (0.0, 0.0) and the upper-right being (1.0, 1.0), no matter the actual size in pixels of the texture.  An array of 2-D texture coordinates needs to be provided, with each texture coordinate being mapped to a particular vertex in the model.


Once the texture array has been specified (or placed in a VBO), you can use code like the following to assign the array:


glTexCoordPointer(2, GL_FLOAT, 0, textureCoordinates);

glEnableClientState(GL_TEXTURE_COORD_ARRAY);


This specifies the number of coordinates per array element (2), the data type of each texture element (GLfloat), the stride (0), and the array containing the texture coordinates.


With this provided, the texture will be drawn over your model as part of glDrawArrays() or glDrawElements().


Debugging OpenGL


It is very hard to debug errors in OpenGL ES.  Normally, you just get a black screen if something goes wrong.


You can get some debugging information through the use of code like the following:


GLenum err = glGetError();

if (err != GL_NO_ERROR)

NSLog(@"Error in frame. glError: 0x%04X", err);


CADisplayLink


One class you might notice in the OpenGL ES Application template is CADisplayLink.  This is a new class in iPhone OS 3.1 that handles a common task in OpenGL ES applications: managing updates of the display.  Typically, if you wanted to regularly update the display, such as in a game or for an animating element, you would have needed to create an NSTimer instance or something similar to fire a selector at a given interval.  However, this interval was not always regular and often did not match the refresh rate of the screen.


CADisplayLink solves this problem by matching update calls to screen refreshes.  You set up a CADisplayLink in a manner similar to the following:


displayLink = [CADisplayLink displayLinkWithTarget:self selector:@selector(drawView:)];

[displayLink setFrameInterval:1];

[displayLink addToRunLoop:[NSRunLoop currentRunLoop] forMode:NSDefaultRunLoopMode];


This causes the CADisplayLink to call the -drawView: method once per screen refresh (60 FPS).


When done, you can stop the CADisplayLink using code like the following:


[displayLink invalidate];

displayLink = nil;


CADisplayLink actually existed as a symbol in iPhone OS 3.0, so you can't use NSClassFromString() to determine if it exists on your user's device.  This is one of the rare cases where Apple suggests checking for a specific OS version, rather than a specific bit of functionality.  You can test for the existence of iPhone OS 3.1 using code like the following:


NSString *reqSysVer = @"3.1";

NSString *currSysVer = [[UIDevice currentDevice] systemVersion];

if ([currSysVer compare:reqSysVer options:NSNumericSearch] != NSOrderedAscending)

displayLinkSupported = TRUE;


OpenGL ES 2.0


With the iPhone 3G S, the third-generation iPod touch, and now the iPad, a new set of GPUs were added to the iPhone OS product line that support OpenGL ES 2.0.  Most of what we've described has been relative to OpenGL ES 1.1 and its rendering pipeline.  OpenGL ES 2.0 adds a whole new set of capabilities for the devices that support it.


OpenGL ES 1.1 has a fixed-function graphics pipeline, where OpenGL ES 2.0 has a programmable one.  What this means is that under OpenGL ES 1.1, lighting and other functions obeyed set calculations that you could not change.  In OpenGL ES 2.0, you can write programs to perform arbitrary functions using the GPU, letting you pull off some pretty stunning hardware-accelerated effects.


The default template application created when you start an OpenGL ES project now has both OpenGL ES 1.1 and OpenGL ES 2.0 renderers that perform the same task.  You might notice that the OpenGL ES 2.0 renderer is significantly more complex.  For simple drawing tasks, it is often easier to implement them in terms of OpenGL ES 1.1 than 2.0.


In OpenGL ES 2.0, you set up functions by defining small programs called shaders.  These shaders are written using the OpenGL Shading Language (GLSL), which has a C-like syntax.  These shaders take inputs in the form of per-vertex data (attributes) like coordinates and normals and constant data (uniforms) like the light coordinates or the state of the matrix, and provide per-vertex outputs (varyings) such as position or texture coordinates.


For rapid development of shaders, you can use Quartz Composer on the Mac, a developer tool that many people overlook.  It lets you apply a shader to a 3-D model and see the realtime results as you edit the shader program.





Performance optimization


Optimization of OpenGL ES could be a course by itself.  I'll just hit the highlights here.


First, you should undertand the 3-D hardware inside of the devices you'll be developing for.  Devices released before the iPhone 3G S use Imagination Technologies' PowerVR MBX Lite, and the iPhone 3G S, third-generation iPod touch, and apparently the iPad use the PowerVR SGX.  These GPUs are tile-based deferred renderers, which means that they cache all the drawing commands for a frame, divide it into tiles, and then render each tile in one operation.  


This tiling makes memory access more efficient, and provides performance advantages when doing depth testing and blending.  Hidden surface removal is also done automatically on the GPU, so you don't need to depth-sort your objects to optimize performance.  It is recommended, however, that you draw all opaque objects first, then ones with transparency, so that non-visible objects can be removed first.


This is different than what you'd find in your Mac's GPU, so things that work well in the Simulator may have vastly different behavior on the iPhone.  As always, test on the device as soon and as regularly as you can.


The PowerVR MBX Lite only supports OpenGL ES 1.1, and is very sensitive to memory usage.  It has a 24 MB limit on textures and render buffers, with a rated maximum texture size of 1024 x 1024 (again, I've used views as large as 2048 x 2048 on iPhones using this GPU).  As mentioned previously, textures must be a power of two in dimension.


The PowerVR SGX has support for both OpenGL ES 1.1 and 2.0, and has a higher memory bandwidth than the MBX Lite.  There is no hard limit on textures and render buffers, and textures can have non-power-of-two dimensions.  These GPUs have hardware acceleration of VBOs, so if you use these objects you will see a significant speedup.  They also include 


For more on the hardware-specific capabilities of these renderers, see Apple's Mastering OpenGL ES for iPhone - Parts 1 and 2 videos available to registered iPhone developers.


Ordering from what I have observed makes the most difference, performance-wise, here are the items you should be aware of:


Don't halt the rendering pipeline with glGet* functions.  The tile-based deferred renderer of the iPhone builds up a list of operations to be performed, then runs them in one operation in order to maximize performance.  If you do any operation that queries the current state of OpenGL, that will introduce a synchronization point and will require OpenGL to render everything up to that point so that you can extract a value.  This can lead to a serious performance degradation, so avoid this if at all possible.  This includes glGetError() calls, which should be stripped out of non-debugging builds (possibly by using compiler conditionals).  If you must query the OpenGL state, do so at the very end of your drawing, right before the renderbuffer is presented.


One trick I've employed to handle this is to use a Core Animation CATransform3D struct as a stand-in for the model view matrix.  It turns out that they share an identical internal structure, so the Core Animation functions for rotating, scaling, and otherwise manipulating a transform perform the same math that OpenGL does when rotating, scaling, etc. the model view matrix.  You can then maintain a transform, manipulate it, and then set the model view matrix to this transform using conversion methods like the following:


- (void)convertMatrix:(GLfloat *)matrix to3DTransform:(CATransform3D *)transform3D;

{

transform3D->m11 = (CGFloat)matrix[0];

transform3D->m12 = (CGFloat)matrix[1];

transform3D->m13 = (CGFloat)matrix[2];

transform3D->m14 = (CGFloat)matrix[3];

transform3D->m21 = (CGFloat)matrix[4];

transform3D->m22 = (CGFloat)matrix[5];

transform3D->m23 = (CGFloat)matrix[6];

transform3D->m24 = (CGFloat)matrix[7];

transform3D->m31 = (CGFloat)matrix[8];

transform3D->m32 = (CGFloat)matrix[9];

transform3D->m33 = (CGFloat)matrix[10];

transform3D->m34 = (CGFloat)matrix[11];

transform3D->m41 = (CGFloat)matrix[12];

transform3D->m42 = (CGFloat)matrix[13];

transform3D->m43 = (CGFloat)matrix[14];

transform3D->m44 = (CGFloat)matrix[15];

}


- (void)convert3DTransform:(CATransform3D *)transform3D toMatrix:(GLfloat *)matrix;

{

matrix[0] = (GLfloat)transform3D->m11;

matrix[1] = (GLfloat)transform3D->m12;

matrix[2] = (GLfloat)transform3D->m13;

matrix[3] = (GLfloat)transform3D->m14;

matrix[4] = (GLfloat)transform3D->m21;

matrix[5] = (GLfloat)transform3D->m22;

matrix[6] = (GLfloat)transform3D->m23;

matrix[7] = (GLfloat)transform3D->m24;

matrix[8] = (GLfloat)transform3D->m31;

matrix[9] = (GLfloat)transform3D->m32;

matrix[10] = (GLfloat)transform3D->m33;

matrix[11] = (GLfloat)transform3D->m34;

matrix[12] = (GLfloat)transform3D->m41;

matrix[13] = (GLfloat)transform3D->m42;

matrix[14] = (GLfloat)transform3D->m43;

matrix[15] = (GLfloat)transform3D->m44;

}


Once you've converted the CATransform3D to a model view matrix, you can use glLoadMatrixf() to set the current model view matrix.  This way, if you need to know the current state of the model view matrix, you can just query your transform without halting the OpenGL pipeline to look at the model view matrix.


Minimize the size of your geometry.  This is particularly critical on older iPhone devices.  If you profile your application in Instruments, and the Tiler Utilization statistic in the OpenGL ES instrument is nearly 100%, that indicates that your bottleneck is in the size of the geometry you're sending to the GPU.  Using indices is one way of minimizing geometry, allowing vertices to be reused.  Compressing textures is another.  


A way that I've found to be effective is to switch from using GLfloats (4 bytes) to GLshorts (2 bytes) for vertices and normals, and specify colors in terms of GLubytes (1 byte).  These smaller types still give you a good enough dynamic range to do accurate rendering, but they dramatically reduce the memory size of your models.


Precalculate your geometry.  One of the early performance bottlenecks I ran into was that I recalculated the vertices for each model for each frame.  The CPU on the iPhone was not able to keep up with this, and I saw horrible performance.  By calculating the locations once, and then storing in a VBO, I saw over a 4X improvement in rendering speed.


Minimize the number of drawing calls and state changes you make.  State changed in OpenGL are particularly expensive, so try to structure your rendering such that everything that needs to be drawn in one state is done in a batch, then everything that needs to be done in the next state, and so on.  While you are at it, try to minimize the number of drawing calls you make.  Rather than drawing each atom in a molecule individually, I place them all together in one model and use a single drawing call for that.


Set your CAEAGLLayer's opaque property to YES.  Compositing an OpenGL layer onto other user interface elements is very expensive, so you almost always want to make sure that your layer is opaque.  The potential exception to this in in augmented reality applications, where you need to have it overlaid on a camera view.  User interface elements on top of an OpenGL layer do not cause that much of a slowdown, on the order of 5% in my tests.


Do not transform your CAEAGLLayer.  CAEAGLLayers or their containing views should not have scaling or rotation transforms applied to them, because that will lead to a slowdown during compositing.  The exception to this are 90 degree rotations on the PowerVR SGX systems, which can handle this in a hardware-accelerated manner.


Use backface culling.  As described above, you can enable backface culling in your scene to reduce the number of triangles that have to be processed, leading to a slight performance improvement in rendering.


Interleave your vertices, normals, and colors.  If you can, set up your VBOs or arrays such that you don't have separate arrays for vertices, normals, and colors, but interleave them in a single array.  That is, construct your array such that it has vertex 1, normal 1, color 1, vertex 2, normal 2, color 2 as its sequence of values.  This produces a slight speedup by grouping items together in memory that are likely to be used together.  This allows for more frequent memory cache hits.


Use the RGB565 colorspace instead of the RGBA8 colorspace.  By changing the default line


eaglLayer.drawableProperties = [NSDictionary dictionaryWithObjectsAndKeys: [NSNumber numberWithBool:FALSE], kEAGLDrawablePropertyRetainedBacking, kEAGLColorFormatRGBA8, kEAGLDrawablePropertyColorFormat, nil];


to read 


eaglLayer.drawableProperties = [NSDictionary dictionaryWithObjectsAndKeys: [NSNumber numberWithBool:FALSE], kEAGLDrawablePropertyRetainedBacking, kEAGLColorFormatRGB565, kEAGLDrawablePropertyColorFormat, nil];


you can see a slight performance improvement (~2% in my tests).  The RGB565 colorspace is optimized on the iPhone GPU hardware.  Note that you lose a little color fidelity this way, but it's an easy optimization to make.