Correcting XNA’s Gamma Correction

One thing I never used to pay attention to is gamma correction.  This is mainly because it rarely gets mentioned, and also because you can usually get pretty good results without ever even thinking about it.  However it only took a few days at my new job for me to realize just how essential it is if you want professional-quality results.

Lately I’ve been doing some research into inferred rendering (more on that later), and while working up a prototype renderer in XNA I decided that I would (for once)  be gamma-correct throughout the pipeline.  So I went looking through the XNA Framework documentation for framework’s equvalent of the D3DSAMP_SRGBTEXTURE sampler state (which automatically converts from sRGB to linear in the texture unit) and the D3DRS_SRGBWRITEENABLE render state (which automatically converts from linear to sRGB in the ROP)…and I didn’t find them.  The thought of these being left out struck me as odd, so I did a bit of searching on Google.  After refining my search terms I found this post by framework developer Shawn Hargreaves, confirming that those states were not exposed in the framework due to inconsistencies between Windows and Xbox.  After looking through some presentations again I concluded that he was talking about…

1.  The fact that the 360 uses a 4-segment piecewise linear approximation curve to perform conversion to and from sRGB, which gives quite different results compared to what you get with PC GPU’s.

2.  The fact that blending behavior is different in DX9 and DX10-level GPU’s, regardless of which API you use.  DX9 GPU’s will perform framebuffer blending after conversion to sRGB (which is mathematically incorrect), while DX10 GPU’s will do the blending in linear space and then convert the blended result to sRGB.  There is a cap to detect this behavior (D3DPMISCCAPS_POSTBLENDSRGBCONVERT) but it’s only available if you create an IDirect3D9Ex device.

So yeah, that’s annoying.  But like most limitations in the framework you can work around them if you’re determined enough, and fortunately this one is a piece of cake.  Well…on the PC, at least.  So let’s start with the first half, sampling sRGB textures.  Like I mentioned before there’s a nice convenient sampler state in D3D9 that will do the sRGB->linear automatically, but XNA’s SamplerState just doesn’t have it.  But fortunately that’s not the only way to set sampler states…we can also get the Effects framework to do it for us by defining a sampler_state in our effect files.  So I took a peek at the D3D9 Effect States documentation, and added the appropriate state declaration to my effect file.  And it worked!  For the lazy, all you have to do is this (important line in bold):

texture2D DiffuseMap;
sampler2D DiffuseSampler = sampler_state
{
   Texture = <DiffuseMap>;
   SRGBTexture = true;
};

Okay now for the other half, sRGB writes.  Once again D3D9 has a convenient render state that does all of the work for us, and the Effects framework can set render states for us if we include them in a pass declaration.  But unfortunately this time the Effect States documentation didn’t have anything for SRGBWRITEENABLE.  Too determined to give up, I followed the standard convention of effect states and chopped the prefix off the “D3DRS_” prefix.  And hey, it worked!

technique Transparent
{
    pass Pass1
    {
       VertexShader = compile vs_3_0 TransparentVS();
       PixelShader = compile ps_3_0 TransparentPS();

       SRGBWriteEnable = true;
    }
}

So we’ve solved our gamma problems…at least if you’re only targeting the PC and you’re using Effects.  If you’re not using Effects, then I don’t know of any way to toggle those states.  It’s probably possible with some sort of interop/reflection voodoo, but I don’t know enough about these things to recommend it.

There’s also the Xbox 360 problem, which is actually two problems in one.  The first problem is that the Xbox 360 doesn’t use sampler and render states to control sRGB read and writes.  It instead uses the D3D10 convention of having special surface formats for textures and render targets that control whether conversion takes place.  I don’t have access to my Xbox 360 at the moment so I can’t verify for sure, but I strongly suspect that the effect states won’t work.  And even if they did work you’d still have the second problem, which is that the Xbox uses that piecewise approximation curve  (this presentation by Valve shows some of the nastiness that can occur with it).

Fortunately we can bypass those problems by doing the conversion ourselves in the shader.  The good news is that the code is a piece of cake…the bad news is that it’s not super cheap since it involves raising your RGB color value to a non-integral power. Here’s the code:

// Converts from linear RGB space to sRGB.
float3 LinearToSRGB(in float3 color)
{
    return pow(color, 1/2.2f);
}
// Converts from sRGB space to linear RGB.
float3 SRGBToLinear(in float3 color)
{
    return pow(color, 2.2f);
}

Unfortunately with these you also have the problem that filtering and blending will be performed in sRGB space, and there’s not much you can do about that (aside from doing the filtering and blending yourself, but that would be way too expensive).

If you want to make these conversions a little cheaper, you can use a trick that my coworker showed me: round down the 2.2 to 2.0.  This gives you a simple square operation for conversion to linear (you can just dot the value with itself), and a sqrt operation for conversion to sRGB.

More Post-Processing Tricks: Lens Flare

I was playing Killzone 2 the other day, which reminded me of the lens flare trick they used.  Unlike most games, which use some sprites controlled by an occlusion query, they applied the effect as a post-process similar to bloom.  The upside is that it works for all bright areas and not pre-defined areas (the sun), and you don’t have to do occlusion queries or anything like that since that’s handled automatically.  Plus it’s really easy to fit it into a post-processing chain, since you can use your bloom results as the input.  The downside is that it’s pretty far from realistic…I’m not sure that most would like the end result.  This screen here shows the effect pretty clearly (it’s the orange and purple blobby areas by the left bad guy’s head, on the opposite side of the screen from the bright light source).

I haven’t seen anyone duplicate or even discuss the technique since before the game out, so I figured I’d take a crack at deciphering it myself.  After some experimenting I came up with the following basic approach:

1.  Render a bloom buffer using standard downscale + threshold + blur
2.  Flip the texture coordinates by doing float2(1, 1) – texCoord
3.  Blur both towards the center of the screen and away from it
4.  Combine additively with the bloom buffer

To fake a chromatic aberration, Killzone 2 uses a strong orange tint for areas closer to the center of the screen and a purple tint on the periphery.  Upon some further close analysis it started to look like they were doing it in two passes with a different tint and different texture coordinate scaling for each pass.  I decided to make my implementation the same,  so I could produce similar results.  This is the shader code I came up with:


const static float4 vPurple = float4(0.7f, 0.2f, 0.9f, 1.0f);
const static float4 vOrange = float4(0.7f, 0.4f, 0.2f, 1.0f);
const static float fThreshold = 0.1f;

float4 LensFlarePS (    in float2 in_vTexCoord    : TEXCOORD0,
 uniform int NumSamples,
 uniform float4 vTint,
 uniform float fTexScale,
 uniform float fBlurScale)    : COLOR0
{
 // The flare should appear on the opposite side of the screen as the
 // source of the light, so first we mirror the texture coordinate.
 // Then we normalize so we can apply a scaling factor.
 float2 vMirrorCoord = float2(1.0f, 1.0f) - in_vTexCoord;
 float2 vNormalizedCoord = vMirrorCoord * 2.0f - 1.0f;
 vNormalizedCoord *= fTexScale;

 // We'll blur towards the center of screen, and also away from it.

 float2 vTowardCenter = normalize(-vNormalizedCoord);
 float2 fBlurDist = fBlurScale * NumSamples;
 float2 vStartPoint = vNormalizedCoord + ((vTowardCenter / g_vSourceDimensions) * fBlurDist);
 float2 vStep = -(vTowardCenter / g_vSourceDimensions) * 2 * fBlurDist;

 // Do the blur and sum the samples
 float4 vSum = 0;
 float2 vSamplePos = vStartPoint;
 for (int i = 0; i < NumSamples; i++)
 {
 float2 vSampleTexCoord = vSamplePos * 0.5f + 0.5f;

 // Don't add in samples past texture border
 if (vSampleTexCoord.x >= 0 && vSampleTexCoord.x <= 1.0f
 && vSampleTexCoord.y >=0 && vSampleTexCoord.y <= 1.0f)
 {
 float4 vSample = tex2D(PointSampler0, vSampleTexCoord);
 vSum +=  max(0, vSample - fThreshold) * vTint;
 }

 vSamplePos += vStep;
 }

 return vSum / NumSamples;
}

float4 CombinePS (in float2 in_vTexCoord    : TEXCOORD0) : COLOR0
{
 float4 vColor = tex2D(PointSampler0, in_vTexCoord);
 vColor += tex2D(PointSampler1, in_vTexCoord);
 vColor += tex2D(PointSampler2, in_vTexCoord);
 return vColor;
}

technique LensFlareFirstPass
{
 pass p0
 {
 VertexShader = compile vs_3_0 PostProcessVS();
 PixelShader = compile ps_3_0 LensFlarePS(12, vOrange, 2.00f, 0.15f);

 ZEnable = false;
 ZWriteEnable = false;
 AlphaBlendEnable = false;
 AlphaTestEnable = false;
 StencilEnable = false;
 }
}

technique LensFlareSecondPass
{
 pass p0
 {
 VertexShader = compile vs_3_0 PostProcessVS();
 PixelShader = compile ps_3_0 LensFlarePS(12, vPurple, 0.5f, 0.1f);

 ZEnable = false;
 ZWriteEnable = false;
 AlphaBlendEnable = false;
 AlphaTestEnable = false;
 StencilEnable = false;
 }
}


Obviously the code is severely unoptimized, but it’s late and I’m tired.  Here’s a screen of what it looks like (ignore the obnoxious brightness and bloom, please):


Two Samples For The Price Of One

Today I have two XNA samples fresh out of the oven: a Motion Blur Sample, and Depth Of Field Sample.  I figure all of the kids these days wanna add fancy post-processing tricks to their games, right?  The motion blur sample shows you how to do camera motion blur using a depth buffer, or full object motion blur using a velocity buffer. The depth of field sample shows you how to do a standard blur-based DOF, a slightly-smarter blur-based DOF that doesn’t blur across edges, and the somewhat more physically accurate disc blur approach.

Get ‘em while they’re hot!

Follow

Get every new post delivered to your Inbox.

Join 34 other followers