Name Change

I got tired having an awkward mouthful of a blog name, so I decided to shorten it to something much snappier.  Hence “MJP’s XNA Danger Zone” becomes simply “The Danger Zone”. I like it better already.

Actually the main reason for the change is that I’ve been taking a break from the XNA stuff so that I can finally play around with DX11 a bit. In fact I’ve been working on a simple and flexible DX11 sample framework, so you may see a few DX11 samples from me in the future.  It should be fun, the new multi-threading features are really really cool. I’d like to do some compute shader stuff too, especially since I haven’t gotten around to playing with Cuda yet. I think tessellation will have to wait a bit though, since I don’t think I’m going to buy a DX11 GPU until the Fermi-based GPU’s from Nvidia come out. Until then, D3D_FEATURE_LEVEL_10_0 will have to do.

Inferred Rendering

So like I said in my last post, I’ve been doing some research into Inferred Rendering.  If you’re not familiar with the technique, Scott Kircher has the original paper and presentation materials hosted on his website.  The main topic of the paper is what they call “Discontinuity Sensitive Filtering”, or “DSF” for short.  Basically it’s standard 2×2 bilinear filtering, except in addition to sampling the texture you’re interested in you also sample what they call a a “DSF buffer” containing depth, an instance ID (semi-unique for each instance rendering on-screen), and a normal ID (a semi-unique value identifying areas where the normals are continuous).  By comparing the values sampled from the DSF buffer with the values supplied for the mesh being rendered (they apply the DSF filter during final pass of a light-prepass renderer where meshes are re-rendered and sample from the lighting buffer), they can bias the bilinear weights so that texels not “belonging” to the object being rendered are automatically rejected.  They go through all of this effort so that they can do two things:

  1. They can use a lower-res G-Buffer and L-Buffer but still render their geometry at full res
  2. They can light transparent surfaces using a deferred approach, by applying a stipple pattern when rendering the transparents to the G-Buffer

The second part is what’s interesting, so let’s talk about.  Basically what they do is they break up the G-Buffer into 2×2 quads.  Then for transparent objects, an output mask is applied so that only one pixel in the quad is actually written to.  Then by rotating the mask, you could render up to 3 layers of transparency into the quad and still have opaques visible underneath.  For a visual, this is what a quad would look like if only one transparent layer was rendered:

So “T1″ would be from the transparent surface, and “O” would be from opaque objects below it.  This is what it would look like if you had 3 transparent surfaces overlapping:

After laying out your G-Buffer, you then fill your L-Buffer (Lighting Buffer) with values just like you would with a standard Light Pre-pass renderer.  After you’ve filled your L-Buffer, you re-render your opaque geometry and sample your L-Buffer using a DSF filter so that only the texels belonging to opaque geometry get samples.  Then you render your transparent geometry with blending enabled, each time adjusting your DSF sample positions so that the 4 nearest texels (according to the output mask you used when rendering it to the G-Buffer) are sampled.

So you can light your transparents just like any other geometry, which is really cool stuff if you have a lot dynamic lights and shadows (which you probably do if you’re doing deferred rendering in the first place).  But now come the downsides:

  1. Transparents end up being lit at 1/4 resolution, and opaques underneath transparents will be lit at either 3/4, 2/4, or 1/4 resolution.  How bad this looks mainly depends on whether you have high-frequency normal maps, since the lighting itself is generally low-frequency.  You’re also helped a bit by the fact that your diffuse albedo texture will still be sampled at full rate.  Here’s a screenshot comparing forward-rendered transparents (left-side), with deferred transparents (right-side):

    You can see that aliasing artifacts become visible on the transparent layers, due to the normal maps.  Even more noticable is shadow map aliasing, which becomes noticeably worse on the transparent layers since it’s only sampled at 1/4 rate.  Here’s a screenshot showing the same comparison, this time with normal maps disabled:


    The aliasing becomes much less visible on the unshadowed areas without normal mapping disabled, since now the normals are much lower-frequency.  However you still have the same problem with shadow map aliasing.

  2. The DSF filtering is not cheap.  Or at least, the way I implemented it wasn’t cheap.  My code can probably optimized a bit to reduce instructions, but unless I’m missing something big I don’t think you could make any big improvements.  If someone does figure out anything, please let me know!  Anyway when compiling my opaque pixel shader with fxc.exe  (from August 2009 SDK) using ps_3_0, I get a nice 11 instructions (9 math, 2 texture) when no DSF filtering is used.  When filtering is added in, it jumps up to a nasty 64 instructions! (55 math, 9 texure).  For transparents the shader jumps up again (71 math, 9 texture) since some additional math is needed to adjust the filtering in order to sample according to the stipple pattern.  Running the shaders through NVShaderPerf gives me the following:

    Here’s what I get with ATI’s GPU ShaderAnalyzer:

    So like I said, it’s not definitely not free.  In the paper they mention that they also use a half-sized G-Buffer + L-Buffer which offsets the cost of the extra filtering.  When running my test app on my GTX 275 at half-res G-Buffer there’s almost no difference in framerate and at quarter-res it’s actually faster to defer the transparents.  Using a full-res G-Buffer/L-Buffer it’s quicker to forward-render the transparents, with 4  large point lights and 1 directional light + shadow.  So I’d imagine for a full-res G-Buffer/L-Buffer you’d need quite a few dynamic lights for it to pay off when going deferred for transparents.  But in my opinion, the decrease in quality when using a lower-res G-Buffer just isn’t worth it.  Here’s a screenshot showing deferred transparents with half-sized G-Buffer:

    Notice how bad the shadows look on the transparents, since now the shadow map is being sampled at 1/8th rate.  Even on the opaques you start to lose quite a bit of the normal map detail.

  3. You only get 3 layers of transparency.  However past 3 layers it would probably be really hard to notice that you’re missing anything, at least to the average player.
  4. Since you use instance ID’s to identify transparent layers, you’ll have problems with models that have multiple transparency levels (like a car, which has 4 windows)

Regardless, I think the technique is interesting enough to look into.  Personally when I read the paper I had major concerns about what shadows would look like on the transparents (especially with a lower-res L-Buffer), which is what lead to me to make a prototype with XNA so that I could evaluate some of the more pathological cases that could pop up.  If you’re also interested, I’ve uploaded the binary here, and the source here.  If you want to run the binary you’ll need the XNA 3.1 Redistributable, located here.

One thing you’ll notice about my implementation is that I didn’t factor in normals at all in the DSF filter, and instead I stored depth in a 16-bit component and instance ID in the the other 16 bits.  This would give you much more than the 256 instances that the original implementation is limited to, at the expense of some artifacts around areas where the normal changes drastically on the same mesh.

Correcting XNA’s Gamma Correction

One thing I never used to pay attention to is gamma correction.  This is mainly because it rarely gets mentioned, and also because you can usually get pretty good results without ever even thinking about it.  However it only took a few days at my new job for me to realize just how essential it is if you want professional-quality results.

Lately I’ve been doing some research into inferred rendering (more on that later), and while working up a prototype renderer in XNA I decided that I would (for once)  be gamma-correct throughout the pipeline.  So I went looking through the XNA Framework documentation for framework’s equvalent of the D3DSAMP_SRGBTEXTURE sampler state (which automatically converts from sRGB to linear in the texture unit) and the D3DRS_SRGBWRITEENABLE render state (which automatically converts from linear to sRGB in the ROP)…and I didn’t find them.  The thought of these being left out struck me as odd, so I did a bit of searching on Google.  After refining my search terms I found this post by framework developer Shawn Hargreaves, confirming that those states were not exposed in the framework due to inconsistencies between Windows and Xbox.  After looking through some presentations again I concluded that he was talking about…

1.  The fact that the 360 uses a 4-segment piecewise linear approximation curve to perform conversion to and from sRGB, which gives quite different results compared to what you get with PC GPU’s.

2.  The fact that blending behavior is different in DX9 and DX10-level GPU’s, regardless of which API you use.  DX9 GPU’s will perform framebuffer blending after conversion to sRGB (which is mathematically incorrect), while DX10 GPU’s will do the blending in linear space and then convert the blended result to sRGB.  There is a cap to detect this behavior (D3DPMISCCAPS_POSTBLENDSRGBCONVERT) but it’s only available if you create an IDirect3D9Ex device.

So yeah, that’s annoying.  But like most limitations in the framework you can work around them if you’re determined enough, and fortunately this one is a piece of cake.  Well…on the PC, at least.  So let’s start with the first half, sampling sRGB textures.  Like I mentioned before there’s a nice convenient sampler state in D3D9 that will do the sRGB->linear automatically, but XNA’s SamplerState just doesn’t have it.  But fortunately that’s not the only way to set sampler states…we can also get the Effects framework to do it for us by defining a sampler_state in our effect files.  So I took a peek at the D3D9 Effect States documentation, and added the appropriate state declaration to my effect file.  And it worked!  For the lazy, all you have to do is this (important line in bold):

texture2D DiffuseMap;
sampler2D DiffuseSampler = sampler_state
{
   Texture = <DiffuseMap>;
   SRGBTexture = true;
};

Okay now for the other half, sRGB writes.  Once again D3D9 has a convenient render state that does all of the work for us, and the Effects framework can set render states for us if we include them in a pass declaration.  But unfortunately this time the Effect States documentation didn’t have anything for SRGBWRITEENABLE.  Too determined to give up, I followed the standard convention of effect states and chopped the prefix off the “D3DRS_” prefix.  And hey, it worked!

technique Transparent
{
    pass Pass1
    {
       VertexShader = compile vs_3_0 TransparentVS();
       PixelShader = compile ps_3_0 TransparentPS();

       SRGBWriteEnable = true;
    }
}

So we’ve solved our gamma problems…at least if you’re only targeting the PC and you’re using Effects.  If you’re not using Effects, then I don’t know of any way to toggle those states.  It’s probably possible with some sort of interop/reflection voodoo, but I don’t know enough about these things to recommend it.

There’s also the Xbox 360 problem, which is actually two problems in one.  The first problem is that the Xbox 360 doesn’t use sampler and render states to control sRGB read and writes.  It instead uses the D3D10 convention of having special surface formats for textures and render targets that control whether conversion takes place.  I don’t have access to my Xbox 360 at the moment so I can’t verify for sure, but I strongly suspect that the effect states won’t work.  And even if they did work you’d still have the second problem, which is that the Xbox uses that piecewise approximation curve  (this presentation by Valve shows some of the nastiness that can occur with it).

Fortunately we can bypass those problems by doing the conversion ourselves in the shader.  The good news is that the code is a piece of cake…the bad news is that it’s not super cheap since it involves raising your RGB color value to a non-integral power. Here’s the code:

// Converts from linear RGB space to sRGB.
float3 LinearToSRGB(in float3 color)
{
    return pow(color, 1/2.2f);
}
// Converts from sRGB space to linear RGB.
float3 SRGBToLinear(in float3 color)
{
    return pow(color, 2.2f);
}

Unfortunately with these you also have the problem that filtering and blending will be performed in sRGB space, and there’s not much you can do about that (aside from doing the filtering and blending yourself, but that would be way too expensive).

If you want to make these conversions a little cheaper, you can use a trick that my coworker showed me: round down the 2.2 to 2.0.  This gives you a simple square operation for conversion to linear (you can just dot the value with itself), and a sqrt operation for conversion to sRGB.

More Post-Processing Tricks: Lens Flare

I was playing Killzone 2 the other day, which reminded me of the lens flare trick they used.  Unlike most games, which use some sprites controlled by an occlusion query, they applied the effect as a post-process similar to bloom.  The upside is that it works for all bright areas and not pre-defined areas (the sun), and you don’t have to do occlusion queries or anything like that since that’s handled automatically.  Plus it’s really easy to fit it into a post-processing chain, since you can use your bloom results as the input.  The downside is that it’s pretty far from realistic…I’m not sure that most would like the end result.  This screen here shows the effect pretty clearly (it’s the orange and purple blobby areas by the left bad guy’s head, on the opposite side of the screen from the bright light source).

I haven’t seen anyone duplicate or even discuss the technique since before the game out, so I figured I’d take a crack at deciphering it myself.  After some experimenting I came up with the following basic approach:

1.  Render a bloom buffer using standard downscale + threshold + blur
2.  Flip the texture coordinates by doing float2(1, 1) – texCoord
3.  Blur both towards the center of the screen and away from it
4.  Combine additively with the bloom buffer

To fake a chromatic aberration, Killzone 2 uses a strong orange tint for areas closer to the center of the screen and a purple tint on the periphery.  Upon some further close analysis it started to look like they were doing it in two passes with a different tint and different texture coordinate scaling for each pass.  I decided to make my implementation the same,  so I could produce similar results.  This is the shader code I came up with:


const static float4 vPurple = float4(0.7f, 0.2f, 0.9f, 1.0f);
const static float4 vOrange = float4(0.7f, 0.4f, 0.2f, 1.0f);
const static float fThreshold = 0.1f;

float4 LensFlarePS (    in float2 in_vTexCoord    : TEXCOORD0,
 uniform int NumSamples,
 uniform float4 vTint,
 uniform float fTexScale,
 uniform float fBlurScale)    : COLOR0
{
 // The flare should appear on the opposite side of the screen as the
 // source of the light, so first we mirror the texture coordinate.
 // Then we normalize so we can apply a scaling factor.
 float2 vMirrorCoord = float2(1.0f, 1.0f) - in_vTexCoord;
 float2 vNormalizedCoord = vMirrorCoord * 2.0f - 1.0f;
 vNormalizedCoord *= fTexScale;

 // We'll blur towards the center of screen, and also away from it.

 float2 vTowardCenter = normalize(-vNormalizedCoord);
 float2 fBlurDist = fBlurScale * NumSamples;
 float2 vStartPoint = vNormalizedCoord + ((vTowardCenter / g_vSourceDimensions) * fBlurDist);
 float2 vStep = -(vTowardCenter / g_vSourceDimensions) * 2 * fBlurDist;

 // Do the blur and sum the samples
 float4 vSum = 0;
 float2 vSamplePos = vStartPoint;
 for (int i = 0; i < NumSamples; i++)
 {
 float2 vSampleTexCoord = vSamplePos * 0.5f + 0.5f;

 // Don't add in samples past texture border
 if (vSampleTexCoord.x >= 0 && vSampleTexCoord.x <= 1.0f
 && vSampleTexCoord.y >=0 && vSampleTexCoord.y <= 1.0f)
 {
 float4 vSample = tex2D(PointSampler0, vSampleTexCoord);
 vSum +=  max(0, vSample - fThreshold) * vTint;
 }

 vSamplePos += vStep;
 }

 return vSum / NumSamples;
}

float4 CombinePS (in float2 in_vTexCoord    : TEXCOORD0) : COLOR0
{
 float4 vColor = tex2D(PointSampler0, in_vTexCoord);
 vColor += tex2D(PointSampler1, in_vTexCoord);
 vColor += tex2D(PointSampler2, in_vTexCoord);
 return vColor;
}

technique LensFlareFirstPass
{
 pass p0
 {
 VertexShader = compile vs_3_0 PostProcessVS();
 PixelShader = compile ps_3_0 LensFlarePS(12, vOrange, 2.00f, 0.15f);

 ZEnable = false;
 ZWriteEnable = false;
 AlphaBlendEnable = false;
 AlphaTestEnable = false;
 StencilEnable = false;
 }
}

technique LensFlareSecondPass
{
 pass p0
 {
 VertexShader = compile vs_3_0 PostProcessVS();
 PixelShader = compile ps_3_0 LensFlarePS(12, vPurple, 0.5f, 0.1f);

 ZEnable = false;
 ZWriteEnable = false;
 AlphaBlendEnable = false;
 AlphaTestEnable = false;
 StencilEnable = false;
 }
}

Obviously the code is severely unoptimized, but it's late and I'm tired.  Here's a screen of what it looks like (ignore the obnoxious brightness and bloom, please):


Two Samples For The Price Of One

Today I have two XNA samples fresh out of the oven: a Motion Blur Sample, and Depth Of Field Sample.  I figure all of the kids these days wanna add fancy post-processing tricks to their games, right?  The motion blur sample shows you how to do camera motion blur using a depth buffer, or full object motion blur using a velocity buffer. The depth of field sample shows you how to do a standard blur-based DOF, a slightly-smarter blur-based DOF that doesn’t blur across edges, and the somewhat more physically accurate disc blur approach.

Get ‘em while they’re hot!

New Tutorial: Using PIX With XNA

Ladies and gentlemen, I present you with the most epic of tutorials: Using PIX With XNA.  This 37-page monster teaches PIX for the XNA programmer, and includes an in-depth explanation of the XNA/D3D9 relationship as well as 6 excercises that show you the how to solve common problems (full source code and XNA 3.1 projects included).  I sure hope somebody finds this thing useful…it took me forever to write this thing.

I originally intended to have this tutorial hosted on Ziggyware…in fact I finished this over a month ago and submitted it to Ziggy.  However as you may or may not know, Ziggy has become the unfortunate target of scumbag hackers who have repeatedly hijacked his site in order to deploy malware.  The whole thing absolutely sucks…I really wish that those assholes had decided to hijack a site that wasn’t the most comprehensive collection of community-created XNA resources.  I hope Ziggy figures out a way to shake them and get the site up and running again…but it looks doubtful.  Honestly I don’t think I’d want to keep dealing with the kinds of problems he’s gone though.

Scintillating Snippets: Storing Normals Using Spherical Coordinates

Update:  n00body posted this link in the comments, which is way more in-depth than my post.  Check it out!

If you’ve ever implemented a deferred renderer, you know that one of the important points is keeping your G-Buffer small enough as to be reasonable in terms of bandwidth and your number of render targets.  Thanks to that constant struggle between good and evil, people have come up with some reasonable clever approaches towards packing necessary attributes in your G-Buffer.  One of the more popular approaches is that whole storing depth and reconstructing position thing, and another is packing normals so that you only need 2 components instead of 3.

One of the more simple and common approaches is to only store the X and Y components of your view-space normals and then assume Z is positive (or negative, depending on whether you’re using right-handed or left-handed coordinates).  As far as I know, this was first proposed here by Guerilla Games. However there’s a problem with this approach, which is that you can’t always assume the sign of your Z component when you’re using a perspective projection! This might seem weird at first (heck it took a while for someone to demonstrate to me why this is the case), but I assure you it’s true.  Insomniac has some good pictures here demonstrating the errors that occur.  So this means that if we want to use this technique and avoid errors, we have to pack the sign of Z somewhere in our two values. This is a little nasty, and takes away a bit of precision from one of your other values.

An alternative approach suggested to me a long time ago is to store the normal as a spherical coordinate.  Since a normal is always a unit vector with length = 1, you can (safely) assume that Rho = 1 and just store Thetha and Phi.  Piece of cake!  All you have to do is implement the equations on the wiki page, take out the Rho’s, and you’ve got a two-component normal with excellent precision.

But wait, there’s more!  It turns out if you use some trig-fu, you can actually further optimization to the conversions when Rho is equal to 1.  I was never actually good at simplifying equations with trig functions (I can do everything else, promise!) so I defer to the noble Pat Wilson who gave a quick rundown over in this thread.  Make sure you check out his set of screenshots that demonstrate the errors that occur from different normal storage options, so you can pick which method is right for you.

Also since this is Scinitillating Snippets and it wouldn’t be much fun without a snippet, I’ll post the HLSL functions I use for encoding and decoding my normals.  Just remember, all of the credit goes to Mr. Wilson.  I just did the pilfering!

// Converts a normalized cartesian direction vector
// to spherical coordinates.
float2 CartesianToSpherical(float3 cartesian)
{
  float2 spherical;

  spherical.x = atan2(cartesian.y, cartesian.x) / 3.14159f;
  spherical.y = cartesian.z;

  return spherical * 0.5f + 0.5f;
}

// Converts a spherical coordinate to a normalized
// cartesian direction vector.
float3 SphericalToCartesian(float2 spherical)
{
  float2 sinCosTheta, sinCosPhi;

  spherical = spherical * 2.0f - 1.0f;
  sincos(spherical.x * 3.14159f, sinCosTheta.x, sinCosTheta.y);
  sinCosPhi = float2(sqrt(1.0 - spherical.y * spherical.y), spherical.y);

  return float3(sinCosTheta.y * sinCosPhi.x, sinCosTheta.x * sinCosPhi.x, sinCosPhi.y);    
}

Also keep in mind that these functions normalize the values to the range [0,1], so that you can store in a regular fixed-point texture. If you’re using a floating point texture you can remove the division by PI if you wish (and corresponding multiply by PI in the decode), as well as the “multiply by 0.5, subtract by 0.5″.

What’s good on the menu, waiter?

I remember reading someone say on gamedev.net that at some point everyone tries to write their own UI system, and usually gets it wrong.  Apparently he’s right (or at least about the first part), because I’ve gone ahead and written a menu/UI system.  While it initially started out as part of the engine/framework I’ve been working on for my game, as I worked on it I decided it might be better off if I decoupled it from the rest of the engine components and made it a standalone library/editor package so that other people could make use of it.

While designing and implementing I had these goals in mind:

  • Keep it simple!  Make menu elements useful by default, but don’t cram in tons of functionality with limited use.  Just let them be flexible enough so that they can be customized for unusual cases.
  • Cross-platform, with a focus on Xbox 360.  Should look identical on both, and expose the same functionality regardless of input method.
  • Page-based layout. A few of the other GUI packages out there seem to be aimed at recreating WinForms using XNA…and I think that’s silly.  You don’t want sizeable windows for a game (or at least not most games), you want menus that are logically divided up into pages that you can switch between.
  • A PC-only editor application that lets you visually design your menus.   The core library should be aware of the fact that it can run in a designer, and provide support for this.
  • Free and open-source!

What I ended up with is the CPX Menu System.  It actually came out better than I expected…the editor is very stable and works pretty nicely.  It could use somore more fancy features (like tools for lining up menu items), but it definitely WORKS and I’m happy about that.  As for the menu item types included in the library itself…it’s pretty bare-bones but you can still do a lot with them.  I mean personally for my game I wouldn’t really need a whole lot more than what I put in the sample app.

Probably the biggest weakness it has working with content is a bit awkward.  Early on a I struggled a lot with trying to come up with a good way to handle it…and I don’t feel like I ever really came up with a killer solution.  As of right now the way it works is that the editor app itself does not build any content at runtime.  This isn’t so nice, since you have to have Content compiled ahead of time before you run the app.  The upside is that editor doesn’t depend on the content pipeline assemblies at all, so you can run it on a PC that doesn’t have the full XNA GS install.  Probably the easiest way to manage content is to just add all of your menu content to the CPXMenu project’s Content project.  If you do that, then you will always have the content available for the editor and your game (assuming you’re always building the editor in VS and running it that way).  Otherwise you can tell the editor to look for content in a specific path whenever it loads a project.  This is what I did for the sample app: it has its own Content project with some custom textures, so I set the editor to look in the output folder for that project.

I guess that’s it for now…at some point I suppose I’ll announce it on Ziggyware.  Maybe after I add some documentation explaining how to use the damned thing.  In the meantime, here’s some screenshots of the sample app and the editor:

Reconstructing Position From Depth, Continued

Picking up where I left off here

As I mentioned, you can also reconstruct a world-space position using the frustum ray technique.  The first step is that you need your frustum corners to be rotated so that they match the current orientation of your camera.  You can do this by transforming the frustum corners by a “camera world matrix”, which is a matrix representing the camera’s position and orientation in world-space.  If you don’t have this available you can just invert your view matrix, which you can actually do by transposing it (since your view matrix should be orthogonal unless you’re doing something really really weird).  I’ll demonstrate doing it right in the vertex shader for the sake of simplicity, but you’d probably want to do it ahead of time in your application code.

// Vertex shader for rendering a full-screen quad
void QuadVS (	in float3 in_vPositionOS		: POSITION,
		in float3 in_vTexCoordAndCornerIndex	: TEXCOORD0,
		out float4 out_vPositionCS		: POSITION,
		out float2 out_vTexCoord		: TEXCOORD0,
		out float3 out_vFrustumCornerWS		: TEXCOORD1	)
{
	// Offset the position by half a pixel to correctly
	// align texels to pixels. Only necessary for D3D9 or XNA
	out_vPositionCS.x = in_vPositionOS.x - (1.0f/g_vOcclusionTextureSize.x);
	out_vPositionCS.y = in_vPositionOS.y + (1.0f/g_vOcclusionTextureSize.y);
	out_vPositionCS.z = in_vPositionOS.z;
	out_vPositionCS.w = 1.0f;

	// Pass along the texture coordinate and the position
	// of the frustum corner in world-space.  This frustum corner
        // position is interpolated so that the pixel shader always
        // has a ray from camera->far-clip plane
	out_vTexCoord = in_vTexCoordAndCornerIndex.xy;
	float3 vFrustumCornerVS = g_vFrustumCornersVS[in_vTexCoordAndCornerIndex.z];
        out_vFrustumCornerWS = mul(vFrustumCornerVS, g_matCameraWorld);
}

So what we’ve done here is we’ve rotated (not translated, since vFrusumCornerVS is only a float3) the view-space frustum corner so that it’s now matches the camera’s orientation.  However it’s still centered around <0,0,0> and not the camera’s world-space position, so when we reconstruct position we’ll also add the camera’s world-space position:

// Pixel shader function for reconstructing world-space position
float3 WSPositionFromDepth(float2 vTexCoord, float3 vFrustumRayWS)
{
	float fPixelDepth = tex2D(DepthSampler, vTexCoord).r;
	return g_vCameraPosWS + fPixelDepth * vFrustumRayWS;
}

And there it is. Easy peasy, lemon squeezy.

The other bit I hinted at was using this same technique with arbitray geometry, for example  the bounding volumes for a local light source.  For this we once again need a ray that points from the camera position through the pixel position to the far-clip plane.  We can do this in the pixel shader by using the view-space position of the pixel.

void VSBoundingVolume(  in float3 in_vPositionOS       : POSITION,
                        out float4 out_vPositionCS     : POSITION,
                        out float3 out_vPositionVS    : TEXCOORD0 )
{
    out_vPositionCS = mul(in_vPositionOS, g_matWorldViewProj);    

    // Pass along the view-space vertex position to the pixel shader
    out_vPositionVS = mul(in_vPositionOS, g_matWorldView);
}

Then in our pixel shader, we calculate the ray and reconstruct position like this:

float3 VSPositionFromDepth(float2 vTexCoord, float3 vPositionVS)
{
    // Calculate the frustum ray using the view-space position.
    // g_fFarCip is the distance to the camera's far clipping plane.
    // Negating the Z component only necessary for right-handed coordinates
    float3 vFrustumRayVS = vPositionVS.xyz * (g_fFarClip/-vPositionVS.z);
    return tex2D(DepthSampler, vTexCoord).x * vFrustumRayVS;
}

So there you go, I did your homework for you.  Now stop beating me up in the schoolyard!

EDIT: Fixed the code and explanation so that it actually works now!  Big thanks to Bill and Josh for pointing out the mistake.

Undo and Redo: Take 2

Please excuse the rhyming in the title…sometimes I just can’t help myself.  It’s a problem.

A few weeks ago I started working on a  super-duper-secret project (to be revealed soon), a big part of which was a new editor.   Since I’m the kind of guy who gets all worked up about having proper undo and redo support, I took the opportunity to make it an up-front part of my design rather than just shoving it in afterwords.

One of the things I’d thought about for map editor was having a well-defined boundary between the user’s input and actions that could be performed on the document.  For the map editor it was too late for that, but this time I could put it in from the start.  What I came up with was the ActionManager (yeah I know, bad name.  Sue me.).  It provides as public methods a variety of actions that can be performed on the document: adding a new item, removing an item, setting a property on an item, etc.   When one of these methods gets called it creates an IEditAction derivative, configures it, has the IEditAction “do” the action, and then pushes it onto the Undo stack.  So similar to what I had previously in my map editor, except that the EditActions actually perform the action the first time around and all the Undo/Redo stuff is wrapped up in a nice class.  It’s also less error prone, because you go through the ActionManager layer rather than going directly to the document (this helps ensure that everything the user does goes through the proper Undo/Redo jazz).

I also managed to get it down to just three EditAction’s: AddRemoveItemAction, PropertyEditAction, and CompoundAction.  The first is for adding and removing items to the document, the second is for whenever an item’s property is modified (this is the majority of actions), and the third just represents multiple AddRemoveItemAction’s and/or PropertyEditAction’s that are peformed as the result of a single user action.   It still doesn’t necessarily deal with the problem of having the number of EditAction’s explode as the app grows, but it helps that Reflection in .NET is awesome enough to let me use PropertyEditAction for just about everything.

The one problem I still had to deal with was the stupid PropertyGrid.  The PropertyGrid is fantastic, but it’s not realy set up for Undo and Redo.  Well that’s a lie, it sorta is.  See it raises a PropertyValueChanged event whenever a property value changes, and the EventArgs conveniently has an OldValue property that tells you what the previous value was.  Great, right?  Right…except  for the fact that this is null when you have multple objects selected on the PropertyGrid.  Not so great.

This led me to approach #1:  each time an item is supposed to be set onto the PropertyGrid, create a “proxy” item byt cloning the original and set that onto the PropertyGrid.  Then whenever a property value is changed, I can look up the “real item, query it for the old property value, and then actually set the property value via the ActionManager.  And this worked…at first.  Where I ran into problems was where setting properties on an item affected the state of another item.  For instance items have an “Index” property that controls the index within a parent item’s children collection.  So setting that property causes the item to send a request to the parent item for a reorder of the children, and that might fail based on the state of parent.  This means that if I leave my references hooked up properly in my clone I end up with a situation where actions like that get performed twice (and usually failing the second time), or if I  “detatch” a clone from all outside references I lose my error verification (not to mention the fact that I have to be very very careful in how I clone something).

This brought me to attempt #2, which I consider uglier but has actually worked out: every time the user selects a GridItem in the PropertyGrid, save the current state of the Property for all selected items so that I have an OldValue.

private GridItem GetRootReferenceGridItem(GridItem gridItem)
{
    GridItem rootItem = gridItem;
    if (!rootItem.PropertyDescriptor.ComponentType.IsValueType)
        return rootItem;

    while (gridItem.Parent.Parent != null)
    {
        gridItem = gridItem.Parent;
        if (gridItem.PropertyDescriptor != null
            && !gridItem.PropertyDescriptor.ComponentType.IsValueType)
        {
            rootItem = gridItem;
            break;
        }
    }

    return rootItem;
}

private void SetOldValues()
{
    GridItem gridItem = propertyGrid.SelectedGridItem;

    if (gridItem != null && gridItem.GridItemType == GridItemType.Property)
    {
        gridItem = GetRootReferenceGridItem(gridItem);

        oldValues = new object[selectedItems.Count];
        for (int i = 0; i < oldValues.Length; i++)
            oldValues[i] = gridItem.Value;
    }
    else
        oldValues = null;
}

void propertyGrid_SelectedGridItemChanged(object sender, SelectedGridItemChangedEventArgs e)
{
    if (e.NewSelection.PropertyDescriptor == null
        || e.OldSelection == null
        || e.OldSelection.PropertyDescriptor == null
        || e.NewSelection.PropertyDescriptor.Name != e.OldSelection.PropertyDescriptor.Name)
        SetOldValues();
}

void propertyGrid_PropertyValueChanged(object s, PropertyValueChangedEventArgs e)
{
    object[] items = new object[selectedItems.Count];

    // Trace backwards through the chain of properties until we find
    // the first property
    List<string> propertyChain = new List<string>();
    GridItem gridItem = GetRootReferenceGridItem(e.ChangedItem);
    string propertyName = gridItem.PropertyDescriptor.Name;

    while (gridItem.Parent.Parent != null)
    {
        gridItem = gridItem.Parent;
        if (gridItem.PropertyDescriptor != null)
            propertyChain.Add(gridItem.PropertyDescriptor.Name);
    }

    // Now walk the chain and find the owner of the property or field that was modified
    for (int i = 0; i < selectedItems.Count; i++)
    {
        items[i] = selectedItems[i];
        object nextItem = items[i];
        for (int j = propertyChain.Count - 1; j >= 0; j--)
            items[i] = ActionManager.GetPropertyOrFieldValue(items[i], propertyChain[j]);
    }

    actionManager.PropertyValueChanged(items, propertyName, oldValues);

    SetOldValues();
}

The main problem with this is that the PropertyGrid is now editing a “live” object: changes it makes to items actually affect their state.  This unfortunately broke my “everything must go through the ActionManager” philosophy, but I couldn’t think of any better alternatives.  So I added a new method to the ActionManager that allows me to “register” that a property value was changes after the fact.  It basically works the same as the old ChangePropertyValue method, except that it doesn’t call “Do” on the PropertyEditAction after it creates it.   So yeah kinda ugly…but it works.  Good enough, I guess.