New Tutorial: Using PIX With XNA

Ladies and gentlemen, I present you with the most epic of tutorials: Using PIX With XNA.  This 37-page monster teaches PIX for the XNA programmer, and includes an in-depth explanation of the XNA/D3D9 relationship as well as 6 excercises that show you the how to solve common problems (full source code and XNA 3.1 projects included).  I sure hope somebody finds this thing useful…it took me forever to write this thing.

I originally intended to have this tutorial hosted on Ziggyware…in fact I finished this over a month ago and submitted it to Ziggy.  However as you may or may not know, Ziggy has become the unfortunate target of scumbag hackers who have repeatedly hijacked his site in order to deploy malware.  The whole thing absolutely sucks…I really wish that those assholes had decided to hijack a site that wasn’t the most comprehensive collection of community-created XNA resources.  I hope Ziggy figures out a way to shake them and get the site up and running again…but it looks doubtful.  Honestly I don’t think I’d want to keep dealing with the kinds of problems he’s gone though.

Scintillating Snippets: Storing Normals Using Spherical Coordinates

Update:  n00body posted this link in the comments, which is way more in-depth than my post.  Check it out!

If you’ve ever implemented a deferred renderer, you know that one of the important points is keeping your G-Buffer small enough as to be reasonable in terms of bandwidth and your number of render targets.  Thanks to that constant struggle between good and evil, people have come up with some reasonable clever approaches towards packing necessary attributes in your G-Buffer.  One of the more popular approaches is that whole storing depth and reconstructing position thing, and another is packing normals so that you only need 2 components instead of 3.

One of the more simple and common approaches is to only store the X and Y components of your view-space normals and then assume Z is positive (or negative, depending on whether you’re using right-handed or left-handed coordinates).  As far as I know, this was first proposed here by Guerilla Games. However there’s a problem with this approach, which is that you can’t always assume the sign of your Z component when you’re using a perspective projection! This might seem weird at first (heck it took a while for someone to demonstrate to me why this is the case), but I assure you it’s true.  Insomniac has some good pictures here demonstrating the errors that occur.  So this means that if we want to use this technique and avoid errors, we have to pack the sign of Z somewhere in our two values. This is a little nasty, and takes away a bit of precision from one of your other values.

An alternative approach suggested to me a long time ago is to store the normal as a spherical coordinate.  Since a normal is always a unit vector with length = 1, you can (safely) assume that Rho = 1 and just store Thetha and Phi.  Piece of cake!  All you have to do is implement the equations on the wiki page, take out the Rho’s, and you’ve got a two-component normal with excellent precision.

But wait, there’s more!  It turns out if you use some trig-fu, you can actually further optimization to the conversions when Rho is equal to 1.  I was never actually good at simplifying equations with trig functions (I can do everything else, promise!) so I defer to the noble Pat Wilson who gave a quick rundown over in this thread.  Make sure you check out his set of screenshots that demonstrate the errors that occur from different normal storage options, so you can pick which method is right for you.

Also since this is Scinitillating Snippets and it wouldn’t be much fun without a snippet, I’ll post the HLSL functions I use for encoding and decoding my normals.  Just remember, all of the credit goes to Mr. Wilson.  I just did the pilfering!

// Converts a normalized cartesian direction vector
// to spherical coordinates.
float2 CartesianToSpherical(float3 cartesian)
{
  float2 spherical;

  spherical.x = atan2(cartesian.y, cartesian.x) / 3.14159f;
  spherical.y = cartesian.z;

  return spherical * 0.5f + 0.5f;
}

// Converts a spherical coordinate to a normalized
// cartesian direction vector.
float3 SphericalToCartesian(float2 spherical)
{
  float2 sinCosTheta, sinCosPhi;

  spherical = spherical * 2.0f - 1.0f;
  sincos(spherical.x * 3.14159f, sinCosTheta.x, sinCosTheta.y);
  sinCosPhi = float2(sqrt(1.0 - spherical.y * spherical.y), spherical.y);

  return float3(sinCosTheta.y * sinCosPhi.x, sinCosTheta.x * sinCosPhi.x, sinCosPhi.y);    
}

Also keep in mind that these functions normalize the values to the range [0,1], so that you can store in a regular fixed-point texture. If you’re using a floating point texture you can remove the division by PI if you wish (and corresponding multiply by PI in the decode), as well as the “multiply by 0.5, subtract by 0.5″.

What’s good on the menu, waiter?

I remember reading someone say on gamedev.net that at some point everyone tries to write their own UI system, and usually gets it wrong.  Apparently he’s right (or at least about the first part), because I’ve gone ahead and written a menu/UI system.  While it initially started out as part of the engine/framework I’ve been working on for my game, as I worked on it I decided it might be better off if I decoupled it from the rest of the engine components and made it a standalone library/editor package so that other people could make use of it.

While designing and implementing I had these goals in mind:

  • Keep it simple!  Make menu elements useful by default, but don’t cram in tons of functionality with limited use.  Just let them be flexible enough so that they can be customized for unusual cases.
  • Cross-platform, with a focus on Xbox 360.  Should look identical on both, and expose the same functionality regardless of input method.
  • Page-based layout. A few of the other GUI packages out there seem to be aimed at recreating WinForms using XNA…and I think that’s silly.  You don’t want sizeable windows for a game (or at least not most games), you want menus that are logically divided up into pages that you can switch between.
  • A PC-only editor application that lets you visually design your menus.   The core library should be aware of the fact that it can run in a designer, and provide support for this.
  • Free and open-source!

What I ended up with is the CPX Menu System.  It actually came out better than I expected…the editor is very stable and works pretty nicely.  It could use somore more fancy features (like tools for lining up menu items), but it definitely WORKS and I’m happy about that.  As for the menu item types included in the library itself…it’s pretty bare-bones but you can still do a lot with them.  I mean personally for my game I wouldn’t really need a whole lot more than what I put in the sample app.

Probably the biggest weakness it has working with content is a bit awkward.  Early on a I struggled a lot with trying to come up with a good way to handle it…and I don’t feel like I ever really came up with a killer solution.  As of right now the way it works is that the editor app itself does not build any content at runtime.  This isn’t so nice, since you have to have Content compiled ahead of time before you run the app.  The upside is that editor doesn’t depend on the content pipeline assemblies at all, so you can run it on a PC that doesn’t have the full XNA GS install.  Probably the easiest way to manage content is to just add all of your menu content to the CPXMenu project’s Content project.  If you do that, then you will always have the content available for the editor and your game (assuming you’re always building the editor in VS and running it that way).  Otherwise you can tell the editor to look for content in a specific path whenever it loads a project.  This is what I did for the sample app: it has its own Content project with some custom textures, so I set the editor to look in the output folder for that project.

I guess that’s it for now…at some point I suppose I’ll announce it on Ziggyware.  Maybe after I add some documentation explaining how to use the damned thing.  In the meantime, here’s some screenshots of the sample app and the editor:

Reconstructing Position From Depth, Continued

Picking up where I left off here

As I mentioned, you can also reconstruct a world-space position using the frustum ray technique.  The first step is that you need your frustum corners to be rotated so that they match the current orientation of your camera.  You can do this by transforming the frustum corners by a “camera world matrix”, which is a matrix representing the camera’s position and orientation in world-space.  If you don’t have this available you can just invert your view matrix, which you can actually do by transposing it (since your view matrix should be orthogonal unless you’re doing something really really weird).  I’ll demonstrate doing it right in the vertex shader for the sake of simplicity, but you’d probably want to do it ahead of time in your application code.

// Vertex shader for rendering a full-screen quad
void QuadVS (	in float3 in_vPositionOS		: POSITION,
		in float3 in_vTexCoordAndCornerIndex	: TEXCOORD0,
		out float4 out_vPositionCS		: POSITION,
		out float2 out_vTexCoord		: TEXCOORD0,
		out float3 out_vFrustumCornerWS		: TEXCOORD1	)
{
	// Offset the position by half a pixel to correctly
	// align texels to pixels. Only necessary for D3D9 or XNA
	out_vPositionCS.x = in_vPositionOS.x - (1.0f/g_vOcclusionTextureSize.x);
	out_vPositionCS.y = in_vPositionOS.y + (1.0f/g_vOcclusionTextureSize.y);
	out_vPositionCS.z = in_vPositionOS.z;
	out_vPositionCS.w = 1.0f;

	// Pass along the texture coordinate and the position
	// of the frustum corner in world-space.  This frustum corner
        // position is interpolated so that the pixel shader always
        // has a ray from camera->far-clip plane
	out_vTexCoord = in_vTexCoordAndCornerIndex.xy;
	float3 vFrustumCornerVS = g_vFrustumCornersVS[in_vTexCoordAndCornerIndex.z];
        out_vFrustumCornerWS = mul(vFrustumCornerVS, g_matCameraWorld);
}

So what we’ve done here is we’ve rotated (not translated, since vFrusumCornerVS is only a float3) the view-space frustum corner so that it’s now matches the camera’s orientation.  However it’s still centered around <0,0,0> and not the camera’s world-space position, so when we reconstruct position we’ll also add the camera’s world-space position:

// Pixel shader function for reconstructing world-space position
float3 WSPositionFromDepth(float2 vTexCoord, float3 vFrustumRayWS)
{
	float fPixelDepth = tex2D(DepthSampler, vTexCoord).r;
	return g_vCameraPosWS + fPixelDepth * vFrustumRayWS;
}

And there it is. Easy peasy, lemon squeezy.

The other bit I hinted at was using this same technique with arbitray geometry, for example  the bounding volumes for a local light source.  For this we once again need a ray that points from the camera position through the pixel position to the far-clip plane.  We can do this in the pixel shader by using the view-space position of the pixel.

void VSBoundingVolume(  in float3 in_vPositionOS       : POSITION,
                        out float4 out_vPositionCS     : POSITION,
                        out float3 out_vPositionVS    : TEXCOORD0 )
{
    out_vPositionCS = mul(in_vPositionOS, g_matWorldViewProj);    

    // Pass along the view-space vertex position to the pixel shader
    out_vPositionVS = mul(in_vPositionOS, g_matWorldView);
}

Then in our pixel shader, we calculate the ray and reconstruct position like this:

float3 VSPositionFromDepth(float2 vTexCoord, float3 vPositionVS)
{
    // Calculate the frustum ray using the view-space position.
    // g_fFarCip is the distance to the camera's far clipping plane.
    // Negating the Z component only necessary for right-handed coordinates
    float3 vFrustumRayVS = vPositionVS.xyz * (g_fFarClip/-vPositionVS.z);
    return tex2D(DepthSampler, vTexCoord).x * vFrustumRayVS;
}

So there you go, I did your homework for you.  Now stop beating me up in the schoolyard!

EDIT: Fixed the code and explanation so that it actually works now!  Big thanks to Bill and Josh for pointing out the mistake.

Undo and Redo: Take 2

Please excuse the rhyming in the title…sometimes I just can’t help myself.  It’s a problem.

A few weeks ago I started working on a  super-duper-secret project (to be revealed soon), a big part of which was a new editor.   Since I’m the kind of guy who gets all worked up about having proper undo and redo support, I took the opportunity to make it an up-front part of my design rather than just shoving it in afterwords.

One of the things I’d thought about for map editor was having a well-defined boundary between the user’s input and actions that could be performed on the document.  For the map editor it was too late for that, but this time I could put it in from the start.  What I came up with was the ActionManager (yeah I know, bad name.  Sue me.).  It provides as public methods a variety of actions that can be performed on the document: adding a new item, removing an item, setting a property on an item, etc.   When one of these methods gets called it creates an IEditAction derivative, configures it, has the IEditAction “do” the action, and then pushes it onto the Undo stack.  So similar to what I had previously in my map editor, except that the EditActions actually perform the action the first time around and all the Undo/Redo stuff is wrapped up in a nice class.  It’s also less error prone, because you go through the ActionManager layer rather than going directly to the document (this helps ensure that everything the user does goes through the proper Undo/Redo jazz).

I also managed to get it down to just three EditAction’s: AddRemoveItemAction, PropertyEditAction, and CompoundAction.  The first is for adding and removing items to the document, the second is for whenever an item’s property is modified (this is the majority of actions), and the third just represents multiple AddRemoveItemAction’s and/or PropertyEditAction’s that are peformed as the result of a single user action.   It still doesn’t necessarily deal with the problem of having the number of EditAction’s explode as the app grows, but it helps that Reflection in .NET is awesome enough to let me use PropertyEditAction for just about everything.

The one problem I still had to deal with was the stupid PropertyGrid.  The PropertyGrid is fantastic, but it’s not realy set up for Undo and Redo.  Well that’s a lie, it sorta is.  See it raises a PropertyValueChanged event whenever a property value changes, and the EventArgs conveniently has an OldValue property that tells you what the previous value was.  Great, right?  Right…except  for the fact that this is null when you have multple objects selected on the PropertyGrid.  Not so great.

This led me to approach #1:  each time an item is supposed to be set onto the PropertyGrid, create a “proxy” item byt cloning the original and set that onto the PropertyGrid.  Then whenever a property value is changed, I can look up the “real item, query it for the old property value, and then actually set the property value via the ActionManager.  And this worked…at first.  Where I ran into problems was where setting properties on an item affected the state of another item.  For instance items have an “Index” property that controls the index within a parent item’s children collection.  So setting that property causes the item to send a request to the parent item for a reorder of the children, and that might fail based on the state of parent.  This means that if I leave my references hooked up properly in my clone I end up with a situation where actions like that get performed twice (and usually failing the second time), or if I  “detatch” a clone from all outside references I lose my error verification (not to mention the fact that I have to be very very careful in how I clone something).

This brought me to attempt #2, which I consider uglier but has actually worked out: every time the user selects a GridItem in the PropertyGrid, save the current state of the Property for all selected items so that I have an OldValue.

private GridItem GetRootReferenceGridItem(GridItem gridItem)
{
    GridItem rootItem = gridItem;
    if (!rootItem.PropertyDescriptor.ComponentType.IsValueType)
        return rootItem;

    while (gridItem.Parent.Parent != null)
    {
        gridItem = gridItem.Parent;
        if (gridItem.PropertyDescriptor != null
            && !gridItem.PropertyDescriptor.ComponentType.IsValueType)
        {
            rootItem = gridItem;
            break;
        }
    }

    return rootItem;
}

private void SetOldValues()
{
    GridItem gridItem = propertyGrid.SelectedGridItem;

    if (gridItem != null && gridItem.GridItemType == GridItemType.Property)
    {
        gridItem = GetRootReferenceGridItem(gridItem);

        oldValues = new object[selectedItems.Count];
        for (int i = 0; i < oldValues.Length; i++)
            oldValues[i] = gridItem.Value;
    }
    else
        oldValues = null;
}

void propertyGrid_SelectedGridItemChanged(object sender, SelectedGridItemChangedEventArgs e)
{
    if (e.NewSelection.PropertyDescriptor == null
        || e.OldSelection == null
        || e.OldSelection.PropertyDescriptor == null
        || e.NewSelection.PropertyDescriptor.Name != e.OldSelection.PropertyDescriptor.Name)
        SetOldValues();
}

void propertyGrid_PropertyValueChanged(object s, PropertyValueChangedEventArgs e)
{
    object[] items = new object[selectedItems.Count];

    // Trace backwards through the chain of properties until we find
    // the first property
    List<string> propertyChain = new List<string>();
    GridItem gridItem = GetRootReferenceGridItem(e.ChangedItem);
    string propertyName = gridItem.PropertyDescriptor.Name;

    while (gridItem.Parent.Parent != null)
    {
        gridItem = gridItem.Parent;
        if (gridItem.PropertyDescriptor != null)
            propertyChain.Add(gridItem.PropertyDescriptor.Name);
    }

    // Now walk the chain and find the owner of the property or field that was modified
    for (int i = 0; i < selectedItems.Count; i++)
    {
        items[i] = selectedItems[i];
        object nextItem = items[i];
        for (int j = propertyChain.Count - 1; j >= 0; j--)
            items[i] = ActionManager.GetPropertyOrFieldValue(items[i], propertyChain[j]);
    }

    actionManager.PropertyValueChanged(items, propertyName, oldValues);

    SetOldValues();
}

The main problem with this is that the PropertyGrid is now editing a “live” object: changes it makes to items actually affect their state.  This unfortunately broke my “everything must go through the ActionManager” philosophy, but I couldn’t think of any better alternatives.  So I added a new method to the ActionManager that allows me to “register” that a property value was changes after the fact.  It basically works the same as the old ChangePropertyValue method, except that it doesn’t call “Do” on the PropertyEditAction after it creates it.   So yeah kinda ugly…but it works.  Good enough, I guess.

There’s More Than One Way To Defer A Renderer

While the idea of deferred shading/deferred rendering isn’t quite as hot as it was  year or two ago (OMG, Killzone 2 uses deferred rendering!), it’s still a cool idea that gets discussed rather often.  People generally tend to be attracted to way a “pure” deferred renderer neatly and cleanly separates your geometry from your lighting, as well as the idea of being able to throw lights everywhere in their scene.  However as anyone who’s done a little bit of research into the topic surely knows, it comes with a few drawbacks.  The main ones being that for MSAA you need to individually light all your subsamples (which isn’t doable in D3D9), and also that for non-opaque objects you have use forward rendering anyway.

The neat thing about the concepts involved with deferred shading is that you’re not all locked into the typical “render depth+normals+diffuse+specular to a fat G-Buffer and then shade” approach.  I’m not sure enough people are aware of this, and appreciate it.  For example, you can just defer your shadow map calculations to gain the related performance and organization benefits, and then use standard forward rendering techniques for everything else.  Or you can reconfigure the deferred lighting pipeline to gain back the ability to have multiple materials, or the ability to multisample without shading individual subsamples.  Surely there are even more possibilities!

Recently while working on my own game, I was grappling with the issue of having my engine support more local light sources in a scene.   I was using standard forward lighting with up to 3 lights per pass (which was fine), but I really wanted to keep my DrawPrimitives calls to a minium (due to how painful they can be on the 360).  This was problem since I’m aggressively batching my mesh rendering using instancing, and sorting instances by which light affects them would cause by batches to increase.  Thus, I was using 3 “global” light sources per frame.  This has obvious drawbacks.

While I was thinking over solutions, I considered the importance of smaller local lights that are relatively far away in the scene.  At further distances, it’s not necessarilly too important to have “correct” lighting.  In fact, we basically just need something that’s the right color, makes the area brighter, and doesn’t shade surfaces facing away from the light source.  So I thought: “I already have view-space depth…if I can calculate view-space normals I canget what I want by using a deferred pass”.  So I did exactly this…and it didn’t work very well.  The problem was that even though you can a calculate view-space normal from a depth value by calculating the partial derivatives and taking a cross product, the normals you calculate aren’t smoothly interpolated between vertices.  So what you get is something that looks an awful lot like flat shading.  Ewwwwwwwwwwww.

This lead to approach #2:  in the depth-only pass, render to a RGBA16F surface instead of a R32F surface and render out depth + view-space normals as interpolated from the vertex normals.  This worked much better!  The only remaining issue (aside from the fact that I just hard-code a diffuse albedo and specular albedo), is that normal-maps aren’t used.  However even with that those problems the results are still decent, as long as surface colors are primarily determined by your forward rendering pass and the local light are just “extra”.  Here’s screenshots of a test scene with forward rendering, and then with the point lights deferred:

The results are clearly not as good as a full forward pass when you have them side-by-side, but I think they’re probably good enough…especially if I only use this technique for lights that are small or far-away.  The trick is going to be transferring smoothly from deferred to forward, but that’s certainly doable.

One downside that came with this was that since I was just additively blending in the lights, I couldn’t use my beloved LogLuv encoding for HDR.  My next-best option of the 360 was to normalize R10G10B10A2 to a range greater than [0,1].  I ended up having to normalize to [0,8] to get the dynamic range I wanted, and unfortunately this can give some visible banding in certain cases.  And alternative I’ll have to explore is rendering just the point lights to an R10G10B10A2 buffer, and then sending this to my forward rendering pass to be sampled and added to the result.  If I did this I could also use the light prepass approach, and gain back material parameters and proper MSAA for the point lights.

Anyway I’m not saying that what I’m doing is that particularly interesting or useful, I’m just trying to demonstrate that there are many possibilities to explore.  It’s good to think out of the box every once in a while!

Scintillating Snippets: Reconstructing Position From Depth

There are times I wish I’d never responded to this thread over at GDnet, simply because of the constant stream of PM’s that I still get about it.  Wouldn’t it be nice if I could just pull out all the important bits, stick it on some blog, and then link everyone to it?  You’re right, it would be!

First things first: what am I talking about?  I’m talking about something that finds great use for deferred rendering: reconstructing the 3D position of a previously-rendered pixel (either in view-space or world-space) from a single depth value.  In practice, it’s really not terribly complicated.  You intrinsically know (or can figure out) the 2D position of any pixel when you’re shading it, which means that if you can sample a depth value you can get the whole 3D position.  However it’s still easy to get tripped up due to the fact that there’s several ways to go about it, coupled with the fact that many beginners aren’t very proficient at debugging their shaders.

Let’s talk about the first way to do it: storing post-projection z/w, combining it with x/w and y/w, transforming by the inverse of the projection matrix, and dividing by w.  In HLSL it looks something like this…

// Depth pass vertex shader
output.vPositionCS = mul(input.vPositionOS, g_matWorldViewProj);
output.vDepthCS.xy = output.vPositionCS.zw;

// Depth pass pixel shader (output z/w)
return input.vDepthCS.x / input.vDepthVS.y;

// Function for converting depth to view-space position
// in deferred pixel shader pass.  vTexCoord is a texture
// coordinate for a full-screen quad, such that x=0 is the
// left of the screen, and y=0 is the top of the screen.
float3 VSPositionFromDepth(float2 vTexCoord)
{
    // Get the depth value for this pixel
    float z = tex2D(DepthSampler, vTexCoord);  
    // Get x/w and y/w from the viewport position
    float x = vTexCoord.x * 2 - 1;
    float y = (1 - vTexCoord.y) * 2 - 1;
    float4 vProjectedPos = float4(x, y, z, 1.0f);
    // Transform by the inverse projection matrix
    float4 vPositionVS = mul(vProjectedPos, g_matInvProjection);  
    // Divide by w to get the view-space position
    return vPositionVS.xyz / vPositionVS.w;  
}

For many this is the preferred approach since it works with hardware depth buffers.  It also may seem natural to some: we get depth by projection, we get position by un-projecting.  But what if we don’t have access to a hardware depth buffer?  If you’re targeting the PC and D3D9,  sampling from a depth buffer as if it were a texture is not straightforward since it requires driver hacks.  If you’re using XNA, it’s not possible at all since the framework generally attempts to main cross-plaftorm compatibility between the PC and the Xbox 360.  In these cases, we can simply render out a depth buffer ourselves using the vertex and pixel shader bits I posted above.  But is this really a good idea?  z/w is non-linear, and most of the precision will be dedicated to areas very close to the near-clip plane.

A different approach would be to render out normalized view-space z as our depth.  Since it’s view-space it’s linear which means we get uniform precision distribution, and this also means we don’t need to bother with projection or unprojection to reconstruct position.  Instead we can take the approach of CryTek and multiply the depth value with a ray pointing from the camera to the far-clip plane.  In HLSL it goes something like this:

// Shaders for rendering linear depth
void DepthVS(   in float4 in_vPositionOS    : POSITION,
                out float4 out_vPositionCS  : POSITION,
                out float  out_fDepthVS     : TEXCOORD0    )
{    
    // Figure out the position of the vertex in
    // view space and clip space
    float4x4 matWorldView = mul(g_matWorld, g_matView);
    float4 vPositionVS = mul(in_vPositionOS, matWorldView);
    out_vPositionCS = mul(vPositionVS, g_matProj);
    out_fDepthVS = vPositionVS.z;
}

float4 DepthPS(in float in_fDepthVS : TEXCOORD0) : COLOR0
{
    // Negate and divide by distance to far-clip plane
    // (so that depth is in range [0,1])
    // This is for right-handed coordinate system,
    // for left-handed negating is not necessary.
    float fDepth = -in_fDepthVS/g_fFarClip;
    return float4(fDepth, 1.0f, 1.0f, 1.0f);
}

// Shaders for deferred pass where position is reconstructed

// Vertex shader for rendering a full-screen quad
void QuadVS (	in float3 in_vPositionOS		: POSITION,
		in float3 in_vTexCoordAndCornerIndex	: TEXCOORD0,
		out float4 out_vPositionCS		: POSITION,
		out float2 out_vTexCoord		: TEXCOORD0,
		out float3 out_vFrustumCornerVS		: TEXCOORD1	)
{
	// Offset the position by half a pixel to correctly
	// align texels to pixels. Only necessary for D3D9 or XNA
	out_vPositionCS.x = in_vPositionOS.x - (1.0f/g_vOcclusionTextureSize.x);
	out_vPositionCS.y = in_vPositionOS.y + (1.0f/g_vOcclusionTextureSize.y);
	out_vPositionCS.z = in_vPositionOS.z;
	out_vPositionCS.w = 1.0f;

	// Pass along the texture coordinate and the position
	// of the frustum corner in view-space.  This frustum corner
        // position is interpolated so that the pixel shader always
        // has a ray from camera->far-clip plane
	out_vTexCoord = in_vTexCoordAndCornerIndex.xy;
	out_vFrustumCornerVS = g_vFrustumCornersVS[in_vTexCoordAndCornerIndex.z];
}

// Pixel shader function for reconstructing view-space position
float3 VSPositionFromDepth(float2 vTexCoord, float3 vFrustumRayVS)
{
	float fPixelDepth = tex2D(DepthSampler, vTexCoord).r;
	return fPixelDepth * vFrustumRayVS;
}

As you can see the reconstruction is quite nice with linear depth, we only need a single multiply instead of the 4 MADD’s and a divide needed for unprojection.  If you’re curious on how to get the frustum corner position I use, it’s rather easy with a little trig.  This tutorial walks you through it.  Or if you’re using XNA, there’s a super-convient BoundingFrustum class that can take care of it for you.  My code for getting the positions looks something like this:

Matrix viewProjMatrix = viewMatrix * projMatrix;
BoundingFrustum frustum = new BoundingFrustum(viewProjMatrix);
frustum.GetCorners(frustumCornersWS);
Vector3.Transform(frustumCornersWS, ref viewMatrix, frustumCornersVS);
for (int i = 0; i < 4; i++)
    farFrustumCornersVS[i] = frustumCornersVS[i + 4];

The farFrustumCornersVS array is what I send to my vertex shader as shader constants. Then you just need to have an index in your quad vertices that tells you which vertex belongs to which corner (which you could also do with shader math, if you want).  Another approach would be to simply store the corner positions directly in the vertices as texCoord’s.

Extra Credit:  this technique can also be used to to reconstruct world-space position, if that’s what you’re after.  All you need to do is rotate (not translate) your frustum corner positions by the inverse of your view matrix to get them back into world space.  Then when you multiply the interpolated ray with your depth value, you simply add the camera position to the value (ends up being a single MADD).

Extra-Extra Credit: you can use this technique with arbitrary geometry too, not just quads.  You just need to figure out a texture coordinate for each pixel, which you can do by either interpolating the clip-space position and dividing x and y by w, or by using the VPOS semantic.  Then for your frustum ray you just calculate the eye->vertex vector and scale it so that it points all the way back to the far-clip plane.

UPDATE:  Answers to extra credit questions here

Scintillating Snippets: Programatically Adding Content To A Content Project

One of the tools I made for my current project is a model editor.  Basically it can import in .fbx or .x models, and then you can apply my custom effects, set parameters, set textures, and then save it using my custom model format I named “.jsm” (it’s just XML…don’t tell anyone!).  Anyway one of the neat features I wanted it to have was the ability to add a model to my game’s Content project so that you wouldn’t have to manually do it through Visual Studio.  And since the Content Pipeline uses MSBuild, this is easy to do:

// Load up the content project
Engine.GlobalEngine.BinPath = System.Runtime.InteropServices.RuntimeEnvironment.GetRuntimeDirectory();
Project contentProject = new Project();
contentProject.Load(projectFileName);

// Add it
BuildItem newItem = contentProject.AddNewItem("Compile", "Models\\" + modelName + ".fbx");
newItem.SetMetadata("Link", "Models\\" + modelName + ".fbx");
newItem.SetMetadata("Name", modelName);
newItem.SetMetadata("Importer", "FbxImporter");
newItem.SetMetadata("Processor", "ModelProcessor");

// Save it
contentProject.Save(projectFileName);

This is of course the generic version and not the actual code I used, but you get the idea.  The “projectFileName” string should contain a path to your Content.contentproj file in your Content subfolder.  “modelName” would just be a name for your model, minus the extension.    What’s going on is pretty simple:  I load up the Content project using the Engine and Project classes found in Microsoft.Build.BuildEngine.  Then I create a new BuildItem for the model, which I add to the Project.   When I create the BuildItem, the string I send to the constructor contains the path to the model file relative to the .contentproj file.  The first bit of metadata specifies that I want to add the file as a link, not as a copy.  The string specifies how the file shows up in the project hierarchy (AKA, how it will show up when you expand the Content node in Visual Studio).  The second bit of metadata is just a name associated with the file.  Then the third specifies the ContentImporter to use, and the fourth specifies the ContentProcessor to use.

Deferred Cascaded Shadow Maps

For my next sample I was planning on extending my deferred shadow maps sample to implement cascaded shadow maps.  I got an email asking about how to make the sample look decent with large viewing distances which is exactly the problem CSM’s solve.  So I decided to bump up my plans a little early and get the code up and running.  It’ll be a while before I get the write-up finished, but until then feel free to play around with code (PC and 360 projects included).

Profiling Events vs. Virtual Functions On The 360

Over the past week or so I’ve been completely reworking my collision system in order to better decouple it from other areas of code, and also make it more flexible.  One part I got stuck on for a bit was deciding on the mechanism to use for notifying owners of collision components when the component collides with something.  I narrowed it down to two options:

-notify owners via the ICollisionOwner interface I was using

OR

-use an Event

I was leaning more towards events because I felt their semantics naturally fit with the usage pattern I was working.  If game entities want to be notified, they simply subscribe and they get notified.  This seemed cleaner and easier to understand than letting each collision component have some sort of  “NotifyOwner” flag, and then call a virtual function if the flag was true.  However I was a little worried about performance…I hadn’t really used delegates on the 360 before and I wanted to make sure that the overhead wasn’t going to be something astronomical before proceeding. So I set up a simple test harness that vaguely resembled how I was going to use events:

public delegate void EventDelegate(object sender, ref Vector3 parameter);

public class EventServer
{
    public event EventDelegate SomeEvent;

    public void RaiseEvent()
    {
        Vector3 param = new Vector3();

        if (SomeEvent != null)
            SomeEvent(this, ref param);

        //for (int i = 0; i < Handlers.Count; i++)
        //{
        //    if (Handlers[i].HandlesEvent)
        //        Handlers[i].HandleEventVirtual(this, ref param);
        //}
    }

    public List<IEventHandler> Handlers = new List<IEventHandler>();
}

public interface IEventHandler
{
    void HandleEventVirtual(object sender, ref Vector3 parameter);
    bool HandlesEvent
    {
        get;
    }
}

public class EventHandler : IEventHandler
{
    EventServer server;
    bool handleEvent;

    public EventHandler(EventServer server, bool handleEvent)
    {
        this.server = server;
        this.handleEvent = handleEvent;  

        if (handleEvent)
            server.SomeEvent += new EventDelegate(HandleEvent);
    }

    void HandleEvent(object sender, ref Vector3 parameter)
    {
        parameter.Y += 0.001f;
    }

    public virtual void HandleEventVirtual(object sender, ref Vector3 parameter)
    {
        parameter.X += 0.001f;
    }

    public bool HandlesEvent
    {
        get { return handleEvent; }
    }
}

public class EventHandler2 : EventHandler
{
    public EventHandler2(EventServer server, bool handleEvent)
        : base(server, handleEvent)
    {
    }

    public override void HandleEventVirtual(object sender, ref Vector3 parameter)
    {
        base.HandleEventVirtual(sender, ref parameter);
        parameter.Normalize();
    }
}

Pretty simple set up: a class that will dole out events to a collection of handlers, with a derivative of the handler class also being thrown in just to make sure the compiler doesn’t do anything funky that will prevent us from actually getting virtual functions.  To test events we leave it like this, to test virtual functions we comment out the event invocation and use the virtual function call instead.  Any .NET junkies might notice I’ve violated the guidelines for creating custom event handlers by not using a an EventArgs derivate…the reason why is because EventArgs in a class, so creating a new instance would generate garbage everytime the event fires.  And as we all know..the GC is not our friend on the Xbox.

I set it up to run with various amounts of event handlers distributed across various amounts of event servers.  I then set up the game class to fire off all the event servers in the Update function and use a Stopwatch to time how long it took.  I also averaged the timing results across 64 frames to smooth out the results.  This is what I got:

50:1         9
             22

500:1        710
             220

5000:1       163000 (3.26ms)
             2200

5000:10      18600
             2200

5000:100     1000
             2200

5000:1000    820
             2200

The table shows the EventHandler:EventServer ratio, and on the right is the of time taken for invocation (in ticks).  The number on top is from using Events, the bottom from using virtual functions.  The first few results are pretty interesting:  the virtual function method scales linearly with the amount of handlers we have, while the the time required for firing events goes up exponentially.   The bottom half of the results are even more interesting: the time taken goes way down as we start to distribute the handlers more evenly across servers.  In fact it goes down so much, it becomes quicker than virtual functions!.  Crazy.

Anyway I had my answer: events would be fine with my setup.  I can’t foresee any reason why more than one handler would subscribe to the same collision component, and even if it did the overhead is basically miniscule for the numbers I’ll be working with.  But it’s always fun to experiment, right?