There are times I wish I’d never responded to this thread over at GDnet, simply because of the constant stream of PM’s that I still get about it. Wouldn’t it be nice if I could just pull out all the important bits, stick it on some blog, and then link everyone to it? You’re right, it would be!
First things first: what am I talking about? I’m talking about something that finds great use for deferred rendering: reconstructing the 3D position of a previously-rendered pixel (either in view-space or world-space) from a single depth value. In practice, it’s really not terribly complicated. You intrinsically know (or can figure out) the 2D position of any pixel when you’re shading it, which means that if you can sample a depth value you can get the whole 3D position. However it’s still easy to get tripped up due to the fact that there’s several ways to go about it, coupled with the fact that many beginners aren’t very proficient at debugging their shaders.
Let’s talk about the first way to do it: storing post-projection z/w, combining it with x/w and y/w, transforming by the inverse of the projection matrix, and dividing by w. In HLSL it looks something like this…
// Depth pass vertex shader
output.vPositionCS = mul(input.vPositionOS, g_matWorldViewProj);
output.vDepthCS.xy = output.vPositionCS.zw;
// Depth pass pixel shader (output z/w)
return input.vDepthCS.x / input.vDepthVS.y;
// Function for converting depth to view-space position
// in deferred pixel shader pass. vTexCoord is a texture
// coordinate for a full-screen quad, such that x=0 is the
// left of the screen, and y=0 is the top of the screen.
float3 VSPositionFromDepth(float2 vTexCoord)
{
// Get the depth value for this pixel
float z = tex2D(DepthSampler, vTexCoord);
// Get x/w and y/w from the viewport position
float x = vTexCoord.x * 2 - 1;
float y = (1 - vTexCoord.y) * 2 - 1;
float4 vProjectedPos = float4(x, y, z, 1.0f);
// Transform by the inverse projection matrix
float4 vPositionVS = mul(vProjectedPos, g_matInvProjection);
// Divide by w to get the view-space position
return vPositionVS.xyz / vPositionVS.w;
}
For many this is the preferred approach since it works with hardware depth buffers. It also may seem natural to some: we get depth by projection, we get position by un-projecting. But what if we don’t have access to a hardware depth buffer? If you’re targeting the PC and D3D9, sampling from a depth buffer as if it were a texture is not straightforward since it requires driver hacks. If you’re using XNA, it’s not possible at all since the framework generally attempts to main cross-plaftorm compatibility between the PC and the Xbox 360. In these cases, we can simply render out a depth buffer ourselves using the vertex and pixel shader bits I posted above. But is this really a good idea? z/w is non-linear, and most of the precision will be dedicated to areas very close to the near-clip plane.
A different approach would be to render out normalized view-space z as our depth. Since it’s view-space it’s linear which means we get uniform precision distribution, and this also means we don’t need to bother with projection or unprojection to reconstruct position. Instead we can take the approach of CryTek and multiply the depth value with a ray pointing from the camera to the far-clip plane. In HLSL it goes something like this:
// Shaders for rendering linear depth
void DepthVS( in float4 in_vPositionOS : POSITION,
out float4 out_vPositionCS : POSITION,
out float out_fDepthVS : TEXCOORD0 )
{
// Figure out the position of the vertex in
// view space and clip space
float4x4 matWorldView = mul(g_matWorld, g_matView);
float4 vPositionVS = mul(in_vPositionOS, matWorldView);
out_vPositionCS = mul(vPositionVS, g_matProj);
out_fDepthVS = vPositionVS.z;
}
float4 DepthPS(in float in_fDepthVS : TEXCOORD0) : COLOR0
{
// Negate and divide by distance to far-clip plane
// (so that depth is in range [0,1])
// This is for right-handed coordinate system,
// for left-handed negating is not necessary.
float fDepth = -in_fDepthVS/g_fFarClip;
return float4(fDepth, 1.0f, 1.0f, 1.0f);
}
// Shaders for deferred pass where position is reconstructed
// Vertex shader for rendering a full-screen quad
void QuadVS ( in float3 in_vPositionOS : POSITION,
in float3 in_vTexCoordAndCornerIndex : TEXCOORD0,
out float4 out_vPositionCS : POSITION,
out float2 out_vTexCoord : TEXCOORD0,
out float3 out_vFrustumCornerVS : TEXCOORD1 )
{
// Offset the position by half a pixel to correctly
// align texels to pixels. Only necessary for D3D9 or XNA
out_vPositionCS.x = in_vPositionOS.x - (1.0f/g_vOcclusionTextureSize.x);
out_vPositionCS.y = in_vPositionOS.y + (1.0f/g_vOcclusionTextureSize.y);
out_vPositionCS.z = in_vPositionOS.z;
out_vPositionCS.w = 1.0f;
// Pass along the texture coordinate and the position
// of the frustum corner in view-space. This frustum corner
// position is interpolated so that the pixel shader always
// has a ray from camera->far-clip plane
out_vTexCoord = in_vTexCoordAndCornerIndex.xy;
out_vFrustumCornerVS = g_vFrustumCornersVS[in_vTexCoordAndCornerIndex.z];
}
// Pixel shader function for reconstructing view-space position
float3 VSPositionFromDepth(float2 vTexCoord, float3 vFrustumRayVS)
{
float fPixelDepth = tex2D(DepthSampler, vTexCoord).r;
return fPixelDepth * vFrustumRayVS;
}
As you can see the reconstruction is quite nice with linear depth, we only need a single multiply instead of the 4 MADD’s and a divide needed for unprojection. If you’re curious on how to get the frustum corner position I use, it’s rather easy with a little trig. This tutorial walks you through it. Or if you’re using XNA, there’s a super-convient BoundingFrustum class that can take care of it for you. My code for getting the positions looks something like this:
Matrix viewProjMatrix = viewMatrix * projMatrix;
BoundingFrustum frustum = new BoundingFrustum(viewProjMatrix);
frustum.GetCorners(frustumCornersWS);
Vector3.Transform(frustumCornersWS, ref viewMatrix, frustumCornersVS);
for (int i = 0; i < 4; i++)
farFrustumCornersVS[i] = frustumCornersVS[i + 4];
The farFrustumCornersVS array is what I send to my vertex shader as shader constants. Then you just need to have an index in your quad vertices that tells you which vertex belongs to which corner (which you could also do with shader math, if you want). Another approach would be to simply store the corner positions directly in the vertices as texCoord’s.
Extra Credit: this technique can also be used to to reconstruct world-space position, if that’s what you’re after. All you need to do is rotate (not translate) your frustum corner positions by the inverse of your view matrix to get them back into world space. Then when you multiply the interpolated ray with your depth value, you simply add the camera position to the value (ends up being a single MADD).
Extra-Extra Credit: you can use this technique with arbitrary geometry too, not just quads. You just need to figure out a texture coordinate for each pixel, which you can do by either interpolating the clip-space position and dividing x and y by w, or by using the VPOS semantic. Then for your frustum ray you just calculate the eye->vertex vector and scale it so that it points all the way back to the far-clip plane.
UPDATE: Answers to extra credit questions here