There are times I wish I’d never responded to this thread over at GDnet, simply because of the constant stream of PM’s that I still get about it. Wouldn’t it be nice if I could just pull out all the important bits, stick it on some blog, and then link everyone to it? You’re right, it would be!

First things first: what am I talking about? I’m talking about something that finds great use for deferred rendering: reconstructing the 3D position of a previously-rendered pixel (either in view-space or world-space) from a single depth value. In practice, it’s really not terribly complicated. You intrinsically know (or can figure out) the 2D position of any pixel when you’re shading it, which means that if you can sample a depth value you can get the whole 3D position. However it’s still easy to get tripped up due to the fact that there’s several ways to go about it, coupled with the fact that many beginners aren’t very proficient at debugging their shaders.

Let’s talk about the first way to do it: storing post-projection z/w, combining it with x/w and y/w, transforming by the inverse of the projection matrix, and dividing by w. In HLSL it looks something like this…

// Depth pass vertex shader output.vPositionCS = mul(input.vPositionOS, g_matWorldViewProj); output.vDepthCS.xy = output.vPositionCS.zw; // Depth pass pixel shader (output z/w) return input.vDepthCS.x / input.vDepthVS.y; // Function for converting depth to view-space position // in deferred pixel shader pass. vTexCoord is a texture // coordinate for a full-screen quad, such that x=0 is the // left of the screen, and y=0 is the top of the screen. float3 VSPositionFromDepth(float2 vTexCoord) { // Get the depth value for this pixel float z = tex2D(DepthSampler, vTexCoord); // Get x/w and y/w from the viewport position float x = vTexCoord.x * 2 - 1; float y = (1 - vTexCoord.y) * 2 - 1; float4 vProjectedPos = float4(x, y, z, 1.0f); // Transform by the inverse projection matrix float4 vPositionVS = mul(vProjectedPos, g_matInvProjection); // Divide by w to get the view-space position return vPositionVS.xyz / vPositionVS.w; }

For many this is the preferred approach since it works with hardware depth buffers. It also may seem natural to some: we get depth by projection, we get position by un-projecting. But what if we don’t have access to a hardware depth buffer? If you’re targeting the PC and D3D9, sampling from a depth buffer as if it were a texture is not straightforward since it requires driver hacks. If you’re using XNA, it’s not possible at all since the framework generally attempts to main cross-plaftorm compatibility between the PC and the Xbox 360. In these cases, we can simply render out a depth buffer ourselves using the vertex and pixel shader bits I posted above. But is this really a good idea? z/w is non-linear, and most of the precision will be dedicated to areas very close to the near-clip plane.

A different approach would be to render out normalized view-space z as our depth. Since it’s view-space it’s linear which means we get uniform precision distribution, and this also means we don’t need to bother with projection or unprojection to reconstruct position. Instead we can take the approach of CryTek and multiply the depth value with a ray pointing from the camera to the far-clip plane. In HLSL it goes something like this:

// Shaders for rendering linear depth void DepthVS( in float4 in_vPositionOS : POSITION, out float4 out_vPositionCS : POSITION, out float out_fDepthVS : TEXCOORD0 ) { // Figure out the position of the vertex in // view space and clip space float4x4 matWorldView = mul(g_matWorld, g_matView); float4 vPositionVS = mul(in_vPositionOS, matWorldView); out_vPositionCS = mul(vPositionVS, g_matProj); out_fDepthVS = vPositionVS.z; } float4 DepthPS(in float in_fDepthVS : TEXCOORD0) : COLOR0 { // Negate and divide by distance to far-clip plane // (so that depth is in range [0,1]) // This is for right-handed coordinate system, // for left-handed negating is not necessary. float fDepth = -in_fDepthVS/g_fFarClip; return float4(fDepth, 1.0f, 1.0f, 1.0f); } // Shaders for deferred pass where position is reconstructed // Vertex shader for rendering a full-screen quad void QuadVS ( in float3 in_vPositionOS : POSITION, in float3 in_vTexCoordAndCornerIndex : TEXCOORD0, out float4 out_vPositionCS : POSITION, out float2 out_vTexCoord : TEXCOORD0, out float3 out_vFrustumCornerVS : TEXCOORD1 ) { // Offset the position by half a pixel to correctly // align texels to pixels. Only necessary for D3D9 or XNA out_vPositionCS.x = in_vPositionOS.x - (1.0f/g_vOcclusionTextureSize.x); out_vPositionCS.y = in_vPositionOS.y + (1.0f/g_vOcclusionTextureSize.y); out_vPositionCS.z = in_vPositionOS.z; out_vPositionCS.w = 1.0f; // Pass along the texture coordinate and the position // of the frustum corner in view-space. This frustum corner // position is interpolated so that the pixel shader always // has a ray from camera->far-clip plane out_vTexCoord = in_vTexCoordAndCornerIndex.xy; out_vFrustumCornerVS = g_vFrustumCornersVS[in_vTexCoordAndCornerIndex.z]; } // Pixel shader function for reconstructing view-space position float3 VSPositionFromDepth(float2 vTexCoord, float3 vFrustumRayVS) { float fPixelDepth = tex2D(DepthSampler, vTexCoord).r; return fPixelDepth * vFrustumRayVS; }

As you can see the reconstruction is quite nice with linear depth, we only need a single multiply instead of the 4 MADD’s and a divide needed for unprojection. If you’re curious on how to get the frustum corner position I use, it’s rather easy with a little trig. This tutorial walks you through it. Or if you’re using XNA, there’s a super-convient BoundingFrustum class that can take care of it for you. My code for getting the positions looks something like this:

Matrix viewProjMatrix = viewMatrix * projMatrix; BoundingFrustum frustum = new BoundingFrustum(viewProjMatrix); frustum.GetCorners(frustumCornersWS); Vector3.Transform(frustumCornersWS, ref viewMatrix, frustumCornersVS); for (int i = 0; i < 4; i++) farFrustumCornersVS[i] = frustumCornersVS[i + 4];

The farFrustumCornersVS array is what I send to my vertex shader as shader constants. Then you just need to have an index in your quad vertices that tells you which vertex belongs to which corner (which you could also do with shader math, if you want). Another approach would be to simply store the corner positions directly in the vertices as texCoord’s.

Extra Credit: this technique can also be used to to reconstruct world-space position, if that’s what you’re after. All you need to do is *rotate* (not translate) your frustum corner positions by the inverse of your view matrix to get them back into world space. Then when you multiply the interpolated ray with your depth value, you simply add the camera position to the value (ends up being a single MADD).

Extra-Extra Credit: you can use this technique with arbitrary geometry too, not just quads. You just need to figure out a texture coordinate for each pixel, which you can do by either interpolating the clip-space position and dividing x and y by w, or by using the VPOS semantic. Then for your frustum ray you just calculate the eye->vertex vector and scale it so that it points all the way back to the far-clip plane.

UPDATE: Answers to extra credit questions here

UPDATE 2: More info here

*Closing the comments for now, because I keep getting spam comments*

Thank you for your beautiful article, it’s exactly what I needed!

Just 2 things:

1) I need to reconstruct the world space position… can you explain a bit more the process? What do you mean for “rotate the frustm corners by the matrix” ?

2) For point lights I use a sphere mesh to approximate the light volume. How can I build the world position from that? I mean, using a full screen quad , I can easily compute the current frustum corner..but with a sphere?

Again, thank you my friend!

I’m trying to do that but I have some problems. I described here in the GDNet forum using your technique, maybe you can help me.

http://www.gamedev.net/community/forums/topic.asp?topic_id=541689

Thanks for all !

Thank you for posting this all together! I can delete my 6 book marks to different parts of it now :)

Cheers,

Greg

Very interesting, thanks.

All that’s missing to make this perfect is an explanation of how to map a position in view space back into screen/texture space again.

My brain melts each time I try to do this and it’s stopping me from implementing a better SSAO algorithm than I have now.

Hi Paul. I replied in that thread on the XNA forums, but I’ll reply here too.

In my SSAO shader I just take the view-space position, transform it by the projection matrix, and then use this function to get a texture coordinate:

// Gets the screen-space texel coord from clip-space position

float2 CalcSSTexCoord (float4 vPositionCS)

{

float2 vSSTexCoord = vPositionCS.xy / vPositionCS.w;

vSSTexCoord = vSSTexCoord * 0.5f + 0.5f;

vSSTexCoord.y = 1.0f – vSSTexCoord.y;

vSSTexCoord += 0.5f / g_vTexDimensions;

return vSSTexCoord;

}

g_vTexDimensions would be the dimensions of the texture you’re sampling. There might be a cheaper way of doing this, but projecting definitely works.

:) Thanks. My email’s been off over the weekend.

Sorry for doubting your article, but I think that the projective transform is not inversible in general – you loose information when you project. Having depth buffer values helps to restore some of the information, but since depth map has a certain resolution then this approach will not work if the camera is far away from the projected object. Most of the object may have very similar depth value in the depth map and so it will look flat after “unprojecting”. I think you should point this out, since it is misleading to say that projective transforms can be undone in general…

Right, if you store perspective z/w in most cases you’ll have inadequate precision in areas closer to the far clip plane. But it’s really a distribution of precision issue more than anything, due to the non-linear curve of z/w. My latest blog (https://mynameismjp.wordpress.com/2010/03/22/attack-of-the-depth-buffer/) demonstrates the sort of error you can expect for different depth formats…perhaps I’ll link it here so that people are aware of this issue. Thanks for bringing it up. :)

It looks like this technique of using the far bounding frustum coordinates will only work if you are using a perspective projection, is that right?

I have a parallel projection, and multiplying the depth by the frustum ray just doesn’t seem to make sense.

thank you for your post.

can i translate into korean and put on my blog?

Phil: indeed the technique is meant for a perspective projection. For an ortho projection it’s unnecessary since you can directly calculate view-space X and Y coordinates based on your clip-space XY and your projection parameters. You can do the same for view-space Z once you’ve sampled it from your depth buffer.

ozael: absolutely, that’s not problem at all.

The way I wrote this yesterday, instead of using constants for the frustum corners, I just did mul(float4(x,y,1,1), mInvProj) per vertex to get far-plane positions in view-space. Then in pixel shader multiply with stored (floating point) view-z/farPlane.

At least visually it looks like it should and saves you from having to work out the frustum coordinates on CPU, so you could draw large number of screen-aligned triangles together, yet still do the unprojection work only once-per-vertex.

Hi MPJ,

I have spend nearly a week while reading your thread in gamedev and trying several ssao implamentations.

I’m trying to make a ssao protetype in render monkey before integrating it in to the engine that I working on. Currently my engine does not have developer friendly shader management facilities so I’m doing my best to make the effect as good as posible bofore integration.

I haver read and understood all of the methods that you mentioned hoverever the biggest piece is missing. I can not calculate and feed view frustum corners to vertex shader because of rendermonkey. I’m trying to generate them by using screen space quad mesh (which is a simple 3d rectangle with dimensions) vertices and I can not get result. Can you write how to calculate view vectors by using only screen aligned quad in vertex shader.

Here is my code which does not work;

G-Buffer Vertex Shader:

VS_OUTPUT vs_main(VS_INPUT Input)

{

VS_OUTPUT Output;

Output.Position = mul(Input.Position + float4(ModelPosition.xyz, 0.0f), ViewProjectionMatrix);

float3 ObjectPosition = mul(Input.Position, ViewMatrix);

Output._Position = ObjectPosition;

Output.NormalDepth.xyz = mul(Input.Normal, ViewMatrix);

Output.NormalDepth.w = ObjectPosition.z;

return Output;

}

G-Buffer Pixel Shader:

PS_OUTPUT ps_main( PS_INPUT Input )

{

PS_OUTPUT Output;

Output.Position = float4(Input.Position, 1.0f);

Output.NormalDepth.xyz = Input.NormalDepth.xyz;

Output.NormalDepth.w = Input.NormalDepth.w / FarZ;

return Output;

}

SSAO Vertex Shader;

VS_OUTPUT vs_main(float4 Position : POSITION, float2 Texcoord : TEXCOORD0)

{

VS_OUTPUT Output;

Output.Position = float4(Position.xy, 0.0f, 1.0f) + float4(-PixelSize.x, PixelSize.y, 0.0f, 0.0f);

Output.ScreenTexcoord = Texcoord;

Output.ViewVector.x = Position.x * tan(FOV / 2) * (ScreenSize.x / ScreenSize.y);

Output.ViewVector.y = Position.y * tan(FOV / 2);

Output.ViewVector.z = 1;

return Output;

}

SSAO PixelShader:

float3 GetViewPosition(in float2 Texcoord, in float3 ViewVector)

{

return ViewVector * tex2D(NormalDepthInput, Texcoord).w;

}

Thanks a lot.

@Orcun: checkout this thread, http://www.gamedev.net/community/forums/topic.asp?topic_id=506573

The method from Shader x5 book uses a technique that doesn’t require the frustum corners and the results are pretty nice.

@author, Thanks for taking the time to blog on this subject and providing the XNA example!

Hi, the second method of render z on view space as depth, can be used to store shadow depth?

Because I think it will solve in part the bias problem, since it is lineal the distribution, a constant bias will be enough to fix it, seems too good to be true, I’m missing something?

If you’re manually writing out shadow depth to a render target, then you can use whatever depth metric you’d like. Linear Z/FarClip definitely works for that purpose. It’s only an issue if you only want to render to a depth buffer, in which case you don’t have a choice but to use z/w.

Thank you so much, I will run some tests :)

It works:

Thanks for open my eyes, behold the linearity.

Me again, I figured out how to render to a depth buffer without using z/w. The pixel fragment should write zero as always, the vertex fragment change from this:

Out.position = mul(In.position, WorldViewProj);

to this:

float4 vpos = mul(In.position, WorldViewProj);

vpos.z = (vpos.z*vpos.w)/FarPlane;

Out.position = vpos;

It allows me to perform the depth comparison by hardware, I still don’t understand why everybody, even the SDK shadow map sample uses z/w, there is even papers written with the purpose of fixing the z-fighting in shadow map.

Messing with your z value like that can (and will) screw up rasterization, early z-cull, and z compression (which is why people don’t do it).

One workable alternative is to use a floating point depth buffer, and flip the near and far planes of your projection (and also flip your depth test directions). When you do that, the non-linear distribution of precision in a floating point value *mostly* cancels out the non-linearity of z/w. That helps with precision, but not with the issue of applying a non-uniform bias.

Hi mpettineo, again, I just want to point out about the frustum corners method, it is valid only for the current pixel, can’t be used directly to read from neighbor samples, because of the interpolation, it can be modified doing a manual lerp:

lerp(-corner.x,corner.x,uv.x);

lerp(-corner.y,corner.y,uv.y);

I was into it the whole week trying to figure out that. Because many people are getting confused with this.

Also, as you said, Z/FarPlane, can be used only if it’s done in the pixel program, interpolated by the vertex program falls into a nonlinear function with werid results, I don’t know why this page is still alive:

http://www.mvps.org/directx/articles/linear_z/linearz.htm

IT IS WRONG!, thanks for your time anyway, has been very helpful :)