Tuesday, March 21, 2006

2D to 3D and back...

So the last few posts have mostly been about DirectX, while I've been playing around with that. Now I've created an interface for importing a mesh (3D object/model) and set/change it's rotation, position and scale while also being able to change some camera parameters. The app also saves an jpg image which can be used on top of the video background to combine the both.

This is all great and basically what I need from Managed DirectX, so I won't revisit that for a few days. What I will do fr a while now includes research on how to find the right angle and position of the 3D object, with regards to the background movement and position of tracked objects in the background video. The translation between the image reference frame and the 3D world reference frame is usually done using Camera Parameters. These parameters can be divided into:
  • Extrinsic parameters such as rotation and translation (position) which gives us a matrix to perform transformation between world and camera reference frames.
  • Intrinsic parameters like focal length and the principal point (also the skew/distorion, which is often 0 in modern cameras). These parameters are part of a matrix which helps us perform transformations between the camera and image reference frames.
For example the image point (x,y,1) would be transformed from the world coordinate by doing the calculation M(int)M(ext)(X,Y,Z,1) (which is not completely correct written considering there's no scientific notation in blogger). I will dive deeper into each of the parameters later. For now, it's enough to say that the Camera Parameters can hopefully be retrieved through Camera Calibration.

Wednesday, March 08, 2006

Finished "Hello Triangle"

Most languages you learn usually have something called "Hello World" as a first tutorial - most often simply printing "Hello World!" on the screen. This could be in a message box, a browser or a console window, but whichever it is, they are all very simple and short - often around 5 lines of code or less. I just finished my first "Hello Triangle", which could be seen as the Computer Graphics programming language equivalent of "Hello World".

As can be seen in the image, it's a very simple render of a triangle with corners having three different colours which blend into eachother towards the middle.

Compared to what I know from OpenGL with C++, and what it took to make the first triangle there, this seemed more straight forward - using Managed DirectX with C#. Most commands are what they are called, and most of the time the things you need make sense - but it still takes some coding before actually getting a result. Compared to the 5 lines or less of "Hello World", this version of "Hello Triangle" took almost 100 lines... and it doesn't even move yet ;-)

The things you have to do, in short, are to initialize the graphics by setting up a device (graphics card), some presentparameters (for example windowed mode or full screen), create a buffer of vertices (points in space/on screen which binds your shape together). When this has been rendered it has to stay "alive", so you loop the application until it's shut down. After shutting down, it's always a good idea to dispose the graphics from memory ;-)

Monday, March 06, 2006

Managed DirectX Tutorials

Since I'm just starting with MDX, I had to take some beginner tutorials in the subject. Most of those tutorials require some prior knowledge in C#, but considering this they are mostly well created and easy to follow. My three favorite beginner MDX tutorials are:
I believe all three work with the February DirectX SDK, even though I haven't completed all of them. Even though these take you a bit on the way, I believe getting a book is necessary sooner or later - but more on that another time.

Regarding my thesis project, it has been going slowly forward. I now have a simple MDX application to build on. I will continue by finding out how to do image rectification before texturing the ground plane. This will be followed by inserting 3D models of houses on top of the ground plane, hopefully giving a realistic look. Of course, the models have to be scaled and rotated according to coordinates on the ground plane - but that shouldn't be too hard once the images/frames from the video feed have been rectified.

Obviously, image rectification doesn't only lead to the possibility of inserting 3D models, but also ta actually create a 3D model of the ground plane itself, if enough information can be extracted from the original video.