If inserting one object into a background - or simply on some kind of ground plane seen from a perspective view - of course we need to know the normal of this plane to find "which way is up?". This part is simple and could be found either from the image of the absolute conic with the horizon (often referred to as the vanishing line), or by finding the projected intersection of parallell lines in all directions. This is something we thought of doing, with varied success in aligning object and image normals. What we didn't consider in the project was, that there's not only an "up direction" for an object, there's of course also the direction in which the object faces/looks. To be able to rotate the object together with the plane it stands on, we can use a marker on the plane - such as an arrow - to find out how to rotate the object which has been inserted on top.
Of course, more - and more through - explanations of this can be seen in litterature such as the ones we have used most frequently:
- Introductory Techniques for 3-D Computer Vision (Trucco, Verri)
- Multiple View Geometry in Computer Vision (Hartley, Zisserman)
The direction of the inserted object becomes especially interesting if we have for example a live video feed, where any object in the scene - the background - can be rotated at any given moment. This is another possible approach to the thesis project. We are at this point considering the following thesis projects:
- A board game played with people in different places, using simple bricks with letters - or other clear markers - to play with. The game would be played using a camera which sends a live video feed from one player to another. This feed would be accompanied by CG models, which would replace the bricks on the board, to make it look more interesting and fun. For example, one player could put up a brick showing a brick with an 'M', which the software interprets as a monster brick, thereby displaying an animated monster on the screen, which will move and rotate in the same way the board does relative to the camera. The other player (or perhaps the computer AI), could then find a reason to put the 'W' brick up, which suddenly grows a wall on the screen, in the position of the brick....
- The other approach is basically what I've outlined as future improvements of the previous project. This includes image rectification, putting these images as textures on a 3D model, derived from information in photo's. Interaction in this version would be to combine several images into for example one big ground, ending by a castle in one direction and at the courtyard entrance in another. There would of course be a few challenges, which I'll report closer if deciding on this approach. Either way, in the end this approach would give a virtual world in which people could "walk around", talk (either by text or microphone) and perhaps even find out curious facts about the surroundings - which of course in the could give a commercial value both for museums, city planners, architects and so on, there's no end to the possibilities!!