Reference no: EM132393086
Task A - Motion Estimation and Visualization
Task A requires students to capture and then visualize the motion of the object in a given video clip (attached) (monkey_TaskA.mov).
The basic premise of motion capture is that in most cases, consecutive video frames will be similar except for the changes induced by objects moving within the frames. The basic idea of motion capture is to firstly define grids of block regions on two adjacent frames and then calculate the 2D displacement vector between the matched blocks. To describe the block-matching algorithm step by step:
1. Define a grid block of size K x K, where K is preferred to be odd to make it easier to determine the central coordinate of each grid block; Within the video data provided, each frame of size H x W can be divided into (H x W)/K2 grid blocks in total.
2. For convenience, we use Fi and Fi+1 to represent the i - th frame and the frame next to it.
3. For each grid block Bi at location (x, y) in frame Fi, we need to search for its matched grid block B'i+1 at location (x', y') in frame Fi+1 (the next frame), with the minimum sum squared distance (SSD) calculated between Bi and B'i+1. The SSD can be computed as
SSD(Bi,B'i+1) = √(∑x∑y∑c(Bi(x, y, c) - Bi+1(x, y, c))2) (1)
where x, y, c indicate the height index, width index and color-channel index, respectively.
The displacement from Bi to B'i+1, can be represented by a 2D vector as (x' - x, y' - y). To speed up the process of finding the matched block from Fi+1 for Bi in Fi, we can search the neighbouring blocks (in Fi+1) of Bi within a certain radius only, rather than all candidate grid blocks in Fi+1.
4. Represent the displacement vectors of frame Fi as a 3D matrix of size (H/K, W/K, 2).
5. Visualise the displacement vectors computed for Fi and place this visualization over the frame Fi. You need to draw arrows to represent the extracted displacement vectors.
6. Repeat step 3-5 for all frames.
Hint: Prepare a supporting function arrowdraw() for you to draw these arrows in Python. Python is recommended to be used for this task (you are welcome to use any other language for this task if you are familiar with it). For Task A should include all the files (source code and output video produced) used to perform motion estimation and visualization, together with a README file to describe how to run these files in order to derive your output scene.
Task B - Digital Video Processing
Task B requires students to replace the background and the marionette of the sample video provided (attached) (monkey_TaskB.mov).
Expected Outcomes: Replace the blue background with another dynamic background, which can be programmed animations or your own video found on Internet. In the new video, render your own character to replace the moving monkey, whose behaviour should follow the behaviour of the monkey, and simulate the gestures of the monkey as much as you could. The replaced character should have at least five connected components, including a body, two arms and two legs.
There could be various approaches to solve the task of marionette replacement, however, the general instructions can be given as follows:
1. The body of a monkey is labelled with five red markers, indicating its hands, feet and body.
2. Segment these red markers and the monkey first.
3. Use their spatial-temporal coordinates to track and represent the body motions.
4. It is a good practice to design a data structure to record the sequence of the captured body motions.
5. Assign these motion sequences captured as the spatial-temporal coordinates of your new character (and its parts).
Hint: Some morphological operations might be needed to enhance the segmentation of the red markers. Python is recommended to be used for this task. For Task B should include all the files (source code and output video produced) used to complete the background and marionette replacement, together with a README file to describe how to run these files in order to derive your output scene.
Task C - 3D Animation Scene
Task C requires students to program an interactive 3D animation scene with 3D rendering techniques applied. Your 3D animation scene should include the following scenarios:
- After clicking mouse at (x, y) on the screen, shoot a 3D ball with a random texture. This texture should be randomly loaded from your texture pool and the number of the images from your texture pool should not exceed 10. You can feel free to find your preferred images from Internet to build your own customized texture pool. New balls can be shot into the screen while the previous balls are travelling.
- The space is constrained by 6 walls (left, right, ceiling, floor, the wall far ahead facing the screen and the wall where the camera is placed).
- Any ball shot by clicking mouse flies away from the screen along the Z axis (the direction that moving away from the camera) with a random direction according to the XY plane. It means the balls does not go straight ahead, but instead are shot away in a random direction.
- When the ball touches any of the wall, it bounces back and its new direction is re-computed based on its previous direction.
- The effects caused by gravity and friction should be modeled appropriately through the entire process.
- The potential energy (speed and height) of the balls should decay, according to the travelling time as well as the number of bounces, therefore they will fall down to the ground eventually.
- You program should be able to resolve the collision between the moving balls. You do not have to follow the exact physics equation of elastic collision and momentum conservation principle, but the collision should be modeled smoothly.
Hint: Please note that it might be easier if you use object-oriented programming skill to model the ball object as a class and define its class contents carefully. Processing is recommended to be used for this task. Task C should include all the files (source code and texture images) used to generate the 3D scene, together with a README file to describe how to run these files in order to derive your output scene.
Note - Need help with Task A and Task C.
Technical Report - The report should record the details of these tasks completed. It should be 5 - 10 pages, maximum 15 pages (single column, 1.5 line spacing, Word or PDF). Tables and figures can help you present your ideas clearly. For each task above, there will be a different topic to be focused on and be demonstrated in this technical report:
- The topic of Task A is about the efforts paid in increasing the accuracy and efficiency of your block-matching algorithm implemented. For example, hyper-parameter K could be considered as a trade-off between accuracy and efficiency. This part of the report could be presented as a set of experiments over these different implementation details, and their results might be seen as a guidance toward the final implementation of the block-matching algorithm.
- The topic of Task C should focus on the final effects reached. For this request, you can simply use screenshots with well-written image captions, to explain what your algorithm would react under different cases (specified as the seven bullet points in the section of Task C), such as when bouncing with other balls, when touching the walls and etc.
Attachment:- Assignment File.rar