Open Access System for Information Sharing

Login Library

 

Thesis
Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Maintaining Spatiotemporal Coherence in Video Retargeting using Grid Optimization

Title
Maintaining Spatiotemporal Coherence in Video Retargeting using Grid Optimization
Authors
이호섭
Date Issued
2020
Publisher
포항공과대학교
Abstract
With proliferation of accessing videos on various devices with different resolutions and aspect ratios, there is a growing need to reproduce video contents on a display platform with various aspect ratios. In addition, sharing of video contents through mobile devices becomes popular. Therefore, in order to adjust the video contents to various display platforms, a technique of adjusting the size of an image to the aspect ratio and resolution of an output device is essential. This technique is called the video retargeting. Video retargeting can be used for devices with various display platforms such as personal computers, smartphones, tablets, television, and portable game consoles. Moreover, movies generally have different aspect ratio, so it becomes an important tool to adapt a video to various display platforms. Uniform scaling, cropping, and letter box insertion are the standard approaches for this task. The uniform scaling method resizes the image and keeps the same scaling factor for all the pixels to fit the desired aspect ratio. The cropping method removes unnecessary surrounding regions from the given image to fit the desired aspect ratio. The letter box insertion method inserts black areas into the upper and lower regions of the image to fit the desired aspect ratio. However, standard video retargeting methods cannot preserve the shape of important objects in the video sequences. In addition, letter box insertion method wastes display area. To overcome the limitations of standard video retargeting methods, various content-aware video retargeting methods have been proposed. Content-aware video retargeting methods adjust the size of the given image by deforming relatively less important regions. A key idea of video retargeting is adjust the aspect ratio of the given image while preserving the shape of the salient object, and maintaining the temporal coherence of video contents. Content-aware video retargeting methods generally consist of three steps: saliency map generation, grid map generation, and retargeted image generation. In the first step, the retargeting method generates a saliency map that represents the region of salient object roughly by analyzing the video contents to detect the visually important region. In the second step, the retargeting method uses the saliency to generate the grid map that represents the deformation information of the original image. In the final step, the retargeting method generates the retargeted image using the deformation information and the original image. Previous video retargeting methods mainly focus on preserving the shape of the salient object or maintaining the temporal coherence of the video contents. However, in order to apply these methods to various video contents, they have two major problems. First, the results of the previous methods show that the shape of the salient object is not consistently maintained in consecutive frames. Second, they have difficulty maintaining the temporal coherence of the static background regions in consecutive frames. The reason for these two problems is that previous video retargeting methods do not explicitly adjust the sizes of the grids that correspond to the salient objects and the static background regions, depending on the degree of the spatiotemporal consistency. These two problems can significantly degrade the quality of the retargeted images. To solve these problems, this dissertation proposes a new video retargeting approach that preserves the shape of the salient object and maintains the temporal coherence for the static background regions in the video. The basic idea of the proposed method is to maintain the consistency of the contents in consecutive frames by analyzing the degree of the spatiotemporal consistency. This approach can maintain the consistency of video contents better than the previous methods. The proposed method consists of the following three steps. First, the proposed method generates the saliency map that represents the rough area of the salient object in the given image by analysis of image characteristic. Then it calculates the spatial and temporal grid sizes which consider the spatiotemporal coherences of the video contents for each column of the image. The spatial grid sizes are the scaling factors obtained from the saliency values to preserve the shape of the salient object. The temporal grids sizes are the scaling factors obtained from the grid position of the previous frame to maintain the temporal coherence of the static background regions. The proposed method formulates the objective function by using the spatial and temporal grid sizes for each column of the image. To find the optimal grid sizes, this method minimizes the objective function. Finally, the proposed method generates the retargeted image by performing image interpolation on the grids. Experimental results show that the proposed method outperforms the previous methods in the video datasets. Compared to the best results obtained using eight previous methods, the proposed method achieved improvements of ×1.19 in Bidirectional Similarity Measure, ×7.59 in Jittery Metric 1, ×13.16 in Jittery Metric 2, and reduced the average computation time per pixel by ×6.14, respectively. Furthermore, subjective evaluation through pairwise comparisons and average ranking scores proved that the preference rate and the average ranking score of the proposed method was 92.75% and 1.8785, respectively. From the experimental results, I conclude that the proposed method provides superior video retargeting quality with much lower computational complexity than the previous methods.
URI
http://postech.dcollection.net/common/orgView/200000292144
https://oasis.postech.ac.kr/handle/2014.oak/111784
Article Type
Thesis
Files in This Item:
There are no files associated with this item.

qr_code

  • mendeley

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Views & Downloads

Browse