Open Access System for Information Sharing

Thesis

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Mixed Temporal Kernel Depthwise-Separable Convolution Network for Action Recognition

Title: Mixed Temporal Kernel Depthwise-Separable Convolution Network for Action Recognition

Abstract: This thesis proposes the mixed temporal kernel depthwise-separable convolution network that extracts important temporal information for action recognition. First, most people recognize other people’s behavior using both surrounding context of the target information and the target person comprehensively. Therefore, similar to human interests, the I3D tail module helps to extract a rich feature map of the target person location. Second, unlike image data, video data has temporal axis information. However, each frame has a different amount of temporal information quality. When we watching a video, we automatically select necessary frames from a video. In order to imitate to human behavior, we propose the mixed temporal kernel depthwise-separable convolution module that can selectively extract temporal information. Overall, extracting rich feature map of the target person and effective temporal information plays an important role in action recognition. We train and test using the Atomic Visual Actions dataset, which is the most similar to the real world because it comes from YouTube, and has multiple action labels. We achieve 23.86 mAP from the AVA dataset, and there are performance improvements in 47 of the 60 categories.

URI: http://postech.dcollection.net/common/orgView/200000333782
https://oasis.postech.ac.kr/handle/2014.oak/111099

qr_code