An Efficient Human Instance-Guided Framework for Video Action Recognition

Inwoong Lee; Doyoung Kim; Dongyoon Wee; Sanghoon Lee

doi:10.3390/s21248309

YUHSpace

BROWSE

79 240

Cited 0 times in

Cited 10 times in

An Efficient Human Instance-Guided Framework for Video Action Recognition

DC Field	Value	Language
dc.date.accessioned	2023-02-10T00:48:51Z	-
dc.date.available	2023-02-10T00:48:51Z	-
dc.date.issued	2021-12	-
dc.identifier.uri	https://ir.ymlib.yonsei.ac.kr/handle/22282913/192386	-
dc.description.abstract	In recent years, human action recognition has been studied by many computer vision researchers. Recent studies have attempted to use two-stream networks using appearance and motion features, but most of these approaches focused on clip-level video action recognition. In contrast to traditional methods which generally used entire images, we propose a new human instance-level video action recognition framework. In this framework, we represent the instance-level features using human boxes and keypoints, and our action region features are used as the inputs of the temporal action head network, which makes our framework more discriminative. We also propose novel temporal action head networks consisting of various modules, which reflect various temporal dynamics well. In the experiment, the proposed models achieve comparable performance with the state-of-the-art approaches on two challenging datasets. Furthermore, we evaluate the proposed features and networks to verify the effectiveness of them. Finally, we analyze the confusion matrix and visualize the recognized actions at human instance level when there are several people.	-
dc.description.statementOfResponsibility	open	-
dc.language	English	-
dc.publisher	MDPI	-
dc.relation.isPartOf	SENSORS	-
dc.rights	CC BY-NC-ND 2.0 KR	-
dc.subject.MESH	Human Activities*	-
dc.subject.MESH	Humans	-
dc.subject.MESH	Motion	-
dc.subject.MESH	Neural Networks, Computer*	-
dc.subject.MESH	Recognition, Psychology	-
dc.subject.MESH	Vision, Ocular	-
dc.title	An Efficient Human Instance-Guided Framework for Video Action Recognition	-
dc.type	Article	-
dc.contributor.college	College of Medicine (의과대학)	-
dc.contributor.department	Dept. of Radiology (영상의학교실)	-
dc.contributor.googleauthor	Inwoong Lee	-
dc.contributor.googleauthor	Doyoung Kim	-
dc.contributor.googleauthor	Dongyoon Wee	-
dc.contributor.googleauthor	Sanghoon Lee	-
dc.identifier.doi	10.3390/s21248309	-
dc.relation.journalcode	J03219	-
dc.identifier.eissn	1424-8220	-
dc.identifier.pmid	34960404	-
dc.subject.keyword	convolutional neural network	-
dc.subject.keyword	human action recognition	-
dc.subject.keyword	human detection	-
dc.subject.keyword	multiple human tracking	-
dc.subject.keyword	temporal sequence analysis	-
dc.citation.volume	21	-
dc.citation.number	24	-
dc.citation.startPage	8309	-
dc.identifier.bibliographicCitation	SENSORS, Vol.21(24) : 8309, 2021-12	-

Appears in Collections:: 1. College of Medicine (의과대학) > Dept. of Radiology (영상의학교실) > 1. Journal Papers

Show simple item record Find it @ YMLIB

License

YUHSpace: An Efficient Human Instance-Guided Framework for Video Action Recognition

YUHSpace

BROWSE

Browse

Links