Prediction of upper limb function from simple activity of daily living using deep learning in patients with stroke

YUHSpace

BROWSE

6 48

Cited 0 times in

Prediction of upper limb function from simple activity of daily living using deep learning in patients with stroke

DC Field	Value	Language
dc.contributor.author	심다인	-
dc.date.accessioned	2025-04-18T05:05:24Z	-
dc.date.available	2025-04-18T05:05:24Z	-
dc.date.issued	2024-02	-
dc.identifier.uri	https://ir.ymlib.yonsei.ac.kr/handle/22282913/204907	-
dc.description.abstract	Upper limb function in stroke patients is commonly determined by clinical measurements such as the Fugl-Meyer Assessment-Upper Extremity (FMA-UE) and Box and Block Test (BBT), which are time-consuming and require trained clinicians. Three-dimensional (3D) computerized motion analysis is one alternative, but it is also time-consuming and requires expensive devices. So, we wanted to know if we could predict upper limb function from simple activity of daily life using deep learning. The aim of this study was to predict upper limb function in stroke patients using deep learning with short two-dimensional (2D) videos of patients performing a simple activity of daily living. To achieve this, we first developed models to predict metrics representing upper limb function using 3D motion capture data of patients with stroke. We then developed similar models to predict the same metrics using keypoints in 2D video data of patients with stroke. We collected FMA-UE score, BBT score, the temporospatial parameters including Movement Time (MT), Index of Curvature (IC) and Number of Movement Units (NMU) and Arm Profile Score (APS) from 3D motion capture in 265 stroke patients from 2014 to 2023. In addition, 2D video data recorded during Reach & Grasp Cycle were collected in 103 stroke patients from 2021 to 2023. Two versions of input data were used to train the deep learning model. First, we used 3D coordinate data to construct the 3D motion capture dataset to predict metrics representing upper limb function. During 3D motion capture, we obtained a total of 30 coordinate data per trial, consisting of X, Y, and Z coordinates data of 10 reflex markers: Trunk (4), Shoulder, Elbow, Wrist (2), Finger (2). Second, we used 330 video clips to construct a 2D video dataset to predict metrics representing upper limb function. 2D keypoints were extracted through pose estimation using the RTMPose method. We obtained a total of 14-coordinate data per video, consisting of X and Y coordinates of 7 keypoints of upper limb from 2D video; Trunk, Shoulder, Elbow, Wrist (2), Finger (2). The Convolutional Neural Network (CNN) and Temporal Convolutional Network (TCN) were used to classify FMA and BBT into 3 groups by severity of upper limb dysfunction and to estimate temporospatial parameters and APS. The input data were divided into a training set (60%), a validation set (20%), and a test set (20%). We found that a CNN performed better than a TCN for all predictions regardless of whether 3D or 2D data were used. The CNN model using 3D data had accuracy, precision, recall, and F1-score exceeding 90 for FMA-UE (91.13, 90.27, 90.35 and 90.31, respectively) and 72 for BBT prediction (79.03, 72.54, 73.96 and 73.24, respectively). The predicted MT, IC, NMU and APS had moderate to strong correlations with true value (r=0.544, 0.755, 0.601 and 0.783). The performance metrics were similar, each exceeding 80 for FMA-UE prediction (89.23, 88.39, 85.97 and 87.16, respectively) and 73 for BBT prediction (76.92, 73.79, 75.51 and 74.64, respectively) when a CNN model was used with 2D data. The predicted MT, IC, NMU and APS had moderate to strong correlations with true value (r=0.528, 0.703, 0.625 and 0.569, respectively). The deep learning method gave highly promising results in predicting upper limb function of stroke patients using only single 2D video recorded during simple activity of daily living. The upper limb dysfunction could be classified according to its severity according to FMA and BBT. Also, temporospatial parameters and APS showed moderate to strong correlation with the predicted values and true values. 뇌졸중에서 상지 기능을 측정하기 위해 현재 가장 널리 사용되는 방법은 푸글마이어 상지 검사 (Fugl-Meyer Assessment-Upper Extremity; FMA-UE) 및 박스앤 블럭 검사 (Box and Block Test; BBT) 와 같은 임상적 측정법이다. 그러나 임상적 측정은 숙련된 임상의가 필요하고 시간이 많이 소요된다. 뇌졸중 환자의 상지 기능을 측정하는 객관적인 방법은 3차원 상지 동작 분석 검사가 있다. 그러나 이 방법은 비싼 장비와 검사가 가능한 넓은 공간이 필요하다는 한계가 있다. 이러한 기존 방법들의 한계를 해결하기 위해 딥러닝 방법을 이용하여 간단한 일상 동작에서 상지 기능을 예측해보고자 하였다. 따라서 본 연구의 최종 목적은 뇌졸중 환자의 2차원 비디오를 사용하여 딥러닝 방법으로 상지 기능을 예측하는 것이다. 이를 달성하기 위해 2가지 단계를 수행했다. 먼저 뇌졸중 환자에서 3차원 상지 동작 분석 검사로부터 얻은 모션 캡처 데이터를 사용하여 딥러닝 방법으로 상지 기능을 나타내는 지표들을 예측해서 본 연구의 최종 목표의 가능성을 타진했다. 최종적으로는 뇌졸중 환자에서 2차원 비디오에서 자세 추정 알고리즘을 통해 추출한 2차원 키포인트의 좌표 데이터를 사용하여 상지 기능을 예측했다. 본 연구는 후향적 연구로 2014년부터 2023년까지 신촌 세브란스 재활병원에 내원한 265명의 뇌졸중 환자들의 FMA-UE 점수, BBT점수, 움직임 시간 (Movement time, MT), 곡률 지수 (Index of curvature, IC), 이동 단위 수 (Number of movement units, NMU)를 포함하는 시공간적 매개변수들과 팔 프로파일 점수 (Arm Profile Score, APS)를 수집했다. 또한 2021년부터 2023년까지 105명의 뇌졸중 환자에서 뻗기와 잡기 주기 (Reach & Grasp Cycle) 동안 녹화된 2차원 비디오 데이터를 수집했다. 수집된 데이터를 가지고 두 가지 버전의 입력 데이터를 사용하여 딥러닝 모델을 개발했다. 먼저, 3차원 좌표 데이터를 사용하여 상지 기능을 나타내는 지표들을 예측하기 위한 3차원 모션 캡처 데이터셋을 구성했다. 3차원 모션 동안 몸통 (4), 어깨, 팔꿈치, 손목 (2), 손가락 (2)에 총 10개의 반사마커의 X, Y, Z 좌표 데이터로 구성된 총 30개의 좌표 데이터를 얻었다. 두번째, 상지 기능을 예측하기 위해 330개의 비디오 데이터를 사용하여 2차원 비디오 데이터셋을 구성했다. 실시간 자세 추정 모델 (Real-Time Models for pose estimation, RTMPose) 을 이용한 자세 추정을 통해 2차원 키포인트를 추출했다. 하나의 2차원 비디오에서 몸통, 어깨, 팔꿈치, 손목 (2), 손가락 (2)을 포함하는 상지의 7개 키포인트의 X, Y 좌표로 구성된 총 14개의 좌표 데이터를 얻었다. 각각의 입력 데이터를 가지고 합성곱 신경망 (Convolutional Neural Network, CNN)과 시간적 합성곱 신경망 (Temporal Convolutional Network, TCN)을 사용하여 상지 기능 장애의 심각도에 따라 FMA-UE와 BBT를 3개의 그룹으로 분류하고 시공간적 매개변수와 APS를 추정했다. 모든 데이터셋은 훈련과 검증은 위한 데이터셋은 80%, 모델 테스트를 위한 데이터셋은 별도의 데이터인 20%로 분할되었다. 결과적으로 모든 결과에서 TCN보다는 CNN 모델의 성능이 더 좋았다. 먼저 3D 모션 캡처 데이터를 사용한 CNN 모델 학습 결과는 FMA-UE 분류 정확도, 정밀도, 재현율 및 f1 점수 모두 90을 초과했다 (각각 91.13, 90.27, 90.35, 90.31). BBT 분류 정확도, 정밀도, 재현율 및 f1 점수 모두 72를 초과했다 (각각 79.03, 72.54, 73.96, 73.24). 예측된 APS와 시공간 매개변수인 MT, IC, NMU는 참값과 중간에서 강한 상관관계를 가졌다 (r=0.783, 0.544, 0.755, 0.601). 2D 비디오 데이터를 사용한 CNN 모델 학습 결과, FMA-UE 분류 정확도, 정밀도, 재현율 및 f1 점수 모두 85를 초과했다 (각각 89.23, 88.39, 85.97, 87.16). BBT 분류 정확도, 정밀도, 재현율 및 f1 점수 모두 73을 초과했다 (각각 76.92, 73.79, 75.51, 74.64). 예측된 APS와 시공간적 매개변수인 MT, IC, NMU는 참값과 중간에서 강한 상관관계를 가졌다 (r=0.569, 0.528, 0.703, 0.625). 딥러닝 기법은 일상생활의 단순 활동 중에 녹화된 단일 2D 영상만을 이용하여 뇌졸중 환자의 상지 기능을 예측하는 데 매우 유망한 결과를 얻었다. 데이터 수가 작았음에도 불구하고 딥러닝 기법을 사용하여 간단한 동작 하나로 FMA-UE 점수와 BBT 점수를 꽤 정확하게 분류할 수 있었다. 또한 시공간적 매개변수와 APS의 참값과 예측값 사이에 중간에서 강한 상관관계를 보였다. 본 연구에서 딥러닝 기법을 사용하여 아주 간단한 일상 생활의 동작을 촬영한 데이터만으로도 복잡하게 수행되는 기존의 상지 기능 평가 결과를 비교적 정확하게 예측해내었다. 본 연구를 통해 뇌졸중 환자의 간단한 영상 데이터를 딥러닝으로 활용하여 상지 기능을 예측하여, 향후 디지털 헬스케어의 헬스 모니터링 분야에 활용할 수 있는 가능성을 확인하였다.	-
dc.description.statementOfResponsibility	open	-
dc.publisher	연세대학교 대학원	-
dc.rights	CC BY-NC-ND 2.0 KR	-
dc.title	Prediction of upper limb function from simple activity of daily living using deep learning in patients with stroke	-
dc.title.alternative	뇌졸중 환자에서 간단한 일상생활 동작 데이터로부터 딥러닝에 의한 상지 기능 예측	-
dc.type	Thesis	-
dc.contributor.college	College of Medicine (의과대학)	-
dc.contributor.department	Others (기타)	-
dc.description.degree	박사	-
dc.contributor.alternativeName	Shim, Dain	-
dc.type.local	Dissertation	-

Appears in Collections:: 1. College of Medicine (의과대학) > Others (기타) > 3. Dissertation

Show simple item record Find it @ YMLIB

License

YUHSpace: Prediction of upper limb function from simple activity of daily living using deep learning in patients with stroke

YUHSpace

BROWSE

Browse

Links