Appearance-based gaze estimation has emerged as a promising alternative to traditional model-based methods, effectively addressing their limitations in terms of flexibility, cost, and adaptability to unconstrained environments. In this study, the Digital Therapeutics Research Team at Bundang CHA Medical Center developed a novel appearance-based gaze estimation algorithm, CHA-Gaze, by integrating head-pose information into an adaptive feature fusion network (AFF-Net) architecture, which is a widely recognized baseline in the field. To evaluate the effectiveness of CHA-Gaze, we conducted a unified validation using the MPIIFaceGaze dataset, which comprises 37,590 images from 15 participants acquired under semi-natural conditions. The results demonstrated that CHA-Gaze achieved a significantly lower mean Euclidean error of 1.88 cm, compared to 2.59 cm by AFF-Net (p < 0.001). These findings indicate that CHA-Gaze offers superior accuracy and improved robustness across various appearances and environmental conditions. This study confirms the effectiveness of architectural refinement within appearance-based gaze estimation frameworks and highlights the potential of CHA-Gaze for real-world deployment in applications, such as digital therapeutics, telehealth, and accessibility technologies. The proposed model provides a scalable, non-intrusive solution using standard webcams, making it suitable for widespread use in both clinical and consumer-grade settings.