한국외국어대학교

박정식 교수

학위: Ph.D. in Computer Science, Korea Advanced Institute of Science and Technology, Republic of Korea

연구분야: 음성언어처리, 기계학습

전화번호: 02-2173-8814

이메일: parkjs@hufs.ac.kr

연구실: 교수회관 314

세부내용

Speech processing technology is a multidisciplinary research area combined with several academic fields including computer science, linguistics, and mathematics. Since studying computer science during master and doctorate course, I have taken a great interest in speech technology like speech recognition, speech emotion recognition, speech synthesis, and so on. My research goal is to achieve a comfortable and reliable speech-driven user interface for adopting to various smart devices and humanoid robots.

음성 처리 기술은 전산학, 언어학, 수학 등 여러 학문이 융합된 대표적인 분야입니다. 저는 전산학 전공자로서 학위 과정 때부터 음성인식, 음성감정인식, 음성합성 등 다양한 음성 기술에 관심을 가지고 연구를 수행해 왔습니다. 제 연구의 목표는 다양한 스마트 기기와 로봇에 탑재할 수 있는 편리하고 안정된 음성 기반 사용자 인터페이스를 완성하는 것입니다.

최종학력

Ph.D. in Computer Science, Korea Advanced Institute of Science and Technology, Republic of Korea

전공분야

Speech Processing Technology
Machine Learning
Artificial Intelligence

주요 연구

2020-2025: Personalized virtual assistant prototype based on user characterized learning of deep neural network. Supported by National Research Foundation (NRF) of Korea.
2014-2019: Personalized speech emotion recognition for human-robot affective interaction. Supported by National Research Foundation (NRF) of Korea.
2016-2018: Study for delivering UAV mission command using natural language. Supported by Agency for Defense Development (ADD).
2012-2014: Real-time emotion recognition from naturally verbalized emotional speech. Supported by Ministry of Education, Science and Technology
2012-2013: Identification of abnormal speech for surveillance system. Supported by Korea Research Institute of Standards and Science (KRISS)

주요 강의

Spoken Language Processing

Speech Signal Processing
Machine Learning

주요 논문/저서

[Speech recognition]

(2023) End-to-end emotional speech recognition using acoustic model adaptation based on knowledge distillation. Multimedia Tools and Applications, 1-18
(2021) Accented speech recognition based on end-to-end domain adversarial training of neural networks. Applied Sciences, 11(18): 1-13.

(2020) Front-end of vehicle-embedded speech recognition for voice-driven multi-UAVs control. Applied Sciences, 10(19): 1-27.

(2020) Automatic language identification using speech rhythm features for multi-lingual speech recognition. Applied Sciences, 10(7): 1-18.

(2020) Noise cancellation based on voice activity detection using spectral variation for speech recognition in smart home devices. Intelligent Automation And Soft Computing, 26(1): 149-159.

(2019) Sound learning–based event detection for acoustic surveillance sensors. Multimedia Tools and Applications, 79: 16127-16139.

(2017) Noise reduction based on robust speech and non-speech detection in vehicular environments. IJMPERD, 7(3): 105-112.

(2016) Unsupervised noise reduction scheme for voice-based information retrieval in mobile environments. Multimedia Tools and Applications. 75(9): 4981-4996.

(2015) Unsupervised rapid speaker adaptation based on selective eigenvoice merging for user-specific voice interaction. Engineering Applications of Artificial Intelligence. 40: 95–102.

(2013) Acoustic interference cancellation for a voice-driven interface in smart TVs. IEEE Transactions on Consumer Electronics. 59(1): 244-249.

[Speech emotion recognition]

(2016) Multistage data selection-based unsupervised speaker adaptation for personalized speech emotion recognition. Engineering Applications of Artificial Intelligence. 52: 126–134.

(2015) Emotional information processing based on feature vector enhancement and selection for human-computer interaction via speech. Telecommunication Systems. 60(2): 201-213.

(2012) Speaker-characterized emotion recognition using online and iterative speaker adaptation. Cognitive Computation. 4(4): 398-408.

[Spoken content retrieval]

(2012) Online speaker diarization for multimedia data retrieval on mobile devices. International Journal of Pattern Recognition and Artificial Intelligence. 26(8).

(2012) Multistage utterance verification for keyword recognition-based online spoken content retrieval. IEEE Transactions on Consumer Electronics. 58(3): 1000-1005.

(2010) GMM adaptation based online speaker segmentation for spoken document retrieval. IEEE Transactions on Consumer Electronics. 56(2): 1123-1129.

Research Homepage (Google Scholar) : http://scholar.google.com/citations?hl=en&user=KBFDG64AAAAJ