Publications

A collection of my research work.

Filters
Filter by Year:
All
2023
2024
2025
Filter by Type:
All
Journal
Conference
Patent
Software Copyright

VIS-IR Image Translation

VITSG: Visible-to-Infrared Image Translation via Semantic Guidance Accepted

Zhiqing Zhao, Yong Ma, Jun Huang*, Kangle Wu, Lihan Wang, Fan Fan

IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), Accepted (In Press), 2026

DOI
Abstract
Bibtex

In the field of visible-to-infrared (VIS-IR) image translation, the key challenge is to overcome the scarcity of infrared data and generate images with physical consistency and high-fidelity details. Existing methods often treat images as uniform pixel matrices, ignoring the physical essence of thermal radiation in different semantic regions, leading to errors in thermal signal allocation and detail artifacts in complex scenes. To address this limitation, this paper proposes a novel Visible-to-Infrared image Translation network via Semantic Guidance (VITSG). This network prioritizes semantic guidance for physical consistency, while incorporating high-frequency information as a complementary tool for detail restoration. The network employs an encoder-decoder framework that decouples the translation task into low-frequency thermal radiation structure generation and high-frequency detail recovery. Specifically, in the semantic embedding phase, we design a progressive mechanism that gradually integrates pre-extracted semantic prior information into multi-scale features through Semantic-Guided Attention Blocks (SGAB). By establishing an explicit mapping between the semantic space and the target thermal radiation characteristics, this mechanism ensures the accurate transmission of key objects’ thermodynamic properties. Subsequently, in the high-frequency detail restoration phase, we introduce a cross-modal high-frequency injection mechanism to effectively recover fine-grained textures. Here, a channel-spatial attention module adaptively filters and converts visible image high-frequency components for precise, high-fidelity detail reconstruction. Through systematic experimental evaluation, the proposed VITSG network significantly outperforms existing baseline models in both quantitative and qualitative results, effectively solving the trade-off between physical consistency and detail fidelity, fully demonstrating the excellent practical potential of this cross-modal translation framework in generating high-quality, physically reasonable infrared images.

@article{zhao2026vitsg,
  title={VITSG: Visible-to-Infrared Image Translation via Semantic Guidance},
  author={Zhao, Zhiqing and Ma, Yong and Huang, Jun* and Wu, Kangle and Wang, Lihan and Fan, Fan},
  journal={IEEE Transactions on Circuits and Systems for Video Technology (TCSVT)},
  year={2026},
  note={Accepted (In Press)},
  publisher={IEEE}
}
                                    
Infrared and Visible Image Fusion

DynaFuse: A Dynamic Perception-Driven Dual-Branch Network with Adaptive Routing for Infrared and Visible Image Fusion Under Review

Lihan Wang, Zhiqing Zhao, Jun Huang*, Kangle Wu, Yong Ma, Fan Fan

Under Review, 2026

DOI
Abstract
Bibtex

Infrared and visible image fusion technology aims to integrate complementary information from multiple sensors to generate a more comprehensive scene representation. Although deep learning-based methods have made significant progress, current approaches still face several challenges. A significant issue in current feature extraction methods for infrared and visible image fusion is the use of homogeneous network structures that fail to account for the fundamental differences in the physical characteristics of the two modalities. This methodological flaw makes it particularly difficult to derive clear feature representations from noisy source images, which in turn undermines the model’s robustness in challenging or degraded scenarios. Secondly, most existing fusion strategies rely on fixed-weight schemes, which struggle to dynamically adjust the contributions of different modalities according to varying scene contents, thus limiting scene adaptability. To address these issues, we propose DynaFuse, a dynamic perception-driven dual-branch fusion network.Its core design comprises a Dual-Branch Feature Extraction Module (DB-FEM) and a Dynamic Routing Fusion Module (DRFM). The DB-FEM performs differentiated encoding based on the physical properties of each modality: the infrared branch suppresses high-frequency noise via frequency-domain truncation and incorporates Vision Mamba blocks to model global thermal radiation dependencies; the visible branch employs lightweight Swin fusion blocks enhanced with deformable convolution and cross-window attention to improve multi-scale texture and edge feature extraction. In the fusion stage, the DRFM incorporates a Window Attention Routing (WAR) mechanism for dynamic weighting.This mechanism adaptively calibrates the fusion weights of different modalities based on the semantic importance of local content, enabling more rational feature integration in complex scenes. Furthermore, we introduce a multi-task loss function that jointly optimizes the network from perspectives of feature physical alignment, multi-scale structural similarity, and gradient consistency. Experiments on MSRS, LLVIP, and TNO benchmarks demonstrate that DynaFuse outperforms existing state-of-the-art methods in both subjective visual quality and objective evaluation metrics, with fewer parameters.Ablation studies further validate the effectiveness of the dual-branch design and the dynamic routing mechanism.

@article{wang2026dynafuse,
  title={DynaFuse: A Dynamic Perception-Driven Dual-Branch Network with Adaptive Routing for Infrared and Visible Image Fusion},
  author={Wang, Lihan and Zhao, Zhiqing and Huang, Jun* and Wu, Kangle and Ma, Yong and Fan, Fan},
  journal={Under Review},
  year={2026},
  publisher={Submitted}
}
                                    

OPENMV Image Processing

Study on Two-dimensional Pan-tilt-Camera Platform Image Processing Based on OPENMV Published

Lihan Wang; Xingrong Zhong; Yuhao Hu; Yiwen Tang; Weiqing Wang; Junjie Wang

2024 7th International Conference on Electronics, Communications, and Control Engineering (ICECC), 2024

DOI
Abstract
Bibtex

A two-dimensional Pan-tilt-Camera platform image processing method has been developed for the control and automatic tracking system of moving targets. The method utilizes an OPENMV control unit and an MT9V034 camera to collect environmental images. After preprocessing, a quadrilateral detection method is employed to extract the contour of the environment edge. An improved algorithm based on spline interpolation is proposed for servo motor movement. By optimizing the path using a circular arc algorithm and predicting feedback from the control strategy, the adaptability of the algorithm is significantly enhanced, enabling it to compensate for hardware deficiencies. For target tracking, an improved algorithm based on Kalman filtering is introduced. The algorithm combines model prediction to achieve adaptability under certain conditions. Multi-object recognition is used to enhance the insensitivity of the algorithm when tracking laser targets. Through experiments conducted in various lighting environments, the algorithm exhibits good application effects below 500lux indoor illumination, with an accuracy rate of 95%.

@INPROCEEDINGS{10624630,
  author={Wang, Lihan and Zhong, Xingrong and Hu, Yuhao and Tang, Yiwen and Wang, Weiqing and Wang, Junjie},
  booktitle={2024 7th International Conference on Electronics, Communications, and Control Engineering (ICECC)}, 
  title={Study on Two-dimensional Pan-tilt-Camera Platform Image Processing Based on OPENMV}, 
  year={2024},
  volume={},
  number={},
  pages={12-18},
  keywords={Interpolation;Target tracking;Filtering;Target recognition;Image processing;Lighting;Prediction algorithms;OPENMV;Spline Interpolation;Kalman filtering},
  doi={10.1109/ICECC63398.2024.00010}
}
                                    
Multi-scene intelligent environmental monitoring system

Multi-scene intelligent environmental monitoring system Published

Haorun Lv; Juanjuan Li; Yi Xu; Xinyi Tao; Lihan Wang

2024 IEEE 17th International Conference on Signal Processing (ICSP), 2024

DOI
Abstract
Bibtex

With the rapid development of the economy and the acceleration of modernization since the 21st century, environmental problems are becoming more and more prominent, and the monitoring and evaluation of the environmental system are particularly important. A multi-scene intelligent environmental detection system based on STM32 is proposed. The system consists of MQ2 module, MQ3 module, MQ7 module, DHT11 temperature and humidity sensor, photo resistor, active buzzer, OLED screen, and Guanghetong L610 communication system, which is capable of real-time monitoring of temperature, humidity, CO concentration, alcohol gas concentration, and smoke gas concentration, and displaying these parameters on the LED display for real-time observation. In addition, the system also realizes the connection with the cell phone app through the Guanghetong interface, so that users can observe and adjust the environmental parameters at any time. The system is set up with a concentration threshold function, once the parameters exceed the preset threshold, the system automatically makes a cell phone call, send WeChat public message and SMS notification. At the same time, users can also set and adjust the threshold of each parameter at any time. The system also has the function of displaying personal information. The experimental results show that the system is characterized by high precision, high sensitivity and high response speed, which can realize the real-time analysis and monitoring of the environmental detection system and achieve better results.

@INPROCEEDINGS{10846170,
  author={Lv, Haorun and Li, Juanjuan and Xu, Yi and Tao, Xinyi and Wang, Lihan},
  booktitle={2024 IEEE 17th International Conference on Signal Processing (ICSP)}, 
  title={Multi-scene intelligent environmental monitoring system}, 
  year={2024},
  volume={},
  number={},
  pages={737-742},
  keywords={Temperature sensors;Temperature measurement;Social networking (online);Cellular phones;Humidity measurement;Humidity;Signal processing;Message services;Real-time systems;Environmental monitoring;Environmental monitoring;IoT communication;STM32F103C8T6;GWT L610},
  doi={10.1109/ICSP62129.2024.10846170}
}
                                    
ECG Signal Denoising

Research on ECG Signal Denoising Based on EM-UKF Algorithm Published

Shengyang Tong, Dongping Yu, Xiang Li, Lihan Wang, Lirong Wang

2024 9th International Conference on Multimedia Systems and Signal Processing (ICMSSP), 2024

DOI
Abstract
Bibtex

The ECG(Electrocardiogram) is an essential diagnostic tool for identifying heart diseases and its significance is unquestionable. However, during the signal collection process, the signal is extremely susceptible to interference from various types of noise, such as baseline wander, powerline interference, electrode motion artifacts, and muscle artifacts. Although filters based on Bayesian models have been well-applied in the field of ECG signal processing, they have limitations when dealing with abnormal ECG signal morphologies. This is because these filters require predefined models to identify and process noise in the signals, and these models may not accurately reflect all types of ECG signals and noise. To address these issues, this paper presents a more flexible and precise denoising method that combines the UKF(Unscented Kalman Filter) and the EM(Expectation-Maximization) algorithm. This method does not require predefined models and can dynamically adjust according to the characteristics of the ECG signal and the type of noise to adapt to the variable noise environment. Through R-peak detection and dynamic segmentation of heartbeats, this method can precisely adjust the parameters of the UKF to match specific ECG signals and noise characteristics. The EM algorithm is then used to iteratively optimize these parameters, thereby improving the denoising efficiency and accuracy. Compared with the current mainstream ECG filtering algorithms, experimental results prove that the EM-UKF filtering algorithm proposed in this paper performs better in improving the SNR(Signal-to-Noise Ratio) and reducing the MSE(Mean Square Error), especially when processing ECG signals in complex noise environments.

@inproceedings{10.1145/3690063.3690067,
author = {Tong, Shengyang and Yu, Dongping and Li, Xiang and Wang, Lihan and Wang, Lirong},
title = {Research on ECG Signal Denoising Based on EM-UKF Algorithm},
year = {2024},
isbn = {9798400716911},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3690063.3690067},
doi = {10.1145/3690063.3690067},
abstract = {The ECG(Electrocardiogram) is an essential diagnostic tool for identifying heart diseases and its significance is unquestionable. However, during the signal collection process, the signal is extremely susceptible to interference from various types of noise, such as baseline wander, powerline interference, electrode motion artifacts, and muscle artifacts. Although filters based on Bayesian models have been well-applied in the field of ECG signal processing, they have limitations when dealing with abnormal ECG signal morphologies. This is because these filters require predefined models to identify and process noise in the signals, and these models may not accurately reflect all types of ECG signals and noise. To address these issues, this paper presents a more flexible and precise denoising method that combines the UKF(Unscented Kalman Filter) and the EM(Expectation-Maximization) algorithm. This method does not require predefined models and can dynamically adjust according to the characteristics of the ECG signal and the type of noise to adapt to the variable noise environment. Through R-peak detection and dynamic segmentation of heartbeats, this method can precisely adjust the parameters of the UKF to match specific ECG signals and noise characteristics. The EM algorithm is then used to iteratively optimize these parameters, thereby improving the denoising efficiency and accuracy. Compared with the current mainstream ECG filtering algorithms, experimental results prove that the EM-UKF filtering algorithm proposed in this paper performs better in improving the SNR(Signal-to-Noise Ratio) and reducing the MSE(Mean Square Error), especially when processing ECG signals in complex noise environments.},
booktitle = {Proceedings of the 2024 9th International Conference on Multimedia Systems and Signal Processing (ICMSSP)},
pages = {18–23},
numpages = {6},
keywords = {AKF, ECG signal denoising, EM, UKF},
location = {},
series = {ICMSSP '24}
}
                                    
Garbage Sorting Cart

Design of an Early Learning Car for Garbage Sorting Based on Object Detection Published

Yuhao Hu; Lihan Wang; Zhenxiao Jiang; Jiawei Ji; Yuxuan Yan; Junjie Wang

2023 International Conference on Intelligent Computing, Communication & Convergence (ICI3C), 2023

DOI
Abstract
Bibtex

China has comprehensively launched the work of living rubbish classification in cities at the prefecture level and above across the country since 2019, and rubbish classification has begun to be integrated into the daily life of all people. However, due to the lack of awareness of rubbish classification for a long time, it has become a habit for adult groups not to classify rubbish, so the education of rubbish classification for young children is particularly important. In this thesis, an intelligent trolley control system is designed in the context of the implementation of the waste classification policy, aiming to improve the efficiency and accuracy of waste classification in the future by means of waste classification education for young children. The system is based on STM32, K210 and LD3320 control. Firstly, a lightened yolov2 target detection model is used to detect the gesture card teaching aids, and then Kalman filtering is applied to the detection results to achieve the tracking of the detected target on the image and to improve the stability of target detection. Then, PID control algorithm is used to achieve motion control of the trolley so that it can advance to the vicinity of the gesture sign as expected. Finally, the LD3320 recognises the voice of the toddler and opens the corresponding category of bins to achieve trash classification. The experimental results show that the intelligent cart system has good accuracy and stability in target detection and dynamic tracking, and high accuracy in speech recognition, which demonstrates good application prospects in young children's rubbish classification education, and promises to effectively improve the efficiency of rubbish classification and young children's participation, which is of positive significance for the promotion of the implementation of rubbish classification policy.

@INPROCEEDINGS{10729795,
  author={Hu, Yuhao and Dai, Zhihao and Jiang, Zhenxiao and Ji, Jiawei and Yan, Yuxuan and Wang, Junjie and Li, Juanjuan},
  booktitle={2023 International Conference on Intelligent Computing, Communication & Convergence (ICI3C)}, 
  title={Design of an Early Learning Car for Garbage Sorting Based on Object Detection}, 
  year={2023},
  volume={},
  number={},
  pages={273-278},
  keywords={Accuracy;Target tracking;Target recognition;Face recognition;Education;Speech recognition;Object detection;Stability analysis;Character recognition;Sorting;Early childhood education;garbage classification;object detection;PID control;speech recognition},
  doi={10.1109/ICI3C60830.2023.00059}
}
                                    
Diode Envelope Detector

Design and Analysis of Diode Envelope Detector Based on Multisim Published

Jiale Xu, Juanjuan Li, Yi Zhong, Tao Wang, Lihan Wang

2023 6th International Conference on Electronics, Communications and Control Engineering (ICECC), 2023

DOI
Abstract
Bibtex

The circuit of diode envelope detector is simple and easy to implement, which is of great significance to the detection of ordinary amplitude modulation waves. In engineering practices, the designed detector is required to be able to restore the original signal very well, so its working principles and distortions are worth studying. In the following paragraphs, the characteristics of the output voltage waveforms of different types of detectors are compared, four kinds of common distortion cases are analyzed, and the ranges of parameters to avoid distortion are given. Since most books focus on the analysis of the theory such as the derivation of equations but lack direct presentation of the output of the circuit, the simulation solution based on the software Multisim is paid much attention to in the following text boxes, which can visualize the detection of the circuit. By scanning the ranges of values of the parameters, the effect of the relevant parameters on the distortion of the detector output voltage is specifically investigated to further verify the theoretical derivation. According to these analyses, it is easy to design diode envelope detectors and select reasonable device parameters.

@inproceedings{10.1145/3592307.3592347,
author = {Xu, Jiale and Li, Juanjuan and Zhong, Yi and Wang, Tao and Wang, Lihan},
title = {Design and Analysis of Diode Envelope Detector Based on Multisim},
year = {2023},
isbn = {9798400700002},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3592307.3592347},
doi = {10.1145/3592307.3592347},
abstract = {The circuit of diode envelope detector is simple and easy to implement, which is of great significance to the detection of ordinary amplitude modulation waves. In engineering practices, the designed detector is required to be able to restore the original signal very well, so its working principles and distortions are worth studying. In the following paragraphs, the characteristics of the output voltage waveforms of different types of detectors are compared, four kinds of common distortion cases are analyzed, and the ranges of parameters to avoid distortion are given. Since most books focus on the analysis of the theory such as the derivation of equations but lack direct presentation of the output of the circuit, the simulation solution based on the software Multisim is paid much attention to in the following text boxes, which can visualize the detection of the circuit. By scanning the ranges of values of the parameters, the effect of the relevant parameters on the distortion of the detector output voltage is specifically investigated to further verify the theoretical derivation. According to these analyses, it is easy to design diode envelope detectors and select reasonable device parameters.},
booktitle = {Proceedings of the 2023 6th International Conference on Electronics, Communications and Control Engineering},
pages = {249–256},
numpages = {8},
keywords = {Diode Envelope Detector, Distortion, Multisim, Parameter Scanning},
location = {Fukuoka, Japan},
series = {ICECC '23}
}
                                    

Intelligent Shooting Trolley System Software V1.0 Registered

Lihan Wang, Juanjuan Li

Software Copyright Registration No. 2024SR0319129, 2024

Description

This software is the core control system developed for the intelligent shooting trolley, built on the STM32G474 main control chip and Raspberry Pi platform, realizing autonomous motion control, high-definition image acquisition and real-time transmission functions of the trolley. The software integrates PID motion control algorithm, visual target recognition module and Wi-Fi communication protocol, supporting remote control of trolley movement, adjustment of shooting angle through upper computer, and real-time preprocessing and feature extraction of collected images. The system adopts modular design, which can flexibly expand functions such as obstacle avoidance and line patrol, and is suitable for indoor inspection, environmental monitoring, education demonstration and other scenarios. By combining embedded hardware with edge computing capabilities, this software can achieve efficient image processing and motion control under resource-constrained conditions, providing a complete solution for mobile vision systems.

Moving Target Control and Automatic Tracking System Software V1.0 Registered

Lihan Wang, Juanjuan Li

Software Copyright Registration No. 2024SR0101586, 2024

Description

This software is developed for the automatic recognition and tracking scenarios of moving targets, integrating computer vision and closed-loop control technology, which can real-time detect targets in complex backgrounds and drive actuators for dynamic tracking. The software uses an improved Kalman filter algorithm to predict the target motion trajectory, combines with the OPENMV vision module to achieve high-precision target positioning, and controls the PTZ or trolley to complete attitude adjustment through serial communication. The system supports multi-target recognition and priority switching, can automatically adjust the tracking strategy according to target characteristics, and has abnormal state detection and emergency response mechanisms. This software is widely used in security monitoring, industrial inspection, educational robots and other fields, and can effectively improve the efficiency and reliability of target monitoring in dynamic scenarios.

Convenient Garbage Classification Software V1.0 Based on Yolo v5 Registered

Shuai Yuan, Lihan Wang, Juanjuan Li

Software Copyright Registration No. 2024SR0512959, 2024

Description

This software is developed based on the lightweight Yolo v5 algorithm, aiming to achieve rapid classification and recognition of domestic waste. By training on the image dataset of common garbage items, the model can real-time identify recyclables, kitchen waste, hazardous waste and other waste on mobile devices or embedded platforms, and provide voice prompts and classification guidelines. The software supports offline operation mode without relying on cloud computing power, and has data statistics and classification accuracy analysis functions, which can be used as an auxiliary tool for garbage classification in families, communities, schools and other scenarios. As the second inventor, I was mainly responsible for the lightweight optimization and embedded deployment of the model, compressing the model volume to 40% of the original size through pruning and quantization technology, and achieving efficient inference on the K210 chip, ensuring the smooth operation of the software on resource-constrained devices.


An Intelligent Garbage Classification Trolley Granted

Lihan Wang, Juanjuan Li

Utility Model Patent No. ZL 2024 2 0369088.3, 2024

Abstract

The utility model discloses an intelligent garbage classification trolley, relating to the technical field of intelligent robots and garbage classification. The trolley includes a mobile chassis, a garbage throwing device, a visual recognition module, a main control unit and a classification storage bin, wherein the visual recognition module adopts a lightweight Yolo v5 model to real-time identify the thrown garbage, and the main control unit controls the mechanical arm to put the garbage into the storage bin of the corresponding category according to the recognition result. The trolley is equipped with ultrasonic obstacle avoidance sensors and infrared tracking modules, which can autonomously navigate in indoor environments and avoid obstacles, and support connection with mobile terminals through Bluetooth or Wi-Fi to achieve remote control and data upload. The utility model has compact structure and simple operation, and can be widely used in families, offices and educational institutions. It can improve users' awareness of garbage classification in an entertaining way, and has good social promotion value. As the first inventor, I led the overall structural design and algorithm development to ensure the stability and classification accuracy of the trolley in practical applications.