Real-time Object Positioning in Vibrating Environments Using DeepLabV3+ and ResNet50-based Semantic Segmentation
Abstract
In environments where both objects and imaging systems experience mechanical vibrations, accurate position measurement poses a significant challenge. Conventional techniques, such as laser-based, contact-based, or image-based methods, often fail under such conditions, particularly when motion artifacts eliminate stable reference points within the image. This work presents a generalized and robust method for object localization under controlled vibration, improving previous approaches by using a single semantic segmentation network (DeepLabV3+ with a ResNet50 backbone) to simultaneously segment both the object and a static reference. This unified architecture eliminates the need for separate models or manual handling of regions of interest. The method retains the use of a local coordinate system anchored at the reference centroid for vibration-resilient position estimation, but extends it to a wider variety of object shapes and configurations. Validation with ten distinct objects under induced vibrations (5–10 Hz) showed reliable performance, with submillimeter localization accuracy (MAE < 0.23 mm, RMSE < 0.29 mm) and strong correlation with ground truth (PCC > 0.99). The system also maintained real-time operation at 94 fps, supporting scalability to dynamic applications. These findings demonstrate that the proposed framework enables fast, precise, and vibration-robust object tracking, supporting applications in automated manufacturing, robotic systems, and industrial quality assurance where vibration has traditionally limited the effectiveness of image-based techniques.
