Fault diagnosis of reducers based on digital twins and deep learning | Scientific Reports

Scientific Reports volume 14, Article number: 24406 (2024) Cite this article

324 Accesses

Metrics details

A new method was proposed to address fault diagnosis by applying the digital twin (DT) high-fidelity behavior and the deep learning (DL) data mining capabilities. Subsequently, the proposed fault distribution GAN (FDGAN) was built to map virtual and physical entities for the data from the established test platform. Finally, the MobileViG was employed to validate the model and diagnose faults. The accuracy of the proposed method with training samples of 600 and 800 were 88.4% and 99.5%, respectively. These accuracies surpass those of other methods based on CycleGAN (98.86%), CACGAN (94.92%), ACGAN (86.45%), ML1D-GAN (82.33%), and transfer learning (99.38%). Therefore, with the integration of global connectivity, an innovative network structure, and training methods, FDGAN can effectively address challenges such as network degradation, limited feature extraction in small windows, and insufficient model robustness.

Reducers are extensively employed in various construction machinery, and therefore, research on digital twins (DTs) and deep learning (DL) has significantly contributed to reducer fault diagnosis. With the progress of DTs, replication has shown potential and is widely used in fault diagnosis. Among the milestones in this field, the concept of DT was initially introduced for creating a mirrored model of a physical entity (Grieves1. In addition, a conceptual model of DTs for predicting aircraft structural life was developed by combining calculations of structural defects and temperature2. Then, the U.S. Department of Defense incorporated it into equipment health maintenance and defined it as a simulation process integrating multiple physical quantities, scales, and probabilities. In the same year, a virtual aircraft model was established through historical and real-time data to map the physical entity and life-cycle processes of objects3. Furthermore, DTs can also be utilized to support the design, verification, operation, and manufacture of system characteristics4, 5. For example, the DT construction method was applied to detect and predict the damage of uncertain crack growth6. A DT system for structural damage detection was developed based on models and machine learning to provide intelligent decision‐making solutions7. When building a DT system, considering both the relationship between the functions and applications and the methods, techniques and model construction of DTs and various driving technologies during construction is essential8,9,10.

In DL, numerous pioneers have provided fascinating fault diagnosis ideas. Sparse deep learning was proposed to reduce the risk of overfitting deep networks with many layers and neurons11. A fault diagnosis framework based on deep transfer learning was developed to achieve faster model convergence and higher fault diagnosis accuracy (Shao et al.12a). A feature learning method based on a 1D residual convolutional autoencoder was used to extract vibration signal features under noisy conditions13. Nevertheless, these methods require considerable training data to support their performance. Under different working conditions, obtaining data becomes more challenging due to the complex structure and external environment. As a result, fault categories are relatively limited and are required for the range expansion of fault types, and several scholars have developed dynamic models14,15,16,17. However, notably, many dynamic models are idealized and cannot accurately capture real-world motion in substances. This limitation also leads to a significant difference between the virtual and real signals. The variable working conditions and complex environment of the reducer cannot be precisely expressed by mathematical formulas. Therefore, bridging the gap between virtual and real signals is a crucial challenge in developing DT technology.

Generation adversarial networks (GANs) have emerged as a novel approach to virtual-real mapping, generating significant interest in the research community due to their unsupervised learning mechanism and impressive results in terms of data generation, image superresolution and style transfer18, and several GAN variants have since been proposed to address specific challenges. Subsequently, conditional GAN was proposed to include supervised learning in GANs by adding constraints to model training and allowing for a smooth transition from unsupervised to supervised learning (Mirza et al.19. The Wasserstein GAN was established to address the issue of unstable training and collapse in GAN models (Arjovsky et al.20. The deep convolutional GAN integrates supervised and unsupervised learning, improving the GAN architecture for high-quality image generation with better stability and control (Radford et al.21. An information GAN that combines information theory with a GAN was applied to enable the model to learn effective semantic features (Xi et al., 2016). In the sequence GAN, the concept of reward in reinforcement learning was utilized to generate discrete sequences and showed promising results in speech, poetry, and music generation (Yu et al.23. Significant progress has also been made in image style transfer. The CycleGAN (Zhu et al.24 focuses on image style transfer and mapping while improving adaptability by adding a discriminator. Similarly, StarGAN applied a generator to achieve image transformation across multiple domains (Choi et al.25. Overall, GANs have revolutionized virtual-real mapping, showing great potential across various applications such as data generation, image processing, and style transfer.

For the past few years, most fault diagnosis methods based on DTs have built virtual models by dynamics, which can shorten the early modeling time and achieve good results. The virtual model and the physical entity can be integrated, where the collected data facilitate simulation, health monitoring, diagnosis, and maintenance. A mechanical fault diagnosis framework was explored using DT and deep transfer learning techniques; the fault data limitation was consequently solved under variable working conditions or system characteristics30. The DT model for autoclaving was developed to enhance the diagnostic capability by replacing scarce with analog data to replace actual faults31. A new method for developing a 3D-printed robotic arm and creating a DT model in Unity3D was reported (Matuils et al.32. DTs have also been applied to intelligent detection robots via multisensor data fusion technology33. These contributions have advanced DTs by establishing effective models across various fields. However, the DT mapping component, which connects the virtual and real worlds, has received limited attention. The CycleGAN method was employed for virtual-real mapping (Song et al.34, the concept of adversarial training from GANs and dual-learning were utilized to transform the training into a cyclic approach. By augmenting the loss function with cyclic consistency constraints, structural similarity between generated and real images is ensured. This enables CycleGAN to generate mappings between source and target domains without requiring paired datasets, even without unpaired data transfer as well, but further improvements are necessary to achieve the desired results.

To address this challenge, the fault distribution GAN (FDGAN) was introduced as a new approach for virtual-real mapping and to enhance the data generation effect by combining DTs. For the degradation issue in traditional networks, a novel residual structure was proposed to seamlessly integrate the global connectivity of self-attention with degradation protection of residual networks. Thus, the limitation in small window feature extraction could be solved. Moreover, a novel generator architecture was proposed, it adopts DG_Blocks and skip connection structure to enhance the ability to extract global features under multiscale invariance. Finally, three training methods were introduced to improve the robustness of the FDGAN. Then, the classification with MobileViG (Munir et al.35 further enhances the diagnostic capabilities of the proposed framework with fewer samples for higher efficiency. These contributions offer a promising solution for enhancing fault diagnosis through DTs.

The objectives of this work are as follows:

The high-fidelity characteristics of finite element analysis could be utilized in virtual modeling to replace traditional dynamic modeling analysis for more accurate simulation signals.

The proposed FDGAN might improve the efficiency and effectiveness of virtual-real mapping. The analog signal mapped to the physical domain can exhibit a similar distribution and fault characteristics as the real signal.

The results were validated by MobileViG for high accuracy with fewer training samples.

The proposed method follows the logic framework “Data acquisition→FDGAN mapping data→MobleViG classifies”. To specify the framework, further introduction were made as follows: First, data acquisition involves analyzing and configuring operational conditions, including fault-free and faulty scenarios at constant and variable speeds (Cases 1, 2, 3, and 4). The collected signals were filtered and treated to serve as real signal samples. Additionally, DT signals are obtained primarily through finite element analysis of JZQ200 reducer, involving 3D modeling, mesh division, boundary condition setting, and transient dynamic analysis. These operations yield raw data categorized as RealA and SimulateB.

FDGAN is employed for implementing virtual-real mapping in the second part. Both real and simulated signals were collected and categorized into R and S domains. In addition, mapping R and S is implemented by FDGAN, that is, the virtual data are mapped to the real data. RealA and SimulateB are mapped using generators G_R(S) and G_S(R), respectively, generating images and assessing image generation losses Ga and Gb. D_S and D_R act as discriminators to evaluate whether the images from G_R(S) and G_S(R) meet the standards, and the discriminator losses DaLoss and DbLoss are output. Successful outputs include FakeS and FakeR; otherwise, the generator is recalibrated. To assess mapping similarity further, FakeS and FakeR are fed back through G_R(S) and G_S(R) to generate RecR and RecS, respectively, and CycleLoss is introduced to evaluate model performance. Following mapping, MobileViG conducts fault diagnosis, assessing classification accuracy to verify the performance of FDGAN in virtual-real mapping within the DT framework. The overall framework and fault diagnosis process of this study are shown in Fig. 1.

Diagnostic process

The twin model integrates multiple disciplines and data and consists of the geometric model of its exact structure, the reference of its working environment and historical data. The working environment factors are reflected in the resonance and noise caused by the operation of other equipment. Additionally, historical data provide the basis for the state detection of the reducer through experience diagnosis. The model is outlined in Formula (1).

where MDT represents the DT model, MG expresses the geometric model, E accounts for the impact of environmental factors, and Hdata incorporates historical data and related databases for reference.

The vibration patterns of the integral reducer under various working conditions are analyzed. These vibrations are expressed as a superposition of sinusoidal signals of gear meshing frequency, bearing rotation frequency, frequency doubling, and other excitations created by faults, which can be regarded as frequency and amplitude modulation of the carrier signal. The mathematical model for the gear vibration signal is presented in Formula (2), where only the first-order components are considered.

where k, l, fm, b, and θ represent the order, amplitude modulation component, meshing frequency, frequency modulation and phase, respectively.

Then, K, M and C were assembled into a global matrix. It is crucial for solving the dynamic equation and describing the mechanical equilibrium state of the reducer. This equation is represented by Formula (3).

where K, M, and C express the stiffness, mass, and damping matrices, respectively. F represents the external load vector, and v, acc, and u represent the node velocity, acceleration, and displacement, respectively.

The next step is the timing set-up. There are two time integration methods. In the explicit Euler method, the displacements and velocities are updated based on the current time step, while in the implicit Euler method, the displacements and velocities are updated in the next time step. Although this approach requires more computational time, it offers improved numerical stability. The time integral of this method is expressed in Equations (4) and (5).

where u(n+1), v(n+1) and f(t(n+1)) represent the displacement, the velocity and the external load at the n+1 time step, respectively. dt and M−1 represent the time interval and the inverse matrix of the mass, respectively.

When contact or friction exists, a nonlinear contact analysis method can be used to model the contact state and friction characteristics of the structure. These conditions can be expressed as parameters such as the contact state, contact stiffness and friction coefficient. In Formula (6), the determined friction attenuation coefficient can replicate the vibration characteristics of the gear under various working conditions. An elastic Coulomb model is utilized to capture adhesion and slip accurately. Equation (6) defines the static friction coefficient and the initial adhesion force through the insertion command flow.

where μ1 and D represent the dynamic friction coefficient and measurement data points, respectively.

For the elastic material contact problem, the classical Lagrange multiplier method is applied to establish the nonlinear contact equation. There was a relative displacement between the two teeth, and a contact force was employed. In this case, the contact could be regarded as a rigid constraint. Assuming that the displacement of the tooth is x1 and x2 and that the contact force is p, the following function is established:

where Π represents the Lagrange multiplier, x1’ and x2’ are the initial positions of A and B, ΠA, Π B, and ΠC are the functions of the displacement and contact force, ΠD is the equation-constrained function, and λ is the Lagrange daily quantity.

Based on the above formulas, a complete finite element analysis model can be established, and the virtual signal can be obtained via transient dynamic analysis.

The CycleGAN is widely used for image style conversion. It consists of two generators (GX,GY) and discriminators (DX,DY), as depicted in Fig. 2. The data created by the generator should closely resemble the target, while the discriminator aims to discern between the generated and real data. The training process is a constant interplay between G and D; once the balance is achieved, the training is considered to be completed. Notably, the transformation from domain S to R is accomplished without source image pairs due to the introduction of a novel cyclic consistency concept.

Principle of CycleGAN

Building upon the contributions of the former, this section introduces a novel FDGAN. To establish global associations, self-attention36 is incorporated. The algorithm and calculation process are illustrated in Fig. 3. Q, K, and V are derived from linear transformations of the input matrix X. Q and K are responsible for semantic associations and the distance matrix through the self-attention algorithm. Associations between pixels are facilitated by Formula 8. The parameter \(\sqrt{{d}_{k}}\) regulates the magnitude of the inner product between Q and K, ensuring that correlations are not adversely affected by excessively large values. Each Row i in distance matrix, processed through softmax, encapsulates the correlation coefficients of pixel i with respect to all other pixels. Ultimately, the distance matrix is multiplied by V to produce an output matrix that comprehensively captures global correlations. The DG_Blocks proposed by this study encapsulate the characteristics of global connectivity and prevent network degradation. Unlike the original residual network, which is limited to small window feature extraction, this work expands the global perspective of feature extraction and incorporates multiscale invariant features.

Self-attention algorithm

The output is calculated from Q, K, and V by Formula (8).

where dk is the number of columns of Q and K.

The DG_Blocks consist of 3x3 convolution layers (Conv) and a self-attention layer (SEAT). Q, K and V are transformed by feature, and the transposed Q and KT are multiplied by the feature matrix for the correlation matrix β of pixels between different channels. The output matrix is then normalized, and the feature matrix is multiplied by V for the global connection feature graph Y. The residual structure of the FDGAN combines self-attention with inherent global connectivity, addressing degradation and enhancing feature extraction capabilities beyond small windows. This advancement broadens the perspective of the network on feature extraction, promoting multiscale feature invariance. The feature map X (X ϵ [C, W, H]) undergoes convolutional operations to derive the Q, K, and V tensors. Notably, Q and K undergo channel reduction by half, while V remains unchanged (Q, K ϵ [0.5, W, H], V ϵ [C, W, H]). This matrix-based transformation optimizes computational efficiency, facilitating robust feature mapping across scales (Q, K ϵ [0.5, N], V ϵ [C, H], N = W * N). The expression for Y and specific process are shown in Formula (9) and Fig. 4.

DG_Blocks structure

Y = X+Conv(SEAT(X))+Conv(Conv(X)) (9)

The generator model adopts the autoencoder network structure37 combined with skip connections38 to compensate for the deficiency in local feature extraction. This approach enables the extraction of global features with multiscale invariance, enhancing generated image quality. The structure of the generator network is depicted in Fig. 5. The red represents image-filling layer, which fills the image content using four pixel matrices surrounding the image. Ci (W*H*C) denotes the current convolution layer, where the output feature map has a width of W and a height of H and C channels. The orange module corresponds to a convolution operation with a kernel size of 3 and a stride of 2. The brown module relates to a deconvolution operation with a kernel size of 3 and a stride of 1/2, enlarging the convolved feature map. The pink blocks represent four consecutive DG_Blocks structures that provide a global perspective on the extracted features. The green module is the instance normalization (IN) layer, which standardizes the feature map, preventing overfitting. The connection line indicates the fusion of low-level features obtained after convolution in the model with high-level features at the same resolution, enhancing the utilization of feature information in each layer.

Generator network structure

In addition, three model training enhancements were implemented to bolster the robustness of the FDGAN. First, label smoothing is incorporated: when assigning labels to real and fake samples (Real=1 and Fake=0), each new label is adjusted by replacing it with a randomly chosen value between 0.7 and 1.2 for real samples and between 0.0 and 0.3 for fake samples. Second, noise is introduced to both real and generated images before inputting them into the discriminator, improving its ability to discern images effectively. Finally, the optimization frequency of the discriminator surpasses that of the generator. Unlike CycleGAN 1:1 training ratio for discriminator and generator per epoch, this approach involves training the generator once per epoch and subsequently training the discriminator five times using the fake images it generates. This strategy allows the generator to produce high-quality outputs efficiently and optimizes training time and resources by stimulating effective discriminator training through multiple iterations.

The resurgence of neural networks has resulted in significant advancements in artificial intelligence (Mantaras et al.39 and machine learning (Shwartz et al.40, particularly in the areas of convolutional neural networks (CNNs) and visual deformation (Alexey Dosovitskiy et al.41, Liu et al.42. In this work, the MobileViG model based on sparse visual graph attention (SVGA) (Kang et al.43 is employed to classify images. SVGA incorporates prior knowledge of the graph structure into reasoning without reshaping the data. In contrast, as a hybrid CNN-GNN44 architecture, MobileViG utilizes reciprocal residual blocks, maximum relative graph convolution, and feedforward network layers to achieve superior image classification results and represents a significant breakthrough in computer vision, providing higher accuracy and lower latency for image classification tasks.

In this study, MobileViG was treated as the validation method, leveraging its unique knowledge distillation to expedite the training of the FDGAN. Employing a CNN-GNN structure, MobileViG excels in time domain image classification, achieving high accuracy within shorter durations. Through knowledge distillation, MobileViG efficiently transfers the essence of complex models to lightweight counterparts, preserving high accuracy while markedly enhancing training efficiency. By combining the spatial feature extraction ability of CNNs with the capacity of GNNs for processing temporal images, the efficiency and precision of time domain image processing was improved. Notably, MobileViG represents a breakthrough in conserving computational resources while maintaining robust model performance, offering a practical and efficient solution for image classification tasks.

The JZQ200 reducer is a two-stage gear drive mechanism consisting of an input shaft, intermediate shaft, output shaft, bearings and a gearbox. A visual representation of the virtual model and the structure of the gear transmission section are provided in Fig. 6.

Schematic diagram of gear mechanism

The obtained specific values can be found in Table 1.

Additionally, the relevant parameters associated with the reducer are presented in Table 2.

The test bench is composed of a motor (YE2-90L-4), reducer (JZQ200), magnetic powder brake (FZY400J), PLC, controller (HD800), frequency converter (FS1000-2R2G/4P) and cooling pump (DB-12A-40W), as shown in Fig. 7. The data were collected with B&K equipment, the sampling frequency was 25.6 kHz, and the sampling time was 1 s.

Test bench

In this experiment, four working conditions were used to test the performance of the reducer. The first one is a faultless constant-speed test, where the reducer ran smoothly at 120 rpm. The second is a failure-free variable speed test, where the input speed gradually increases over time. The third involved a constant-speed test with a missing tooth on the high-speed shaft at 120 rpm. Finally, simulating the missing gear fault is performed under various speeds in the linear-increasing time flow. Although the reducer works under various working conditions, four of them are used in this experiment to represent the possible issues encountered in daily operations. Obviously, the faulty and normal states were also distinguished in the experiment. The experimental scheme is shown in Fig. 8.

Working condition design

This section presents simulated and real domain signals generated under four different working conditions. A high degree of similarity between the signals is illustrated in Fig. 9. Furthermore, a comparison of the frequency signals is shown in Fig. 10. The fault can be accurately judged by analyzing the components of the frequency domain and their waveforms.

Comparison of the simulated and real-time domains

Comparison of the simulated and real frequency domain signals

A comparison of the numerical and experimental vibration time domain signals is shown in Fig. 9. Figure 9a shows a comparison of the faultless constant-speed conditions, demonstrating the periodic vibration changes and the state during meshing. The meshing process was characterized by high amplitude. The captured vibration data span a duration of 0.2 seconds at a speed of 120 rpm. Within this interval, when the input rotation reached 144 degrees, the amplitude generated by the gear was inconspicuous, approximately 2 m/s2. Moreover, there is a high degree of similarity between the simulated and real signals. Figure 9b presents a comparison of the faultless variable speed conditions. After 0.2 s, the speed linearly increased from 0 to 120 rpm. The amplitude was small as a result of the low initial speed. Figure 9c displays a comparison of the fault constant-speed conditions. Under identical conditions, the signal distribution remained consistent with that in Fig. 9a. However, the presence of the fault amplified the amplitude, increasing it to 7.5 m/s2. Figure 9d provides a comparison of the fault variable speeds. Similar to Fig. 9b, the feature distribution remained consistent. With changes in amplitude, the presence of a fault can be determined.

The time domain can identify initial faults, while the spectrum can be used to determine the location and composition of faults. The time domain signal was transformed into a spectrum through fast Fourier transform (FFT), and the frequency spectrum was also analyzed in Section "Test bench" with Table 2 to validate the DT model accuracy. First, the faultless and fault constant speeds are compared in Fig. 10a. The 1550–1570 Hz (fm - fe, fm + fe) spectrum diagram can be observed on both sides of the 1562 Hz point. The spectrum reveals that the frequency conversion band has a low amplitude and a relatively flat distribution. It is suggested that the reducer has a concentrated defect. Introducing the missing tooth fault intensifies the amplitude of the reducer at the same frequency. A comparison between the faultless and fault variable speed conditions is shown in Fig. 10c. The amplitude under the fault condition is significantly greater. Finally, a comparison of the simulated and real signals under variable speeds is displayed in Figs. 10b and d. The analysis focuses on the midfrequency band, where the meshing frequency and the frequency doubling are clearly visible. By comparing the time and frequency domains of the simulated and real signals, it can be preliminarily shown that applying finite element analysis to DT virtual modeling can yield better results.

Mapping is extremely important in DTs because it enables interactions between virtual systems and physical entities. In this section, the CycleGAN framework is further investigated and improved to achieve more favorable outcomes with reduced losses. Table 3 shows the loss function implemented with the FDGAN. By comparing both methods under the same set of tests.

The loss function consisted of six parts, including two generator losses, G_R(S) and G_S(R), and two discriminator losses, D_R and D_S, which map between the S and R domains. By comparing the generator loss of the two methods, the loss of FDGAN is less than 0.5, indicating better characteristics compared to CycleGAN. The network structure of the generator is improved by adding self-attention with global connection characteristics to enhance mapping efficiency. This both improved generation accuracy and reduced the discriminator workload, thereby reducing discriminator loss. To verify the characteristics of the mapped image, CycleR(S) and CycleS(R) were generated by mapping the image back to the original domain through the generator and comparing it with the original image. A lower loss usually indicates a better effect.

To further compare and analyze the advantages of the improved generator, the mapping effects of the FDGAN and CycleGAN are compared. This comparison is divided into two parts, where the input time domain signals Simulate_B and Real_A are mapped. It is evident that the proposed method can carry out feature extraction at a deeper level during the mapping process. This allows for the retention of original features and the preservation of the crucial signal-to-noise ratio of the time domain. The amplitude change in the time domain plays a vital role in the diagnosis results. The proposed method can enable more direct observation of the mapped results and facilitate further analysis. This approach broadens the methods for obtaining data. Fig. 11 shows the effectiveness of these methods in generating and mapping data by comparing these methods under the same set of tests.

CycleGAN and FDGAN mapping results, The black, red, and blue lines represent the original signals, FDGAN, and CycleGAN generated signals, respectively.

Then, FDGAN and CycleGAN mapping results were employed to construct a CIFAR-10 dataset. Following a 1:1 ratio criterion of real to mapped data, the dataset was divided into four scenarios for ablation experiments. The first scenario utilized the original CycleGAN model. The second scenario, FD1, involved an improved model training methods. The third scenario, FD2, featured a generator with a new architecture. The final scenario incorporated DG_Blocks into the FDGAN model, as illustrated in Fig. 12. The results reveal several key insights. The CycleGAN-based mapping method proved challenging to train, leading to difficulties in achieving convergence and acceptable accuracy. With an enhanced training method, the accuracy of the FD1 model increased to 84%. FD2 further improved the classification accuracy and convergence rate. Ultimately, the FDGAN approach achieved 94% accuracy on the mapped dataset. These findings demonstrate the improvements in training efficiency resulting from the FDGAN.

Ablation experiment

This section advanced by comparing the viability of FDGAN for DT mapping. The datasets included Dataset 1 containing 300 training samples (200 mapped and 100 original images), Dataset 2 containing 400 samples (300 mapped and 100 original images), plus, Data sets 3, 4, and 5 contained 800, 2000, and 4400 training samples with the same number of test samples 400 respectively. From the composition outlined in Table 4, it could be observed that higher mapping accuracy was attained with a smaller proportion of original data.

As a start, the performance of MobileViG was compared with several established methods (Hugo Touvron et al.48, Sachin50, Liu et al.51) to validate its feasibility in classification. The comparison under Dataset 1 is shown in Table 5.

Then, MobileViG classifier is initially trained with datasets (1–5) of varying sample sizes and later compared with CNN-ResNet. After training for 100 generations, this work achieves a superior effect with a training accuracy of 78.8%, while training for 300 generations results in an accuracy of 88.4%. When the training sample size increased to 800 and the test sample size was set to 400, the accuracy rate reached 99.5%. These results demonstrate that FDGAN can generate higher-precision maps and achieve higher classification accuracy than traditional methods. Thus, the simulation data generated by the DT model exhibit the same distribution and fault characteristics as the measured data. These findings may highlight the importance and potential of DTs in enhancing the accuracy and effectiveness of DL algorithms. The comparison results are shown in Table 6.

Finally, to further illustrate the efficiency of the proposed method, the results are also compared with those of ML1D-GAN, ACGAN, CACGAN, CNN-ResNet and transfer learning45,46,Shao et al.47,34. In Table 7, the findings reveal that the proposed method achieves a classification accuracy of 99.5%, which surpasses the accuracies of other methods. When using 800 sets of measurements, ML1D-GAN demonstrates an ace fault diagnosis accuracy of 97.79%, and transfer learning also exhibits high accuracy (99.37%), however, it requires a substantial quantity pretraining data. In conclusion, the method proposed in this paper offers the advantage of obtaining higher accuracy with a smaller quantity of data. This might highlight its potential for practical application in fault diagnosis, where limited data availability is often challenging.

With the addition of DG_Blocks, the improvement of generator network architecture and the update of the FDGAN training method, in this paper, fusion DT and DL technology was applied to overcome the limitations of traditional systems, such as the low efficiency of virtual and real mapping and the large quantity of training data.

Subsequently, a DT fault diagnosis system combining the FDGAN was established with the reducer experiment. Then, the feasibility and accuracy of fault diagnosis were verified via experiments. Finally, the DT model was tested using the MobileViG classifier, and the initial expectation of high precision with few samples is finally achieved. Overall, this research contributes to the advantages of DTs and DLs in fault diagnosis by overcoming the limitations of traditional systems and encourages related applications to further research. The main conclusions are as follows:

Finite element analysis can be used as a substitute for dynamic modeling to obtain virtual signals with high similarity to real-world system behavior.

The proposed FDGAN model exhibits excellent performance and competitiveness in virtual-real mapping, with an error within 1% and an accuracy of 99.5%, laying a reliable foundation for its application to fault diagnosis.

Validating the MobileViG classification also confirmed the efficiency of the proposed model with fewer samples; in detail, there were 800 samples (99.5%) in the proposed model and 4800 samples (98.3%) in the traditional model. The relevant results showed that the FDGAN is competent for DT virtual-real mapping.

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Number of teeth

Normal module

Tooth profile angle

Addendum coefficient

Radial clearance coefficient

Center to center spacing

Tooth width

Gear accuracy class

Order

Amplitude modulation component

Meshing frequency

Frequency modulation

Phase

The gear shaft speed

Displacement

Dynamic friction coefficient

Measurement data points

Lagrange multiplier

Lagrange daily quantity

The number of columns

Historical data and related databases

Stiffness matrices

Mass matrices

Damping matrices

The DT model

The geometric model

The impact of environmental factors

Acceleration

Grieves, M., Vickers, J. Digital twin: Mitigating unpredictable, undesirable emergent behavior in complex systems. In T ransdisci-plinary Perspectives on Complex Systems: New Findings and Approaches; Springer: Cham, Switzerland, 2017; pp. 85–113.

Tuegel, E. J., Ingraffea, A. R., Eason, T. G. & Spottswood, S. M. Reengineering aircraft structural life prediction using a digital twin. Int. J. Aerosp. Eng.2011, 1–14 (2011).

Article Google Scholar

Tao, F. & Zhang, M. Digital twin shop-floor: A new shop-floor paradigm towards smart manufacturing. IEEE Access.5, 20418–20427 (2017).

Article Google Scholar

Schleich, B., Anwer, N., Mathieu, L. & Wartzack, S. Shaping the digital twin for design and production engineering. CIRP Ann.66(1), 141–144 (2017).

Article Google Scholar

Tao, F. et al. Digital twin-driven product design framework. Int. J. Prod. Res.56, 1–19 (2018).

Google Scholar

Karve, P. M., Guo, Y., Kapusuzoglu, B., Mahadevan, S. & Haile, M. A. DT approach for damage-tolerant mission planning under uncertainty. Eng. Fract. Mech.225, 106766 (2020).

Article Google Scholar

Ritto, T. & Rochinha, F. Digital twin, physics-based model, and machine learning applied to damage detection in structures. Mech. Syst. Signal Process.155, 107614 (2021).

Article Google Scholar

Lechler, T., Fuchs, J., & Sjarov, M., et al. Introduction of a comprehensive structure model for the digital twin in manufacturing. In Proceedings of the 2020 IEEE 25th International Conference on Emerging Technology Fact Automation, 1773–80 (2020).

Rasheed, A., San, O. & Kvamsdal, T. Digital twin-values, challenges and enablers from a modeling perspective. IEEE Access8, 21980–2012 (2020).

Article Google Scholar

Bordeleau, F., Combemale, B., Eramo, R., et al. Towards model-driven digital twin engineering: current opportunities and future challenges. In Proceedings of the 2020 International Conference on System Modelling Management, 43–54 (2020).

Sun, C., Ma, M., Zhao, Z. & Chen, X. Sparse deep stacking network for fault diagnosis of motor. IEEE Trans. Ind. Inform.14(7), 3261–3270 (2018).

Article Google Scholar

Shao, S., McAleer, S., Yan, R. & Baldi, P. Highly accurate machine fault diagnosis using deep transfer learning. IEEE Trans. Ind. Inform.15(4), 2446–2455 (2019).

Article Google Scholar

Yu, J. & Zhou, X. One-dimensional residual convolutional autoencoder based feature learning for gearbox fault diagnosis. IEEE Trans. Ind. Inform.16(10), 6347–6358 (2020).

Article Google Scholar

Qiao, X. et al. (2020) Study on transient contact performance of meshing transmission of cycloid gear and needle wheel in RV reducer. J. Eng.14, 1001–1004 (2020).

Google Scholar

Wang, H., Shi, Z.-Y., Yu, B. & Xu, H. Transmission performance analysis of RV reducers influenced by profile modification and load. Appl. Sci.9(19), 4099 (2019).

Article Google Scholar

Xie, Y. H., Xu, L. X. & Deng, Y. Q. A dynamic approach for evaluating the moment rigidity and rotation precision of a bearing-planetary frame rotor system used in RV reducer. Mech. Mach. Theory173, 104851 (2022).

Article Google Scholar

Xu, L. X., Chen, B. K. & Li, C. Y. Dynamic modelling and contact analysis of bearing-cycloid-pinwheel transmission mechanisms used in joint rotate vector reducers. Mech. Mach. Theory137, 432–458 (2019).

Article Google Scholar

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde Farley, D., Ozair, S., Courville, A., & Bengio, Y. Generative adversarial nets. In Proceedings of the 2014 Conference on Advances in Neural Information Processing Systems 27. Montreal, Canada: Curran Associates, Inc., 26722680 (2014).

Mirza, M., Osindero, S. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014)

Arjovsky, M., Chintala, S., & Bottou, L. Wasserstein GAN. arXiv preprint arXiv:1701.07875 (2017)

Radford, A., Metz, L., & Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. Comput. Sci. (2015).

Chen, X., Duan, Y., Houthooft, R., Schulman, J., Sutskever, I., Abbeel, P. InfoGAN: Interpretable representation learning byinformation maximizing generative adversarial nets. In Proceedings of the 2016Neural Information Processing Systems. Barcelona, Spain: Department of Information Technology IMEC, 21722180 (2016).

Yu, L.T., Zhang, W.N., Wang, J., & Yu, Y. SeqGAN: Sequence generative adversarial nets with policy gradient. arXivpreprint arXiv:1609.05473 (2016).

Zhu, J.Y., Park, T., Isola, P., et al. Unpaired image-to-image translation using cycle-consistent adversarial networks. IEEE, (2017). https://doi.org/10.1109/ICCV.2017.244.

Choi, Y., Choi, M., Kim, M., et al. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2018). https://doi.org/10.1109/CVPR.2018.00916.

Jia, W., Wang, W. & Zhang, Z. From simple digital twin to complex digital twin Part I: A novel modeling method for multi-scale and multi-scenario digital twin. Adv. Eng. Inform.53, 101706 (2022).

Article Google Scholar

Zhang, H., Qi, Q., Ji, W. & Tao, F. An update method for digital twin multi-dimension models. Robot. Comput.-Integr. Manuf.80, 102481 (2023).

Article Google Scholar

Zhang, Q., Wei, Y., Liu, Z., Duan, J. & Qin, J. A framework for service-oriented digital twin systems for discrete workshops and its practical case study. Systems11, 156 (2023).

Article Google Scholar

Sharma, A., Kosasih, E., Zhang, J., Brintrup, A. & Calinescu, A. Digital twins: State of the art theory and practice, challenges, and open research questions. J. Ind. Inf. Integr.30, 100383 (2022).

Google Scholar

Xia, M. et al. Intelligent fault diagnosis of machinery using digital twin-assisted deep transfer learning. Reliab. Eng. Syst. Saf.215, 107938 (2021).

Article Google Scholar

Wang, Y., Tao, F., Zhang, M., Wang, L. & Zuo, Y. Digital twin enhanced fault prediction for the autoclave with insufficient data. J. Manuf. Syst.60, 350–359 (2021).

Article Google Scholar

Matulis, M. & Harvey, C. A robot arm digital twin utilizing reinforcement learning. Comput. Gr.95, 106–114 (2021).

Article Google Scholar

He, B., Cao, X. & Hua, Y. Data fusion-based sustainable digital twin system of intelligent detection robotics. J. Clean. Prod.280, 124181 (2021).

Article Google Scholar

Song, Z., Shi, H., Bai, X., et al. Digital twin-assisted fault diagnosis system for robot joints with insufficient data. J. Field Robot. (2023).

Munir, M., Avery, W., & Marculescu, R. MobileViG: Graph-based sparse attention for mobile vision applications. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada, 2211–2219 (2023). https://doi.org/10.1109/CVPRW59228.2023.00215.

Ren, Y., Liang, K., Shang, Y., et al. MulOER-SAN: 2-layer multi-objective framework for exercise recommendation with self-attention networks. Knowl. Based Syst. (2023).

Yang, S., Kong, X., Wang, Q., et al. Deep multiple autoencoder with attention mechanism network: A dynamic domain adaptation method for rotary machine fault diagnosis under different working conditions. Knowl. Based Syst. 249 (2022).

Alaraimi, S. et al. Transfer learning networks with skip connections for classification of brain tumors. Int. J. Imag. Syst. Technol.https://doi.org/10.1002/ima.22546 (2021).

Article Google Scholar

De Mantaras, R. L. & Poole, D. Proceedings of the tenth conference on uncertainty in artificial intelligence. Indian J. Dermatol. Venereol. Leprol.https://doi.org/10.1057/ejis.1994.27 (2013).

Article Google Scholar

Shalev-Shwartz, S., & Ben-David, S. Understanding machine learning. 2014. https://doi.org/10.1017/CBO9781107298019.025.

Alexey Dosovitskiy et al. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020. 1, 5

Liu, Z., Mao, H., Wu, C. Y., Feichtenhofer, C., Darrell, T., & Xie, S. A convnet for the 2020s. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 11976–11986 (2022).

Kang, G.C., Park, J., Lee, H., et al. DialGraph: Sparse graph learning networks for visual dialog. 2020.https://doi.org/10.48550/arXiv.2004.06698.

Mcdonnell, K., Abram, F. & Howley, E. Application of a novel hybrid CNN-GNN for peptide ion encoding. J. Proteome Res.22, 323–333. https://doi.org/10.1021/acs.jproteome.2c00234 (2022).

Article CAS PubMed PubMed Central Google Scholar

Dixit, S., Verma, N. K. & Ghosh, A. K. Intelligent fault diagnosis of rotary machines: Conditional auxiliary classifier GAN coupled with meta learning using limited data. IEEE Trans. Instrum. Meas.70, 1–11 (2021).

Article Google Scholar

Guo, Q., Li, Y., Song, Y., Wang, D. & Chen, W. Intelligent fault diagnosis method based on full 1-D convolutional generative adversarial network. IEEE Trans. Ind. Inform.16(3), 2044–2053 (2020).

Article Google Scholar

Shao, S., Wang, P. & Yan, R. Generative adversarial networks for data augmentation in machine fault diagnosis. Comput. Ind.106, 85–93 (2019).

Article Google Scholar

Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., & Jégou, H. Training data-efficient image transformers & distillation through attention. In International Conference on Machine Learning 10347–10357 (2021).

Mehta, S., & Rastegari, M. Separable selfattention for mobile vision transformers. arXiv preprint arXiv:2206.02680 (2022)

Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., & Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 10012–10022 (2021).

Download references

This work is financially supported by the S&T Program of Hebei (22282203Z), the Natural Science Foundation of Hebei Province (E2022209086),

College of Mechanical Engineering, North China University of Science and Technology, Tangshan, 063210, Hebei, China

Weimin Liu, Bin Han, Aiyun Zheng, Zhi Zheng & Shikui Jia

CRRC Tangshan Co., Ltd, Tangshan, 064000, Hebei, China

Shujun Chen

You can also search for this author in PubMed Google Scholar

Author Contributions: Conceptualization, W.L. and B.H.; methodology, B.H. and A.Z.; software, W.L. and B.H.; formal analysis, W.L.; writing—original draft preparation, W.L., B.H., Z.Z., and S.C.; writing—review and editing, W.L., Z.Z., and A.Z.; supervision, W.L. and A.Z.; project administra-tion, W.L. and Z.Z.; funding acquisition, W.L., A.Z., and S.J. All authors have read and agreed to the published version of the manuscript

Correspondence to Aiyun Zheng.

The authors declare no competing interests.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

Liu, W., Han, B., Zheng, A. et al. Fault diagnosis of reducers based on digital twins and deep learning. Sci Rep 14, 24406 (2024). https://doi.org/10.1038/s41598-024-75112-x

Download citation

Received: 03 March 2024

Accepted: 01 October 2024

Published: 17 October 2024

DOI: https://doi.org/10.1038/s41598-024-75112-x

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative