Addressing the challenge of inaccurate fault diagnosis caused by the strong interference of vibration signal noise and the dependence of feature extraction on manual design in the fault diagnosis of rotating machinery bearings, this paper proposes a physics⁃informed attention Transformer model that integrates bearing dynamics mechanisms with the Kolmogorov⁃Arnold Network (KAN). First, based on Hertz contact theory, the characteristic fault frequency equations for the bearing inner ring, outer ring and rolling element are derived, and the frequency⁃domain mask⁃guided attention mechanism is constructed to focus on the fault⁃sensitive frequency band; Second, the Kan⁃Transformer architecture is designed to adaptively analyze the time⁃frequency characteristics of vibration signals through the multi⁃scale decomposition ability of Kan, and realize the long⁃range dependence modeling combined with the Transformer's global attention; Finally, the proposed model is evaluated using the Case Western Reserve University (CWRU) bearing data set. Experiments show that the accuracy of the model is 99.75%, which is significantly better than the traditional model. It provides a high⁃precision, robust and physically interpretable solution for bearing fault diagnosis of rotating machinery.