It is well-known that the perception of the position of audio and video stimuli is not independent. In general, video dominates the position if the position offset between audio and video is small. Most previous work focused on natural listening conditions and position offsets between audio and video in the horizontal plane. There is little research concerning offsets in vertical direction and artificial, auralized sound environments. Among different approaches to auralization of spatial audio, the binaural reproduction is especially very interestingas it offers proper perception of direction, distance, and elevation of sound sources at moderate cost. This article addresses the question whether the thresholds of perceptual fusion of audio and video stimuli are the same in binaural reproduction systems and in natural listening conditions. To estimate the influence of audio-visual discrepancy on vertical sound source localization, two experiments have been designed. The test methods were optimized to improve usability and minimize rating errors. Both experiments resulted in psychometric functions of intersensory bias for competing audio and visual stimuli. For binaural reproduction, the obtained results showed an effect of similar magnitude for both the vertical and horizontal plane which is in good agreement with the results obtained from other studies in natural environments.