This paper deeply discusses the end-to-end optimized video codec technology,which can realize efficient compression and high-quality transmission of video data by integrating video capture,encoding, transmission and decoding. With the development of deep learning technology,end-to-end video codec methods based on deep learning have gradually emerged. Relying on the powerful learning ability of neural network models, they have surpassed traditional video codec algorithms. This paper first introduces the background and significance of end-to-end video codec technology,and then elaborates its key technologies,including video compression algorithm based on deep learning,motion compensation of multi-scale features,and auxiliary prediction frame generation of multi-reference frames. Through comparative experiments,this paper demonstrates the significant advantages of end-to-end optimized video codec technology in improving compression efficiency,reducing transmission delay,and improving video quality. At the same time,the current research results are summarized, and the future research direction is prospeced.