Papers·어제

Lite3R: Sparse Linear Attention + FP8 QAT 로 3D Transformer 추론 속도 1.7~2.0x, 메모리 1.9~2.4x 개선

Peking University 팀이 3D reconstruction Transformer 의 효율성을 높이는 Lite3R 을 제안했습니다. Dense attention 을 Sparse Linear Attention 으로 대체해 token-mixing 비용을 줄이고, FP8-aware QAT 로 backbone 은 고정한 채 linear projection 만 학습해 저정밀도에서도 geometry consistency 를 유지합니다. VGGT, DA3-Large 백본 기준 BlendedMVS, DTU64 에서 latency 1.7~2.0x, memory 1.9~2.4x 감소하면서 재구성 품질은 거의 유지했네요. 코드와 웹사이트도 공개 중입니다.

#3d-reconstruction
#transformer
#efficiency
#quantization
#peking-university

Peking University

원문 보기 →

Lite3R: Sparse Linear Attention + FP8 QAT 로 3D Transformer 추론 속도 1.7~2.0x, 메모리 1.9~2.4x 개선

Comments