In this project, we investigate the effects of various image enhancement techniques on a text-to-video retrieval task. Our focus is to understand how enhanced images might contribute to the effectiveness and precision of the retrieval process. By using the MSR-VTT dataset and implementing the CLIP model in a zero-shot setting, we hope to generate new insights about the correlation between image quality and retrieval accuracy.
The main objective of this project is to study the influence of different image enhancement algorithms on a text-to-video retrieval task. The nature of the problem involves improving the image quality before the retrieval process and subsequently comparing the results against the non-enhanced scenario. The challenge lies in measuring the variation in retrieval accuracy caused by these enhancements and interpreting the implications of these variations.
Name |
---|
Parsa Haghighi |
Zahra Dehghanian |
Elham Abolhassani |
M. Taha Teimuri |