Idan Schwartz
Home
Publications
Contact
CV
L. Wolf
Latest
Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
Zero-shot video captioning with evolving pseudo-tokens
AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation
Discriminative Class Tokens for Text-to-Image Diffusion Models
Describing Sets of Images with Textual-PCA
Optimizing Relevance Maps of Vision Transformers Improves Robustness
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
Video and Text Matching with Conditioned Embeddings
Cite
×