Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution. | 0 | 0.34 | 2022 |
Learning Trajectory-Aware Transformer for Video Super-Resolution | 0 | 0.34 | 2022 |
TinyViT: Fast Pretraining Distillation for Small Vision Transformers. | 0 | 0.34 | 2022 |
Advancing High-Resolution Video-Language Representation with Large-Scale Video Transcriptions | 0 | 0.34 | 2022 |
AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation | 0 | 0.34 | 2022 |
GRIT-VLP: Grouped Mini-batch Sampling for Efficient Vision and Language Pre-training. | 0 | 0.34 | 2022 |
Expanding Language-Image Pretrained Models for General Video Recognition. | 0 | 0.34 | 2022 |
MiniViT: Compressing Vision Transformers with Weight Multiplexing | 0 | 0.34 | 2022 |
Probing Inter-modality: Visual Parsing with Self-Attention for Vision-and-Language Pre-training. | 0 | 0.34 | 2021 |
Learning Fine-Grained Motion Embedding for Landscape Animation | 0 | 0.34 | 2021 |
Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment. | 0 | 0.34 | 2021 |
Food and Ingredient Joint Learning for Fine-Grained Recognition | 1 | 0.39 | 2021 |
LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search | 0 | 0.34 | 2021 |
MMPT'21: International Joint Workshop on Multi-Modal Pre-Training for Multimedia Understanding | 0 | 0.34 | 2021 |
Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning | 0 | 0.34 | 2021 |
A Picture is Worth a Thousand Words: A Unified System for Diverse Captions and Rich Images Generation | 0 | 0.34 | 2021 |
Improving Visual Quality of Image Synthesis by A Token-based Generator with Transformers. | 0 | 0.34 | 2021 |
Learning Rich Part Hierarchies with Progressive Attention Networks for Fine-Grained Image Recognition. | 10 | 0.52 | 2020 |
Dgcn: Dynamic Graph Convolutional Network For Efficient Multi-Person Pose Estimation | 0 | 0.34 | 2020 |
Learning Semantic-aware Normalization for Generative Adversarial Networks. | 0 | 0.34 | 2020 |
NTIRE 2020 Challenge on Perceptual Extreme Super-Resolution: Methods and Results | 1 | 0.36 | 2020 |
Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search | 0 | 0.34 | 2020 |
360-Indoor: Towards Learning Real-World Objects in 360° Indoor Equirectangular Images | 0 | 0.34 | 2020 |
Aesthetic-Aware Image Style Transfer | 0 | 0.34 | 2020 |
Learning Texture Transformer Network For Image Super-Resolution | 6 | 0.45 | 2020 |
Looking For The Devil In The Details: Learning Trilinear Attention Sampling Network For Fine-Grained Image Recognition | 14 | 0.48 | 2019 |
Emotion Reinforced Visual Storytelling. | 2 | 0.41 | 2019 |
Learning Deep Bilinear Transformation for Fine-grained Image Representation | 1 | 0.35 | 2019 |
Exploiting hierarchical visual features for visual question answering | 1 | 0.40 | 2019 |
Multi-source Multi-level Attention Networks for Visual Question Answering. | 1 | 0.35 | 2019 |
Neural Storyboard Artist: Visualizing Stories with Coherent Image Sequences | 1 | 0.35 | 2019 |
From Words to Sentences: A Progressive Learning Approach for Zero-resource Machine Translation with Visual Pivots. | 0 | 0.34 | 2019 |
Show, Reward, and Tell: Adversarial Visual Story Generation | 1 | 0.37 | 2019 |
AI Coach: Deep Human Pose Estimation and Analysis for Personalized Athletic Training Assistance | 0 | 0.34 | 2019 |
Learning Pyramid-Context Encoder Network For High-Quality Image Inpainting | 8 | 0.48 | 2019 |
Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training. | 4 | 0.39 | 2018 |
What Dress Fits Me Best?: Fashion Recommendation on the Clothing Style for Personal Body Shape. | 8 | 0.43 | 2018 |
DA-GAN: Instance-level Image Translation by Deep Attention Generative Adversarial Networks (with Supplementary Materials). | 2 | 0.39 | 2018 |
Show, Reward and Tell: Automatic Generation of Narrative Paragraph From Photo Stream by Adversarial Training. | 4 | 0.42 | 2018 |
Image Inspired Poetry Generation in XiaoIce. | 4 | 0.45 | 2018 |
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions. | 3 | 0.39 | 2018 |
Self-view Grounding Given a Narrated 360° Video. | 1 | 0.35 | 2018 |
3D Human Body Reshaping with Anthropometric Modeling. | 0 | 0.34 | 2017 |
Searching Personal Photos on the Phone with Instant Visual Query Suggestion and Joint Text-Image Hashing. | 0 | 0.34 | 2017 |
Show, Adapt and Tell: Adversarial Training of Cross-Domain Image Captioner | 19 | 0.66 | 2017 |
Let Your Photos Talk: Generating Narrative Paragraph for Photo Stream via Bidirectional Attention Recurrent Neural Networks. | 13 | 0.48 | 2017 |
Storytelling of Photo Stream with Bidirectional Multi-thread Recurrent Neural Network. | 2 | 0.40 | 2016 |
Beyond Object Recognition: Visual Sentiment Analysis with Deep Coupled Adjective and Noun Neural Networks. | 17 | 0.58 | 2016 |
Relaxing From Vocabulary: Robust Weakly-Supervised Deep Learning for Vocabulary-Free Image Tagging | 15 | 0.56 | 2015 |
Tagging Personal Photos with Transfer Deep Learning | 9 | 0.50 | 2015 |