1 |
1 |
Crossed-Time Delay Neural Network for Speaker Recognition |
5 |
1 |
An Asymmetric Two-sided Penalty Term for CT-GAN |
6 |
1 |
Fast Discrete Matrix Factorization Hashing for Large-Scale Cross-Modal Retrieval |
7 |
1 |
Fast Optimal Transport Artistic Style Transfer |
8 |
1 |
Stacked Sparse Autoencoder for Audio Object Coding |
9 |
1 |
A collaborative multi-modal fusion method based on random variational information bottleneck for gesture recognition |
17 |
1 |
Frame Aggregation and Multi-Modal Fusion Framework for Video-Based Person Recognition |
21 |
1 |
An Adaptive Face-Iris Multimodal Identification System based on Quality Assessment Network |
24 |
1 |
Thermal Face Recognition based on Multi-Scale Image Synthesis |
26 |
1 |
Contrastive Learning in Frequency Domain for Non-I.I.D. Image Classification |
29 |
1 |
Group Activity Recognition by Exploiting Position Distribution and Appearance Relation |
31 |
1 |
Multi-branch and Multi-scale Attention Learning for Fine-Grained Visual Categorization |
35 |
1 |
Dense Attention-guided Network for Boundary-aware Salient Object Detection |
36 |
1 |
MSCANet: Adaptive Multi-scale Context Aggregation Network for Congested Crowd Counting |
38 |
1 |
Generative Image Inpainting By Hybrid Contextual Attention Network |
41 |
1 |
Atypical Lyrics Completion Considering Musical Audio Signals |
45 |
1 |
Satellite cloud images based tropical cyclones tracking: database and comprehensive study |
47 |
1 |
Improving Supervised Cross-modal Retrieval with Semantic Graph Embedding |
51 |
1 |
Confidence-based Global Attention Guided Network for Image Inpainting |
55 |
1 |
Multi-Task Deep Learning for No-Reference Screen Content Image Quality Assessment |
56 |
1 |
Image Registration Improved by Generative Adversarial Networks |
57 |
1 |
Language Person Search with Pair-based Weighting Loss |
62 |
1 |
Deep 3D Modeling of Human Bodies from Freehand Sketching |
65 |
1 |
DeepFusion: Deep Ensembles for Domain Independent System Fusion |
66 |
1 |
Illuminate Low-light Image Via Coarse-to-fine Multi-level Network |
69 |
1 |
MM-Net: Learning Adaptive Meta-Metric for Few-Shot Biometric Recognition |
74 |
1 |
A Sentiment Similarity-oriented Attention Model with Multi-task Learning for Text-based Emotion Recognition |
76 |
1 |
Locating Visual Explanations for Video Question Answering |
79 |
1 |
Global Cognition and Local Perception Network for Blind Image Deblurring |
80 |
1 |
Multi-grained Fusion for Conditional Image Retrieval |
81 |
1 |
A Hybrid Music Recommendation Algorithm Based on Attention Mechanism |
85 |
1 |
Few-shot Learning With Unlabeled Outlier Exposure |
86 |
1 |
Fine-Grained Video Deblurring with Event Camera |
88 |
1 |
Discriminative and Selective Pseudo-Labeling for Domain Adaptation |
92 |
1 |
Multi-level Gate Feature Aggregation with Spatially Adaptive Batch-Instance Normalization for Semantic Image Synthesis |
94 |
1 |
Robust Multispectral Pedestrian Detection via Uncertain-Aware Cross-Modal Learning |
96 |
1 |
Time-dependent Body Gesture Representation for Video Emotion Recognition |
97 |
1 |
Two-stage Real-time Multi-object Tracking with Candidate Selection |
99 |
1 |
MusiCoder: A Universal Music-Acoustic Encoder Based on Transformers |
102 |
1 |
DANet: Deformable Alignment Network for Video Inpainting |
103 |
1 |
Deep Centralized Cross-modal Retrieval |
104 |
1 |
Shot Boundary Detection through Multi-stage Deep Convolution Neural Network |
106 |
1 |
Towards Optimal Multirate Encoding for HTTP Adaptive Streaming |
108 |
1 |
Tell as You Imagine: Sentence Imageability-aware Image Captioning |
117 |
1 |
Fast Mode Decision Algorithm for Intra Prediction of the 3rd Generation Audio Video Coding Standard |
120 |
1 |
Graph Structure Reasoning Network for Face Alignment and Reconstruction |
126 |
1 |
Game Input with Delay – A Model for the Time to Select a Moving Target with a Mouse |
127 |
1 |
Unsupervised Temporal Attention Summarization Model for User Created Videos |
130 |
1 |
Learning from the Negativity: Deep Negative Correlation Meta-Learning for Adversarial Image Classification |
135 |
1 |
Deep Face Swapping via Cross-Identity Adversarial Training |
138 |
1 |
Res2-Net: an Enhanced Network for Generalized Nuclear Segmentation in Pathological Images |
143 |
1 |
Automatic Diagnosis of Glaucoma on Color Fundus Images Using Adaptive Mask Deep Network |
144 |
1 |
Learning 3D-Craft Generation with Predictive Action Neural Network |
145 |
1 |
Unsupervised Multi-shot Person Re-identification via Dynamic Bi-directional Normalized Sparse Representation |
146 |
1 |
Classifier Belief Optimization For Visual Categorization |
147 |
1 |
Initialize with Mask: For More Efficient Federated Learning |
154 |
1 |
Fine-grained Generation for Zero-Shot Learning |
155 |
1 |
Unsupervised Gaze: Exploration of Geometric Constraints for 3D Gaze Estimation |
166 |
1 |
Fine-grained Image-Text Retrieval via Complementary Feature Learning |
184 |
1 |
Considering Human Perception and Memory in Interactive Multimedia Retrieval Evaluations |
192 |
1 |
Median Pooling Grad-CAM: An Efficient Inference Level Visual Explanation for CNN Networks in Remote Sensing Image Classification |
195 |
1 |
Multi-granularity Recurrent Attention Graph Neural Network for Few-shot learning |
197 |
1 |
Learning Multi-level Interaction Relations and Feature Representations for Group Activity Recognition |
201 |
1 |
A Structured Feature Learning Model for Clothing Keypoints Localization |
206 |
1 |
Automatic Pose Quality Assessment for Adaptive Human Pose Refinement |
211 |
1 |
Deep Attributed Network Embedding with Community Information |
214 |
1 |
An acceleration framework for super-resolution network via region difficulty self-adaption |
215 |
1 |
Spatial Gradient Guided Learning and Semantic Relation Transfer for Facial Landmark Detection |
216 |
1 |
EEG Emotion Recognition Based on Channel Attention for E-Healthcare Applications |
222 |
1 |
The MovieWall: A New Interface for Browsing Large Video Collections |
224 |
1 |
DVRCNN: Dark Video Post-Processing Method for VVC |
226 |
1 |
An Efficient Image Transmission Pipeline for Multimedia Services |
229 |
1 |
Gaussian Mixture Model Based Semi-supervised Sparse Representation for Face Recognition |
124 |
2 |
Using Keystroke Dynamics as Part of Lifelogging |
148 |
2 |
HTAD: A Home-Tasks Activities Dataset with Wrist-accelerometer and Audio Features |
164 |
2 |
MNR-Air: An Economic and Dynamic Crowdsourcing Mechanism to Collect Personal Lifelog and Surrounding Environment Dataset. A Case Study in Ho Chi Minh City, Vietnam |
177 |
2 |
Kvasir-Instrument: Diagnostic and therapeutic tool segmentation dataset in gastrointestinal endoscopy |
209 |
2 |
CatMeows: A Publicly-Available Dataset of Cat Vocalizations |
50 |
3 |
Search and Explore Strategies for Interactive Analysis of Real-Life Image Collections with Unknown and Unique Categories |
174 |
3 |
Graph-based Indexing and Retrieval of Lifelog Data |
193 |
3 |
On Fusion of Learned and Designed Features for Video Data Analytics |
196 |
3 |
SMILe: Interactive Learning on Mobile Phones |
70 |
4 |
A multimodal tensor-based late fusion approach for satellite image search in Sentinel 2 images |
107 |
4 |
Canopy height estimation from spaceborne imagery using convolutional encoder-decoder |
156 |
4 |
Implementation of a Random Forest classifier for testing wildfire predictive modelling in Greece using diachronically collected fire occurrence and fire mapping data |
48 |
5 |
Mobile eHealth Platform for Home Monitoring of Bipolar Disorder |
82 |
5 |
Multimodal Sensor Data Analysis for Detection of Risk Situations of Fragile People in @home Environments |
109 |
5 |
Towards the Development of a Trustworthy Chatbot for Mental Health Applications |
122 |
5 |
Fusion of multimodal sensor data for effective human action recognition in the service of a medicine-oriented platform |
137 |
6 |
SpotifyGraph: Visualisation of User’sPreferences in Music |
191 |
6 |
A System for Interactive Multimedia Retrieval Evaluations |
13 |
7 |
SQL-like interpretable interactive video search |
234 |
7 |
VERGE in VBS 2021 |
235 |
7 |
NoShot Video Browser at VBS2021 |
236 |
7 |
Exquisitor at the Video Browser Showdown 2021: Relationships Between Semantic Classifiers |
237 |
7 |
VideoGraph – Towards using Knowledge Graphs for Interactive Video Retrieval |
238 |
7 |
IVIST: Interactive Video Search Tool in VBS 2021 |
239 |
7 |
Video Search With Collage Queries |
240 |
7 |
Towards Explainable Interactive Multi-Modal Video Retrieval with vitrivr |
241 |
7 |
Competitive Interactive Video Retrieval in Virtual Reality with vitrivr-VR |
242 |
7 |
An Interactive Video Search Tool: A Case Study using the V3C1 Dataset |
243 |
7 |
Less is More – diveXplore 5.0 at VBS 2021 |
244 |
7 |
SOMHunter V2 at Video Browser Showdown 2021 |
245 |
7 |
W2VV++ BERT Model at VBS 2021 |
246 |
7 |
VISIONE at Video Browser Showdown 2021 |
247 |
7 |
IVOS – The ITEC Interactive Video Object Search System at VBS2021 |
248 |
7 |
Video search with sub-image keyword transfer using existing image archives |
249 |
7 |
A VR Interface for Browsing Visual Spaces at VBS2021 |