RegionViT: Regional-to-Local Attention for Vision Transformers | 0 | 0.34 | 2022 |
Distributed adversarial training to robustify deep neural networks at scale. | 0 | 0.34 | 2022 |
Can an Image Classifier Suffice For Action Recognition? | 0 | 0.34 | 2022 |
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification | 8 | 0.66 | 2021 |
Deep Analysis of CNN-based Spatio-temporal Representations for Action Recognition | 2 | 0.36 | 2021 |
Generating Adversarial Computer Programs using Optimized Obfuscations | 0 | 0.34 | 2021 |
Relationship Matters - Relation Guided Knowledge Transfer for Incremental Learning of Object Detectors. | 0 | 0.34 | 2020 |
Moments in Time Dataset: one million videos for event understanding | 28 | 0.80 | 2020 |
Interpreting Adversarial Examples by Activation Promotion and Suppression. | 1 | 0.35 | 2019 |
Structured Adversarial Attack: Towards General Implementation and Better Interpretability | 0 | 0.34 | 2019 |
More Is Less: Learning Efficient Video Representations by Big-Little Network and Depthwise Temporal Aggregation | 0 | 0.34 | 2019 |
Reasoning About Human-Object Interactions Through Dual Attention Networks | 1 | 0.36 | 2019 |
Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition. | 2 | 0.36 | 2018 |
SC-Conv: Sparse-Complementary Convolution for Efficient Model Utilization on CNNs | 0 | 0.34 | 2018 |
Efficient Fusion of Sparse and Complementary Convolutions for Object Recognition and Detection. | 0 | 0.34 | 2018 |
Semantically Guided Visual Question Answering | 1 | 0.35 | 2018 |
Object-Centric Spatio-Temporal Activity Detection and Recognition. | 0 | 0.34 | 2018 |
Sparse Deep Feature Representation for Object Detection from Wearable Cameras. | 0 | 0.34 | 2017 |
A Unified Multi-Scale Deep Convolutional Neural Network For Fast Object Detection | 206 | 4.83 | 2016 |
People Detection In Crowded Scenes By Context-Driven Label Propagation | 0 | 0.34 | 2016 |
Self-calibration from vehicle information | 2 | 0.36 | 2015 |
Temporal Sequence Modeling for Video Event Detection | 32 | 0.82 | 2014 |
Long-term object tracking for parked vehicle detection | 2 | 0.38 | 2014 |
IBM-Northwestern@TRECVID 2014 - Surveillance Event Detection. | 0 | 0.34 | 2014 |
Random Laplace Feature Maps for Semigroup Kernels on Histograms | 16 | 0.81 | 2014 |
Riskwheel: Interactive Visual Analytics For Surveillance Event Detection | 2 | 0.39 | 2014 |
Relative Attributes for Large-Scale Abandoned Object Detection | 16 | 0.65 | 2013 |
Spatio-temporal fisher vector coding for surveillance event detection | 11 | 0.53 | 2013 |
IBM Research and Columbia University TRECVID-2013 Multimedia Event Detection (MED), Multimedia Event Recounting (MER), Surveillance Event Detection (SED), and Semantic Indexing (SIN) Systems. | 0 | 0.34 | 2013 |
Hand tracking by binary quadratic programming and its application to retail activity recognition | 1 | 0.35 | 2012 |
Multimodal ranking for non-compliance detection in retail surveillance | 1 | 0.40 | 2012 |
Robust Foreground and Abandonment Analysis for Large-Scale Abandoned Object Detection in Complex Surveillance Videos | 6 | 0.44 | 2012 |
Practical computer vision: example techniques and challenges | 5 | 0.79 | 2011 |
A pattern discovery approach to retail fraud detection | 3 | 0.44 | 2011 |
Soft margin keyframe comparison: Enhancing precision of fraud detection in retail surveillance | 1 | 0.37 | 2011 |
Modeling of temporarily static objects for robust abandoned object detection in urban surveillance | 14 | 0.63 | 2011 |
Robust spatiotemporal matching of electronic slides to presentation videos. | 7 | 0.56 | 2011 |
Robust Abandoned Object Detection Using Region-Level Analysis | 16 | 0.67 | 2011 |
Graph based event detection from realistic videos using weak feature correspondence. | 2 | 0.43 | 2010 |
An integer programming approach to visual compliance. | 0 | 0.34 | 2010 |
Fast detection of retail fraud using polar touch buttons | 2 | 0.38 | 2009 |
Recognition Of Repetitive Sequential Human Activity | 16 | 1.27 | 2009 |
Detecting sweethearting in retail surveillance videos | 7 | 0.63 | 2009 |
Accurate alignment of presentation slides with educational video | 2 | 0.45 | 2009 |
Evaluation of Localized Semantics: Data, Methodology, and Experiments | 21 | 3.34 | 2008 |
Reducing Correspondence Ambiguity In Loosely Labeled Training Data | 6 | 0.98 | 2007 |
Temporal Modeling of Slide Change in Presentation Videos | 10 | 0.76 | 2007 |
Curve Matching, Time Warping, and Light Fields: New Algorithms for Computing Similarity between Curves | 35 | 1.35 | 2007 |
Matching slides to presentation videos using SIFT and scene background matching | 19 | 1.37 | 2006 |