State-of-the-art 3D detection methods rely on supervised learning and large
labelled datasets. However, annotating lidar data is resource-consuming, and
depending only on supervised learning limits th
We present a framework that can impose the audio effects and production style
from one recording to another by example with the goal of simplifying the audio
production process. We train a deep neural
We propose a learnable illumination enhancement model for high-level vision.
Inspired by real camera response functions, we assume that the illumination enhancement function should be a concave curve, and propose to satisfy this concavity through discrete integral.
Event-based cameras can overpass frame-based cameras limitations for
important tasks such as high-speed motion detection during self-driving cars
navigation in low illumination conditions. The event c
Despite the recent attention to DeepFakes and other forms of image
manipulations, one of the most prevalent ways to mislead audiences is the use
of unaltered images in a new but false context. To addr
We propose a simple and effective self-supervised pre-training strategy, named pairwise half-graphdiscrimination (phd), that explicitly pre-trains a graph neural network at graph-level.
Phd is designed as a simple binary classification task to discriminate whether two half-graphs come from the same source.
Deep neural networks are known to be vulnerable to adversarial examples, where a perturbation in the input space leads to an amplified shift in the latent network representation.
Our approach leverages the label-independent nature of self-supervised signals and counters the adversarial perturbation with respect to the self-supervised tasks.
Training speaker-discriminative and robust speaker verification systems
without speaker labels is still challenging and worthwhile to explore. In this
study, we propose an effective self-supervised le
We investigate reducing the training time of recent self-supervised methods by various model-agnostic strategies that have not been used for this problem.
In particular, we study three strategies : an extendable cyclic learning rateschedule, a matching progressive augmentation magnitude and image resolutionsschedule, and a hard positive mining strategy based on augmentation difficulty.
In this work, we present a fully self-supervised framework for semantic
segmentation(FS^4). A fully bootstrapped strategy for semantic segmentation,
which saves efforts for the huge amount of annotati
Self-supervised learning has attracted many researchers for its soaring performance on representation learning in the last several years.
In this survey, we take a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning.
We present a robust estimator for fitting multiple parametric models of the
same form to noisy measurements. Applications include finding multiple
vanishing points in man-made scenes, fitting planes t
Visual domain adaptation (da) seeks to transfer trained models to unseen, unlabeled domains across distribution shift, but approaches typically focus on adapting convolutional neural network architectures initialized with supervisedimagenet representations.
In this work, we shift focus to adapting modern architectures for object recognition, the increasingly popular visiontransformer (vit), and modern pretraining based on self-supervised learning(ssl).
Self-training achieves enormous success in various semi-supervised and
weakly-supervised learning tasks. The method can be interpreted as a
teacher-student framework, where the teacher generates pseud
Masked Language Model (MLM) framework has been widely adopted for
self-supervised language pre-training. In this paper, we argue that randomly
sampled masks in MLM would lead to undesirably large grad
Existing video self-supervised learning methods mainly rely on trimmed videos for model training. However, trimmed datasets are manually annotated from untrimmed videos. In this sense, these methods a
Self-supervision has demonstrated to be an effective learning strategy when
training target tasks on small annotated data-sets. While current research
focuses on creating novel pretext tasks to learn
Fluorescence microscopy is a key driver to promote discoveries of biomedical
research. However, with the limitation of microscope hardware and
characteristics of the observed samples, the fluorescence
Learning representations with self-supervision for convolutional networks
(CNN) has proven effective for vision tasks. As an alternative for CNN, vision
transformers (ViTs) emerge strong representatio