We present EDA: easy data augmentation techniques for boosting performance on
text classification tasks. EDA consists of four simple but powerful operations:
synonym replacement, random insertion, ran
Blind music source separation has been a popular and active subject of research in both the music information retrieval and signal processing communities.
A data augmentation method that creates artificial mixtures by combining tracks from different songs has been shown useful in recent works.
We consider data augmentation techniques for improving speech recognition for languages where both annotated and raw speech data is sparse.
We consider three different types of data augmentation : unsupervised training, multilingual training and the use of synthesized speech as trainingdata.
Graph Neural Network (GNNs) based methods have recently become a popular tool
to deal with graph data because of their ability to incorporate structural
information. The only hurdle in the performance
We introduce two data augmentation techniques, which, used with a
Resnet-BiLSTM-CTC network, significantly reduce Word Error Rate (WER) and
Character Error Rate (CER) beyond best-reported results on h
Data augmentation techniques have become standard practice in deep learning,
as it has been shown to greatly improve the generalisation abilities of models.
These techniques rely on different ideas su
We explore different data augmentation techniques on context-gloss pairs to improve the performance of word sense disambiguation (wsd).
We compare and analyze different context-gloss augmentation techniques, and the results show that applying back translation on gloss performs the best.
Successful training of convolutional neural networks (CNNs) requires a
substantial amount of data. With small datasets networks generalize poorly.
Data Augmentation techniques improve the generalizabi
Despite recent success, most contrastive self-supervised learning methods are
domain-specific, relying heavily on data augmentation techniques that require
knowledge about a particular domain, such as
Intensive care units (ICU) are increasingly looking towards machine learning
for methods to provide online monitoring of critically ill patients. In machine
learning, online monitoring is often formul
Micro Abstract: A recent study from GLOBOCAN disclosed that during 2018 two
million women worldwide had been diagnosed from breast cancer. This study
presents a computer-aided diagnosis system based o
In this work, we propose an extensive analysis of how different data augmentation techniques improve the training of encoder-decoder neural networks on this problem.
Twenty different data augmentation techniques were evaluated on five different datasets.
Task-agnostic forms of data augmentation have proven widely effective in
computer vision, even on pretrained models. In NLP similar results are reported
most commonly for low data regimes, non-pretrai
In this paper, we propose a novel graph-based data augmentation method that
can generally be applied to medical waveform data with graph structures. In the
process of recording medical waveform data,
Detection of some types of toxic language is hampered by extreme scarcity of
labeled training data. Data augmentation - generating new synthetic data from a
labeled seed dataset - can help. The effica
This paper presents our strategy to address the semeval-2022 task 3 pretens:presupposed taxonomies evaluating neural network semantics.
The goal of the task is to identify if a sentence is deemed acceptable or not, depending on the taxonomic relationship that holds between a noun pair contained in thesentence.
In this paper, we present a comparative study on the robustness of two
different online streaming speech recognition models: Monotonic Chunkwise
Attention (MoChA) and Recurrent Neural Network-Transduc
We propose a neural data augmentation method for natural language processing tasks, which is a combination of conditionalvariational autoencoder and encoder-decoder transformer model.
While encoding and decoding the input sentence, our model captures the syntactic and semanticrepresentation of the input language with its class condition.