The NLP community has seen substantial recent interest in grounding to
facilitate interaction between language technologies and the world. However, as
a community, we use the term broadly to reference
While most previous work has focused on different pretraining objectives and architectures for transfer learning, we ask how to best adapt the pretrained model to a given target task. We focus on the
The bridging research between Human-Computer Interaction and Natural Language
Processing is developing quickly these years. However, there is still a lack of
formative guidelines to understand the hum
Natural Language Processing offers new insights into language data across
almost all disciplines and domains, and allows us to corroborate and/or
challenge existing knowledge. The primary hurdles to w
Preregistration refers to the practice of specifying what you are going todo, and what you expect to find in your study, before carrying out the study.
This paper explores preregistration in more detail, explores how researchers could preregister their work, and presents several preregistration questions for different kinds of studies.
Objective The evaluation of natural language processing (NLP) models for
clinical text de-identification relies on the availability of clinical notes,
which is often restricted due to privacy concerns
Modern natural language processing (NLP) methods employ self-supervised
pretraining objectives such as masked language modeling to boost the
performance of various application tasks. These pretraining
Adversarial example generation methods in NLP rely on models like language
models or sentence encoders to determine if potential adversarial examples are
valid. In these methods, a valid adversarial e
In this paper, we introduce an iterative and incremental process model for developing natural language processing (nlp) applications.
Our process model allows efficiently creating prototypes by utilizing templates, conventions, and implementations, enabling developers and data scientists to focus on the business goals.
We argue that extrapolation to examples outside the training space will often
be easier for models that capture global structures, rather than just maximise
their local fit to the training data. We sh
Many natural language processing (nlp) tasks are naturally imbalanced, as some target categories occur much more frequently than others in the realworld.
Current nlp models still tend to perform poorly on less frequent classes.
Transition to Adulthood is an essential life stage for many families. The
prior research has shown that young people with intellectual or development
disabil-ities (IDD) have more challenges than thei
We present an ontology-driven self-supervised approach (derive concept embeddings using an auto-encoder from baseline natural language processing results) for producing a publicly available resource that would support large-scale machine learning (e.g., training transformer based large language models) on social media corpus.
The proposed approach is aimed to facilitate the community in training transferable nlp models for effectively surfacing adverse childhood experiences (aces) in low-resource scenarios like nlp on clinical notes within electronic health records.
Data augmentation methods have been explored as a means of improving data efficiency in natural language processing (nlp).
However, there has been no systematic empirical overview of data augmentation for nlp in the limited labeled datasetting, making it difficult to understand which methods work in which settings.
Recent work has demonstrated that pre-training in-domain language models can
boost performance when adapting to a new domain. However, the costs associated
with pre-training raise an important questio
The prototypical non-linear programming (nlp) experiment trains a standard architecture on labeled data and optimizes for accuracy, without accounting for other dimensions such as fairness, interpretability, or computational efficiency.
We refer to this as the square one experimental setup and show through a manual classification of recent nlp research papers that this is indeed the case and refer to it as the square one experimental setup.
Several years of research have shown that machine-learning systems are
vulnerable to adversarial examples, both in theory and in practice. Until now,
such attacks have primarily targeted visual models
We provide a thorough analysis of different privacy preserving strategies on seven downstream datasets in five different `typical' nlp tasks with varying complexity using modern neuralmodels.
We show that unlike standard non-private approaches to solving nlptasks, where bigger is usually better, privacy-preserving strategies do not exhibit a winning pattern, and each task and privacy regime requires a specialtreatment to achieve adequate performance.