{"input": "language models is used for OtherScientificTerm| context: temporary syntactic ambiguities arise when the beginning of a sentence is compatible with multiple syntactic analyses . we inspect to which extent neural language models ( lms ) exhibit uncertainty over such analyses when processing temporarily ambiguous inputs , and how that uncertainty is modulated by disambiguating cues .", "entity": "language models", "output": "temporary ambiguity", "neg_sample": ["language models is used for OtherScientificTerm", "temporary syntactic ambiguities arise when the beginning of a sentence is compatible with multiple syntactic analyses .", "we inspect to which extent neural language models ( lms ) exhibit uncertainty over such analyses when processing temporarily ambiguous inputs , and how that uncertainty is modulated by disambiguating cues ."], "relation": "used for", "id": "2021.blackboxnlp-1.4", "year": 2021, "rel_sent": "We apply this method to study the behavior of two LMs ( GPT2 and an LSTM ) on three types of temporary ambiguity , using materials from human sentence processing experiments .", "forward": true, "src_ids": "2021.blackboxnlp-1.4_6370"}
{"input": "task - oriented dialog generation is done by using Method| context: large pre - trained language generation models such as gpt-2 have demonstrated their effectiveness as language priors by reaching state - of - the - art results in various language generation tasks . however , the performance of pre - trained models on task - oriented dialog tasks is still under - explored .", "entity": "task - oriented dialog generation", "output": "tailored pre - training model", "neg_sample": ["task - oriented dialog generation is done by using Method", "large pre - trained language generation models such as gpt-2 have demonstrated their effectiveness as language priors by reaching state - of - the - art results in various language generation tasks .", "however , the performance of pre - trained models on task - oriented dialog tasks is still under - explored ."], "relation": "used for", "id": "2021.acl-short.40", "year": 2021, "rel_sent": "PRAL : A Tailored Pre - Training Model for Task - Oriented Dialog Generation.", "forward": false, "src_ids": "2021.acl-short.40_5999"}
{"input": "dvae - gnn is used for OtherScientificTerm| context: however , this problem is less studied in open - domain dialogue .", "entity": "dvae - gnn", "output": "dialog structure graph", "neg_sample": ["dvae - gnn is used for OtherScientificTerm", "however , this problem is less studied in open - domain dialogue ."], "relation": "used for", "id": "2021.acl-long.136", "year": 2021, "rel_sent": "Experimental results on two benchmark corpora confirm that DVAE - GNN can discover meaningful dialog structure graph , and the use of dialog structure as background knowledge can significantly improve multi - turn coherence .", "forward": true, "src_ids": "2021.acl-long.136_4607"}
{"input": "heuristics is used for OtherScientificTerm| context: when a model attribution technique highlights a particular part of the input , a user might understand this highlight as making a statement about counterfactuals ( miller , 2019 ): if that part of the input were to change , the model 's prediction might change as well .", "entity": "heuristics", "output": "high - level model behavior", "neg_sample": ["heuristics is used for OtherScientificTerm", "when a model attribution technique highlights a particular part of the input , a user might understand this highlight as making a statement about counterfactuals ( miller , 2019 ): if that part of the input were to change , the model 's prediction might change as well ."], "relation": "used for", "id": "2021.emnlp-main.447", "year": 2021, "rel_sent": "We construct counterfactual sets for three different RC settings , and through heuristics that can connect attribution methods ' outputs to high - level model behavior , we can evaluate how useful different attribution methods and even different formats are for understanding counterfactuals .", "forward": true, "src_ids": "2021.emnlp-main.447_5132"}
{"input": "non - autoregressive generation is done by using Method| context: the multimodality problem has become a major challenge of existing non - autoregressive generation ( nag ) systems . a common solution often resorts to sequence - level knowledge distillation by rebuilding the training dataset through autoregressive generation ( hereinafter known as ' teacher ag ' ) . the success of such methods may largely depend on a latent assumption , i.e. , the teacher ag is superior to the nag model .", "entity": "non - autoregressive generation", "output": "pos - constrained parallel decoding", "neg_sample": ["non - autoregressive generation is done by using Method", "the multimodality problem has become a major challenge of existing non - autoregressive generation ( nag ) systems .", "a common solution often resorts to sequence - level knowledge distillation by rebuilding the training dataset through autoregressive generation ( hereinafter known as ' teacher ag ' ) .", "the success of such methods may largely depend on a latent assumption , i.e.", ", the teacher ag is superior to the nag model ."], "relation": "used for", "id": "2021.acl-long.467", "year": 2021, "rel_sent": "POS - Constrained Parallel Decoding for Non - autoregressive Generation.", "forward": false, "src_ids": "2021.acl-long.467_1707"}
{"input": "low - resource neural machine translation is done by using Method| context: the data scarcity in low - resource languages has become a bottleneck to building robust neural machine translation systems . finetuning a multilingual pre - trained model ( e.g. , mbart ( liu et al . , 2020a ) ) on the translation task is a good approach for low - resource languages ; however , its performance will be greatly limited when there are unseen languages in the translation pairs .", "entity": "low - resource neural machine translation", "output": "continual mixed - language pre - training", "neg_sample": ["low - resource neural machine translation is done by using Method", "the data scarcity in low - resource languages has become a bottleneck to building robust neural machine translation systems .", "finetuning a multilingual pre - trained model ( e.g.", ", mbart ( liu et al .", ", 2020a ) ) on the translation task is a good approach for low - resource languages ; however , its performance will be greatly limited when there are unseen languages in the translation pairs ."], "relation": "used for", "id": "2021.findings-acl.239", "year": 2021, "rel_sent": "Continual Mixed - Language Pre - Training for Extremely Low - Resource Neural Machine Translation.", "forward": false, "src_ids": "2021.findings-acl.239_14009"}
{"input": "generalization is done by using Method| context: after a neural sequence model encounters an unexpected token , can its behavior be predicted ? we show that rnn and transformer language models exhibit structured , consistent generalization in out - of - distribution contexts .", "entity": "generalization", "output": "neural language models", "neg_sample": ["generalization is done by using Method", "after a neural sequence model encounters an unexpected token , can its behavior be predicted ?", "we show that rnn and transformer language models exhibit structured , consistent generalization in out - of - distribution contexts ."], "relation": "used for", "id": "2021.emnlp-main.448", "year": 2021, "rel_sent": "In experiments in English , Finnish , Mandarin , and random regular languages , we demonstrate that neural language models interpolate between these twoforms of generalization : their predictions are well - approximated by a log - linear combination of lexical and syntactic predictive distributions .", "forward": false, "src_ids": "2021.emnlp-main.448_16093"}
{"input": "fine - tuning is done by using OtherScientificTerm| context: pretrained transformer - based models such as bert have demonstrated state - of - the - art predictive performance when adapted into a range of natural language processing tasks . an open problem is how to improve the faithfulness of explanations ( rationales ) for the predictions of these models .", "entity": "fine - tuning", "output": "task - specific information", "neg_sample": ["fine - tuning is done by using OtherScientificTerm", "pretrained transformer - based models such as bert have demonstrated state - of - the - art predictive performance when adapted into a range of natural language processing tasks .", "an open problem is how to improve the faithfulness of explanations ( rationales ) for the predictions of these models ."], "relation": "used for", "id": "2021.emnlp-main.645", "year": 2021, "rel_sent": "In this paper , we hypothesize that salient information extracted a priori from the training data can complement the task - specific information learned by the model during fine - tuning on a downstream task .", "forward": false, "src_ids": "2021.emnlp-main.645_8886"}
{"input": "clustering method is used for OtherScientificTerm| context: our systems rely on data representations learned through fine - tuned neural language models .", "entity": "clustering method", "output": "prototypes", "neg_sample": ["clustering method is used for OtherScientificTerm", "our systems rely on data representations learned through fine - tuned neural language models ."], "relation": "used for", "id": "2021.semeval-1.37", "year": 2021, "rel_sent": "The clustering method is applied to build prototypes of both classes which are used for training and classifying new messages .", "forward": true, "src_ids": "2021.semeval-1.37_3256"}
{"input": "limited context is used for Task| context: incorporating syntax into neural approaches in nlp has a multitude of practical and scientific benefits . for instance , a language model that is syntax - aware is likely to be able to produce better samples ; even a discriminative model like bert with a syntax module could be used for core nlp tasks like unsupervised syntactic parsing . rapid progress in recent years was arguably spurred on by the empirical success of the parsing - reading - predict architecture of ( shen et al . , 2018a ) , later simplified by the order neuron lstm of ( shen et al . , 2019 ) . most notably , this is the first time neural approaches were able to successfully perform unsupervised syntactic parsing ( evaluated by various metrics like f-1 score ) . however , even heuristic ( much less fully mathematical ) understanding of why and when these architectures work is lagging severely behind .", "entity": "limited context", "output": "constituency parsing", "neg_sample": ["limited context is used for Task", "incorporating syntax into neural approaches in nlp has a multitude of practical and scientific benefits .", "for instance , a language model that is syntax - aware is likely to be able to produce better samples ; even a discriminative model like bert with a syntax module could be used for core nlp tasks like unsupervised syntactic parsing .", "rapid progress in recent years was arguably spurred on by the empirical success of the parsing - reading - predict architecture of ( shen et al .", ", 2018a ) , later simplified by the order neuron lstm of ( shen et al .", ", 2019 ) .", "most notably , this is the first time neural approaches were able to successfully perform unsupervised syntactic parsing ( evaluated by various metrics like f-1 score ) .", "however , even heuristic ( much less fully mathematical ) understanding of why and when these architectures work is lagging severely behind ."], "relation": "used for", "id": "2021.acl-long.208", "year": 2021, "rel_sent": "The Limitations of Limited Context for Constituency Parsing.", "forward": true, "src_ids": "2021.acl-long.208_7664"}
{"input": "natural language explanations is used for Task| context: although neural models have shown strong performance in datasets such as snli , they lack the ability to generalize out - of - distribution ( ood ) .", "entity": "natural language explanations", "output": "out - of - distribution generalization", "neg_sample": ["natural language explanations is used for Task", "although neural models have shown strong performance in datasets such as snli , they lack the ability to generalize out - of - distribution ( ood ) ."], "relation": "used for", "id": "2021.insights-1.17", "year": 2021, "rel_sent": "Investigating the Effect of Natural Language Explanations on Out - of - Distribution Generalization in Few - shot NLI.", "forward": true, "src_ids": "2021.insights-1.17_13547"}
{"input": "statistical machine translation ( smt ) is used for OtherScientificTerm| context: because sign language is a visual language , the translation of it into spoken language is typically performed through an intermediate representation called gloss notation . in sign language , function words , such as particles and determiners , are not explicitly expressed , and there is little or no concept of morphological inflection in sign language . therefore , gloss notation does not include such linguistic constructs . because of these factors , we argue that sign language translation is effectively processed by taking advantage of the similarities and differences between sign language and its spoken counterpart .", "entity": "statistical machine translation ( smt )", "output": "glosses", "neg_sample": ["statistical machine translation ( smt ) is used for OtherScientificTerm", "because sign language is a visual language , the translation of it into spoken language is typically performed through an intermediate representation called gloss notation .", "in sign language , function words , such as particles and determiners , are not explicitly expressed , and there is little or no concept of morphological inflection in sign language .", "therefore , gloss notation does not include such linguistic constructs .", "because of these factors , we argue that sign language translation is effectively processed by taking advantage of the similarities and differences between sign language and its spoken counterpart ."], "relation": "used for", "id": "2021.findings-acl.178", "year": 2021, "rel_sent": "Specifically , our method first uses statistical machine translation ( SMT ) to map glosses to corresponding spoken language words .", "forward": true, "src_ids": "2021.findings-acl.178_3081"}
{"input": "data augmentation method is used for Task| context: with the widespread commercialization of smart devices , research on environmental sound classification has gained more and more attention in recent years .", "entity": "data augmentation method", "output": "semi - supervised model training", "neg_sample": ["data augmentation method is used for Task", "with the widespread commercialization of smart devices , research on environmental sound classification has gained more and more attention in recent years ."], "relation": "used for", "id": "2021.rocling-1.14", "year": 2021, "rel_sent": "Further , to simulate a low - resource sound classification setting where only limited supervised examples are made available , we instantiate the notion of transfer learning with a recently proposed training algorithm ( namely , FixMatch ) and a data augmentation method ( namely , SpecAugment ) to achieve the goal of semi - supervised model training .", "forward": true, "src_ids": "2021.rocling-1.14_11582"}
{"input": "binarybert is done by using Method| context: the rapid development of large pre - trained language models has greatly increased the demand for model compression techniques , among which quantization is a popular solution . we find that a binary bert is hard to be trained directly than a ternary counterpart due to its complex and irregular loss landscape .", "entity": "binarybert", "output": "ternary weight splitting", "neg_sample": ["binarybert is done by using Method", "the rapid development of large pre - trained language models has greatly increased the demand for model compression techniques , among which quantization is a popular solution .", "we find that a binary bert is hard to be trained directly than a ternary counterpart due to its complex and irregular loss landscape ."], "relation": "used for", "id": "2021.acl-long.334", "year": 2021, "rel_sent": "Therefore , we propose ternary weight splitting , which initializes BinaryBERT by equivalently splitting from a half - sized ternary network .", "forward": false, "src_ids": "2021.acl-long.334_12105"}
{"input": "multi - corpus machine translation is done by using Material| context: learning multilingual and multi - domain translation model is challenging as the heterogeneous and imbalanced data make the model converge inconsistently over different corpora in real world . one common practice is to adjust the share of each corpus in the training , so that the learning process is balanced and low - resource cases can benefit from the high resource ones . however , automatic balancing methods usually depend on the intra- and inter - dataset characteristics , which is usually agnostic or requires human priors .", "entity": "multi - corpus machine translation", "output": "trusted clean data", "neg_sample": ["multi - corpus machine translation is done by using Material", "learning multilingual and multi - domain translation model is challenging as the heterogeneous and imbalanced data make the model converge inconsistently over different corpora in real world .", "one common practice is to adjust the share of each corpus in the training , so that the learning process is balanced and low - resource cases can benefit from the high resource ones .", "however , automatic balancing methods usually depend on the intra- and inter - dataset characteristics , which is usually agnostic or requires human priors ."], "relation": "used for", "id": "2021.emnlp-main.580", "year": 2021, "rel_sent": "In this work , we propose an approach , MultiUAT , that dynamically adjusts the training data usage based on the model 's uncertainty on a small set of trusted clean data for multi - corpus machine translation .", "forward": false, "src_ids": "2021.emnlp-main.580_5723"}
{"input": "hashtag contexts is done by using Method| context: millions of hashtags are created on social media every day to cross - refer messages concerning similar topics .", "entity": "hashtag contexts", "output": "personalized topic attention", "neg_sample": ["hashtag contexts is done by using Method", "millions of hashtags are created on social media every day to cross - refer messages concerning similar topics ."], "relation": "used for", "id": "2021.emnlp-main.616", "year": 2021, "rel_sent": "Furthermore , we propose a novel personalized topic attention to capture salient contents to personalize hashtag contexts .", "forward": false, "src_ids": "2021.emnlp-main.616_2375"}
{"input": "amr - to - text generation is done by using Method| context: due to the scarcity of annotated data , abstract meaning representation ( amr ) research is relatively limited and challenging for languages other than english .", "entity": "amr - to - text generation", "output": "cross - lingual pre - training approach", "neg_sample": ["amr - to - text generation is done by using Method", "due to the scarcity of annotated data , abstract meaning representation ( amr ) research is relatively limited and challenging for languages other than english ."], "relation": "used for", "id": "2021.acl-long.73", "year": 2021, "rel_sent": "Upon the availability of English AMR dataset and English - to- X parallel datasets , in this paper we propose a novel cross - lingual pre - training approach via multi - task learning ( MTL ) for both zeroshot AMR parsing and AMR - to - text generation .", "forward": false, "src_ids": "2021.acl-long.73_11855"}
{"input": "autoencoding topic model is used for Task| context: natural language processing ( nlp ) often faces the problem of data diversity such as different domains , themes , styles , and so on . therefore , a single language model ( lm ) is insufficient to learn all knowledge from diverse samples .", "entity": "autoencoding topic model", "output": "clustering", "neg_sample": ["autoencoding topic model is used for Task", "natural language processing ( nlp ) often faces the problem of data diversity such as different domains , themes , styles , and so on .", "therefore , a single language model ( lm ) is insufficient to learn all knowledge from diverse samples ."], "relation": "used for", "id": "2021.acl-long.230", "year": 2021, "rel_sent": "To solve this problem , we firstly propose an autoencoding topic model with a mixture prior ( mATM ) to perform clustering for the data , where the clusters defined in semantic space describes the data diversity .", "forward": true, "src_ids": "2021.acl-long.230_1411"}
{"input": "dynamic contextualized word embeddings is done by using Method| context: static word embeddings that represent words by a single vector can not capture the variability of word meaning in different linguistic and extralinguistic contexts .", "entity": "dynamic contextualized word embeddings", "output": "pretrained language model ( plm )", "neg_sample": ["dynamic contextualized word embeddings is done by using Method", "static word embeddings that represent words by a single vector can not capture the variability of word meaning in different linguistic and extralinguistic contexts ."], "relation": "used for", "id": "2021.acl-long.542", "year": 2021, "rel_sent": "Based on a pretrained language model ( PLM ) , dynamic contextualized word embeddings model time and social space jointly , which makes them attractive for a range of NLP tasks involving semantic variability .", "forward": false, "src_ids": "2021.acl-long.542_10640"}
{"input": "rapport - building is done by using OtherScientificTerm| context: in conversational analyses , humans manually weave multimodal information into the transcripts , which is significantly time - consuming .", "entity": "rapport - building", "output": "multimodal features", "neg_sample": ["rapport - building is done by using OtherScientificTerm", "in conversational analyses , humans manually weave multimodal information into the transcripts , which is significantly time - consuming ."], "relation": "used for", "id": "2021.eacl-main.37", "year": 2021, "rel_sent": "Our feature engineering contributions are two - fold : firstly , we identify the range of multimodal features relevant to detect rapport - building ; secondly , we expand the range of multimodal annotations and show that the expansion leads to statistically significant improvements in detecting rapport - building .", "forward": false, "src_ids": "2021.eacl-main.37_5072"}
{"input": "medical reports is done by using Method| context: existing medical report generation efforts emphasize producing human - readable reports , yet the generated text may not be well aligned to the clinical facts . our generated medical reports , on the other hand , are fluent and , more importantly , clinically accurate .", "entity": "medical reports", "output": "transformer - based generator", "neg_sample": ["medical reports is done by using Method", "existing medical report generation efforts emphasize producing human - readable reports , yet the generated text may not be well aligned to the clinical facts .", "our generated medical reports , on the other hand , are fluent and , more importantly , clinically accurate ."], "relation": "used for", "id": "2021.emnlp-main.288", "year": 2021, "rel_sent": "This is achieved by our fully differentiable and end - to - end paradigm that contains three complementary modules : taking the chest X - ray images and clinical history document of patients as inputs , our classification module produces an internal checklist of disease - related topics , referred to as enriched disease embedding ; the embedding representation is then passed to our transformer - based generator , to produce the medical report ; meanwhile , our generator also creates a weighted embedding representation , which is fed to our interpreter to ensure consistency with respect to disease - related topics .", "forward": false, "src_ids": "2021.emnlp-main.288_2702"}
{"input": "alignments is used for OtherScientificTerm| context: knowledge distillation ( kd ) is commonly used to construct synthetic data for training non - autoregressive translation ( nat ) models . however , there exists a discrepancy on low - frequency words between the distilled and the original data , leading to more errors on predicting low - frequency words .", "entity": "alignments", "output": "low - frequency target words", "neg_sample": ["alignments is used for OtherScientificTerm", "knowledge distillation ( kd ) is commonly used to construct synthetic data for training non - autoregressive translation ( nat ) models .", "however , there exists a discrepancy on low - frequency words between the distilled and the original data , leading to more errors on predicting low - frequency words ."], "relation": "used for", "id": "2021.acl-long.266", "year": 2021, "rel_sent": "Accordingly , we propose reverse KD to rejuvenate more alignments for low - frequency target words .", "forward": true, "src_ids": "2021.acl-long.266_3944"}
{"input": "weakly supervised named entity recognition is done by using Method| context: instead of using expensive manual annotations , researchers have proposed to train named entity recognition ( ner ) systems using heuristic labeling rules . however , devising labeling rules is challenging because it often requires a considerable amount of manual effort and domain expertise .", "entity": "weakly supervised named entity recognition", "output": "graph - based labeling rule augmentation framework", "neg_sample": ["weakly supervised named entity recognition is done by using Method", "instead of using expensive manual annotations , researchers have proposed to train named entity recognition ( ner ) systems using heuristic labeling rules .", "however , devising labeling rules is challenging because it often requires a considerable amount of manual effort and domain expertise ."], "relation": "used for", "id": "2021.eacl-main.318", "year": 2021, "rel_sent": "GLaRA : Graph - based Labeling Rule Augmentation for Weakly Supervised Named Entity Recognition.", "forward": false, "src_ids": "2021.eacl-main.318_11791"}
{"input": "model - agnostic approach is used for OtherScientificTerm| context: existing work shows the benefits of integrating kbs with textual evidence for qa only on questions that are answerable by kbs alone ( sun et al . , 2019 ) . in contrast , real world qa systems often have to deal with questions that might not be directly answerable by kbs .", "entity": "model - agnostic approach", "output": "kb paths", "neg_sample": ["model - agnostic approach is used for OtherScientificTerm", "existing work shows the benefits of integrating kbs with textual evidence for qa only on questions that are answerable by kbs alone ( sun et al .", ", 2019 ) .", "in contrast , real world qa systems often have to deal with questions that might not be directly answerable by kbs ."], "relation": "used for", "id": "2021.deelio-1.3", "year": 2021, "rel_sent": "We propose and analyze a simple , model - agnostic approach for incorporating KB paths into text - based QA systems and establish a strong upper bound on FQ for our method using an oracle retriever .", "forward": true, "src_ids": "2021.deelio-1.3_12511"}
{"input": "evolutionary algorithm is used for OtherScientificTerm| context: modern pre - trained language models are mostly built upon backbones stacking self - attention and feed - forward layers in an interleaved order .", "entity": "evolutionary algorithm", "output": "optimal architecture", "neg_sample": ["evolutionary algorithm is used for OtherScientificTerm", "modern pre - trained language models are mostly built upon backbones stacking self - attention and feed - forward layers in an interleaved order ."], "relation": "used for", "id": "2021.findings-acl.2", "year": 2021, "rel_sent": "To solve this problem , we first pre - train a supernet from which the weights of all candidate models can be inherited , and then adopt an evolutionary algorithm guided by pre - training accuracy tofind the optimal architecture .", "forward": true, "src_ids": "2021.findings-acl.2_937"}
{"input": "matching - guided embedding ( mge ) mechanism is used for OtherScientificTerm| context: learning sentence embeddings from dialogues has drawn increasing attention due to its low annotation cost and high domain adaptability . conventional approaches employ the siamese - network for this task , which obtains the sentence embeddings through modeling the context - response semantic relevance by applying a feed - forward network on top of the sentence encoders . however , as the semantic textual similarity is commonly measured through the element - wise distance metrics ( e.g. cosine and l2 distance ) , such architecture yields a large gap between training and evaluating .", "entity": "matching - guided embedding ( mge ) mechanism", "output": "context - aware embedding", "neg_sample": ["matching - guided embedding ( mge ) mechanism is used for OtherScientificTerm", "learning sentence embeddings from dialogues has drawn increasing attention due to its low annotation cost and high domain adaptability .", "conventional approaches employ the siamese - network for this task , which obtains the sentence embeddings through modeling the context - response semantic relevance by applying a feed - forward network on top of the sentence encoders .", "however , as the semantic textual similarity is commonly measured through the element - wise distance metrics ( e.g.", "cosine and l2 distance ) , such architecture yields a large gap between training and evaluating ."], "relation": "used for", "id": "2021.emnlp-main.185", "year": 2021, "rel_sent": "DialogueCSE first introduces a novel matching - guided embedding ( MGE ) mechanism , which generates a context - aware embedding for each candidate response embedding ( i.e.", "forward": true, "src_ids": "2021.emnlp-main.185_8875"}
{"input": "verdict inference is used for Task| context: automatic fact verification has attracted recent research attention as the increasing dissemination of disinformation on social media platforms .", "entity": "verdict inference", "output": "feverous shared task", "neg_sample": ["verdict inference is used for Task", "automatic fact verification has attracted recent research attention as the increasing dissemination of disinformation on social media platforms ."], "relation": "used for", "id": "2021.fever-1.7", "year": 2021, "rel_sent": "In this paper , we propose our 3rd place three - stage system consisting of document retrieval , element retrieval , and verdict inference for the FEVEROUS shared task .", "forward": true, "src_ids": "2021.fever-1.7_8423"}
{"input": "linguistic analysis of text - based communication is used for OtherScientificTerm| context: the complexity loss paradox , which posits that individuals suffering from disease exhibit surprisingly predictable behavioral dynamics , has been observed in a variety of both human and animal physiological systems . the recent advent of online text - based therapy presents a new opportunity to analyze the complexity loss paradox in a novel operationalization : linguistic complexity loss in text - based therapy conversations .", "entity": "linguistic analysis of text - based communication", "output": "anxiety", "neg_sample": ["linguistic analysis of text - based communication is used for OtherScientificTerm", "the complexity loss paradox , which posits that individuals suffering from disease exhibit surprisingly predictable behavioral dynamics , has been observed in a variety of both human and animal physiological systems .", "the recent advent of online text - based therapy presents a new opportunity to analyze the complexity loss paradox in a novel operationalization : linguistic complexity loss in text - based therapy conversations ."], "relation": "used for", "id": "2021.naacl-main.352", "year": 2021, "rel_sent": "These results demonstrate how linguistic analysis of text - based communication can be leveraged as a marker for anxiety , an exciting prospect in a time of both increased online communication and increased mental health issues .", "forward": true, "src_ids": "2021.naacl-main.352_6550"}
{"input": "multi - turn dialogue is done by using Method| context: retrieval - based dialogue systems display an outstanding performance when pre - trained language models are used , which includes bidirectional encoder representations from transformers ( bert ) . during the multi - turn response selection , bert focuses on training the relationship between the context with multiple utterances and the response . however , this method of training is insufficient when considering the relations between each utterance in the context . this leads to a problem of not completely understanding the context flow that is required to select a response .", "entity": "multi - turn dialogue", "output": "fine - grained post - training method", "neg_sample": ["multi - turn dialogue is done by using Method", "retrieval - based dialogue systems display an outstanding performance when pre - trained language models are used , which includes bidirectional encoder representations from transformers ( bert ) .", "during the multi - turn response selection , bert focuses on training the relationship between the context with multiple utterances and the response .", "however , this method of training is insufficient when considering the relations between each utterance in the context .", "this leads to a problem of not completely understanding the context flow that is required to select a response ."], "relation": "used for", "id": "2021.naacl-main.122", "year": 2021, "rel_sent": "To address this issue , we propose a new fine - grained post - training method that reflects the characteristics of the multi - turn dialogue .", "forward": false, "src_ids": "2021.naacl-main.122_1477"}
{"input": "neural models is used for OtherScientificTerm| context: individuals with autism spectrum disorder ( asd ) experience difficulties in social aspects of communication , but the linguistic characteristics associated with deficits in discourse and pragmatic expression are often difficult to precisely identify and quantify . we are currently collecting a corpus of transcribed natural conversations produced in an experimental setting in which participants with and without asd complete a number of collaborative tasks with their neurotypical peers .", "entity": "neural models", "output": "features", "neg_sample": ["neural models is used for OtherScientificTerm", "individuals with autism spectrum disorder ( asd ) experience difficulties in social aspects of communication , but the linguistic characteristics associated with deficits in discourse and pragmatic expression are often difficult to precisely identify and quantify .", "we are currently collecting a corpus of transcribed natural conversations produced in an experimental setting in which participants with and without asd complete a number of collaborative tasks with their neurotypical peers ."], "relation": "used for", "id": "2021.acl-srw.29", "year": 2021, "rel_sent": "We then introduce ongoing work in developing and training neural models to automatically predict these features , with the goal of identifying the same between - groups differences that are observed using manual annotations .", "forward": true, "src_ids": "2021.acl-srw.29_14081"}
{"input": "kbqa module is used for Task| context: conversational kbqa is about answering a sequence of questions related to a kb . follow - up questions in conversational kbqa often have missing information referring to entities from the conversation history .", "entity": "kbqa module", "output": "answer ranking", "neg_sample": ["kbqa module is used for Task", "conversational kbqa is about answering a sequence of questions related to a kb .", "follow - up questions in conversational kbqa often have missing information referring to entities from the conversation history ."], "relation": "used for", "id": "2021.acl-long.255", "year": 2021, "rel_sent": "We propose a novel graph - based model to capture the transitions of focal entities and apply a graph neural network to derive a probability distribution of focal entities for each question , which is then combined with a standard KBQA module to perform answer ranking .", "forward": true, "src_ids": "2021.acl-long.255_6058"}
{"input": "few - shot learning is used for Method| context: interleaved texts , where posts belonging to different threads occur in a sequence , commonly occur in online chat posts , so that it can be time - consuming to quickly obtain an overview of the discussions . existing systems first disentangle the posts by threads and then extract summaries from those threads . a major issue with such systems is error propagation from the disentanglement component . while end - to - end trainable summarization system could obviate explicit disentanglement , such systems require a large amount of labeled data .", "entity": "few - shot learning", "output": "interleaved text summarization model", "neg_sample": ["few - shot learning is used for Method", "interleaved texts , where posts belonging to different threads occur in a sequence , commonly occur in online chat posts , so that it can be time - consuming to quickly obtain an overview of the discussions .", "existing systems first disentangle the posts by threads and then extract summaries from those threads .", "a major issue with such systems is error propagation from the disentanglement component .", "while end - to - end trainable summarization system could obviate explicit disentanglement , such systems require a large amount of labeled data ."], "relation": "used for", "id": "2021.adaptnlp-1.24", "year": 2021, "rel_sent": "Few - Shot Learning of an Interleaved Text Summarization Model by Pretraining with Synthetic Data.", "forward": true, "src_ids": "2021.adaptnlp-1.24_9380"}
{"input": "contrastive evaluation suite is used for Material| context: minimal sentence pairs are frequently used to analyze the behavior of language models . it is often assumed that model behavior on contrastive pairs is predictive of model behavior at large .", "entity": "contrastive evaluation suite", "output": "english - german mt", "neg_sample": ["contrastive evaluation suite is used for Material", "minimal sentence pairs are frequently used to analyze the behavior of language models .", "it is often assumed that model behavior on contrastive pairs is predictive of model behavior at large ."], "relation": "used for", "id": "2021.blackboxnlp-1.5", "year": 2021, "rel_sent": "We present a contrastive evaluation suite for English - German MT that implements this recommendation .", "forward": true, "src_ids": "2021.blackboxnlp-1.5_5093"}
{"input": "livestream transcripts is done by using Method| context: with the explosive growth of livestream broadcasting , there is an urgent need for new summarization technology that enables us to create a preview of streamed content and tap into this wealth of knowledge . however , the problem is nontrivial due to the informal nature of spoken language . further , there has been a shortage of annotated datasets that are necessary for transcript summarization .", "entity": "livestream transcripts", "output": "streamhover", "neg_sample": ["livestream transcripts is done by using Method", "with the explosive growth of livestream broadcasting , there is an urgent need for new summarization technology that enables us to create a preview of streamed content and tap into this wealth of knowledge .", "however , the problem is nontrivial due to the informal nature of spoken language .", "further , there has been a shortage of annotated datasets that are necessary for transcript summarization ."], "relation": "used for", "id": "2021.emnlp-main.520", "year": 2021, "rel_sent": "In this paper , we present StreamHover , a framework for annotating and summarizing livestream transcripts .", "forward": false, "src_ids": "2021.emnlp-main.520_5084"}
{"input": "model parameters is done by using Method| context: in this paper , we use domain generalization to improve the performance of the cross - device speaker verification system .", "entity": "model parameters", "output": "domain generalization algorithms", "neg_sample": ["model parameters is done by using Method", "in this paper , we use domain generalization to improve the performance of the cross - device speaker verification system ."], "relation": "used for", "id": "2021.rocling-1.12", "year": 2021, "rel_sent": "Based on a trainable speaker verification system , we use domain generalization algorithms tofine - tune the model parameters .", "forward": false, "src_ids": "2021.rocling-1.12_361"}
{"input": "auxiliary loss is used for OtherScientificTerm| context: identifying relevant knowledge to be used in conversational systems that are grounded in long documents is critical to effective response generation .", "entity": "auxiliary loss", "output": "dialogue - document connections", "neg_sample": ["auxiliary loss is used for OtherScientificTerm", "identifying relevant knowledge to be used in conversational systems that are grounded in long documents is critical to effective response generation ."], "relation": "used for", "id": "2021.emnlp-main.140", "year": 2021, "rel_sent": "An auxiliary loss captures the history of dialogue - document connections .", "forward": true, "src_ids": "2021.emnlp-main.140_5519"}
{"input": "clustering is used for Method| context: we review twofeatures of mixture of experts ( moe ) models which we call averaging and clustering effects in the context of graph - based dependency parsers learned in a supervised probabilistic framework . averaging corresponds to the ensemble combination of parsers and is responsible for variance reduction which helps stabilizing and improving parsing accuracy . although promising , this is difficult to achieve , especially without additional data .", "entity": "clustering", "output": "moe models", "neg_sample": ["clustering is used for Method", "we review twofeatures of mixture of experts ( moe ) models which we call averaging and clustering effects in the context of graph - based dependency parsers learned in a supervised probabilistic framework .", "averaging corresponds to the ensemble combination of parsers and is responsible for variance reduction which helps stabilizing and improving parsing accuracy .", "although promising , this is difficult to achieve , especially without additional data ."], "relation": "used for", "id": "2021.iwpt-1.11", "year": 2021, "rel_sent": "Clustering describes the capacity of MoE models to give more credit to experts believed to be more accurate given an input .", "forward": true, "src_ids": "2021.iwpt-1.11_11378"}
{"input": "referent predictability is used for OtherScientificTerm| context: it is often posited that more predictable parts of a speaker 's meaning tend to be made less explicit , for instance using shorter , less informative words . studying these dynamics in the domain of referring expressions has proven difficult , with existing studies , both psycholinguistic and corpus - based , providing contradictory results .", "entity": "referent predictability", "output": "referential form", "neg_sample": ["referent predictability is used for OtherScientificTerm", "it is often posited that more predictable parts of a speaker 's meaning tend to be made less explicit , for instance using shorter , less informative words .", "studying these dynamics in the domain of referring expressions has proven difficult , with existing studies , both psycholinguistic and corpus - based , providing contradictory results ."], "relation": "used for", "id": "2021.conll-1.36", "year": 2021, "rel_sent": "Does referent predictability affect the choice of referential form ? A computational approach using masked coreference resolution.", "forward": true, "src_ids": "2021.conll-1.36_7906"}
{"input": "salience detection is done by using Method| context: measuring event salience is essential in the understanding of stories .", "entity": "salience detection", "output": "unsupervised method", "neg_sample": ["salience detection is done by using Method", "measuring event salience is essential in the understanding of stories ."], "relation": "used for", "id": "2021.emnlp-main.65", "year": 2021, "rel_sent": "This paper takes a recent unsupervised method for salience detection derived from Barthes Cardinal Functions and theories of surprise and applies it to longer narrative forms .", "forward": false, "src_ids": "2021.emnlp-main.65_3934"}
{"input": "fine - tuning scheme is used for Method| context: the rise of pre - trained language models has yielded substantial progress in the vast majority of natural language processing ( nlp ) tasks . however , a generic approach towards the pre - training procedure can naturally be sub - optimal in some cases . particularly , fine - tuning a pre - trained language model on a source domain and then applying it to a different target domain , results in a sharp performance decline of the eventual classifier for many source - target domain pairs . moreover , in some nlp tasks , the output categories substantially differ between domains , making adaptation even more challenging . this , for example , happens in the task of aspect extraction , where the aspects of interest of reviews of , e.g. , restaurants or electronic devices may be very different .", "entity": "fine - tuning scheme", "output": "bert", "neg_sample": ["fine - tuning scheme is used for Method", "the rise of pre - trained language models has yielded substantial progress in the vast majority of natural language processing ( nlp ) tasks .", "however , a generic approach towards the pre - training procedure can naturally be sub - optimal in some cases .", "particularly , fine - tuning a pre - trained language model on a source domain and then applying it to a different target domain , results in a sharp performance decline of the eventual classifier for many source - target domain pairs .", "moreover , in some nlp tasks , the output categories substantially differ between domains , making adaptation even more challenging .", "this , for example , happens in the task of aspect extraction , where the aspects of interest of reviews of , e.g.", ", restaurants or electronic devices may be very different ."], "relation": "used for", "id": "2021.emnlp-main.20", "year": 2021, "rel_sent": "This paper presents a new fine - tuning scheme for BERT , which aims to address the above challenges .", "forward": true, "src_ids": "2021.emnlp-main.20_9599"}
{"input": "at knowledge is done by using Method| context: non - autoregressive machine translation ( nat ) models have demonstrated significant inference speedup but suffer from inferior translation accuracy . the common practice to tackle the problem is transferring the autoregressive machine translation ( at ) knowledge to nat models , e.g. , with knowledge distillation . in this work , we hypothesize and empirically verify that at and nat encoders capture different linguistic properties of source sentences .", "entity": "at knowledge", "output": "multi - task learning", "neg_sample": ["at knowledge is done by using Method", "non - autoregressive machine translation ( nat ) models have demonstrated significant inference speedup but suffer from inferior translation accuracy .", "the common practice to tackle the problem is transferring the autoregressive machine translation ( at ) knowledge to nat models , e.g.", ", with knowledge distillation .", "in this work , we hypothesize and empirically verify that at and nat encoders capture different linguistic properties of source sentences ."], "relation": "used for", "id": "2021.naacl-main.313", "year": 2021, "rel_sent": "Therefore , we propose to adopt multi - task learning to transfer the AT knowledge to NAT models through encoder sharing .", "forward": false, "src_ids": "2021.naacl-main.313_7140"}
{"input": "pretrained model is done by using OtherScientificTerm| context: moreover , when combined with regular learning from examples , this idea yields impressive few - shot results for a wide range of text classification tasks .", "entity": "pretrained model", "output": "task descriptions", "neg_sample": ["pretrained model is done by using OtherScientificTerm", "moreover , when combined with regular learning from examples , this idea yields impressive few - shot results for a wide range of text classification tasks ."], "relation": "used for", "id": "2021.emnlp-main.32", "year": 2021, "rel_sent": "In particular , it is crucial tofind task descriptions that are easy to understand for the pretrained model and to ensure that it actually makes good use of them ; furthermore , effective measures against overfitting have to be implemented .", "forward": false, "src_ids": "2021.emnlp-main.32_12908"}
{"input": "multi - document summarization evaluation is used for Material| context: allowing users to interact with multi - document summarizers is a promising direction towards improving and customizing summary results . different ideas for interactive summarization have been proposed in previous work but these solutions are highly divergent and incomparable .", "entity": "multi - document summarization evaluation", "output": "interactive setting", "neg_sample": ["multi - document summarization evaluation is used for Material", "allowing users to interact with multi - document summarizers is a promising direction towards improving and customizing summary results .", "different ideas for interactive summarization have been proposed in previous work but these solutions are highly divergent and incomparable ."], "relation": "used for", "id": "2021.naacl-main.54", "year": 2021, "rel_sent": "Extending Multi - Document Summarization Evaluation to the Interactive Setting.", "forward": true, "src_ids": "2021.naacl-main.54_8813"}
{"input": "detection of hyperbole is done by using OtherScientificTerm| context: the detection of hyperbole is an important stepping stone to understanding the intentions of a hyperbolic utterance .", "entity": "detection of hyperbole", "output": "privileged information", "neg_sample": ["detection of hyperbole is done by using OtherScientificTerm", "the detection of hyperbole is an important stepping stone to understanding the intentions of a hyperbolic utterance ."], "relation": "used for", "id": "2021.alta-1.6", "year": 2021, "rel_sent": "Harnessing Privileged Information for Hyperbole Detection.", "forward": false, "src_ids": "2021.alta-1.6_11497"}
{"input": "bert - persner is used for Task| context: named entity recognition ( ner ) is one of the major tasks in natural language processing . a named entity is often a word or expression that bears a valuable piece of information , which can be effectively employed by some major nlp tasks such as machine translation , question answering , and text summarization .", "entity": "bert - persner", "output": "persian ner", "neg_sample": ["bert - persner is used for Task", "named entity recognition ( ner ) is one of the major tasks in natural language processing .", "a named entity is often a word or expression that bears a valuable piece of information , which can be effectively employed by some major nlp tasks such as machine translation , question answering , and text summarization ."], "relation": "used for", "id": "2021.ranlp-1.73", "year": 2021, "rel_sent": "BERT - PersNER has outperformed two available studies in Persian NER , in most cases of our experiments using the supervised learning approach on two Persian datasets called Arman and Peyma .", "forward": true, "src_ids": "2021.ranlp-1.73_12368"}
{"input": "domain - specific vocabulary expansion is used for Task| context: large pretrained models have achieved great success in many natural language processing tasks . however , when they are applied in specific domains , these models suffer from domain shift and bring challenges in fine - tuning and online serving for latency and capacity constraints .", "entity": "domain - specific vocabulary expansion", "output": "adaptation stage", "neg_sample": ["domain - specific vocabulary expansion is used for Task", "large pretrained models have achieved great success in many natural language processing tasks .", "however , when they are applied in specific domains , these models suffer from domain shift and bring challenges in fine - tuning and online serving for latency and capacity constraints ."], "relation": "used for", "id": "2021.findings-acl.40", "year": 2021, "rel_sent": "Specifically , we propose domain - specific vocabulary expansion in the adaptation stage and employ corpus level occurrence probability to choose the size of incremental vocabulary automatically .", "forward": true, "src_ids": "2021.findings-acl.40_13364"}
{"input": "alignment is used for Method| context: zero - shot translations is a fascinating feature of multilingual neural machine translation ( mnmt ) systems . these mnmt models are usually trained on english - centric data , i.e. english either as the source or target language , and with a language label prepended to the input indicating the target language . however , recent work has highlighted several flaws of these models in zero - shot scenarios where language labels are ignored and the wrong language is generated or different runs show highly unstable results .", "entity": "alignment", "output": "transformer - based mnmt models", "neg_sample": ["alignment is used for Method", "zero - shot translations is a fascinating feature of multilingual neural machine translation ( mnmt ) systems .", "these mnmt models are usually trained on english - centric data , i.e.", "english either as the source or target language , and with a language label prepended to the input indicating the target language .", "however , recent work has highlighted several flaws of these models in zero - shot scenarios where language labels are ignored and the wrong language is generated or different runs show highly unstable results ."], "relation": "used for", "id": "2021.emnlp-main.664", "year": 2021, "rel_sent": "In this paper , we investigate the benefits of an explicit alignment to language labels in Transformer - based MNMT models in the zero - shot context , by jointly training one cross attention head with word alignment supervision to stress the focus on the target language label .", "forward": true, "src_ids": "2021.emnlp-main.664_8038"}
{"input": "multimodal jensen - shannon divergence loss is used for Task| context: effective unimodal representation and complementary crossmodal representation fusion are both important in multimodal representation learning . prior works often modulate one modal feature to another straightforwardly and thus , underutilizing both unimodal and crossmodal representation refinements , which incurs a bottleneck of performance improvement .", "entity": "multimodal jensen - shannon divergence loss", "output": "crossmodal refinement", "neg_sample": ["multimodal jensen - shannon divergence loss is used for Task", "effective unimodal representation and complementary crossmodal representation fusion are both important in multimodal representation learning .", "prior works often modulate one modal feature to another straightforwardly and thus , underutilizing both unimodal and crossmodal representation refinements , which incurs a bottleneck of performance improvement ."], "relation": "used for", "id": "2021.emnlp-main.720", "year": 2021, "rel_sent": "Subsequently , those unimodal representations are projected into a common latent space , regularized by a multimodal Jensen - Shannon divergence loss for better crossmodal refinement .", "forward": true, "src_ids": "2021.emnlp-main.720_1052"}
{"input": "downstream systems is done by using Task| context: learning discrete dialog structure graph from human - human dialogs yields basic insights into the structure of conversation , and also provides background knowledge tofacilitate dialog generation . however , this problem is less studied in open - domain dialogue .", "entity": "downstream systems", "output": "coherent dialog generation", "neg_sample": ["downstream systems is done by using Task", "learning discrete dialog structure graph from human - human dialogs yields basic insights into the structure of conversation , and also provides background knowledge tofacilitate dialog generation .", "however , this problem is less studied in open - domain dialogue ."], "relation": "used for", "id": "2021.acl-long.136", "year": 2021, "rel_sent": "In this paper , we conduct unsupervised discovery of discrete dialog structure from chitchat corpora , and then leverage it tofacilitate coherent dialog generation in downstream systems .", "forward": false, "src_ids": "2021.acl-long.136_4608"}
{"input": "vision is done by using Method| context: optimization - based approaches become difficult in the language domain due to the discrete nature of text .", "entity": "vision", "output": "white - box approaches", "neg_sample": ["vision is done by using Method", "optimization - based approaches become difficult in the language domain due to the discrete nature of text ."], "relation": "used for", "id": "2021.emnlp-main.452", "year": 2021, "rel_sent": "White - box approaches have been successfully applied to similar problems in vision where one can directly optimize the continuous input .", "forward": false, "src_ids": "2021.emnlp-main.452_9978"}
{"input": "fine - grained semantic divergences is used for Method| context: while it has been shown that neural machine translation ( nmt ) is highly sensitive to noisy parallel training samples , prior work treats all types of mismatches between source and target as noise . as a result , it remains unclear how samples that are mostly equivalent but contain a small number of semantically divergent tokens impact nmt training .", "entity": "fine - grained semantic divergences", "output": "transformer models", "neg_sample": ["fine - grained semantic divergences is used for Method", "while it has been shown that neural machine translation ( nmt ) is highly sensitive to noisy parallel training samples , prior work treats all types of mismatches between source and target as noise .", "as a result , it remains unclear how samples that are mostly equivalent but contain a small number of semantically divergent tokens impact nmt training ."], "relation": "used for", "id": "2021.acl-long.562", "year": 2021, "rel_sent": "To close this gap , we analyze the impact of different types of fine - grained semantic divergences on Transformer models .", "forward": true, "src_ids": "2021.acl-long.562_2304"}
{"input": "tamil is done by using Method| context: in this work , we explore generating morphologically enhanced word embeddings for tamil , a highly agglutinative south indian language with rich morphology that remains low - resource with regards to nlp tasks .", "entity": "tamil", "output": "morphology - aware meta - embeddings", "neg_sample": ["tamil is done by using Method", "in this work , we explore generating morphologically enhanced word embeddings for tamil , a highly agglutinative south indian language with rich morphology that remains low - resource with regards to nlp tasks ."], "relation": "used for", "id": "2021.naacl-srw.13", "year": 2021, "rel_sent": "Morphology - Aware Meta - Embeddings for Tamil.", "forward": false, "src_ids": "2021.naacl-srw.13_9795"}
{"input": "human - human dialogue dataset is used for Task| context: photochat contains 12k dialogues , each of which is paired with a user photo that is shared during the conversation .", "entity": "human - human dialogue dataset", "output": "joint image - text modeling", "neg_sample": ["human - human dialogue dataset is used for Task", "photochat contains 12k dialogues , each of which is paired with a user photo that is shared during the conversation ."], "relation": "used for", "id": "2021.acl-long.479", "year": 2021, "rel_sent": "PhotoChat : A Human - Human Dialogue Dataset With Photo Sharing Behavior For Joint Image - Text Modeling.", "forward": true, "src_ids": "2021.acl-long.479_9189"}
{"input": "graph is done by using Method| context: schema translation is the task of automatically translating headers of tabular data from one language to another . high - quality schema translation plays an important role in cross - lingual table searching , understanding and analysis . despite its importance , schema translation is not well studied in the community , and state - of - the - art neural machine translation models can not work well on this task because of two intrinsic differences between plain text and tabular data : morphological difference and context difference .", "entity": "graph", "output": "cast", "neg_sample": ["graph is done by using Method", "schema translation is the task of automatically translating headers of tabular data from one language to another .", "high - quality schema translation plays an important role in cross - lingual table searching , understanding and analysis .", "despite its importance , schema translation is not well studied in the community , and state - of - the - art neural machine translation models can not work well on this task because of two intrinsic differences between plain text and tabular data : morphological difference and context difference ."], "relation": "used for", "id": "2021.emnlp-main.5", "year": 2021, "rel_sent": "Then CAST encodes the graph with a relational - aware transformer and uses another transformer to decode the header in the target language .", "forward": false, "src_ids": "2021.emnlp-main.5_8720"}
{"input": "sentiment analysis task is done by using Method| context: despite their success , modern language models are fragile . even small changes in their training pipeline can lead to unexpected results .", "entity": "sentiment analysis task", "output": "stochastic weight averaging", "neg_sample": ["sentiment analysis task is done by using Method", "despite their success , modern language models are fragile .", "even small changes in their training pipeline can lead to unexpected results ."], "relation": "used for", "id": "2021.eval4nlp-1.3", "year": 2021, "rel_sent": "How Emotionally Stable is ALBERT ? Testing Robustness with Stochastic Weight Averaging on a Sentiment Analysis Task.", "forward": false, "src_ids": "2021.eval4nlp-1.3_10291"}
{"input": "efl writing is done by using OtherScientificTerm| context: quantitative research on learner writing has traditionally focused on lexical and syntactic features , but there has been increasing interest in incorporating discourse - level properties .", "entity": "efl writing", "output": "dependency distance", "neg_sample": ["efl writing is done by using OtherScientificTerm", "quantitative research on learner writing has traditionally focused on lexical and syntactic features , but there has been increasing interest in incorporating discourse - level properties ."], "relation": "used for", "id": "2021.tlt-1.10", "year": 2021, "rel_sent": "Discourse Tree Structure and Dependency Distance in EFL Writing.", "forward": false, "src_ids": "2021.tlt-1.10_2047"}
{"input": "mention - based reasoning is used for Method| context: document - level relation extraction aims to detect the relations within one document , which is challenging since it requires complex reasoning using mentions , entities , local and global contexts . few previous studies have distinguished local and global reasoning explicitly , which may be problematic because they play different roles in intraand inter - sentence relations . moreover , the interactions between local and global contexts should be considered since they could help relation reasoning based on our observation .", "entity": "mention - based reasoning", "output": "co - predictor module", "neg_sample": ["mention - based reasoning is used for Method", "document - level relation extraction aims to detect the relations within one document , which is challenging since it requires complex reasoning using mentions , entities , local and global contexts .", "few previous studies have distinguished local and global reasoning explicitly , which may be problematic because they play different roles in intraand inter - sentence relations .", "moreover , the interactions between local and global contexts should be considered since they could help relation reasoning based on our observation ."], "relation": "used for", "id": "2021.findings-acl.117", "year": 2021, "rel_sent": "Based on MRN , we design a co - predictor module to predict entity relations based on local and global entity and relation representations jointly .", "forward": true, "src_ids": "2021.findings-acl.117_4103"}
{"input": "slot transferability is used for Task| context: it is of great significance for transferring a dialogue system into new domains . most of the existing work focused on building a cross - domain transfer model .", "entity": "slot transferability", "output": "cross - domain slot filling", "neg_sample": ["slot transferability is used for Task", "it is of great significance for transferring a dialogue system into new domains .", "most of the existing work focused on building a cross - domain transfer model ."], "relation": "used for", "id": "2021.findings-acl.440", "year": 2021, "rel_sent": "Slot Transferability for Cross - domain Slot Filling.", "forward": true, "src_ids": "2021.findings-acl.440_14045"}
{"input": "l0drop layer is used for Method| context: sequence - to - sequence models usually transfer all encoder outputs to the decoder for generation . in this work , by contrast , we hypothesize that these encoder outputs can be compressed to shorten the sequence delivered for decoding .", "entity": "l0drop layer", "output": "transformer", "neg_sample": ["l0drop layer is used for Method", "sequence - to - sequence models usually transfer all encoder outputs to the decoder for generation .", "in this work , by contrast , we hypothesize that these encoder outputs can be compressed to shorten the sequence delivered for decoding ."], "relation": "used for", "id": "2021.findings-acl.255", "year": 2021, "rel_sent": "In other words , via joint training , the L0DROP layer forces Transformer to route information through a subset of its encoder states .", "forward": true, "src_ids": "2021.findings-acl.255_12009"}
{"input": "semi - supervised language models is used for Task| context: first - hand experience related to any changes of one 's health condition and understanding such experience can play an important role in advancing medical science and healthcare . monitoring the safe use of medication drugs is an important task of pharmacovigilance , and first - hand experience of effects about consumers ' medication intake can be valuable to gain insight into how our human body reacts to medications . social media have been considered as a possible alternative data source for gathering personal experience with medications posted by users . identifying personal experience tweets is a challenging classification task , and efforts have made to tackle the challenges using supervised approaches requiring annotated data .", "entity": "semi - supervised language models", "output": "identification of personal health experiential", "neg_sample": ["semi - supervised language models is used for Task", "first - hand experience related to any changes of one 's health condition and understanding such experience can play an important role in advancing medical science and healthcare .", "monitoring the safe use of medication drugs is an important task of pharmacovigilance , and first - hand experience of effects about consumers ' medication intake can be valuable to gain insight into how our human body reacts to medications .", "social media have been considered as a possible alternative data source for gathering personal experience with medications posted by users .", "identifying personal experience tweets is a challenging classification task , and efforts have made to tackle the challenges using supervised approaches requiring annotated data ."], "relation": "used for", "id": "2021.bionlp-1.25", "year": 2021, "rel_sent": "Semi - Supervised Language Models for Identification of Personal Health Experiential from Twitter Data : A Case for Medication Effects.", "forward": true, "src_ids": "2021.bionlp-1.25_7739"}
{"input": "offensive words is done by using OtherScientificTerm| context: this paper describes the development of an online lexical resource to help detection systems regulate and curb the use of offensive words online . with the growing prevalence of social media platforms , many conversations are now conducted on- line . the increase of online conversations for leisure , work and socializing has led to an increase in harassment .", "entity": "offensive words", "output": "vocabulary", "neg_sample": ["offensive words is done by using OtherScientificTerm", "this paper describes the development of an online lexical resource to help detection systems regulate and curb the use of offensive words online .", "with the growing prevalence of social media platforms , many conversations are now conducted on- line .", "the increase of online conversations for leisure , work and socializing has led to an increase in harassment ."], "relation": "used for", "id": "2021.gwc-1.5", "year": 2021, "rel_sent": "This paper then discusses the evaluation of the vocabulary as a resource for representing and classifying offensive words and as a possible resource for offensive word use detection in social media .", "forward": false, "src_ids": "2021.gwc-1.5_6272"}
{"input": "neural aspect extraction models is used for Task| context: aspect extraction is not a well - explored topic in hindi , with only one corpus having been developed for the task .", "entity": "neural aspect extraction models", "output": "monolingual and multilingual settings", "neg_sample": ["neural aspect extraction models is used for Task", "aspect extraction is not a well - explored topic in hindi , with only one corpus having been developed for the task ."], "relation": "used for", "id": "2021.ecnlp-1.17", "year": 2021, "rel_sent": "We also evaluate our dataset using state - of - the - art neural aspect extraction models in both monolingual and multilingual settings and show that the models perform far better on our corpus than on the existing Hindi dataset .", "forward": true, "src_ids": "2021.ecnlp-1.17_7512"}
{"input": "argument pair extraction ( ape ) is used for OtherScientificTerm| context: previous work studied this task in the context of peer review and rebuttal , and decomposed it into a sequence labeling task and a sentence relation classification task . however , despite the promising performance , such an approach obtains the argument pairs implicitly by the two decomposed tasks , lacking explicitly modeling of the argument - level interactions between argument pairs .", "entity": "argument pair extraction ( ape )", "output": "interactive argument pairs", "neg_sample": ["argument pair extraction ( ape ) is used for OtherScientificTerm", "previous work studied this task in the context of peer review and rebuttal , and decomposed it into a sequence labeling task and a sentence relation classification task .", "however , despite the promising performance , such an approach obtains the argument pairs implicitly by the two decomposed tasks , lacking explicitly modeling of the argument - level interactions between argument pairs ."], "relation": "used for", "id": "2021.emnlp-main.319", "year": 2021, "rel_sent": "Argument pair extraction ( APE ) aims to extract interactive argument pairs from two passages of a discussion .", "forward": true, "src_ids": "2021.emnlp-main.319_9615"}
{"input": "box - to - box transformations is used for OtherScientificTerm| context: learning representations of entities and relations in structured knowledge bases is an active area of research , with much emphasis placed on choosing the appropriate geometry to capture the hierarchical structures exploited in , for example , isa or haspart relations . box embeddings ( vilnis et al . , 2018 ; li et al . , 2019 ; dasgupta et al . , 2020 ) , which represent concepts as n - dimensional hyperrectangles , are capable of embedding hierarchies when training on a subset of the transitive closure .", "entity": "box - to - box transformations", "output": "joint hierarchies", "neg_sample": ["box - to - box transformations is used for OtherScientificTerm", "learning representations of entities and relations in structured knowledge bases is an active area of research , with much emphasis placed on choosing the appropriate geometry to capture the hierarchical structures exploited in , for example , isa or haspart relations .", "box embeddings ( vilnis et al .", ", 2018 ; li et al .", ", 2019 ; dasgupta et al .", ", 2020 ) , which represent concepts as n - dimensional hyperrectangles , are capable of embedding hierarchies when training on a subset of the transitive closure ."], "relation": "used for", "id": "2021.repl4nlp-1.28", "year": 2021, "rel_sent": "Box - To - Box Transformations for Modeling Joint Hierarchies.", "forward": true, "src_ids": "2021.repl4nlp-1.28_14379"}
{"input": "natural language understanding tasks is done by using Method| context: automatic detection of toxic language plays an essential role in protecting social media users , especially minority groups , from verbal abuse . however , biases toward some attributes , including gender , race , and dialect , exist in most training datasets for toxicity detection . the biases make the learned models unfair and can even exacerbate the marginalization of people .", "entity": "natural language understanding tasks", "output": "debiasing methods", "neg_sample": ["natural language understanding tasks is done by using Method", "automatic detection of toxic language plays an essential role in protecting social media users , especially minority groups , from verbal abuse .", "however , biases toward some attributes , including gender , race , and dialect , exist in most training datasets for toxicity detection .", "the biases make the learned models unfair and can even exacerbate the marginalization of people ."], "relation": "used for", "id": "2021.woah-1.12", "year": 2021, "rel_sent": "Considering that current debiasing methods for general natural language understanding tasks can not effectively mitigate the biases in the toxicity detectors , we propose to use invariant rationalization ( InvRat ) , a game - theoretic framework consisting of a rationale generator and a predictor , to rule out the spurious correlation of certain syntactic patterns ( e.g. , identity mentions , dialect ) to toxicity labels .", "forward": false, "src_ids": "2021.woah-1.12_8313"}
{"input": "robustly fine - tuning plms is done by using Method| context: recent works have shown that powerful pre - trained language models ( plm ) can be fooled by small perturbations or intentional attacks . to solve this issue , various data augmentation techniques are proposed to improve the robustness of plms . however , it is still challenging to augment semantically relevant examples with sufficient diversity .", "entity": "robustly fine - tuning plms", "output": "virtual data augmentation ( vda )", "neg_sample": ["robustly fine - tuning plms is done by using Method", "recent works have shown that powerful pre - trained language models ( plm ) can be fooled by small perturbations or intentional attacks .", "to solve this issue , various data augmentation techniques are proposed to improve the robustness of plms .", "however , it is still challenging to augment semantically relevant examples with sufficient diversity ."], "relation": "used for", "id": "2021.emnlp-main.315", "year": 2021, "rel_sent": "In this work , we present Virtual Data Augmentation ( VDA ) , a general framework for robustly fine - tuning PLMs .", "forward": false, "src_ids": "2021.emnlp-main.315_7295"}
{"input": "ssncse nlp is used for Task| context: social media are interactive platforms that facilitate the creation or sharing of information , ideas or other forms of expression among people . this exchange is not free from offensive , trolling or malicious contents targeting users or communities . one way of trolling is by making memes . a meme is an image or video that represents the thoughts and feelings of a specific audience . the challenge of dealing with memes is that they are region - specific and their meaning is often obscured in humour or sarcasm . a meme is a form of media that spreads an idea or emotion across the internet . the multi modal nature of memes , postings of hateful memes or related events like trolling , cyberbullying are increasing day by day . memes make it even more challenging since they express humour and sarcasm in an implicit way , because of which the meme may not be offensive if we only consider the text or the image .", "entity": "ssncse nlp", "output": "meme classification", "neg_sample": ["ssncse nlp is used for Task", "social media are interactive platforms that facilitate the creation or sharing of information , ideas or other forms of expression among people .", "this exchange is not free from offensive , trolling or malicious contents targeting users or communities .", "one way of trolling is by making memes .", "a meme is an image or video that represents the thoughts and feelings of a specific audience .", "the challenge of dealing with memes is that they are region - specific and their meaning is often obscured in humour or sarcasm .", "a meme is a form of media that spreads an idea or emotion across the internet .", "the multi modal nature of memes , postings of hateful memes or related events like trolling , cyberbullying are increasing day by day .", "memes make it even more challenging since they express humour and sarcasm in an implicit way , because of which the meme may not be offensive if we only consider the text or the image ."], "relation": "used for", "id": "2021.dravidianlangtech-1.49", "year": 2021, "rel_sent": "This work explains the submissions made by SSNCSE NLP in DravidanLangTechEACL2021 task for meme classification in Tamil language .", "forward": true, "src_ids": "2021.dravidianlangtech-1.49_552"}
{"input": "neural models is done by using Material| context: in an election campaign , political parties pledge to implement various projects - should they be elected . but do they follow through ?", "entity": "neural models", "output": "election manifestos", "neg_sample": ["neural models is done by using Material", "in an election campaign , political parties pledge to implement various projects - should they be elected .", "but do they follow through ?"], "relation": "used for", "id": "2021.findings-acl.301", "year": 2021, "rel_sent": "In this paper , we use election manifestos of Swedish and Indian political parties to learn neural models that distinguish actual pledges from generic political positions .", "forward": false, "src_ids": "2021.findings-acl.301_464"}
{"input": "cross - lingual nlp applications is done by using Material| context: we describe the process of conversion between the pos tagging schemes of two languages , the icelandic mim - gold tagging scheme and the faroese sosialurin tagging scheme . these tagging schemes are functionally similar but use separate ways to encode fine - grained morphological information on tokenised text .", "entity": "cross - lingual nlp applications", "output": "icelandic corpora", "neg_sample": ["cross - lingual nlp applications is done by using Material", "we describe the process of conversion between the pos tagging schemes of two languages , the icelandic mim - gold tagging scheme and the faroese sosialurin tagging scheme .", "these tagging schemes are functionally similar but use separate ways to encode fine - grained morphological information on tokenised text ."], "relation": "used for", "id": "2021.nodalida-main.33", "year": 2021, "rel_sent": "As a product of our work , we present a provisional version of Icelandic corpora , prepared in the Faroese PoS tagging scheme , ready for use in cross - lingual NLP applications .", "forward": false, "src_ids": "2021.nodalida-main.33_8748"}
{"input": "role - selected sharing network is used for Task| context: chatbot is increasingly thriving in different domains , however , because of unexpected discourse complexity and training data sparseness , its potential distrust hatches vital apprehension .", "entity": "role - selected sharing network", "output": "machine - human chatting handoff ( mhch )", "neg_sample": ["role - selected sharing network is used for Task", "chatbot is increasingly thriving in different domains , however , because of unexpected discourse complexity and training data sparseness , its potential distrust hatches vital apprehension ."], "relation": "used for", "id": "2021.emnlp-main.767", "year": 2021, "rel_sent": "A Role - Selected Sharing Network for Joint Machine - Human Chatting Handoff and Service Satisfaction Analysis.", "forward": true, "src_ids": "2021.emnlp-main.767_9689"}
{"input": "statistical sequence models is used for Task| context: cmas are more fine - grained sub - utterance acts compared to traditional dialogue act mark - up .", "entity": "statistical sequence models", "output": "tutor content related decisions", "neg_sample": ["statistical sequence models is used for Task", "cmas are more fine - grained sub - utterance acts compared to traditional dialogue act mark - up ."], "relation": "used for", "id": "2021.reinact-1.4", "year": 2021, "rel_sent": "We annotate a corpus of analogical episodes with the schema and develop statistical sequence models from the corpus which predict tutor content related decisions , in terms of the selection of the analogical component ( AC ) and tutor conversational management act ( TCMA ) to deploy at the current utterance , given the student 's behaviour .", "forward": true, "src_ids": "2021.reinact-1.4_15835"}
{"input": "mtl research is done by using Method| context: a key problem in multi - task learning ( mtl ) research is how to select high - quality auxiliary tasks automatically .", "entity": "mtl research", "output": "gradts", "neg_sample": ["mtl research is done by using Method", "a key problem in multi - task learning ( mtl ) research is how to select high - quality auxiliary tasks automatically ."], "relation": "used for", "id": "2021.emnlp-main.455", "year": 2021, "rel_sent": "The efficiency and efficacy of GradTS in these case studies illustrate its general applicability in MTL research without requiring manual task filtering or costly parameter tuning .", "forward": false, "src_ids": "2021.emnlp-main.455_4674"}
{"input": "diversion rectification is done by using Method| context: topic diversion occurs frequently with engaging open - domain dialogue systems like virtual assistants . the balance between staying on topic and rectifying the topic drift is important for a good collaborative system .", "entity": "diversion rectification", "output": "system initiative", "neg_sample": ["diversion rectification is done by using Method", "topic diversion occurs frequently with engaging open - domain dialogue systems like virtual assistants .", "the balance between staying on topic and rectifying the topic drift is important for a good collaborative system ."], "relation": "used for", "id": "2021.sigdial-1.17", "year": 2021, "rel_sent": "We propose a preliminary study , classifying utterances into major , minor and off - topics , which further extends into a system initiative for diversion rectification .", "forward": false, "src_ids": "2021.sigdial-1.17_5987"}
{"input": "hierarchical attention network is used for OtherScientificTerm| context: previous studies only use features from pretrained action detection models as motion representations of the video to solve the verb sense ambiguity , leaving the noun sense ambiguity a problem .", "entity": "hierarchical attention network", "output": "spatial information", "neg_sample": ["hierarchical attention network is used for OtherScientificTerm", "previous studies only use features from pretrained action detection models as motion representations of the video to solve the verb sense ambiguity , leaving the noun sense ambiguity a problem ."], "relation": "used for", "id": "2021.acl-srw.9", "year": 2021, "rel_sent": "For spatial features , we propose a hierarchical attention network to model the spatial information from object - level to video - level .", "forward": true, "src_ids": "2021.acl-srw.9_15525"}
{"input": "salient opinions is done by using OtherScientificTerm| context: summarisation of reviews aims at compressing opinions expressed in multiple review documents into a concise form while still covering the key opinions . despite the advancement in summarisation models , evaluation metrics for opinionated text summaries lag behind and still rely on lexical - matching metrics such as rouge .", "entity": "salient opinions", "output": "qa pairs", "neg_sample": ["salient opinions is done by using OtherScientificTerm", "summarisation of reviews aims at compressing opinions expressed in multiple review documents into a concise form while still covering the key opinions .", "despite the advancement in summarisation models , evaluation metrics for opinionated text summaries lag behind and still rely on lexical - matching metrics such as rouge ."], "relation": "used for", "id": "2021.alta-1.9", "year": 2021, "rel_sent": "We propose to identify opinion - bearing text spans in the reference summary to generate QA pairs so as to capture salient opinions .", "forward": false, "src_ids": "2021.alta-1.9_12409"}
{"input": "multi - step retrieval approach ( beamdr ) is used for OtherScientificTerm| context: complex question answering often requires finding a reasoning chain that consists of multiple evidence pieces . current approaches incorporate the strengths of structured knowledge and unstructured text , assuming text corpora is semi - structured .", "entity": "multi - step retrieval approach ( beamdr )", "output": "evidence chain", "neg_sample": ["multi - step retrieval approach ( beamdr ) is used for OtherScientificTerm", "complex question answering often requires finding a reasoning chain that consists of multiple evidence pieces .", "current approaches incorporate the strengths of structured knowledge and unstructured text , assuming text corpora is semi - structured ."], "relation": "used for", "id": "2021.naacl-main.368", "year": 2021, "rel_sent": "Building on dense retrieval methods , we propose a new multi - step retrieval approach ( BeamDR ) that iteratively forms an evidence chain through beam search in dense representations .", "forward": true, "src_ids": "2021.naacl-main.368_8083"}
{"input": "redaction is used for Task| context: we explore the application of state - of - the - art ner algorithms to asr - generated call center transcripts . previous work in this domain focused on the use of a bilstm - crf model which relied on flair embeddings ; however , such a model is unwieldy in terms of latency and memory consumption . in a production environment , end users require low - latency models which can be readily integrated into existing pipelines .", "entity": "redaction", "output": "privacy law compliance", "neg_sample": ["redaction is used for Task", "we explore the application of state - of - the - art ner algorithms to asr - generated call center transcripts .", "previous work in this domain focused on the use of a bilstm - crf model which relied on flair embeddings ; however , such a model is unwieldy in terms of latency and memory consumption .", "in a production environment , end users require low - latency models which can be readily integrated into existing pipelines ."], "relation": "used for", "id": "2021.wnut-1.40", "year": 2021, "rel_sent": "We show that this model , while not as accurate as its Transformer - based counterpart , is highly effective in identifying items which require redaction for privacy law compliance .", "forward": true, "src_ids": "2021.wnut-1.40_6146"}
{"input": "few - shot semantic parsers is done by using Method| context: we explore the use of large pretrained language models as few - shot semantic parsers .", "entity": "few - shot semantic parsers", "output": "constrained language models", "neg_sample": ["few - shot semantic parsers is done by using Method", "we explore the use of large pretrained language models as few - shot semantic parsers ."], "relation": "used for", "id": "2021.emnlp-main.608", "year": 2021, "rel_sent": "Constrained Language Models Yield Few - Shot Semantic Parsers.", "forward": false, "src_ids": "2021.emnlp-main.608_5831"}
{"input": "end - to - end pipeline is used for OtherScientificTerm| context: people rely on digital task management tools , such as email or to - do apps , to manage their tasks . some of these tasks are large and complex , leading to action paralysis and feelings of being overwhelmed on the part of the user . the micro - productivity literature has shown that such tasks could benefit from being decomposed and organized , in order to reduce user cognitive load .", "entity": "end - to - end pipeline", "output": "dependency graph", "neg_sample": ["end - to - end pipeline is used for OtherScientificTerm", "people rely on digital task management tools , such as email or to - do apps , to manage their tasks .", "some of these tasks are large and complex , leading to action paralysis and feelings of being overwhelmed on the part of the user .", "the micro - productivity literature has shown that such tasks could benefit from being decomposed and organized , in order to reduce user cognitive load ."], "relation": "used for", "id": "2021.naacl-main.217", "year": 2021, "rel_sent": "Thus in this paper , we propose a novel end - to - end pipeline that consumes a complex task and induces a dependency graph from unstructured text to represent sub - tasks and their relationships .", "forward": true, "src_ids": "2021.naacl-main.217_9674"}
{"input": "features is done by using Method| context: successful machine translation ( mt ) deployment requires understanding not only the intrinsic qualities of mt output , such as fluency and adequacy , but also user perceptions . users who do not understand the source language respond to mt output based on their perception of the likelihood that the meaning of the mt output matches the meaning of the source text . we refer to this as believability . output that is not believable may be off - putting to users , but believable mt output with incorrect meaning may mislead them .", "entity": "features", "output": "mt direct assessment protocols", "neg_sample": ["features is done by using Method", "successful machine translation ( mt ) deployment requires understanding not only the intrinsic qualities of mt output , such as fluency and adequacy , but also user perceptions .", "users who do not understand the source language respond to mt output based on their perception of the likelihood that the meaning of the mt output matches the meaning of the source text .", "we refer to this as believability .", "output that is not believable may be off - putting to users , but believable mt output with incorrect meaning may mislead them ."], "relation": "used for", "id": "2021.hcinlp-1.14", "year": 2021, "rel_sent": "In this work , we study the relationship of believability tofluency and adequacy by applying traditional MT direct assessment protocols to annotate all three features on the output of neural MT systems .", "forward": false, "src_ids": "2021.hcinlp-1.14_4075"}
{"input": "graph - based encoder - decoder model is used for Task| context: abstractive summarization for long - document or multi - document remains challenging for the seq2seq architecture , as seq2seq is not good at analyzing long - distance relations in text .", "entity": "graph - based encoder - decoder model", "output": "summary generation process", "neg_sample": ["graph - based encoder - decoder model is used for Task", "abstractive summarization for long - document or multi - document remains challenging for the seq2seq architecture , as seq2seq is not good at analyzing long - distance relations in text ."], "relation": "used for", "id": "2021.acl-long.472", "year": 2021, "rel_sent": "Further , a graph - based encoder - decoder model is proposed to improve both the document representation and summary generation process by leveraging the graph structure .", "forward": true, "src_ids": "2021.acl-long.472_8135"}
{"input": "dutnlp machine translation system is used for Task| context: this paper describes dut - nlp lab 's submission to the wmt-21 triangular machine translation shared task . the participants are not allowed to use other data and the translation direction of this task is russian - to - chinese .", "entity": "dutnlp machine translation system", "output": "wmt21 triangular translation task", "neg_sample": ["dutnlp machine translation system is used for Task", "this paper describes dut - nlp lab 's submission to the wmt-21 triangular machine translation shared task .", "the participants are not allowed to use other data and the translation direction of this task is russian - to - chinese ."], "relation": "used for", "id": "2021.wmt-1.38", "year": 2021, "rel_sent": "DUTNLP Machine Translation System for WMT21 Triangular Translation Task.", "forward": true, "src_ids": "2021.wmt-1.38_614"}
{"input": "label propagation is done by using OtherScientificTerm| context: short text classification is a fundamental task in natural language processing . it is hard due to the lack of context information and labeled data in practice .", "entity": "label propagation", "output": "short document graph", "neg_sample": ["label propagation is done by using OtherScientificTerm", "short text classification is a fundamental task in natural language processing .", "it is hard due to the lack of context information and labeled data in practice ."], "relation": "used for", "id": "2021.emnlp-main.247", "year": 2021, "rel_sent": "Then , we dynamically learn a short document graph that facilitates effective label propagation among similar short texts .", "forward": false, "src_ids": "2021.emnlp-main.247_974"}
{"input": "inverse dynamics decoder is used for Task| context: text - based games simulate worlds and interact with players using natural language . recent work has used them as a testbed for autonomous language - understanding agents , with the motivation being that understanding the meanings of words or semantics is a key component of how humans understand , reason , and act in these worlds . however , it remains unclear to what extent artificial agents utilize semantic understanding of the text .", "entity": "inverse dynamics decoder", "output": "exploration", "neg_sample": ["inverse dynamics decoder is used for Task", "text - based games simulate worlds and interact with players using natural language .", "recent work has used them as a testbed for autonomous language - understanding agents , with the motivation being that understanding the meanings of words or semantics is a key component of how humans understand , reason , and act in these worlds .", "however , it remains unclear to what extent artificial agents utilize semantic understanding of the text ."], "relation": "used for", "id": "2021.naacl-main.247", "year": 2021, "rel_sent": "To remedy this deficiency , we propose an inverse dynamics decoder to regularize the representation space and encourage exploration , which shows improved performance on several games including Zork I.", "forward": true, "src_ids": "2021.naacl-main.247_9902"}
{"input": "coordination structure identification is done by using Method| context: bubble representations were proposed in the formal linguistics literature decades ago ; they enhance dependency trees by encoding coordination boundaries and internal relationships within coordination structures explicitly .", "entity": "coordination structure identification", "output": "transition - based bubble parsing", "neg_sample": ["coordination structure identification is done by using Method", "bubble representations were proposed in the formal linguistics literature decades ago ; they enhance dependency trees by encoding coordination boundaries and internal relationships within coordination structures explicitly ."], "relation": "used for", "id": "2021.acl-long.557", "year": 2021, "rel_sent": "We propose a transition - based bubble parser to perform coordination structure identification and dependency - based syntactic analysis simultaneously .", "forward": false, "src_ids": "2021.acl-long.557_13688"}
{"input": "annotated rewrite cases is used for Method| context: recently , text - to - sql for multi - turn dialogue has attracted great interest . here , the user input of the current turn is parsed into the corresponding sql query of the appropriate database , given all previous dialogue history . current approaches mostly employ end - to - end models and consequently face two challenges . first , dialogue history modeling and text - tosql parsing are implicitly combined , hence it is hard to carry out interpretable analysis and obtain targeted improvement . second , sql annotation of multi - turn dialogue is very expensive , leading to training data sparsity .", "entity": "annotated rewrite cases", "output": "decoupled method", "neg_sample": ["annotated rewrite cases is used for Method", "recently , text - to - sql for multi - turn dialogue has attracted great interest .", "here , the user input of the current turn is parsed into the corresponding sql query of the appropriate database , given all previous dialogue history .", "current approaches mostly employ end - to - end models and consequently face two challenges .", "first , dialogue history modeling and text - tosql parsing are implicitly combined , hence it is hard to carry out interpretable analysis and obtain targeted improvement .", "second , sql annotation of multi - turn dialogue is very expensive , leading to training data sparsity ."], "relation": "used for", "id": "2021.findings-acl.270", "year": 2021, "rel_sent": "With just a few annotated rewrite cases , the decoupled method outperforms the released state - of - the - art endto - end models on both SParC and CoSQL datasets .", "forward": true, "src_ids": "2021.findings-acl.270_4340"}
{"input": "encoderdecoder model is used for Task| context: simultaneous span detection and classification is a task not currently addressed in standard nlp frameworks .", "entity": "encoderdecoder model", "output": "semeval-2021 task 6", "neg_sample": ["encoderdecoder model is used for Task", "simultaneous span detection and classification is a task not currently addressed in standard nlp frameworks ."], "relation": "used for", "id": "2021.semeval-1.32", "year": 2021, "rel_sent": "The present paper describes why and how an EncoderDecoder model was used to combine span detection and classification to address subtask 2 of SemEval-2021 Task 6 .", "forward": true, "src_ids": "2021.semeval-1.32_561"}
{"input": "feature is done by using OtherScientificTerm| context: misleading information spreads on the internet at an incredible speed , which can lead to irreparable consequences in some cases . therefore , it is becoming essential to develop fake news detection technologies . while substantial work has been done in this direction , one of the limitations of the current approaches is that these models are focused only on one language and do not use multilingual information .", "entity": "feature", "output": "cross - lingual evidence ( ce )", "neg_sample": ["feature is done by using OtherScientificTerm", "misleading information spreads on the internet at an incredible speed , which can lead to irreparable consequences in some cases .", "therefore , it is becoming essential to develop fake news detection technologies .", "while substantial work has been done in this direction , one of the limitations of the current approaches is that these models are focused only on one language and do not use multilingual information ."], "relation": "used for", "id": "2021.acl-srw.32", "year": 2021, "rel_sent": "The hypothesis of the usage of cross - lingual evidence as a feature for fake news detection is confirmed , firstly , by manual experiment based on a set of known true and fake news .", "forward": false, "src_ids": "2021.acl-srw.32_8765"}
{"input": "cider - r is done by using Method| context: this paper shows that cider - d , a traditional evaluation metric for image description , does not work properly on datasets where the number of words in the sentence is significantly greater than those in the ms coco captions dataset . we also show that cider - d has performance hampered by the lack of multiple reference sentences and high variance of sentence length .", "entity": "cider - r", "output": "self - critical sequence training", "neg_sample": ["cider - r is done by using Method", "this paper shows that cider - d , a traditional evaluation metric for image description , does not work properly on datasets where the number of words in the sentence is significantly greater than those in the ms coco captions dataset .", "we also show that cider - d has performance hampered by the lack of multiple reference sentences and high variance of sentence length ."], "relation": "used for", "id": "2021.wnut-1.39", "year": 2021, "rel_sent": "Our results reveal that using Self - Critical Sequence Training to optimize CIDEr - R generates descriptive captions .", "forward": false, "src_ids": "2021.wnut-1.39_13504"}
{"input": "relational triplet extraction is done by using Method| context: in this work , we show that all of the relational fact extraction models can be organized according to a graph - oriented analytical perspective .", "entity": "relational triplet extraction", "output": "( sota ) models", "neg_sample": ["relational triplet extraction is done by using Method", "in this work , we show that all of the relational fact extraction models can be organized according to a graph - oriented analytical perspective ."], "relation": "used for", "id": "2021.findings-acl.271", "year": 2021, "rel_sent": "Extensive experiments are conducted on two benchmark datasets , and results prove that the proposed model outperforms a series of state - of - the - art ( SoTA ) models for relational triplet extraction .", "forward": false, "src_ids": "2021.findings-acl.271_3458"}
{"input": "data augmentation approaches is used for Task| context: despite this recent upsurge , this area is still relatively underexplored , perhaps due to the challenges posed by the discrete nature of language data .", "entity": "data augmentation approaches", "output": "nlp", "neg_sample": ["data augmentation approaches is used for Task", "despite this recent upsurge , this area is still relatively underexplored , perhaps due to the challenges posed by the discrete nature of language data ."], "relation": "used for", "id": "2021.findings-acl.84", "year": 2021, "rel_sent": "Overall , our paper aims to clarify the landscape of existing literature in data augmentation for NLP and motivate additional work in this area .", "forward": true, "src_ids": "2021.findings-acl.84_13633"}
{"input": "open domain question answering is done by using Method| context: the current state - of - the - art generative models for open - domain question answering ( odqa ) have focused on generating direct answers from unstructured textual information . however , a large amount of world 's knowledge is stored in structured databases , and need to be accessed using query languages such as sql . furthermore , query languages can answer questions that require complex reasoning , as well as offering full explainability .", "entity": "open domain question answering", "output": "dual reader - parser", "neg_sample": ["open domain question answering is done by using Method", "the current state - of - the - art generative models for open - domain question answering ( odqa ) have focused on generating direct answers from unstructured textual information .", "however , a large amount of world 's knowledge is stored in structured databases , and need to be accessed using query languages such as sql .", "furthermore , query languages can answer questions that require complex reasoning , as well as offering full explainability ."], "relation": "used for", "id": "2021.acl-long.315", "year": 2021, "rel_sent": "Dual Reader - Parser on Hybrid Textual and Tabular Evidence for Open Domain Question Answering.", "forward": false, "src_ids": "2021.acl-long.315_14875"}
{"input": "chinese dataset is used for Task| context: great research interests have been attracted to devise ai services that are able to provide mental health support . however , the lack of corpora is a main obstacle to this research , particularly in chinese language .", "entity": "chinese dataset", "output": "generating long counseling text", "neg_sample": ["chinese dataset is used for Task", "great research interests have been attracted to devise ai services that are able to provide mental health support .", "however , the lack of corpora is a main obstacle to this research , particularly in chinese language ."], "relation": "used for", "id": "2021.findings-acl.130", "year": 2021, "rel_sent": "PsyQA : A Chinese Dataset for Generating Long Counseling Text for Mental Health Support.", "forward": true, "src_ids": "2021.findings-acl.130_4436"}
{"input": "length configuration is done by using Method| context: despite transformers ' impressive accuracy , their computational cost is often prohibitive to use with limited computational resources . most previous approaches to improve inference efficiency require a separate model for each possible computational budget .", "entity": "length configuration", "output": "multi - objective evolutionary search", "neg_sample": ["length configuration is done by using Method", "despite transformers ' impressive accuracy , their computational cost is often prohibitive to use with limited computational resources .", "most previous approaches to improve inference efficiency require a separate model for each possible computational budget ."], "relation": "used for", "id": "2021.acl-long.508", "year": 2021, "rel_sent": "We then conduct a multi - objective evolutionary search tofind a length configuration that maximizes the accuracy and minimizes the efficiency metric under any given computational budget .", "forward": false, "src_ids": "2021.acl-long.508_14485"}
{"input": "label - semantic augmented meta - learner is used for Task| context: most of the current studies focus on building a meta - learner from the information of input texts but ignore abundant semantic information beneath class labels .", "entity": "label - semantic augmented meta - learner", "output": "few - shot text classification problems", "neg_sample": ["label - semantic augmented meta - learner is used for Task", "most of the current studies focus on building a meta - learner from the information of input texts but ignore abundant semantic information beneath class labels ."], "relation": "used for", "id": "2021.findings-acl.245", "year": 2021, "rel_sent": "Do n't Miss the Labels : Label - semantic Augmented Meta - Learner for Few - Shot Text Classification.", "forward": true, "src_ids": "2021.findings-acl.245_4183"}
{"input": "transparent neural - symbolic reasoning framework is used for Task| context: ' the predominant approach of visual question answering ( vqa ) relies on encoding the imageand question with a ' black box ' neural encoder and decoding a single token into answers suchas ' yes ' or ' no ' . despite this approach 's strong quantitative results it struggles to come up withhuman - readable forms of justification for the prediction process .", "entity": "transparent neural - symbolic reasoning framework", "output": "real - world visual question answering", "neg_sample": ["transparent neural - symbolic reasoning framework is used for Task", "' the predominant approach of visual question answering ( vqa ) relies on encoding the imageand question with a ' black box ' neural encoder and decoding a single token into answers suchas ' yes ' or ' no ' .", "despite this approach 's strong quantitative results it struggles to come up withhuman - readable forms of justification for the prediction process ."], "relation": "used for", "id": "2021.ccl-1.92", "year": 2021, "rel_sent": "LRRA : A Transparent Neural - Symbolic Reasoning Framework for Real - World Visual Question Answering.", "forward": true, "src_ids": "2021.ccl-1.92_3449"}
{"input": "mandarin chinese language models is done by using OtherScientificTerm| context: prior work has shown that structural supervision helps english language models learn generalizations about syntactic phenomena such as subject - verb agreement . however , it remains unclear if such an inductive bias would also improve language models ' ability to learn grammatical dependencies in typologically different languages .", "entity": "mandarin chinese language models", "output": "grammatical knowledge", "neg_sample": ["mandarin chinese language models is done by using OtherScientificTerm", "prior work has shown that structural supervision helps english language models learn generalizations about syntactic phenomena such as subject - verb agreement .", "however , it remains unclear if such an inductive bias would also improve language models ' ability to learn grammatical dependencies in typologically different languages ."], "relation": "used for", "id": "2021.emnlp-main.454", "year": 2021, "rel_sent": "Controlled Evaluation of Grammatical Knowledge in Mandarin Chinese Language Models.", "forward": false, "src_ids": "2021.emnlp-main.454_7377"}
{"input": "aspect terms is done by using Task| context: while self - training is potentially an effective method to address this issue , the pseudo - labels it yields on unlabeled data could induce noise .", "entity": "aspect terms", "output": "aspect term extraction", "neg_sample": ["aspect terms is done by using Task", "while self - training is potentially an effective method to address this issue , the pseudo - labels it yields on unlabeled data could induce noise ."], "relation": "used for", "id": "2021.emnlp-main.23", "year": 2021, "rel_sent": "Aspect term extraction aims to extract aspect terms from a review sentence that users have expressed opinions on .", "forward": false, "src_ids": "2021.emnlp-main.23_10152"}
{"input": "sample selection is used for OtherScientificTerm| context: while pre - trained language models have obtained state - of - the - art performance for several natural language understanding tasks , they are quite opaque in terms of their decision - making process . while some recent works focus on rationalizing neural predictions by highlighting salient concepts in the text as justifications or rationales , they rely on thousands of labeled training examples for both task labels as well as annotated rationales for every instance . such extensive large - scale annotations are infeasible to obtain for many tasks .", "entity": "sample selection", "output": "informative pseudo - labeled examples", "neg_sample": ["sample selection is used for OtherScientificTerm", "while pre - trained language models have obtained state - of - the - art performance for several natural language understanding tasks , they are quite opaque in terms of their decision - making process .", "while some recent works focus on rationalizing neural predictions by highlighting salient concepts in the text as justifications or rationales , they rely on thousands of labeled training examples for both task labels as well as annotated rationales for every instance .", "such extensive large - scale annotations are infeasible to obtain for many tasks ."], "relation": "used for", "id": "2021.emnlp-main.836", "year": 2021, "rel_sent": "To this end , we develop a multi - task teacher - student framework based on self - training pre - trained language models with limited task - specific labels and rationales and judicious sample selection to learn from informative pseudo - labeled examples .", "forward": true, "src_ids": "2021.emnlp-main.836_12973"}
{"input": "workflow manager is done by using Method| context: biomaterials are synthetic or natural materials used for constructing artificial organs , fabricating prostheses , or replacing tissues . the last century saw the development of thousands of novel biomaterials and , as a result , an exponential increase in scientific publications in the field . large - scale analysis of biomaterials and their performance could enable data - driven material selection and implant design . however , such analysis requires identification and organization of concepts , such as materials and structures , from published texts .", "entity": "workflow manager", "output": "nextflow", "neg_sample": ["workflow manager is done by using Method", "biomaterials are synthetic or natural materials used for constructing artificial organs , fabricating prostheses , or replacing tissues .", "the last century saw the development of thousands of novel biomaterials and , as a result , an exponential increase in scientific publications in the field .", "large - scale analysis of biomaterials and their performance could enable data - driven material selection and implant design .", "however , such analysis requires identification and organization of concepts , such as materials and structures , from published texts ."], "relation": "used for", "id": "2021.sdp-1.5", "year": 2021, "rel_sent": "The Biomaterials Annotator has been implemented following a modular organization using software containers for the different components and orchestrated using Nextflow as workflow manager .", "forward": false, "src_ids": "2021.sdp-1.5_6011"}
{"input": "hierarchy is done by using Method| context: learning representations of entities and relations in structured knowledge bases is an active area of research , with much emphasis placed on choosing the appropriate geometry to capture the hierarchical structures exploited in , for example , isa or haspart relations . box embeddings ( vilnis et al . , 2018 ; li et al . , 2019 ; dasgupta et al . , 2020 ) , which represent concepts as n - dimensional hyperrectangles , are capable of embedding hierarchies when training on a subset of the transitive closure . while it is possible to represent joint hierarchies with this method , the parameters for each hierarchy are decoupled , making generalization between hierarchies infeasible .", "entity": "hierarchy", "output": "box - to - box transformations", "neg_sample": ["hierarchy is done by using Method", "learning representations of entities and relations in structured knowledge bases is an active area of research , with much emphasis placed on choosing the appropriate geometry to capture the hierarchical structures exploited in , for example , isa or haspart relations .", "box embeddings ( vilnis et al .", ", 2018 ; li et al .", ", 2019 ; dasgupta et al .", ", 2020 ) , which represent concepts as n - dimensional hyperrectangles , are capable of embedding hierarchies when training on a subset of the transitive closure .", "while it is possible to represent joint hierarchies with this method , the parameters for each hierarchy are decoupled , making generalization between hierarchies infeasible ."], "relation": "used for", "id": "2021.repl4nlp-1.28", "year": 2021, "rel_sent": "In this work , we introduce a learned box - to - box transformation that respects the structure of each hierarchy .", "forward": false, "src_ids": "2021.repl4nlp-1.28_14378"}
{"input": "bert ( leebert ) is done by using OtherScientificTerm| context: pre - trained language models like bert are performant in a wide range of natural language tasks . however , they are resource exhaustive and computationally expensive for industrial scenarios . thus , early exits are adopted at each layer of bert to perform adaptive computation by predicting easier samples with the first few layers to speed up the inference .", "entity": "bert ( leebert )", "output": "learned early exit", "neg_sample": ["bert ( leebert ) is done by using OtherScientificTerm", "pre - trained language models like bert are performant in a wide range of natural language tasks .", "however , they are resource exhaustive and computationally expensive for industrial scenarios .", "thus , early exits are adopted at each layer of bert to perform adaptive computation by predicting easier samples with the first few layers to speed up the inference ."], "relation": "used for", "id": "2021.acl-long.231", "year": 2021, "rel_sent": "In this work , to improve efficiency without performance drop , we propose a novel training scheme called Learned Early Exit for BERT ( LeeBERT ) .", "forward": false, "src_ids": "2021.acl-long.231_15126"}
{"input": "prototypes is done by using Method| context: in this paper , we formulate acd in the few - shot learning scenario . however , existing few - shot learning approaches mainly focus on single - label predictions . these methods can not work well for the acd task since a sentence may contain multiple aspect categories .", "entity": "prototypes", "output": "support - set attention", "neg_sample": ["prototypes is done by using Method", "in this paper , we formulate acd in the few - shot learning scenario .", "however , existing few - shot learning approaches mainly focus on single - label predictions .", "these methods can not work well for the acd task since a sentence may contain multiple aspect categories ."], "relation": "used for", "id": "2021.acl-long.495", "year": 2021, "rel_sent": "The support - set attention aims to extract better prototypes by removing irrelevant aspects .", "forward": false, "src_ids": "2021.acl-long.495_4602"}
{"input": "hebrew speakers is used for Task| context: it has been well - documented for several languages that human interlocutors tend to adapt their linguistic productions to become more similar to each other . this behavior , known as entrainment , affects lexical choice as well , both with regard to specific words , such as referring expressions , and overall style .", "entity": "hebrew speakers", "output": "map task", "neg_sample": ["hebrew speakers is used for Task", "it has been well - documented for several languages that human interlocutors tend to adapt their linguistic productions to become more similar to each other .", "this behavior , known as entrainment , affects lexical choice as well , both with regard to specific words , such as referring expressions , and overall style ."], "relation": "used for", "id": "2021.eacl-main.23", "year": 2021, "rel_sent": "Using two existing measures , we analyze Hebrew speakers interacting in a Map Task , a popular experimental setup , and find rich evidence of lexical entrainment .", "forward": true, "src_ids": "2021.eacl-main.23_1862"}
{"input": "transfer learning is used for Task| context: nowadays , named entity recognition ( ner ) achieved excellent results on the standard corpora . however , big issues are emerging with a need for an application in a specific domain , because it requires a suitable annotated corpus with adapted ne tag - set . this is particularly evident in the historical document processing field .", "entity": "transfer learning", "output": "czech historical named entity recognition", "neg_sample": ["transfer learning is used for Task", "nowadays , named entity recognition ( ner ) achieved excellent results on the standard corpora .", "however , big issues are emerging with a need for an application in a specific domain , because it requires a suitable annotated corpus with adapted ne tag - set .", "this is particularly evident in the historical document processing field ."], "relation": "used for", "id": "2021.ranlp-1.65", "year": 2021, "rel_sent": "Transfer Learning for Czech Historical Named Entity Recognition.", "forward": true, "src_ids": "2021.ranlp-1.65_6854"}
{"input": "coreference resolution is done by using Method| context: large annotated corpora for coreference resolution are available for few languages . for machine translation , however , strong black - box systems exist for many languages .", "entity": "coreference resolution", "output": "machine translation tools", "neg_sample": ["coreference resolution is done by using Method", "large annotated corpora for coreference resolution are available for few languages .", "for machine translation , however , strong black - box systems exist for many languages ."], "relation": "used for", "id": "2021.crac-1.6", "year": 2021, "rel_sent": "We empirically explore the appealing idea of leveraging such translation tools for bootstrapping coreference resolution in languages with limited resources .", "forward": false, "src_ids": "2021.crac-1.6_2151"}
{"input": "style transfer is done by using Method| context: we present a novel approach to the problem of text style transfer .", "entity": "style transfer", "output": "decoder", "neg_sample": ["style transfer is done by using Method", "we present a novel approach to the problem of text style transfer ."], "relation": "used for", "id": "2021.acl-long.293", "year": 2021, "rel_sent": "We adapt T5 ( Raffel et al . , 2020 ) , a strong pretrained text - to - text model , to extract a style vector from text and use it to condition the decoder to perform style transfer .", "forward": false, "src_ids": "2021.acl-long.293_13035"}
{"input": "reinforcement learning is used for Task| context: automatic construction of relevant knowledge bases ( kbs ) from text , and generation of semantically meaningful text from kbs are both long - standing goals in machine learning .", "entity": "reinforcement learning", "output": "text and knowledge base generation", "neg_sample": ["reinforcement learning is used for Task", "automatic construction of relevant knowledge bases ( kbs ) from text , and generation of semantically meaningful text from kbs are both long - standing goals in machine learning ."], "relation": "used for", "id": "2021.emnlp-main.83", "year": 2021, "rel_sent": "ReGen : Reinforcement Learning for Text and Knowledge Base Generation using Pretrained Language Models.", "forward": true, "src_ids": "2021.emnlp-main.83_1135"}
{"input": "fermi problems is used for Material| context: many real - world problems require the combined application of multiple reasoning abilities - employing suitable abstractions , commonsense knowledge , and creative synthesis of problem - solving strategies .", "entity": "fermi problems", "output": "quizzes", "neg_sample": ["fermi problems is used for Material", "many real - world problems require the combined application of multiple reasoning abilities - employing suitable abstractions , commonsense knowledge , and creative synthesis of problem - solving strategies ."], "relation": "used for", "id": "2021.emnlp-main.582", "year": 2021, "rel_sent": "FPs are commonly used in quizzes and interviews to bring out and evaluate the creative reasoning abilities of humans .", "forward": true, "src_ids": "2021.emnlp-main.582_5357"}
{"input": "differential diagnosis is done by using Method| context: we present a text representation approach that can combine different views ( representations ) of the same input through effective data fusion and attention strategies for ranking purposes .", "entity": "differential diagnosis", "output": "attentive multiview text representation", "neg_sample": ["differential diagnosis is done by using Method", "we present a text representation approach that can combine different views ( representations ) of the same input through effective data fusion and attention strategies for ranking purposes ."], "relation": "used for", "id": "2021.acl-short.128", "year": 2021, "rel_sent": "Attentive Multiview Text Representation for Differential Diagnosis.", "forward": false, "src_ids": "2021.acl-short.128_7073"}
{"input": "control mechanism is done by using OtherScientificTerm| context: in practical applications of semantic parsing , we often want to rapidly change the behavior of the parser , such as enabling it to handle queries in a new domain , or changing its predictions on certain targeted queries . while we can introduce new training examples exhibiting the target behavior , a mechanism for enacting such behavior changes without expensive model re - training would be preferable .", "entity": "control mechanism", "output": "exemplars", "neg_sample": ["control mechanism is done by using OtherScientificTerm", "in practical applications of semantic parsing , we often want to rapidly change the behavior of the parser , such as enabling it to handle queries in a new domain , or changing its predictions on certain targeted queries .", "while we can introduce new training examples exhibiting the target behavior , a mechanism for enacting such behavior changes without expensive model re - training would be preferable ."], "relation": "used for", "id": "2021.emnlp-main.607", "year": 2021, "rel_sent": "The exemplars act as a control mechanism over the generic generative model : by manipulating the retrieval index or how the augmented query is constructed , we can manipulate the behavior of the parser .", "forward": false, "src_ids": "2021.emnlp-main.607_3537"}
{"input": "open information extraction is done by using Method| context: integrating extracted knowledge from the web to knowledge graphs ( kgs ) can facilitate tasks like question answering . however , the predictions are made independently , which can be mutually inconsistent .", "entity": "open information extraction", "output": "collective relation integration", "neg_sample": ["open information extraction is done by using Method", "integrating extracted knowledge from the web to knowledge graphs ( kgs ) can facilitate tasks like question answering .", "however , the predictions are made independently , which can be mutually inconsistent ."], "relation": "used for", "id": "2021.acl-long.363", "year": 2021, "rel_sent": "CoRI : Collective Relation Integration with Data Augmentation for Open Information Extraction.", "forward": false, "src_ids": "2021.acl-long.363_2566"}
{"input": "contrastive learning method is used for Task| context: finding codes given natural language query is beneficial to the productivity of software developers . future progress towards better semantic matching between query and code requires richer supervised training resources .", "entity": "contrastive learning method", "output": "text - code matching", "neg_sample": ["contrastive learning method is used for Task", "finding codes given natural language query is beneficial to the productivity of software developers .", "future progress towards better semantic matching between query and code requires richer supervised training resources ."], "relation": "used for", "id": "2021.acl-long.442", "year": 2021, "rel_sent": "We further introduce a contrastive learning method dubbed CoCLR to enhance text - code matching , which works as a data augmenter to bring more artificially generated training instances .", "forward": true, "src_ids": "2021.acl-long.442_11514"}
{"input": "knowledge - aware graph model is used for Task| context: we observe that a widely - used ece dataset exhibits a bias that the majority of annotated cause clauses are either directly before their associated emotion clauses or are the emotion clauses themselves .", "entity": "knowledge - aware graph model", "output": "emotion cause extraction", "neg_sample": ["knowledge - aware graph model is used for Task", "we observe that a widely - used ece dataset exhibits a bias that the majority of annotated cause clauses are either directly before their associated emotion clauses or are the emotion clauses themselves ."], "relation": "used for", "id": "2021.acl-long.261", "year": 2021, "rel_sent": "Position Bias Mitigation : A Knowledge - Aware Graph Model for Emotion Cause Extraction.", "forward": true, "src_ids": "2021.acl-long.261_10420"}
{"input": "hate speech domain is done by using Method| context: mainstream research on hate speech focused sofar predominantly on the task of classifying mainly social media posts with respect to predefined typologies of rather coarse - grained hate speech categories . this may be sufficient if the goal is to detect and delete abusive language posts . however , removal is not always possible due to the legislation of a country . also , there is evidence that hate speech can not be successfully combated by merely removing hate speech posts ; they should be countered by education and counter - narratives .", "entity": "hate speech domain", "output": "concept extraction model", "neg_sample": ["hate speech domain is done by using Method", "mainstream research on hate speech focused sofar predominantly on the task of classifying mainly social media posts with respect to predefined typologies of rather coarse - grained hate speech categories .", "this may be sufficient if the goal is to detect and delete abusive language posts .", "however , removal is not always possible due to the legislation of a country .", "also , there is evidence that hate speech can not be successfully combated by merely removing hate speech posts ; they should be countered by education and counter - narratives ."], "relation": "used for", "id": "2021.woah-1.19", "year": 2021, "rel_sent": "As the first approximation , we propose to adapt a generic state - of - the - art concept extraction model to the hate speech domain .", "forward": false, "src_ids": "2021.woah-1.19_6995"}
{"input": "bertscore is used for OtherScientificTerm| context: neural machine translation models are often biased toward the limited translation references seen during training . bertscore is a scoring function based on contextual embeddings that overcomes the typical limitations of n - gram - based metrics ( e.g.", "entity": "bertscore", "output": "training objective", "neg_sample": ["bertscore is used for OtherScientificTerm", "neural machine translation models are often biased toward the limited translation references seen during training .", "bertscore is a scoring function based on contextual embeddings that overcomes the typical limitations of n - gram - based metrics ( e.g."], "relation": "used for", "id": "2021.acl-short.115", "year": 2021, "rel_sent": "To be able to use BERTScore as a training objective , we propose three approaches for generating soft predictions , allowing the network to remain completely differentiable end - to - end .", "forward": true, "src_ids": "2021.acl-short.115_14519"}
{"input": "multimodal cues is used for Task| context: metaphor involves not only a linguistic phenomenon , but also a cognitive phenomenon structuring human thought , which makes understanding it challenging .", "entity": "multimodal cues", "output": "metaphor processing and understanding", "neg_sample": ["multimodal cues is used for Task", "metaphor involves not only a linguistic phenomenon , but also a cognitive phenomenon structuring human thought , which makes understanding it challenging ."], "relation": "used for", "id": "2021.acl-long.249", "year": 2021, "rel_sent": "Moreover , we propose a range of strong baselines and show the importance of combining multimodal cues for metaphor understanding .", "forward": true, "src_ids": "2021.acl-long.249_3518"}
{"input": "embeddings is used for Method| context: the success of language models based on the transformer architecture appears to be inconsistent with observed anisotropic properties of representations learned by such models . we resolve this by showing , contrary to previous studies , that the representations do not occupy a narrow cone , but rather drift in common directions .", "entity": "embeddings", "output": "transformer language models", "neg_sample": ["embeddings is used for Method", "the success of language models based on the transformer architecture appears to be inconsistent with observed anisotropic properties of representations learned by such models .", "we resolve this by showing , contrary to previous studies , that the representations do not occupy a narrow cone , but rather drift in common directions ."], "relation": "used for", "id": "2021.naacl-main.403", "year": 2021, "rel_sent": "Too Much in Common : Shifting of Embeddings in Transformer Language Models and its Implications.", "forward": true, "src_ids": "2021.naacl-main.403_9650"}
{"input": "scibert classifier is done by using Material| context: formulaic expressions ( fes ) , such as ' in this paper , we propose ' are frequently used in scientific papers . fes convey a communicative function ( cf ) , i.e. ' showing the aim of the paper ' in the above - mentioned example . although cf - labelled fes are helpful in assisting academic writing , the construction of fe databases requires manual labour for assigning cf labels .", "entity": "scibert classifier", "output": "cf - labelled sentence dataset", "neg_sample": ["scibert classifier is done by using Material", "formulaic expressions ( fes ) , such as ' in this paper , we propose ' are frequently used in scientific papers .", "fes convey a communicative function ( cf ) , i.e. '", "showing the aim of the paper ' in the above - mentioned example .", "although cf - labelled fes are helpful in assisting academic writing , the construction of fe databases requires manual labour for assigning cf labels ."], "relation": "used for", "id": "2021.eacl-main.304", "year": 2021, "rel_sent": "For the CF - label assignment , we created a CF - labelled sentence dataset , on which we trained a SciBERT classifier .", "forward": false, "src_ids": "2021.eacl-main.304_10696"}
{"input": "retrieval - based question answering systems is done by using Task| context: recent advancements in transformer - based models have greatly improved the ability of question answering ( qa ) systems to provide correct answers ; in particular , answer sentence selection ( as2 ) models , core components of retrieval - based systems , have achieved impressive results . while generally effective , these models fail to provide a satisfying answer when all retrieved candidates are of poor quality , even if they contain correct information .", "entity": "retrieval - based question answering systems", "output": "answer generation", "neg_sample": ["retrieval - based question answering systems is done by using Task", "recent advancements in transformer - based models have greatly improved the ability of question answering ( qa ) systems to provide correct answers ; in particular , answer sentence selection ( as2 ) models , core components of retrieval - based systems , have achieved impressive results .", "while generally effective , these models fail to provide a satisfying answer when all retrieved candidates are of poor quality , even if they contain correct information ."], "relation": "used for", "id": "2021.findings-acl.374", "year": 2021, "rel_sent": "Answer Generation for Retrieval - based Question Answering Systems.", "forward": false, "src_ids": "2021.findings-acl.374_9885"}
{"input": "offensive language identification is done by using Method| context: the intensity of online abuse has increased in recent years . automated tools are being developed to prevent the use of hate speech and offensive content . most of the technologies use natural language and machine learning tools to identify offensive text . in a multilingual society , where code - mixing is a norm , the hate content would be delivered in a code - mixed form in social media , which makes the offensive content identification , further challenging .", "entity": "offensive language identification", "output": "transformers", "neg_sample": ["offensive language identification is done by using Method", "the intensity of online abuse has increased in recent years .", "automated tools are being developed to prevent the use of hate speech and offensive content .", "most of the technologies use natural language and machine learning tools to identify offensive text .", "in a multilingual society , where code - mixing is a norm , the hate content would be delivered in a code - mixed form in social media , which makes the offensive content identification , further challenging ."], "relation": "used for", "id": "2021.dravidianlangtech-1.19", "year": 2021, "rel_sent": "OFFLangOne@DravidianLangTech - EACL2021 : Transformers with the Class Balanced Loss for Offensive Language Identification in Dravidian Code - Mixed text ..", "forward": false, "src_ids": "2021.dravidianlangtech-1.19_6985"}
{"input": "representation reconstruction module ( rrm ) is done by using OtherScientificTerm| context: change captioning is to use a natural language sentence to describe the fine - grained disagreement between two similar images . viewpoint change is the most typical distractor in this task , because it changes the scale and location of the objects and overwhelms the representation of real change .", "entity": "representation reconstruction module ( rrm )", "output": "semantic similarities", "neg_sample": ["representation reconstruction module ( rrm ) is done by using OtherScientificTerm", "change captioning is to use a natural language sentence to describe the fine - grained disagreement between two similar images .", "viewpoint change is the most typical distractor in this task , because it changes the scale and location of the objects and overwhelms the representation of real change ."], "relation": "used for", "id": "2021.emnlp-main.735", "year": 2021, "rel_sent": "Then , based on the semantic similarities of corresponding locations in the two images , a representation reconstruction module ( RRM ) is designed to learn the reconstruction representation and further model the difference representation .", "forward": false, "src_ids": "2021.emnlp-main.735_630"}
{"input": "detecting sentiment critical errors is done by using Metric| context: social media companies as well as censorship authorities make extensive use of artificial intelligence ( ai ) tools to monitor postings of hate speech , celebrations of violence or profanity . since ai software requires massive volumes of data to train computers , automatic - translation of the online content is usually implemented to compensate for the scarcity of text in some languages . however , machine translation ( mt ) mistakes are a regular occurrence when translating sentiment - oriented user - generated content ( ugc ) , especially when a low - resource language is involved . in such scenarios , the adequacy of the whole process relies on the assumption that the translation can be evaluated correctly .", "entity": "detecting sentiment critical errors", "output": "automatic metrics", "neg_sample": ["detecting sentiment critical errors is done by using Metric", "social media companies as well as censorship authorities make extensive use of artificial intelligence ( ai ) tools to monitor postings of hate speech , celebrations of violence or profanity .", "since ai software requires massive volumes of data to train computers , automatic - translation of the online content is usually implemented to compensate for the scarcity of text in some languages .", "however , machine translation ( mt ) mistakes are a regular occurrence when translating sentiment - oriented user - generated content ( ugc ) , especially when a low - resource language is involved .", "in such scenarios , the adequacy of the whole process relies on the assumption that the translation can be evaluated correctly ."], "relation": "used for", "id": "2021.triton-1.6", "year": 2021, "rel_sent": "We demonstrate the need for the fine - tuning of automatic metrics to make them more robust in detecting sentiment critical errors .", "forward": false, "src_ids": "2021.triton-1.6_15806"}
{"input": "lms is done by using Task| context: recent research in multilingual language models ( lm ) has demonstrated their ability to effectively handle multiple languages in a single model . this holds promise for low web - resource languages ( lrl ) as multilingual models can enable transfer of supervision from high resource languages to lrls . however , incorporating a new language in an lm still remains a challenge , particularly for languages with limited corpora and in unseen scripts .", "entity": "lms", "output": "transliteration", "neg_sample": ["lms is done by using Task", "recent research in multilingual language models ( lm ) has demonstrated their ability to effectively handle multiple languages in a single model .", "this holds promise for low web - resource languages ( lrl ) as multilingual models can enable transfer of supervision from high resource languages to lrls .", "however , incorporating a new language in an lm still remains a challenge , particularly for languages with limited corpora and in unseen scripts ."], "relation": "used for", "id": "2021.acl-long.105", "year": 2021, "rel_sent": "Experiments on multiple real - world benchmark datasets provide validation to our hypothesis that using a related language as pivot , along with transliteration and pseudo translation based data augmentation , can be an effective way to adapt LMs for LRLs , rather than direct training or pivoting through English .", "forward": false, "src_ids": "2021.acl-long.105_9070"}
{"input": "e - commerce functions is done by using Generic| context: identifying the value of product attribute is essential for many e - commerce functions such as product search and product recommendations . therefore , identifying attribute values from unstructured product descriptions is a critical undertaking for any e - commerce retailer . what makes this problem challenging is the diversity of product types and their attributes and values . existing methods have typically employed multiple types of machine learning models , each of which handles specific product types or attribute classes . this has limited their scalability and generalization for large scale real world e - commerce applications . previous approaches for this task have formulated the attribute value extraction as a named entity recognition ( ner ) task or a question answering ( qa ) task .", "entity": "e - commerce functions", "output": "general model", "neg_sample": ["e - commerce functions is done by using Generic", "identifying the value of product attribute is essential for many e - commerce functions such as product search and product recommendations .", "therefore , identifying attribute values from unstructured product descriptions is a critical undertaking for any e - commerce retailer .", "what makes this problem challenging is the diversity of product types and their attributes and values .", "existing methods have typically employed multiple types of machine learning models , each of which handles specific product types or attribute classes .", "this has limited their scalability and generalization for large scale real world e - commerce applications .", "previous approaches for this task have formulated the attribute value extraction as a named entity recognition ( ner ) task or a question answering ( qa ) task ."], "relation": "used for", "id": "2021.ecnlp-1.2", "year": 2021, "rel_sent": "We show that a single general model is very effective for this task over a broad set of product attribute values with the open world assumption .", "forward": false, "src_ids": "2021.ecnlp-1.2_1572"}
{"input": "teet ! tunisian dataset is used for Task| context: the complete freedom of expression in social media has its costs especially in spreading harmful and abusive content that may induce people to act accordingly . therefore , the need of detecting automatically such a content becomes an urgent task that will help and enhance the efficiency in limiting this toxic spread . compared to other arabic dialects which are mostly based on msa , the tunisian dialect is a combination of many other languages like msa , tamazight , italian and french . because of its rich language , dealing with nlp problems can be challenging due to the lack of large annotated datasets .", "entity": "teet ! tunisian dataset", "output": "toxic speech detection", "neg_sample": ["teet ! tunisian dataset is used for Task", "the complete freedom of expression in social media has its costs especially in spreading harmful and abusive content that may induce people to act accordingly .", "therefore , the need of detecting automatically such a content becomes an urgent task that will help and enhance the efficiency in limiting this toxic spread .", "compared to other arabic dialects which are mostly based on msa , the tunisian dialect is a combination of many other languages like msa , tamazight , italian and french .", "because of its rich language , dealing with nlp problems can be challenging due to the lack of large annotated datasets ."], "relation": "used for", "id": "2021.winlp-1.2", "year": 2021, "rel_sent": "TEET ! Tunisian Dataset for Toxic Speech Detection.", "forward": true, "src_ids": "2021.winlp-1.2_14988"}
{"input": "aspect - based sentiment analysis is done by using Method| context: there exist seven subtasks in absa . most studies only focus on the subsets of these subtasks , which leads to various complicated absa models while hard to solve these subtasks in a unified framework .", "entity": "aspect - based sentiment analysis", "output": "unified generative framework", "neg_sample": ["aspect - based sentiment analysis is done by using Method", "there exist seven subtasks in absa .", "most studies only focus on the subsets of these subtasks , which leads to various complicated absa models while hard to solve these subtasks in a unified framework ."], "relation": "used for", "id": "2021.acl-long.188", "year": 2021, "rel_sent": "A Unified Generative Framework for Aspect - based Sentiment Analysis.", "forward": false, "src_ids": "2021.acl-long.188_13506"}
{"input": "cast is used for OtherScientificTerm| context: schema translation is the task of automatically translating headers of tabular data from one language to another . high - quality schema translation plays an important role in cross - lingual table searching , understanding and analysis . despite its importance , schema translation is not well studied in the community , and state - of - the - art neural machine translation models can not work well on this task because of two intrinsic differences between plain text and tabular data : morphological difference and context difference .", "entity": "cast", "output": "graph", "neg_sample": ["cast is used for OtherScientificTerm", "schema translation is the task of automatically translating headers of tabular data from one language to another .", "high - quality schema translation plays an important role in cross - lingual table searching , understanding and analysis .", "despite its importance , schema translation is not well studied in the community , and state - of - the - art neural machine translation models can not work well on this task because of two intrinsic differences between plain text and tabular data : morphological difference and context difference ."], "relation": "used for", "id": "2021.emnlp-main.5", "year": 2021, "rel_sent": "Then CAST encodes the graph with a relational - aware transformer and uses another transformer to decode the header in the target language .", "forward": true, "src_ids": "2021.emnlp-main.5_8721"}
{"input": "discovering unknown slot types is done by using Task| context: existing slot filling models can only recognize pre - defined in - domain slot types from a limited slot set . in the practical application , a reliable dialogue system should know what it does not know .", "entity": "discovering unknown slot types", "output": "slot detection", "neg_sample": ["discovering unknown slot types is done by using Task", "existing slot filling models can only recognize pre - defined in - domain slot types from a limited slot set .", "in the practical application , a reliable dialogue system should know what it does not know ."], "relation": "used for", "id": "2021.acl-long.270", "year": 2021, "rel_sent": "Novel Slot Detection : A Benchmark for Discovering Unknown Slot Types in the Task - Oriented Dialogue System.", "forward": false, "src_ids": "2021.acl-long.270_4247"}
{"input": "lexical resource is used for Material| context: framing involves the positive or negative presentation of an argument or issue depending on the audience and goal of the speaker . differences in lexical framing , the focus of our work , can have large effects on peoples ' opinions and beliefs .", "entity": "lexical resource", "output": "parallel corpus", "neg_sample": ["lexical resource is used for Material", "framing involves the positive or negative presentation of an argument or issue depending on the audience and goal of the speaker .", "differences in lexical framing , the focus of our work , can have large effects on peoples ' opinions and beliefs ."], "relation": "used for", "id": "2021.naacl-main.394", "year": 2021, "rel_sent": "We use a lexical resource for ' connotations ' to create a parallel corpus and propose a method for argument reframing that combines controllable text generation ( positive connotation ) with a post - decoding entailment component ( same denotation ) .", "forward": true, "src_ids": "2021.naacl-main.394_8345"}
{"input": "data scarcity is done by using Method| context: sequence labeling aims to predict a fine - grained sequence of labels for the text . however , such formulation hinders the effectiveness of supervised methods due to the lack of token - level annotated data . this is exacerbated when we meet a diverse range of languages .", "entity": "data scarcity", "output": "meta learning method", "neg_sample": ["data scarcity is done by using Method", "sequence labeling aims to predict a fine - grained sequence of labels for the text .", "however , such formulation hinders the effectiveness of supervised methods due to the lack of token - level annotated data .", "this is exacerbated when we meet a diverse range of languages ."], "relation": "used for", "id": "2021.emnlp-main.255", "year": 2021, "rel_sent": "Specifically , we propose a Meta Teacher - Student ( MetaTS ) Network , a novel meta learning method to alleviate data scarcity by leveraging large multilingual unlabeled data .", "forward": false, "src_ids": "2021.emnlp-main.255_14823"}
{"input": "absa problem is done by using OtherScientificTerm| context: many efforts have been made in solving the aspect - based sentiment analysis ( absa ) task . while most existing studies focus on english texts , handling absa in resource - poor languages remains a challenging problem .", "entity": "absa problem", "output": "language - specific knowledge", "neg_sample": ["absa problem is done by using OtherScientificTerm", "many efforts have been made in solving the aspect - based sentiment analysis ( absa ) task .", "while most existing studies focus on english texts , handling absa in resource - poor languages remains a challenging problem ."], "relation": "used for", "id": "2021.emnlp-main.727", "year": 2021, "rel_sent": "Tofurther investigate the importance of language - specific knowledge in solving the ABSA problem , we distill the above model on the unlabeled target language data which improves the performance to the same level of the supervised method .", "forward": false, "src_ids": "2021.emnlp-main.727_190"}
{"input": "joint word segmentation ( ws ) is done by using Method| context: the most straightforward approach to joint word segmentation ( ws ) , part - of - speech ( pos ) tagging , and constituent parsing is converting a word - level tree into a char - level tree , which , however , leads to two severe challenges . first , a larger label set ( e.g. , > = 600 ) and longer inputs both increase computational costs . second , it is difficult to rule out illegal trees containing conflicting production rules , which is important for reliable model evaluation . if a pos tag ( like vv ) is above a phrase tag ( like vp ) in the output tree , it becomes quite complex to decide word boundaries .", "entity": "joint word segmentation ( ws )", "output": "coarse - to - fine labeling framework", "neg_sample": ["joint word segmentation ( ws ) is done by using Method", "the most straightforward approach to joint word segmentation ( ws ) , part - of - speech ( pos ) tagging , and constituent parsing is converting a word - level tree into a char - level tree , which , however , leads to two severe challenges .", "first , a larger label set ( e.g.", ", > = 600 ) and longer inputs both increase computational costs .", "second , it is difficult to rule out illegal trees containing conflicting production rules , which is important for reliable model evaluation .", "if a pos tag ( like vv ) is above a phrase tag ( like vp ) in the output tree , it becomes quite complex to decide word boundaries ."], "relation": "used for", "id": "2021.conll-1.23", "year": 2021, "rel_sent": "A Coarse - to - Fine Labeling Framework for Joint Word Segmentation , POS Tagging , and Constituent Parsing.", "forward": false, "src_ids": "2021.conll-1.23_2749"}
{"input": "partially non - autoregressive objective is used for Method| context: multilingual t5 pretrains a sequence - to - sequence model on massive monolingual texts , which has shown promising results on many cross - lingual tasks .", "entity": "partially non - autoregressive objective", "output": "text - to - text pre - training", "neg_sample": ["partially non - autoregressive objective is used for Method", "multilingual t5 pretrains a sequence - to - sequence model on massive monolingual texts , which has shown promising results on many cross - lingual tasks ."], "relation": "used for", "id": "2021.emnlp-main.125", "year": 2021, "rel_sent": "In addition , we propose a partially non - autoregressive objective for text - to - text pre - training .", "forward": true, "src_ids": "2021.emnlp-main.125_9620"}
{"input": "pretraining and transfer learning is used for Method| context: multi - task benchmarks such as glue and superglue have driven great progress of pretraining and transfer learning in natural language processing ( nlp ) . these benchmarks mostly focus on a range of natural language understanding ( nlu ) tasks , without considering the natural language generation ( nlg ) models .", "entity": "pretraining and transfer learning", "output": "nlg models", "neg_sample": ["pretraining and transfer learning is used for Method", "multi - task benchmarks such as glue and superglue have driven great progress of pretraining and transfer learning in natural language processing ( nlp ) .", "these benchmarks mostly focus on a range of natural language understanding ( nlu ) tasks , without considering the natural language generation ( nlg ) models ."], "relation": "used for", "id": "2021.findings-acl.36", "year": 2021, "rel_sent": "To encourage research on pretraining and transfer learning on NLG models , we make GLGE publicly available and build a leaderboard with strong baselines including MASS , BART , and ProphetNet .", "forward": true, "src_ids": "2021.findings-acl.36_10700"}
{"input": "query rewriting is done by using Method| context: anaphora and ellipses are two common phenomena in dialogues . without resolving referring expressions and information omission , dialogue systems may fail to generate consistent and coherent responses . traditionally , anaphora is resolved by coreference resolution and ellipses by query rewrite .", "entity": "query rewriting", "output": "joint learning framework", "neg_sample": ["query rewriting is done by using Method", "anaphora and ellipses are two common phenomena in dialogues .", "without resolving referring expressions and information omission , dialogue systems may fail to generate consistent and coherent responses .", "traditionally , anaphora is resolved by coreference resolution and ellipses by query rewrite ."], "relation": "used for", "id": "2021.naacl-main.265", "year": 2021, "rel_sent": "In this work , we propose a novel joint learning framework of modeling coreference resolution and query rewriting for complex , multi - turn dialogue understanding .", "forward": false, "src_ids": "2021.naacl-main.265_2441"}
{"input": "ensemble language model is used for OtherScientificTerm| context: therefore , a single language model ( lm ) is insufficient to learn all knowledge from diverse samples .", "entity": "ensemble language model", "output": "data diversity", "neg_sample": ["ensemble language model is used for OtherScientificTerm", "therefore , a single language model ( lm ) is insufficient to learn all knowledge from diverse samples ."], "relation": "used for", "id": "2021.acl-long.230", "year": 2021, "rel_sent": "EnsLM : Ensemble Language Model for Data Diversity by Semantic Clustering.", "forward": true, "src_ids": "2021.acl-long.230_1408"}
{"input": "probabilistic finite state transducers is used for OtherScientificTerm| context: in this work , we draw parallels between automatically responding to emails for combating social - engineering attacks and document - grounded response generation and lay out the blueprint of our approach . phishing emails are longer than dialogue utterances and often contain multiple intents . hence , we need to make decisions similar to those for document - grounded responses in deciding what parts of long text to use and how to address each intent to generate a knowledgeable multi - component response that pushes scammers towards agendas that aid in attribution and linking attacks .", "entity": "probabilistic finite state transducers", "output": "pushing agendas", "neg_sample": ["probabilistic finite state transducers is used for OtherScientificTerm", "in this work , we draw parallels between automatically responding to emails for combating social - engineering attacks and document - grounded response generation and lay out the blueprint of our approach .", "phishing emails are longer than dialogue utterances and often contain multiple intents .", "hence , we need to make decisions similar to those for document - grounded responses in deciding what parts of long text to use and how to address each intent to generate a knowledgeable multi - component response that pushes scammers towards agendas that aid in attribution and linking attacks ."], "relation": "used for", "id": "2021.dialdoc-1.15", "year": 2021, "rel_sent": "We propose , a hybrid system that uses customizable probabilistic finite state transducers to orchestrate pushing agendas coupled with neural dialogue systems that generate responses to unexpected prompts , as a promising solution to this end .", "forward": true, "src_ids": "2021.dialdoc-1.15_11715"}
{"input": "semi - automatic situation extraction is done by using OtherScientificTerm| context: currently , text chatting is one of the primary means of communication . however , modern text chat still in general does not offer any navigation or even full - featured search , although the high volumes of messages demand it . since the task is novel , neither training nor gold - standard datasets for it have been created yet .", "entity": "semi - automatic situation extraction", "output": "query language", "neg_sample": ["semi - automatic situation extraction is done by using OtherScientificTerm", "currently , text chatting is one of the primary means of communication .", "however , modern text chat still in general does not offer any navigation or even full - featured search , although the high volumes of messages demand it .", "since the task is novel , neither training nor gold - standard datasets for it have been created yet ."], "relation": "used for", "id": "2021.acl-srw.14", "year": 2021, "rel_sent": "We also introduce a custom query language for semi - automatic situation extraction .", "forward": false, "src_ids": "2021.acl-srw.14_374"}
{"input": "public datasets is used for Task| context: detection of toxic spans - detecting toxicity of contents in the granularity of tokens - is crucial for effective moderation of online discussions . one of the limitations of such a baseline approach is the scarcity of labeled data .", "entity": "public datasets", "output": "comment / sentence classification", "neg_sample": ["public datasets is used for Task", "detection of toxic spans - detecting toxicity of contents in the granularity of tokens - is crucial for effective moderation of online discussions .", "one of the limitations of such a baseline approach is the scarcity of labeled data ."], "relation": "used for", "id": "2021.semeval-1.127", "year": 2021, "rel_sent": "To improve the results , We studied leveraging existing public datasets for a related but different task of entire comment / sentence classification .", "forward": true, "src_ids": "2021.semeval-1.127_8450"}
{"input": "language transformers is used for Task| context: weakly supervised data is regarded as the main source for knowledge acquisition .", "entity": "language transformers", "output": "multi - label classification of chinese medical questions", "neg_sample": ["language transformers is used for Task", "weakly supervised data is regarded as the main source for knowledge acquisition ."], "relation": "used for", "id": "2021.rocling-1.34", "year": 2021, "rel_sent": "Incorporating Domain Knowledge into Language Transformers for Multi - Label Classification of Chinese Medical Questions.", "forward": true, "src_ids": "2021.rocling-1.34_3660"}
{"input": "linguistic features is done by using Method| context: we implemented a neural machine translation system that uses automatic sequence tagging to improve the quality of translation .", "entity": "linguistic features", "output": "pre - trained tagging systems", "neg_sample": ["linguistic features is done by using Method", "we implemented a neural machine translation system that uses automatic sequence tagging to improve the quality of translation ."], "relation": "used for", "id": "2021.iwslt-1.30", "year": 2021, "rel_sent": "Instead of operating on unannotated sentence pairs , our system uses pre - trained tagging systems to add linguistic features to source and target sentences .", "forward": false, "src_ids": "2021.iwslt-1.30_5596"}
{"input": "automatic rumor detection is done by using OtherScientificTerm| context: social networks face a major challenge in the form of rumors and fake news , due to their intrinsic nature of connecting users to millions of others , and of giving any individual the power to post anything . given the rapid , widespread dissemination of information in social networks , manually detecting suspicious news is sub - optimal . thus , research on automatic rumor detection has become a necessity . previous works in the domain have utilized the reply relations between posts , as well as the semantic similarity between the main post and its context , consisting of replies , in order to obtain state - of - the - art performance .", "entity": "automatic rumor detection", "output": "semantic oppositeness", "neg_sample": ["automatic rumor detection is done by using OtherScientificTerm", "social networks face a major challenge in the form of rumors and fake news , due to their intrinsic nature of connecting users to millions of others , and of giving any individual the power to post anything .", "given the rapid , widespread dissemination of information in social networks , manually detecting suspicious news is sub - optimal .", "thus , research on automatic rumor detection has become a necessity .", "previous works in the domain have utilized the reply relations between posts , as well as the semantic similarity between the main post and its context , consisting of replies , in order to obtain state - of - the - art performance ."], "relation": "used for", "id": "2021.eacl-main.31", "year": 2021, "rel_sent": "Semantic Oppositeness Assisted Deep Contextual Modeling for Automatic Rumor Detection in Social Networks.", "forward": false, "src_ids": "2021.eacl-main.31_13902"}
{"input": "end - to - end training is done by using Metric| context: state - of - the - art deep neural networks require large - scale labeled training data that is often expensive to obtain or not available for many tasks . weak supervision in the form of domain - specific rules has been shown to be useful in such settings to automatically generate weakly labeled training data . however , learning with weak rules is challenging due to their inherent heuristic and noisy nature . an additional challenge is rule coverage and overlap , where prior work on weak supervision only considers instances that are covered by weak rules , thus leaving valuable unlabeled data behind .", "entity": "end - to - end training", "output": "semi - supervised learning objective", "neg_sample": ["end - to - end training is done by using Metric", "state - of - the - art deep neural networks require large - scale labeled training data that is often expensive to obtain or not available for many tasks .", "weak supervision in the form of domain - specific rules has been shown to be useful in such settings to automatically generate weakly labeled training data .", "however , learning with weak rules is challenging due to their inherent heuristic and noisy nature .", "an additional challenge is rule coverage and overlap , where prior work on weak supervision only considers instances that are covered by weak rules , thus leaving valuable unlabeled data behind ."], "relation": "used for", "id": "2021.naacl-main.66", "year": 2021, "rel_sent": "Finally , we construct a semi - supervised learning objective for end - to - end training with unlabeled data , domain - specific rules , and a small amount of labeled data .", "forward": false, "src_ids": "2021.naacl-main.66_2835"}
{"input": "dependency patterns is used for Task| context: abstract meaning representation ( amr ) is a sentence - level meaning representation based on predicate argument structure . knowing the core part of the sentence structure in advance may be beneficial in such a task .", "entity": "dependency patterns", "output": "amr parsing", "neg_sample": ["dependency patterns is used for Task", "abstract meaning representation ( amr ) is a sentence - level meaning representation based on predicate argument structure .", "knowing the core part of the sentence structure in advance may be beneficial in such a task ."], "relation": "used for", "id": "2021.starsem-1.20", "year": 2021, "rel_sent": "In this paper , we present a list of dependency patterns for English complex sentence constructions designed for AMR parsing .", "forward": true, "src_ids": "2021.starsem-1.20_10470"}
{"input": "cognate alignment is done by using Method| context: most undeciphered lost languages exhibit two characteristics that pose significant decipherment challenges : ( 1 ) the scripts are not fully segmented into words ; ( 2 ) the closest known language is not determined .", "entity": "cognate alignment", "output": "generative framework", "neg_sample": ["cognate alignment is done by using Method", "most undeciphered lost languages exhibit two characteristics that pose significant decipherment challenges : ( 1 ) the scripts are not fully segmented into words ; ( 2 ) the closest known language is not determined ."], "relation": "used for", "id": "2021.tacl-1.5", "year": 2021, "rel_sent": "The resulting generative framework jointly models word segmentation and cognate alignment , informed by phonological constraints .", "forward": false, "src_ids": "2021.tacl-1.5_13767"}
{"input": "text - based conversation is done by using Method| context: recent work in open - domain conversational agents has demonstrated that significant improvements in humanness and user preference can be achieved via massive scaling in both pre - training data and model size ( adiwardana et al . , 2020 ; roller et al . , 2020 ) . however , if we want to build agents with human - like abilities , we must expand beyond handling just text . a particularly important topic is the ability to see images and communicate about what is perceived .", "entity": "text - based conversation", "output": "( text - only ) blenderbot", "neg_sample": ["text - based conversation is done by using Method", "recent work in open - domain conversational agents has demonstrated that significant improvements in humanness and user preference can be achieved via massive scaling in both pre - training data and model size ( adiwardana et al .", ", 2020 ; roller et al .", ", 2020 ) .", "however , if we want to build agents with human - like abilities , we must expand beyond handling just text .", "a particularly important topic is the ability to see images and communicate about what is perceived ."], "relation": "used for", "id": "2021.emnlp-main.398", "year": 2021, "rel_sent": "We study incorporating different image fusion schemes and domain - adaptive pre - training and fine - tuning strategies , and show that our best resulting model outperforms strong existing models in multi - modal dialogue while simultaneously performing as well as its predecessor ( text - only ) BlenderBot ( Roller et al . , 2020 ) in text - based conversation .", "forward": false, "src_ids": "2021.emnlp-main.398_6032"}
{"input": "semantic and multimodal context is used for Task| context: accurate detection and classification of online hate is a difficult task . implicit hate is particularly challenging as such content tends to have unusual syntax , polysemic words , and fewer markers of prejudice ( e.g. , slurs ) . this problem is heightened with multimodal content , such as memes ( combinations of text and images ) , as they are often harder to decipher than unimodal content ( e.g. , text alone ) .", "entity": "semantic and multimodal context", "output": "detecting implicit and explicit hate", "neg_sample": ["semantic and multimodal context is used for Task", "accurate detection and classification of online hate is a difficult task .", "implicit hate is particularly challenging as such content tends to have unusual syntax , polysemic words , and fewer markers of prejudice ( e.g.", ", slurs ) .", "this problem is heightened with multimodal content , such as memes ( combinations of text and images ) , as they are often harder to decipher than unimodal content ( e.g.", ", text alone ) ."], "relation": "used for", "id": "2021.findings-acl.166", "year": 2021, "rel_sent": "This paper evaluates the role of semantic and multimodal context for detecting implicit and explicit hate .", "forward": true, "src_ids": "2021.findings-acl.166_3610"}
{"input": "cross - lingual sentiment analysis is done by using Method| context: much work in cross - lingual transfer learning explored how to select better transfer languages for multilingual tasks , primarily focusing on typological and genealogical similarities between languages . we hypothesize that these measures of linguistic proximity are not enough when working with pragmatically - motivated tasks , such as sentiment analysis .", "entity": "cross - lingual sentiment analysis", "output": "pragmatically - driven transfer", "neg_sample": ["cross - lingual sentiment analysis is done by using Method", "much work in cross - lingual transfer learning explored how to select better transfer languages for multilingual tasks , primarily focusing on typological and genealogical similarities between languages .", "we hypothesize that these measures of linguistic proximity are not enough when working with pragmatically - motivated tasks , such as sentiment analysis ."], "relation": "used for", "id": "2021.eacl-main.204", "year": 2021, "rel_sent": "We further corroborate the effectiveness of pragmatically - driven transfer in the downstream task of choosing transfer languages for cross - lingual sentiment analysis .", "forward": false, "src_ids": "2021.eacl-main.204_10486"}
{"input": "text retrieval is used for OtherScientificTerm| context: a text retrieval system for language learning returns reading materials at the appropriate difficulty level for the user . the system typically maintains a learner model on the user 's vocabulary knowledge , and identifies texts that best fit the model . as the user 's language proficiency increases , model updates are necessary to retrieve texts with the corresponding lexical complexity .", "entity": "text retrieval", "output": "language learners", "neg_sample": ["text retrieval is used for OtherScientificTerm", "a text retrieval system for language learning returns reading materials at the appropriate difficulty level for the user .", "the system typically maintains a learner model on the user 's vocabulary knowledge , and identifies texts that best fit the model .", "as the user 's language proficiency increases , model updates are necessary to retrieve texts with the corresponding lexical complexity ."], "relation": "used for", "id": "2021.ranlp-1.91", "year": 2021, "rel_sent": "Text Retrieval for Language Learners : Graded Vocabulary vs. Open Learner Model.", "forward": true, "src_ids": "2021.ranlp-1.91_845"}
{"input": "video understanding is done by using Method| context: video editing tools are widely used nowadays for digital design . although the demand for these tools is high , the prior knowledge required makes it difficult for novices to get started . systems that could follow natural language instructions to perform automatic editing would significantly improve accessibility .", "entity": "video understanding", "output": "m^3l - transformer", "neg_sample": ["video understanding is done by using Method", "video editing tools are widely used nowadays for digital design .", "although the demand for these tools is high , the prior knowledge required makes it difficult for novices to get started .", "systems that could follow natural language instructions to perform automatic editing would significantly improve accessibility ."], "relation": "used for", "id": "2021.alvr-1.8", "year": 2021, "rel_sent": "The M^3L - Transformer dynamically learns the correspondence between video perception and language semantic at different levels , which benefits both the video understanding and videoframe synthesis .", "forward": false, "src_ids": "2021.alvr-1.8_8415"}
{"input": "annotations of generalising statements is done by using Material| context: literary texts feature a rich variety in expressing quantification , including a broad range of lexemes to express quantifiers and complex sentence structures to express the restrictor and the nuclear scope of a quantification .", "entity": "annotations of generalising statements", "output": "german corpus", "neg_sample": ["annotations of generalising statements is done by using Material", "literary texts feature a rich variety in expressing quantification , including a broad range of lexemes to express quantifiers and complex sentence structures to express the restrictor and the nuclear scope of a quantification ."], "relation": "used for", "id": "2021.isa-1.3", "year": 2021, "rel_sent": "In the second part of the paper , we introduce our German corpus with annotations of generalising statements , which form a proper subset of quantified statements .", "forward": false, "src_ids": "2021.isa-1.3_8114"}
{"input": "sentiment classification is done by using Method| context: models pre - trained on large - scale regular text corpora often do not work well for user - generated data where the language styles differ significantly from the mainstream text .", "entity": "sentiment classification", "output": "context - aware rule injection ( cari )", "neg_sample": ["sentiment classification is done by using Method", "models pre - trained on large - scale regular text corpora often do not work well for user - generated data where the language styles differ significantly from the mainstream text ."], "relation": "used for", "id": "2021.acl-long.124", "year": 2021, "rel_sent": "We show that CARI outperformed existing rule - based FST approaches for sentiment classification .", "forward": false, "src_ids": "2021.acl-long.124_4693"}
{"input": "residual networks is used for Method| context: multi - label document classification ( mldc ) problems can be challenging , especially for long documents with a large label set and a long - tail distribution over labels .", "entity": "residual networks", "output": "document representations", "neg_sample": ["residual networks is used for Method", "multi - label document classification ( mldc ) problems can be challenging , especially for long documents with a large label set and a long - tail distribution over labels ."], "relation": "used for", "id": "2021.emnlp-main.481", "year": 2021, "rel_sent": "Our innovations are three - fold : ( 1 ) we utilize a deep convolution - based encoder with the squeeze - and - excitation networks and residual networks to aggregate the information across the document and learn meaningful document representations that cover different ranges of texts ; ( 2 ) we explore multi - layer and sum - pooling attention to extract the most informative features from these multi - scale representations ; ( 3 ) we combine binary cross entropy loss and focal loss to improve performance for rare labels .", "forward": true, "src_ids": "2021.emnlp-main.481_899"}
{"input": "gpt-2 model is used for Material| context: we ask subjects whether they perceive as human - produced a bunch of texts , some of which are actually human - written , while others are automatically generated .", "entity": "gpt-2 model", "output": "human - like texts", "neg_sample": ["gpt-2 model is used for Material", "we ask subjects whether they perceive as human - produced a bunch of texts , some of which are actually human - written , while others are automatically generated ."], "relation": "used for", "id": "2021.gem-1.2", "year": 2021, "rel_sent": "We use this data tofine - tune a GPT-2 model to push it to generate more human - like texts , and observe that this fine - tuned model produces texts that are indeed perceived more human - like than the original model .", "forward": true, "src_ids": "2021.gem-1.2_10255"}
{"input": "modal dependency structure construction is done by using Method| context: as the sources of information that we consume everyday rapidly diversify , it is becoming increasingly important to develop nlp tools that help to evaluate the credibility of the information we receive .", "entity": "modal dependency structure construction", "output": "joint model", "neg_sample": ["modal dependency structure construction is done by using Method", "as the sources of information that we consume everyday rapidly diversify , it is becoming increasingly important to develop nlp tools that help to evaluate the credibility of the information we receive ."], "relation": "used for", "id": "2021.acl-long.122", "year": 2021, "rel_sent": "We evaluate the joint model against a pipeline model and demonstrate the advantage of the joint model in conceiver extraction and modal dependency structure construction when events and conceivers are automatically extracted .", "forward": false, "src_ids": "2021.acl-long.122_15488"}
{"input": "long - tail label distribution is done by using Method| context: state - of - the - art approaches to spelling error correction problem include transformer - based seq2seq models , which require large training sets and suffer from slow inference time ; and sequence labeling models based on transformer encoders like bert , which involve token - level label space and therefore a large pre - defined vocabulary dictionary .", "entity": "long - tail label distribution", "output": "hierarchical multi - task approach", "neg_sample": ["long - tail label distribution is done by using Method", "state - of - the - art approaches to spelling error correction problem include transformer - based seq2seq models , which require large training sets and suffer from slow inference time ; and sequence labeling models based on transformer encoders like bert , which involve token - level label space and therefore a large pre - defined vocabulary dictionary ."], "relation": "used for", "id": "2021.wnut-1.13", "year": 2021, "rel_sent": "For decoding , we propose a hierarchical multi - task approach to alleviate the issue of long - tail label distribution without introducing extra model parameters .", "forward": false, "src_ids": "2021.wnut-1.13_11901"}
{"input": "persian ner is done by using Method| context: named entity recognition ( ner ) is one of the major tasks in natural language processing . a named entity is often a word or expression that bears a valuable piece of information , which can be effectively employed by some major nlp tasks such as machine translation , question answering , and text summarization .", "entity": "persian ner", "output": "bert - persner", "neg_sample": ["persian ner is done by using Method", "named entity recognition ( ner ) is one of the major tasks in natural language processing .", "a named entity is often a word or expression that bears a valuable piece of information , which can be effectively employed by some major nlp tasks such as machine translation , question answering , and text summarization ."], "relation": "used for", "id": "2021.ranlp-1.73", "year": 2021, "rel_sent": "BERT - PersNER has outperformed two available studies in Persian NER , in most cases of our experiments using the supervised learning approach on two Persian datasets called Arman and Peyma .", "forward": false, "src_ids": "2021.ranlp-1.73_12361"}
{"input": "context - aware interaction network ( coin ) is used for OtherScientificTerm| context: impressive milestones have been achieved in text matching by adopting a cross - attention mechanism to capture pertinent semantic connections between two sentence representations . however , regular cross - attention focuses on word - level links between the two input sequences , neglecting the importance of contextual information .", "entity": "context - aware interaction network ( coin )", "output": "semantic relationship", "neg_sample": ["context - aware interaction network ( coin ) is used for OtherScientificTerm", "impressive milestones have been achieved in text matching by adopting a cross - attention mechanism to capture pertinent semantic connections between two sentence representations .", "however , regular cross - attention focuses on word - level links between the two input sequences , neglecting the importance of contextual information ."], "relation": "used for", "id": "2021.emnlp-main.312", "year": 2021, "rel_sent": "We propose a context - aware interaction network ( COIN ) to properly align two sequences and infer their semantic relationship .", "forward": true, "src_ids": "2021.emnlp-main.312_11568"}
{"input": "neural models is used for OtherScientificTerm| context: bubble representations were proposed in the formal linguistics literature decades ago ; they enhance dependency trees by encoding coordination boundaries and internal relationships within coordination structures explicitly .", "entity": "neural models", "output": "bubble - enhanced structures", "neg_sample": ["neural models is used for OtherScientificTerm", "bubble representations were proposed in the formal linguistics literature decades ago ; they enhance dependency trees by encoding coordination boundaries and internal relationships within coordination structures explicitly ."], "relation": "used for", "id": "2021.acl-long.557", "year": 2021, "rel_sent": "In this paper , we introduce a transition system and neural models for parsing these bubble - enhanced structures .", "forward": true, "src_ids": "2021.acl-long.557_13695"}
{"input": "bert token representations is used for Task| context: several studies have been carried out on revealing linguistic features captured by bert . this is usually achieved by training a diagnostic classifier on the representations obtained from different layers of bert . the subsequent classification accuracy is then interpreted as the ability of the model in encoding the corresponding linguistic property . despite providing insights , these studies have left out the potential role of token representations .", "entity": "bert token representations", "output": "sentence probing", "neg_sample": ["bert token representations is used for Task", "several studies have been carried out on revealing linguistic features captured by bert .", "this is usually achieved by training a diagnostic classifier on the representations obtained from different layers of bert .", "the subsequent classification accuracy is then interpreted as the ability of the model in encoding the corresponding linguistic property .", "despite providing insights , these studies have left out the potential role of token representations ."], "relation": "used for", "id": "2021.emnlp-main.61", "year": 2021, "rel_sent": "Exploring the Role of BERT Token Representations to Explain Sentence Probing Results.", "forward": true, "src_ids": "2021.emnlp-main.61_4347"}
{"input": "downstream tasks is done by using Method| context: differential privacy provides a formal approach to privacy of individuals . applications of differential privacy in various scenarios , such as protecting users ' original utterances , must satisfy certain mathematical properties .", "entity": "downstream tasks", "output": "adept", "neg_sample": ["downstream tasks is done by using Method", "differential privacy provides a formal approach to privacy of individuals .", "applications of differential privacy in various scenarios , such as protecting users ' original utterances , must satisfy certain mathematical properties ."], "relation": "used for", "id": "2021.emnlp-main.114", "year": 2021, "rel_sent": "ADePT achieves promising results on downstream tasks while providing tight privacy guarantees .", "forward": false, "src_ids": "2021.emnlp-main.114_13351"}
{"input": "multi - sentence resampling is used for OtherScientificTerm| context: neural machine translation ( nmt ) is known to suffer from a beam - search problem : after a certain point , increasing beam size causes an overall drop in translation quality . this effect is especially pronounced for long sentences . while much work was done analyzing this phenomenon , primarily for autoregressive nmt models , there is still no consensus on its underlying cause .", "entity": "multi - sentence resampling", "output": "dataset length bias", "neg_sample": ["multi - sentence resampling is used for OtherScientificTerm", "neural machine translation ( nmt ) is known to suffer from a beam - search problem : after a certain point , increasing beam size causes an overall drop in translation quality .", "this effect is especially pronounced for long sentences .", "while much work was done analyzing this phenomenon , primarily for autoregressive nmt models , there is still no consensus on its underlying cause ."], "relation": "used for", "id": "2021.emnlp-main.677", "year": 2021, "rel_sent": "Multi - Sentence Resampling : A Simple Approach to Alleviate Dataset Length Bias and Beam - Search Degradation.", "forward": true, "src_ids": "2021.emnlp-main.677_3453"}
{"input": "ai clerk platform is used for Method| context: information extraction is a core technology of natural language processing , which extracts some meaningful phrases / clauses from unstructured or semistructured content to a particular topic . it can be said to be the core technology of many language technologies and applications .", "entity": "ai clerk platform", "output": "natural language processing technologies", "neg_sample": ["ai clerk platform is used for Method", "information extraction is a core technology of natural language processing , which extracts some meaningful phrases / clauses from unstructured or semistructured content to a particular topic .", "it can be said to be the core technology of many language technologies and applications ."], "relation": "used for", "id": "2021.rocling-1.4", "year": 2021, "rel_sent": "AI Clerk Platform further assists in the development of other natural language processing technologies and the derivation of application services .", "forward": true, "src_ids": "2021.rocling-1.4_11419"}
{"input": "conversation update system is used for Method| context: the rational speech acts ( rsa ) framework has been applied to an increasing number of linguistic phenomena . despite its promise as a model of conversational reasoning , it has rarely been used to model more than a single conversation turn .", "entity": "conversation update system", "output": "rational speech acts framework", "neg_sample": ["conversation update system is used for Method", "the rational speech acts ( rsa ) framework has been applied to an increasing number of linguistic phenomena .", "despite its promise as a model of conversational reasoning , it has rarely been used to model more than a single conversation turn ."], "relation": "used for", "id": "2021.scil-1.22", "year": 2021, "rel_sent": "Tell Me Everything You Know : A Conversation Update System for the Rational Speech Acts Framework.", "forward": true, "src_ids": "2021.scil-1.22_14053"}
{"input": "pragmatic reasoning is used for Task| context: the ability for variation in language use is necessary for speakers to achieve their conversational goals , for instance when referring to objects in visual environments . we argue that diversity should not be modelled as an independent objective in dialogue , but should rather be a result or by - product of goal - oriented language generation .", "entity": "pragmatic reasoning", "output": "decoding", "neg_sample": ["pragmatic reasoning is used for Task", "the ability for variation in language use is necessary for speakers to achieve their conversational goals , for instance when referring to objects in visual environments .", "we argue that diversity should not be modelled as an independent objective in dialogue , but should rather be a result or by - product of goal - oriented language generation ."], "relation": "used for", "id": "2021.sigdial-1.43", "year": 2021, "rel_sent": "We connect those lines of work and analyze how pragmatic reasoning during decoding affects the diversity of generated image captions .", "forward": true, "src_ids": "2021.sigdial-1.43_622"}
{"input": "task distributions is used for Method| context: meta - learning considers the problem of learning an efficient learning process that can leverage its past experience to accurately solve new tasks .", "entity": "task distributions", "output": "meta - learned models", "neg_sample": ["task distributions is used for Method", "meta - learning considers the problem of learning an efficient learning process that can leverage its past experience to accurately solve new tasks ."], "relation": "used for", "id": "2021.emnlp-main.469", "year": 2021, "rel_sent": "In this work , we aim to provide task distributions for meta - learning by considering self - supervised tasks automatically proposed from unlabeled text , to enable large - scale meta - learning in NLP .", "forward": true, "src_ids": "2021.emnlp-main.469_8228"}
{"input": "multihop qa data is done by using Generic| context: multi - hop question generation requires complex reasoning and coherent language realization . learning a generation model for the problem requires extensive multi - hop question answering ( qa ) data , which are limited due to the manual collection effort . learning this generating and then composing twophase model , however , requires manually labeled question decomposition data , which is labor intensive .", "entity": "multihop qa data", "output": "two - phase strategy", "neg_sample": ["multihop qa data is done by using Generic", "multi - hop question generation requires complex reasoning and coherent language realization .", "learning a generation model for the problem requires extensive multi - hop question answering ( qa ) data , which are limited due to the manual collection effort .", "learning this generating and then composing twophase model , however , requires manually labeled question decomposition data , which is labor intensive ."], "relation": "used for", "id": "2021.findings-acl.265", "year": 2021, "rel_sent": "A two - phase strategy addresses the insufficiency of multihop QA data by first generating and then composing single - hop sub - questions .", "forward": false, "src_ids": "2021.findings-acl.265_5939"}
{"input": "ner and er models is done by using OtherScientificTerm| context: in recent years , incorporating external knowledge for response generation in open - domain conversation systems has attracted great interest . different from formal documents , such as news , conversational utterances are informal and multi - turn , which makes it more challenging to disambiguate the entities .", "entity": "ner and er models", "output": "context information", "neg_sample": ["ner and er models is done by using OtherScientificTerm", "in recent years , incorporating external knowledge for response generation in open - domain conversation systems has attracted great interest .", "different from formal documents , such as news , conversational utterances are informal and multi - turn , which makes it more challenging to disambiguate the entities ."], "relation": "used for", "id": "2021.naacl-industry.4", "year": 2021, "rel_sent": "We conduct NEL experiments on three open - domain conversation datasets and validate that incorporating context information improves the performance of NER and ER models .", "forward": false, "src_ids": "2021.naacl-industry.4_14955"}
{"input": "cross - modal prediction is used for Task| context: previous works usually treat all three modal features equally and implicitly explore the interactions between different modalities .", "entity": "cross - modal prediction", "output": "multimodal sentiment analysis", "neg_sample": ["cross - modal prediction is used for Task", "previous works usually treat all three modal features equally and implicitly explore the interactions between different modalities ."], "relation": "used for", "id": "2021.findings-acl.417", "year": 2021, "rel_sent": "A Text - Centered Shared - Private Framework via Cross - Modal Prediction for Multimodal Sentiment Analysis.", "forward": true, "src_ids": "2021.findings-acl.417_14128"}
{"input": "zero - shot transfer learning is used for Task| context: natural language understanding is an important task in modern dialogue systems . it becomes more important with the rapid extension of the dialogue systems ' functionality .", "entity": "zero - shot transfer learning", "output": "intent classification", "neg_sample": ["zero - shot transfer learning is used for Task", "natural language understanding is an important task in modern dialogue systems .", "it becomes more important with the rapid extension of the dialogue systems ' functionality ."], "relation": "used for", "id": "2021.ranlp-1.25", "year": 2021, "rel_sent": "In this work , we present an approach to zero - shot transfer learning for the tasks of intent classification and slot - filling based on pre - trained language models .", "forward": true, "src_ids": "2021.ranlp-1.25_12967"}
{"input": "generation is done by using Method| context: text - to - sql is the problem of converting a user question into an sql query , when the question and database are given .", "entity": "generation", "output": "input manipulation methods", "neg_sample": ["generation is done by using Method", "text - to - sql is the problem of converting a user question into an sql query , when the question and database are given ."], "relation": "used for", "id": "2021.cl-2.12", "year": 2021, "rel_sent": "Additionally , two input manipulation methods are presented to improve generation performance further .", "forward": false, "src_ids": "2021.cl-2.12_2945"}
{"input": "multi - lingual question generation research is done by using Method| context: question generation is the task of generating coherent and relevant question given context paragraph . recently , with the development of large - scale question answering datasets such as squad , the english question generation has been rapidly developed . however , for other languages such as chinese , the available training data is limited , which hinders the development of question generation in the corresponding language .", "entity": "multi - lingual question generation research", "output": "language - agnostic language model", "neg_sample": ["multi - lingual question generation research is done by using Method", "question generation is the task of generating coherent and relevant question given context paragraph .", "recently , with the development of large - scale question answering datasets such as squad , the english question generation has been rapidly developed .", "however , for other languages such as chinese , the available training data is limited , which hinders the development of question generation in the corresponding language ."], "relation": "used for", "id": "2021.findings-acl.199", "year": 2021, "rel_sent": "With the languageagnostic language model , we achieve significant improvement in multi - lingual question generation over five languages .", "forward": false, "src_ids": "2021.findings-acl.199_11685"}
{"input": "link prediction is done by using Method| context: knowledge bases ( kbs ) are easy to query , verifiable , and interpretable . they however scale with man - hours and high - quality data . masked language models ( mlms ) , such as bert , scale with computing power as well as unstructured raw text data . the knowledge contained within those models is however not directly interpretable .", "entity": "link prediction", "output": "mean likelihood masked language model", "neg_sample": ["link prediction is done by using Method", "knowledge bases ( kbs ) are easy to query , verifiable , and interpretable .", "they however scale with man - hours and high - quality data .", "masked language models ( mlms ) , such as bert , scale with computing power as well as unstructured raw text data .", "the knowledge contained within those models is however not directly interpretable ."], "relation": "used for", "id": "2021.findings-acl.378", "year": 2021, "rel_sent": "To do that we introduce MLMLM , Mean Likelihood Masked Language Model , an approach comparing the mean likelihood of generating the different entities to perform link prediction in a tractable manner .", "forward": false, "src_ids": "2021.findings-acl.378_11453"}
{"input": "graph nodes is done by using OtherScientificTerm| context: most of the recent work on personality detection from online posts adopts multifarious deep neural networks to represent the posts and builds predictive models in a data - driven manner , without the exploitation of psycholinguistic knowledge that may unveil the connections between one 's language use and his psychological traits .", "entity": "graph nodes", "output": "initial embeddings", "neg_sample": ["graph nodes is done by using OtherScientificTerm", "most of the recent work on personality detection from online posts adopts multifarious deep neural networks to represent the posts and builds predictive models in a data - driven manner , without the exploitation of psycholinguistic knowledge that may unveil the connections between one 's language use and his psychological traits ."], "relation": "used for", "id": "2021.acl-long.326", "year": 2021, "rel_sent": "The initializer is employed to provide initial embeddings for the graph nodes .", "forward": false, "src_ids": "2021.acl-long.326_12554"}
{"input": "transformer encoder models is done by using OtherScientificTerm| context: transformer encoder models exhibit strong performance in single - domain applications . however , in a cross - domain situation , using a sub - word vocabulary model results in sub - word overlap . this is an issue when there is an overlap between sub - words that share no semantic similarity between domains .", "entity": "transformer encoder models", "output": "vocabulary size", "neg_sample": ["transformer encoder models is done by using OtherScientificTerm", "transformer encoder models exhibit strong performance in single - domain applications .", "however , in a cross - domain situation , using a sub - word vocabulary model results in sub - word overlap .", "this is an issue when there is an overlap between sub - words that share no semantic similarity between domains ."], "relation": "used for", "id": "2021.alta-1.22", "year": 2021, "rel_sent": "We present a study on reducing sub - word overlap by scaling the vocabulary size in a Transformer encoder model while pretraining with multiple domains .", "forward": false, "src_ids": "2021.alta-1.22_10683"}
{"input": "embeddings is used for Task| context: we train word embeddings for kannada , a dravidian language spoken by the people of karnataka , a southern state in india .", "entity": "embeddings", "output": "downstream tasks", "neg_sample": ["embeddings is used for Task", "we train word embeddings for kannada , a dravidian language spoken by the people of karnataka , a southern state in india ."], "relation": "used for", "id": "2021.icnlsp-1.33", "year": 2021, "rel_sent": "We hope that by publicly releasing our trained models , we will help in accelerating research and easing the effort involved in training embeddings for downstream tasks .", "forward": true, "src_ids": "2021.icnlsp-1.33_6850"}
{"input": "varied rewriting styles is done by using Method| context: text simplification improves the readability of sentences through several rewriting transformations , such as lexical paraphrasing , deletion , and splitting . current simplification systems are predominantly sequence - to - sequence models that are trained end - to - end to perform all these operations simultaneously . however , such systems limit themselves to mostly deleting words and can not easily adapt to the requirements of different target audiences .", "entity": "varied rewriting styles", "output": "hybrid approach", "neg_sample": ["varied rewriting styles is done by using Method", "text simplification improves the readability of sentences through several rewriting transformations , such as lexical paraphrasing , deletion , and splitting .", "current simplification systems are predominantly sequence - to - sequence models that are trained end - to - end to perform all these operations simultaneously .", "however , such systems limit themselves to mostly deleting words and can not easily adapt to the requirements of different target audiences ."], "relation": "used for", "id": "2021.naacl-main.277", "year": 2021, "rel_sent": "In this paper , we propose a novel hybrid approach that leverages linguistically - motivated rules for splitting and deletion , and couples them with a neural paraphrasing model to produce varied rewriting styles .", "forward": false, "src_ids": "2021.naacl-main.277_11423"}
{"input": "non - logical knowledge is used for Method| context: solving natural language inference ( nli ) with formal semantics and automated theorem proving has the merit of high precision and interpretability . however , they suffer from non - logical inference such as lexical inference .", "entity": "non - logical knowledge", "output": "theorem prover", "neg_sample": ["non - logical knowledge is used for Method", "solving natural language inference ( nli ) with formal semantics and automated theorem proving has the merit of high precision and interpretability .", "however , they suffer from non - logical inference such as lexical inference ."], "relation": "used for", "id": "2021.paclic-1.44", "year": 2021, "rel_sent": "To overcome this weakness , we propose a human - in - the - loop mechanism in which the user provides non - logical knowledge to the theorem prover .", "forward": true, "src_ids": "2021.paclic-1.44_13435"}
{"input": "rare words is done by using OtherScientificTerm| context: word embedding techniques depend heavily on the frequencies of words in the corpus , and are negatively impacted by failures in providing reliable representations for low - frequency words or unseen words during training .", "entity": "rare words", "output": "embeddings", "neg_sample": ["rare words is done by using OtherScientificTerm", "word embedding techniques depend heavily on the frequencies of words in the corpus , and are negatively impacted by failures in providing reliable representations for low - frequency words or unseen words during training ."], "relation": "used for", "id": "2021.starsem-1.26", "year": 2021, "rel_sent": "To address this problem , we propose an algorithm to learn embeddings for rare words based on an Internet search engine and the spatial location relationships .", "forward": false, "src_ids": "2021.starsem-1.26_3191"}
{"input": "semantic representations is done by using Method| context: multilingual neural machine translation ( mnmt ) has aroused widespread interest due to its efficiency . an exciting advantage of mnmt models is that they could also translate between unsupervised ( zero - shot ) language directions . language tag ( lt ) strategies are often adopted to indicate the translation directions in mnmt .", "entity": "semantic representations", "output": "lt strategies", "neg_sample": ["semantic representations is done by using Method", "multilingual neural machine translation ( mnmt ) has aroused widespread interest due to its efficiency .", "an exciting advantage of mnmt models is that they could also translate between unsupervised ( zero - shot ) language directions .", "language tag ( lt ) strategies are often adopted to indicate the translation directions in mnmt ."], "relation": "used for", "id": "2021.findings-acl.264", "year": 2021, "rel_sent": "We demonstrate that a proper LT strategy could enhance the consistency of semantic representations and alleviate the off - target issue in zero - shot directions .", "forward": false, "src_ids": "2021.findings-acl.264_1550"}
{"input": "commonsense relational knowledge is used for Task| context: in this paper , we consider a realistic scenario on stance detection with more application potential , i.e. , zero - shot and few - shot stance detection , which identifies stances for a wide range of topics with no or very few training examples . conventional data - driven approaches are not applicable to the above zero - shot and few - shot scenarios .", "entity": "commonsense relational knowledge", "output": "reasoning", "neg_sample": ["commonsense relational knowledge is used for Task", "in this paper , we consider a realistic scenario on stance detection with more application potential , i.e.", ", zero - shot and few - shot stance detection , which identifies stances for a wide range of topics with no or very few training examples .", "conventional data - driven approaches are not applicable to the above zero - shot and few - shot scenarios ."], "relation": "used for", "id": "2021.findings-acl.278", "year": 2021, "rel_sent": "In the absence of annotated data and cryptic expression of users ' stance , we believe that introducing commonsense relational knowledge as support for reasoning can further improve the generalization and reasoning ability of the model in the zero - shot and few - shot scenarios .", "forward": true, "src_ids": "2021.findings-acl.278_3055"}
{"input": "ood intents is done by using OtherScientificTerm| context: detecting out - of - domain ( ood ) intents is crucial for the deployed task - oriented dialogue system . previous unsupervised ood detection methods only extract discriminative features of different in - domain intents while supervised counterparts can directly distinguish ood and in - domain intents but require extensive labeled ood data .", "entity": "ood intents", "output": "discriminative semantic features", "neg_sample": ["ood intents is done by using OtherScientificTerm", "detecting out - of - domain ( ood ) intents is crucial for the deployed task - oriented dialogue system .", "previous unsupervised ood detection methods only extract discriminative features of different in - domain intents while supervised counterparts can directly distinguish ood and in - domain intents but require extensive labeled ood data ."], "relation": "used for", "id": "2021.naacl-main.447", "year": 2021, "rel_sent": "To combine the benefits of both types , we propose a self - supervised contrastive learning framework to model discriminative semantic features of both in - domain intents and OOD intents from unlabeled data .", "forward": false, "src_ids": "2021.naacl-main.447_444"}
{"input": "qa models is done by using Method| context: one of the main challenges in conversational question answering ( cqa ) is to resolve the conversational dependency , such as anaphora and ellipsis . however , existing approaches do not explicitly train qa models on how to resolve the dependency , and thus these models are limited in understanding human dialogues .", "entity": "qa models", "output": "explicit guidance on how to resolve conversational dependency", "neg_sample": ["qa models is done by using Method", "one of the main challenges in conversational question answering ( cqa ) is to resolve the conversational dependency , such as anaphora and ellipsis .", "however , existing approaches do not explicitly train qa models on how to resolve the dependency , and thus these models are limited in understanding human dialogues ."], "relation": "used for", "id": "2021.acl-long.478", "year": 2021, "rel_sent": "In our experiments , we demonstrate that ExCorD significantly improves the QA models ' performance by up to 1.2 F1 on QuAC , and 5.2 F1 on CANARD , while addressing the limitations of the existing approaches .", "forward": false, "src_ids": "2021.acl-long.478_666"}
{"input": "sociolinguistic aspects of c - s is done by using Method| context: the analysis of data in which multiple languages are represented has gained popularity among computational linguists in recent years . sofar , much of this research focuses mainly on the improvement of computational methods and largely ignores linguistic and social aspects of c - s discussed across a wide range of languages within the long - established literature in linguistics .", "entity": "sociolinguistic aspects of c - s", "output": "end - to- end systems", "neg_sample": ["sociolinguistic aspects of c - s is done by using Method", "the analysis of data in which multiple languages are represented has gained popularity among computational linguists in recent years .", "sofar , much of this research focuses mainly on the improvement of computational methods and largely ignores linguistic and social aspects of c - s discussed across a wide range of languages within the long - established literature in linguistics ."], "relation": "used for", "id": "2021.acl-long.131", "year": 2021, "rel_sent": "From the language technologies perspective , we discuss how massive language models fail to represent diverse C - S types due to lack of appropriate training data , lack of robust evaluation benchmarks for C - S ( across multilingual situations and types of C - S ) and lack of end - to- end systems that cover sociolinguistic aspects of C - S as well .", "forward": false, "src_ids": "2021.acl-long.131_267"}
{"input": "post - editing is done by using Method| context: despite the increasingly good quality of machine translation ( mt ) systems , mt outputs require corrections . automatic post - editing ( ape ) models have been introduced to perform these corrections without human intervention . however , no system has been able tofully automate the post - editing ( pe ) process . moreover , while numerous translation tools , such as translation memories ( tms ) , largely benefit from translators ' input , human - computer interaction ( hci ) remains limited when it comes to pe .", "entity": "post - editing", "output": "interactive models", "neg_sample": ["post - editing is done by using Method", "despite the increasingly good quality of machine translation ( mt ) systems , mt outputs require corrections .", "automatic post - editing ( ape ) models have been introduced to perform these corrections without human intervention .", "however , no system has been able tofully automate the post - editing ( pe ) process .", "moreover , while numerous translation tools , such as translation memories ( tms ) , largely benefit from translators ' input , human - computer interaction ( hci ) remains limited when it comes to pe ."], "relation": "used for", "id": "2021.triton-1.19", "year": 2021, "rel_sent": "Interactive Models for Post - Editing.", "forward": false, "src_ids": "2021.triton-1.19_7875"}
{"input": "style vs. topic - related features is done by using Method| context: hyperpartisan news show an extreme manipulation of reality based on an underlying and extreme ideological orientation . because of its harmful effects at reinforcing one 's bias and the posterior behavior of people , hyperpartisan news detection has become an important task for computational linguists .", "entity": "style vs. topic - related features", "output": "text masking technique", "neg_sample": ["style vs. topic - related features is done by using Method", "hyperpartisan news show an extreme manipulation of reality based on an underlying and extreme ideological orientation .", "because of its harmful effects at reinforcing one 's bias and the posterior behavior of people , hyperpartisan news detection has become an important task for computational linguists ."], "relation": "used for", "id": "2021.ranlp-1.140", "year": 2021, "rel_sent": "First , a text masking technique that allows us to compare style vs. topic - related features in a different perspective from previous work .", "forward": false, "src_ids": "2021.ranlp-1.140_817"}
{"input": "quality of noisy weak labels is done by using OtherScientificTerm| context: weakly supervised methods estimate the labels for a dataset using the predictions of several noisy supervision sources . many machine learning practitioners have begun using weak supervision to more quickly and cheaply annotate data compared to traditional manual labeling .", "entity": "quality of noisy weak labels", "output": "contextual embeddings", "neg_sample": ["quality of noisy weak labels is done by using OtherScientificTerm", "weakly supervised methods estimate the labels for a dataset using the predictions of several noisy supervision sources .", "many machine learning practitioners have begun using weak supervision to more quickly and cheaply annotate data compared to traditional manual labeling ."], "relation": "used for", "id": "2021.findings-acl.335", "year": 2021, "rel_sent": "State - of - the - art contextual embeddings are used tofurther discriminate the quality of noisy weak labels in various contexts .", "forward": false, "src_ids": "2021.findings-acl.335_11999"}
{"input": "transfer learning approach is used for Task| context: the quest for seeking health information has swamped the web with consumers ' healthrelated questions , which makes the need for efficient and reliable question answering systems more pressing . the consumers ' questions , however , are very descriptive and contain several peripheral information ( like patient 's medical history , demographic information , etc . ) , that are often not required for answering the question . furthermore , it contributes to the challenges of understanding natural language questions for automatic answer retrieval . also , it is crucial to provide the consumers with the exact and relevant answers , rather than the entire pool of answer documents to their question . one of the cardinal tasks in achieving robust consumer health question answering systems is the question summarization and multi - document answer summarization .", "entity": "transfer learning approach", "output": "abstractive question summarization", "neg_sample": ["transfer learning approach is used for Task", "the quest for seeking health information has swamped the web with consumers ' healthrelated questions , which makes the need for efficient and reliable question answering systems more pressing .", "the consumers ' questions , however , are very descriptive and contain several peripheral information ( like patient 's medical history , demographic information , etc . )", ", that are often not required for answering the question .", "furthermore , it contributes to the challenges of understanding natural language questions for automatic answer retrieval .", "also , it is crucial to provide the consumers with the exact and relevant answers , rather than the entire pool of answer documents to their question .", "one of the cardinal tasks in achieving robust consumer health question answering systems is the question summarization and multi - document answer summarization ."], "relation": "used for", "id": "2021.bionlp-1.34", "year": 2021, "rel_sent": "In this work , we exploited the capabilities of pre - trained transformer models and introduced a transfer learning approach for the abstractive Question Summarization and extractive Multi - Answer Summarization tasks by first pre - training our model on a task - specific summarization dataset followed by fine - tuning it for both the tasks via incorporating medical entities .", "forward": true, "src_ids": "2021.bionlp-1.34_5538"}
{"input": "multi - label text classification is done by using Method| context: multi - label text classification is one of the fundamental tasks in natural language processing . previous studies have difficulties to distinguish similar labels well because they learn the same document representations for different labels , that is they do not explicitly extract label - specific semantic components from documents . moreover , they do not fully explore the high - order interactions among these semantic components , which is very helpful to predict tail labels .", "entity": "multi - label text classification", "output": "label - specific dual graph neural network", "neg_sample": ["multi - label text classification is done by using Method", "multi - label text classification is one of the fundamental tasks in natural language processing .", "previous studies have difficulties to distinguish similar labels well because they learn the same document representations for different labels , that is they do not explicitly extract label - specific semantic components from documents .", "moreover , they do not fully explore the high - order interactions among these semantic components , which is very helpful to predict tail labels ."], "relation": "used for", "id": "2021.acl-long.298", "year": 2021, "rel_sent": "Label - Specific Dual Graph Neural Network for Multi - Label Text Classification.", "forward": false, "src_ids": "2021.acl-long.298_14587"}
{"input": "neural methods is used for Task| context: advances in transfer learning and domain adaptation have raised hopes that once - challenging nlp tasks are ready to be put to use for sophisticated information extraction needs .", "entity": "neural methods", "output": "negation detection", "neg_sample": ["neural methods is used for Task", "advances in transfer learning and domain adaptation have raised hopes that once - challenging nlp tasks are ready to be put to use for sophisticated information extraction needs ."], "relation": "used for", "id": "2021.adaptnlp-1.11", "year": 2021, "rel_sent": "In this work , we describe an effort to do just that - combining state - of - the - art neural methods for negation detection , document time relation extraction , and aspectual link prediction , with the eventual goal of extracting drug timelines from electronic health record text .", "forward": true, "src_ids": "2021.adaptnlp-1.11_9816"}
{"input": "machine translation is done by using Method| context: sequence - to - sequence models usually transfer all encoder outputs to the decoder for generation . in this work , by contrast , we hypothesize that these encoder outputs can be compressed to shorten the sequence delivered for decoding .", "entity": "machine translation", "output": "sparsification", "neg_sample": ["machine translation is done by using Method", "sequence - to - sequence models usually transfer all encoder outputs to the decoder for generation .", "in this work , by contrast , we hypothesize that these encoder outputs can be compressed to shorten the sequence delivered for decoding ."], "relation": "used for", "id": "2021.findings-acl.255", "year": 2021, "rel_sent": "We investigate the effects of this sparsification on two machine translation and two summarization tasks .", "forward": false, "src_ids": "2021.findings-acl.255_12013"}
{"input": "medical task prediction is done by using Method| context: healthcare is becoming a more and more important research topic recently . with the growing data in the healthcare domain , it offers a great opportunity for deep learning to improve the quality of service and reduce costs . however , the complexity of electronic health records ( ehr ) data is a challenge for the application of deep learning . specifically , the data produced in the hospital admissions are monitored by the ehr system , which includes structured data like daily body temperature and unstructured data like free text and laboratory measurements . although there are some preprocessing frameworks proposed for specific ehr data , the clinical notes that contain significant clinical value are beyond the realm of their consideration . besides , whether these different data from various views are all beneficial to the medical tasks and how to best utilize these data remain unclear .", "entity": "medical task prediction", "output": "data leverage methods", "neg_sample": ["medical task prediction is done by using Method", "healthcare is becoming a more and more important research topic recently .", "with the growing data in the healthcare domain , it offers a great opportunity for deep learning to improve the quality of service and reduce costs .", "however , the complexity of electronic health records ( ehr ) data is a challenge for the application of deep learning .", "specifically , the data produced in the hospital admissions are monitored by the ehr system , which includes structured data like daily body temperature and unstructured data like free text and laboratory measurements .", "although there are some preprocessing frameworks proposed for specific ehr data , the clinical notes that contain significant clinical value are beyond the realm of their consideration .", "besides , whether these different data from various views are all beneficial to the medical tasks and how to best utilize these data remain unclear ."], "relation": "used for", "id": "2021.emnlp-main.329", "year": 2021, "rel_sent": "Therefore , in this paper , we first extract the accompanying clinical notes from EHR and propose a method to integrate these data , we also comprehensively study the different models and the data leverage methods for better medical task prediction performance .", "forward": false, "src_ids": "2021.emnlp-main.329_1894"}
{"input": "structural guidance is used for Task| context: transformer - based language models pre - trained on large amounts of text data have proven remarkably successful in learning generic transferable linguistic representations . we explore two general ideas .", "entity": "structural guidance", "output": "human - like systematic linguistic generalization", "neg_sample": ["structural guidance is used for Task", "transformer - based language models pre - trained on large amounts of text data have proven remarkably successful in learning generic transferable linguistic representations .", "we explore two general ideas ."], "relation": "used for", "id": "2021.acl-long.289", "year": 2021, "rel_sent": "Here we study whether structural guidance leads to more human - like systematic linguistic generalization in Transformer language models without resorting to pre - training on very large amounts of data .", "forward": true, "src_ids": "2021.acl-long.289_8472"}
{"input": "diverse set generation strategies is used for Task| context: beam search is a go - to strategy for decoding neural sequence models . the algorithm can naturally be viewed as a subset optimization problem , albeit one where the corresponding set function does not reflect interactions between candidates . empirically , this leads to sets often exhibiting high overlap , e.g. , strings may differ by only a single word . yet in use - cases that call for multiple solutions , a diverse or representative set is often desired .", "entity": "diverse set generation strategies", "output": "language generation", "neg_sample": ["diverse set generation strategies is used for Task", "beam search is a go - to strategy for decoding neural sequence models .", "the algorithm can naturally be viewed as a subset optimization problem , albeit one where the corresponding set function does not reflect interactions between candidates .", "empirically , this leads to sets often exhibiting high overlap , e.g.", ", strings may differ by only a single word .", "yet in use - cases that call for multiple solutions , a diverse or representative set is often desired ."], "relation": "used for", "id": "2021.acl-long.512", "year": 2021, "rel_sent": "We observe that our algorithm offers competitive performance against other diverse set generation strategies in the context of language generation , while providing a more general approach to optimizing for diversity .", "forward": true, "src_ids": "2021.acl-long.512_11821"}
{"input": "dangling entity detection is done by using Method| context: this paper studies a new problem setting of entity alignment for knowledge graphs ( kgs ) . since kgs possess different sets of entities , there could be entities that can not find alignment across them , leading to the problem of dangling entities .", "entity": "dangling entity detection", "output": "multi - task learning framework", "neg_sample": ["dangling entity detection is done by using Method", "this paper studies a new problem setting of entity alignment for knowledge graphs ( kgs ) .", "since kgs possess different sets of entities , there could be entities that can not find alignment across them , leading to the problem of dangling entities ."], "relation": "used for", "id": "2021.acl-long.278", "year": 2021, "rel_sent": "As the first attempt to this problem , we construct a new dataset and design a multi - task learning framework for both entity alignment and dangling entity detection .", "forward": false, "src_ids": "2021.acl-long.278_4656"}
{"input": "automatic post - editing is used for Task| context: accurate translation requires document - level information , which is ignored by sentence - level machine translation . recent work has demonstrated that document - level consistency can be improved with automatic post - editing ( ape ) using only target - language ( tl ) information .", "entity": "automatic post - editing", "output": "context - aware machine translation", "neg_sample": ["automatic post - editing is used for Task", "accurate translation requires document - level information , which is ignored by sentence - level machine translation .", "recent work has demonstrated that document - level consistency can be improved with automatic post - editing ( ape ) using only target - language ( tl ) information ."], "relation": "used for", "id": "2021.nodalida-main.34", "year": 2021, "rel_sent": "Exploring the Importance of Source Text in Automatic Post - Editing for Context - Aware Machine Translation.", "forward": true, "src_ids": "2021.nodalida-main.34_1088"}
{"input": "probability distribution over trees is done by using Method| context: in this abstract we outline some theoretical work on the probabilistic learning of a representative mildly context - sensitive grammar formalism from positive examples only . in a recent paper , clark and fijalkow ( 2020 ) ( cf from now on ) present a consistent unsupervised learning algorithm for probabilistic context - free grammars ( pcfgs ) satisfying certain structural conditions : it converges to the correct grammar and parameter values , taking as input only a sample of strings generated by the pcfg . the derivation trees of these grammars are richer than those of context - free grammars and naturally account for limited forms of syntactic movement ( rogers , 2003 ) , and so the issue of how these structural descriptions can be learned is of great theoretical importance .", "entity": "probability distribution over trees", "output": "probabilistic tree adjoining grammars", "neg_sample": ["probability distribution over trees is done by using Method", "in this abstract we outline some theoretical work on the probabilistic learning of a representative mildly context - sensitive grammar formalism from positive examples only .", "in a recent paper , clark and fijalkow ( 2020 ) ( cf from now on ) present a consistent unsupervised learning algorithm for probabilistic context - free grammars ( pcfgs ) satisfying certain structural conditions : it converges to the correct grammar and parameter values , taking as input only a sample of strings generated by the pcfg .", "the derivation trees of these grammars are richer than those of context - free grammars and naturally account for limited forms of syntactic movement ( rogers , 2003 ) , and so the issue of how these structural descriptions can be learned is of great theoretical importance ."], "relation": "used for", "id": "2021.scil-1.47", "year": 2021, "rel_sent": "In this learning model , we have a probabilistic tree grammar which generates a probability distribution over trees ; given a sample of these trees , the learner must converge to a grammar that has the same structure as the original grammar and the same parameters .", "forward": false, "src_ids": "2021.scil-1.47_14914"}
{"input": "automatic mind - map generation method is used for OtherScientificTerm| context: a mind - map is a diagram that represents the central concept and key ideas in a hierarchical way . converting plain text into a mind - map will reveal its key semantic structure and be easier to understand . the computation complexity increases exponentially with the length of the document . moreover , it is difficult to capture the overall semantics .", "entity": "automatic mind - map generation method", "output": "directed semantic graph", "neg_sample": ["automatic mind - map generation method is used for OtherScientificTerm", "a mind - map is a diagram that represents the central concept and key ideas in a hierarchical way .", "converting plain text into a mind - map will reveal its key semantic structure and be easier to understand .", "the computation complexity increases exponentially with the length of the document .", "moreover , it is difficult to capture the overall semantics ."], "relation": "used for", "id": "2021.emnlp-main.641", "year": 2021, "rel_sent": "Given a document , the existing automatic mind - map generation method extracts the relationships of every sentence pair to generate the directed semantic graph for this document .", "forward": true, "src_ids": "2021.emnlp-main.641_13324"}
{"input": "darknet forum migrant analysis is done by using Method| context: darknet market forums are frequently used to exchange illegal goods and services between parties who use encryption to conceal their identities . the tor network is used to host these markets , which guarantees additional anonymization from ip and location tracking , making it challenging to link across malicious users using multiple accounts ( sybils ) . additionally , users migrate to new forums when one is closed further increasing the difficulty of linking users across multiple forums .", "entity": "darknet forum migrant analysis", "output": "multitask learning", "neg_sample": ["darknet forum migrant analysis is done by using Method", "darknet market forums are frequently used to exchange illegal goods and services between parties who use encryption to conceal their identities .", "the tor network is used to host these markets , which guarantees additional anonymization from ip and location tracking , making it challenging to link across malicious users using multiple accounts ( sybils ) .", "additionally , users migrate to new forums when one is closed further increasing the difficulty of linking users across multiple forums ."], "relation": "used for", "id": "2021.emnlp-main.548", "year": 2021, "rel_sent": "SYSML : StYlometry with Structure and Multitask Learning : Implications for Darknet Forum Migrant Analysis.", "forward": false, "src_ids": "2021.emnlp-main.548_10935"}
{"input": "implementation tricks is used for Task| context: filtering target - irrelevant information through hierarchically refining hidden states has been demonstrated to be effective for obtaining informative representations . however , previous work simply relies on locally normalized attention without considering possible labels at other time steps , the capacity for modeling long - term dependency relations is thus limited .", "entity": "implementation tricks", "output": "crf computation", "neg_sample": ["implementation tricks is used for Task", "filtering target - irrelevant information through hierarchically refining hidden states has been demonstrated to be effective for obtaining informative representations .", "however , previous work simply relies on locally normalized attention without considering possible labels at other time steps , the capacity for modeling long - term dependency relations is thus limited ."], "relation": "used for", "id": "2021.findings-acl.164", "year": 2021, "rel_sent": "We also propose two implementation tricks to accelerate CRF computation and an initialization trick for Chinese character embeddings tofurther improve performance .", "forward": true, "src_ids": "2021.findings-acl.164_11143"}
{"input": "low - resource domain adaptation is done by using Method| context: fine - tuning is known to improve nlp models by adapting an initial model trained on more plentiful but less domain - salient examples to data in a target domain . such domain adaptation is typically done using one stage of fine - tuning .", "entity": "low - resource domain adaptation", "output": "gradually fine - tuning", "neg_sample": ["low - resource domain adaptation is done by using Method", "fine - tuning is known to improve nlp models by adapting an initial model trained on more plentiful but less domain - salient examples to data in a target domain .", "such domain adaptation is typically done using one stage of fine - tuning ."], "relation": "used for", "id": "2021.adaptnlp-1.22", "year": 2021, "rel_sent": "Gradual Fine - Tuning for Low - Resource Domain Adaptation.", "forward": false, "src_ids": "2021.adaptnlp-1.22_2111"}
{"input": "semi - supervised multi - task learning approach is used for OtherScientificTerm| context: in the area of customer support , understanding customers ' intents is a crucial step . machine learning plays a vital role in this type of intent classification . in reality , it is typical to collect confirmation from customer support representatives ( csrs ) regarding the intent prediction , though it can unnecessarily incur prohibitive cost to ask csrs to assign existing or new intents to the mis - classified cases . apart from the confirmed cases with and without intent labels , there can be a number of cases with no human curation . this data composition ( positives + unlabeled + multiclass negatives ) creates unique challenges for model development .", "entity": "semi - supervised multi - task learning approach", "output": "customer contact intents", "neg_sample": ["semi - supervised multi - task learning approach is used for OtherScientificTerm", "in the area of customer support , understanding customers ' intents is a crucial step .", "machine learning plays a vital role in this type of intent classification .", "in reality , it is typical to collect confirmation from customer support representatives ( csrs ) regarding the intent prediction , though it can unnecessarily incur prohibitive cost to ask csrs to assign existing or new intents to the mis - classified cases .", "apart from the confirmed cases with and without intent labels , there can be a number of cases with no human curation .", "this data composition ( positives + unlabeled + multiclass negatives ) creates unique challenges for model development ."], "relation": "used for", "id": "2021.ecnlp-1.7", "year": 2021, "rel_sent": "A Semi - supervised Multi - task Learning Approach to Classify Customer Contact Intents.", "forward": true, "src_ids": "2021.ecnlp-1.7_12175"}
{"input": "pre - trained models is done by using Method| context: modern pre - trained language models are mostly built upon backbones stacking self - attention and feed - forward layers in an interleaved order .", "entity": "pre - trained models", "output": "convolution", "neg_sample": ["pre - trained models is done by using Method", "modern pre - trained language models are mostly built upon backbones stacking self - attention and feed - forward layers in an interleaved order ."], "relation": "used for", "id": "2021.findings-acl.2", "year": 2021, "rel_sent": "Specifically , besides the original self - attention and feed - forward layers , we introduce convolution into the layer type set , which is experimentally found beneficial to pre - trained models .", "forward": false, "src_ids": "2021.findings-acl.2_935"}
{"input": "adaptive its is used for Material| context: integrating an adaptive intelligent tutoring system ( its ) in real - life school contexts requires coverage of the official curricula , which necessitates a broad range and number of activities to practice the official set of language phenomena .", "entity": "adaptive its", "output": "english", "neg_sample": ["adaptive its is used for Material", "integrating an adaptive intelligent tutoring system ( its ) in real - life school contexts requires coverage of the official curricula , which necessitates a broad range and number of activities to practice the official set of language phenomena ."], "relation": "used for", "id": "2021.nlp4call-1.2", "year": 2021, "rel_sent": "In the context of developing an adaptive ITS for English as a Foreign Language , we propose a method to automatically derive rich activity models from ordinary exercise specifications .", "forward": true, "src_ids": "2021.nlp4call-1.2_8606"}
{"input": "product labels is done by using OtherScientificTerm| context: item categorization is an important application of text classification in e - commerce due to its impact on the online shopping experience of users . one class of text classification techniques that has gained attention recently is using the semantic information of the labels to guide the classification task .", "entity": "product labels", "output": "hyperbolic space", "neg_sample": ["product labels is done by using OtherScientificTerm", "item categorization is an important application of text classification in e - commerce due to its impact on the online shopping experience of users .", "one class of text classification techniques that has gained attention recently is using the semantic information of the labels to guide the classification task ."], "relation": "used for", "id": "2021.naacl-industry.37", "year": 2021, "rel_sent": "Furthermore , using a hyperbolic space to embed product labels that are organized in a hierarchical structure led to better performance compared to using a conventional Euclidean space embedding .", "forward": false, "src_ids": "2021.naacl-industry.37_10656"}
{"input": "event structures is done by using Method| context: event extraction ( ee ) has considerably benefited from pre - trained language models ( plms ) by fine - tuning . however , existing pre - training methods have not involved modeling event characteristics , resulting in the developed ee models can not take full advantage of large - scale unsupervised data .", "entity": "event structures", "output": "graph encoder", "neg_sample": ["event structures is done by using Method", "event extraction ( ee ) has considerably benefited from pre - trained language models ( plms ) by fine - tuning .", "however , existing pre - training methods have not involved modeling event characteristics , resulting in the developed ee models can not take full advantage of large - scale unsupervised data ."], "relation": "used for", "id": "2021.acl-long.491", "year": 2021, "rel_sent": "CLEVE contains a text encoder to learn event semantics and a graph encoder to learn event structures respectively .", "forward": false, "src_ids": "2021.acl-long.491_6668"}
{"input": "lexical hyperlinks is used for Method| context: we describe a new addition to the webvectors toolkit which is used to serve word embedding models over the web . the new elmoviz module adds support for contextualized embedding architectures , in particular for elmo models .", "entity": "lexical hyperlinks", "output": "word representations", "neg_sample": ["lexical hyperlinks is used for Method", "we describe a new addition to the webvectors toolkit which is used to serve word embedding models over the web .", "the new elmoviz module adds support for contextualized embedding architectures , in particular for elmo models ."], "relation": "used for", "id": "2021.eacl-demos.18", "year": 2021, "rel_sent": "The module is well integrated into the rest of the WebVectors toolkit , providing lexical hyperlinks to word representations in static embedding models .", "forward": true, "src_ids": "2021.eacl-demos.18_15728"}
{"input": "multi - task learning approach is used for Task| context: aspect term extraction ( ate ) , opinion term extraction ( ote ) and aspect sentiment classification ( asc ) are the essential building blocks of aspect - based sentiment analysis ( absa ) . they are typically treated as separate tasks and are individually studied by previous work . recent studies intend to incorporate multiple sub - tasks into a unified framework , but suffer from the following major disadvantages : ( 1 ) absa models are extremely fragile when some sub - tasks are absent ; ( 2 ) the interactive relations among subtasks are not adequate .", "entity": "multi - task learning approach", "output": "unified absa", "neg_sample": ["multi - task learning approach is used for Task", "aspect term extraction ( ate ) , opinion term extraction ( ote ) and aspect sentiment classification ( asc ) are the essential building blocks of aspect - based sentiment analysis ( absa ) .", "they are typically treated as separate tasks and are individually studied by previous work .", "recent studies intend to incorporate multiple sub - tasks into a unified framework , but suffer from the following major disadvantages : ( 1 ) absa models are extremely fragile when some sub - tasks are absent ; ( 2 ) the interactive relations among subtasks are not adequate ."], "relation": "used for", "id": "2021.findings-acl.238", "year": 2021, "rel_sent": "To this end , we propose a multi - task learning approach named MIN ( Multiplex Interaction Network ) to make flexible use of sub - tasks for a unified ABSA .", "forward": true, "src_ids": "2021.findings-acl.238_3702"}
{"input": "unified listwise training approach is used for Method| context: in various natural language processing tasks , passage retrieval and passage re - ranking are two key procedures in finding and ranking relevant information . since both the two procedures contribute to the final performance , it is important to jointly optimize them in order to achieve mutual improvement .", "entity": "unified listwise training approach", "output": "retriever", "neg_sample": ["unified listwise training approach is used for Method", "in various natural language processing tasks , passage retrieval and passage re - ranking are two key procedures in finding and ranking relevant information .", "since both the two procedures contribute to the final performance , it is important to jointly optimize them in order to achieve mutual improvement ."], "relation": "used for", "id": "2021.emnlp-main.224", "year": 2021, "rel_sent": "A major contribution is that we introduce the dynamic listwise distillation , where we design a unified listwise training approach for both the retriever and the re - ranker .", "forward": true, "src_ids": "2021.emnlp-main.224_856"}
{"input": "sub - word representations is used for OtherScientificTerm| context: definition modelling is the task of automatically generating a dictionary - style definition given a target word .", "entity": "sub - word representations", "output": "morphologically - complex wolastoqey words", "neg_sample": ["sub - word representations is used for OtherScientificTerm", "definition modelling is the task of automatically generating a dictionary - style definition given a target word ."], "relation": "used for", "id": "2021.ranlp-1.17", "year": 2021, "rel_sent": "We hypothesize that sub - word representations based on byte pair encoding ( Sennrich et al . , 2016 ) can be leveraged to represent morphologically - complex Wolastoqey words and overcome the challenge of not having large corpora available for training .", "forward": true, "src_ids": "2021.ranlp-1.17_2591"}
{"input": "multi - modal intent classification is used for Task| context: one of the natural ways of communicating with assistive technologies is through verbal instructions . the meaning of natural language commands depends on the current configuration of the surrounding environment and needs to be interpreted in this multi - modal context , as accurate interpretation of the command is essential for a successful execution of the users intent by an assistive device .", "entity": "multi - modal intent classification", "output": "assistive robots", "neg_sample": ["multi - modal intent classification is used for Task", "one of the natural ways of communicating with assistive technologies is through verbal instructions .", "the meaning of natural language commands depends on the current configuration of the surrounding environment and needs to be interpreted in this multi - modal context , as accurate interpretation of the command is essential for a successful execution of the users intent by an assistive device ."], "relation": "used for", "id": "2021.alta-1.5", "year": 2021, "rel_sent": "Multi - modal Intent Classification for Assistive Robots with Large - scale Naturalistic Datasets.", "forward": true, "src_ids": "2021.alta-1.5_7557"}
{"input": "structure prediction task is done by using Generic| context: conversations are often held in laboratories and companies . a summary is vital to grasp the content of a discussion for people who did not attend the discussion . if the summary is illustrated as an argument structure , it is helpful to grasp the discussion 's essentials immediately .", "entity": "structure prediction task", "output": "two - step methods", "neg_sample": ["structure prediction task is done by using Generic", "conversations are often held in laboratories and companies .", "a summary is vital to grasp the content of a discussion for people who did not attend the discussion .", "if the summary is illustrated as an argument structure , it is helpful to grasp the discussion 's essentials immediately ."], "relation": "used for", "id": "2021.ranlp-1.61", "year": 2021, "rel_sent": "To solve this problem , we introduce a two - step method to the structure prediction task .", "forward": false, "src_ids": "2021.ranlp-1.61_152"}
{"input": "learning - based models is done by using Generic| context: toxic spans detection is an emerging challenge that aims tofind toxic spans within a toxic text .", "entity": "learning - based models", "output": "last", "neg_sample": ["learning - based models is done by using Generic", "toxic spans detection is an emerging challenge that aims tofind toxic spans within a toxic text ."], "relation": "used for", "id": "2021.semeval-1.116", "year": 2021, "rel_sent": "This last is used to interpret predictions of learning - based models .", "forward": false, "src_ids": "2021.semeval-1.116_14873"}
{"input": "label confusion strategy is used for OtherScientificTerm| context: zero - shot cross - domain slot filling alleviates the data dependence in the case of data scarcity in the target domain , which has aroused extensive research . however , as most of the existing methods do not achieve effective knowledge transfer to the target domain , they just fit the distribution of the seen slot and show poor performance on unseen slot in the target domain .", "entity": "label confusion strategy", "output": "label dependence", "neg_sample": ["label confusion strategy is used for OtherScientificTerm", "zero - shot cross - domain slot filling alleviates the data dependence in the case of data scarcity in the target domain , which has aroused extensive research .", "however , as most of the existing methods do not achieve effective knowledge transfer to the target domain , they just fit the distribution of the seen slot and show poor performance on unseen slot in the target domain ."], "relation": "used for", "id": "2021.emnlp-main.746", "year": 2021, "rel_sent": "The prototypical contrastive learning aims to reconstruct the semantic constraints of labels , and we introduce the label confusion strategy to establish the label dependence between the source domains and the target domain on - the - fly .", "forward": true, "src_ids": "2021.emnlp-main.746_12377"}
{"input": "masking scheme is used for Task| context: a principal barrier to training temporal relation extraction models in new domains is the lack of varied , high quality examples and the challenge of collecting more .", "entity": "masking scheme", "output": "generalization", "neg_sample": ["masking scheme is used for Task", "a principal barrier to training temporal relation extraction models in new domains is the lack of varied , high quality examples and the challenge of collecting more ."], "relation": "used for", "id": "2021.adaptnlp-1.20", "year": 2021, "rel_sent": "We demonstrate that a pre - trained Transformer model is able to transfer from the weakly labeled examples to human - annotated benchmarks in both zero - shot and few - shot settings , and that the masking scheme is important in improving generalization .", "forward": true, "src_ids": "2021.adaptnlp-1.20_6447"}
{"input": "transformer - based sentence classifiers is done by using Task| context: we investigate how sentence - level transformers can be modified into effective sequence labelers at the token level without any direct supervision . as transformers contain multiple layers of multi - head self - attention , information in the sentence gets distributed between many tokens , negatively affecting zero - shot token - level performance .", "entity": "transformer - based sentence classifiers", "output": "zero - shot sequence labeling", "neg_sample": ["transformer - based sentence classifiers is done by using Task", "we investigate how sentence - level transformers can be modified into effective sequence labelers at the token level without any direct supervision .", "as transformers contain multiple layers of multi - head self - attention , information in the sentence gets distributed between many tokens , negatively affecting zero - shot token - level performance ."], "relation": "used for", "id": "2021.repl4nlp-1.20", "year": 2021, "rel_sent": "Zero - shot Sequence Labeling for Transformer - based Sentence Classifiers.", "forward": false, "src_ids": "2021.repl4nlp-1.20_15904"}
{"input": "unsupervised learning is used for Method| context: evaluating the quality of responses generated by open - domain conversation systems is a challenging task . this is partly because there can be multiple appropriate responses to a given dialogue history . reference - based metrics that rely on comparisons to a set of known correct responses often fail to account for this variety , and consequently correlate poorly with human judgment .", "entity": "unsupervised learning", "output": "response evaluation model", "neg_sample": ["unsupervised learning is used for Method", "evaluating the quality of responses generated by open - domain conversation systems is a challenging task .", "this is partly because there can be multiple appropriate responses to a given dialogue history .", "reference - based metrics that rely on comparisons to a set of known correct responses often fail to account for this variety , and consequently correlate poorly with human judgment ."], "relation": "used for", "id": "2021.naacl-main.120", "year": 2021, "rel_sent": "Generating Negative Samples by Manipulating Golden Responses for Unsupervised Learning of a Response Evaluation Model.", "forward": true, "src_ids": "2021.naacl-main.120_2386"}
{"input": "disagreement space is done by using OtherScientificTerm| context: detecting arguments in online interactions is useful to understand how conflicts arise and get resolved .", "entity": "disagreement space", "output": "sarcasm", "neg_sample": ["disagreement space is done by using OtherScientificTerm", "detecting arguments in online interactions is useful to understand how conflicts arise and get resolved ."], "relation": "used for", "id": "2021.eacl-main.171", "year": 2021, "rel_sent": "Tofurther our understanding of the role of sarcasm in shaping the disagreement space , we present a thorough experimental setup using a corpus annotated with both argumentative moves ( agree / disagree ) and sarcasm .", "forward": false, "src_ids": "2021.eacl-main.171_1592"}
{"input": "word representation is done by using Method| context: to keep pace with the increased generation and digitization of documents , automated methods that can improve search , discovery and mining of the vast body of literature are essential . keyphrases provide a concise representation by identifying salient concepts in a document . various supervised approaches model keyphrase extraction using local context to predict the label for each token and perform much better than the unsupervised counterparts . moreover , keyphrases , which are usually the gist of a document , need to be the central theme .", "entity": "word representation", "output": "extraction model", "neg_sample": ["word representation is done by using Method", "to keep pace with the increased generation and digitization of documents , automated methods that can improve search , discovery and mining of the vast body of literature are essential .", "keyphrases provide a concise representation by identifying salient concepts in a document .", "various supervised approaches model keyphrase extraction using local context to predict the label for each token and perform much better than the unsupervised counterparts .", "moreover , keyphrases , which are usually the gist of a document , need to be the central theme ."], "relation": "used for", "id": "2021.bionlp-1.17", "year": 2021, "rel_sent": "We propose a new extraction model that introduces a centrality constraint to enrich the word representation of a Bidirectional long short - term memory .", "forward": false, "src_ids": "2021.bionlp-1.17_4786"}
{"input": "general - purpose knowledge bases ( kbs ) is used for Method| context: the limits of applicability of vision - and language models are defined by the coverage of their training data . tasks like vision question answering ( vqa ) often require commonsense and factual information beyond what can be learned from task - specific datasets .", "entity": "general - purpose knowledge bases ( kbs )", "output": "vision - and - language transformers", "neg_sample": ["general - purpose knowledge bases ( kbs ) is used for Method", "the limits of applicability of vision - and language models are defined by the coverage of their training data .", "tasks like vision question answering ( vqa ) often require commonsense and factual information beyond what can be learned from task - specific datasets ."], "relation": "used for", "id": "2021.lantern-1.1", "year": 2021, "rel_sent": "This paper investigates the injection of knowledge from general - purpose knowledge bases ( KBs ) into vision - and - language transformers .", "forward": true, "src_ids": "2021.lantern-1.1_5167"}
{"input": "data augmentation strategy is used for Task| context: due to recent pretrained multilingual representation models , it has become feasible to exploit labeled data from one language to train a cross - lingual model that can then be applied to multiple new languages . in practice , however , we still face the problem of scarce labeled data , leading to subpar results .", "entity": "data augmentation strategy", "output": "cross - lingual natural language inference", "neg_sample": ["data augmentation strategy is used for Task", "due to recent pretrained multilingual representation models , it has become feasible to exploit labeled data from one language to train a cross - lingual model that can then be applied to multiple new languages .", "in practice , however , we still face the problem of scarce labeled data , leading to subpar results ."], "relation": "used for", "id": "2021.acl-long.401", "year": 2021, "rel_sent": "In this paper , we propose a novel data augmentation strategy for better cross - lingual natural language inference by enriching the data to reflect more diversity in a semantically faithful way .", "forward": true, "src_ids": "2021.acl-long.401_15547"}
{"input": "pretrained language models is used for OtherScientificTerm| context: translation between natural language and source code can help software development by enabling developers to comprehend , ideate , search , and write computer programs in natural language . despite growing interest from the industry and the research community , this task is often difficult due to the lack of large standard datasets suitable for training deep neural models , standard noise removal methods , and evaluation benchmarks . this leaves researchers to collect new small - scale datasets , resulting in inconsistencies across published works .", "entity": "pretrained language models", "output": "java", "neg_sample": ["pretrained language models is used for OtherScientificTerm", "translation between natural language and source code can help software development by enabling developers to comprehend , ideate , search , and write computer programs in natural language .", "despite growing interest from the industry and the research community , this task is often difficult due to the lack of large standard datasets suitable for training deep neural models , standard noise removal methods , and evaluation benchmarks .", "this leaves researchers to collect new small - scale datasets , resulting in inconsistencies across published works ."], "relation": "used for", "id": "2021.findings-acl.18", "year": 2021, "rel_sent": "Furthermore , we show CoDesc 's effectiveness in pre - training - finetuning setup , opening possibilities in building pretrained language models for Java .", "forward": true, "src_ids": "2021.findings-acl.18_10834"}
{"input": "descriptive and informative captions is done by using Task| context: unlike conventional image captions that simply describe the content of the image in general terms , news image captions follow journalistic guidelines and rely heavily on named entities to describe the image content , often drawing context from the whole article they are associated with .", "entity": "descriptive and informative captions", "output": "news article image captioning", "neg_sample": ["descriptive and informative captions is done by using Task", "unlike conventional image captions that simply describe the content of the image in general terms , news image captions follow journalistic guidelines and rely heavily on named entities to describe the image content , often drawing context from the whole article they are associated with ."], "relation": "used for", "id": "2021.emnlp-main.419", "year": 2021, "rel_sent": "The task of news article image captioning aims to generate descriptive and informative captions for news article images .", "forward": false, "src_ids": "2021.emnlp-main.419_1074"}
{"input": "bert is done by using Method| context: the rise of pre - trained language models has yielded substantial progress in the vast majority of natural language processing ( nlp ) tasks . however , a generic approach towards the pre - training procedure can naturally be sub - optimal in some cases . particularly , fine - tuning a pre - trained language model on a source domain and then applying it to a different target domain , results in a sharp performance decline of the eventual classifier for many source - target domain pairs . moreover , in some nlp tasks , the output categories substantially differ between domains , making adaptation even more challenging . this , for example , happens in the task of aspect extraction , where the aspects of interest of reviews of , e.g. , restaurants or electronic devices may be very different .", "entity": "bert", "output": "fine - tuning scheme", "neg_sample": ["bert is done by using Method", "the rise of pre - trained language models has yielded substantial progress in the vast majority of natural language processing ( nlp ) tasks .", "however , a generic approach towards the pre - training procedure can naturally be sub - optimal in some cases .", "particularly , fine - tuning a pre - trained language model on a source domain and then applying it to a different target domain , results in a sharp performance decline of the eventual classifier for many source - target domain pairs .", "moreover , in some nlp tasks , the output categories substantially differ between domains , making adaptation even more challenging .", "this , for example , happens in the task of aspect extraction , where the aspects of interest of reviews of , e.g.", ", restaurants or electronic devices may be very different ."], "relation": "used for", "id": "2021.emnlp-main.20", "year": 2021, "rel_sent": "This paper presents a new fine - tuning scheme for BERT , which aims to address the above challenges .", "forward": false, "src_ids": "2021.emnlp-main.20_9598"}
{"input": "subword regularization methods is used for Task| context: multilingual pretrained representations generally rely on subword segmentation algorithms to create a shared multilingual vocabulary . however , standard heuristic algorithms often lead to sub - optimal segmentation , especially for languages with limited amounts of data .", "entity": "subword regularization methods", "output": "cross - lingual transfer", "neg_sample": ["subword regularization methods is used for Task", "multilingual pretrained representations generally rely on subword segmentation algorithms to create a shared multilingual vocabulary .", "however , standard heuristic algorithms often lead to sub - optimal segmentation , especially for languages with limited amounts of data ."], "relation": "used for", "id": "2021.naacl-main.40", "year": 2021, "rel_sent": "First , we demonstrate empirically that applying existing subword regularization methods ( Kudo , 2018 ; Provilkov et al . , 2020 ) during fine - tuning of pre - trained multilingual representations improves the effectiveness of cross - lingual transfer .", "forward": true, "src_ids": "2021.naacl-main.40_3250"}
{"input": "biomedical entity linking is done by using Method| context: due to large number of entities in biomedical knowledge bases , only a small fraction of entities have corresponding labelled training data . this necessitates entity linking models which are able to link mentions of unseen entities using learned representations of entities . previous approaches link each mention independently , ignoring the relationships within and across documents between the entity mentions . these relations can be very useful for linking mentions in biomedical text where linking decisions are often difficult due mentions having a generic or a highly specialized form .", "entity": "biomedical entity linking", "output": "clustering - based inference", "neg_sample": ["biomedical entity linking is done by using Method", "due to large number of entities in biomedical knowledge bases , only a small fraction of entities have corresponding labelled training data .", "this necessitates entity linking models which are able to link mentions of unseen entities using learned representations of entities .", "previous approaches link each mention independently , ignoring the relationships within and across documents between the entity mentions .", "these relations can be very useful for linking mentions in biomedical text where linking decisions are often difficult due mentions having a generic or a highly specialized form ."], "relation": "used for", "id": "2021.naacl-main.205", "year": 2021, "rel_sent": "Clustering - based Inference for Biomedical Entity Linking.", "forward": false, "src_ids": "2021.naacl-main.205_7037"}
{"input": "paraphrasing is done by using Method| context: publicly available , large pretrained language models ( lms ) generate text with remarkable quality , but only sequentially from left to right . as a result , they are not immediately applicable to generation tasks that break the unidirectional assumption , such as paraphrasing or text - infilling , necessitating task - specific supervision .", "entity": "paraphrasing", "output": "reflective decoding", "neg_sample": ["paraphrasing is done by using Method", "publicly available , large pretrained language models ( lms ) generate text with remarkable quality , but only sequentially from left to right .", "as a result , they are not immediately applicable to generation tasks that break the unidirectional assumption , such as paraphrasing or text - infilling , necessitating task - specific supervision ."], "relation": "used for", "id": "2021.acl-long.114", "year": 2021, "rel_sent": "Comprehensive empirical results demonstrate that Reflective Decoding outperforms strong unsupervised baselines on both paraphrasing and abductive text infilling , significantly narrowing the gap between unsupervised and supervised methods .", "forward": false, "src_ids": "2021.acl-long.114_15864"}
{"input": "t5 text - to - text transformer is done by using Method| context: identifying the value of product attribute is essential for many e - commerce functions such as product search and product recommendations . therefore , identifying attribute values from unstructured product descriptions is a critical undertaking for any e - commerce retailer . what makes this problem challenging is the diversity of product types and their attributes and values . existing methods have typically employed multiple types of machine learning models , each of which handles specific product types or attribute classes . this has limited their scalability and generalization for large scale real world e - commerce applications . previous approaches for this task have formulated the attribute value extraction as a named entity recognition ( ner ) task or a question answering ( qa ) task .", "entity": "t5 text - to - text transformer", "output": "large - scale pretraining", "neg_sample": ["t5 text - to - text transformer is done by using Method", "identifying the value of product attribute is essential for many e - commerce functions such as product search and product recommendations .", "therefore , identifying attribute values from unstructured product descriptions is a critical undertaking for any e - commerce retailer .", "what makes this problem challenging is the diversity of product types and their attributes and values .", "existing methods have typically employed multiple types of machine learning models , each of which handles specific product types or attribute classes .", "this has limited their scalability and generalization for large scale real world e - commerce applications .", "previous approaches for this task have formulated the attribute value extraction as a named entity recognition ( ner ) task or a question answering ( qa ) task ."], "relation": "used for", "id": "2021.ecnlp-1.2", "year": 2021, "rel_sent": "We leverage the large - scale pretraining of the GPT-2 and the T5 text - to - text transformer to create fine - tuned models that can effectively perform this task .", "forward": false, "src_ids": "2021.ecnlp-1.2_1569"}
{"input": "cross - cultural similarities is done by using OtherScientificTerm| context: much work in cross - lingual transfer learning explored how to select better transfer languages for multilingual tasks , primarily focusing on typological and genealogical similarities between languages . we hypothesize that these measures of linguistic proximity are not enough when working with pragmatically - motivated tasks , such as sentiment analysis .", "entity": "cross - cultural similarities", "output": "pragmatic features", "neg_sample": ["cross - cultural similarities is done by using OtherScientificTerm", "much work in cross - lingual transfer learning explored how to select better transfer languages for multilingual tasks , primarily focusing on typological and genealogical similarities between languages .", "we hypothesize that these measures of linguistic proximity are not enough when working with pragmatically - motivated tasks , such as sentiment analysis ."], "relation": "used for", "id": "2021.eacl-main.204", "year": 2021, "rel_sent": "Our analyses show that the proposed pragmatic features do capture cross - cultural similarities and align well with existing work in sociolinguistics and linguistic anthropology .", "forward": false, "src_ids": "2021.eacl-main.204_10484"}
{"input": "intent classification and slot filling tasks is done by using Method| context: few - shot learning arises in important practical scenarios , such as when a natural language understanding system needs to learn new semantic labels for an emerging , resource - scarce domain .", "entity": "intent classification and slot filling tasks", "output": "retrieval - based methods", "neg_sample": ["intent classification and slot filling tasks is done by using Method", "few - shot learning arises in important practical scenarios , such as when a natural language understanding system needs to learn new semantic labels for an emerging , resource - scarce domain ."], "relation": "used for", "id": "2021.naacl-main.59", "year": 2021, "rel_sent": "In this paper , we explore retrieval - based methods for intent classification and slot filling tasks in few - shot settings .", "forward": false, "src_ids": "2021.naacl-main.59_9024"}
{"input": "multilingual grapheme - to - phoneme conversion is done by using Task| context: grapheme - to - phoneme conversion is an important component in many speech technologies , but until recently there were no multilingual benchmarks for this task .", "entity": "multilingual grapheme - to - phoneme conversion", "output": "sigmorphon shared task", "neg_sample": ["multilingual grapheme - to - phoneme conversion is done by using Task", "grapheme - to - phoneme conversion is an important component in many speech technologies , but until recently there were no multilingual benchmarks for this task ."], "relation": "used for", "id": "2021.sigmorphon-1.13", "year": 2021, "rel_sent": "Results of the Second SIGMORPHON Shared Task on Multilingual Grapheme - to - Phoneme Conversion.", "forward": false, "src_ids": "2021.sigmorphon-1.13_8002"}
{"input": "semantic processing is done by using Task| context: due to the increasing concerns for data privacy , source - free unsupervised domain adaptation attracts more and more research attention , where only a trained source model is assumed to be available , while the labeled source data remain private . to get promising adaptation results , we need tofind effective ways to transfer knowledge learned in source domain and leverage useful domain specific information from target domain at the same time .", "entity": "semantic processing", "output": "source - free domain adaptation", "neg_sample": ["semantic processing is done by using Task", "due to the increasing concerns for data privacy , source - free unsupervised domain adaptation attracts more and more research attention , where only a trained source model is assumed to be available , while the labeled source data remain private .", "to get promising adaptation results , we need tofind effective ways to transfer knowledge learned in source domain and leverage useful domain specific information from target domain at the same time ."], "relation": "used for", "id": "2021.semeval-1.183", "year": 2021, "rel_sent": "This paper describes our winning contribution to SemEval 2021 Task 10 : Source - Free Domain Adaptation for Semantic Processing .", "forward": false, "src_ids": "2021.semeval-1.183_1349"}
{"input": "multimodal data is used for Task| context: the additional modality is typically in the form of images . despite proven advantages , it is indeed difficult to develop an mmt system for various languages primarily due to the lack of a suitable multimodal dataset .", "entity": "multimodal data", "output": "translation", "neg_sample": ["multimodal data is used for Task", "the additional modality is typically in the form of images .", "despite proven advantages , it is indeed difficult to develop an mmt system for various languages primarily due to the lack of a suitable multimodal dataset ."], "relation": "used for", "id": "2021.mmtlrl-1.6", "year": 2021, "rel_sent": "Through a comparative study of the developed MMT system vis - a - vis a Text - to - text translation , we demonstrate that the use of multimodal data not only improves the translation performance improvement in BLEU score of +1.3 on the development set , +3.9 on the evaluation test , and +0.9 on the challenge test set but also helps to resolve ambiguities in the pure text description .", "forward": true, "src_ids": "2021.mmtlrl-1.6_14720"}
{"input": "explicit guidance on how to resolve conversational dependency is used for Method| context: one of the main challenges in conversational question answering ( cqa ) is to resolve the conversational dependency , such as anaphora and ellipsis .", "entity": "explicit guidance on how to resolve conversational dependency", "output": "qa models", "neg_sample": ["explicit guidance on how to resolve conversational dependency is used for Method", "one of the main challenges in conversational question answering ( cqa ) is to resolve the conversational dependency , such as anaphora and ellipsis ."], "relation": "used for", "id": "2021.acl-long.478", "year": 2021, "rel_sent": "In this paper , we propose a novel framework , ExCorD ( Explicit guidance on how to resolve Conversational Dependency ) to enhance the abilities of QA models in comprehending conversational context .", "forward": true, "src_ids": "2021.acl-long.478_660"}
{"input": "human judgments is done by using Metric| context: for many nlp applications of online reviews , comparison of two opinion - bearing sentences is key . we argue that , while general purpose text similarity metrics have been applied for this purpose , there has been limited exploration of their applicability to opinion texts .", "entity": "human judgments", "output": "embedding - based metrics", "neg_sample": ["human judgments is done by using Metric", "for many nlp applications of online reviews , comparison of two opinion - bearing sentences is key .", "we argue that , while general purpose text similarity metrics have been applied for this purpose , there has been limited exploration of their applicability to opinion texts ."], "relation": "used for", "id": "2021.newsum-1.9", "year": 2021, "rel_sent": "We crowdsourced annotations for opinion sentence pairs and our main findings are : ( 1 ) annotators tend to agree on whether or not opinion sentences are similar or different ; and ( 2 ) embedding - based metrics capture human judgments of ' opinion similarity ' but not ' opinion difference ' .", "forward": false, "src_ids": "2021.newsum-1.9_3667"}
{"input": "statistical biases is done by using Method| context: pre - trained language models have achieved human - level performance on many machine reading comprehension ( mrc ) tasks , but it remains unclear whether these models truly understand language or answer questions by exploiting statistical biases in datasets .", "entity": "statistical biases", "output": "augmented training method", "neg_sample": ["statistical biases is done by using Method", "pre - trained language models have achieved human - level performance on many machine reading comprehension ( mrc ) tasks , but it remains unclear whether these models truly understand language or answer questions by exploiting statistical biases in datasets ."], "relation": "used for", "id": "2021.acl-short.43", "year": 2021, "rel_sent": "Finally , we propose an augmented training method that can greatly reduce models ' statistical biases .", "forward": false, "src_ids": "2021.acl-short.43_4404"}
{"input": "covr is done by using Method| context: while interest in models that generalize at test time to new compositions has risen in recent years , benchmarks in the visually - grounded domain have thus far been restricted to synthetic images .", "entity": "covr", "output": "automatic generation process", "neg_sample": ["covr is done by using Method", "while interest in models that generalize at test time to new compositions has risen in recent years , benchmarks in the visually - grounded domain have thus far been restricted to synthetic images ."], "relation": "used for", "id": "2021.emnlp-main.774", "year": 2021, "rel_sent": "Due to the automatic generation process , COVR facilitates the creation of compositional splits , where models at test time need to generalize to new concepts and compositions in a zero- or few - shot setting .", "forward": false, "src_ids": "2021.emnlp-main.774_10197"}
{"input": "code completion and bug fixing tasks is done by using Method| context: source code processing heavily relies on the methods widely used in natural language processing ( nlp ) , but involves specifics that need to be taken into account to achieve higher quality . an example of this specificity is that the semantics of a variable is defined not only by its name but also by the contexts in which the variable occurs .", "entity": "code completion and bug fixing tasks", "output": "recurrent neural network", "neg_sample": ["code completion and bug fixing tasks is done by using Method", "source code processing heavily relies on the methods widely used in natural language processing ( nlp ) , but involves specifics that need to be taken into account to achieve higher quality .", "an example of this specificity is that the semantics of a variable is defined not only by its name but also by the contexts in which the variable occurs ."], "relation": "used for", "id": "2021.naacl-main.213", "year": 2021, "rel_sent": "We show that using the proposed dynamic embeddings significantly improves the performance of the recurrent neural network , in code completion and bug fixing tasks .", "forward": false, "src_ids": "2021.naacl-main.213_1656"}
{"input": "zero - shot korean ner models is done by using Method| context: this paper presents a english - korean parallel dataset that collects 381 k news articles where 1,400 of them , comprising 10 k sentences , are manually labeled for crosslingual named entity recognition ( ner ) .", "entity": "zero - shot korean ner models", "output": "crosslingual learning approaches", "neg_sample": ["zero - shot korean ner models is done by using Method", "this paper presents a english - korean parallel dataset that collects 381 k news articles where 1,400 of them , comprising 10 k sentences , are manually labeled for crosslingual named entity recognition ( ner ) ."], "relation": "used for", "id": "2021.mrl-1.19", "year": 2021, "rel_sent": "Three types of crosslingual learning approaches , direct model transfer , embedding projection , and annotation projection , are used to develop zero - shot Korean NER models .", "forward": false, "src_ids": "2021.mrl-1.19_13196"}
{"input": "domain - specific knowledge is used for Task| context: the literature shows that the use of articles as input features helps improve the classification performance .", "entity": "domain - specific knowledge", "output": "judgment prediction", "neg_sample": ["domain - specific knowledge is used for Task", "the literature shows that the use of articles as input features helps improve the classification performance ."], "relation": "used for", "id": "2021.ranlp-1.139", "year": 2021, "rel_sent": "Exploiting Domain - Specific Knowledge for Judgment Prediction Is No Panacea.", "forward": true, "src_ids": "2021.ranlp-1.139_16163"}
{"input": "open - domain question - answering is used for Material| context: as with other emergent domains , the discussion surrounding the topic has been rapidly changing , leading to the spread of misinformation . this has created the need for a public space for users to ask questions and receive credible , scientific answers .", "entity": "open - domain question - answering", "output": "covid-19", "neg_sample": ["open - domain question - answering is used for Material", "as with other emergent domains , the discussion surrounding the topic has been rapidly changing , leading to the spread of misinformation .", "this has created the need for a public space for users to ask questions and receive credible , scientific answers ."], "relation": "used for", "id": "2021.emnlp-demo.30", "year": 2021, "rel_sent": "Open - Domain Question - Answering for COVID-19 and Other Emergent Domains.", "forward": true, "src_ids": "2021.emnlp-demo.30_3392"}
{"input": "claim and premise tokens is done by using Method| context: argument mining is often addressed by a pipeline method where segmentation of text into argumentative units is conducted first and proceeded by an argument component identification task .", "entity": "claim and premise tokens", "output": "token - level classification", "neg_sample": ["claim and premise tokens is done by using Method", "argument mining is often addressed by a pipeline method where segmentation of text into argumentative units is conducted first and proceeded by an argument component identification task ."], "relation": "used for", "id": "2021.bea-1.22", "year": 2021, "rel_sent": "In this research , we apply a token - level classification to identify claim and premise tokens from a new corpus of argumentative essays written by middle school students .", "forward": false, "src_ids": "2021.bea-1.22_11883"}
{"input": "similar representations is used for OtherScientificTerm| context: with the modeling of bidirectional contexts , recently prevalent language modeling approaches such as xlm achieve better performance than traditional methods based on embedding alignment , which strives to assign similar vector representations to semantic - equivalent units . however , such approaches like xlm capture cross - lingual information based solely on shared bpe vocabulary , resulting in the absence of fine - grained supervision induced by embedding alignment .", "entity": "similar representations", "output": "finegrained and explicit cross - lingual information", "neg_sample": ["similar representations is used for OtherScientificTerm", "with the modeling of bidirectional contexts , recently prevalent language modeling approaches such as xlm achieve better performance than traditional methods based on embedding alignment , which strives to assign similar vector representations to semantic - equivalent units .", "however , such approaches like xlm capture cross - lingual information based solely on shared bpe vocabulary , resulting in the absence of fine - grained supervision induced by embedding alignment ."], "relation": "used for", "id": "2021.findings-acl.149", "year": 2021, "rel_sent": "While predicting the masked words based on bidirectional contexts , the proposal also encodes semantic equivalents from different languages into similar representations to introduce more finegrained and explicit cross - lingual information .", "forward": true, "src_ids": "2021.findings-acl.149_11029"}
{"input": "breakdown detection task is used for Task| context: however , the field suffers from a paucity of available negotiation corpora , which hinders further development and makes it difficult to test new methodologies in novel negotiation settings .", "entity": "breakdown detection task", "output": "human - human negotiation support", "neg_sample": ["breakdown detection task is used for Task", "however , the field suffers from a paucity of available negotiation corpora , which hinders further development and makes it difficult to test new methodologies in novel negotiation settings ."], "relation": "used for", "id": "2021.eacl-main.63", "year": 2021, "rel_sent": "We test the proposed corpus using a breakdown detection task for human - human negotiation support .", "forward": true, "src_ids": "2021.eacl-main.63_1037"}
{"input": "icd coding is done by using Method| context: automatic icd coding is the task of assigning codes from the international classification of diseases ( icd ) to medical notes . these codes describe the state of the patient and have multiple applications , e.g. , computer - assisted diagnosis or epidemiological studies . icd coding is a challenging task due to the complexity and length of medical notes . unlike the general trend in language processing , no transformer model has been reported to reach high performance on this task .", "entity": "icd coding", "output": "bert - based models", "neg_sample": ["icd coding is done by using Method", "automatic icd coding is the task of assigning codes from the international classification of diseases ( icd ) to medical notes .", "these codes describe the state of the patient and have multiple applications , e.g.", ", computer - assisted diagnosis or epidemiological studies .", "icd coding is a challenging task due to the complexity and length of medical notes .", "unlike the general trend in language processing , no transformer model has been reported to reach high performance on this task ."], "relation": "used for", "id": "2021.bionlp-1.6", "year": 2021, "rel_sent": "We find that the difficulty of fine - tuning the model on long pieces of text is the main limitation for BERT - based models on ICD coding .", "forward": false, "src_ids": "2021.bionlp-1.6_9680"}
{"input": "text generation is done by using Task| context: the adoption of natural language generation ( nlg ) models can leave individuals vulnerable to the generation of harmful information memorized by the models , such as conspiracy theories . while previous studies examine conspiracy theories in the context of social media , they have not evaluated their presence in the new space of generative language models .", "entity": "text generation", "output": "memorization and elicitation of conspiracy theories", "neg_sample": ["text generation is done by using Task", "the adoption of natural language generation ( nlg ) models can leave individuals vulnerable to the generation of harmful information memorized by the models , such as conspiracy theories .", "while previous studies examine conspiracy theories in the context of social media , they have not evaluated their presence in the new space of generative language models ."], "relation": "used for", "id": "2021.findings-acl.416", "year": 2021, "rel_sent": "Investigating Memorization of Conspiracy Theories in Text Generation.", "forward": false, "src_ids": "2021.findings-acl.416_3267"}
{"input": "language models is used for Material| context: i participated in the wmt shared news translation task and focus on one high resource language pair : english and chinese ( two directions , chinese to english and english to chinese ) . the submitted systems ( zenghuimt ) focus on data cleaning , data selection , back translation and model ensemble .", "entity": "language models", "output": "monolingual data", "neg_sample": ["language models is used for Material", "i participated in the wmt shared news translation task and focus on one high resource language pair : english and chinese ( two directions , chinese to english and english to chinese ) .", "the submitted systems ( zenghuimt ) focus on data cleaning , data selection , back translation and model ensemble ."], "relation": "used for", "id": "2021.wmt-1.24", "year": 2021, "rel_sent": "I used a base translation model trained on initial corpus to obtain the target versions of the WMT21 test sets , then I used language models tofind out the monolingual data that is most similar to the target version of test set , such monolingual data was then used to do back translation .", "forward": true, "src_ids": "2021.wmt-1.24_13026"}
{"input": "translator - based transfer - learning strategy is used for Method| context: we investigate how to solve the cross - corpus news recommendation for unseen users in the future . this is a problem where traditional content - based recommendation techniques often fail . luckily , in real - world recommendation services , some publisher ( e.g. , daily news ) may have accumulated a large corpus with lots of consumers which can be used for a newly deployed publisher ( e.g. , political news ) .", "entity": "translator - based transfer - learning strategy", "output": "representation mapping", "neg_sample": ["translator - based transfer - learning strategy is used for Method", "we investigate how to solve the cross - corpus news recommendation for unseen users in the future .", "this is a problem where traditional content - based recommendation techniques often fail .", "luckily , in real - world recommendation services , some publisher ( e.g.", ", daily news ) may have accumulated a large corpus with lots of consumers which can be used for a newly deployed publisher ( e.g.", ", political news ) ."], "relation": "used for", "id": "2021.eacl-main.62", "year": 2021, "rel_sent": "To tackle the heterogeneity of different user interests and of different word distributions across corpora , we design a translator - based transfer - learning strategy to learn a representation mapping between source and target corpora .", "forward": true, "src_ids": "2021.eacl-main.62_8987"}
{"input": "pseudo translation based data augmentation is used for Method| context: recent research in multilingual language models ( lm ) has demonstrated their ability to effectively handle multiple languages in a single model . this holds promise for low web - resource languages ( lrl ) as multilingual models can enable transfer of supervision from high resource languages to lrls . however , incorporating a new language in an lm still remains a challenge , particularly for languages with limited corpora and in unseen scripts .", "entity": "pseudo translation based data augmentation", "output": "lms", "neg_sample": ["pseudo translation based data augmentation is used for Method", "recent research in multilingual language models ( lm ) has demonstrated their ability to effectively handle multiple languages in a single model .", "this holds promise for low web - resource languages ( lrl ) as multilingual models can enable transfer of supervision from high resource languages to lrls .", "however , incorporating a new language in an lm still remains a challenge , particularly for languages with limited corpora and in unseen scripts ."], "relation": "used for", "id": "2021.acl-long.105", "year": 2021, "rel_sent": "Experiments on multiple real - world benchmark datasets provide validation to our hypothesis that using a related language as pivot , along with transliteration and pseudo translation based data augmentation , can be an effective way to adapt LMs for LRLs , rather than direct training or pivoting through English .", "forward": true, "src_ids": "2021.acl-long.105_9076"}
{"input": "automatic algorithm is used for OtherScientificTerm| context: the acquisition of a dialogue corpus is a key step in the process of training a dialogue model . in this context , corpora acquisitions have been designed either for open - domain information retrieval or slot - filling ( e.g. restaurant booking ) tasks . however , there has been scarce research in the problem of collecting personal conversations with users over a long period of time .", "entity": "automatic algorithm", "output": "textual stimuli", "neg_sample": ["automatic algorithm is used for OtherScientificTerm", "the acquisition of a dialogue corpus is a key step in the process of training a dialogue model .", "in this context , corpora acquisitions have been designed either for open - domain information retrieval or slot - filling ( e.g.", "restaurant booking ) tasks .", "however , there has been scarce research in the problem of collecting personal conversations with users over a long period of time ."], "relation": "used for", "id": "2021.nlpmc-1.1", "year": 2021, "rel_sent": "We propose an automatic algorithm that generates textual stimuli from personal narratives collected during psychotherapy interventions .", "forward": true, "src_ids": "2021.nlpmc-1.1_6760"}
{"input": "human - like character embeddings is done by using Task| context: in contrast to their word- or sentence - level counterparts , character embeddings are still poorly understood .", "entity": "human - like character embeddings", "output": "grapheme - to - phoneme conversion", "neg_sample": ["human - like character embeddings is done by using Task", "in contrast to their word- or sentence - level counterparts , character embeddings are still poorly understood ."], "relation": "used for", "id": "2021.eacl-main.230", "year": 2021, "rel_sent": "Comparing across tasks , grapheme - to - phoneme conversion results in the most human - like character embeddings .", "forward": false, "src_ids": "2021.eacl-main.230_12677"}
{"input": "paraphrased sentence is done by using Method| context: paraphrase generation is an important and challenging nlg problem .", "entity": "paraphrased sentence", "output": "decoder", "neg_sample": ["paraphrased sentence is done by using Method", "paraphrase generation is an important and challenging nlg problem ."], "relation": "used for", "id": "2021.findings-acl.50", "year": 2021, "rel_sent": "In the aggregation step , these groups are separately encoded , before being aggregated by a custom designed decoder , which autoregressively generates the paraphrased sentence .", "forward": false, "src_ids": "2021.findings-acl.50_600"}
{"input": "feature importance analysis is used for Method| context: traditional hand - crafted linguistically - informed features have often been used for distinguishing between translated and original non - translated texts . by contrast , to date , neural architectures without manual feature engineering have been less explored for this task .", "entity": "feature importance analysis", "output": "neural and classical architectures", "neg_sample": ["feature importance analysis is used for Method", "traditional hand - crafted linguistically - informed features have often been used for distinguishing between translated and original non - translated texts .", "by contrast , to date , neural architectures without manual feature engineering have been less explored for this task ."], "relation": "used for", "id": "2021.emnlp-main.676", "year": 2021, "rel_sent": "We show that ( i ) neural architectures outperform other approaches by more than 20 accuracy points , with the BERT - based model performing the best in both the monolingual and multilingual settings ; ( ii ) while many individual hand - crafted translationese features correlate with neural model predictions , feature importance analysis shows that the most important features for neural and classical architectures differ ; and ( iii ) our multilingual experiments provide empirical evidence for translationese universals across languages .", "forward": true, "src_ids": "2021.emnlp-main.676_4038"}
{"input": "discrete wavelet transform ( dwt ) is used for Material| context: the masking - based speech enhancement method pursues a multiplicative mask that applies to the spectrogram of input noise - corrupted utterance , and a deep neural network ( dnn ) is often used to learn the mask . in particular , the features commonly used for automatic speech recognition can serve as the input of the dnn to learn the well - behaved mask that significantly reduce the noise distortion of processed utterances .", "entity": "discrete wavelet transform ( dwt )", "output": "temporal speech feature sequence", "neg_sample": ["discrete wavelet transform ( dwt ) is used for Material", "the masking - based speech enhancement method pursues a multiplicative mask that applies to the spectrogram of input noise - corrupted utterance , and a deep neural network ( dnn ) is often used to learn the mask .", "in particular , the features commonly used for automatic speech recognition can serve as the input of the dnn to learn the well - behaved mask that significantly reduce the noise distortion of processed utterances ."], "relation": "used for", "id": "2021.ijclclp-2.3", "year": 2021, "rel_sent": "In particular , we employ the discrete wavelet transform ( DWT ) to decompose the temporal speech feature sequence and scale down the detail coefficients , which correspond to the high - pass portion of the sequence .", "forward": true, "src_ids": "2021.ijclclp-2.3_5956"}
{"input": "data augmentation pipeline is used for Task| context: as a result , a vqa model trained solely on human - annotated examples could easily over - fit specific question styles or image contents that are being asked , leaving the model largely ignorant about the sheer diversity of questions . we found that many of the ' unknowns ' to the learned vqa model are indeed ' known ' in the dataset implicitly .", "entity": "data augmentation pipeline", "output": "visual question answering", "neg_sample": ["data augmentation pipeline is used for Task", "as a result , a vqa model trained solely on human - annotated examples could easily over - fit specific question styles or image contents that are being asked , leaving the model largely ignorant about the sheer diversity of questions .", "we found that many of the ' unknowns ' to the learned vqa model are indeed ' known ' in the dataset implicitly ."], "relation": "used for", "id": "2021.emnlp-main.512", "year": 2021, "rel_sent": "Building upon these insights , we present a simple data augmentation pipeline SimpleAug to turn this ' known ' knowledge into training examples for VQA .", "forward": true, "src_ids": "2021.emnlp-main.512_8144"}
{"input": "neural machine translation is done by using Method| context: a good translation should not only translate the original content semantically , but also incarnate personal traits of the original text . for a real - world neural machine translation ( nmt ) system , these user traits ( e.g. , topic preference , stylistic characteristics and expression habits ) can be preserved in user behavior ( e.g. , historical inputs ) . however , current nmt systems marginally consider the user behavior due to : 1 ) the difficulty of modeling user portraits in zero - shot scenarios , and 2 ) the lack of user - behavior annotated parallel dataset .", "entity": "neural machine translation", "output": "cache - based module", "neg_sample": ["neural machine translation is done by using Method", "a good translation should not only translate the original content semantically , but also incarnate personal traits of the original text .", "for a real - world neural machine translation ( nmt ) system , these user traits ( e.g.", ", topic preference , stylistic characteristics and expression habits ) can be preserved in user behavior ( e.g.", ", historical inputs ) .", "however , current nmt systems marginally consider the user behavior due to : 1 ) the difficulty of modeling user portraits in zero - shot scenarios , and 2 ) the lack of user - behavior annotated parallel dataset ."], "relation": "used for", "id": "2021.acl-long.310", "year": 2021, "rel_sent": "Specifically , a cache - based module and a user - driven contrastive learning method are proposed to offer NMT the ability to capture potential user traits from their historical inputs under a zero - shot learning fashion .", "forward": false, "src_ids": "2021.acl-long.310_8198"}
{"input": "random walks is used for OtherScientificTerm| context: knowledge graphs ( kgs ) are widely used to store and access information about entities and their relationships . given a query , the task of entity retrieval from a kg aims at presenting a ranked list of entities relevant to the query . lately , an increasing number of models for entity retrieval have shown a significant improvement over traditional methods . these models , however , were developed for english kgs .", "entity": "random walks", "output": "entity embeddings", "neg_sample": ["random walks is used for OtherScientificTerm", "knowledge graphs ( kgs ) are widely used to store and access information about entities and their relationships .", "given a query , the task of entity retrieval from a kg aims at presenting a ranked list of entities relevant to the query .", "lately , an increasing number of models for entity retrieval have shown a significant improvement over traditional methods .", "these models , however , were developed for english kgs ."], "relation": "used for", "id": "2021.wanlp-1.24", "year": 2021, "rel_sent": "Like KEWER , SERAG uses random walks to generate entity embeddings .", "forward": true, "src_ids": "2021.wanlp-1.24_9324"}
{"input": "temporal commonsense reasoning is done by using Task| context: temporal commonsense reasoning is a challenging task as it requires temporal knowledge usually not explicit in text .", "entity": "temporal commonsense reasoning", "output": "temporal masked language model task", "neg_sample": ["temporal commonsense reasoning is done by using Task", "temporal commonsense reasoning is a challenging task as it requires temporal knowledge usually not explicit in text ."], "relation": "used for", "id": "2021.ranlp-srw.12", "year": 2021, "rel_sent": "Our model relies on pre - trained contextual representations from transformer - based language models ( i.e. , BERT ) , and on a variety of training methods for enhancing model generalization : 1 ) multi - step fine - tuning using carefully selected auxiliary tasks and datasets , and 2 ) a specifically designed temporal masked language model task aimed to capture temporal commonsense knowledge .", "forward": false, "src_ids": "2021.ranlp-srw.12_2760"}
{"input": "safeguards is used for OtherScientificTerm| context: a growing amount of psychiatric research incorporates machine learning and natural language processing methods , however findings have yet to be translated into actual clinical decision support systems . many of these studies are based on relatively small datasets in homogeneous populations , which has the associated risk that the models may not perform adequately on new data in real clinical practice . the nature of serious mental illness is that it is hard to define , hard to capture , and requires frequent monitoring , which leads to imperfect data where attribute and class noise are common .", "entity": "safeguards", "output": "spurious predictions", "neg_sample": ["safeguards is used for OtherScientificTerm", "a growing amount of psychiatric research incorporates machine learning and natural language processing methods , however findings have yet to be translated into actual clinical decision support systems .", "many of these studies are based on relatively small datasets in homogeneous populations , which has the associated risk that the models may not perform adequately on new data in real clinical practice .", "the nature of serious mental illness is that it is hard to define , hard to capture , and requires frequent monitoring , which leads to imperfect data where attribute and class noise are common ."], "relation": "used for", "id": "2021.clpsych-1.20", "year": 2021, "rel_sent": "With the integration of human - in - the - loop machine learning in the clinical implementation process , incorporating safeguards such as these into the models will offer patients increased protection from spurious predictions .", "forward": true, "src_ids": "2021.clpsych-1.20_32"}
{"input": "representation learning of the words is done by using OtherScientificTerm| context: a missing part in the current deep learning models for finetemprel is their failure to exploit the syntactic structures of the input sentences to enrich the representation vectors .", "entity": "representation learning of the words", "output": "syntax - based importance scores", "neg_sample": ["representation learning of the words is done by using OtherScientificTerm", "a missing part in the current deep learning models for finetemprel is their failure to exploit the syntactic structures of the input sentences to enrich the representation vectors ."], "relation": "used for", "id": "2021.wnut-1.5", "year": 2021, "rel_sent": "The proposed model focuses on two types of syntactic information from the dependency trees , i.e. , the syntax - based importance scores for representation learning of the words and the syntactic connections to identify important context words for the event mentions .", "forward": false, "src_ids": "2021.wnut-1.5_3372"}
{"input": "french fralbert is done by using Method| context: for many tasks , state - of - the - art results have been achieved with transformer - based architectures , resulting in a paradigmatic shift in practices from the use of task - specific architectures to the fine - tuning of pre - trained language models . the ongoing trend consists in training models with an ever - increasing amount of data and parameters , which requires considerable resources . it leads to a strong search to improve resource efficiency based on algorithmic and hardware improvements evaluated only for english . this raises questions about their usability when applied to small - scale learning problems , for which a limited amount of training data is available , especially for under - resourced languages tasks . the lack of appropriately sized corpora is a hindrance to applying data - driven and transfer learning - based approaches with strong instability cases .", "entity": "french fralbert", "output": "compact model", "neg_sample": ["french fralbert is done by using Method", "for many tasks , state - of - the - art results have been achieved with transformer - based architectures , resulting in a paradigmatic shift in practices from the use of task - specific architectures to the fine - tuning of pre - trained language models .", "the ongoing trend consists in training models with an ever - increasing amount of data and parameters , which requires considerable resources .", "it leads to a strong search to improve resource efficiency based on algorithmic and hardware improvements evaluated only for english .", "this raises questions about their usability when applied to small - scale learning problems , for which a limited amount of training data is available , especially for under - resourced languages tasks .", "the lack of appropriately sized corpora is a hindrance to applying data - driven and transfer learning - based approaches with strong instability cases ."], "relation": "used for", "id": "2021.ranlp-1.29", "year": 2021, "rel_sent": "We also introduce a new compact model for French FrALBERT which proves to be competitive in low - resource settings .", "forward": false, "src_ids": "2021.ranlp-1.29_5803"}
{"input": "neural tagging models is used for Task| context: though the proposed annotation scheme is conceptually promising , the feasibility is only examined in four indo - european languages .", "entity": "neural tagging models", "output": "semantic tagging", "neg_sample": ["neural tagging models is used for Task", "though the proposed annotation scheme is conceptually promising , the feasibility is only examined in four indo - european languages ."], "relation": "used for", "id": "2021.naacl-main.440", "year": 2021, "rel_sent": "By means of the new annotations , we also evaluate a series of neural tagging models to gauge how successful semantic tagging can be : accuracies of 92.7 % and 94.6 % are obtained for Chinese and English respectively .", "forward": true, "src_ids": "2021.naacl-main.440_5402"}
{"input": "vector representations is done by using Method| context: automatic short answer grading ( asag ) is the task of assessing students ' short natural language responses to objective questions . it is a crucial component of new education platforms , and could support more wide - spread use of constructed response questions to replace cognitively less challenging multiple choice questions .", "entity": "vector representations", "output": "relation network", "neg_sample": ["vector representations is done by using Method", "automatic short answer grading ( asag ) is the task of assessing students ' short natural language responses to objective questions .", "it is a crucial component of new education platforms , and could support more wide - spread use of constructed response questions to replace cognitively less challenging multiple choice questions ."], "relation": "used for", "id": "2021.emnlp-main.487", "year": 2021, "rel_sent": "A relation network learns vector representations for the elements of QRA triples , then combines the learned representations using learned semantic feature - wise transformations .", "forward": false, "src_ids": "2021.emnlp-main.487_13264"}
{"input": "rewards is used for Task| context: the growth of online consumer health questions has led to the necessity for reliable and accurate question answering systems . a recent study showed that manual summarization of consumer health questions brings significant improvement in retrieving relevant answers . however , the automatic summarization of long questions is a challenging task due to the lack of training data and the complexity of the related subtasks , such as the question focus and type recognition .", "entity": "rewards", "output": "generation of semantically valid questions", "neg_sample": ["rewards is used for Task", "the growth of online consumer health questions has led to the necessity for reliable and accurate question answering systems .", "a recent study showed that manual summarization of consumer health questions brings significant improvement in retrieving relevant answers .", "however , the automatic summarization of long questions is a challenging task due to the lack of training data and the complexity of the related subtasks , such as the question focus and type recognition ."], "relation": "used for", "id": "2021.acl-short.33", "year": 2021, "rel_sent": "These rewards ensure the generation of semantically valid questions and encourage the inclusion of key medical entities / foci in the question summary .", "forward": true, "src_ids": "2021.acl-short.33_6882"}
{"input": "background knowledge is done by using OtherScientificTerm| context: however , this problem is less studied in open - domain dialogue .", "entity": "background knowledge", "output": "dialog structure", "neg_sample": ["background knowledge is done by using OtherScientificTerm", "however , this problem is less studied in open - domain dialogue ."], "relation": "used for", "id": "2021.acl-long.136", "year": 2021, "rel_sent": "Experimental results on two benchmark corpora confirm that DVAE - GNN can discover meaningful dialog structure graph , and the use of dialog structure as background knowledge can significantly improve multi - turn coherence .", "forward": false, "src_ids": "2021.acl-long.136_4619"}
{"input": "logistic regression classifier is done by using OtherScientificTerm| context: automatic detection of stylistic devices is an important tool for literary studies , e.g. , for stylometric analysis or argument mining . a particularly striking device is the rhetorical figure called chiasmus , which involves the inversion of semantically or syntactically related words . existing works focus on a special case of chiasmi that involve identical words in an a b b a pattern , so - called antimetaboles .", "entity": "logistic regression classifier", "output": "features", "neg_sample": ["logistic regression classifier is done by using OtherScientificTerm", "automatic detection of stylistic devices is an important tool for literary studies , e.g.", ", for stylometric analysis or argument mining .", "a particularly striking device is the rhetorical figure called chiasmus , which involves the inversion of semantically or syntactically related words .", "existing works focus on a special case of chiasmi that involve identical words in an a b b a pattern , so - called antimetaboles ."], "relation": "used for", "id": "2021.latechclfl-1.11", "year": 2021, "rel_sent": "These features serve as input for a logistic regression classifier , which learns to distinguish between rhetorical chiasmi and coincidental chiastic word orders without special meaning .", "forward": false, "src_ids": "2021.latechclfl-1.11_1901"}
{"input": "object bounding box annotations is used for Method| context: methodologies for training visual question answering ( vqa ) models assume the availability of datasets with human - annotated imagequestion - answer ( i - q - a ) triplets . this has led to heavy reliance on datasets and a lack of generalization to new types of questions and scenes .", "entity": "object bounding box annotations", "output": "vqa models", "neg_sample": ["object bounding box annotations is used for Method", "methodologies for training visual question answering ( vqa ) models assume the availability of datasets with human - annotated imagequestion - answer ( i - q - a ) triplets .", "this has led to heavy reliance on datasets and a lack of generalization to new types of questions and scenes ."], "relation": "used for", "id": "2021.findings-acl.302", "year": 2021, "rel_sent": "Additionally , we demonstrate the efficacy of spatial - pyramid image patches as a simple but effective alternative to dense and costly object bounding box annotations used in existing VQA models .", "forward": true, "src_ids": "2021.findings-acl.302_1642"}
{"input": "vector representation is done by using Method| context: humour detection is an interesting but difficult task in nlp . because humorous might not be obvious in text , it can be embedded into context , hide behind the literal meaning and require prior knowledge to understand .", "entity": "vector representation", "output": "pre - trained models", "neg_sample": ["vector representation is done by using Method", "humour detection is an interesting but difficult task in nlp .", "because humorous might not be obvious in text , it can be embedded into context , hide behind the literal meaning and require prior knowledge to understand ."], "relation": "used for", "id": "2021.semeval-1.166", "year": 2021, "rel_sent": "Models like Logistic Regression , LSTM , MLP , CNN were used , and pre - trained models like DistilBert were introduced to generate accurate vector representation for textual data .", "forward": false, "src_ids": "2021.semeval-1.166_14856"}
{"input": "constituency parsing is done by using OtherScientificTerm| context: incorporating syntax into neural approaches in nlp has a multitude of practical and scientific benefits . for instance , a language model that is syntax - aware is likely to be able to produce better samples ; even a discriminative model like bert with a syntax module could be used for core nlp tasks like unsupervised syntactic parsing . rapid progress in recent years was arguably spurred on by the empirical success of the parsing - reading - predict architecture of ( shen et al . , 2018a ) , later simplified by the order neuron lstm of ( shen et al . , 2019 ) . most notably , this is the first time neural approaches were able to successfully perform unsupervised syntactic parsing ( evaluated by various metrics like f-1 score ) . however , even heuristic ( much less fully mathematical ) understanding of why and when these architectures work is lagging severely behind .", "entity": "constituency parsing", "output": "limited context", "neg_sample": ["constituency parsing is done by using OtherScientificTerm", "incorporating syntax into neural approaches in nlp has a multitude of practical and scientific benefits .", "for instance , a language model that is syntax - aware is likely to be able to produce better samples ; even a discriminative model like bert with a syntax module could be used for core nlp tasks like unsupervised syntactic parsing .", "rapid progress in recent years was arguably spurred on by the empirical success of the parsing - reading - predict architecture of ( shen et al .", ", 2018a ) , later simplified by the order neuron lstm of ( shen et al .", ", 2019 ) .", "most notably , this is the first time neural approaches were able to successfully perform unsupervised syntactic parsing ( evaluated by various metrics like f-1 score ) .", "however , even heuristic ( much less fully mathematical ) understanding of why and when these architectures work is lagging severely behind ."], "relation": "used for", "id": "2021.acl-long.208", "year": 2021, "rel_sent": "The Limitations of Limited Context for Constituency Parsing.", "forward": false, "src_ids": "2021.acl-long.208_7662"}
{"input": "chinese dataset is used for Task| context: ca nt is important for understanding advertising , comedies and dog - whistle politics . however , computational research on ca nt is hindered by a lack of available datasets .", "entity": "chinese dataset", "output": "ca nt understanding", "neg_sample": ["chinese dataset is used for Task", "ca nt is important for understanding advertising , comedies and dog - whistle politics .", "however , computational research on ca nt is hindered by a lack of available datasets ."], "relation": "used for", "id": "2021.naacl-main.172", "year": 2021, "rel_sent": "Blow the Dog Whistle : A Chinese Dataset for Ca nt Understanding with Common Sense and World Knowledge.", "forward": true, "src_ids": "2021.naacl-main.172_5612"}
{"input": "fine - tuning classification models is done by using Material| context: supervised models can achieve very high accuracy for fine - grained text classification . in practice , however , training data may be abundant for some types but scarce or even non - existent for others .", "entity": "fine - tuning classification models", "output": "labeled data", "neg_sample": ["fine - tuning classification models is done by using Material", "supervised models can achieve very high accuracy for fine - grained text classification .", "in practice , however , training data may be abundant for some types but scarce or even non - existent for others ."], "relation": "used for", "id": "2021.case-1.24", "year": 2021, "rel_sent": "We propose a hybrid architecture that uses as much labeled data as available for fine - tuning classification models , while also allowing for types with little ( few - shot ) or no ( zero - shot ) labeled data .", "forward": false, "src_ids": "2021.case-1.24_13867"}
{"input": "counterfactual inference is used for Task| context: today 's text classifiers inevitably suffer from unintended dataset biases , especially the document - level label bias and word - level keyword bias , which may hurt models ' generalization . many previous studies employed data - level manipulations or model - level balancing mechanisms to recover unbiased distributions and thus prevent models from capturing the two types of biases . unfortunately , they either suffer from the extra cost of data collection / selection / annotation or need an elaborate design of balancing strategies . different from traditional factual inference in which debiasing occurs before or during training , counterfactual inference mitigates the influence brought by unintended confounders after training , which can make unbiased decisions with biased observations .", "entity": "counterfactual inference", "output": "text classification debiasing", "neg_sample": ["counterfactual inference is used for Task", "today 's text classifiers inevitably suffer from unintended dataset biases , especially the document - level label bias and word - level keyword bias , which may hurt models ' generalization .", "many previous studies employed data - level manipulations or model - level balancing mechanisms to recover unbiased distributions and thus prevent models from capturing the two types of biases .", "unfortunately , they either suffer from the extra cost of data collection / selection / annotation or need an elaborate design of balancing strategies .", "different from traditional factual inference in which debiasing occurs before or during training , counterfactual inference mitigates the influence brought by unintended confounders after training , which can make unbiased decisions with biased observations ."], "relation": "used for", "id": "2021.acl-long.422", "year": 2021, "rel_sent": "Counterfactual Inference for Text Classification Debiasing.", "forward": true, "src_ids": "2021.acl-long.422_6899"}
{"input": "generalizable commonsense inference is done by using Method| context: however , there is a lack of understanding on their generalization to multiple ckgs , unseen relations , and novel entities .", "entity": "generalizable commonsense inference", "output": "language models", "neg_sample": ["generalizable commonsense inference is done by using Method", "however , there is a lack of understanding on their generalization to multiple ckgs , unseen relations , and novel entities ."], "relation": "used for", "id": "2021.findings-acl.322", "year": 2021, "rel_sent": "Do Language Models Perform Generalizable Commonsense Inference ?.", "forward": false, "src_ids": "2021.findings-acl.322_12233"}
{"input": "network - based tool is used for Task| context: scot represents the meanings of a word as clusters of similar words . it visualises their formation , change , and demise .", "entity": "network - based tool", "output": "analysis of lexical change", "neg_sample": ["network - based tool is used for Task", "scot represents the meanings of a word as clusters of similar words .", "it visualises their formation , change , and demise ."], "relation": "used for", "id": "2021.eacl-demos.23", "year": 2021, "rel_sent": "We present Sense Clustering over Time ( SCoT ) , a novel network - based tool for analysing lexical change .", "forward": true, "src_ids": "2021.eacl-demos.23_14152"}
{"input": "natural language understanding is done by using Method| context: fine - tuning large pre - trained models with task - specific data has achieved great success in nlp . however , it has been demonstrated that the majority of information within the self - attention networks is redundant and not utilized effectively during the fine - tuning stage . this leads to inferior results when generalizing the obtained models to out - of - domain distributions .", "entity": "natural language understanding", "output": "data augmentation", "neg_sample": ["natural language understanding is done by using Method", "fine - tuning large pre - trained models with task - specific data has achieved great success in nlp .", "however , it has been demonstrated that the majority of information within the self - attention networks is redundant and not utilized effectively during the fine - tuning stage .", "this leads to inferior results when generalizing the obtained models to out - of - domain distributions ."], "relation": "used for", "id": "2021.acl-long.338", "year": 2021, "rel_sent": "HiddenCut : Simple Data Augmentation for Natural Language Understanding with Better Generalizability.", "forward": false, "src_ids": "2021.acl-long.338_13798"}
{"input": "cross - domain text mining is done by using Method| context: transfer learning ( tl ) seeks to improve the learning of a data - scarce target domain by using information from source domains . however , the source and target domains usually have different data distributions , which may lead to negative transfer .", "entity": "cross - domain text mining", "output": "wasserstein selective transfer learning", "neg_sample": ["cross - domain text mining is done by using Method", "transfer learning ( tl ) seeks to improve the learning of a data - scarce target domain by using information from source domains .", "however , the source and target domains usually have different data distributions , which may lead to negative transfer ."], "relation": "used for", "id": "2021.emnlp-main.770", "year": 2021, "rel_sent": "Wasserstein Selective Transfer Learning for Cross - domain Text Mining.", "forward": false, "src_ids": "2021.emnlp-main.770_1926"}
{"input": "corpus level occurrence probability is used for OtherScientificTerm| context: large pretrained models have achieved great success in many natural language processing tasks . however , when they are applied in specific domains , these models suffer from domain shift and bring challenges in fine - tuning and online serving for latency and capacity constraints .", "entity": "corpus level occurrence probability", "output": "incremental vocabulary", "neg_sample": ["corpus level occurrence probability is used for OtherScientificTerm", "large pretrained models have achieved great success in many natural language processing tasks .", "however , when they are applied in specific domains , these models suffer from domain shift and bring challenges in fine - tuning and online serving for latency and capacity constraints ."], "relation": "used for", "id": "2021.findings-acl.40", "year": 2021, "rel_sent": "Specifically , we propose domain - specific vocabulary expansion in the adaptation stage and employ corpus level occurrence probability to choose the size of incremental vocabulary automatically .", "forward": true, "src_ids": "2021.findings-acl.40_13366"}
{"input": "graph - of - words is done by using OtherScientificTerm| context: recent , well - known graph - based approaches typically employ the knowledge from word vector representations during the ranking process via popular centrality measures ( e.g. , pagerank ) without giving the primary role to vectors ' distribution .", "entity": "graph - of - words", "output": "adjacency matrix", "neg_sample": ["graph - of - words is done by using OtherScientificTerm", "recent , well - known graph - based approaches typically employ the knowledge from word vector representations during the ranking process via popular centrality measures ( e.g.", ", pagerank ) without giving the primary role to vectors ' distribution ."], "relation": "used for", "id": "11.textgraphs-1.9", "year": 2021, "rel_sent": "We consider the adjacency matrix that corresponds to the graph - of - words of a target text document as the vector representation of its vocabulary .", "forward": false, "src_ids": "11.textgraphs-1.9_10648"}
{"input": "gradient imitation reinforcement learning method is used for Task| context: existing works either utilize self - training scheme to generate pseudo labels that will cause the gradual drift problem , or leverage meta - learning scheme which does not solicit feedback explicitly .", "entity": "gradient imitation reinforcement learning method", "output": "low - resource relation extraction", "neg_sample": ["gradient imitation reinforcement learning method is used for Task", "existing works either utilize self - training scheme to generate pseudo labels that will cause the gradual drift problem , or leverage meta - learning scheme which does not solicit feedback explicitly ."], "relation": "used for", "id": "2021.emnlp-main.216", "year": 2021, "rel_sent": "Gradient Imitation Reinforcement Learning for Low Resource Relation Extraction.", "forward": true, "src_ids": "2021.emnlp-main.216_5807"}
{"input": "social and linguistic information is used for Method| context: understanding the political perspective shaping the way events are discussed in the media is increasingly important due to the dramatic change in news distribution . with the advance in text classification models , the performance of political perspective detection is also improving rapidly . however , current deep learning based text models often require a large amount of supervised data for training , which can be very expensive to obtain for this task . meanwhile , models pre - trained on the general source and task ( e.g. bert ) lack the ability tofocus on bias - related text span .", "entity": "social and linguistic information", "output": "pretrained representations", "neg_sample": ["social and linguistic information is used for Method", "understanding the political perspective shaping the way events are discussed in the media is increasingly important due to the dramatic change in news distribution .", "with the advance in text classification models , the performance of political perspective detection is also improving rapidly .", "however , current deep learning based text models often require a large amount of supervised data for training , which can be very expensive to obtain for this task .", "meanwhile , models pre - trained on the general source and task ( e.g.", "bert ) lack the ability tofocus on bias - related text span ."], "relation": "used for", "id": "2021.findings-acl.401", "year": 2021, "rel_sent": "Using Social and Linguistic Information to Adapt Pretrained Representations for Political Perspective Identification.", "forward": true, "src_ids": "2021.findings-acl.401_1190"}
{"input": "aggregating passage - level clues is used for Task| context: recent qa with logical reasoning questions requires passage - level relations among the sentences . however , current approaches still focus on sentence - level relations interacting among tokens .", "entity": "aggregating passage - level clues", "output": "logical reasoning qa", "neg_sample": ["aggregating passage - level clues is used for Task", "recent qa with logical reasoning questions requires passage - level relations among the sentences .", "however , current approaches still focus on sentence - level relations interacting among tokens ."], "relation": "used for", "id": "2021.naacl-main.467", "year": 2021, "rel_sent": "In this work , we explore aggregating passage - level clues for solving logical reasoning QA by using discourse - based information .", "forward": true, "src_ids": "2021.naacl-main.467_8427"}
{"input": "endangered language documentation is done by using Material| context: documentation of endangered languages ( els ) has become increasingly urgent as thousands of languages are on the verge of disappearing by the end of the 21st century . one challenging aspect of documentation is to develop machine learning tools to automate the processing of el audio via automatic speech recognition ( asr ) , machine translation ( mt ) , or speech translation ( st ) .", "entity": "endangered language documentation", "output": "highland puebla nahuatl speech translation corpus", "neg_sample": ["endangered language documentation is done by using Material", "documentation of endangered languages ( els ) has become increasingly urgent as thousands of languages are on the verge of disappearing by the end of the 21st century .", "one challenging aspect of documentation is to develop machine learning tools to automate the processing of el audio via automatic speech recognition ( asr ) , machine translation ( mt ) , or speech translation ( st ) ."], "relation": "used for", "id": "2021.americasnlp-1.7", "year": 2021, "rel_sent": "Highland Puebla Nahuatl Speech Translation Corpus for Endangered Language Documentation.", "forward": false, "src_ids": "2021.americasnlp-1.7_15496"}
{"input": "automatic scene segmentation is used for Material| context: a scene here is a segment of the text where time and discourse time are more or less equal , the narration focuses on one action and location and character constellations stay the same .", "entity": "automatic scene segmentation", "output": "narrative texts", "neg_sample": ["automatic scene segmentation is used for Material", "a scene here is a segment of the text where time and discourse time are more or less equal , the narration focuses on one action and location and character constellations stay the same ."], "relation": "used for", "id": "2021.eacl-main.276", "year": 2021, "rel_sent": "An automatic scene segmentation paves the way towards processing longer narrative texts like tales or novels by breaking them down into smaller , coherent and meaningful parts , which is an important stepping stone towards the reconstruction of plot in Computational Literary Studies but also can serve to improve tasks like coreference resolution .", "forward": true, "src_ids": "2021.eacl-main.276_8834"}
{"input": "online abuse is done by using Task| context: we present a data set consisting of german news articles labeled for political bias on a five - point scale in a semi - supervised way .", "entity": "online abuse", "output": "understanding political bias", "neg_sample": ["online abuse is done by using Task", "we present a data set consisting of german news articles labeled for political bias on a five - point scale in a semi - supervised way ."], "relation": "used for", "id": "2021.woah-1.13", "year": 2021, "rel_sent": "Understanding political bias helps in accurately detecting hate speech and online abuse .", "forward": false, "src_ids": "2021.woah-1.13_2651"}
{"input": "concatenated n - best asr alternatives is used for Method| context: spoken language understanding ( slu ) systems parse speech into semantic structures like dialog acts and slots . this involves the use of an automatic speech recognizer ( asr ) to transcribe speech into multiple text alternatives ( hypotheses ) . transcription errors , ordinary in asrs , impact downstream slu performance negatively . common approaches to mitigate such errors involve using richer information from the asr , either in form of n - best hypotheses or word - lattices .", "entity": "concatenated n - best asr alternatives", "output": "transformer encoder models", "neg_sample": ["concatenated n - best asr alternatives is used for Method", "spoken language understanding ( slu ) systems parse speech into semantic structures like dialog acts and slots .", "this involves the use of an automatic speech recognizer ( asr ) to transcribe speech into multiple text alternatives ( hypotheses ) .", "transcription errors , ordinary in asrs , impact downstream slu performance negatively .", "common approaches to mitigate such errors involve using richer information from the asr , either in form of n - best hypotheses or word - lattices ."], "relation": "used for", "id": "2021.acl-short.14", "year": 2021, "rel_sent": "In our work , we test our hypothesis by using the concatenated N - best ASR alternatives as the input to the transformer encoder models , namely BERT and XLM - RoBERTa , and achieve equivalent performance to the prior state - of - the - art model on DSTC2 dataset .", "forward": true, "src_ids": "2021.acl-short.14_8330"}
{"input": "multi - task loss function is used for Method| context: multilingual sentence embeddings capture rich semantic information not only for measuring similarity between texts but alsofor catering to a broad range of downstream cross - lingual nlp tasks . state - of - the - art multilingual sentence embedding models require large parallel corpora to learn efficiently , which confines the scope of these models .", "entity": "multi - task loss function", "output": "dual encoder model", "neg_sample": ["multi - task loss function is used for Method", "multilingual sentence embeddings capture rich semantic information not only for measuring similarity between texts but alsofor catering to a broad range of downstream cross - lingual nlp tasks .", "state - of - the - art multilingual sentence embedding models require large parallel corpora to learn efficiently , which confines the scope of these models ."], "relation": "used for", "id": "2021.emnlp-main.716", "year": 2021, "rel_sent": "We capture semantic similarity and relatedness between sentences using a multi - task loss function for training a dual encoder model mapping different languages onto the same vector space .", "forward": true, "src_ids": "2021.emnlp-main.716_2524"}
{"input": "sparse pruning technique is used for Method| context: transformer - based pre - trained language models have significantly improved the performance of various natural language processing ( nlp ) tasks in the recent years . while effective and prevalent , these models are usually prohibitively large for resource - limited deployment scenarios . a thread of research has thus been working on applying network pruning techniques under the pretrain - then - finetune paradigm widely adopted in nlp . however , the existing pruning results on benchmark transformers , such as bert , are not as remarkable as the pruning results in the literature of convolutional neural networks ( cnns ) . in particular , common wisdom in pruning cnn states that sparse pruning technique compresses a model more than that obtained by reducing number of channels and layers , while existing works on sparse pruning of bert yields inferior results than its small - dense counterparts such as tinybert .", "entity": "sparse pruning technique", "output": "bert model", "neg_sample": ["sparse pruning technique is used for Method", "transformer - based pre - trained language models have significantly improved the performance of various natural language processing ( nlp ) tasks in the recent years .", "while effective and prevalent , these models are usually prohibitively large for resource - limited deployment scenarios .", "a thread of research has thus been working on applying network pruning techniques under the pretrain - then - finetune paradigm widely adopted in nlp .", "however , the existing pruning results on benchmark transformers , such as bert , are not as remarkable as the pruning results in the literature of convolutional neural networks ( cnns ) .", "in particular , common wisdom in pruning cnn states that sparse pruning technique compresses a model more than that obtained by reducing number of channels and layers , while existing works on sparse pruning of bert yields inferior results than its small - dense counterparts such as tinybert ."], "relation": "used for", "id": "2021.naacl-main.188", "year": 2021, "rel_sent": "We show for the first time that sparse pruning compresses a BERT model significantly more than reducing its number of channels and layers .", "forward": true, "src_ids": "2021.naacl-main.188_12025"}
{"input": "interactive transcription is done by using Method| context: transcribing low resource languages can be challenging in the absence of a good lexicon and trained transcribers . accordingly , we seek a way to enable interactive transcription whereby the machine amplifies human efforts .", "entity": "interactive transcription", "output": "computational model", "neg_sample": ["interactive transcription is done by using Method", "transcribing low resource languages can be challenging in the absence of a good lexicon and trained transcribers .", "accordingly , we seek a way to enable interactive transcription whereby the machine amplifies human efforts ."], "relation": "used for", "id": "2021.dash-1.16", "year": 2021, "rel_sent": "A Computational Model for Interactive Transcription.", "forward": false, "src_ids": "2021.dash-1.16_15470"}
{"input": "word representation is done by using OtherScientificTerm| context: to keep pace with the increased generation and digitization of documents , automated methods that can improve search , discovery and mining of the vast body of literature are essential . keyphrases provide a concise representation by identifying salient concepts in a document . various supervised approaches model keyphrase extraction using local context to predict the label for each token and perform much better than the unsupervised counterparts . moreover , keyphrases , which are usually the gist of a document , need to be the central theme .", "entity": "word representation", "output": "centrality constraint", "neg_sample": ["word representation is done by using OtherScientificTerm", "to keep pace with the increased generation and digitization of documents , automated methods that can improve search , discovery and mining of the vast body of literature are essential .", "keyphrases provide a concise representation by identifying salient concepts in a document .", "various supervised approaches model keyphrase extraction using local context to predict the label for each token and perform much better than the unsupervised counterparts .", "moreover , keyphrases , which are usually the gist of a document , need to be the central theme ."], "relation": "used for", "id": "2021.bionlp-1.17", "year": 2021, "rel_sent": "We propose a new extraction model that introduces a centrality constraint to enrich the word representation of a Bidirectional long short - term memory .", "forward": false, "src_ids": "2021.bionlp-1.17_4787"}
{"input": "error identification is used for Task| context: however , the lack of explanations along with decisions made by end - to - end neural models makes the results difficult to interpret . furthermore , word - level annotated datasets are rare due to the prohibitive effort required to perform this task , while they could provide interpretable signals in addition to sentence - level qe outputs .", "entity": "error identification", "output": "machine translation", "neg_sample": ["error identification is used for Task", "however , the lack of explanations along with decisions made by end - to - end neural models makes the results difficult to interpret .", "furthermore , word - level annotated datasets are rare due to the prohibitive effort required to perform this task , while they could provide interpretable signals in addition to sentence - level qe outputs ."], "relation": "used for", "id": "2021.eval4nlp-1.15", "year": 2021, "rel_sent": "Error Identification for Machine Translation with Metric Embedding and Attention.", "forward": true, "src_ids": "2021.eval4nlp-1.15_11280"}
{"input": "zero - shot slot filling is done by using Method| context: zero - shot cross - domain slot filling alleviates the data dependence in the case of data scarcity in the target domain , which has aroused extensive research . however , as most of the existing methods do not achieve effective knowledge transfer to the target domain , they just fit the distribution of the seen slot and show poor performance on unseen slot in the target domain .", "entity": "zero - shot slot filling", "output": "dynamic label confusion strategy", "neg_sample": ["zero - shot slot filling is done by using Method", "zero - shot cross - domain slot filling alleviates the data dependence in the case of data scarcity in the target domain , which has aroused extensive research .", "however , as most of the existing methods do not achieve effective knowledge transfer to the target domain , they just fit the distribution of the seen slot and show poor performance on unseen slot in the target domain ."], "relation": "used for", "id": "2021.emnlp-main.746", "year": 2021, "rel_sent": "To solve this , we propose a novel approach based on prototypical contrastive learning with a dynamic label confusion strategy for zero - shot slot filling .", "forward": false, "src_ids": "2021.emnlp-main.746_12374"}
{"input": "selecting unbiased annotators is used for Task| context: implicit bias embedded in the annotated data is by far the greatest impediment in the effectual use of supervised machine learning models in tasks involving race , ethics , and geopolitical polarization . for societal good and demonstrable positive impact on wider society , it is paramount to carefully select data annotators and rigorously validate the annotation process . current approaches to selecting annotators are not sufficiently grounded in scientific principles and are limited at the policy - guidance level , thereby rendering them unusable for machine learning practitioners .", "entity": "selecting unbiased annotators", "output": "machine learning problem", "neg_sample": ["selecting unbiased annotators is used for Task", "implicit bias embedded in the annotated data is by far the greatest impediment in the effectual use of supervised machine learning models in tasks involving race , ethics , and geopolitical polarization .", "for societal good and demonstrable positive impact on wider society , it is paramount to carefully select data annotators and rigorously validate the annotation process .", "current approaches to selecting annotators are not sufficiently grounded in scientific principles and are limited at the policy - guidance level , thereby rendering them unusable for machine learning practitioners ."], "relation": "used for", "id": "2021.findings-acl.169", "year": 2021, "rel_sent": "This work proposes a new approach based on the mixed - methods design that is functional , adaptable , and simpler to implement in selecting unbiased annotators for any machine learning problem .", "forward": true, "src_ids": "2021.findings-acl.169_10386"}
{"input": "text similarity tasks is done by using Method| context: representation learning for text via pretraining a language model on a large corpus has become a standard starting point for building nlp systems . this approach stands in contrast to autoencoders , also trained on raw text , but with the objective of learning to encode each input as a vector that allows full reconstruction . autoencoders are attractive because of their latent space structure and generative properties .", "entity": "text similarity tasks", "output": "pretrained transformers", "neg_sample": ["text similarity tasks is done by using Method", "representation learning for text via pretraining a language model on a large corpus has become a standard starting point for building nlp systems .", "this approach stands in contrast to autoencoders , also trained on raw text , but with the objective of learning to encode each input as a vector that allows full reconstruction .", "autoencoders are attractive because of their latent space structure and generative properties ."], "relation": "used for", "id": "2021.emnlp-main.137", "year": 2021, "rel_sent": "We demonstrate that the sentence representations discovered by our model achieve better quality than previous methods that extract representations from pretrained transformers on text similarity tasks , style transfer ( an example of controlled generation ) , and single - sentence classification tasks in the GLUE benchmark , while using fewer parameters than large pretrained models .", "forward": false, "src_ids": "2021.emnlp-main.137_14097"}
{"input": "paraphrases is done by using Method| context: we release our synthetic parallel paraphrase corpus across 17 languages : arabic , catalan , czech , german , english , spanish , estonian , french , hindi , indonesian , italian , dutch , romanian , russian , swedish , vietnamese , and chinese .", "entity": "paraphrases", "output": "neural machine translation system", "neg_sample": ["paraphrases is done by using Method", "we release our synthetic parallel paraphrase corpus across 17 languages : arabic , catalan , czech , german , english , spanish , estonian , french , hindi , indonesian , italian , dutch , romanian , russian , swedish , vietnamese , and chinese ."], "relation": "used for", "id": "2021.paclic-1.6", "year": 2021, "rel_sent": "Our method relies only on monolingual data and a neural machine translation system to generate paraphrases , hence simple to apply .", "forward": false, "src_ids": "2021.paclic-1.6_2682"}
{"input": "rational lamol is used for Method| context: lifelong learning ( ll ) aims to train a neural network on a stream of tasks while retaining knowledge from previous tasks . however , many prior attempts in nlp still suffer from the catastrophic forgetting issue , where the model completely forgets what it just learned in the previous tasks .", "entity": "rational lamol", "output": "lamol", "neg_sample": ["rational lamol is used for Method", "lifelong learning ( ll ) aims to train a neural network on a stream of tasks while retaining knowledge from previous tasks .", "however , many prior attempts in nlp still suffer from the catastrophic forgetting issue , where the model completely forgets what it just learned in the previous tasks ."], "relation": "used for", "id": "2021.acl-long.229", "year": 2021, "rel_sent": "In order to alleviate catastrophic forgetting , Rational LAMOL enhances LAMOL , a recent LL model , by applying critical freezing guided by human rationales .", "forward": true, "src_ids": "2021.acl-long.229_8498"}
{"input": "pre - modern languages is done by using Method| context: the vast majority of nlp , its algorithms and software , is created with assumptions particular to living languages , thus neglecting certain important characteristics of largely non - spoken historical languages . further , scholars of pre - modern languages often have different goals than those of living - language researchers .", "entity": "pre - modern languages", "output": "nlp frameworks", "neg_sample": ["pre - modern languages is done by using Method", "the vast majority of nlp , its algorithms and software , is created with assumptions particular to living languages , thus neglecting certain important characteristics of largely non - spoken historical languages .", "further , scholars of pre - modern languages often have different goals than those of living - language researchers ."], "relation": "used for", "id": "2021.acl-demo.3", "year": 2021, "rel_sent": "The Classical Language Toolkit : An NLP Framework for Pre - Modern Languages.", "forward": false, "src_ids": "2021.acl-demo.3_10165"}
{"input": "dictionaries is done by using Method| context: however , due to a number of limitations of tlex , it was necessary to transition to a more flexible and more powerful format .", "entity": "dictionaries", "output": "tshwanelex ( tlex ) platform", "neg_sample": ["dictionaries is done by using Method", "however , due to a number of limitations of tlex , it was necessary to transition to a more flexible and more powerful format ."], "relation": "used for", "id": "2021.iwclul-1.7", "year": 2021, "rel_sent": "Volume 1 , whose prelimiary version was completed in 2020 , used the TshwaneLex ( TLex ) platform , which is perfectly adequate for dictionaries with a low to medium level of complexity , and which allows for almost WYSIWYG formatting and simple export into a publishable format .", "forward": false, "src_ids": "2021.iwclul-1.7_4149"}
{"input": "stretch - vst is used for Task| context: in visual storytelling , a short story is generated based on a given image sequence . despite years of work , most visual storytelling models remain limited in terms of the generated stories ' fixed length : most models produce stories with exactly five sentences because five - sentence stories dominate the training data . the fix - length stories carry limited details and provide ambiguous textual information to the readers .", "entity": "stretch - vst", "output": "generation of prolonged stories", "neg_sample": ["stretch - vst is used for Task", "in visual storytelling , a short story is generated based on a given image sequence .", "despite years of work , most visual storytelling models remain limited in terms of the generated stories ' fixed length : most models produce stories with exactly five sentences because five - sentence stories dominate the training data .", "the fix - length stories carry limited details and provide ambiguous textual information to the readers ."], "relation": "used for", "id": "2021.acl-demo.42", "year": 2021, "rel_sent": "This paper presents Stretch - VST , a visual storytelling framework that enables the generation of prolonged stories by adding appropriate knowledge , which is selected by the proposed scoring function .", "forward": true, "src_ids": "2021.acl-demo.42_8554"}
{"input": "aligned representations is done by using Method| context: impressive milestones have been achieved in text matching by adopting a cross - attention mechanism to capture pertinent semantic connections between two sentence representations . however , regular cross - attention focuses on word - level links between the two input sequences , neglecting the importance of contextual information .", "entity": "aligned representations", "output": "gate fusion layer", "neg_sample": ["aligned representations is done by using Method", "impressive milestones have been achieved in text matching by adopting a cross - attention mechanism to capture pertinent semantic connections between two sentence representations .", "however , regular cross - attention focuses on word - level links between the two input sequences , neglecting the importance of contextual information ."], "relation": "used for", "id": "2021.emnlp-main.312", "year": 2021, "rel_sent": "Specifically , each interaction block includes ( 1 ) a context - aware cross - attention mechanism to effectively integrate contextual information when aligning two sequences , and ( 2 ) a gate fusion layer toflexibly interpolate aligned representations .", "forward": false, "src_ids": "2021.emnlp-main.312_11570"}
{"input": "automatic methods is used for OtherScientificTerm| context: benchmark datasets , however , are intimately tied to the corpus used for their creation questioning their reliability as well as the robustness of automatic methods .", "entity": "automatic methods", "output": "lexical semantic change", "neg_sample": ["automatic methods is used for OtherScientificTerm", "benchmark datasets , however , are intimately tied to the corpus used for their creation questioning their reliability as well as the robustness of automatic methods ."], "relation": "used for", "id": "2021.lchange-1.3", "year": 2021, "rel_sent": "We also identify a set of additional issues ( OCR quality , named entities ) that impact the performance of the automatic methods , especially when used to discover LSC .", "forward": true, "src_ids": "2021.lchange-1.3_1419"}
{"input": "semantic representation is used for Task| context: although neural models have achieved competitive results in dialogue systems , they have shown limited ability in representing core semantics , such as ignoring important entities .", "entity": "semantic representation", "output": "dialogue modeling", "neg_sample": ["semantic representation is used for Task", "although neural models have achieved competitive results in dialogue systems , they have shown limited ability in representing core semantics , such as ignoring important entities ."], "relation": "used for", "id": "2021.acl-long.342", "year": 2021, "rel_sent": "Semantic Representation for Dialogue Modeling.", "forward": true, "src_ids": "2021.acl-long.342_4311"}
{"input": "pre - trained language model compression is done by using Method| context: pre - trained language models ( plms ) achieve great success in nlp . however , their huge model sizes hinder their applications in many practical systems . knowledge distillation is a popular technique to compress plms , which learns a small student model from a large teacher plm . however , the knowledge learned from a single teacher may be limited and even biased , resulting in low - quality student model .", "entity": "pre - trained language model compression", "output": "multi - teacher knowledge distillation framework", "neg_sample": ["pre - trained language model compression is done by using Method", "pre - trained language models ( plms ) achieve great success in nlp .", "however , their huge model sizes hinder their applications in many practical systems .", "knowledge distillation is a popular technique to compress plms , which learns a small student model from a large teacher plm .", "however , the knowledge learned from a single teacher may be limited and even biased , resulting in low - quality student model ."], "relation": "used for", "id": "2021.findings-acl.387", "year": 2021, "rel_sent": "In this paper , we propose a multi - teacher knowledge distillation framework named MTBERT for pre - trained language model compression , which can train high - quality student model from multiple teacher PLMs .", "forward": false, "src_ids": "2021.findings-acl.387_10299"}
{"input": "extractive multi - document summarization is done by using Method| context: recent researches have demonstrated that bert shows potential in a wide range of natural language processing tasks . it is adopted as an encoder for many stateof - the - art automatic summarizing systems , which achieve excellent performance . however , sofar , there is not much work done for vietnamese .", "entity": "extractive multi - document summarization", "output": "monolingual vs multilingual bertology", "neg_sample": ["extractive multi - document summarization is done by using Method", "recent researches have demonstrated that bert shows potential in a wide range of natural language processing tasks .", "it is adopted as an encoder for many stateof - the - art automatic summarizing systems , which achieve excellent performance .", "however , sofar , there is not much work done for vietnamese ."], "relation": "used for", "id": "2021.paclic-1.59", "year": 2021, "rel_sent": "Monolingual vs multilingual BERTology for Vietnamese extractive multi - document summarization.", "forward": false, "src_ids": "2021.paclic-1.59_5695"}
{"input": "covid-19 is used for Task| context: social media can be leveraged to understand public sentiment and feelings in real - time , and target public health messages based on user interests and emotions .", "entity": "covid-19", "output": "triggering intense anxiety", "neg_sample": ["covid-19 is used for Task", "social media can be leveraged to understand public sentiment and feelings in real - time , and target public health messages based on user interests and emotions ."], "relation": "used for", "id": "2021.jeptalnrecital-taln.21", "year": 2021, "rel_sent": "Sifting French Tweets to Investigate the Impact of Covid-19 in Triggering Intense Anxiety.", "forward": true, "src_ids": "2021.jeptalnrecital-taln.21_6400"}
{"input": "euclidean distance is used for Metric| context: sign language lexica are a useful resource for researchers and people learning sign languages . current implementations allow a user to search a sign either by its gloss or by selecting its primary features such as handshape and location .", "entity": "euclidean distance", "output": "distance metrics", "neg_sample": ["euclidean distance is used for Metric", "sign language lexica are a useful resource for researchers and people learning sign languages .", "current implementations allow a user to search a sign either by its gloss or by selecting its primary features such as handshape and location ."], "relation": "used for", "id": "2021.mtsummit-at4ssl.3", "year": 2021, "rel_sent": "By extracting different body joints combinations ( upper body , dominant hand 's arm and wrist ) using the pose estimation framework OpenPose , we compare four techniques ( PCA , UMAP , DTW and Euclidean distance ) as distance metrics between 20 query signs , each performed by eight participants on a 1200 sign lexicon .", "forward": true, "src_ids": "2021.mtsummit-at4ssl.3_2271"}
{"input": "conversational recommendation is done by using Method| context: growing interests have been attracted in conversational recommender systems ( crs ) , which explore user preference through conversational interactions in order to make appropriate recommendation . however , there is still a lack of ability in existing crs to ( 1 ) traverse multiple reasoning paths over background knowledge to introduce relevant items and attributes , and ( 2 ) arrange selected entities appropriately under current system intents to control response generation .", "entity": "conversational recommendation", "output": "cr - walker", "neg_sample": ["conversational recommendation is done by using Method", "growing interests have been attracted in conversational recommender systems ( crs ) , which explore user preference through conversational interactions in order to make appropriate recommendation .", "however , there is still a lack of ability in existing crs to ( 1 ) traverse multiple reasoning paths over background knowledge to introduce relevant items and attributes , and ( 2 ) arrange selected entities appropriately under current system intents to control response generation ."], "relation": "used for", "id": "2021.emnlp-main.139", "year": 2021, "rel_sent": "CR - Walker : Tree - Structured Graph Reasoning and Dialog Acts for Conversational Recommendation.", "forward": false, "src_ids": "2021.emnlp-main.139_2623"}
{"input": "bert and bart - based rankers is done by using Method| context: the performance of state - of - the - art neural rankers can deteriorate substantially when exposed to noisy inputs or applied to a new domain .", "entity": "bert and bart - based rankers", "output": "contrastive fine - tuning method", "neg_sample": ["bert and bart - based rankers is done by using Method", "the performance of state - of - the - art neural rankers can deteriorate substantially when exposed to noisy inputs or applied to a new domain ."], "relation": "used for", "id": "2021.findings-acl.51", "year": 2021, "rel_sent": "In experiments with four passage ranking datasets , the proposed contrastive fine - tuning method obtains improvements on robustness to query reformulations , noise perturbations , and zeroshot transfer for both BERT and BART - based rankers .", "forward": false, "src_ids": "2021.findings-acl.51_12700"}
{"input": "database schema is used for Task| context: under the setup of cross - domain , traditional semantic parsing models struggle to adapt to unseen database schemas .", "entity": "database schema", "output": "text - to - sql", "neg_sample": ["database schema is used for Task", "under the setup of cross - domain , traditional semantic parsing models struggle to adapt to unseen database schemas ."], "relation": "used for", "id": "2021.naacl-main.441", "year": 2021, "rel_sent": "Given a database schema , Text - to - SQL aims to translate a natural language question into the corresponding SQL query .", "forward": true, "src_ids": "2021.naacl-main.441_6975"}
{"input": "predicting reddit popularity is done by using Method| context: past work investigating what makes a reddit post popular has indicated that style is a far better predictor than content , where posts conforming to a subreddit 's community style are better received . however , what about a diglossia , when there are two community styles ? in singapore , the basilect ( ' singlish ' ) co - exists with an acrolect ( standard english ) , each with contrasting advantages of community identity and prestige respectively .", "entity": "predicting reddit popularity", "output": "stylistic approaches", "neg_sample": ["predicting reddit popularity is done by using Method", "past work investigating what makes a reddit post popular has indicated that style is a far better predictor than content , where posts conforming to a subreddit 's community style are better received .", "however , what about a diglossia , when there are two community styles ?", "in singapore , the basilect ( ' singlish ' ) co - exists with an acrolect ( standard english ) , each with contrasting advantages of community identity and prestige respectively ."], "relation": "used for", "id": "2021.acl-srw.10", "year": 2021, "rel_sent": "Stylistic approaches to predicting Reddit popularity in diglossia.", "forward": false, "src_ids": "2021.acl-srw.10_12286"}
{"input": "joint model is used for OtherScientificTerm| context: the most straightforward approach to joint word segmentation ( ws ) , part - of - speech ( pos ) tagging , and constituent parsing is converting a word - level tree into a char - level tree , which , however , leads to two severe challenges . first , a larger label set ( e.g. , > = 600 ) and longer inputs both increase computational costs . second , it is difficult to rule out illegal trees containing conflicting production rules , which is important for reliable model evaluation . if a pos tag ( like vv ) is above a phrase tag ( like vp ) in the output tree , it becomes quite complex to decide word boundaries .", "entity": "joint model", "output": "bracketed tree", "neg_sample": ["joint model is used for OtherScientificTerm", "the most straightforward approach to joint word segmentation ( ws ) , part - of - speech ( pos ) tagging , and constituent parsing is converting a word - level tree into a char - level tree , which , however , leads to two severe challenges .", "first , a larger label set ( e.g.", ", > = 600 ) and longer inputs both increase computational costs .", "second , it is difficult to rule out illegal trees containing conflicting production rules , which is important for reliable model evaluation .", "if a pos tag ( like vv ) is above a phrase tag ( like vp ) in the output tree , it becomes quite complex to decide word boundaries ."], "relation": "used for", "id": "2021.conll-1.23", "year": 2021, "rel_sent": "In the coarse labeling stage , the joint model outputs a bracketed tree , in which each node corresponds to one of four labels ( i.e. , phrase , subphrase , word , subword ) .", "forward": true, "src_ids": "2021.conll-1.23_2755"}
{"input": "gaze behaviour data is used for OtherScientificTerm| context: automatic detection of cognates helps downstream nlp tasks of machine translation , cross - lingual information retrieval , computational phylogenetics and cross - lingual named entity recognition . previous approaches for the task of cognate detection use orthographic , phonetic and semantic similarity based features sets .", "entity": "gaze behaviour data", "output": "cognitive features", "neg_sample": ["gaze behaviour data is used for OtherScientificTerm", "automatic detection of cognates helps downstream nlp tasks of machine translation , cross - lingual information retrieval , computational phylogenetics and cross - lingual named entity recognition .", "previous approaches for the task of cognate detection use orthographic , phonetic and semantic similarity based features sets ."], "relation": "used for", "id": "2021.eacl-main.288", "year": 2021, "rel_sent": "We use the collected gaze behaviour data to predict cognitive features for a larger sample and show that predicted cognitive features , also , significantly improve the task performance .", "forward": true, "src_ids": "2021.eacl-main.288_9990"}
{"input": "socaog is used for Task| context: inferring social relations from dialogues is vital for building emotionally intelligent robots to interpret human language better and act accordingly .", "entity": "socaog", "output": "dynamic inference", "neg_sample": ["socaog is used for Task", "inferring social relations from dialogues is vital for building emotionally intelligent robots to interpret human language better and act accordingly ."], "relation": "used for", "id": "2021.acl-long.54", "year": 2021, "rel_sent": "Moreover , we formulate a sequential structure prediction task , and propose an alpha - beta - gamma strategy to incrementally parse SocAoG for the dynamic inference upon any incoming utterance : ( i ) an alpha process predicting attributes and relations conditioned on the semantics of dialogues , ( ii ) a beta process updating the social relations based on related attributes , and ( iii ) a gamma process updating individual 's attributes based on interpersonal social relations .", "forward": true, "src_ids": "2021.acl-long.54_61"}
{"input": "unimodal and crossmodal refinement network ( ucrn ) is used for Task| context: effective unimodal representation and complementary crossmodal representation fusion are both important in multimodal representation learning . prior works often modulate one modal feature to another straightforwardly and thus , underutilizing both unimodal and crossmodal representation refinements , which incurs a bottleneck of performance improvement .", "entity": "unimodal and crossmodal refinement network ( ucrn )", "output": "multimodal sequence fusion", "neg_sample": ["unimodal and crossmodal refinement network ( ucrn ) is used for Task", "effective unimodal representation and complementary crossmodal representation fusion are both important in multimodal representation learning .", "prior works often modulate one modal feature to another straightforwardly and thus , underutilizing both unimodal and crossmodal representation refinements , which incurs a bottleneck of performance improvement ."], "relation": "used for", "id": "2021.emnlp-main.720", "year": 2021, "rel_sent": "Unimodal and Crossmodal Refinement Network for Multimodal Sequence Fusion.", "forward": true, "src_ids": "2021.emnlp-main.720_1045"}
{"input": "unsupervised neural machine translation is done by using Method| context: unsupervised neural machine translation ( unmt ) is beneficial especially for under - resourced languages such as from the dravidian family . they learn to translate between the source and target , relying solely on only monolingual corpora . however , unmt systems fail in scenarios that occur often when dealing with low resource languages . recent works have achieved state - of - the - art results by adding auxiliary parallel data with similar languages .", "entity": "unsupervised neural machine translation", "output": "model architectures", "neg_sample": ["unsupervised neural machine translation is done by using Method", "unsupervised neural machine translation ( unmt ) is beneficial especially for under - resourced languages such as from the dravidian family .", "they learn to translate between the source and target , relying solely on only monolingual corpora .", "however , unmt systems fail in scenarios that occur often when dealing with low resource languages .", "recent works have achieved state - of - the - art results by adding auxiliary parallel data with similar languages ."], "relation": "used for", "id": "2021.dravidianlangtech-1.7", "year": 2021, "rel_sent": "We explore several model architectures that use the auxiliary data in order to maximize knowledge sharing and enable UNMT for dissimilar language pairs .", "forward": false, "src_ids": "2021.dravidianlangtech-1.7_1886"}
{"input": "labeled social knowledge categories dataset is used for Method| context: large pre - trained language models ( plms ) have led to great success on various commonsense question answering ( qa ) tasks in an end - to - end fashion . however , little attention has been paid to what commonsense knowledge is needed to deeply characterize these qa tasks .", "entity": "labeled social knowledge categories dataset", "output": "neural qa models", "neg_sample": ["labeled social knowledge categories dataset is used for Method", "large pre - trained language models ( plms ) have led to great success on various commonsense question answering ( qa ) tasks in an end - to - end fashion .", "however , little attention has been paid to what commonsense knowledge is needed to deeply characterize these qa tasks ."], "relation": "used for", "id": "2021.sustainlp-1.10", "year": 2021, "rel_sent": "Building upon our labeled social knowledge categories dataset on top of SocialIQA , we further train neural QA models to incorporate such social knowledge categories and relation information from a knowledge base .", "forward": true, "src_ids": "2021.sustainlp-1.10_9239"}
{"input": "oq tasks is done by using Metric| context: ordinal classification ( oc ) is an important classification task where the classes are ordinal . for example , an oc task for sentiment analysis could have the following classes : highly positive , positive , neutral , negative , highly negative .", "entity": "oq tasks", "output": "evaluating evaluation measures", "neg_sample": ["oq tasks is done by using Metric", "ordinal classification ( oc ) is an important classification task where the classes are ordinal .", "for example , an oc task for sentiment analysis could have the following classes : highly positive , positive , neutral , negative , highly negative ."], "relation": "used for", "id": "2021.acl-long.214", "year": 2021, "rel_sent": "In the present study , we utilise data from the SemEval and NTCIR communities to clarify the properties of nine evaluation measures in the context of OC tasks , and six measures in the context of OQ tasks .", "forward": false, "src_ids": "2021.acl-long.214_9553"}
{"input": "text classification is done by using Method| context: in recent years , memes combining image and text have been widely used in social media , and memes are one of the most popular types of content used in online disinformation campaigns .", "entity": "text classification", "output": "combination model", "neg_sample": ["text classification is done by using Method", "in recent years , memes combining image and text have been widely used in social media , and memes are one of the most popular types of content used in online disinformation campaigns ."], "relation": "used for", "id": "2021.semeval-1.144", "year": 2021, "rel_sent": "For propaganda technology detection in text , we propose a combination model of both ALBERT and Text CNN for text classification , as well as a BERT - based multi - task sequence labeling model for propaganda technology coverage span detection .", "forward": false, "src_ids": "2021.semeval-1.144_749"}
{"input": "human evaluations is used for Task| context: recent years have brought about an interest in the challenging task of summarizing conversation threads ( meetings , online discussions , etc . ) . such summaries help analysis of the long text to quickly catch up with the decisions made and thus improve our work or communication efficiency .", "entity": "human evaluations", "output": "short and long summary generation tasks", "neg_sample": ["human evaluations is used for Task", "recent years have brought about an interest in the challenging task of summarizing conversation threads ( meetings , online discussions , etc . )", ".", "such summaries help analysis of the long text to quickly catch up with the decisions made and thus improve our work or communication efficiency ."], "relation": "used for", "id": "2021.acl-long.537", "year": 2021, "rel_sent": "We perform a comprehensive empirical study to explore different summarization techniques ( including extractive and abstractive methods , single - document and hierarchical models , as well as transfer and semisupervised learning ) and conduct human evaluations on both short and long summary generation tasks .", "forward": true, "src_ids": "2021.acl-long.537_5488"}
{"input": "temporal semantics is used for OtherScientificTerm| context: the analysis supports a wide range of phenomena including : temporal references , temporal adverbs , aspectual classes and progressives .", "entity": "temporal semantics", "output": "syntax trees", "neg_sample": ["temporal semantics is used for OtherScientificTerm", "the analysis supports a wide range of phenomena including : temporal references , temporal adverbs , aspectual classes and progressives ."], "relation": "used for", "id": "2021.iwcs-1.2", "year": 2021, "rel_sent": "In this paper , we propose an implementation of temporal semantics that translates syntax trees to logical formulas , suitable for consumption by the Coq proof assistant .", "forward": true, "src_ids": "2021.iwcs-1.2_2391"}
{"input": "multimodal modeling is used for Task| context: visual grounding is a promising path toward more robust and accurate natural language processing ( nlp ) models . many multimodal extensions of bert ( e.g. , videobert , lxmert , vl - bert ) allow a joint modeling of texts and images that lead to state - of - the - art results on multimodal tasks such as visual question answering .", "entity": "multimodal modeling", "output": "text - only tasks", "neg_sample": ["multimodal modeling is used for Task", "visual grounding is a promising path toward more robust and accurate natural language processing ( nlp ) models .", "many multimodal extensions of bert ( e.g.", ", videobert , lxmert , vl - bert ) allow a joint modeling of texts and images that lead to state - of - the - art results on multimodal tasks such as visual question answering ."], "relation": "used for", "id": "2021.lantern-1.2", "year": 2021, "rel_sent": "A first type of strategy , referred to as transferred grounding consists in applying multimodal models to text - only tasks using a placeholder to replace image input .", "forward": true, "src_ids": "2021.lantern-1.2_7241"}
{"input": "repair process is done by using Method| context: automated program repair ( apr ) aims tofind an automatic solution to program language bugs without human intervention , and it can potentially reduce debugging costs and improve software quality . conventional approaches adopt learning - based methods such as sequence - to - sequence models for the patches generation . however , they tend to ignore the code structure information and suffer from grammar and syntax errors .", "entity": "repair process", "output": "grammar - based ruleto - rule model", "neg_sample": ["repair process is done by using Method", "automated program repair ( apr ) aims tofind an automatic solution to program language bugs without human intervention , and it can potentially reduce debugging costs and improve software quality .", "conventional approaches adopt learning - based methods such as sequence - to - sequence models for the patches generation .", "however , they tend to ignore the code structure information and suffer from grammar and syntax errors ."], "relation": "used for", "id": "2021.findings-acl.111", "year": 2021, "rel_sent": "To consider the grammar and syntax information , in this paper , we propose a grammar - based ruleto - rule model , which regards the repair process as the transformation of grammar rules , and leverages two encoders modeling both the original token sequence and the grammar rules , enhanced with a new tree - based self - attention .", "forward": false, "src_ids": "2021.findings-acl.111_9096"}
{"input": "causal analysis of syntactic agreement mechanisms is used for Method| context: targeted syntactic evaluations have demonstrated the ability of language models to perform subject - verb agreement given difficult contexts .", "entity": "causal analysis of syntactic agreement mechanisms", "output": "neural language models", "neg_sample": ["causal analysis of syntactic agreement mechanisms is used for Method", "targeted syntactic evaluations have demonstrated the ability of language models to perform subject - verb agreement given difficult contexts ."], "relation": "used for", "id": "2021.acl-long.144", "year": 2021, "rel_sent": "Causal Analysis of Syntactic Agreement Mechanisms in Neural Language Models.", "forward": true, "src_ids": "2021.acl-long.144_13192"}
{"input": "sanitized texts is used for Method| context: texts convey sophisticated knowledge . however , texts also convey sensitive information . despite the success of general - purpose language models and domain - specific mechanisms with differential privacy ( dp ) , existing text sanitization mechanisms still provide low utility , as cursed by the high - dimensional text representation . the companion issue of utilizing sanitized texts for downstream analytics is also under - explored .", "entity": "sanitized texts", "output": "sanitization - aware pretraining", "neg_sample": ["sanitized texts is used for Method", "texts convey sophisticated knowledge .", "however , texts also convey sensitive information .", "despite the success of general - purpose language models and domain - specific mechanisms with differential privacy ( dp ) , existing text sanitization mechanisms still provide low utility , as cursed by the high - dimensional text representation .", "the companion issue of utilizing sanitized texts for downstream analytics is also under - explored ."], "relation": "used for", "id": "2021.findings-acl.337", "year": 2021, "rel_sent": "The sanitized texts also contribute to our sanitization - aware pretraining and fine - tuning , enabling privacypreserving natural language processing over the BERT language model with promising utility .", "forward": true, "src_ids": "2021.findings-acl.337_8402"}
{"input": "machine learning approach is used for Task| context: machine learning approaches to detect depression from language have been demonstrated in the united states and other countries .", "entity": "machine learning approach", "output": "detection of depression", "neg_sample": ["machine learning approach is used for Task", "machine learning approaches to detect depression from language have been demonstrated in the united states and other countries ."], "relation": "used for", "id": "2021.paclic-1.37", "year": 2021, "rel_sent": "Machine Learning Approach for Depression Detection in Japanese.", "forward": true, "src_ids": "2021.paclic-1.37_5814"}
{"input": "graph - based causal inference is used for OtherScientificTerm| context: causal inference is the process of capturing cause - effect relationship among variables . most existing works focus on dealing with structured data , while mining causal relationship among factors from unstructured data , like text , has been less examined , but is of great importance , especially in the legal domain .", "entity": "graph - based causal inference", "output": "explainable discrimination", "neg_sample": ["graph - based causal inference is used for OtherScientificTerm", "causal inference is the process of capturing cause - effect relationship among variables .", "most existing works focus on dealing with structured data , while mining causal relationship among factors from unstructured data , like text , has been less examined , but is of great importance , especially in the legal domain ."], "relation": "used for", "id": "2021.naacl-main.155", "year": 2021, "rel_sent": "Experimental results show that GCI can capture the nuance from fact descriptions among multiple confusing charges and provide explainable discrimination , especially in few - shot settings .", "forward": true, "src_ids": "2021.naacl-main.155_12301"}
{"input": "biomedical domain is done by using Task| context: in recent years this interest has grown enormously leading to the development of a number of substantial datasets , of domain - specific contextual language models , and of several architectures .", "entity": "biomedical domain", "output": "coreference resolution", "neg_sample": ["biomedical domain is done by using Task", "in recent years this interest has grown enormously leading to the development of a number of substantial datasets , of domain - specific contextual language models , and of several architectures ."], "relation": "used for", "id": "2021.crac-1.2", "year": 2021, "rel_sent": "Coreference Resolution for the Biomedical Domain : A Survey.", "forward": false, "src_ids": "2021.crac-1.2_7822"}
{"input": "certified word substitution robustness methods is used for Task| context: existing bias mitigation methods to reduce disparities in model outcomes across cohorts have focused on data augmentation , debiasing model embeddings , or adding fairness - based optimization objectives during training .", "entity": "certified word substitution robustness methods", "output": "text classification tasks", "neg_sample": ["certified word substitution robustness methods is used for Task", "existing bias mitigation methods to reduce disparities in model outcomes across cohorts have focused on data augmentation , debiasing model embeddings , or adding fairness - based optimization objectives during training ."], "relation": "used for", "id": "2021.findings-acl.294", "year": 2021, "rel_sent": "In this paper , we investigate the utility of certified word substitution robustness methods to improve equality of odds and equality of opportunity on multiple text classification tasks .", "forward": true, "src_ids": "2021.findings-acl.294_2360"}
{"input": "translation is used for Task| context: we present a method for determining intended sense definitions of a given academic word in an academic keyword list .", "entity": "translation", "output": "pedagogy purposes", "neg_sample": ["translation is used for Task", "we present a method for determining intended sense definitions of a given academic word in an academic keyword list ."], "relation": "used for", "id": "2021.rocling-1.44", "year": 2021, "rel_sent": "We present a prototype system for the Academic Keyword List to generate definitions and translation for pedagogy purposes .", "forward": true, "src_ids": "2021.rocling-1.44_7882"}
{"input": "pretrained multilingual language models is used for Material| context: pretrained multilingual language models have been shown to work well on many languages for a variety of downstream nlp tasks . however , these models are known to require a lot of training data . this consequently leaves out a huge percentage of the world 's languages as they are under - resourced .", "entity": "pretrained multilingual language models", "output": "lower - resource languages", "neg_sample": ["pretrained multilingual language models is used for Material", "pretrained multilingual language models have been shown to work well on many languages for a variety of downstream nlp tasks .", "however , these models are known to require a lot of training data .", "this consequently leaves out a huge percentage of the world 's languages as they are under - resourced ."], "relation": "used for", "id": "2021.mrl-1.11", "year": 2021, "rel_sent": "Small Data ? No Problem ! Exploring the Viability of Pretrained Multilingual Language Models for Low - resourced Languages.", "forward": true, "src_ids": "2021.mrl-1.11_8713"}
{"input": "retrieval models is used for Task| context: recent state - of - the - art approaches in open - domain dialogue include training end - to - end deep - learning models to learn various conversational features like emotional content of response , symbolic transitions of dialogue contexts in a knowledge graph and persona of the agent and the user , among others . while neural models have shown reasonable results , modelling the cognitive processes that humans use when conversing with each other may improve the agent 's quality of responses .", "entity": "retrieval models", "output": "open - domain conversational agents", "neg_sample": ["retrieval models is used for Task", "recent state - of - the - art approaches in open - domain dialogue include training end - to - end deep - learning models to learn various conversational features like emotional content of response , symbolic transitions of dialogue contexts in a knowledge graph and persona of the agent and the user , among others .", "while neural models have shown reasonable results , modelling the cognitive processes that humans use when conversing with each other may improve the agent 's quality of responses ."], "relation": "used for", "id": "2021.emnlp-main.403", "year": 2021, "rel_sent": "Using human and automatic evaluations , we show that our augmentation approach significantly improves the performance of existing state - of - the - art retrieval models for open - domain conversational agents .", "forward": true, "src_ids": "2021.emnlp-main.403_12633"}
{"input": "demographic - aware framework is used for Task| context: affect preferences vary with user demographics , and tapping into demographic information provides important cues about the users ' language preferences .", "entity": "demographic - aware framework", "output": "empathy prediction", "neg_sample": ["demographic - aware framework is used for Task", "affect preferences vary with user demographics , and tapping into demographic information provides important cues about the users ' language preferences ."], "relation": "used for", "id": "2021.eacl-main.268", "year": 2021, "rel_sent": "In this paper , we utilize the user demographics and propose EmpathBERT , a demographic - aware framework for empathy prediction based on BERT .", "forward": true, "src_ids": "2021.eacl-main.268_5192"}
{"input": "shining - through effects is done by using OtherScientificTerm| context: the paper reports the results of a translationese study of literary texts based on translated and non - translated russian . we expect that literary translations from typologically distant languages should exhibit more translationese , and the fingerprints of individual source languages ( and their families ) are traceable in translations .", "entity": "shining - through effects", "output": "features", "neg_sample": ["shining - through effects is done by using OtherScientificTerm", "the paper reports the results of a translationese study of literary texts based on translated and non - translated russian .", "we expect that literary translations from typologically distant languages should exhibit more translationese , and the fingerprints of individual source languages ( and their families ) are traceable in translations ."], "relation": "used for", "id": "2021.latechclfl-1.12", "year": 2021, "rel_sent": "We identified features that point to linguistic specificity of Russian non - translated literature and to shining - through effects .", "forward": false, "src_ids": "2021.latechclfl-1.12_12516"}
{"input": "deep learning predictions is used for OtherScientificTerm| context: nowadays , most research conducted in the field of abstractive text summarization focuses on neural - based models alone , without considering their combination with knowledge - based approaches that could further enhance their efficiency .", "entity": "deep learning predictions", "output": "abstractive summaries", "neg_sample": ["deep learning predictions is used for OtherScientificTerm", "nowadays , most research conducted in the field of abstractive text summarization focuses on neural - based models alone , without considering their combination with knowledge - based approaches that could further enhance their efficiency ."], "relation": "used for", "id": "2021.cl-4.27", "year": 2021, "rel_sent": "The overall methodology is based on a well - defined theoretical model of knowledge - based content generalization and deep learning predictions for generating abstractive summaries .", "forward": true, "src_ids": "2021.cl-4.27_15045"}
{"input": "relevant information is done by using Method| context: end - to - end task - oriented dialogue systems generate responses based on dialog history and an accompanying knowledge base ( kb ) . inferring those kb entities that are most relevant for an utterance is crucial for response generation . existing state of the art scales to large kbs by softly filtering over irrelevant kb information .", "entity": "relevant information", "output": "pairwise similarity based filter", "neg_sample": ["relevant information is done by using Method", "end - to - end task - oriented dialogue systems generate responses based on dialog history and an accompanying knowledge base ( kb ) .", "inferring those kb entities that are most relevant for an utterance is crucial for response generation .", "existing state of the art scales to large kbs by softly filtering over irrelevant kb information ."], "relation": "used for", "id": "2021.findings-acl.448", "year": 2021, "rel_sent": "In this paper , we propose a novel filtering technique that consists of ( 1 ) a pairwise similarity based filter that identifies relevant information by respecting the n - ary structure in a KB record .", "forward": false, "src_ids": "2021.findings-acl.448_15410"}
{"input": "preview module is used for Method| context: existing dialog state tracking ( dst ) models are trained with dialog data in a random order , neglecting rich structural information in a dataset .", "entity": "preview module", "output": "dst model", "neg_sample": ["preview module is used for Method", "existing dialog state tracking ( dst ) models are trained with dialog data in a random order , neglecting rich structural information in a dataset ."], "relation": "used for", "id": "2021.acl-short.111", "year": 2021, "rel_sent": "Specifically , we propose a model - agnostic framework called Schema - aware Curriculum Learning for Dialog State Tracking ( SaCLog ) , which consists of a preview module that pre - trains a DST model with schema information , a curriculum module that optimizes the model with CL , and a review module that augments mispredicted data to reinforce the CL training .", "forward": true, "src_ids": "2021.acl-short.111_3993"}
{"input": "cross lingual transfer is done by using OtherScientificTerm| context: recent multilingual pre - trained language models have achieved remarkable zero - shot performance , where the model is only finetuned on one source language and directly evaluated on target languages .", "entity": "cross lingual transfer", "output": "uncertainties", "neg_sample": ["cross lingual transfer is done by using OtherScientificTerm", "recent multilingual pre - trained language models have achieved remarkable zero - shot performance , where the model is only finetuned on one source language and directly evaluated on target languages ."], "relation": "used for", "id": "2021.emnlp-main.538", "year": 2021, "rel_sent": "Three different uncertainties are adapted and analyzed specifically for the cross lingual transfer : Language Heteroscedastic / Homoscedastic Uncertainty ( LEU / LOU ) , Evidential Uncertainty ( EVI ) .", "forward": false, "src_ids": "2021.emnlp-main.538_3804"}
{"input": "inter - country social media dynamics is done by using Generic| context: the ongoing covid-19 pandemic resulted in significant ramifications for international relations ranging from travel restrictions , global ceasefires , and international vaccine production and sharing agreements . amidst a wave of infections in india that resulted in a systemic breakdown of healthcare infrastructure , a social welfare organization based in pakistan offered to procure medical - grade oxygen to assist india - a nation which was involved in four wars with pakistan in the past few decades . while # indianeedsoxygen and # pakistanstandswithindia featured among the top - trending hashtags in pakistan , divisive hashtags such as # endiasaysorrytokashmir simultaneously started trending . against the backdrop of a contentious history including four wars , divisive content of this nature , especially when a country is facing an unprecedented healthcare crisis , fuels further deterioration of relations .", "entity": "inter - country social media dynamics", "output": "resource transfer", "neg_sample": ["inter - country social media dynamics is done by using Generic", "the ongoing covid-19 pandemic resulted in significant ramifications for international relations ranging from travel restrictions , global ceasefires , and international vaccine production and sharing agreements .", "amidst a wave of infections in india that resulted in a systemic breakdown of healthcare infrastructure , a social welfare organization based in pakistan offered to procure medical - grade oxygen to assist india - a nation which was involved in four wars with pakistan in the past few decades .", "while # indianeedsoxygen and # pakistanstandswithindia featured among the top - trending hashtags in pakistan , divisive hashtags such as # endiasaysorrytokashmir simultaneously started trending .", "against the backdrop of a contentious history including four wars , divisive content of this nature , especially when a country is facing an unprecedented healthcare crisis , fuels further deterioration of relations ."], "relation": "used for", "id": "2021.nlp4posimpact-1.14", "year": 2021, "rel_sent": "Empathy and Hope : Resource Transfer to Model Inter - country Social Media Dynamics.", "forward": false, "src_ids": "2021.nlp4posimpact-1.14_13225"}
{"input": "student factorization is used for OtherScientificTerm| context: knowledge distillation is a critical technique to transfer knowledge between models , typically from a large model ( the teacher ) to a more fine - grained one ( the student ) . the objective function of knowledge distillation is typically the cross - entropy between the teacher and the student 's output distributions . however , for structured prediction problems , the output space is exponential in size ; therefore , the cross - entropy objective becomes intractable to compute and optimize directly .", "entity": "student factorization", "output": "fine - grained substructures", "neg_sample": ["student factorization is used for OtherScientificTerm", "knowledge distillation is a critical technique to transfer knowledge between models , typically from a large model ( the teacher ) to a more fine - grained one ( the student ) .", "the objective function of knowledge distillation is typically the cross - entropy between the teacher and the student 's output distributions .", "however , for structured prediction problems , the output space is exponential in size ; therefore , the cross - entropy objective becomes intractable to compute and optimize directly ."], "relation": "used for", "id": "2021.acl-long.46", "year": 2021, "rel_sent": "In particular , we show the tractability and empirical effectiveness of structural knowledge distillation between sequence labeling and dependency parsing models under four different scenarios : 1 ) the teacher and student share the same factorization form of the output structure scoring function ; 2 ) the student factorization produces more fine - grained substructures than the teacher factorization ; 3 ) the teacher factorization produces more fine - grained substructures than the student factorization ; 4 ) the factorization forms from the teacher and the student are incompatible .", "forward": true, "src_ids": "2021.acl-long.46_2547"}
{"input": "lcp task is done by using Method| context: lexical complexity prediction ( lcp ) involves assigning a difficulty score to a particular word or expression , in a text intended for a target audience .", "entity": "lcp task", "output": "fine - tuning pre - trained transformers", "neg_sample": ["lcp task is done by using Method", "lexical complexity prediction ( lcp ) involves assigning a difficulty score to a particular word or expression , in a text intended for a target audience ."], "relation": "used for", "id": "2021.semeval-1.73", "year": 2021, "rel_sent": "The obtained results are very promising and show the effectiveness of fine - tuning pre - trained transformers for LCP task .", "forward": false, "src_ids": "2021.semeval-1.73_6891"}
{"input": "system initiative is used for Task| context: topic diversion occurs frequently with engaging open - domain dialogue systems like virtual assistants . the balance between staying on topic and rectifying the topic drift is important for a good collaborative system .", "entity": "system initiative", "output": "diversion rectification", "neg_sample": ["system initiative is used for Task", "topic diversion occurs frequently with engaging open - domain dialogue systems like virtual assistants .", "the balance between staying on topic and rectifying the topic drift is important for a good collaborative system ."], "relation": "used for", "id": "2021.sigdial-1.17", "year": 2021, "rel_sent": "We propose a preliminary study , classifying utterances into major , minor and off - topics , which further extends into a system initiative for diversion rectification .", "forward": true, "src_ids": "2021.sigdial-1.17_5988"}
{"input": "named entity recognition is done by using Method| context: this paper presents our findings from participating in the smm4h shared task 2021 .", "entity": "named entity recognition", "output": "stacked heterogeneous embeddings", "neg_sample": ["named entity recognition is done by using Method", "this paper presents our findings from participating in the smm4h shared task 2021 ."], "relation": "used for", "id": "2021.smm4h-1.14", "year": 2021, "rel_sent": "Neural Text Classification and Stacked Heterogeneous Embeddings for Named Entity Recognition in SMM4H 2021.", "forward": false, "src_ids": "2021.smm4h-1.14_3381"}
{"input": "model interpretation and correcting model errors is done by using OtherScientificTerm| context: despite the popularity , their computational cost does not scale well with model and training data size .", "entity": "model interpretation and correcting model errors", "output": "influence functions", "neg_sample": ["model interpretation and correcting model errors is done by using OtherScientificTerm", "despite the popularity , their computational cost does not scale well with model and training data size ."], "relation": "used for", "id": "2021.emnlp-main.808", "year": 2021, "rel_sent": "Overall , our fast influence functions can be efficiently applied to large models and datasets , and our experiments demonstrate the potential of influence functions in model interpretation and correcting model errors .", "forward": false, "src_ids": "2021.emnlp-main.808_10929"}
{"input": "modeling context is used for Task| context: answer sentence selection ( as2 ) is an efficient approach for the design of open - domain question answering ( qa ) systems . in order to achieve low latency , traditional as2 models score question - answer pairs individually , ignoring any information from the document each potential answer was extracted from . in contrast , more computationally expensive models designed for machine reading comprehension tasks typically receive one or more passages as input , which often results in better accuracy .", "entity": "modeling context", "output": "answer sentence selection systems", "neg_sample": ["modeling context is used for Task", "answer sentence selection ( as2 ) is an efficient approach for the design of open - domain question answering ( qa ) systems .", "in order to achieve low latency , traditional as2 models score question - answer pairs individually , ignoring any information from the document each potential answer was extracted from .", "in contrast , more computationally expensive models designed for machine reading comprehension tasks typically receive one or more passages as input , which often results in better accuracy ."], "relation": "used for", "id": "2021.eacl-main.261", "year": 2021, "rel_sent": "Modeling Context in Answer Sentence Selection Systems on a Latency Budget.", "forward": true, "src_ids": "2021.eacl-main.261_14534"}
{"input": "learning relation alignment is used for Task| context: despite the achievements of large - scale multimodal pre - training approaches , cross - modal retrieval , e.g. , image - text retrieval , remains a challenging task . to bridge the semantic gap between the two modalities , previous studies mainly focus on word - region alignment at the object level , lacking the matching between the linguistic relation among the words and the visual relation among the regions . the neglect of such relation consistency impairs the contextualized representation of image - text pairs and hinders the model performance and the interpretability .", "entity": "learning relation alignment", "output": "calibrated cross - modal retrieval", "neg_sample": ["learning relation alignment is used for Task", "despite the achievements of large - scale multimodal pre - training approaches , cross - modal retrieval , e.g.", ", image - text retrieval , remains a challenging task .", "to bridge the semantic gap between the two modalities , previous studies mainly focus on word - region alignment at the object level , lacking the matching between the linguistic relation among the words and the visual relation among the regions .", "the neglect of such relation consistency impairs the contextualized representation of image - text pairs and hinders the model performance and the interpretability ."], "relation": "used for", "id": "2021.acl-long.43", "year": 2021, "rel_sent": "Learning Relation Alignment for Calibrated Cross - modal Retrieval.", "forward": true, "src_ids": "2021.acl-long.43_9571"}
{"input": "multitask semi - supervised learning is used for Task| context: as labeling schemas evolve over time , small differences can render datasets following older schemas unusable . this prevents researchers from building on top of previous annotation work and results in the existence , in discourse learning in particular , of many small class - imbalanced datasets .", "entity": "multitask semi - supervised learning", "output": "class - imbalanced discourse classification", "neg_sample": ["multitask semi - supervised learning is used for Task", "as labeling schemas evolve over time , small differences can render datasets following older schemas unusable .", "this prevents researchers from building on top of previous annotation work and results in the existence , in discourse learning in particular , of many small class - imbalanced datasets ."], "relation": "used for", "id": "2021.emnlp-main.40", "year": 2021, "rel_sent": "Multitask Semi - Supervised Learning for Class - Imbalanced Discourse Classification.", "forward": true, "src_ids": "2021.emnlp-main.40_12922"}
{"input": "neural machine translation is done by using Method| context: we present a simple method for extending transformers to source - side trees .", "entity": "neural machine translation", "output": "syntax - based attention masking", "neg_sample": ["neural machine translation is done by using Method", "we present a simple method for extending transformers to source - side trees ."], "relation": "used for", "id": "2021.naacl-srw.7", "year": 2021, "rel_sent": "Syntax - Based Attention Masking for Neural Machine Translation.", "forward": false, "src_ids": "2021.naacl-srw.7_6651"}
{"input": "polysemiotic corpora is used for Task| context: translation studies and more specifically , its subfield descriptive translation studies [ holmes 1988/2000 ] is , according to many scholars [ gambier , 2009 ; nenopoulou , 2007 ; munday , 2001/2008 ; hermans , 1999 ; snell - hornby et al . , 1994 e.t.c ] , a highly interdisciplinary field of study .", "entity": "polysemiotic corpora", "output": "qualitative analysis of website localization", "neg_sample": ["polysemiotic corpora is used for Task", "translation studies and more specifically , its subfield descriptive translation studies [ holmes 1988/2000 ] is , according to many scholars [ gambier , 2009 ; nenopoulou , 2007 ; munday , 2001/2008 ; hermans , 1999 ; snell - hornby et al .", ", 1994 e.t.c ] , a highly interdisciplinary field of study ."], "relation": "used for", "id": "2021.triton-1.25", "year": 2021, "rel_sent": "Up to now research findings have shown that polysemiotic corpora can be a valuable tool not only of quantitative but also of qualitative analysis of website localization both for scholars and translation professionals working with multimodal genres .", "forward": true, "src_ids": "2021.triton-1.25_15683"}
{"input": "syntactically controlled paraphrase generator ( synpg ) is done by using Task| context: paraphrase generation plays an essential role in natural language process ( nlp ) , and it has many downstream applications . however , training supervised paraphrase models requires many annotated paraphrase pairs , which are usually costly to obtain . on the other hand , the paraphrases generated by existing unsupervised approaches are usually syntactically similar to the source sentences and are limited in diversity .", "entity": "syntactically controlled paraphrase generator ( synpg )", "output": "disentanglement", "neg_sample": ["syntactically controlled paraphrase generator ( synpg ) is done by using Task", "paraphrase generation plays an essential role in natural language process ( nlp ) , and it has many downstream applications .", "however , training supervised paraphrase models requires many annotated paraphrase pairs , which are usually costly to obtain .", "on the other hand , the paraphrases generated by existing unsupervised approaches are usually syntactically similar to the source sentences and are limited in diversity ."], "relation": "used for", "id": "2021.eacl-main.88", "year": 2021, "rel_sent": "The disentanglement enables SynPG to control the syntax of output paraphrases by manipulating the embedding in the syntactic space .", "forward": false, "src_ids": "2021.eacl-main.88_4472"}
{"input": "scalable data generation method is used for OtherScientificTerm| context: in multi - hop qa , answering complex questions entails iterative document retrieval for finding the missing entity of the question . the main steps of this process are sub - question detection , document retrieval for the subquestion , and generation of a new query for the final document retrieval . however , building a dataset that contains complex questions with sub - questions and their corresponding documents requires costly human annotation .", "entity": "scalable data generation method", "output": "nested structure of question", "neg_sample": ["scalable data generation method is used for OtherScientificTerm", "in multi - hop qa , answering complex questions entails iterative document retrieval for finding the missing entity of the question .", "the main steps of this process are sub - question detection , document retrieval for the subquestion , and generation of a new query for the final document retrieval .", "however , building a dataset that contains complex questions with sub - questions and their corresponding documents requires costly human annotation ."], "relation": "used for", "id": "2021.findings-acl.62", "year": 2021, "rel_sent": "Our method includes 1 ) a pre - training task for generating vector representations of complex questions , 2 ) a scalable data generation method that produces the nested structure of question and subquestion as weak supervision for pre - training , and 3 ) a pre - training model structure based on dense encoders .", "forward": true, "src_ids": "2021.findings-acl.62_10077"}
{"input": "arbitrary text sequences is used for OtherScientificTerm| context: in the recent advances of natural language processing , the scale of the state - of - the - art models and datasets is usually extensive , which challenges the application of sample - based explanation methods in many aspects , such as explanation interpretability , efficiency , and faithfulness .", "entity": "arbitrary text sequences", "output": "explanation unit", "neg_sample": ["arbitrary text sequences is used for OtherScientificTerm", "in the recent advances of natural language processing , the scale of the state - of - the - art models and datasets is usually extensive , which challenges the application of sample - based explanation methods in many aspects , such as explanation interpretability , efficiency , and faithfulness ."], "relation": "used for", "id": "2021.acl-long.419", "year": 2021, "rel_sent": "In this work , for the first time , we can improve the interpretability of explanations by allowing arbitrary text sequences as the explanation unit .", "forward": true, "src_ids": "2021.acl-long.419_4878"}
{"input": "named entity recognition is used for Material| context: we explore the application of state - of - the - art ner algorithms to asr - generated call center transcripts . previous work in this domain focused on the use of a bilstm - crf model which relied on flair embeddings ; however , such a model is unwieldy in terms of latency and memory consumption . in a production environment , end users require low - latency models which can be readily integrated into existing pipelines .", "entity": "named entity recognition", "output": "noisy call center transcripts", "neg_sample": ["named entity recognition is used for Material", "we explore the application of state - of - the - art ner algorithms to asr - generated call center transcripts .", "previous work in this domain focused on the use of a bilstm - crf model which relied on flair embeddings ; however , such a model is unwieldy in terms of latency and memory consumption .", "in a production environment , end users require low - latency models which can be readily integrated into existing pipelines ."], "relation": "used for", "id": "2021.wnut-1.40", "year": 2021, "rel_sent": "Improved Named Entity Recognition for Noisy Call Center Transcripts.", "forward": true, "src_ids": "2021.wnut-1.40_6140"}
{"input": "n - gram coverage is done by using Method| context: beam search is a go - to strategy for decoding neural sequence models . the algorithm can naturally be viewed as a subset optimization problem , albeit one where the corresponding set function does not reflect interactions between candidates . empirically , this leads to sets often exhibiting high overlap , e.g. , strings may differ by only a single word . yet in use - cases that call for multiple solutions , a diverse or representative set is often desired .", "entity": "n - gram coverage", "output": "string subsequence kernel", "neg_sample": ["n - gram coverage is done by using Method", "beam search is a go - to strategy for decoding neural sequence models .", "the algorithm can naturally be viewed as a subset optimization problem , albeit one where the corresponding set function does not reflect interactions between candidates .", "empirically , this leads to sets often exhibiting high overlap , e.g.", ", strings may differ by only a single word .", "yet in use - cases that call for multiple solutions , a diverse or representative set is often desired ."], "relation": "used for", "id": "2021.acl-long.512", "year": 2021, "rel_sent": "In a case study , we use the string subsequence kernel to explicitly encourage n - gram coverage in text generated from a sequence model .", "forward": false, "src_ids": "2021.acl-long.512_11818"}
{"input": "masked language models is used for Material| context: previous literatures show that pre - trained masked language models ( mlms ) such as bert can achieve competitive factual knowledge extraction performance on some datasets , indicating that mlms can potentially be a reliable knowledge source .", "entity": "masked language models", "output": "factual knowledge bases", "neg_sample": ["masked language models is used for Material", "previous literatures show that pre - trained masked language models ( mlms ) such as bert can achieve competitive factual knowledge extraction performance on some datasets , indicating that mlms can potentially be a reliable knowledge source ."], "relation": "used for", "id": "2021.acl-long.146", "year": 2021, "rel_sent": "Our findings shed light on the underlying predicting mechanisms of MLMs , and strongly question the previous conclusion that current MLMs can potentially serve as reliable factual knowledge bases .", "forward": true, "src_ids": "2021.acl-long.146_9446"}
{"input": "cmta is used for OtherScientificTerm| context: the internet has actually come to be an essential resource of health knowledge for individuals around the world in the present situation of the coronavirus condition pandemic(covid-19 ) . during pandemic situations , myths , sensationalism , rumours and misinformation , generated intentionally or unintentionally , spread rapidly through social networks . twitter is one of these popular social networks people use to share covid-19 related news , information , and thoughts that reflect their perception and opinion about the pandemic . evaluation of tweets for recognizing misinformation can create beneficial understanding to review the top quality and also the readability of online information concerning the covid-19 .", "entity": "cmta", "output": "features", "neg_sample": ["cmta is used for OtherScientificTerm", "the internet has actually come to be an essential resource of health knowledge for individuals around the world in the present situation of the coronavirus condition pandemic(covid-19 ) .", "during pandemic situations , myths , sensationalism , rumours and misinformation , generated intentionally or unintentionally , spread rapidly through social networks .", "twitter is one of these popular social networks people use to share covid-19 related news , information , and thoughts that reflect their perception and opinion about the pandemic .", "evaluation of tweets for recognizing misinformation can create beneficial understanding to review the top quality and also the readability of online information concerning the covid-19 ."], "relation": "used for", "id": "2021.acl-srw.28", "year": 2021, "rel_sent": "CMTA extracts features from multilingual textual data , which is then categorized into specific information classes .", "forward": true, "src_ids": "2021.acl-srw.28_2854"}
{"input": "cript is used for Method| context: this paper takes a first step towards a critical thinking curriculum for neural auto - regressive language models .", "entity": "cript", "output": "core argument schemes", "neg_sample": ["cript is used for Method", "this paper takes a first step towards a critical thinking curriculum for neural auto - regressive language models ."], "relation": "used for", "id": "2021.iwcs-1.7", "year": 2021, "rel_sent": "CRiPT generalizes the core argument schemes in a correct way .", "forward": true, "src_ids": "2021.iwcs-1.7_13980"}
{"input": "unsupervised label refinement is used for Task| context: while promising , it crucially relies on accurate descriptions of the label set for each downstream task . this reliance causes dataless classifiers to be highly sensitive to the choice of label descriptions and hinders the broader application of dataless classification in practice .", "entity": "unsupervised label refinement", "output": "dataless text classification", "neg_sample": ["unsupervised label refinement is used for Task", "while promising , it crucially relies on accurate descriptions of the label set for each downstream task .", "this reliance causes dataless classifiers to be highly sensitive to the choice of label descriptions and hinders the broader application of dataless classification in practice ."], "relation": "used for", "id": "2021.findings-acl.365", "year": 2021, "rel_sent": "Unsupervised Label Refinement Improves Dataless Text Classification.", "forward": true, "src_ids": "2021.findings-acl.365_13618"}
{"input": "estimation approaches is used for Material| context: finding the year of writing for a historical text is of crucial importance to historical research . however , the year of original creation is rarely explicitly stated and must be inferred from the text content , historical records , and codicological clues . given a transcribed text , machine learning has successfully been used to estimate the year of production .", "entity": "estimation approaches", "output": "historical text archives", "neg_sample": ["estimation approaches is used for Material", "finding the year of writing for a historical text is of crucial importance to historical research .", "however , the year of original creation is rarely explicitly stated and must be inferred from the text content , historical records , and codicological clues .", "given a transcribed text , machine learning has successfully been used to estimate the year of production ."], "relation": "used for", "id": "2021.nodalida-main.15", "year": 2021, "rel_sent": "In this paper , we present an overview of several estimation approaches for historical text archives spanning from the 12th century until today .", "forward": true, "src_ids": "2021.nodalida-main.15_13220"}
{"input": "mlms is used for Task| context: a possible explanation for the impressive performance of masked language model ( mlm ) pre - training is that such models have learned to represent the syntactic structures prevalent in classical nlp pipelines .", "entity": "mlms", "output": "downstream tasks", "neg_sample": ["mlms is used for Task", "a possible explanation for the impressive performance of masked language model ( mlm ) pre - training is that such models have learned to represent the syntactic structures prevalent in classical nlp pipelines ."], "relation": "used for", "id": "2021.emnlp-main.230", "year": 2021, "rel_sent": "In this paper , we propose a different explanation : MLMs succeed on downstream tasks almost entirely due to their ability to model higher - order word co - occurrence statistics .", "forward": true, "src_ids": "2021.emnlp-main.230_15120"}
{"input": "text - based qa systems is done by using Method| context: existing work shows the benefits of integrating kbs with textual evidence for qa only on questions that are answerable by kbs alone ( sun et al . , 2019 ) . in contrast , real world qa systems often have to deal with questions that might not be directly answerable by kbs .", "entity": "text - based qa systems", "output": "model - agnostic approach", "neg_sample": ["text - based qa systems is done by using Method", "existing work shows the benefits of integrating kbs with textual evidence for qa only on questions that are answerable by kbs alone ( sun et al .", ", 2019 ) .", "in contrast , real world qa systems often have to deal with questions that might not be directly answerable by kbs ."], "relation": "used for", "id": "2021.deelio-1.3", "year": 2021, "rel_sent": "We propose and analyze a simple , model - agnostic approach for incorporating KB paths into text - based QA systems and establish a strong upper bound on FQ for our method using an oracle retriever .", "forward": false, "src_ids": "2021.deelio-1.3_12510"}
{"input": "drs trees is done by using Method| context: text - level discourse rhetorical structure ( drs ) parsing is known to be challenging due to the notorious lack of training data . although recent top - down drs parsers can better leverage global document context and have achieved certain success , the performance is still far from perfect . to our knowledge , all previous drs parsers make local decisions for either bottom - up node composition or top - down split point ranking at each time step , and largely ignore drs parsing from the global view point . obviously , it is not sufficient to build an entire drs tree only through these local decisions .", "entity": "drs trees", "output": "adversarial bot", "neg_sample": ["drs trees is done by using Method", "text - level discourse rhetorical structure ( drs ) parsing is known to be challenging due to the notorious lack of training data .", "although recent top - down drs parsers can better leverage global document context and have achieved certain success , the performance is still far from perfect .", "to our knowledge , all previous drs parsers make local decisions for either bottom - up node composition or top - down split point ranking at each time step , and largely ignore drs parsing from the global view point .", "obviously , it is not sufficient to build an entire drs tree only through these local decisions ."], "relation": "used for", "id": "2021.acl-long.305", "year": 2021, "rel_sent": "After that , we learn an adversarial bot between gold and fake tree diagrams to estimate the generated DRS trees from a global perspective .", "forward": false, "src_ids": "2021.acl-long.305_1463"}
{"input": "deep learning architectures is used for Method| context: argument mining is often addressed by a pipeline method where segmentation of text into argumentative units is conducted first and proceeded by an argument component identification task .", "entity": "deep learning architectures", "output": "argument components", "neg_sample": ["deep learning architectures is used for Method", "argument mining is often addressed by a pipeline method where segmentation of text into argumentative units is conducted first and proceeded by an argument component identification task ."], "relation": "used for", "id": "2021.bea-1.22", "year": 2021, "rel_sent": "To this end , we compare a variety of state - of - the - art models such as discrete features and deep learning architectures ( e.g. , BiLSTM networks and BERT - based architectures ) to identify the argument components .", "forward": true, "src_ids": "2021.bea-1.22_11882"}
{"input": "distributed meaning representations is done by using OtherScientificTerm| context: word concreteness and imageability have proven crucial in understanding how humans process and represent language in the brain . while word - embeddings do not explicitly incorporate the concreteness of words into their computations , they have been shown to accurately predict human judgments of concreteness and imageability .", "entity": "distributed meaning representations", "output": "neural activity patterns", "neg_sample": ["distributed meaning representations is done by using OtherScientificTerm", "word concreteness and imageability have proven crucial in understanding how humans process and represent language in the brain .", "while word - embeddings do not explicitly incorporate the concreteness of words into their computations , they have been shown to accurately predict human judgments of concreteness and imageability ."], "relation": "used for", "id": "2021.cmcl-1.1", "year": 2021, "rel_sent": "Inspired by the recent interest in using neural activity patterns to analyze distributed meaning representations , we first show that brain responses acquired while human subjects passively comprehend natural stories can significantly distinguish the concreteness levels of the words encountered .", "forward": false, "src_ids": "2021.cmcl-1.1_4503"}
{"input": "bilingual parallel corpus is used for OtherScientificTerm| context: we introduce a method for assisting english as second language ( esl ) learners by providing translations of collins cobuild grammar patterns(gp ) for a given word .", "entity": "bilingual parallel corpus", "output": "bilingual gp pairs", "neg_sample": ["bilingual parallel corpus is used for OtherScientificTerm", "we introduce a method for assisting english as second language ( esl ) learners by providing translations of collins cobuild grammar patterns(gp ) for a given word ."], "relation": "used for", "id": "2021.rocling-1.39", "year": 2021, "rel_sent": "In our approach , bilingual parallel corpus is transformed into bilingual GP pairs aimed at providing native language support for learning word usage through GPs .", "forward": true, "src_ids": "2021.rocling-1.39_2676"}
{"input": "joint model is used for Task| context: journalists usually organize and present the contents of a news article following a welldefined structure .", "entity": "joint model", "output": "structure - based news genre classification", "neg_sample": ["joint model is used for Task", "journalists usually organize and present the contents of a news article following a welldefined structure ."], "relation": "used for", "id": "2021.findings-acl.295", "year": 2021, "rel_sent": "A Joint Model for Structure - based News Genre Classification with Application to Text Summarization.", "forward": true, "src_ids": "2021.findings-acl.295_2812"}
{"input": "pre - trained language model is used for Method| context: discourse segmentation and sentence - level discourse parsing play important roles for various nlp tasks to consider textual coherence . despite recent achievements in both tasks , there is still room for improvement due to the scarcity of labeled data .", "entity": "pre - trained language model", "output": "language model - based generative classifier", "neg_sample": ["pre - trained language model is used for Method", "discourse segmentation and sentence - level discourse parsing play important roles for various nlp tasks to consider textual coherence .", "despite recent achievements in both tasks , there is still room for improvement due to the scarcity of labeled data ."], "relation": "used for", "id": "2021.emnlp-main.188", "year": 2021, "rel_sent": "Moreover , since this enables LMGC to make ready the representations for labels , unseen in the pre - training step , we can effectively use a pre - trained language model in LMGC .", "forward": true, "src_ids": "2021.emnlp-main.188_524"}
{"input": "contrastive fine - tuning objective is used for OtherScientificTerm| context: phrase representations derived from bert often do not exhibit complex phrasal compositionality , as the model relies instead on lexical similarity to determine semantic relatedness .", "entity": "contrastive fine - tuning objective", "output": "phrase embeddings", "neg_sample": ["contrastive fine - tuning objective is used for OtherScientificTerm", "phrase representations derived from bert often do not exhibit complex phrasal compositionality , as the model relies instead on lexical similarity to determine semantic relatedness ."], "relation": "used for", "id": "2021.emnlp-main.846", "year": 2021, "rel_sent": "In this paper , we propose a contrastive fine - tuning objective that enables BERT to produce more powerful phrase embeddings .", "forward": true, "src_ids": "2021.emnlp-main.846_14976"}
{"input": "corporate events detection is used for Task| context: unlike existing models that utilize textual features ( e.g. , bag - of - words ) and sentiments to directly make stock predictions , we consider corporate events as the driving force behind stock movements and aim to profit from the temporary stock mispricing that may occur when corporate events take place .", "entity": "corporate events detection", "output": "news - based event - driven trading", "neg_sample": ["corporate events detection is used for Task", "unlike existing models that utilize textual features ( e.g.", ", bag - of - words ) and sentiments to directly make stock predictions , we consider corporate events as the driving force behind stock movements and aim to profit from the temporary stock mispricing that may occur when corporate events take place ."], "relation": "used for", "id": "2021.findings-acl.186", "year": 2021, "rel_sent": "Trade the Event : Corporate Events Detection for News - Based Event - Driven Trading.", "forward": true, "src_ids": "2021.findings-acl.186_1431"}
{"input": "linguistic features is used for OtherScientificTerm| context: much work in cross - lingual transfer learning explored how to select better transfer languages for multilingual tasks , primarily focusing on typological and genealogical similarities between languages . we hypothesize that these measures of linguistic proximity are not enough when working with pragmatically - motivated tasks , such as sentiment analysis .", "entity": "linguistic features", "output": "cross - cultural similarities", "neg_sample": ["linguistic features is used for OtherScientificTerm", "much work in cross - lingual transfer learning explored how to select better transfer languages for multilingual tasks , primarily focusing on typological and genealogical similarities between languages .", "we hypothesize that these measures of linguistic proximity are not enough when working with pragmatically - motivated tasks , such as sentiment analysis ."], "relation": "used for", "id": "2021.eacl-main.204", "year": 2021, "rel_sent": "As an alternative , we introduce three linguistic features that capture cross - cultural similarities that manifest in linguistic patterns and quantify distinct aspects of language pragmatics : language context - level , figurative language , and the lexification of emotion concepts .", "forward": true, "src_ids": "2021.eacl-main.204_10482"}
{"input": "computer vision is used for OtherScientificTerm| context: technical logbooks are a challenging and under - explored text type in automated event identification . these texts are typically short and written in non - standard yet technical language , posing challenges to off - the - shelf nlp pipelines . the granularity of issue types described in these datasets additionally leads to class imbalance , making it challenging for models to accurately predict which issue each logbook entry describes .", "entity": "computer vision", "output": "extreme class imbalance", "neg_sample": ["computer vision is used for OtherScientificTerm", "technical logbooks are a challenging and under - explored text type in automated event identification .", "these texts are typically short and written in non - standard yet technical language , posing challenges to off - the - shelf nlp pipelines .", "the granularity of issue types described in these datasets additionally leads to class imbalance , making it challenging for models to accurately predict which issue each logbook entry describes ."], "relation": "used for", "id": "2021.acl-long.312", "year": 2021, "rel_sent": "We adapt a feedback strategy from computer vision for handling extreme class imbalance , which resamples the training data based on its error in the prediction process .", "forward": true, "src_ids": "2021.acl-long.312_1199"}
{"input": "recommender dialogue system is done by using OtherScientificTerm| context: though recent end - to - end neural models have shown promising progress on this task , two key challenges still remain . first , the recommended items can not be always incorporated into the generated response precisely and appropriately . second , only the items mentioned in the training corpus have a chance to be recommended in the conversation .", "entity": "recommender dialogue system", "output": "neural templates", "neg_sample": ["recommender dialogue system is done by using OtherScientificTerm", "though recent end - to - end neural models have shown promising progress on this task , two key challenges still remain .", "first , the recommended items can not be always incorporated into the generated response precisely and appropriately .", "second , only the items mentioned in the training corpus have a chance to be recommended in the conversation ."], "relation": "used for", "id": "2021.emnlp-main.617", "year": 2021, "rel_sent": "Learning Neural Templates for Recommender Dialogue System.", "forward": false, "src_ids": "2021.emnlp-main.617_1615"}
{"input": "token - wise transition and emission probabilities is done by using Method| context: we study the problem of learning a named entity recognition ( ner ) tagger using noisy labels from multiple weak supervision sources . though cheap to obtain , the labels from weak supervision sources are often incomplete , inaccurate , and contradictory , making it difficult to learn an accurate ner model .", "entity": "token - wise transition and emission probabilities", "output": "conditional hidden markov model", "neg_sample": ["token - wise transition and emission probabilities is done by using Method", "we study the problem of learning a named entity recognition ( ner ) tagger using noisy labels from multiple weak supervision sources .", "though cheap to obtain , the labels from weak supervision sources are often incomplete , inaccurate , and contradictory , making it difficult to learn an accurate ner model ."], "relation": "used for", "id": "2021.acl-long.482", "year": 2021, "rel_sent": "Specifically , CHMM learns token - wise transition and emission probabilities from the BERT embeddings of the input tokens to infer the latent true labels from noisy observations .", "forward": false, "src_ids": "2021.acl-long.482_2183"}
{"input": "argumentation knowledge graph is used for Task| context: existing research treats it as a problem of sentence matching and largely relies on textual information to compute the similarities . however , the interaction of opinions usually involves the background of the topic and requires reasoning of knowledge , which is beyond textual information .", "entity": "argumentation knowledge graph", "output": "identification of interactive argument pairs", "neg_sample": ["argumentation knowledge graph is used for Task", "existing research treats it as a problem of sentence matching and largely relies on textual information to compute the similarities .", "however , the interaction of opinions usually involves the background of the topic and requires reasoning of knowledge , which is beyond textual information ."], "relation": "used for", "id": "2021.findings-acl.203", "year": 2021, "rel_sent": "Leveraging Argumentation Knowledge Graph for Interactive Argument Pair Identification.", "forward": true, "src_ids": "2021.findings-acl.203_3282"}
{"input": "time - aware methods is used for OtherScientificTerm| context: as languages evolve historically , making computational approaches sensitive to time can improve performance on specific tasks .", "entity": "time - aware methods", "output": "sense of polysemous words", "neg_sample": ["time - aware methods is used for OtherScientificTerm", "as languages evolve historically , making computational approaches sensitive to time can improve performance on specific tasks ."], "relation": "used for", "id": "2021.findings-acl.243", "year": 2021, "rel_sent": "In this work , we assess whether applying historical language models and time - aware methods help with determining the correct sense of polysemous words .", "forward": true, "src_ids": "2021.findings-acl.243_8965"}
{"input": "gpt-2 is done by using Method| context: fine - tuning is the de facto way of leveraging large pretrained language models for downstream tasks . however , fine - tuning modifies all the language model parameters and therefore necessitates storing a full copy for each task .", "entity": "gpt-2", "output": "prefix - tuning", "neg_sample": ["gpt-2 is done by using Method", "fine - tuning is the de facto way of leveraging large pretrained language models for downstream tasks .", "however , fine - tuning modifies all the language model parameters and therefore necessitates storing a full copy for each task ."], "relation": "used for", "id": "2021.acl-long.353", "year": 2021, "rel_sent": "We apply prefix - tuning to GPT-2 for table - to - text generation and to BART for summarization .", "forward": false, "src_ids": "2021.acl-long.353_11053"}
{"input": "( lower - order ) relations is done by using Method| context: relation extraction is a key task in knowledge extraction , and is commonly defined as the task of identifying relations that hold between entities in text .", "entity": "( lower - order ) relations", "output": "annotation schemes", "neg_sample": ["( lower - order ) relations is done by using Method", "relation extraction is a key task in knowledge extraction , and is commonly defined as the task of identifying relations that hold between entities in text ."], "relation": "used for", "id": "2021.eacl-srw.18", "year": 2021, "rel_sent": "More specifically , we aim to develop theoretical underpinnings and practical solutions for the challenges of ( 1 ) incorporating meta - relations into conceptualisations and annotation schemes for ( lower - order ) relations and named entities , ( 2 ) obtaining annotations for them with tolerable cognitive load on annotators , ( 3 ) creating models capable of reliably extracting meta - relations , and related to that ( 4 ) addressing the limited - data problem exacerbated by the introduction of meta - relations into the learning task .", "forward": false, "src_ids": "2021.eacl-srw.18_6805"}
{"input": "detection of hope speech is used for Task| context: the proliferation of hate speech and misinformation in social media is fast becoming a menace to society . in compliment , the dissemination of hate - diffusing , promising and anti - oppressive messages become a unique alternative . unfortunately , due to its complex nature as well as the relatively limited manifestation in comparison to hostile and neutral content , the identification of hope speech becomes a challenge .", "entity": "detection of hope speech", "output": "detection of hope speech", "neg_sample": ["detection of hope speech is used for Task", "the proliferation of hate speech and misinformation in social media is fast becoming a menace to society .", "in compliment , the dissemination of hate - diffusing , promising and anti - oppressive messages become a unique alternative .", "unfortunately , due to its complex nature as well as the relatively limited manifestation in comparison to hostile and neutral content , the identification of hope speech becomes a challenge ."], "relation": "used for", "id": "2021.ltedi-1.24", "year": 2021, "rel_sent": "This work revolves around the detection of Hope Speech in Youtube comments , for the Shared Task on Hope Speech Detection for Equality , Diversity , and Inclusion .", "forward": true, "src_ids": "2021.ltedi-1.24_2644"}
{"input": "two - stream transformer is done by using Method| context: recently , word enhancement has become very popular for chinese named entity recognition ( ner ) , reducing segmentation errors and increasing the semantic and boundary information of chinese words . however , these methods tend to ignore the information of the chinese character structure after integrating the lexical information . chinese characters have evolved from pictographs since ancient times , and their structure often reflects more information about the characters .", "entity": "two - stream transformer", "output": "multi - metadata embedding", "neg_sample": ["two - stream transformer is done by using Method", "recently , word enhancement has become very popular for chinese named entity recognition ( ner ) , reducing segmentation errors and increasing the semantic and boundary information of chinese words .", "however , these methods tend to ignore the information of the chinese character structure after integrating the lexical information .", "chinese characters have evolved from pictographs since ancient times , and their structure often reflects more information about the characters ."], "relation": "used for", "id": "2021.acl-long.121", "year": 2021, "rel_sent": "Specifically , we use multi - metadata embedding in a two - stream Transformer to integrate Chinese character features with the radical - level embedding .", "forward": false, "src_ids": "2021.acl-long.121_16068"}
{"input": "unified span - based formalism is used for Task| context: fine - grained opinion mining ( om ) has achieved increasing attraction in the natural language processing ( nlp ) community , which aims tofind the opinion structures of ' who expressed what opinions towards what ' in one sentence .", "entity": "unified span - based formalism", "output": "constituent parsing", "neg_sample": ["unified span - based formalism is used for Task", "fine - grained opinion mining ( om ) has achieved increasing attraction in the natural language processing ( nlp ) community , which aims tofind the opinion structures of ' who expressed what opinions towards what ' in one sentence ."], "relation": "used for", "id": "2021.naacl-main.144", "year": 2021, "rel_sent": "Furthermore , inspired by the unified span - based formalism of OM and constituent parsing , we explore two different methods ( multi - task learning and graph convolutional neural network ) to integrate syntactic constituents into the proposed model to help OM .", "forward": true, "src_ids": "2021.naacl-main.144_2410"}
{"input": "layer - wise probing is done by using Method| context: most of the recent works on probing representations have focused on bert , with the presumption that the findings might be similar to the other models . in this work , we extend the probing studies to two other models in the family , namely electra and xlnet , showing that variations in the pre - training objectives or architectural choices can result in different behaviors in encoding linguistic information in the representations .", "entity": "layer - wise probing", "output": "weight mixing evaluation strategy", "neg_sample": ["layer - wise probing is done by using Method", "most of the recent works on probing representations have focused on bert , with the presumption that the findings might be similar to the other models .", "in this work , we extend the probing studies to two other models in the family , namely electra and xlnet , showing that variations in the pre - training objectives or architectural choices can result in different behaviors in encoding linguistic information in the representations ."], "relation": "used for", "id": "2021.blackboxnlp-1.29", "year": 2021, "rel_sent": "Moreover , we show that drawing conclusions based on the weight mixing evaluation strategy - which is widely used in the context of layer - wise probing - can be misleading given the norm disparity of the representations across different layers .", "forward": false, "src_ids": "2021.blackboxnlp-1.29_12765"}
{"input": "parsing latin is done by using OtherScientificTerm| context: in much previous research on dependency parsing , related languages have successfully been used . however , when parsing latin , it has been suggested that languages such as ancient greek could be helpful .", "entity": "parsing latin", "output": "transfer languages", "neg_sample": ["parsing latin is done by using OtherScientificTerm", "in much previous research on dependency parsing , related languages have successfully been used .", "however , when parsing latin , it has been suggested that languages such as ancient greek could be helpful ."], "relation": "used for", "id": "2021.nodalida-main.32", "year": 2021, "rel_sent": "Investigation of Transfer Languages for Parsing Latin : Italic Branch vs. Hellenic Branch.", "forward": false, "src_ids": "2021.nodalida-main.32_13637"}
{"input": "self - attention patterns is used for Method| context: one of the major challenges in sign language translation from a sign language to a spoken language is the lack of parallel corpora . recent works have achieved promising results on the rwth - phoenix - weather 2014 t dataset , which consists of over eight thousand parallel sentences between german sign language and german . however , from the perspective of neural machine translation , this is still a tiny dataset .", "entity": "self - attention patterns", "output": "encoder and decoder of sign language translation models", "neg_sample": ["self - attention patterns is used for Method", "one of the major challenges in sign language translation from a sign language to a spoken language is the lack of parallel corpora .", "recent works have achieved promising results on the rwth - phoenix - weather 2014 t dataset , which consists of over eight thousand parallel sentences between german sign language and german .", "however , from the perspective of neural machine translation , this is still a tiny dataset ."], "relation": "used for", "id": "2021.mtsummit-at4ssl.10", "year": 2021, "rel_sent": "Our results show that pretrained language models can be used to improve sign language translation performance and that the self - attention patterns in BERT transfer in zero - shot to the encoder and decoder of sign language translation models .", "forward": true, "src_ids": "2021.mtsummit-at4ssl.10_14433"}
{"input": "semantic textual similarity task is done by using Method| context: bert , which has been successfully applied to many types of natural language processing ( nlp ) tasks , is also effective with various information retrieval ( ir ) tasks . however , it is not easy to obtain appropriate data for fine - tuning a bert model .", "entity": "semantic textual similarity task", "output": "pretrained model", "neg_sample": ["semantic textual similarity task is done by using Method", "bert , which has been successfully applied to many types of natural language processing ( nlp ) tasks , is also effective with various information retrieval ( ir ) tasks .", "however , it is not easy to obtain appropriate data for fine - tuning a bert model ."], "relation": "used for", "id": "2021.paclic-1.11", "year": 2021, "rel_sent": "LS calculates the similarity of contextualized representations of the common words , encoded using a pretrained model for the semantic textual similarity task .", "forward": false, "src_ids": "2021.paclic-1.11_8545"}
{"input": "attentive soft - pooling approach is used for OtherScientificTerm| context: although several studies are proposed to solve this challenging task , none distinguishes the importance of different phrases with a word window . intuitively , informative phrases should be more useful for the prediction .", "entity": "attentive soft - pooling approach", "output": "sparse and redundant word representations", "neg_sample": ["attentive soft - pooling approach is used for OtherScientificTerm", "although several studies are proposed to solve this challenging task , none distinguishes the importance of different phrases with a word window .", "intuitively , informative phrases should be more useful for the prediction ."], "relation": "used for", "id": "2021.findings-acl.184", "year": 2021, "rel_sent": "In particular , we propose an attentive soft - pooling approach to compress the sparse and redundant word representations into informative and dense ones as local features .", "forward": true, "src_ids": "2021.findings-acl.184_8460"}
{"input": "discrete cosine transform is used for Method| context: modern sentence encoders are used to generate dense vector representations that capture the underlying linguistic characteristics for a sequence of words , including phrases , sentences , or paragraphs . these kinds of representations are ideal for training a classifier for an end task such as sentiment analysis , question answering and text classification . different models have been proposed to efficiently generate general purpose sentence representations to be used in pretraining protocols .", "entity": "discrete cosine transform", "output": "universal sentence encoder", "neg_sample": ["discrete cosine transform is used for Method", "modern sentence encoders are used to generate dense vector representations that capture the underlying linguistic characteristics for a sequence of words , including phrases , sentences , or paragraphs .", "these kinds of representations are ideal for training a classifier for an end task such as sentiment analysis , question answering and text classification .", "different models have been proposed to efficiently generate general purpose sentence representations to be used in pretraining protocols ."], "relation": "used for", "id": "2021.acl-short.53", "year": 2021, "rel_sent": "Discrete Cosine Transform as Universal Sentence Encoder.", "forward": true, "src_ids": "2021.acl-short.53_179"}
{"input": "hierarchical pointer network is used for Task| context: most of previous systems process these subtasks independently and ignore their interactions .", "entity": "hierarchical pointer network", "output": "argument identification", "neg_sample": ["hierarchical pointer network is used for Task", "most of previous systems process these subtasks independently and ignore their interactions ."], "relation": "used for", "id": "2021.findings-acl.227", "year": 2021, "rel_sent": "Moreover , we design a hierarchical pointer network for argument identification which reduces the computational complexity .", "forward": true, "src_ids": "2021.findings-acl.227_7637"}
{"input": "adapter tuning is used for Task| context: adapter modules were recently introduced as an efficient alternative tofine - tuning in nlp . adapter tuning consists in freezing pre - trained parameters of a model and injecting lightweight modules between layers , resulting in the addition of only a small number of task - specific trainable parameters .", "entity": "adapter tuning", "output": "multilingual neural machine translation", "neg_sample": ["adapter tuning is used for Task", "adapter modules were recently introduced as an efficient alternative tofine - tuning in nlp .", "adapter tuning consists in freezing pre - trained parameters of a model and injecting lightweight modules between layers , resulting in the addition of only a small number of task - specific trainable parameters ."], "relation": "used for", "id": "2021.acl-short.103", "year": 2021, "rel_sent": "While adapter tuning was investigated for multilingual neural machine translation , this paper proposes a comprehensive analysis of adapters for multilingual speech translation ( ST ) .", "forward": true, "src_ids": "2021.acl-short.103_12587"}
{"input": "intent detection task is done by using Method| context: recent research considers few - shot intent detection as a meta - learning problem : the model is learning to learn from a consecutive set of small tasks named episodes .", "entity": "intent detection task", "output": "meta - learning algorithm", "neg_sample": ["intent detection task is done by using Method", "recent research considers few - shot intent detection as a meta - learning problem : the model is learning to learn from a consecutive set of small tasks named episodes ."], "relation": "used for", "id": "2021.acl-long.191", "year": 2021, "rel_sent": "In this work , we propose ProtAugment , a meta - learning algorithm for short texts classification ( the intent detection task ) .", "forward": false, "src_ids": "2021.acl-long.191_7620"}
{"input": "out - of - domain generalization is done by using Method| context: data - to - text annotations can be a costly process , especially when dealing with tables which are the major source of structured data and contain nontrivial structures .", "entity": "out - of - domain generalization", "output": "dart", "neg_sample": ["out - of - domain generalization is done by using Method", "data - to - text annotations can be a costly process , especially when dealing with tables which are the major source of structured data and contain nontrivial structures ."], "relation": "used for", "id": "2021.naacl-main.37", "year": 2021, "rel_sent": "We present systematic evaluation on DART as well as new state - of - the - art results on WebNLG 2017 to show that DART ( 1 ) poses new challenges to existing data - to - text datasets and ( 2 ) facilitates out - of - domain generalization .", "forward": false, "src_ids": "2021.naacl-main.37_669"}
{"input": "early exit methods is used for Method| context: pre - trained language models like bert are performant in a wide range of natural language tasks . however , they are resource exhaustive and computationally expensive for industrial scenarios . thus , early exits are adopted at each layer of bert to perform adaptive computation by predicting easier samples with the first few layers to speed up the inference .", "entity": "early exit methods", "output": "pre - trained models", "neg_sample": ["early exit methods is used for Method", "pre - trained language models like bert are performant in a wide range of natural language tasks .", "however , they are resource exhaustive and computationally expensive for industrial scenarios .", "thus , early exits are adopted at each layer of bert to perform adaptive computation by predicting easier samples with the first few layers to speed up the inference ."], "relation": "used for", "id": "2021.acl-long.231", "year": 2021, "rel_sent": "Experiments on the GLUE benchmark show that our proposed methods improve the performance of the state - of - the - art ( SOTA ) early exit methods for pre - trained models .", "forward": true, "src_ids": "2021.acl-long.231_15131"}
{"input": "cross - corpora abusive language detection is done by using Task| context: the state - of - the - art abusive language detection models report great in - corpus performance , but underperform when evaluated on abusive comments that differ from the training scenario . as human annotation involves substantial time and effort , models that can adapt to newly collected comments can prove to be useful .", "entity": "cross - corpora abusive language detection", "output": "unsupervised domain adaptation", "neg_sample": ["cross - corpora abusive language detection is done by using Task", "the state - of - the - art abusive language detection models report great in - corpus performance , but underperform when evaluated on abusive comments that differ from the training scenario .", "as human annotation involves substantial time and effort , models that can adapt to newly collected comments can prove to be useful ."], "relation": "used for", "id": "2021.socialnlp-1.10", "year": 2021, "rel_sent": "Unsupervised Domain Adaptation in Cross - corpora Abusive Language Detection.", "forward": false, "src_ids": "2021.socialnlp-1.10_11123"}
{"input": "stance - aware aggregation model is used for Task| context: existing approaches typically ( i ) explore the semantic interaction between the claim and evidence at different granularity levels but fail to capture their topical consistency during the reasoning process , which we believe is crucial for verification ; ( ii ) aggregate multiple pieces of evidence equally without considering their implicit stances to the claim , thereby introducing spurious information .", "entity": "stance - aware aggregation model", "output": "fact verification", "neg_sample": ["stance - aware aggregation model is used for Task", "existing approaches typically ( i ) explore the semantic interaction between the claim and evidence at different granularity levels but fail to capture their topical consistency during the reasoning process , which we believe is crucial for verification ; ( ii ) aggregate multiple pieces of evidence equally without considering their implicit stances to the claim , thereby introducing spurious information ."], "relation": "used for", "id": "2021.acl-long.128", "year": 2021, "rel_sent": "Topic - Aware Evidence Reasoning and Stance - Aware Aggregation for Fact Verification.", "forward": true, "src_ids": "2021.acl-long.128_11367"}
{"input": "event relation extraction is done by using Method| context: current event - centric knowledge graphs highly rely on explicit connectives to mine relations between events . unfortunately , due to the sparsity of connectives , these methods severely undermine the coverage of eventkgs . the lack of high - quality labelled corpora further exacerbates that problem .", "entity": "event relation extraction", "output": "knowledge projection paradigm", "neg_sample": ["event relation extraction is done by using Method", "current event - centric knowledge graphs highly rely on explicit connectives to mine relations between events .", "unfortunately , due to the sparsity of connectives , these methods severely undermine the coverage of eventkgs .", "the lack of high - quality labelled corpora further exacerbates that problem ."], "relation": "used for", "id": "2021.acl-long.60", "year": 2021, "rel_sent": "In this paper , we propose a knowledge projection paradigm for event relation extraction : projecting discourse knowledge to narratives by exploiting the commonalities between them .", "forward": false, "src_ids": "2021.acl-long.60_676"}
{"input": "neural architecture is used for OtherScientificTerm| context: capturing interactions among event arguments is an essential step towards robust event argument extraction ( eae ) .", "entity": "neural architecture", "output": "argument roles", "neg_sample": ["neural architecture is used for OtherScientificTerm", "capturing interactions among event arguments is an essential step towards robust event argument extraction ( eae ) ."], "relation": "used for", "id": "2021.acl-long.18", "year": 2021, "rel_sent": "A neural architecture with a novel Bi - directional Entity - level Recurrent Decoder ( BERD ) is proposed to generate argument roles by incorporating contextual entities ' argument role predictions , like a word - by - word text generation process , thereby distinguishing implicit argument distribution patterns within an event more accurately .", "forward": true, "src_ids": "2021.acl-long.18_9621"}
{"input": "fuzzy logic is used for OtherScientificTerm| context: domain - specific conceptual bases use key concepts to capture domain scope and relevant information . conceptual bases serve as a foundation for various downstream tasks , including ontology construction , information mapping , and analysis . however , building conceptual bases necessitates domain awareness and takes time . wikipedia navigational templates offer multiple articles on the same / similar domain . it is possible to use the templates to recognize fundamental concepts that shape the domain .", "entity": "fuzzy logic", "output": "crude conceptual base", "neg_sample": ["fuzzy logic is used for OtherScientificTerm", "domain - specific conceptual bases use key concepts to capture domain scope and relevant information .", "conceptual bases serve as a foundation for various downstream tasks , including ontology construction , information mapping , and analysis .", "however , building conceptual bases necessitates domain awareness and takes time .", "wikipedia navigational templates offer multiple articles on the same / similar domain .", "it is possible to use the templates to recognize fundamental concepts that shape the domain ."], "relation": "used for", "id": "2021.dash-1.1", "year": 2021, "rel_sent": "We filter important concepts using fuzzy logic on network metrics to create a crude conceptual base .", "forward": true, "src_ids": "2021.dash-1.1_13876"}
{"input": "pretrained language models is used for Task| context: in a world abounding in constant protests resulting from events like a global pandemic , climate change , religious or political conflicts , there has always been a need to detect events / protests before getting amplified by news media or social media .", "entity": "pretrained language models", "output": "multilingual protest detection", "neg_sample": ["pretrained language models is used for Task", "in a world abounding in constant protests resulting from events like a global pandemic , climate change , religious or political conflicts , there has always been a need to detect events / protests before getting amplified by news media or social media ."], "relation": "used for", "id": "2021.case-1.13", "year": 2021, "rel_sent": "IIITT at CASE 2021 Task 1 : Leveraging Pretrained Language Models for Multilingual Protest Detection.", "forward": true, "src_ids": "2021.case-1.13_1811"}
{"input": "triggers is done by using Method| context: recent work has demonstrated the vulnerability of modern text classifiers to universal adversarial attacks , which are input - agnostic sequences of words added to text processed by classifiers . despite being successful , the word sequences produced in such attacks are often ungrammatical and can be easily distinguished from natural text . we develop adversarial attacks that appear closer to natural english phrases and yet confuse classification systems when added to benign inputs .", "entity": "triggers", "output": "adversarially regularized autoencoder ( arae )", "neg_sample": ["triggers is done by using Method", "recent work has demonstrated the vulnerability of modern text classifiers to universal adversarial attacks , which are input - agnostic sequences of words added to text processed by classifiers .", "despite being successful , the word sequences produced in such attacks are often ungrammatical and can be easily distinguished from natural text .", "we develop adversarial attacks that appear closer to natural english phrases and yet confuse classification systems when added to benign inputs ."], "relation": "used for", "id": "2021.naacl-main.291", "year": 2021, "rel_sent": "We leverage an adversarially regularized autoencoder ( ARAE ) to generate triggers and propose a gradient - based search that aims to maximize the downstream classifier 's prediction loss .", "forward": false, "src_ids": "2021.naacl-main.291_9157"}
{"input": "mbart is done by using Method| context: current benchmark tasks for natural language processing contain text that is qualitatively different from the text used in informal day to day digital communication . this discrepancy has led to severe performance degradation of state - of - the - art nlp models when fine - tuned on real - world data . one way to resolve this issue is through lexical normalization , which is the process of transforming non - standard text , usually from social media , into a more standardized form .", "entity": "mbart", "output": "multi - lingual pre - training", "neg_sample": ["mbart is done by using Method", "current benchmark tasks for natural language processing contain text that is qualitatively different from the text used in informal day to day digital communication .", "this discrepancy has led to severe performance degradation of state - of - the - art nlp models when fine - tuned on real - world data .", "one way to resolve this issue is through lexical normalization , which is the process of transforming non - standard text , usually from social media , into a more standardized form ."], "relation": "used for", "id": "2021.wnut-1.53", "year": 2021, "rel_sent": "As the noisy text is a pervasive problem across languages , not just English , we leverage the multi - lingual pre - training of mBART tofine - tune it to our data .", "forward": false, "src_ids": "2021.wnut-1.53_1371"}
{"input": "transfer learning is used for Task| context: the explosive growth of music libraries has made music information retrieval and recommendation a critical issue . recommendation systems based on music emotion recognition are gradually gaining attention . most of the studies focus on audio data rather than lyrics to build models of music emotion classification . in addition , because of the richness of english language resources , most of the existing studies are focused on english lyrics but rarely on chinese .", "entity": "transfer learning", "output": "emotion classification task", "neg_sample": ["transfer learning is used for Task", "the explosive growth of music libraries has made music information retrieval and recommendation a critical issue .", "recommendation systems based on music emotion recognition are gradually gaining attention .", "most of the studies focus on audio data rather than lyrics to build models of music emotion classification .", "in addition , because of the richness of english language resources , most of the existing studies are focused on english lyrics but rarely on chinese ."], "relation": "used for", "id": "2021.rocling-1.2", "year": 2021, "rel_sent": "For this reason , We propose an approach that uses the BERT pretraining model and Transfer learning to improve the emotion classification task of Chinese lyrics .", "forward": true, "src_ids": "2021.rocling-1.2_15885"}
{"input": "machine - learnt language understanding tasks is done by using OtherScientificTerm| context: entity tags in human - machine dialog are integral to natural language understanding ( nlu ) tasks in conversational assistants . however , current systems struggle to accurately parse spoken queries with the typical use of text input alone , and often fail to understand the user intent . previous work in linguistics has identified a cross - language tendency for longer speech pauses surrounding nouns as compared to verbs .", "entity": "machine - learnt language understanding tasks", "output": "linguistic observation", "neg_sample": ["machine - learnt language understanding tasks is done by using OtherScientificTerm", "entity tags in human - machine dialog are integral to natural language understanding ( nlu ) tasks in conversational assistants .", "however , current systems struggle to accurately parse spoken queries with the typical use of text input alone , and often fail to understand the user intent .", "previous work in linguistics has identified a cross - language tendency for longer speech pauses surrounding nouns as compared to verbs ."], "relation": "used for", "id": "2021.nlp4convai-1.22", "year": 2021, "rel_sent": "We demonstrate that the linguistic observation on pauses can be used to improve accuracy in machine - learnt language understanding tasks .", "forward": false, "src_ids": "2021.nlp4convai-1.22_714"}
{"input": "personalized transformer model is done by using Method| context: adapting a model to a handful of personalized data is challenging , especially when it has gigantic parameters , such as a transformerbased pretrained model . the standard way of fine - tuning all the parameters necessitates storing a huge model for each user .", "entity": "personalized transformer model", "output": "useradapter", "neg_sample": ["personalized transformer model is done by using Method", "adapting a model to a handful of personalized data is challenging , especially when it has gigantic parameters , such as a transformerbased pretrained model .", "the standard way of fine - tuning all the parameters necessitates storing a huge model for each user ."], "relation": "used for", "id": "2021.findings-acl.129", "year": 2021, "rel_sent": "More importantly , UserAdapter offers an efficient way to produce a personalized Transformer model with less than 0.5 % parameters added for each user .", "forward": false, "src_ids": "2021.findings-acl.129_11727"}
{"input": "recovery of knowledge is done by using OtherScientificTerm| context: recent work has shown fine - tuning neural coreference models can produce strong performance when adapting to different domains . however , at the same time , this can require a large amount of annotated target examples .", "entity": "recovery of knowledge", "output": "scaffolding loss", "neg_sample": ["recovery of knowledge is done by using OtherScientificTerm", "recent work has shown fine - tuning neural coreference models can produce strong performance when adapting to different domains .", "however , at the same time , this can require a large amount of annotated target examples ."], "relation": "used for", "id": "2021.crac-1.13", "year": 2021, "rel_sent": "We develop methods to improve the span representations via ( 1 ) a retrofitting loss to incentivize span representations to satisfy a knowledge - based distance function and ( 2 ) a scaffolding loss to guide the recovery of knowledge from the span representation .", "forward": false, "src_ids": "2021.crac-1.13_12332"}
{"input": "segmentation is done by using OtherScientificTerm| context: dravidian languages , such as kannada and tamil , are notoriously difficult to translate by state - of - the - art neural models . this stems from the fact that these languages are morphologically very rich as well as being low - resourced .", "entity": "segmentation", "output": "sentencepiece ( sp )", "neg_sample": ["segmentation is done by using OtherScientificTerm", "dravidian languages , such as kannada and tamil , are notoriously difficult to translate by state - of - the - art neural models .", "this stems from the fact that these languages are morphologically very rich as well as being low - resourced ."], "relation": "used for", "id": "2021.wat-1.21", "year": 2021, "rel_sent": "We find that SP is the overall best choice for segmentation , and that larger dictionary sizes lead to higher translation quality .", "forward": false, "src_ids": "2021.wat-1.21_15206"}
{"input": "neural network models is done by using OtherScientificTerm| context: recent studies strive to incorporate various human rationales into neural networks to improve model performance , but few pay attention to the quality of the rationales . most existing methods distribute their models ' focus to distantly - labeled rationale words entirely and equally , while ignoring the potential important non - rationale words and not distinguishing the importance of different rationale words .", "entity": "neural network models", "output": "distantly - labeled rationales", "neg_sample": ["neural network models is done by using OtherScientificTerm", "recent studies strive to incorporate various human rationales into neural networks to improve model performance , but few pay attention to the quality of the rationales .", "most existing methods distribute their models ' focus to distantly - labeled rationale words entirely and equally , while ignoring the potential important non - rationale words and not distinguishing the importance of different rationale words ."], "relation": "used for", "id": "2021.acl-long.433", "year": 2021, "rel_sent": "Exploring Distantly - Labeled Rationales in Neural Network Models.", "forward": false, "src_ids": "2021.acl-long.433_2889"}
{"input": "data augmentation is used for Task| context: users of medical question answering systems often submit long and detailed questions , making it hard to achieve high recall in answer retrieval .", "entity": "data augmentation", "output": "medical question understanding", "neg_sample": ["data augmentation is used for Task", "users of medical question answering systems often submit long and detailed questions , making it hard to achieve high recall in answer retrieval ."], "relation": "used for", "id": "2021.acl-long.119", "year": 2021, "rel_sent": "To alleviate this problem , we propose a novel Multi - Task Learning ( MTL ) method with data augmentation for medical question understanding .", "forward": true, "src_ids": "2021.acl-long.119_1215"}
{"input": "translation system is used for OtherScientificTerm| context: many efforts have been made in solving the aspect - based sentiment analysis ( absa ) task . while most existing studies focus on english texts , handling absa in resource - poor languages remains a challenging problem .", "entity": "translation system", "output": "task - specific knowledge", "neg_sample": ["translation system is used for OtherScientificTerm", "many efforts have been made in solving the aspect - based sentiment analysis ( absa ) task .", "while most existing studies focus on english texts , handling absa in resource - poor languages remains a challenging problem ."], "relation": "used for", "id": "2021.emnlp-main.727", "year": 2021, "rel_sent": "To this end , we propose an alignment - free label projection method to obtain high - quality pseudo - labeled data of the target language with the help of the translation system , which could preserve more accurate task - specific knowledge in the target language .", "forward": true, "src_ids": "2021.emnlp-main.727_189"}
{"input": "neural speech recognition language models is done by using Method| context: language modeling ( lm ) for automatic speech recognition ( asr ) does not usually incorporate utterance level contextual information . for some domains like voice assistants , however , additional context , such as time at which an utterance was spoken , provides a rich input signal .", "entity": "neural speech recognition language models", "output": "attention mechanism", "neg_sample": ["neural speech recognition language models is done by using Method", "language modeling ( lm ) for automatic speech recognition ( asr ) does not usually incorporate utterance level contextual information .", "for some domains like voice assistants , however , additional context , such as time at which an utterance was spoken , provides a rich input signal ."], "relation": "used for", "id": "2021.findings-acl.175", "year": 2021, "rel_sent": "We introduce an attention mechanism for training neural speech recognition language models on both text and nonlinguistic contextual data 1 .", "forward": false, "src_ids": "2021.findings-acl.175_12696"}
{"input": "sociolinguistics is used for Method| context: the field of nlp has made substantial progress in building meaning representations . however , an important aspect of linguistic meaning , social meaning , has been largely overlooked .", "entity": "sociolinguistics", "output": "representation learning", "neg_sample": ["sociolinguistics is used for Method", "the field of nlp has made substantial progress in building meaning representations .", "however , an important aspect of linguistic meaning , social meaning , has been largely overlooked ."], "relation": "used for", "id": "2021.naacl-main.50", "year": 2021, "rel_sent": "We introduce the concept of social meaning to NLP and discuss how insights from sociolinguistics can inform work on representation learning in NLP .", "forward": true, "src_ids": "2021.naacl-main.50_11090"}
{"input": "propagation layout is done by using Method| context: existing rumor detection strategies typically provide detection labels while ignoring their explanation . nonetheless , providing pieces of evidence to explain why a suspicious tweet is rumor is essential .", "entity": "propagation layout", "output": "heterogeneous graph objects", "neg_sample": ["propagation layout is done by using Method", "existing rumor detection strategies typically provide detection labels while ignoring their explanation .", "nonetheless , providing pieces of evidence to explain why a suspicious tweet is rumor is essential ."], "relation": "used for", "id": "2021.findings-acl.63", "year": 2021, "rel_sent": "LOSIRD then automatically constructs two heterogeneous graph objects to simulate the propagation layout of the tweets and code the relationship of evidence .", "forward": false, "src_ids": "2021.findings-acl.63_6543"}
{"input": "pair comparisons is done by using Method| context: recognising if a relation holds between two entities in a text plays a vital role in information extraction . to address this problem , multiple models have been proposed based on fixed or contextualised word representations .", "entity": "pair comparisons", "output": "neural architectures", "neg_sample": ["pair comparisons is done by using Method", "recognising if a relation holds between two entities in a text plays a vital role in information extraction .", "to address this problem , multiple models have been proposed based on fixed or contextualised word representations ."], "relation": "used for", "id": "2021.semdeep-1.4", "year": 2021, "rel_sent": "We grounded our strategy in recent neural architectures that allow single sentence classification as well as pair comparisons .", "forward": false, "src_ids": "2021.semdeep-1.4_13585"}
{"input": "mrc models is used for OtherScientificTerm| context: recent studies report that many machine reading comprehension ( mrc ) models can perform closely to or even better than humans on benchmark datasets . however , existing works indicate that many mrc models may learn shortcuts to outwit these benchmarks , but the performance is unsatisfactory in real - world applications .", "entity": "mrc models", "output": "shortcut questions", "neg_sample": ["mrc models is used for OtherScientificTerm", "recent studies report that many machine reading comprehension ( mrc ) models can perform closely to or even better than humans on benchmark datasets .", "however , existing works indicate that many mrc models may learn shortcuts to outwit these benchmarks , but the performance is unsatisfactory in real - world applications ."], "relation": "used for", "id": "2021.findings-acl.85", "year": 2021, "rel_sent": "A thorough empirical analysis shows that MRC models tend to learn shortcut questions earlier than challenging questions , and the high proportions of shortcut questions in training sets hinder models from exploring the sophisticated reasoning skills in the later stage of training .", "forward": true, "src_ids": "2021.findings-acl.85_9176"}
{"input": "facetsum is used for Method| context: faceted summarization provides briefings of a document from different perspectives . readers can quickly comprehend the main points of a long document with the help of a structured outline . however , little research has been conducted on this subject , partially due to the lack of large - scale faceted summarization datasets .", "entity": "facetsum", "output": "nlp systems", "neg_sample": ["facetsum is used for Method", "faceted summarization provides briefings of a document from different perspectives .", "readers can quickly comprehend the main points of a long document with the help of a structured outline .", "however , little research has been conducted on this subject , partially due to the lack of large - scale faceted summarization datasets ."], "relation": "used for", "id": "2021.acl-short.137", "year": 2021, "rel_sent": "We believe FacetSum will spur further advances in summarization research and foster the development of NLP systems that can leverage the structured information in both long texts and summaries .", "forward": true, "src_ids": "2021.acl-short.137_3889"}
{"input": "selecting erroneous samples is done by using Method| context: customers of machine learning systems demand accountability from the companies employing these algorithms for various prediction tasks . accountability requires understanding of system limit and condition of erroneous predictions , as customers are often interested in understanding the incorrect predictions , and model developers are absorbed in finding methods that can be used to get incremental improvements to an existing system .", "entity": "selecting erroneous samples", "output": "aec", "neg_sample": ["selecting erroneous samples is done by using Method", "customers of machine learning systems demand accountability from the companies employing these algorithms for various prediction tasks .", "accountability requires understanding of system limit and condition of erroneous predictions , as customers are often interested in understanding the incorrect predictions , and model developers are absorbed in finding methods that can be used to get incremental improvements to an existing system ."], "relation": "used for", "id": "2021.trustnlp-1.4", "year": 2021, "rel_sent": "Our results on the sample sentiment task show that AEC is able to characterize erroneous predictions into human understandable categories and also achieves promising results on selecting erroneous samples when compared with the uncertainty - based sampling .", "forward": false, "src_ids": "2021.trustnlp-1.4_15754"}
{"input": "statistical methods is used for OtherScientificTerm| context: aimed at generating a seed lexicon for use in downstream natural language tasks and unsupervised methods for bilingual lexicon induction have received much attention in the academic literature recently . while interesting and fully unsupervised settings are unrealistic ; small amounts of bilingual data are usually available due to the existence of massively multilingual parallel corpora and or linguists can create small amounts of parallel data .", "entity": "statistical methods", "output": "translation pairs", "neg_sample": ["statistical methods is used for OtherScientificTerm", "aimed at generating a seed lexicon for use in downstream natural language tasks and unsupervised methods for bilingual lexicon induction have received much attention in the academic literature recently .", "while interesting and fully unsupervised settings are unrealistic ; small amounts of bilingual data are usually available due to the existence of massively multilingual parallel corpora and or linguists can create small amounts of parallel data ."], "relation": "used for", "id": "2021.mtsummit-research.24", "year": 2021, "rel_sent": "Whereas statistical methods are highly effective at inducing correct translation pairs for words frequently occurring in a parallel corpus and monolingual embedding spaces have the advantage of having been trained on large amounts of data and and therefore may induce accurate translations for words absent from the small corpus .", "forward": true, "src_ids": "2021.mtsummit-research.24_6967"}
{"input": "self - supervised fine - tuning is used for Method| context: a common approach in many machine learning algorithms involves self - supervised learning on large unlabeled data before fine - tuning on downstream tasks tofurther improve performance . a new approach for language modelling , called dynamic evaluation , further fine - tunes a trained model during inference using trivially - present ground - truth labels , giving a large improvement in performance . however , this approach does not easily extend to classification tasks , where ground - truth labels are absent during inference .", "entity": "self - supervised fine - tuning", "output": "classifier model", "neg_sample": ["self - supervised fine - tuning is used for Method", "a common approach in many machine learning algorithms involves self - supervised learning on large unlabeled data before fine - tuning on downstream tasks tofurther improve performance .", "a new approach for language modelling , called dynamic evaluation , further fine - tunes a trained model during inference using trivially - present ground - truth labels , giving a large improvement in performance .", "however , this approach does not easily extend to classification tasks , where ground - truth labels are absent during inference ."], "relation": "used for", "id": "2021.eacl-main.6", "year": 2021, "rel_sent": "Our proposed method outperforms previous approaches , enables self - supervised fine - tuning during inference of any classifier model to better adapt to target domains , can be easily adapted to any model , and is also effective in online and transfer - learning settings .", "forward": true, "src_ids": "2021.eacl-main.6_3575"}
{"input": "clustering is used for OtherScientificTerm| context: the words in a single morphological paradigm are different inflectional variants of an underlying lemma , meaning that the words share a common core meaning . they also - usually - show a high degree of orthographical similarity .", "entity": "clustering", "output": "morphological mechanisms", "neg_sample": ["clustering is used for OtherScientificTerm", "the words in a single morphological paradigm are different inflectional variants of an underlying lemma , meaning that the words share a common core meaning .", "they also - usually - show a high degree of orthographical similarity ."], "relation": "used for", "id": "2021.sigmorphon-1.10", "year": 2021, "rel_sent": "For all development languages , the character - based embeddings perform similarly to the baseline , and the semantic embeddings perform well below the baseline . Analysis of the systems ' errors suggests that clustering based on orthographic representations is suitable for a wide range of morphological mechanisms , particularly as part of a larger system .", "forward": true, "src_ids": "2021.sigmorphon-1.10_5868"}
{"input": "soft constraints is used for OtherScientificTerm| context: visual dialog is a vision - language task where an agent needs to answer a series of questions grounded in an image based on the understanding of the dialog history and the image . the occurrences of coreference relations in the dialog makes it a more challenging task than visual question - answering .", "entity": "soft constraints", "output": "coreferences", "neg_sample": ["soft constraints is used for OtherScientificTerm", "visual dialog is a vision - language task where an agent needs to answer a series of questions grounded in an image based on the understanding of the dialog history and the image .", "the occurrences of coreference relations in the dialog makes it a more challenging task than visual question - answering ."], "relation": "used for", "id": "2021.eacl-main.290", "year": 2021, "rel_sent": "In this paper , based on linguistic knowledge and discourse features of human dialog we propose two soft constraints that can improve the model 's ability of resolving coreferences in dialog in an unsupervised way .", "forward": true, "src_ids": "2021.eacl-main.290_14013"}
{"input": "statistical analysis is used for OtherScientificTerm| context: this paper introduces mediasum , a large - scale media interview dataset consisting of 463.6 k transcripts with abstractive summaries .", "entity": "statistical analysis", "output": "positional bias", "neg_sample": ["statistical analysis is used for OtherScientificTerm", "this paper introduces mediasum , a large - scale media interview dataset consisting of 463.6 k transcripts with abstractive summaries ."], "relation": "used for", "id": "2021.naacl-main.474", "year": 2021, "rel_sent": "We conduct statistical analysis to demonstrate the unique positional bias exhibited in the transcripts of televised and radioed interviews .", "forward": true, "src_ids": "2021.naacl-main.474_9301"}
{"input": "syntactic and a semantic graph structure is done by using Method| context: nlp has a rich history of representing our prior understanding of language in the form of graphs . recent work on analyzing contextualized text representations has focused on hand - designed probe models to understand how and to what extent do these representations encode a particular linguistic phenomenon . however , due to the inter - dependence of various phenomena and randomness of training probe models , detecting how these representations encode the rich information in these linguistic graphs remains a challenging problem .", "entity": "syntactic and a semantic graph structure", "output": "bert models", "neg_sample": ["syntactic and a semantic graph structure is done by using Method", "nlp has a rich history of representing our prior understanding of language in the form of graphs .", "recent work on analyzing contextualized text representations has focused on hand - designed probe models to understand how and to what extent do these representations encode a particular linguistic phenomenon .", "however , due to the inter - dependence of various phenomena and randomness of training probe models , detecting how these representations encode the rich information in these linguistic graphs remains a challenging problem ."], "relation": "used for", "id": "2021.acl-long.145", "year": 2021, "rel_sent": "Using these probes , we analyze the BERT models on its ability to encode a syntactic and a semantic graph structure , and find that these models encode to some degree both syntactic as well as semantic information ; albeit syntactic information to a greater extent .", "forward": false, "src_ids": "2021.acl-long.145_1330"}
{"input": "qa models is used for OtherScientificTerm| context: one of the main challenges in conversational question answering ( cqa ) is to resolve the conversational dependency , such as anaphora and ellipsis . however , existing approaches do not explicitly train qa models on how to resolve the dependency , and thus these models are limited in understanding human dialogues .", "entity": "qa models", "output": "conversational context", "neg_sample": ["qa models is used for OtherScientificTerm", "one of the main challenges in conversational question answering ( cqa ) is to resolve the conversational dependency , such as anaphora and ellipsis .", "however , existing approaches do not explicitly train qa models on how to resolve the dependency , and thus these models are limited in understanding human dialogues ."], "relation": "used for", "id": "2021.acl-long.478", "year": 2021, "rel_sent": "In this paper , we propose a novel framework , ExCorD ( Explicit guidance on how to resolve Conversational Dependency ) to enhance the abilities of QA models in comprehending conversational context .", "forward": true, "src_ids": "2021.acl-long.478_667"}
{"input": "intensity of persona characteristics is done by using Metric| context: to maintain utterance quality of a persona - aware dialog system , inappropriate utterances for the persona should be thoroughly filtered . in addition , practical utterance filtering requires the ability to select utterances based on the intensity of persona characteristics .", "entity": "intensity of persona characteristics", "output": "evaluation metrics", "neg_sample": ["intensity of persona characteristics is done by using Metric", "to maintain utterance quality of a persona - aware dialog system , inappropriate utterances for the persona should be thoroughly filtered .", "in addition , practical utterance filtering requires the ability to select utterances based on the intensity of persona characteristics ."], "relation": "used for", "id": "2021.sigdial-1.19", "year": 2021, "rel_sent": "Therefore , we are developing metrics that can be used to capture the intensity of persona characteristics and can be computed without references tailored to the evaluation targets .", "forward": false, "src_ids": "2021.sigdial-1.19_11347"}
{"input": "fake news detection is done by using Task| context: fake news with textual and visual contents has a better story - telling ability than text - only contents , and can be spread quickly with social media . people can be easily deceived by such fake news , and traditional expert identification is labor - intensive . therefore , automatic detection of multimodal fake news has become a new hot - spot issue . a shortcoming of existing approaches is their inability tofuse multimodality features effectively . they simply concatenate unimodal features without considering inter - modality relations .", "entity": "fake news detection", "output": "multimodal fusion", "neg_sample": ["fake news detection is done by using Task", "fake news with textual and visual contents has a better story - telling ability than text - only contents , and can be spread quickly with social media .", "people can be easily deceived by such fake news , and traditional expert identification is labor - intensive .", "therefore , automatic detection of multimodal fake news has become a new hot - spot issue .", "a shortcoming of existing approaches is their inability tofuse multimodality features effectively .", "they simply concatenate unimodal features without considering inter - modality relations ."], "relation": "used for", "id": "2021.findings-acl.226", "year": 2021, "rel_sent": "Multimodal Fusion with Co - Attention Networks for Fake News Detection.", "forward": false, "src_ids": "2021.findings-acl.226_2599"}
{"input": "amr parses is done by using Method| context: knowledge base question answering ( kbqa ) is an important task in natural language processing . existing approaches face significant challenges including complex question understanding , necessity for reasoning , and lack of large end - to - end training datasets .", "entity": "amr parses", "output": "graph transformation approach", "neg_sample": ["amr parses is done by using Method", "knowledge base question answering ( kbqa ) is an important task in natural language processing .", "existing approaches face significant challenges including complex question understanding , necessity for reasoning , and lack of large end - to - end training datasets ."], "relation": "used for", "id": "2021.findings-acl.339", "year": 2021, "rel_sent": "In this work , we propose Neuro - Symbolic Question Answering ( NSQA ) , a modular KBQA system , that leverages ( 1 ) Abstract Meaning Representation ( AMR ) parses for task - independent question understanding ; ( 2 ) a simple yet effective graph transformation approach to convert AMR parses into candidate logical queries that are aligned to the KB ; ( 3 ) a pipeline - based approach which integrates multiple , reusable modules that are trained specifically for their individual tasks ( semantic parser , entity and relationship linkers , and neuro - symbolic reasoner ) and do not require end - to - end training data .", "forward": false, "src_ids": "2021.findings-acl.339_10055"}
{"input": "cross - target generalization is done by using Material| context: cross - target generalization is a known problem in stance detection ( sd ) , where systems tend to perform poorly when exposed to targets unseen during training . given that data annotation is expensive and time - consuming , finding ways to leverage abundant unlabeled in - domain data can offer great benefits .", "entity": "cross - target generalization", "output": "synthetic examples", "neg_sample": ["cross - target generalization is done by using Material", "cross - target generalization is a known problem in stance detection ( sd ) , where systems tend to perform poorly when exposed to targets unseen during training .", "given that data annotation is expensive and time - consuming , finding ways to leverage abundant unlabeled in - domain data can offer great benefits ."], "relation": "used for", "id": "2021.wassa-1.19", "year": 2021, "rel_sent": "Synthetic Examples Improve Cross - Target Generalization : A Study on Stance Detection on a Twitter corpus ..", "forward": false, "src_ids": "2021.wassa-1.19_14543"}
{"input": "tkgs is done by using Method| context: static knowledge graph ( skg ) embedding ( skge ) has been studied intensively in the past years . recently , temporal knowledge graph ( tkg ) embedding ( tkge ) has emerged .", "entity": "tkgs", "output": "skge models", "neg_sample": ["tkgs is done by using Method", "static knowledge graph ( skg ) embedding ( skge ) has been studied intensively in the past years .", "recently , temporal knowledge graph ( tkg ) embedding ( tkge ) has emerged ."], "relation": "used for", "id": "2021.naacl-main.451", "year": 2021, "rel_sent": "In this paper , we propose a Recursive Temporal Fact Embedding ( RTFE ) framework to transplant SKGE models to TKGs and to enhance the performance of existing TKGE models for TKG completion .", "forward": false, "src_ids": "2021.naacl-main.451_11845"}
{"input": "asr models is used for Material| context: to address the performance gap of english asr models on l2 english speakers , we evaluate fine - tuning of pretrained wav2vec 2.0 models ( baevski et al . , 2020 ; xu et al . , 2021 ) on l2 - arctic , a non - native english speech corpus ( zhao et al . , 2018 ) under different training settings .", "entity": "asr models", "output": "non - native english speakers", "neg_sample": ["asr models is used for Material", "to address the performance gap of english asr models on l2 english speakers , we evaluate fine - tuning of pretrained wav2vec 2.0 models ( baevski et al .", ", 2020 ; xu et al .", ", 2021 ) on l2 - arctic , a non - native english speech corpus ( zhao et al .", ", 2018 ) under different training settings ."], "relation": "used for", "id": "2021.icnlsp-1.2", "year": 2021, "rel_sent": "Our experiments demonstrate the promise of developing ASR models for non - native English speakers , even with small amounts of L2 training data and even without a language model .", "forward": true, "src_ids": "2021.icnlsp-1.2_11364"}
{"input": "automatic classification of misogynistic content is done by using Material| context: online misogyny is a pernicious social problem that risks making online platforms toxic and unwelcoming to women .", "entity": "automatic classification of misogynistic content", "output": "expert labelled dataset", "neg_sample": ["automatic classification of misogynistic content is done by using Material", "online misogyny is a pernicious social problem that risks making online platforms toxic and unwelcoming to women ."], "relation": "used for", "id": "2021.eacl-main.114", "year": 2021, "rel_sent": "We present a new hierarchical taxonomy for online misogyny , as well as an expert labelled dataset to enable automatic classification of misogynistic content .", "forward": false, "src_ids": "2021.eacl-main.114_13913"}
{"input": "domain knowledge is done by using Method| context: transformer - based neural language models have led to breakthroughs for a variety of natural language processing ( nlp ) tasks . however , most models are pretrained on general domain data .", "entity": "domain knowledge", "output": "entity - centric masking strategy", "neg_sample": ["domain knowledge is done by using Method", "transformer - based neural language models have led to breakthroughs for a variety of natural language processing ( nlp ) tasks .", "however , most models are pretrained on general domain data ."], "relation": "used for", "id": "2021.bionlp-1.21", "year": 2021, "rel_sent": "We propose a methodology to produce a model focused on the clinical domain : continued pretraining of a model with a broad representation of biomedical terminology ( PubMedBERT ) on a clinical corpus along with a novel entity - centric masking strategy to infuse domain knowledge in the learning process .", "forward": false, "src_ids": "2021.bionlp-1.21_15083"}
{"input": "state - of - theart model is used for Method| context: language modeling ( lm ) for automatic speech recognition ( asr ) does not usually incorporate utterance level contextual information . for some domains like voice assistants , however , additional context , such as time at which an utterance was spoken , provides a rich input signal .", "entity": "state - of - theart model", "output": "contextual lm", "neg_sample": ["state - of - theart model is used for Method", "language modeling ( lm ) for automatic speech recognition ( asr ) does not usually incorporate utterance level contextual information .", "for some domains like voice assistants , however , additional context , such as time at which an utterance was spoken , provides a rich input signal ."], "relation": "used for", "id": "2021.findings-acl.175", "year": 2021, "rel_sent": "When evaluated on utterances extracted from the long tail of the dataset , our method improves perplexity by 9.0 % relative over a standard LM and by over 2.8 % relative when compared to a state - of - theart model for contextual LM .", "forward": true, "src_ids": "2021.findings-acl.175_12698"}
{"input": "entailment is done by using Task| context: large scale language models encode rich commonsense knowledge acquired through exposure to massive data during pre - training , but their understanding of entities and their semantic properties is unclear .", "entity": "entailment", "output": "fine - tuning setting", "neg_sample": ["entailment is done by using Task", "large scale language models encode rich commonsense knowledge acquired through exposure to massive data during pre - training , but their understanding of entities and their semantic properties is unclear ."], "relation": "used for", "id": "2021.blackboxnlp-1.7", "year": 2021, "rel_sent": "Finally , we show that when tested in a fine - tuning setting addressing entailment , BERT successfully leverages the information needed for reasoning about the meaning of adjective - noun constructions outperforming previous methods .", "forward": false, "src_ids": "2021.blackboxnlp-1.7_3486"}
{"input": "contextualized knowledge structures is used for Task| context: however , these methods rely on quality and contextualized knowledge structures ( i.e. , fact triples ) that are retrieved at the pre - processing stage but overlook challenges caused by incompleteness of a kg , limited expressiveness of its relations , and retrieved facts irrelevant to the reasoning context .", "entity": "contextualized knowledge structures", "output": "commonsense reasoning", "neg_sample": ["contextualized knowledge structures is used for Task", "however , these methods rely on quality and contextualized knowledge structures ( i.e.", ", fact triples ) that are retrieved at the pre - processing stage but overlook challenges caused by incompleteness of a kg , limited expressiveness of its relations , and retrieved facts irrelevant to the reasoning context ."], "relation": "used for", "id": "2021.findings-acl.354", "year": 2021, "rel_sent": "Learning Contextualized Knowledge Structures for Commonsense Reasoning.", "forward": true, "src_ids": "2021.findings-acl.354_11118"}
{"input": "social impact tools is done by using Task| context: the ongoing covid-19 pandemic resulted in significant ramifications for international relations ranging from travel restrictions , global ceasefires , and international vaccine production and sharing agreements . amidst a wave of infections in india that resulted in a systemic breakdown of healthcare infrastructure , a social welfare organization based in pakistan offered to procure medical - grade oxygen to assist india - a nation which was involved in four wars with pakistan in the past few decades . while # indianeedsoxygen and # pakistanstandswithindia featured among the top - trending hashtags in pakistan , divisive hashtags such as # endiasaysorrytokashmir simultaneously started trending . against the backdrop of a contentious history including four wars , divisive content of this nature , especially when a country is facing an unprecedented healthcare crisis , fuels further deterioration of relations .", "entity": "social impact tools", "output": "nlp", "neg_sample": ["social impact tools is done by using Task", "the ongoing covid-19 pandemic resulted in significant ramifications for international relations ranging from travel restrictions , global ceasefires , and international vaccine production and sharing agreements .", "amidst a wave of infections in india that resulted in a systemic breakdown of healthcare infrastructure , a social welfare organization based in pakistan offered to procure medical - grade oxygen to assist india - a nation which was involved in four wars with pakistan in the past few decades .", "while # indianeedsoxygen and # pakistanstandswithindia featured among the top - trending hashtags in pakistan , divisive hashtags such as # endiasaysorrytokashmir simultaneously started trending .", "against the backdrop of a contentious history including four wars , divisive content of this nature , especially when a country is facing an unprecedented healthcare crisis , fuels further deterioration of relations ."], "relation": "used for", "id": "2021.nlp4posimpact-1.14", "year": 2021, "rel_sent": "In this paper , we define a new task of detecting supportive content and demonstrate that existing NLP for social impact tools can be effectively harnessed for such tasks within a quick turnaround time .", "forward": false, "src_ids": "2021.nlp4posimpact-1.14_13227"}
{"input": "english and dutch models is done by using Method| context: the field of explainable ai has recently seen an explosion in the number of explanation methods for highly non - linear deep neural networks . the extent to which such methods - that are often proposed and tested in the domain of computer vision - are appropriate to address the explainability challenges in nlp is yet relatively unexplored .", "entity": "english and dutch models", "output": "contextual decomposition", "neg_sample": ["english and dutch models is done by using Method", "the field of explainable ai has recently seen an explosion in the number of explanation methods for highly non - linear deep neural networks .", "the extent to which such methods - that are often proposed and tested in the domain of computer vision - are appropriate to address the explainability challenges in nlp is yet relatively unexplored ."], "relation": "used for", "id": "2021.deelio-1.13", "year": 2021, "rel_sent": "In particular , using CD , we show that the English and Dutch models demonstrate similar processing behaviour , but that under the hood there are consistent differences between our attention and non - attention models .", "forward": false, "src_ids": "2021.deelio-1.13_1011"}
{"input": "three - module pipeline approach is used for OtherScientificTerm| context: warning : this paper contains content that may be offensive or upsetting . countermeasures to effectively fight the ever increasing hate speech online without blocking freedom of speech is of great social interest . natural language generation ( nlg ) , is uniquely capable of developing scalable solutions . however , off - the - shelf nlg methods are primarily sequence - to - sequence neural models and they are limited in that they generate commonplace , repetitive and safe responses regardless of the hate speech ( e.g. , ' please refrain from using such language . ' ) or irrelevant responses , making them ineffective for de - escalating hateful conversations .", "entity": "three - module pipeline approach", "output": "diversity", "neg_sample": ["three - module pipeline approach is used for OtherScientificTerm", "warning : this paper contains content that may be offensive or upsetting .", "countermeasures to effectively fight the ever increasing hate speech online without blocking freedom of speech is of great social interest .", "natural language generation ( nlg ) , is uniquely capable of developing scalable solutions .", "however , off - the - shelf nlg methods are primarily sequence - to - sequence neural models and they are limited in that they generate commonplace , repetitive and safe responses regardless of the hate speech ( e.g.", ", ' please refrain from using such language . '", ") or irrelevant responses , making them ineffective for de - escalating hateful conversations ."], "relation": "used for", "id": "2021.findings-acl.12", "year": 2021, "rel_sent": "In this paper , we design a three - module pipeline approach to effectively improve the diversity and relevance .", "forward": true, "src_ids": "2021.findings-acl.12_9740"}
{"input": "open - domain chatbot is done by using Material| context: building open - domain chatbots is a challenging area for machine learning research . while prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results , we highlight other ingredients . good conversation requires blended skills : providing engaging talking points , and displaying knowledge , empathy and personality appropriately , while maintaining a consistent persona .", "entity": "open - domain chatbot", "output": "recipes", "neg_sample": ["open - domain chatbot is done by using Material", "building open - domain chatbots is a challenging area for machine learning research .", "while prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results , we highlight other ingredients .", "good conversation requires blended skills : providing engaging talking points , and displaying knowledge , empathy and personality appropriately , while maintaining a consistent persona ."], "relation": "used for", "id": "2021.eacl-main.24", "year": 2021, "rel_sent": "Recipes for Building an Open - Domain Chatbot.", "forward": false, "src_ids": "2021.eacl-main.24_8808"}
{"input": "discourse - aware information is used for Method| context: despite the recent advances in applying pre - trained language models to generate high - quality texts , generating long passages that maintain long - range coherence is yet challenging for these models .", "entity": "discourse - aware information", "output": "discrete latent representations", "neg_sample": ["discourse - aware information is used for Method", "despite the recent advances in applying pre - trained language models to generate high - quality texts , generating long passages that maintain long - range coherence is yet challenging for these models ."], "relation": "used for", "id": "2021.emnlp-main.347", "year": 2021, "rel_sent": "Tofurther embed discourse - aware information into the discrete latent representations , we introduce an auxiliary objective to model the discourse relations within the text .", "forward": true, "src_ids": "2021.emnlp-main.347_3179"}
{"input": "minimum spanning tree is used for OtherScientificTerm| context: we study how masking and predicting tokens in an unsupervised fashion can give rise to linguistic structures and downstream performance gains . recent theories have suggested that pretrained language models acquire useful inductive biases through masks that implicitly act as cloze reductions for downstream tasks . while appealing , we show that the success of the random masking strategy used in practice can not be explained by such cloze - like masks alone .", "entity": "minimum spanning tree", "output": "statistical dependence structure", "neg_sample": ["minimum spanning tree is used for OtherScientificTerm", "we study how masking and predicting tokens in an unsupervised fashion can give rise to linguistic structures and downstream performance gains .", "recent theories have suggested that pretrained language models acquire useful inductive biases through masks that implicitly act as cloze reductions for downstream tasks .", "while appealing , we show that the success of the random masking strategy used in practice can not be explained by such cloze - like masks alone ."], "relation": "used for", "id": "2021.naacl-main.404", "year": 2021, "rel_sent": "In an unsupervised parsing evaluation , simply forming a minimum spanning tree on the implied statistical dependence structure outperforms a classic method for unsupervised parsing ( 58.74 vs. 55.91 UUAS ) .", "forward": true, "src_ids": "2021.naacl-main.404_6945"}
{"input": "pragmatic reasoning is used for OtherScientificTerm| context: the ability for variation in language use is necessary for speakers to achieve their conversational goals , for instance when referring to objects in visual environments . we argue that diversity should not be modelled as an independent objective in dialogue , but should rather be a result or by - product of goal - oriented language generation . different lines of work in neural language generation investigated decoding methods for generating more diverse utterances , or increasing the informativity through pragmatic reasoning .", "entity": "pragmatic reasoning", "output": "lexical diversity", "neg_sample": ["pragmatic reasoning is used for OtherScientificTerm", "the ability for variation in language use is necessary for speakers to achieve their conversational goals , for instance when referring to objects in visual environments .", "we argue that diversity should not be modelled as an independent objective in dialogue , but should rather be a result or by - product of goal - oriented language generation .", "different lines of work in neural language generation investigated decoding methods for generating more diverse utterances , or increasing the informativity through pragmatic reasoning ."], "relation": "used for", "id": "2021.sigdial-1.43", "year": 2021, "rel_sent": "We find that boosting diversity itself does not result in more pragmatically informative captions , but pragmatic reasoning does increase lexical diversity .", "forward": true, "src_ids": "2021.sigdial-1.43_624"}
{"input": "sensitive information is done by using Method| context: large transformers pretrained over clinical notes from electronic health records ( ehr ) have afforded substantial gains in performance on predictive clinical tasks . would it be safe to release the weights of such a model if they did ?", "entity": "sensitive information", "output": "bert", "neg_sample": ["sensitive information is done by using Method", "large transformers pretrained over clinical notes from electronic health records ( ehr ) have afforded substantial gains in performance on predictive clinical tasks .", "would it be safe to release the weights of such a model if they did ?"], "relation": "used for", "id": "2021.naacl-main.73", "year": 2021, "rel_sent": "Does BERT Pretrained on Clinical Notes Reveal Sensitive Data ?.", "forward": false, "src_ids": "2021.naacl-main.73_13603"}
{"input": "common functions is done by using OtherScientificTerm| context: humans create things for a reason . ancient people created spears for hunting , knives for cutting meat , pots for preparing food , etc . the prototypical function of a physical artifact is a kind of commonsense knowledge that we rely on to understand natural language . for example , if someone says ' she borrowed the book ' then you would assume that she intends to read the book , or if someone asks ' can i use your knife ? ' then you would assume that they need to cut something .", "entity": "common functions", "output": "frames", "neg_sample": ["common functions is done by using OtherScientificTerm", "humans create things for a reason .", "ancient people created spears for hunting , knives for cutting meat , pots for preparing food , etc .", "the prototypical function of a physical artifact is a kind of commonsense knowledge that we rely on to understand natural language .", "for example , if someone says ' she borrowed the book ' then you would assume that she intends to read the book , or if someone asks ' can i use your knife ? '", "then you would assume that they need to cut something ."], "relation": "used for", "id": "2021.acl-long.540", "year": 2021, "rel_sent": "We use frames from FrameNet to represent a set of common functions for objects , and describe a manually annotated data set of physical objects labeled with their prototypical function .", "forward": false, "src_ids": "2021.acl-long.540_13431"}
{"input": "neural sequence model is used for Task| context: presentation slides generated from original research papers provide an efficient form to present research innovations . manually generating presentation slides is labor - intensive .", "entity": "neural sequence model", "output": "extractive summarization", "neg_sample": ["neural sequence model is used for Task", "presentation slides generated from original research papers provide an efficient form to present research innovations .", "manually generating presentation slides is labor - intensive ."], "relation": "used for", "id": "2021.sdp-1.11", "year": 2021, "rel_sent": "The sentence labeling module of our method is based on SummaRuNNer , a neural sequence model for extractive summarization .", "forward": true, "src_ids": "2021.sdp-1.11_6048"}
{"input": "arguments is done by using Task| context: twitter is a popular platform to share opinions and claims , which may be accompanied by the underlying rationale . such information can be invaluable to policy makers , marketers and social scientists , to name a few . however , the effort to mine arguments on twitter has been limited , mainly because a tweet is typically too short to contain an argument - both a claim and a premise .", "entity": "arguments", "output": "problem formulation", "neg_sample": ["arguments is done by using Task", "twitter is a popular platform to share opinions and claims , which may be accompanied by the underlying rationale .", "such information can be invaluable to policy makers , marketers and social scientists , to name a few .", "however , the effort to mine arguments on twitter has been limited , mainly because a tweet is typically too short to contain an argument - both a claim and a premise ."], "relation": "used for", "id": "2021.argmining-1.1", "year": 2021, "rel_sent": "In this paper , we propose a novel problem formulation to mine arguments from Twitter : We formulate argument mining on Twitter as a text classification task to identify tweets that serve as premises for a hashtag that represents a claim of interest .", "forward": false, "src_ids": "2021.argmining-1.1_596"}
{"input": "inverted projection is used for Task| context: traditional translation systems trained on written documents perform well for text - based translation but not as well for speech - based applications .", "entity": "inverted projection", "output": "speech translation", "neg_sample": ["inverted projection is used for Task", "traditional translation systems trained on written documents perform well for text - based translation but not as well for speech - based applications ."], "relation": "used for", "id": "2021.iwslt-1.28", "year": 2021, "rel_sent": "Inverted Projection for Robust Speech Translation.", "forward": true, "src_ids": "2021.iwslt-1.28_13761"}
{"input": "toxic spans is done by using Method| context: the upsurge of prolific blogging and microblogging platforms enabled the abusers to spread negativity and threats greater than ever . detecting the toxic portions substantially aids to moderate or exclude the abusive parts for maintaining sound online platforms .", "entity": "toxic spans", "output": "sequence labeling models", "neg_sample": ["toxic spans is done by using Method", "the upsurge of prolific blogging and microblogging platforms enabled the abusers to spread negativity and threats greater than ever .", "detecting the toxic portions substantially aids to moderate or exclude the abusive parts for maintaining sound online platforms ."], "relation": "used for", "id": "2021.semeval-1.135", "year": 2021, "rel_sent": "We explore an ensemble of sequence labeling models including the BiLSTM - CRF , spaCy NER model with custom toxic tags , and fine - tuned BERT model to identify the toxic spans .", "forward": false, "src_ids": "2021.semeval-1.135_4200"}
{"input": "explicit and explainable numerical reasoning is done by using OtherScientificTerm| context: automatic math problem solving has recently attracted increasing attention as a longstanding ai benchmark . however , the existing methods were highly dependent on handcraft rules and were merely evaluated on small - scale datasets .", "entity": "explicit and explainable numerical reasoning", "output": "program annotations", "neg_sample": ["explicit and explainable numerical reasoning is done by using OtherScientificTerm", "automatic math problem solving has recently attracted increasing attention as a longstanding ai benchmark .", "however , the existing methods were highly dependent on handcraft rules and were merely evaluated on small - scale datasets ."], "relation": "used for", "id": "2021.findings-acl.46", "year": 2021, "rel_sent": "Compared with another publicly available dataset GeoS , GeoQA is 25 times larger , in which the program annotations can provide a practical testbed for future research on explicit and explainable numerical reasoning .", "forward": false, "src_ids": "2021.findings-acl.46_6604"}
{"input": "deep learning on graph techniques is used for Task| context: due to its great power in modeling non - euclidean data like graphs or manifolds , deep learning on graph techniques ( i.e. , graph neural networks ( gnns ) ) have opened a new door to solving challenging graph - related nlp problems . there has seen a surge of interests in applying deep learning on graph techniques to nlp , and has achieved considerable success in many nlp tasks , ranging from classification tasks like sentence classification , semantic role labeling and relation extraction , to generation tasks like machine translation , question generation and summarization . despite these successes , deep learning on graphs for nlp still face many challenges , including automatically transforming original text sequence data into highly graph - structured data , and effectively modeling complex data that involves mapping between graph - based inputs and other highly structured output data such as sequences , trees , and graph data with multi - types in both nodes and edges .", "entity": "deep learning on graph techniques", "output": "natural language processing", "neg_sample": ["deep learning on graph techniques is used for Task", "due to its great power in modeling non - euclidean data like graphs or manifolds , deep learning on graph techniques ( i.e.", ", graph neural networks ( gnns ) ) have opened a new door to solving challenging graph - related nlp problems .", "there has seen a surge of interests in applying deep learning on graph techniques to nlp , and has achieved considerable success in many nlp tasks , ranging from classification tasks like sentence classification , semantic role labeling and relation extraction , to generation tasks like machine translation , question generation and summarization .", "despite these successes , deep learning on graphs for nlp still face many challenges , including automatically transforming original text sequence data into highly graph - structured data , and effectively modeling complex data that involves mapping between graph - based inputs and other highly structured output data such as sequences , trees , and graph data with multi - types in both nodes and edges ."], "relation": "used for", "id": "2021.naacl-tutorials.3", "year": 2021, "rel_sent": "Deep Learning on Graphs for Natural Language Processing.", "forward": true, "src_ids": "2021.naacl-tutorials.3_8178"}
{"input": "event mentions is done by using OtherScientificTerm| context: a missing part in the current deep learning models for finetemprel is their failure to exploit the syntactic structures of the input sentences to enrich the representation vectors .", "entity": "event mentions", "output": "context words", "neg_sample": ["event mentions is done by using OtherScientificTerm", "a missing part in the current deep learning models for finetemprel is their failure to exploit the syntactic structures of the input sentences to enrich the representation vectors ."], "relation": "used for", "id": "2021.wnut-1.5", "year": 2021, "rel_sent": "The proposed model focuses on two types of syntactic information from the dependency trees , i.e. , the syntax - based importance scores for representation learning of the words and the syntactic connections to identify important context words for the event mentions .", "forward": false, "src_ids": "2021.wnut-1.5_3375"}
{"input": "salient contents is done by using Method| context: millions of hashtags are created on social media every day to cross - refer messages concerning similar topics .", "entity": "salient contents", "output": "personalized topic attention", "neg_sample": ["salient contents is done by using Method", "millions of hashtags are created on social media every day to cross - refer messages concerning similar topics ."], "relation": "used for", "id": "2021.emnlp-main.616", "year": 2021, "rel_sent": "Furthermore , we propose a novel personalized topic attention to capture salient contents to personalize hashtag contexts .", "forward": false, "src_ids": "2021.emnlp-main.616_2374"}
{"input": "meta - learning framework is used for Task| context: meta - learning has emerged as a trending technique to tackle few - shot text classification and achieved state - of - the - art performance . however , existing solutions heavily rely on the exploitation of lexical features and their distributional signatures on training data , while neglecting to strengthen the model 's ability to adapt to new tasks .", "entity": "meta - learning framework", "output": "text embedding", "neg_sample": ["meta - learning framework is used for Task", "meta - learning has emerged as a trending technique to tackle few - shot text classification and achieved state - of - the - art performance .", "however , existing solutions heavily rely on the exploitation of lexical features and their distributional signatures on training data , while neglecting to strengthen the model 's ability to adapt to new tasks ."], "relation": "used for", "id": "2021.findings-acl.145", "year": 2021, "rel_sent": "In this paper , we propose a novel meta - learning framework integrated with an adversarial domain adaptation network , aiming to improve the adaptive ability of the model and generate high - quality text embedding for new classes .", "forward": true, "src_ids": "2021.findings-acl.145_10010"}
{"input": "empathetic response generation is done by using Method| context: the capacity of empathy is crucial to the success of open - domain dialog systems . due to its nature of multi - dimensionality , there are various factors that relate to empathy expression , such as communication mechanism , dialog act and emotion . however , existing methods for empathetic response generation usually either consider only one empathy factor or ignore the hierarchical relationships between different factors , leading to a weak ability of empathy modeling .", "entity": "empathetic response generation", "output": "multi - factor hierarchical framework", "neg_sample": ["empathetic response generation is done by using Method", "the capacity of empathy is crucial to the success of open - domain dialog systems .", "due to its nature of multi - dimensionality , there are various factors that relate to empathy expression , such as communication mechanism , dialog act and emotion .", "however , existing methods for empathetic response generation usually either consider only one empathy factor or ignore the hierarchical relationships between different factors , leading to a weak ability of empathy modeling ."], "relation": "used for", "id": "2021.findings-acl.72", "year": 2021, "rel_sent": "CoMAE : A Multi - factor Hierarchical Framework for Empathetic Response Generation.", "forward": false, "src_ids": "2021.findings-acl.72_14616"}
{"input": "dialogue scenarios is done by using Method| context: one of the difficulties in training dialogue systems is the lack of training data .", "entity": "dialogue scenarios", "output": "modelling framework", "neg_sample": ["dialogue scenarios is done by using Method", "one of the difficulties in training dialogue systems is the lack of training data ."], "relation": "used for", "id": "2021.acl-long.13", "year": 2021, "rel_sent": "Our goal is to develop a modelling framework that can incorporate new dialogue scenarios through self - play between the two agents .", "forward": false, "src_ids": "2021.acl-long.13_10120"}
{"input": "multi - task learning framework is used for Task| context: rhetorical implicit emotion identification is one of important and challenging tasks in natural language processing . we observe that each rhetoric may express certain evidence of semantic and syntactic patterns .", "entity": "multi - task learning framework", "output": "rhetoric and emotion identification problem", "neg_sample": ["multi - task learning framework is used for Task", "rhetorical implicit emotion identification is one of important and challenging tasks in natural language processing .", "we observe that each rhetoric may express certain evidence of semantic and syntactic patterns ."], "relation": "used for", "id": "2021.findings-acl.123", "year": 2021, "rel_sent": "We thus propose a new multi - task learning framework that can encode the categorical correlation between tasks to improve the performance of rhetoric and emotion identification problem .", "forward": true, "src_ids": "2021.findings-acl.123_10559"}
{"input": "argument mining is used for Task| context: this survey builds an interdisciplinary picture of argument mining ( am ) , with a strong focus on its potential to address issues related to social and political science . more specifically , we focus on am challenges related to its applications to social media and in the multilingual domain , and then proceed to the widely debated notion of argument quality .", "entity": "argument mining", "output": "social good", "neg_sample": ["argument mining is used for Task", "this survey builds an interdisciplinary picture of argument mining ( am ) , with a strong focus on its potential to address issues related to social and political science .", "more specifically , we focus on am challenges related to its applications to social media and in the multilingual domain , and then proceed to the widely debated notion of argument quality ."], "relation": "used for", "id": "2021.acl-long.107", "year": 2021, "rel_sent": "Towards Argument Mining for Social Good : A Survey.", "forward": true, "src_ids": "2021.acl-long.107_2274"}
{"input": "medical code prediction is done by using OtherScientificTerm| context: given the clinical notes written in electronic health records ( ehrs ) , it is challenging to predict the diagnostic codes which is formulated as a multi - label classification task . the large set of labels , the hierarchical dependency , and the imbalanced data make this prediction task extremely hard . most existing work built a binary prediction for each label independently , ignoring the dependencies between labels .", "entity": "medical code prediction", "output": "predictors", "neg_sample": ["medical code prediction is done by using OtherScientificTerm", "given the clinical notes written in electronic health records ( ehrs ) , it is challenging to predict the diagnostic codes which is formulated as a multi - label classification task .", "the large set of labels , the hierarchical dependency , and the imbalanced data make this prediction task extremely hard .", "most existing work built a binary prediction for each label independently , ignoring the dependencies between labels ."], "relation": "used for", "id": "2021.naacl-main.318", "year": 2021, "rel_sent": "In the experiments , our proposed framework is able to improve upon best - performing predictors for medical code prediction on the benchmark MIMIC datasets .", "forward": false, "src_ids": "2021.naacl-main.318_8388"}
{"input": "event extraction is done by using Method| context: unlike the previous causality detection task , we do not assign target events in the text , but only provide structural event descriptions , and such settings accord more with practice scenarios .", "entity": "event extraction", "output": "canonical methods", "neg_sample": ["event extraction is done by using Method", "unlike the previous causality detection task , we do not assign target events in the text , but only provide structural event descriptions , and such settings accord more with practice scenarios ."], "relation": "used for", "id": "2021.eacl-main.175", "year": 2021, "rel_sent": "We also provide the performance of existing canonical methods in event extraction and machine reading comprehension on this task .", "forward": false, "src_ids": "2021.eacl-main.175_639"}
{"input": "transformer is done by using Method| context: transformer architecture achieves great success in abundant natural language processing tasks . the over - parameterization of the transformer model has motivated plenty of works to alleviate its overfitting for superior performances . with some explorations , we find simple techniques such as dropout , can greatly boost model performance with a careful design .", "entity": "transformer", "output": "unidrop", "neg_sample": ["transformer is done by using Method", "transformer architecture achieves great success in abundant natural language processing tasks .", "the over - parameterization of the transformer model has motivated plenty of works to alleviate its overfitting for superior performances .", "with some explorations , we find simple techniques such as dropout , can greatly boost model performance with a careful design ."], "relation": "used for", "id": "2021.naacl-main.302", "year": 2021, "rel_sent": "UniDrop : A Simple yet Effective Technique to Improve Transformer without Extra Cost.", "forward": false, "src_ids": "2021.naacl-main.302_1021"}
{"input": "multi - task structure is used for Task| context: opinion target extraction and opinion term extraction are twofundamental tasks in aspect based sentiment analysis ( absa ) . many recent works on absa focus on target - oriented opinion words ( or terms ) extraction ( towe ) , which aims at extracting the corresponding opinion words for a given opinion target .", "entity": "multi - task structure", "output": "aspect - opinion pair extraction ( aope )", "neg_sample": ["multi - task structure is used for Task", "opinion target extraction and opinion term extraction are twofundamental tasks in aspect based sentiment analysis ( absa ) .", "many recent works on absa focus on target - oriented opinion words ( or terms ) extraction ( towe ) , which aims at extracting the corresponding opinion words for a given opinion target ."], "relation": "used for", "id": "2021.naacl-main.145", "year": 2021, "rel_sent": "As a case study , we also develop a Multi - Task structure named MT - TSMSA for AOPE by combining our TSMSA with an aspect and opinion term extraction module .", "forward": true, "src_ids": "2021.naacl-main.145_11149"}
{"input": "zero - shot spoken language understanding is done by using Task| context: the lack of publicly available evaluation data for low - resource languages limits progress in spoken language understanding ( slu ) . as key tasks like intent classification and slot filling require abundant training data , it is desirable to reuse existing data in high - resource languages to develop models for low - resource scenarios .", "entity": "zero - shot spoken language understanding", "output": "non - english auxiliary tasks", "neg_sample": ["zero - shot spoken language understanding is done by using Task", "the lack of publicly available evaluation data for low - resource languages limits progress in spoken language understanding ( slu ) .", "as key tasks like intent classification and slot filling require abundant training data , it is desirable to reuse existing data in high - resource languages to develop models for low - resource scenarios ."], "relation": "used for", "id": "2021.naacl-main.197", "year": 2021, "rel_sent": "From Masked Language Modeling to Translation : Non - English Auxiliary Tasks Improve Zero - shot Spoken Language Understanding.", "forward": false, "src_ids": "2021.naacl-main.197_3852"}
{"input": "english is used for Method| context: in this paper we explore pos tagging for the scots language . scots is spoken in scotland and northern ireland , and is closely related to english . as no linguistically annotated scots data were available , we manually pos tagged a small set that is used for evaluation and training .", "entity": "english", "output": "transfer learning methods", "neg_sample": ["english is used for Method", "in this paper we explore pos tagging for the scots language .", "scots is spoken in scotland and northern ireland , and is closely related to english .", "as no linguistically annotated scots data were available , we manually pos tagged a small set that is used for evaluation and training ."], "relation": "used for", "id": "2021.vardial-1.5", "year": 2021, "rel_sent": "We use English as a transfer language to examine zero - shot transfer and transfer learning methods .", "forward": true, "src_ids": "2021.vardial-1.5_4798"}
{"input": "contextual information is used for OtherScientificTerm| context: a health outcome is a measurement or an observation used to capture and assess the effect of a treatment . automatic detection of health outcomes from text would undoubtedly speed up access to evidence necessary in healthcare decision making . prior work on outcome detection has modelled this task as either ( a ) a sequence labelling task , where the goal is to detect which text spans describe health outcomes , or ( b ) a classification task , where the goal is to classify a text into a predefined set of categories depending on an outcome that is mentioned somewhere in that text . however , this decoupling of span detection and classification is problematic from a modelling perspective and ignores global structural correspondences between sentence - level and word - level information present in a given text .", "entity": "contextual information", "output": "hidden vectors", "neg_sample": ["contextual information is used for OtherScientificTerm", "a health outcome is a measurement or an observation used to capture and assess the effect of a treatment .", "automatic detection of health outcomes from text would undoubtedly speed up access to evidence necessary in healthcare decision making .", "prior work on outcome detection has modelled this task as either ( a ) a sequence labelling task , where the goal is to detect which text spans describe health outcomes , or ( b ) a classification task , where the goal is to classify a text into a predefined set of categories depending on an outcome that is mentioned somewhere in that text .", "however , this decoupling of span detection and classification is problematic from a modelling perspective and ignores global structural correspondences between sentence - level and word - level information present in a given text ."], "relation": "used for", "id": "2021.emnlp-main.686", "year": 2021, "rel_sent": "In addition to injecting contextual information to hidden vectors , we use label attention to appropriately weight both word and sentence level information .", "forward": true, "src_ids": "2021.emnlp-main.686_14870"}
{"input": "downstream tasks is done by using Method| context: performing event and entity coreference resolution across documents vastly increases the number of candidate mentions , making it intractable to do the full n^2 pairwise comparisons . existing approaches simplify by considering coreference only within document clusters , but this fails to handle inter - cluster coreference , common in many applications . as a result cross - document coreference algorithms are rarely applied to downstream tasks .", "entity": "downstream tasks", "output": "coreference resolution model", "neg_sample": ["downstream tasks is done by using Method", "performing event and entity coreference resolution across documents vastly increases the number of candidate mentions , making it intractable to do the full n^2 pairwise comparisons .", "existing approaches simplify by considering coreference only within document clusters , but this fails to handle inter - cluster coreference , common in many applications .", "as a result cross - document coreference algorithms are rarely applied to downstream tasks ."], "relation": "used for", "id": "2021.emnlp-main.106", "year": 2021, "rel_sent": "Furthermore , training on multiple corpora improves average performance across all datasets by 17.2 F1 points , leading to a robust coreference resolution model that is now feasible to apply to downstream tasks .", "forward": false, "src_ids": "2021.emnlp-main.106_14419"}
{"input": "temporary ambiguity is done by using Method| context: temporary syntactic ambiguities arise when the beginning of a sentence is compatible with multiple syntactic analyses .", "entity": "temporary ambiguity", "output": "language models", "neg_sample": ["temporary ambiguity is done by using Method", "temporary syntactic ambiguities arise when the beginning of a sentence is compatible with multiple syntactic analyses ."], "relation": "used for", "id": "2021.blackboxnlp-1.4", "year": 2021, "rel_sent": "We apply this method to study the behavior of two LMs ( GPT2 and an LSTM ) on three types of temporary ambiguity , using materials from human sentence processing experiments .", "forward": false, "src_ids": "2021.blackboxnlp-1.4_6364"}
{"input": "encoder is used for Method| context: dense retrieval requires high - quality text sequence embeddings to support effective search in the representation space . autoencoder - based language models are appealing in dense retrieval as they train the encoder to output high - quality embedding that can reconstruct the input texts . however , in this paper , we provide theoretical analyses and show empirically that an autoencoder language model with a low reconstruction loss may not provide good sequence representations because the decoder may take shortcuts by exploiting language patterns .", "entity": "encoder", "output": "text representations", "neg_sample": ["encoder is used for Method", "dense retrieval requires high - quality text sequence embeddings to support effective search in the representation space .", "autoencoder - based language models are appealing in dense retrieval as they train the encoder to output high - quality embedding that can reconstruct the input texts .", "however , in this paper , we provide theoretical analyses and show empirically that an autoencoder language model with a low reconstruction loss may not provide good sequence representations because the decoder may take shortcuts by exploiting language patterns ."], "relation": "used for", "id": "2021.emnlp-main.220", "year": 2021, "rel_sent": "To address this , we propose a new self - learning method that pre - trains the autoencoder using a weak decoder , with restricted capacity and attention flexibility to push the encoder to provide better text representations .", "forward": true, "src_ids": "2021.emnlp-main.220_12930"}
{"input": "dependency parsers is used for Material| context: we present an empirical study that compares mention heads as annotated manually in four coreference datasets ( for dutch , english , polish , and russian ) on one hand , with heads induced from dependency trees parsed automatically , on the other hand .", "entity": "dependency parsers", "output": "coreference corpora", "neg_sample": ["dependency parsers is used for Material", "we present an empirical study that compares mention heads as annotated manually in four coreference datasets ( for dutch , english , polish , and russian ) on one hand , with heads induced from dependency trees parsed automatically , on the other hand ."], "relation": "used for", "id": "2021.depling-1.10", "year": 2021, "rel_sent": "This can be achieved with sufficient accuracy using modern dependency parsers even for coreference corpora that lack manual head annotation .", "forward": true, "src_ids": "2021.depling-1.10_2550"}
{"input": "visual information is used for Task| context: change captioning is to describe the difference in a pair of images with a natural language sentence . in this task , the distractors , such as the illumination or viewpoint change , bring the huge challenges about learning the difference representation .", "entity": "visual information", "output": "caption generation", "neg_sample": ["visual information is used for Task", "change captioning is to describe the difference in a pair of images with a natural language sentence .", "in this task , the distractors , such as the illumination or viewpoint change , bring the huge challenges about learning the difference representation ."], "relation": "used for", "id": "2021.findings-acl.6", "year": 2021, "rel_sent": "Besides , relying on the POS of words , we devise an attentionbased visual switch to dynamically use visual information for caption generation .", "forward": true, "src_ids": "2021.findings-acl.6_4487"}
{"input": "fine - grained chemistry ner is done by using Method| context: scientific literature analysis needs fine - grained named entity recognition ( ner ) to provide a wide range of information for scientific discovery . for example , chemistry research needs to study dozens to hundreds of distinct , fine - grained entity types , making consistent and accurate annotation difficult even for crowds of domain experts . on the other hand , domain - specific ontologies and knowledge bases ( kbs ) can be easily accessed , constructed , or integrated , which makes distant supervision realistic for fine - grained chemistry ner . in distant supervision , training labels are generated by matching mentions in a document with the concepts in the knowledge bases ( kbs ) . however , this kind of kb - matching suffers from two major challenges : incomplete annotation and noisy annotation .", "entity": "fine - grained chemistry ner", "output": "chemner", "neg_sample": ["fine - grained chemistry ner is done by using Method", "scientific literature analysis needs fine - grained named entity recognition ( ner ) to provide a wide range of information for scientific discovery .", "for example , chemistry research needs to study dozens to hundreds of distinct , fine - grained entity types , making consistent and accurate annotation difficult even for crowds of domain experts .", "on the other hand , domain - specific ontologies and knowledge bases ( kbs ) can be easily accessed , constructed , or integrated , which makes distant supervision realistic for fine - grained chemistry ner .", "in distant supervision , training labels are generated by matching mentions in a document with the concepts in the knowledge bases ( kbs ) .", "however , this kind of kb - matching suffers from two major challenges : incomplete annotation and noisy annotation ."], "relation": "used for", "id": "2021.emnlp-main.424", "year": 2021, "rel_sent": "We propose ChemNER , an ontology - guided , distantly - supervised method for fine - grained chemistry NER to tackle these challenges .", "forward": false, "src_ids": "2021.emnlp-main.424_5685"}
{"input": "diversity - aware batch active learning is used for Task| context: while the predictive performance of modern statistical dependency parsers relies heavily on the availability of expensive expert - annotated treebank data , not all annotations contribute equally to the training of the parsers .", "entity": "diversity - aware batch active learning", "output": "dependency parsing", "neg_sample": ["diversity - aware batch active learning is used for Task", "while the predictive performance of modern statistical dependency parsers relies heavily on the availability of expensive expert - annotated treebank data , not all annotations contribute equally to the training of the parsers ."], "relation": "used for", "id": "2021.naacl-main.207", "year": 2021, "rel_sent": "Diversity - Aware Batch Active Learning for Dependency Parsing.", "forward": true, "src_ids": "2021.naacl-main.207_167"}
{"input": "information flows is done by using Method| context: event coreference resolution is an important research problem with many applications . despite the recent remarkable success of pre - trained language models , we argue that it is still highly beneficial to utilize symbolic features for the task . however , as the input for coreference resolution typically comes from upstream components in the information extraction pipeline , the automatically extracted symbolic features can be noisy and contain errors . also , depending on the specific context , some features can be more informative than others .", "entity": "information flows", "output": "context - dependent gated module", "neg_sample": ["information flows is done by using Method", "event coreference resolution is an important research problem with many applications .", "despite the recent remarkable success of pre - trained language models , we argue that it is still highly beneficial to utilize symbolic features for the task .", "however , as the input for coreference resolution typically comes from upstream components in the information extraction pipeline , the automatically extracted symbolic features can be noisy and contain errors .", "also , depending on the specific context , some features can be more informative than others ."], "relation": "used for", "id": "2021.naacl-main.274", "year": 2021, "rel_sent": "Motivated by these observations , we propose a novel context - dependent gated module to adaptively control the information flows from the input symbolic features .", "forward": false, "src_ids": "2021.naacl-main.274_3124"}
{"input": "dyad classification is used for Task| context: understanding the origins of militarized conflict is a complex , yet important undertaking .", "entity": "dyad classification", "output": "militarized conflict analysis", "neg_sample": ["dyad classification is used for Task", "understanding the origins of militarized conflict is a complex , yet important undertaking ."], "relation": "used for", "id": "2021.emnlp-main.613", "year": 2021, "rel_sent": "Classifying Dyads for Militarized Conflict Analysis.", "forward": true, "src_ids": "2021.emnlp-main.613_8516"}
{"input": "dependency graph is used for Task| context: existing works on information extraction ( ie ) have mainly solved the four main tasks separately ( entity mention recognition , relation extraction , event trigger detection , and argument extraction ) , thus failing to benefit from inter - dependencies between tasks .", "entity": "dependency graph", "output": "ie tasks", "neg_sample": ["dependency graph is used for Task", "existing works on information extraction ( ie ) have mainly solved the four main tasks separately ( entity mention recognition , relation extraction , event trigger detection , and argument extraction ) , thus failing to benefit from inter - dependencies between tasks ."], "relation": "used for", "id": "2021.naacl-main.3", "year": 2021, "rel_sent": "Second , at the label level , we propose a dependency graph for the information types in the four IE tasks that captures the connections between the types expressed in an input sentence .", "forward": true, "src_ids": "2021.naacl-main.3_5663"}
{"input": "extracting clinical relations is done by using OtherScientificTerm| context: in recent years pre - trained language models ( plm ) such as bert have proven to be very effective in diverse nlp tasks such as information extraction , sentiment analysis and question answering . trained with massive general - domain text , these pre - trained language models capture rich syntactic , semantic and discourse information in the text . however , due to the differences between general and specific domain text ( e.g. , wikipedia versus clinic notes ) , these models may not be ideal for domain - specific tasks ( e.g. , extracting clinical relations ) .", "entity": "extracting clinical relations", "output": "medical knowledge", "neg_sample": ["extracting clinical relations is done by using OtherScientificTerm", "in recent years pre - trained language models ( plm ) such as bert have proven to be very effective in diverse nlp tasks such as information extraction , sentiment analysis and question answering .", "trained with massive general - domain text , these pre - trained language models capture rich syntactic , semantic and discourse information in the text .", "however , due to the differences between general and specific domain text ( e.g.", ", wikipedia versus clinic notes ) , these models may not be ideal for domain - specific tasks ( e.g.", ", extracting clinical relations ) ."], "relation": "used for", "id": "2021.emnlp-main.435", "year": 2021, "rel_sent": "Incorporating medical knowledge in BERT for clinical relation extraction.", "forward": false, "src_ids": "2021.emnlp-main.435_8868"}
{"input": "capacity models is used for Task| context: recent work has demonstrated the effectiveness of cross - lingual language model pretraining for cross - lingual understanding . in this study , we present the results of two larger multilingual masked language models , with 3.5b and 10.7b parameters .", "entity": "capacity models", "output": "language understanding", "neg_sample": ["capacity models is used for Task", "recent work has demonstrated the effectiveness of cross - lingual language model pretraining for cross - lingual understanding .", "in this study , we present the results of two larger multilingual masked language models , with 3.5b and 10.7b parameters ."], "relation": "used for", "id": "2021.repl4nlp-1.4", "year": 2021, "rel_sent": "This suggests larger capacity models for language understanding may obtain strong performance on high - resource languages while greatly improving low - resource languages .", "forward": true, "src_ids": "2021.repl4nlp-1.4_9891"}
{"input": "training strategies is used for OtherScientificTerm| context: variational autoencoders ( vaes ) are widely used for latent variable modeling of text . we focus on variations that learn expressive prior distributions over the latent variable .", "entity": "training strategies", "output": "rich priors", "neg_sample": ["training strategies is used for OtherScientificTerm", "variational autoencoders ( vaes ) are widely used for latent variable modeling of text .", "we focus on variations that learn expressive prior distributions over the latent variable ."], "relation": "used for", "id": "2021.naacl-main.259", "year": 2021, "rel_sent": "We find that existing training strategies are not effective for learning rich priors , so we propose adding the importance - sampled log marginal likelihood as a second term to the standard VAE objective to help when learning the prior .", "forward": true, "src_ids": "2021.naacl-main.259_9627"}
{"input": "shapley values is used for OtherScientificTerm| context: this paper discusses different approaches to the toxic spans detection task . as opposed to binary classification of entire texts , word - level assessment could be of great use during comment moderation , also allowing for a more in - depth comprehension of the model 's predictions .", "entity": "shapley values", "output": "bert predictions", "neg_sample": ["shapley values is used for OtherScientificTerm", "this paper discusses different approaches to the toxic spans detection task .", "as opposed to binary classification of entire texts , word - level assessment could be of great use during comment moderation , also allowing for a more in - depth comprehension of the model 's predictions ."], "relation": "used for", "id": "2021.semeval-1.114", "year": 2021, "rel_sent": "The work consists of two xAI approaches that automatically provide the explanation for models trained for binary classification of toxic documents : an LSTM model with attention as a model - specific approach and the Shapley values for interpreting BERT predictions as a model - agnostic method .", "forward": true, "src_ids": "2021.semeval-1.114_13650"}
{"input": "structured data augmentation is used for Task| context: we study a family of data augmentation methods , substructure substitution ( sub ) , that generalizes prior methods . sub generates new examples by substituting substructures ( e.g. , subtrees or subsequences ) with others having the same label .", "entity": "structured data augmentation", "output": "nlp", "neg_sample": ["structured data augmentation is used for Task", "we study a family of data augmentation methods , substructure substitution ( sub ) , that generalizes prior methods .", "sub generates new examples by substituting substructures ( e.g.", ", subtrees or subsequences ) with others having the same label ."], "relation": "used for", "id": "2021.findings-acl.307", "year": 2021, "rel_sent": "Substructure Substitution : Structured Data Augmentation for NLP.", "forward": true, "src_ids": "2021.findings-acl.307_7477"}
{"input": "election manifestos is used for Method| context: in an election campaign , political parties pledge to implement various projects - should they be elected . but do they follow through ? to track election pledges from parties ' election manifestos , we need to distinguish between pledges and general statements .", "entity": "election manifestos", "output": "neural models", "neg_sample": ["election manifestos is used for Method", "in an election campaign , political parties pledge to implement various projects - should they be elected .", "but do they follow through ?", "to track election pledges from parties ' election manifestos , we need to distinguish between pledges and general statements ."], "relation": "used for", "id": "2021.findings-acl.301", "year": 2021, "rel_sent": "In this paper , we use election manifestos of Swedish and Indian political parties to learn neural models that distinguish actual pledges from generic political positions .", "forward": true, "src_ids": "2021.findings-acl.301_463"}
{"input": "rationales is used for Method| context: pretrained transformer - based models such as bert have demonstrated state - of - the - art predictive performance when adapted into a range of natural language processing tasks . an open problem is how to improve the faithfulness of explanations ( rationales ) for the predictions of these models .", "entity": "rationales", "output": "vanilla bert", "neg_sample": ["rationales is used for Method", "pretrained transformer - based models such as bert have demonstrated state - of - the - art predictive performance when adapted into a range of natural language processing tasks .", "an open problem is how to improve the faithfulness of explanations ( rationales ) for the predictions of these models ."], "relation": "used for", "id": "2021.emnlp-main.645", "year": 2021, "rel_sent": "Using the rationales extracted from vanilla BERT and SaLoss models to train inherently faithful classifiers , we further show that the latter result in higher predictive performance in downstream tasks .", "forward": true, "src_ids": "2021.emnlp-main.645_8892"}
{"input": "feature sequence is used for Task| context: human language encompasses more than just text ; it also conveys emotions through tone and gestures .", "entity": "feature sequence", "output": "predicting sentiment", "neg_sample": ["feature sequence is used for Task", "human language encompasses more than just text ; it also conveys emotions through tone and gestures ."], "relation": "used for", "id": "2021.wassa-1.14", "year": 2021, "rel_sent": "The Late Fusion model merges unimodal features to create a multimodal feature sequence , the Round Robin model iteratively combines bimodal features using cross - modal attention , and the Hybrid Fusion model combines trimodal and unimodal features together toform a final feature sequence for predicting sentiment .", "forward": true, "src_ids": "2021.wassa-1.14_7652"}
{"input": "sinusoidal encodings is used for OtherScientificTerm| context: in order to preserve word - order information in a non - autoregressive setting , transformer architectures tend to include positional knowledge , by ( for instance ) adding positional encodings to token embeddings . several modifications have been proposed over the sinusoidal positional encodings used in the original transformer architecture ; these include , for instance , separating position encodings and token embeddings , or directly modifying attention weights based on the distance between word pairs .", "entity": "sinusoidal encodings", "output": "compositionality", "neg_sample": ["sinusoidal encodings is used for OtherScientificTerm", "in order to preserve word - order information in a non - autoregressive setting , transformer architectures tend to include positional knowledge , by ( for instance ) adding positional encodings to token embeddings .", "several modifications have been proposed over the sinusoidal positional encodings used in the original transformer architecture ; these include , for instance , separating position encodings and token embeddings , or directly modifying attention weights based on the distance between word pairs ."], "relation": "used for", "id": "2021.emnlp-main.59", "year": 2021, "rel_sent": "We then answer why that is : sinusoidal encodings were explicitly designed tofacilitate compositionality by allowing linear projections over arbitrary time steps .", "forward": true, "src_ids": "2021.emnlp-main.59_11594"}
{"input": "attention head masking technique is used for OtherScientificTerm| context: how can we effectively inform content selection in transformer - based abstractive summarization models ?", "entity": "attention head masking technique", "output": "encoder - decoder attentions", "neg_sample": ["attention head masking technique is used for OtherScientificTerm", "how can we effectively inform content selection in transformer - based abstractive summarization models ?"], "relation": "used for", "id": "2021.naacl-main.397", "year": 2021, "rel_sent": "In this work , we present a simple - yet - effective attention head masking technique , which is applied on encoder - decoder attentions to pinpoint salient content at inference time .", "forward": true, "src_ids": "2021.naacl-main.397_4211"}
{"input": "domain transfer is done by using Method| context: while neural networks produce state - of - the- art performance in several nlp tasks , they generally depend heavily on lexicalized information , which transfer poorly between domains . previous works have proposed delexicalization as a form of knowledge distillation to reduce the dependency on such lexical artifacts . however , a critical unsolved issue that remains is how much delexicalization to apply : a little helps reduce overfitting , but too much discards useful information .", "entity": "domain transfer", "output": "collective knowledge distillation", "neg_sample": ["domain transfer is done by using Method", "while neural networks produce state - of - the- art performance in several nlp tasks , they generally depend heavily on lexicalized information , which transfer poorly between domains .", "previous works have proposed delexicalization as a form of knowledge distillation to reduce the dependency on such lexical artifacts .", "however , a critical unsolved issue that remains is how much delexicalization to apply : a little helps reduce overfitting , but too much discards useful information ."], "relation": "used for", "id": "2021.emnlp-main.558", "year": 2021, "rel_sent": "Students Who Study Together Learn Better : On the Importance of Collective Knowledge Distillation for Domain Transfer in Fact Verification.", "forward": false, "src_ids": "2021.emnlp-main.558_7355"}
{"input": "information bottleneck principle is used for OtherScientificTerm| context: current abstractive summarization systems outperform their extractive counterparts , but their widespread adoption is inhibited by the inherent lack of interpretability . extractive summarization systems , though interpretable , suffer from redundancy and possible lack of coherence .", "entity": "information bottleneck principle", "output": "abstraction", "neg_sample": ["information bottleneck principle is used for OtherScientificTerm", "current abstractive summarization systems outperform their extractive counterparts , but their widespread adoption is inhibited by the inherent lack of interpretability .", "extractive summarization systems , though interpretable , suffer from redundancy and possible lack of coherence ."], "relation": "used for", "id": "2021.newsum-1.10", "year": 2021, "rel_sent": "We use the Information Bottleneck principle to jointly train the extraction and abstraction in an end - to - end fashion .", "forward": true, "src_ids": "2021.newsum-1.10_8944"}
{"input": "mention - neighbour hybrid attention is used for OtherScientificTerm| context: recently , the performance of pre - trained language models ( plms ) has been significantly improved by injecting knowledge facts to enhance their abilities of language understanding . for medical domains , the background knowledge sources are especially useful , due to the massive medical terms and their complicated relations are difficult to understand in text .", "entity": "mention - neighbour hybrid attention", "output": "heterogeneous - entity information", "neg_sample": ["mention - neighbour hybrid attention is used for OtherScientificTerm", "recently , the performance of pre - trained language models ( plms ) has been significantly improved by injecting knowledge facts to enhance their abilities of language understanding .", "for medical domains , the background knowledge sources are especially useful , due to the massive medical terms and their complicated relations are difficult to understand in text ."], "relation": "used for", "id": "2021.acl-long.457", "year": 2021, "rel_sent": "In SMedBERT , the mention - neighbour hybrid attention is proposed to learn heterogeneous - entity information , which infuses the semantic representations of entity types into the homogeneous neighbouring entity structure .", "forward": true, "src_ids": "2021.acl-long.457_15019"}
{"input": "maximal cliques is done by using Method| context: named entity recognition ( ner ) remains challenging when entity mentions can be discontinuous . existing methods break the recognition process into several sequential steps . in training , they predict conditioned on the golden intermediate results , while at inference relying on the model output of the previous steps , which introduces exposure bias .", "entity": "maximal cliques", "output": "non - parametric process", "neg_sample": ["maximal cliques is done by using Method", "named entity recognition ( ner ) remains challenging when entity mentions can be discontinuous .", "existing methods break the recognition process into several sequential steps .", "in training , they predict conditioned on the golden intermediate results , while at inference relying on the model output of the previous steps , which introduces exposure bias ."], "relation": "used for", "id": "2021.acl-long.63", "year": 2021, "rel_sent": "Then discontinuous NER can be reformulated as a non - parametric process of discovering maximal cliques in the graph and concatenating the spans in each clique .", "forward": false, "src_ids": "2021.acl-long.63_3468"}
{"input": "caption quality is done by using Task| context: automatic image captioning has improved significantly over the last few years , but the problem is far from being solved , with state of the art models still often producing low quality captions when used in the wild .", "entity": "caption quality", "output": "quality estimation ( qe )", "neg_sample": ["caption quality is done by using Task", "automatic image captioning has improved significantly over the last few years , but the problem is far from being solved , with state of the art models still often producing low quality captions when used in the wild ."], "relation": "used for", "id": "2021.naacl-main.253", "year": 2021, "rel_sent": "In this paper , we focus on the task of Quality Estimation ( QE ) for image captions , which attempts to model the caption quality from a human perspective and * without * access to ground - truth references , so that it can be applied at prediction time to detect low - quality captions produced on * previously unseen images * .", "forward": false, "src_ids": "2021.naacl-main.253_4906"}
{"input": "cleaner dataset is done by using Material| context: distant supervision for relation extraction provides uniform bag labels for each sentence inside the bag , while accurate sentence labels are important for downstream applications that need the exact relation type . directly using bag labels for sentence - level training will introduce much noise , thus severely degrading performance .", "entity": "cleaner dataset", "output": "noisy data", "neg_sample": ["cleaner dataset is done by using Material", "distant supervision for relation extraction provides uniform bag labels for each sentence inside the bag , while accurate sentence labels are important for downstream applications that need the exact relation type .", "directly using bag labels for sentence - level training will introduce much noise , thus severely degrading performance ."], "relation": "used for", "id": "2021.acl-long.484", "year": 2021, "rel_sent": "SENT not only filters the noisy data to construct a cleaner dataset , but also performs a re - labeling process to transform the noisy data into useful training data , thus further benefiting the model 's performance .", "forward": false, "src_ids": "2021.acl-long.484_6188"}
{"input": "openkg link prediction is done by using Method| context: open knowledge graphs ( openkg ) refer to a set of ( head noun phrase , relation phrase , tail noun phrase ) triples such as ( tesla , return to , new york ) extracted from a corpus using openie tools . while openkgs are easy to bootstrap for a domain , they are very sparse and far from being directly usable in an end task . therefore , the task of predicting new facts , i.e. , link prediction , becomes an important step while using these graphs in downstream tasks such as text comprehension , question answering , and web search query recommendation . learning embeddings for openkgs is one approach for link prediction that has received some attention lately .", "entity": "openkg link prediction", "output": "okgit", "neg_sample": ["openkg link prediction is done by using Method", "open knowledge graphs ( openkg ) refer to a set of ( head noun phrase , relation phrase , tail noun phrase ) triples such as ( tesla , return to , new york ) extracted from a corpus using openie tools .", "while openkgs are easy to bootstrap for a domain , they are very sparse and far from being directly usable in an end task .", "therefore , the task of predicting new facts , i.e.", ", link prediction , becomes an important step while using these graphs in downstream tasks such as text comprehension , question answering , and web search query recommendation .", "learning embeddings for openkgs is one approach for link prediction that has received some attention lately ."], "relation": "used for", "id": "2021.findings-acl.225", "year": 2021, "rel_sent": "We address this problem in this work and propose OKGIT that improves OpenKG link prediction using novel type compatibility score and type regularization .", "forward": false, "src_ids": "2021.findings-acl.225_11323"}
{"input": "topics is done by using Method| context: phrase representations derived from bert often do not exhibit complex phrasal compositionality , as the model relies instead on lexical similarity to determine semantic relatedness .", "entity": "topics", "output": "phrase - based neural topic model", "neg_sample": ["topics is done by using Method", "phrase representations derived from bert often do not exhibit complex phrasal compositionality , as the model relies instead on lexical similarity to determine semantic relatedness ."], "relation": "used for", "id": "2021.emnlp-main.846", "year": 2021, "rel_sent": "Finally , as a case study , we show that Phrase - BERT embeddings can be easily integrated with a simple autoencoder to build a phrase - based neural topic model that interprets topics as mixtures of words and phrases by performing a nearest neighbor search in the embedding space .", "forward": false, "src_ids": "2021.emnlp-main.846_14983"}
{"input": "small neural network is done by using OtherScientificTerm| context: natural language understanding is an important task in modern dialogue systems . it becomes more important with the rapid extension of the dialogue systems ' functionality .", "entity": "small neural network", "output": "embeddings", "neg_sample": ["small neural network is done by using OtherScientificTerm", "natural language understanding is an important task in modern dialogue systems .", "it becomes more important with the rapid extension of the dialogue systems ' functionality ."], "relation": "used for", "id": "2021.ranlp-1.25", "year": 2021, "rel_sent": "These embeddings then used by a small neural network to produce predictions for intent and slot probabilities .", "forward": false, "src_ids": "2021.ranlp-1.25_12968"}
{"input": "fine - grained semantic similarity is used for Task| context: automated frequently asked question ( faq ) retrieval provides an effective procedure to provide prompt responses to natural language based queries , providing an efficient platform for large - scale service - providing companies for presenting readily available information pertaining to customers ' questions .", "entity": "fine - grained semantic similarity", "output": "candidate refinement", "neg_sample": ["fine - grained semantic similarity is used for Task", "automated frequently asked question ( faq ) retrieval provides an effective procedure to provide prompt responses to natural language based queries , providing an efficient platform for large - scale service - providing companies for presenting readily available information pertaining to customers ' questions ."], "relation": "used for", "id": "2021.sigdial-1.44", "year": 2021, "rel_sent": "We propose two decoupled deep learning architectures trained for ( i ) candidate generation via text classification for a user question , and ( ii ) learning fine - grained semantic similarity between user questions and the FAQ repository for candidate refinement .", "forward": true, "src_ids": "2021.sigdial-1.44_14106"}
{"input": "bert is used for Method| context: unfortunately , how they make their predictions remains vastly unexplained , especially at the end - to - end , input - to - output level . little known is how tokens , layers , and passages precisely contribute to the final prediction .", "entity": "bert", "output": "reranking", "neg_sample": ["bert is used for Method", "unfortunately , how they make their predictions remains vastly unexplained , especially at the end - to - end , input - to - output level .", "little known is how tokens , layers , and passages precisely contribute to the final prediction ."], "relation": "used for", "id": "2021.blackboxnlp-1.39", "year": 2021, "rel_sent": "Overall , we find that BERT still cares about exact token matching for reranking ; the [ CLS ] token mainly gathers information for predictions at the last layer ; top - ranked passages are robust to token removal ; and BERT fine - tuned on MSMARCO has positional bias towards the start of the passage .", "forward": true, "src_ids": "2021.blackboxnlp-1.39_14058"}
{"input": "supervision signals is done by using OtherScientificTerm| context: most existing hmtc methods train classifiers using massive human - labeled documents , which are often too costly to obtain in real - world applications .", "entity": "supervision signals", "output": "class surface names", "neg_sample": ["supervision signals is done by using OtherScientificTerm", "most existing hmtc methods train classifiers using massive human - labeled documents , which are often too costly to obtain in real - world applications ."], "relation": "used for", "id": "2021.naacl-main.335", "year": 2021, "rel_sent": "In this paper , we explore to conduct HMTC based on only class surface names as supervision signals .", "forward": false, "src_ids": "2021.naacl-main.335_7305"}
{"input": "cognitively motivated method is used for Task| context: child language acquisition is famously accurate despite the sparsity of linguistic input .", "entity": "cognitively motivated method", "output": "morphological acquisition", "neg_sample": ["cognitively motivated method is used for Task", "child language acquisition is famously accurate despite the sparsity of linguistic input ."], "relation": "used for", "id": "2021.scil-1.17", "year": 2021, "rel_sent": "In this paper , we introduce a cognitively motivated method for morphological acquisition with a special focus on verbal inflections .", "forward": true, "src_ids": "2021.scil-1.17_12351"}
{"input": "sanskrit asr is used for Method| context: automatic speech recognition ( asr ) in sanskrit is interesting , owing to the various linguistic peculiarities present in the language . the sanskrit language is lexically productive , undergoes euphonic assimilation of phones at the word boundaries and exhibits variations in spelling conventions and in pronunciations .", "entity": "sanskrit asr", "output": "asr systems", "neg_sample": ["sanskrit asr is used for Method", "automatic speech recognition ( asr ) in sanskrit is interesting , owing to the various linguistic peculiarities present in the language .", "the sanskrit language is lexically productive , undergoes euphonic assimilation of phones at the word boundaries and exhibits variations in spelling conventions and in pronunciations ."], "relation": "used for", "id": "2021.findings-acl.447", "year": 2021, "rel_sent": "Finally , we extend these insights from Sanskrit ASR for building ASR systems in two other Indic languages , Gujarati and Telugu .", "forward": true, "src_ids": "2021.findings-acl.447_10844"}
{"input": "persuasive techniques is done by using Task| context: memes are one of the most popular types of content used to spread information online . they can influence a large number of people through rhetorical and psychological techniques .", "entity": "persuasive techniques", "output": "detection of persuasion techniques", "neg_sample": ["persuasive techniques is done by using Task", "memes are one of the most popular types of content used to spread information online .", "they can influence a large number of people through rhetorical and psychological techniques ."], "relation": "used for", "id": "2021.semeval-1.149", "year": 2021, "rel_sent": "The task , Detection of Persuasion Techniques in Texts and Images , is to detect these persuasive techniques in memes .", "forward": false, "src_ids": "2021.semeval-1.149_947"}
{"input": "moderation is done by using Metric| context: human moderation is commonly employed in deliberative contexts ( argumentation and discussion targeting a shared decision on an issue relevant to a group , e.g. , citizens arguing on how to employ a shared budget ) . as the scale of discussion enlarges in online settings , the overall discussion quality risks to drop and moderation becomes more important to assist participants in having a cooperative and productive interaction . to prioritize when moderation is most needed .", "entity": "moderation", "output": "argument quality", "neg_sample": ["moderation is done by using Metric", "human moderation is commonly employed in deliberative contexts ( argumentation and discussion targeting a shared decision on an issue relevant to a group , e.g.", ", citizens arguing on how to employ a shared budget ) .", "as the scale of discussion enlarges in online settings , the overall discussion quality risks to drop and moderation becomes more important to assist participants in having a cooperative and productive interaction .", "to prioritize when moderation is most needed ."], "relation": "used for", "id": "2021.argmining-1.13", "year": 2021, "rel_sent": "We further investigate whether argument quality is a key indicator of the need for moderation , showing that surprisingly , high quality arguments also trigger moderation .", "forward": false, "src_ids": "2021.argmining-1.13_9381"}
{"input": "entangled semantics is done by using Method| context: difficult samples of the minority class in imbalanced text classification are usually hard to be classified as they are embedded into an overlapping semantic region with the majority class .", "entity": "entangled semantics", "output": "semantic fusion module", "neg_sample": ["entangled semantics is done by using Method", "difficult samples of the minority class in imbalanced text classification are usually hard to be classified as they are embedded into an overlapping semantic region with the majority class ."], "relation": "used for", "id": "2021.emnlp-main.252", "year": 2021, "rel_sent": "MISO consists of ( 1 ) a semantic fusion module that learns entangled semantics among difficult and majority samples with an adaptive multi - head attention mechanism , ( 2 ) a mutual information loss that forces our model to learn new representations of entangled semantics in the non - overlapping region of the minority class , and ( 3 ) a coupled adversarial encoder - decoder that fine - tunes disentangled semantic representations to remain their correlations with the minority class , and then using these disentangled semantic representations to generate anchor instances for each difficult sample .", "forward": false, "src_ids": "2021.emnlp-main.252_9845"}
{"input": "dense retriever is done by using Method| context: passage retrieval and ranking is a key task in open - domain question answering and information retrieval . current effective approaches mostly rely on pre - trained deep language model - based retrievers and rankers . these methods have been shown to effectively model the semantic matching between queries and passages , also in presence of keyword mismatch , i.e. passages that are relevant to a query but do not contain important query keywords .", "entity": "dense retriever", "output": "typos - aware training framework", "neg_sample": ["dense retriever is done by using Method", "passage retrieval and ranking is a key task in open - domain question answering and information retrieval .", "current effective approaches mostly rely on pre - trained deep language model - based retrievers and rankers .", "these methods have been shown to effectively model the semantic matching between queries and passages , also in presence of keyword mismatch , i.e.", "passages that are relevant to a query but do not contain important query keywords ."], "relation": "used for", "id": "2021.emnlp-main.225", "year": 2021, "rel_sent": "We then propose a simple typos - aware training framework for DR and BERT re - ranker to address this issue .", "forward": false, "src_ids": "2021.emnlp-main.225_14301"}
{"input": "representing syntactic state is done by using Method| context: however , it remains unclear if such an inductive bias would also improve language models ' ability to learn grammatical dependencies in typologically different languages .", "entity": "representing syntactic state", "output": "structural supervision", "neg_sample": ["representing syntactic state is done by using Method", "however , it remains unclear if such an inductive bias would also improve language models ' ability to learn grammatical dependencies in typologically different languages ."], "relation": "used for", "id": "2021.emnlp-main.454", "year": 2021, "rel_sent": "We find suggestive evidence that structural supervision helps with representing syntactic state across intervening content and improves performance in low - data settings , suggesting that the benefits of hierarchical inductive biases in acquiring dependency relationships may extend beyond English .", "forward": false, "src_ids": "2021.emnlp-main.454_7379"}
{"input": "multilingual neural machine translation is done by using OtherScientificTerm| context: low - resource multilingual neural machine translation ( mnmt ) is typically tasked with improving the translation performance on one or more language pairs with the aid of high - resource language pairs .", "entity": "multilingual neural machine translation", "output": "learning curricula", "neg_sample": ["multilingual neural machine translation is done by using OtherScientificTerm", "low - resource multilingual neural machine translation ( mnmt ) is typically tasked with improving the translation performance on one or more language pairs with the aid of high - resource language pairs ."], "relation": "used for", "id": "2021.mtsummit-research.1", "year": 2021, "rel_sent": "Learning Curricula for Multilingual Neural Machine Translation Training.", "forward": false, "src_ids": "2021.mtsummit-research.1_377"}
{"input": "detecting and classifying modal expressions is used for Task| context: modality is the linguistic ability to describe vents with added information such as how desirable , plausible , or feasible they are . modality is important for many nlp downstream tasks such as the detection of hedging , uncertainty , speculation , and more . previous studies that address modality detection in nlp often restrict modal expressions to a closed syntactic class , and the modal sense labels are vastly different across different studies , lacking an accepted standard . furthermore , these senses are often analyzed independently of the events that they modify .", "entity": "detecting and classifying modal expressions", "output": "detection of modal events", "neg_sample": ["detecting and classifying modal expressions is used for Task", "modality is the linguistic ability to describe vents with added information such as how desirable , plausible , or feasible they are .", "modality is important for many nlp downstream tasks such as the detection of hedging , uncertainty , speculation , and more .", "previous studies that address modality detection in nlp often restrict modal expressions to a closed syntactic class , and the modal sense labels are vastly different across different studies , lacking an accepted standard .", "furthermore , these senses are often analyzed independently of the events that they modify ."], "relation": "used for", "id": "2021.acl-long.77", "year": 2021, "rel_sent": "We show that detecting and classifying modal expressions is not only feasible , it also improves the detection of modal events in their own right .", "forward": true, "src_ids": "2021.acl-long.77_2135"}
{"input": "resolution of anaphoric reference is done by using Method| context: the anaphora resolution in dialogue shared - task consists of three subtasks ; subtask1 , resolution of anaphoric identity and non - referring expression identification , subtask2 , resolution of bridging references , and subtask3 , resolution of discourse deixis / abstract anaphora .", "entity": "resolution of anaphoric reference", "output": "pipeline model", "neg_sample": ["resolution of anaphoric reference is done by using Method", "the anaphora resolution in dialogue shared - task consists of three subtasks ; subtask1 , resolution of anaphoric identity and non - referring expression identification , subtask2 , resolution of bridging references , and subtask3 , resolution of discourse deixis / abstract anaphora ."], "relation": "used for", "id": "2021.codi-sharedtask.4", "year": 2021, "rel_sent": "The Pipeline Model for Resolution of Anaphoric Reference and Resolution of Entity Reference.", "forward": false, "src_ids": "2021.codi-sharedtask.4_15552"}
{"input": "knowledge paths is done by using Method| context: in this work we leverage commonsense knowledge in form of knowledge paths to establish connections between sentences , as a form of explicitation of implicit knowledge . such connections can be direct ( singlehop paths ) or require intermediate concepts ( multihop paths ) .", "entity": "knowledge paths", "output": "language models", "neg_sample": ["knowledge paths is done by using Method", "in this work we leverage commonsense knowledge in form of knowledge paths to establish connections between sentences , as a form of explicitation of implicit knowledge .", "such connections can be direct ( singlehop paths ) or require intermediate concepts ( multihop paths ) ."], "relation": "used for", "id": "2021.iwcs-1.3", "year": 2021, "rel_sent": "Unlike prior work that relies exclusively on static knowledge sources , we leverage language models finetuned on knowledge stored in ConceptNet , to dynamically generate knowledge paths , as explanations of implicit knowledge that connects sentences in texts .", "forward": false, "src_ids": "2021.iwcs-1.3_15155"}
{"input": "multi - granularity boundary - aware network is used for OtherScientificTerm| context: to alleviate label scarcity in named entity recognition ( ner ) task , distantly supervised ner methods are widely applied to automatically label data and identify entities . although the human effort is reduced , the generated incomplete and noisy annotations pose new challenges for learning effective neural models .", "entity": "multi - granularity boundary - aware network", "output": "entity boundaries", "neg_sample": ["multi - granularity boundary - aware network is used for OtherScientificTerm", "to alleviate label scarcity in named entity recognition ( ner ) task , distantly supervised ner methods are widely applied to automatically label data and identify entities .", "although the human effort is reduced , the generated incomplete and noisy annotations pose new challenges for learning effective neural models ."], "relation": "used for", "id": "2021.emnlp-main.18", "year": 2021, "rel_sent": "Moreover , we design a multi - granularity boundary - aware network which detects entity boundaries from both local and global perspectives .", "forward": true, "src_ids": "2021.emnlp-main.18_283"}
{"input": "social bias attributes is done by using Method| context: representation learning is widely used in nlp for a vast range of tasks . however , representations derived from text corpora often reflect social biases . this phenomenon is pervasive and consistent across different neural models , causing serious concern . previous methods mostly rely on a pre - specified , user - provided direction or suffer from unstable training .", "entity": "social bias attributes", "output": "adversarial disentangled debiasing model", "neg_sample": ["social bias attributes is done by using Method", "representation learning is widely used in nlp for a vast range of tasks .", "however , representations derived from text corpora often reflect social biases .", "this phenomenon is pervasive and consistent across different neural models , causing serious concern .", "previous methods mostly rely on a pre - specified , user - provided direction or suffer from unstable training ."], "relation": "used for", "id": "2021.naacl-main.293", "year": 2021, "rel_sent": "In this paper , we propose an adversarial disentangled debiasing model to dynamically decouple social bias attributes from the intermediate representations trained on the main task .", "forward": false, "src_ids": "2021.naacl-main.293_5101"}
{"input": "mbart pre - trained model is used for Task| context: adapter modules were recently introduced as an efficient alternative tofine - tuning in nlp . adapter tuning consists in freezing pre - trained parameters of a model and injecting lightweight modules between layers , resulting in the addition of only a small number of task - specific trainable parameters .", "entity": "mbart pre - trained model", "output": "multilingual st task", "neg_sample": ["mbart pre - trained model is used for Task", "adapter modules were recently introduced as an efficient alternative tofine - tuning in nlp .", "adapter tuning consists in freezing pre - trained parameters of a model and injecting lightweight modules between layers , resulting in the addition of only a small number of task - specific trainable parameters ."], "relation": "used for", "id": "2021.acl-short.103", "year": 2021, "rel_sent": "Starting from different pre - trained models ( a multilingual ST trained on parallel data or a multilingual BART ( mBART ) trained on non parallel multilingual data ) , we show that adapters can be used to : ( a ) efficiently specialize ST to specific language pairs with a low extra cost in terms of parameters , and ( b ) transfer from an automatic speech recognition ( ASR ) task and an mBART pre - trained model to a multilingual ST task .", "forward": true, "src_ids": "2021.acl-short.103_12592"}
{"input": "bert is done by using Method| context: learning high - quality sentence representations benefits a wide range of natural language processing tasks . though bert - based pre - trained language models achieve high performance on many downstream tasks , the native derived sentence representations are proved to be collapsed and thus produce a poor performance on the semantic textual similarity ( sts ) tasks .", "entity": "bert", "output": "contrastive learning", "neg_sample": ["bert is done by using Method", "learning high - quality sentence representations benefits a wide range of natural language processing tasks .", "though bert - based pre - trained language models achieve high performance on many downstream tasks , the native derived sentence representations are proved to be collapsed and thus produce a poor performance on the semantic textual similarity ( sts ) tasks ."], "relation": "used for", "id": "2021.acl-long.393", "year": 2021, "rel_sent": "In this paper , we present ConSERT , a Contrastive Framework for Self - Supervised SEntence Representation Transfer , that adopts contrastive learning tofine - tune BERT in an unsupervised and effective way .", "forward": false, "src_ids": "2021.acl-long.393_3906"}
{"input": "entity information is used for Method| context: implicit discourse relation classification is a challenging task , in particular when the text domain is different from the standard penn discourse treebank ( pdtb ; prasad et al . , 2008 ) training corpus domain ( wall street journal in 1990s ) .", "entity": "entity information", "output": "discourse relational argument representation", "neg_sample": ["entity information is used for Method", "implicit discourse relation classification is a challenging task , in particular when the text domain is different from the standard penn discourse treebank ( pdtb ; prasad et al .", ", 2008 ) training corpus domain ( wall street journal in 1990s ) ."], "relation": "used for", "id": "2021.acl-short.116", "year": 2021, "rel_sent": "We show that entity information can be used to improve discourse relational argument representation .", "forward": true, "src_ids": "2021.acl-short.116_1556"}
{"input": "embodied language is used for Task| context: to effectively apply robots in working environments and assist humans , it is essential to develop and evaluate how visual grounding ( vg ) can affect machine performance on occluded objects . however , current vg works are limited in working environments , such as offices and warehouses , where objects are usually occluded due to space utilization issues .", "entity": "embodied language", "output": "clutter scene grounding", "neg_sample": ["embodied language is used for Task", "to effectively apply robots in working environments and assist humans , it is essential to develop and evaluate how visual grounding ( vg ) can affect machine performance on occluded objects .", "however , current vg works are limited in working environments , such as offices and warehouses , where objects are usually occluded due to space utilization issues ."], "relation": "used for", "id": "2021.naacl-main.419", "year": 2021, "rel_sent": "OCID - Ref : A 3D Robotic Dataset With Embodied Language For Clutter Scene Grounding.", "forward": true, "src_ids": "2021.naacl-main.419_1775"}
{"input": "pairwise orderings is done by using Method| context: dominant sentence ordering models can be classified into pairwise ordering models and set - to - sequence models . however , there is little attempt to combine these two types of models , which inituitively possess complementary advantages .", "entity": "pairwise orderings", "output": "classifiers", "neg_sample": ["pairwise orderings is done by using Method", "dominant sentence ordering models can be classified into pairwise ordering models and set - to - sequence models .", "however , there is little attempt to combine these two types of models , which inituitively possess complementary advantages ."], "relation": "used for", "id": "2021.emnlp-main.186", "year": 2021, "rel_sent": "In this paper , we propose a novel sentence ordering framework which introduces two classifiers to make better use of pairwise orderings for graph - based sentence ordering ( Yin et al .", "forward": false, "src_ids": "2021.emnlp-main.186_5030"}
{"input": "tokenization schemes is used for Method| context: in this paper , we describe our submissions for the similar language translation shared task 2021 .", "entity": "tokenization schemes", "output": "statistical models", "neg_sample": ["tokenization schemes is used for Method", "in this paper , we describe our submissions for the similar language translation shared task 2021 ."], "relation": "used for", "id": "2021.wmt-1.33", "year": 2021, "rel_sent": "This paper outlines experiments with various tokenization schemes to train statistical models .", "forward": true, "src_ids": "2021.wmt-1.33_5320"}
{"input": "factual corrector model fc is used for OtherScientificTerm| context: automatic abstractive summaries are found to often distort or fabricate facts in the article . this inconsistency between summary and original text has seriously impacted its applicability .", "entity": "factual corrector model fc", "output": "factual errors", "neg_sample": ["factual corrector model fc is used for OtherScientificTerm", "automatic abstractive summaries are found to often distort or fabricate facts in the article .", "this inconsistency between summary and original text has seriously impacted its applicability ."], "relation": "used for", "id": "2021.naacl-main.58", "year": 2021, "rel_sent": "We then design a factual corrector model FC to automatically correct factual errors from summaries generated by existing systems .", "forward": true, "src_ids": "2021.naacl-main.58_12166"}
{"input": "label - guided learning is used for Method| context: item categorization is an important application of text classification in e - commerce due to its impact on the online shopping experience of users . one class of text classification techniques that has gained attention recently is using the semantic information of the labels to guide the classification task .", "entity": "label - guided learning", "output": "item categorization systems", "neg_sample": ["label - guided learning is used for Method", "item categorization is an important application of text classification in e - commerce due to its impact on the online shopping experience of users .", "one class of text classification techniques that has gained attention recently is using the semantic information of the labels to guide the classification task ."], "relation": "used for", "id": "2021.naacl-industry.37", "year": 2021, "rel_sent": "These findings demonstrate how label - guided learning can improve item categorization systems in the e - commerce domain .", "forward": true, "src_ids": "2021.naacl-industry.37_10658"}
{"input": "embeddings is done by using OtherScientificTerm| context: neural topic models ( ntms ) apply deep neural networks to topic modelling . despite their success , ntms generally ignore two important aspects : ( 1 ) only document - level word count information is utilized for the training , while more fine - grained sentence - level information is ignored , and ( 2 ) external semantic knowledge regarding documents , sentences and words are not exploited for the training .", "entity": "embeddings", "output": "hierarchical kl divergence", "neg_sample": ["embeddings is done by using OtherScientificTerm", "neural topic models ( ntms ) apply deep neural networks to topic modelling .", "despite their success , ntms generally ignore two important aspects : ( 1 ) only document - level word count information is utilized for the training , while more fine - grained sentence - level information is ignored , and ( 2 ) external semantic knowledge regarding documents , sentences and words are not exploited for the training ."], "relation": "used for", "id": "2021.emnlp-main.80", "year": 2021, "rel_sent": "Our model alsofeatures hierarchical KL divergence to leverage embeddings of each document to regularize those of their sentences , paying more attention to semantically relevant sentences .", "forward": false, "src_ids": "2021.emnlp-main.80_3058"}
{"input": "syntax trees is done by using OtherScientificTerm| context: the analysis supports a wide range of phenomena including : temporal references , temporal adverbs , aspectual classes and progressives .", "entity": "syntax trees", "output": "temporal semantics", "neg_sample": ["syntax trees is done by using OtherScientificTerm", "the analysis supports a wide range of phenomena including : temporal references , temporal adverbs , aspectual classes and progressives ."], "relation": "used for", "id": "2021.iwcs-1.2", "year": 2021, "rel_sent": "In this paper , we propose an implementation of temporal semantics that translates syntax trees to logical formulas , suitable for consumption by the Coq proof assistant .", "forward": false, "src_ids": "2021.iwcs-1.2_2390"}
{"input": "nlu tasks is done by using Method| context: nlp is currently dominated by language models like roberta which are pretrained on billions of words . but what exact knowledge or skills do transformer lms learn from large - scale pretraining that they can not learn from less data ?", "entity": "nlu tasks", "output": "fine - tuning", "neg_sample": ["nlu tasks is done by using Method", "nlp is currently dominated by language models like roberta which are pretrained on billions of words .", "but what exact knowledge or skills do transformer lms learn from large - scale pretraining that they can not learn from less data ?"], "relation": "used for", "id": "2021.acl-long.90", "year": 2021, "rel_sent": "To explore this question , we adopt five styles of evaluation : classifier probing , information - theoretic probing , unsupervised relative acceptability judgments , unsupervised language model knowledge probing , and fine - tuning on NLU tasks .", "forward": false, "src_ids": "2021.acl-long.90_6305"}
{"input": "spurious correlations is done by using Method| context: distant supervision tackles the data bottleneck in ner by automatically generating training instances via dictionary matching . unfortunately , the learning of ds - ner is severely dictionary - biased , which suffers from spurious correlations and therefore undermines the effectiveness and the robustness of the learned models .", "entity": "spurious correlations", "output": "backdoor adjustment", "neg_sample": ["spurious correlations is done by using Method", "distant supervision tackles the data bottleneck in ner by automatically generating training instances via dictionary matching .", "unfortunately , the learning of ds - ner is severely dictionary - biased , which suffers from spurious correlations and therefore undermines the effectiveness and the robustness of the learned models ."], "relation": "used for", "id": "2021.acl-long.371", "year": 2021, "rel_sent": "For intra - dictionary bias , we conduct backdoor adjustment to remove the spurious correlations introduced by the dictionary confounder .", "forward": false, "src_ids": "2021.acl-long.371_9861"}
{"input": "universal adversarial texts is done by using Method| context: universal adversarial texts ( uats ) refer to short pieces of text units that can largely affect the predictions of nlp models . recent studies on universal adversarial attacks assume the accessibility of datasets for the task , which is not realistic .", "entity": "universal adversarial texts", "output": "cnn based models", "neg_sample": ["universal adversarial texts is done by using Method", "universal adversarial texts ( uats ) refer to short pieces of text units that can largely affect the predictions of nlp models .", "recent studies on universal adversarial attacks assume the accessibility of datasets for the task , which is not realistic ."], "relation": "used for", "id": "2021.alta-1.14", "year": 2021, "rel_sent": "Our empirical studies on three text classification datasets reveal that : 1 ) CNN based models are more extremely vulnerable to UATs while self - attention models show the most robustness , 2 ) the vulnerability of CNN and LSTM models and robustness of self - attention models could be attributed to whether they rely on training data artifacts for their predictions , and 3 ) the pre - trained embeddings could expose vulnerability to both universal adversarial attack and the UAT transfer attack .", "forward": false, "src_ids": "2021.alta-1.14_164"}
{"input": "unsupervised domain adaptation is done by using Method| context: contextual embedding models such as bert can be easily fine - tuned on labeled samples to create a state - of - the - art model for many downstream tasks . however , the fine - tuned bert model suffers considerably from unlabeled data when applied to a different domain .", "entity": "unsupervised domain adaptation", "output": "pseudo - label guided method", "neg_sample": ["unsupervised domain adaptation is done by using Method", "contextual embedding models such as bert can be easily fine - tuned on labeled samples to create a state - of - the - art model for many downstream tasks .", "however , the fine - tuned bert model suffers considerably from unlabeled data when applied to a different domain ."], "relation": "used for", "id": "2021.adaptnlp-1.2", "year": 2021, "rel_sent": "In this paper , we propose a pseudo - label guided method for unsupervised domain adaptation .", "forward": false, "src_ids": "2021.adaptnlp-1.2_14602"}
{"input": "probabilistic symbolic rules is done by using Method| context: hesip is a hybrid explanation system for image predictions that combines sub - symbolic and symbolic machine learning techniques to explain the predictions of image classification tasks .", "entity": "probabilistic symbolic rules", "output": "symbolic component", "neg_sample": ["probabilistic symbolic rules is done by using Method", "hesip is a hybrid explanation system for image predictions that combines sub - symbolic and symbolic machine learning techniques to explain the predictions of image classification tasks ."], "relation": "used for", "id": "2021.alta-1.15", "year": 2021, "rel_sent": "The sub - symbolic component makes a prediction for an image and the symbolic component learns probabilistic symbolic rules in order to explain that prediction .", "forward": false, "src_ids": "2021.alta-1.15_6166"}
{"input": "embedding space is done by using OtherScientificTerm| context: good quality monolingual word embeddings ( mwes ) can be built for languages which have large amounts of unlabeled text . mwes can be aligned to bilingual spaces using only a few thousand word translation pairs . for low resource languages training mwes monolingually results in mwes of poor quality , and thus poor bilingual word embeddings ( bwes ) as well .", "entity": "embedding space", "output": "vector spaces", "neg_sample": ["embedding space is done by using OtherScientificTerm", "good quality monolingual word embeddings ( mwes ) can be built for languages which have large amounts of unlabeled text .", "mwes can be aligned to bilingual spaces using only a few thousand word translation pairs .", "for low resource languages training mwes monolingually results in mwes of poor quality , and thus poor bilingual word embeddings ( bwes ) as well ."], "relation": "used for", "id": "2021.acl-short.30", "year": 2021, "rel_sent": "This paper proposes a new approach for building BWEs in which the vector space of the high resource source language is used as a starting point for training an embedding space for the low resource target language .", "forward": false, "src_ids": "2021.acl-short.30_8839"}
{"input": "knowledge graph embeddings is done by using Method| context: knowledge graph embeddings ( kges ) have been intensively explored in recent years due to their promise for a wide range of applications . however , existing studies focus on improving the final model performance without acknowledging the computational cost of the proposed approaches , in terms of execution time and environmental impact .", "entity": "knowledge graph embeddings", "output": "closed - form orthogonal procrustes analysis", "neg_sample": ["knowledge graph embeddings is done by using Method", "knowledge graph embeddings ( kges ) have been intensively explored in recent years due to their promise for a wide range of applications .", "however , existing studies focus on improving the final model performance without acknowledging the computational cost of the proposed approaches , in terms of execution time and environmental impact ."], "relation": "used for", "id": "2021.naacl-main.187", "year": 2021, "rel_sent": "We highlight three technical innovations : full batch learning via relational matrices , closed - form Orthogonal Procrustes Analysis for KGEs , and non - negative - sampling training .", "forward": false, "src_ids": "2021.naacl-main.187_3502"}
{"input": "table schemas is done by using OtherScientificTerm| context: recently , there has been significant progress in studying neural networks to translate text descriptions into sql queries . despite achieving good performance on some public benchmarks , existing text - to - sql models typically rely on the lexical matching between words in natural language ( nl ) questions and tokens in table schemas , which may render the models vulnerable to attacks that break the schema linking mechanism .", "entity": "table schemas", "output": "synonym annotations", "neg_sample": ["table schemas is done by using OtherScientificTerm", "recently , there has been significant progress in studying neural networks to translate text descriptions into sql queries .", "despite achieving good performance on some public benchmarks , existing text - to - sql models typically rely on the lexical matching between words in natural language ( nl ) questions and tokens in table schemas , which may render the models vulnerable to attacks that break the schema linking mechanism ."], "relation": "used for", "id": "2021.acl-long.195", "year": 2021, "rel_sent": "The first category of approaches utilizes additional synonym annotations for table schemas by modifying the model input , while the second category is based on adversarial training .", "forward": false, "src_ids": "2021.acl-long.195_15363"}
{"input": "qa models is done by using Method| context: one of the main challenges in conversational question answering ( cqa ) is to resolve the conversational dependency , such as anaphora and ellipsis . however , existing approaches do not explicitly train qa models on how to resolve the dependency , and thus these models are limited in understanding human dialogues .", "entity": "qa models", "output": "explicit guidance on how to resolve conversational dependency", "neg_sample": ["qa models is done by using Method", "one of the main challenges in conversational question answering ( cqa ) is to resolve the conversational dependency , such as anaphora and ellipsis .", "however , existing approaches do not explicitly train qa models on how to resolve the dependency , and thus these models are limited in understanding human dialogues ."], "relation": "used for", "id": "2021.acl-long.478", "year": 2021, "rel_sent": "ExCorD first generates self - contained questions that can be understood without the conversation history , then trains a QA model with the pairs of original and self - contained questions using a consistency - based regularizer .", "forward": false, "src_ids": "2021.acl-long.478_664"}
{"input": "emotion classification is done by using Method| context: ' emotion classification of covid-19 chinese microblogs helps analyze the public opinion triggered by covid-19 . existing methods only consider the features of the microblog itself with - out combining the semantics of emotion categories for modeling . emotion classification of mi - croblogs is a process of reading the content of microblogs and combining the semantics of emo - tion categories to understand whether it contains a certain emotion .", "entity": "emotion classification", "output": "bert model", "neg_sample": ["emotion classification is done by using Method", "' emotion classification of covid-19 chinese microblogs helps analyze the public opinion triggered by covid-19 .", "existing methods only consider the features of the microblog itself with - out combining the semantics of emotion categories for modeling .", "emotion classification of mi - croblogs is a process of reading the content of microblogs and combining the semantics of emo - tion categories to understand whether it contains a certain emotion ."], "relation": "used for", "id": "2021.ccl-1.82", "year": 2021, "rel_sent": "Finally we construct a question - and - answer pair and use it as the input of the BERT model to complete emotion classification .", "forward": false, "src_ids": "2021.ccl-1.82_7960"}
{"input": "tagging is used for OtherScientificTerm| context: alzheimer 's disease ( ad ) is associated with many characteristic changes , not only in an individual 's language but also in the interactive patterns observed in dialogue . the most indicative changes of this latter kind tend to be associated with relatively rare dialogue acts ( das ) , such as those involved in clarification exchanges and responses to particular kinds of questions . however , most existing work in da tagging focuses on improving average performance , effectively prioritizing more frequent classes ; it thus gives a poor performance on these rarer classes and is not suited for application to ad analysis .", "entity": "tagging", "output": "rare class das", "neg_sample": ["tagging is used for OtherScientificTerm", "alzheimer 's disease ( ad ) is associated with many characteristic changes , not only in an individual 's language but also in the interactive patterns observed in dialogue .", "the most indicative changes of this latter kind tend to be associated with relatively rare dialogue acts ( das ) , such as those involved in clarification exchanges and responses to particular kinds of questions .", "however , most existing work in da tagging focuses on improving average performance , effectively prioritizing more frequent classes ; it thus gives a poor performance on these rarer classes and is not suited for application to ad analysis ."], "relation": "used for", "id": "2021.sigdial-1.32", "year": 2021, "rel_sent": "In this paper , we investigate tagging specifically for rare class DAs , using a hierarchical BiLSTM model with various ways of incorporating information from previous utterances and DA tags in context .", "forward": true, "src_ids": "2021.sigdial-1.32_12762"}
{"input": "tensorflow text is used for Task| context: tokenization is a fundamental preprocessing step for almost all nlp tasks .", "entity": "tensorflow text", "output": "general text ( e.g. , sentence ) tokenization", "neg_sample": ["tensorflow text is used for Task", "tokenization is a fundamental preprocessing step for almost all nlp tasks ."], "relation": "used for", "id": "2021.emnlp-main.160", "year": 2021, "rel_sent": "Experimental results show that our method is 8.2x faster than HuggingFace Tokenizers and 5.1x faster than TensorFlow Text on average for general text tokenization .", "forward": true, "src_ids": "2021.emnlp-main.160_13947"}
{"input": "emotional support is done by using Method| context: emotional support is a crucial ability for many conversation scenarios , including social interactions , mental health support , and customer service chats . following reasonable procedures and using various support skills can help to effectively provide support . however , due to the lack of a well - designed task and corpora of effective emotional support conversations , research on building emotional support into dialog systems remains lacking .", "entity": "emotional support", "output": "dialog models", "neg_sample": ["emotional support is done by using Method", "emotional support is a crucial ability for many conversation scenarios , including social interactions , mental health support , and customer service chats .", "following reasonable procedures and using various support skills can help to effectively provide support .", "however , due to the lack of a well - designed task and corpora of effective emotional support conversations , research on building emotional support into dialog systems remains lacking ."], "relation": "used for", "id": "2021.acl-long.269", "year": 2021, "rel_sent": "Finally , we evaluate state - of - the - art dialog models with respect to the ability to provide emotional support .", "forward": false, "src_ids": "2021.acl-long.269_6353"}
{"input": "parsers is used for Task| context: bubble representations were proposed in the formal linguistics literature decades ago ; they enhance dependency trees by encoding coordination boundaries and internal relationships within coordination structures explicitly .", "entity": "parsers", "output": "coordination structure prediction", "neg_sample": ["parsers is used for Task", "bubble representations were proposed in the formal linguistics literature decades ago ; they enhance dependency trees by encoding coordination boundaries and internal relationships within coordination structures explicitly ."], "relation": "used for", "id": "2021.acl-long.557", "year": 2021, "rel_sent": "Experimental results on the English Penn Treebank and the English GENIA corpus show that our parsers beat previous state - of - the - art approaches on the task of coordination structure prediction , especially for the subset of sentences with complex coordination structures .", "forward": true, "src_ids": "2021.acl-long.557_13691"}
{"input": "data augmentation is used for Task| context: data augmentation is an effective way to improve the performance of many neural text generation models . however , current data augmentation methods need to define or choose proper data mapping functions that map the original samples into the augmented samples .", "entity": "data augmentation", "output": "text generation tasks", "neg_sample": ["data augmentation is used for Task", "data augmentation is an effective way to improve the performance of many neural text generation models .", "however , current data augmentation methods need to define or choose proper data mapping functions that map the original samples into the augmented samples ."], "relation": "used for", "id": "2021.acl-long.173", "year": 2021, "rel_sent": "In this work , we derive an objective toformulate the problem of data augmentation on text generation tasks without any use of augmented data constructed by specific mapping functions .", "forward": true, "src_ids": "2021.acl-long.173_13140"}
{"input": "commit classification is done by using Method| context: commits in version control systems ( e.g. git ) track changes in a software project . commits comprise noisy user - generated natural language and code patches . automatic commit classification ( cc ) has been used to determine the type of code maintenance activities performed , as well as to detect bug fixes in code repositories . much prior work occurs in the fully - supervised setting - a setting that can be a stretch in resource - scarce situations presenting difficulties in labeling commits .", "entity": "commit classification", "output": "co - training", "neg_sample": ["commit classification is done by using Method", "commits in version control systems ( e.g.", "git ) track changes in a software project .", "commits comprise noisy user - generated natural language and code patches .", "automatic commit classification ( cc ) has been used to determine the type of code maintenance activities performed , as well as to detect bug fixes in code repositories .", "much prior work occurs in the fully - supervised setting - a setting that can be a stretch in resource - scarce situations presenting difficulties in labeling commits ."], "relation": "used for", "id": "2021.wnut-1.43", "year": 2021, "rel_sent": "Co - training for Commit Classification.", "forward": false, "src_ids": "2021.wnut-1.43_6512"}
{"input": "deception detection is done by using Method| context: the deception in the text can be of different forms in different domains , including fake news , rumor tweets , and spam emails . irrespective of the domain , the main intent of the deceptive text is to deceit the reader . although domain - specific deception detection exists , domain - independent deception detection can provide a holistic picture , which can be crucial to understand how deception occurs in the text .", "entity": "deception detection", "output": "domain - independent holistic approach", "neg_sample": ["deception detection is done by using Method", "the deception in the text can be of different forms in different domains , including fake news , rumor tweets , and spam emails .", "irrespective of the domain , the main intent of the deceptive text is to deceit the reader .", "although domain - specific deception detection exists , domain - independent deception detection can provide a holistic picture , which can be crucial to understand how deception occurs in the text ."], "relation": "used for", "id": "2021.ranlp-1.147", "year": 2021, "rel_sent": "A Domain - Independent Holistic Approach to Deception Detection.", "forward": false, "src_ids": "2021.ranlp-1.147_14211"}
{"input": "loss functions is used for OtherScientificTerm| context: data augmentation is an effective way to improve the performance of many neural text generation models . however , current data augmentation methods need to define or choose proper data mapping functions that map the original samples into the augmented samples .", "entity": "loss functions", "output": "loss functions", "neg_sample": ["loss functions is used for OtherScientificTerm", "data augmentation is an effective way to improve the performance of many neural text generation models .", "however , current data augmentation methods need to define or choose proper data mapping functions that map the original samples into the augmented samples ."], "relation": "used for", "id": "2021.acl-long.173", "year": 2021, "rel_sent": "Our proposed objective can be efficiently optimized and applied to popular loss functions on text generation tasks with a convergence rate guarantee .", "forward": true, "src_ids": "2021.acl-long.173_13139"}
{"input": "parsimonious parser transfer is used for Task| context: cross - lingual transfer is a leading technique for parsing low - resource languages in the absence of explicit supervision . simple ' direct transfer ' of a learned model based on a multilingual input encoding has provided a strong benchmark .", "entity": "parsimonious parser transfer", "output": "unsupervised cross - lingual adaptation", "neg_sample": ["parsimonious parser transfer is used for Task", "cross - lingual transfer is a leading technique for parsing low - resource languages in the absence of explicit supervision .", "simple ' direct transfer ' of a learned model based on a multilingual input encoding has provided a strong benchmark ."], "relation": "used for", "id": "2021.eacl-main.254", "year": 2021, "rel_sent": "PPT : Parsimonious Parser Transfer for Unsupervised Cross - Lingual Adaptation.", "forward": true, "src_ids": "2021.eacl-main.254_15004"}
{"input": "bilingual conversational characteristics is done by using Method| context: despite the impressive performance of sentence - level and context - aware neural machine translation ( nmt ) , there still remain challenges to translate bilingual conversational text due to its inherent characteristics such as role preference , dialogue coherence , and translation consistency .", "entity": "bilingual conversational characteristics", "output": "latent variational modules", "neg_sample": ["bilingual conversational characteristics is done by using Method", "despite the impressive performance of sentence - level and context - aware neural machine translation ( nmt ) , there still remain challenges to translate bilingual conversational text due to its inherent characteristics such as role preference , dialogue coherence , and translation consistency ."], "relation": "used for", "id": "2021.acl-long.444", "year": 2021, "rel_sent": "Specifically , we design three latent variational modules to learn the distributions of bilingual conversational characteristics .", "forward": false, "src_ids": "2021.acl-long.444_15144"}
{"input": "phrase - based neural topic model is done by using Method| context: phrase representations derived from bert often do not exhibit complex phrasal compositionality , as the model relies instead on lexical similarity to determine semantic relatedness .", "entity": "phrase - based neural topic model", "output": "phrase - bert embeddings", "neg_sample": ["phrase - based neural topic model is done by using Method", "phrase representations derived from bert often do not exhibit complex phrasal compositionality , as the model relies instead on lexical similarity to determine semantic relatedness ."], "relation": "used for", "id": "2021.emnlp-main.846", "year": 2021, "rel_sent": "Finally , as a case study , we show that Phrase - BERT embeddings can be easily integrated with a simple autoencoder to build a phrase - based neural topic model that interprets topics as mixtures of words and phrases by performing a nearest neighbor search in the embedding space .", "forward": false, "src_ids": "2021.emnlp-main.846_14982"}
{"input": "transition - based bubble parsing is used for Method| context: bubble representations were proposed in the formal linguistics literature decades ago ; they enhance dependency trees by encoding coordination boundaries and internal relationships within coordination structures explicitly .", "entity": "transition - based bubble parsing", "output": "dependency - based syntactic analysis", "neg_sample": ["transition - based bubble parsing is used for Method", "bubble representations were proposed in the formal linguistics literature decades ago ; they enhance dependency trees by encoding coordination boundaries and internal relationships within coordination structures explicitly ."], "relation": "used for", "id": "2021.acl-long.557", "year": 2021, "rel_sent": "We propose a transition - based bubble parser to perform coordination structure identification and dependency - based syntactic analysis simultaneously .", "forward": true, "src_ids": "2021.acl-long.557_13693"}
{"input": "paraphrase identification task is done by using Material| context: in the domain of natural language augmentation , the eligibility of generated samples remains not well understood .", "entity": "paraphrase identification task", "output": "backtranslated samples", "neg_sample": ["paraphrase identification task is done by using Material", "in the domain of natural language augmentation , the eligibility of generated samples remains not well understood ."], "relation": "used for", "id": "2021.ranlp-1.35", "year": 2021, "rel_sent": "Assessing the Eligibility of Backtranslated Samples Based on Semantic Similarity for the Paraphrase Identification Task.", "forward": false, "src_ids": "2021.ranlp-1.35_2467"}
{"input": "dynamic token reduction approach is used for Task| context: existing pre - trained language models ( plms ) are often computationally expensive in inference , making them impractical in various resource - limited real - world applications .", "entity": "dynamic token reduction approach", "output": "plms ' inference", "neg_sample": ["dynamic token reduction approach is used for Task", "existing pre - trained language models ( plms ) are often computationally expensive in inference , making them impractical in various resource - limited real - world applications ."], "relation": "used for", "id": "2021.naacl-main.463", "year": 2021, "rel_sent": "To address this issue , we propose a dynamic token reduction approach to accelerate PLMs ' inference , named TR - BERT , which could flexibly adapt the layer number of each token in inference to avoid redundant calculation .", "forward": true, "src_ids": "2021.naacl-main.463_2771"}
{"input": "indic languages is done by using Method| context: multilingual neural machine translation has achieved remarkable performance by training a single translation model for multiple languages .", "entity": "indic languages", "output": "transliteration ( script conversion )", "neg_sample": ["indic languages is done by using Method", "multilingual neural machine translation has achieved remarkable performance by training a single translation model for multiple languages ."], "relation": "used for", "id": "2021.wat-1.26", "year": 2021, "rel_sent": "Furthermore , we demonstrate the use of transliteration ( script conversion ) for Indic languages in reducing the lexical gap for training a multilingual NMT system .", "forward": false, "src_ids": "2021.wat-1.26_3015"}
{"input": "keyphrase extraction approaches is used for Task| context: from the view of human understanding documents , we typically measure the importance of phrase according to its syntactic accuracy , information saliency , and concept consistency simultaneously . however , most existing keyphrase extraction approaches only focus on the part of them , which leads to biased results .", "entity": "keyphrase extraction approaches", "output": "candidate keyphrase extraction", "neg_sample": ["keyphrase extraction approaches is used for Task", "from the view of human understanding documents , we typically measure the importance of phrase according to its syntactic accuracy , information saliency , and concept consistency simultaneously .", "however , most existing keyphrase extraction approaches only focus on the part of them , which leads to biased results ."], "relation": "used for", "id": "2021.emnlp-main.215", "year": 2021, "rel_sent": "In this paper , we propose a new approach to estimate the importance of keyphrase from multiple perspectives ( called as KIEMP ) and further improve the performance of keyphrase extraction .", "forward": true, "src_ids": "2021.emnlp-main.215_6167"}
{"input": "re - embedding position is done by using Method| context: difficult samples of the minority class in imbalanced text classification are usually hard to be classified as they are embedded into an overlapping semantic region with the majority class .", "entity": "re - embedding position", "output": "backbone network", "neg_sample": ["re - embedding position is done by using Method", "difficult samples of the minority class in imbalanced text classification are usually hard to be classified as they are embedded into an overlapping semantic region with the majority class ."], "relation": "used for", "id": "2021.emnlp-main.252", "year": 2021, "rel_sent": "In this paper , we propose a Mutual Information constrained Semantically Oversampling framework ( MISO ) that can generate anchor instances to help the backbone network determine the re - embedding position of a non - overlapping representation for each difficult sample .", "forward": false, "src_ids": "2021.emnlp-main.252_9843"}
{"input": "analysis and review of legal cases is done by using Task| context: information extraction and question answering have the potential to introduce a new paradigm for how machine learning is applied to criminal law . existing approaches generally use tabular data for predictive metrics . an alternative approach is needed for matters of equitable justice , where individuals are judged on a case - by - case basis , in a process involving verbal or written discussion and interpretation of case factors . such discussions are individualized , but they nonetheless rely on underlying facts . information extraction can play an important role in surfacing these facts , which are still important to understand .", "entity": "analysis and review of legal cases", "output": "nlp", "neg_sample": ["analysis and review of legal cases is done by using Task", "information extraction and question answering have the potential to introduce a new paradigm for how machine learning is applied to criminal law .", "existing approaches generally use tabular data for predictive metrics .", "an alternative approach is needed for matters of equitable justice , where individuals are judged on a case - by - case basis , in a process involving verbal or written discussion and interpretation of case factors .", "such discussions are individualized , but they nonetheless rely on underlying facts .", "information extraction can play an important role in surfacing these facts , which are still important to understand ."], "relation": "used for", "id": "2021.nlp4posimpact-1.8", "year": 2021, "rel_sent": "We encourage new developments in NLP to enable analysis and review of legal cases to be done in a post - hoc , not predictive , manner .", "forward": false, "src_ids": "2021.nlp4posimpact-1.8_16113"}
{"input": "deep nlp models is done by using Method| context: several researchers have shown that deep nlp models learn non - trivial amount of linguistic knowledge , captured at different layers of the model .", "entity": "deep nlp models", "output": "transfer learning", "neg_sample": ["deep nlp models is done by using Method", "several researchers have shown that deep nlp models learn non - trivial amount of linguistic knowledge , captured at different layers of the model ."], "relation": "used for", "id": "2021.findings-acl.438", "year": 2021, "rel_sent": "How transfer learning impacts linguistic knowledge in deep NLP models ?.", "forward": false, "src_ids": "2021.findings-acl.438_2981"}
{"input": "unified span - based approach is used for Task| context: fine - grained opinion mining ( om ) has achieved increasing attraction in the natural language processing ( nlp ) community , which aims tofind the opinion structures of ' who expressed what opinions towards what ' in one sentence .", "entity": "unified span - based approach", "output": "end - to - end om setting", "neg_sample": ["unified span - based approach is used for Task", "fine - grained opinion mining ( om ) has achieved increasing attraction in the natural language processing ( nlp ) community , which aims tofind the opinion structures of ' who expressed what opinions towards what ' in one sentence ."], "relation": "used for", "id": "2021.naacl-main.144", "year": 2021, "rel_sent": "In this work , motivated by its span - based representations of opinion expressions and roles , we propose a unified span - based approach for the end - to - end OM setting .", "forward": true, "src_ids": "2021.naacl-main.144_2407"}
{"input": "dual - encoder model is done by using Method| context: dense passage retrieval has been shown to be an effective approach for information retrieval tasks such as open domain question answering . under this paradigm , a dual - encoder model is learned to encode questions and passages separately into vector representations , and all the passage vectors are then pre - computed and indexed , which can be efficiently retrieved by vector space search during inference time .", "entity": "dual - encoder model", "output": "contrastive learning method", "neg_sample": ["dual - encoder model is done by using Method", "dense passage retrieval has been shown to be an effective approach for information retrieval tasks such as open domain question answering .", "under this paradigm , a dual - encoder model is learned to encode questions and passages separately into vector representations , and all the passage vectors are then pre - computed and indexed , which can be efficiently retrieved by vector space search during inference time ."], "relation": "used for", "id": "2021.acl-long.477", "year": 2021, "rel_sent": "In this paper , we propose a new contrastive learning method called Cross Momentum Contrastive learning ( xMoCo ) , for learning a dual - encoder model for question - passage matching .", "forward": false, "src_ids": "2021.acl-long.477_13235"}
{"input": "text summarizing models is used for Material| context: recent researches have demonstrated that bert shows potential in a wide range of natural language processing tasks . it is adopted as an encoder for many stateof - the - art automatic summarizing systems , which achieve excellent performance .", "entity": "text summarizing models", "output": "vietnamese", "neg_sample": ["text summarizing models is used for Material", "recent researches have demonstrated that bert shows potential in a wide range of natural language processing tasks .", "it is adopted as an encoder for many stateof - the - art automatic summarizing systems , which achieve excellent performance ."], "relation": "used for", "id": "2021.paclic-1.59", "year": 2021, "rel_sent": "The experiment results indicate that monolingual models produce promising results compared to other multilingual models and previous text summarizing models for Vietnamese .", "forward": true, "src_ids": "2021.paclic-1.59_5698"}
{"input": "lstms is used for OtherScientificTerm| context: we propose an alternate approach to quantifying how well language models learn natural language : we ask how well they match the statistical tendencies of natural language .", "entity": "lstms", "output": "natural language distributions", "neg_sample": ["lstms is used for OtherScientificTerm", "we propose an alternate approach to quantifying how well language models learn natural language : we ask how well they match the statistical tendencies of natural language ."], "relation": "used for", "id": "2021.acl-long.414", "year": 2021, "rel_sent": "As concrete examples , text generated under the nucleus sampling scheme adheres more closely to the type - token relationship of natural language than text produced using standard ancestral sampling ; text from LSTMs reflects the natural language distributions over length , stopwords , and symbols surprisingly well .", "forward": true, "src_ids": "2021.acl-long.414_3714"}
{"input": "architecture selection method is used for OtherScientificTerm| context: this paper addresses the efficiency challenge of neural architecture search ( nas ) by formulating the task as a ranking problem . previous methods require numerous training examples to estimate the accurate performance of architectures , although the actual goal is tofind the distinction between ' good ' and ' bad ' candidates . here we do not resort to performance predictors .", "entity": "architecture selection method", "output": "search space", "neg_sample": ["architecture selection method is used for OtherScientificTerm", "this paper addresses the efficiency challenge of neural architecture search ( nas ) by formulating the task as a ranking problem .", "previous methods require numerous training examples to estimate the accurate performance of architectures , although the actual goal is tofind the distinction between ' good ' and ' bad ' candidates .", "here we do not resort to performance predictors ."], "relation": "used for", "id": "2021.emnlp-main.191", "year": 2021, "rel_sent": "Moreover , we develop an architecture selection method to prune the search space and concentrate on more promising candidates .", "forward": true, "src_ids": "2021.emnlp-main.191_9049"}
{"input": "endangered language ( el ) documentation is done by using Method| context: ' transcription bottlenecks ' , created by a shortage of effective human transcribers ( i.e. , transcriber shortage ) , are one of the main challenges to endangered language ( el ) documentation . automatic speech recognition ( asr ) has been suggested as a tool to overcome such bottlenecks .", "entity": "endangered language ( el ) documentation", "output": "end - to - end asr system", "neg_sample": ["endangered language ( el ) documentation is done by using Method", "' transcription bottlenecks ' , created by a shortage of effective human transcribers ( i.e.", ", transcriber shortage ) , are one of the main challenges to endangered language ( el ) documentation .", "automatic speech recognition ( asr ) has been suggested as a tool to overcome such bottlenecks ."], "relation": "used for", "id": "2021.eacl-main.96", "year": 2021, "rel_sent": "Leveraging End - to - End ASR for Endangered Language Documentation : An Empirical Study on Yoloxochitl Mixtec.", "forward": false, "src_ids": "2021.eacl-main.96_4920"}
{"input": "dependency parser is done by using Material| context: while the predictive performance of modern statistical dependency parsers relies heavily on the availability of expensive expert - annotated treebank data , not all annotations contribute equally to the training of the parsers .", "entity": "dependency parser", "output": "labeled examples", "neg_sample": ["dependency parser is done by using Material", "while the predictive performance of modern statistical dependency parsers relies heavily on the availability of expensive expert - annotated treebank data , not all annotations contribute equally to the training of the parsers ."], "relation": "used for", "id": "2021.naacl-main.207", "year": 2021, "rel_sent": "In this paper , we attempt to reduce the number of labeled examples needed to train a strong dependency parser using batch active learning ( AL ) .", "forward": false, "src_ids": "2021.naacl-main.207_168"}
{"input": "real - world ad - hoc retrieval applications is done by using Method| context: transformer - based ' behemoths ' have grown in popularity , as well as structurally , shattering multiple nlp benchmarks along the way . however , their real - world usability remains a question .", "entity": "real - world ad - hoc retrieval applications", "output": "transformer - based models", "neg_sample": ["real - world ad - hoc retrieval applications is done by using Method", "transformer - based ' behemoths ' have grown in popularity , as well as structurally , shattering multiple nlp benchmarks along the way .", "however , their real - world usability remains a question ."], "relation": "used for", "id": "2021.eacl-main.293", "year": 2021, "rel_sent": "In this work , we empirically assess the feasibility of applying transformer - based models in real - world ad - hoc retrieval applications by comparison to a ' greener and more sustainable ' alternative , comprising only 620 trainable parameters .", "forward": false, "src_ids": "2021.eacl-main.293_4430"}
{"input": "pvf is used for Task| context: effective management of dementia hinges on timely detection and precise diagnosis of the underlying cause of the syndrome at an early mild cognitive impairment ( mci ) stage . verbal fluency tasks are among the most often applied tests for early dementia detection due to their efficiency and ease of use . in these tasks , participants are asked to produce as many words as possible belonging to either a semantic category ( svf task ) or a phonemic category ( pvf task ) . even though both svf and pvf share neurocognitive function profiles , the pvf is typically believed to be less sensitive to measure mci - related cognitive impairment and recent research on fine - grained automatic evaluation of vf tasks has mainly focused on the svf .", "entity": "pvf", "output": "in - depth assessment of cognition", "neg_sample": ["pvf is used for Task", "effective management of dementia hinges on timely detection and precise diagnosis of the underlying cause of the syndrome at an early mild cognitive impairment ( mci ) stage .", "verbal fluency tasks are among the most often applied tests for early dementia detection due to their efficiency and ease of use .", "in these tasks , participants are asked to produce as many words as possible belonging to either a semantic category ( svf task ) or a phonemic category ( pvf task ) .", "even though both svf and pvf share neurocognitive function profiles , the pvf is typically believed to be less sensitive to measure mci - related cognitive impairment and recent research on fine - grained automatic evaluation of vf tasks has mainly focused on the svf ."], "relation": "used for", "id": "2021.clpsych-1.4", "year": 2021, "rel_sent": "As such , these results point towards the yet underexplored utility of the PVF for in - depth assessment of cognition in MCI .", "forward": true, "src_ids": "2021.clpsych-1.4_2466"}
{"input": "attention is used for Task| context: large language models have become increasingly difficult to train because of the growing computation time and cost .", "entity": "attention", "output": "sequence modeling", "neg_sample": ["attention is used for Task", "large language models have become increasingly difficult to train because of the growing computation time and cost ."], "relation": "used for", "id": "2021.emnlp-main.602", "year": 2021, "rel_sent": "In this work , we present SRU++ , a highly - efficient architecture that combines fast recurrence and attention for sequence modeling .", "forward": true, "src_ids": "2021.emnlp-main.602_7514"}
{"input": "user engagement detector is done by using Material| context: user engagement is one of the most important metrics for evaluating open - domain dialog systems , and could also be used as real - time feedback to benefit dialog policy learning . existing work on detecting user disengagement typically requires hand - labeling many dialog samples .", "entity": "user engagement detector", "output": "denoised data", "neg_sample": ["user engagement detector is done by using Material", "user engagement is one of the most important metrics for evaluating open - domain dialog systems , and could also be used as real - time feedback to benefit dialog policy learning .", "existing work on detecting user disengagement typically requires hand - labeling many dialog samples ."], "relation": "used for", "id": "2021.acl-long.283", "year": 2021, "rel_sent": "Finally , we use the denoised data to train a user engagement detector .", "forward": false, "src_ids": "2021.acl-long.283_9692"}
{"input": "sequence - to - sequence transformers is used for Task| context: predicting linearized abstract meaning representation ( amr ) graphs using pre - trained sequence - to - sequence transformer models has recently led to large improvements on amr parsing benchmarks . these parsers are simple and avoid explicit modeling of structure but lack desirable properties such as graph well - formedness guarantees or built - in graph - sentence alignments .", "entity": "sequence - to - sequence transformers", "output": "transition - based amr parsing", "neg_sample": ["sequence - to - sequence transformers is used for Task", "predicting linearized abstract meaning representation ( amr ) graphs using pre - trained sequence - to - sequence transformer models has recently led to large improvements on amr parsing benchmarks .", "these parsers are simple and avoid explicit modeling of structure but lack desirable properties such as graph well - formedness guarantees or built - in graph - sentence alignments ."], "relation": "used for", "id": "2021.emnlp-main.507", "year": 2021, "rel_sent": "Structure - aware Fine - tuning of Sequence - to - sequence Transformers for Transition - based AMR Parsing.", "forward": true, "src_ids": "2021.emnlp-main.507_12733"}
{"input": "rnn - based methods is used for OtherScientificTerm| context: this paper explores a variant of automatic headline generation methods , where a generated headline is required to include a given phrase such as a company or a product name . previous methods using transformer - based models generate a headline including a given phrase by providing the encoder with additional information corresponding to the given phrase . however , these methods can not always include the phrase in the generated headline .", "entity": "rnn - based methods", "output": "token sequences", "neg_sample": ["rnn - based methods is used for OtherScientificTerm", "this paper explores a variant of automatic headline generation methods , where a generated headline is required to include a given phrase such as a company or a product name .", "previous methods using transformer - based models generate a headline including a given phrase by providing the encoder with additional information corresponding to the given phrase .", "however , these methods can not always include the phrase in the generated headline ."], "relation": "used for", "id": "2021.emnlp-main.335", "year": 2021, "rel_sent": "Inspired by previous RNN - based methods generating token sequences in backward and forward directions from the given phrase , we propose a simple Transformer - based method that guarantees to include the given phrase in the high - quality generated headline .", "forward": true, "src_ids": "2021.emnlp-main.335_88"}
{"input": "few - shot model is done by using Method| context: ' few - shot relation classification has attracted great attention recently and is regarded as an ef - fective way to tackle the long - tail problem in relation classification . most previous works onfew - shot relation classification are based on learning - to - match paradigms which focus on learn - ing an effective universal matcher between the query and one target class prototype based oninner - class support sets . however the learning - to - match paradigm focuses on capturing the sim - ilarity knowledge between query and class prototype while fails to consider discriminative infor - mation between different candidate classes . such information is critical especially when targetclasses are highly confusing and domain shifting exists between training and testing phases .", "entity": "few - shot model", "output": "global transformed prototypical networks(gtpn )", "neg_sample": ["few - shot model is done by using Method", "' few - shot relation classification has attracted great attention recently and is regarded as an ef - fective way to tackle the long - tail problem in relation classification .", "most previous works onfew - shot relation classification are based on learning - to - match paradigms which focus on learn - ing an effective universal matcher between the query and one target class prototype based oninner - class support sets .", "however the learning - to - match paradigm focuses on capturing the sim - ilarity knowledge between query and class prototype while fails to consider discriminative infor - mation between different candidate classes .", "such information is critical especially when targetclasses are highly confusing and domain shifting exists between training and testing phases ."], "relation": "used for", "id": "2021.ccl-1.90", "year": 2021, "rel_sent": "Inthis paper we propose the Global Transformed Prototypical Networks(GTPN ) which learns tobuild a few - shot model to directly discriminate between the query and all target classes with bothinner - class local information and inter - class global information .", "forward": false, "src_ids": "2021.ccl-1.90_13685"}
{"input": "multi - teacher knowledge distillation framework is used for Method| context: pre - trained language models ( plms ) achieve great success in nlp . however , their huge model sizes hinder their applications in many practical systems .", "entity": "multi - teacher knowledge distillation framework", "output": "student model", "neg_sample": ["multi - teacher knowledge distillation framework is used for Method", "pre - trained language models ( plms ) achieve great success in nlp .", "however , their huge model sizes hinder their applications in many practical systems ."], "relation": "used for", "id": "2021.findings-acl.387", "year": 2021, "rel_sent": "In this paper , we propose a multi - teacher knowledge distillation framework named MTBERT for pre - trained language model compression , which can train high - quality student model from multiple teacher PLMs .", "forward": true, "src_ids": "2021.findings-acl.387_10298"}
{"input": "learnable knowledge - guided data augmentation is used for Task| context: unfortunately , the existing nlp - related augmentation methods can not directly produce available data required for this task .", "entity": "learnable knowledge - guided data augmentation", "output": "event causality identification ( eci )", "neg_sample": ["learnable knowledge - guided data augmentation is used for Task", "unfortunately , the existing nlp - related augmentation methods can not directly produce available data required for this task ."], "relation": "used for", "id": "2021.acl-long.276", "year": 2021, "rel_sent": "LearnDA : Learnable Knowledge - Guided Data Augmentation for Event Causality Identification.", "forward": true, "src_ids": "2021.acl-long.276_15349"}
{"input": "pronunciation modeling is used for Task| context: there has been increasing demand to develop effective computer - assisted language training ( capt ) systems , which can provide feedback on mispronunciations and facilitate second - language ( l2 ) learners to improve their speaking proficiency through repeated practice . due to the shortage of non - native speech for training the automatic speech recognition ( asr ) module of a capt system , the corresponding mispronunciation detection performance is often affected by imperfect asr .", "entity": "pronunciation modeling", "output": "english mispronunciation detection", "neg_sample": ["pronunciation modeling is used for Task", "there has been increasing demand to develop effective computer - assisted language training ( capt ) systems , which can provide feedback on mispronunciations and facilitate second - language ( l2 ) learners to improve their speaking proficiency through repeated practice .", "due to the shortage of non - native speech for training the automatic speech recognition ( asr ) module of a capt system , the corresponding mispronunciation detection performance is often affected by imperfect asr ."], "relation": "used for", "id": "2021.rocling-1.17", "year": 2021, "rel_sent": "Exploring the Integration of E2E ASR and Pronunciation Modeling for English Mispronunciation Detection.", "forward": true, "src_ids": "2021.rocling-1.17_2496"}
{"input": "fst morphological analyzer is used for Material| context: the analyzer draws from a 1250 - token eastern dialect wordlist .", "entity": "fst morphological analyzer", "output": "gitksan language", "neg_sample": ["fst morphological analyzer is used for Material", "the analyzer draws from a 1250 - token eastern dialect wordlist ."], "relation": "used for", "id": "2021.sigmorphon-1.21", "year": 2021, "rel_sent": "An FST morphological analyzer for the Gitksan language.", "forward": true, "src_ids": "2021.sigmorphon-1.21_7465"}
{"input": "comparison features is used for Method| context: neural em models learn vector representation of entity descriptions and match entities end - to - end . though robust , these methods require many annotated resources for training , and lack of interpretability .", "entity": "comparison features", "output": "kat induction", "neg_sample": ["comparison features is used for Method", "neural em models learn vector representation of entity descriptions and match entities end - to - end .", "though robust , these methods require many annotated resources for training , and lack of interpretability ."], "relation": "used for", "id": "2021.acl-long.215", "year": 2021, "rel_sent": "Using a set of comparison features and a limited amount of annotated data , KAT Induction learns an efficient decision tree that can be interpreted by generating entity matching rules whose structure is advocated by domain experts .", "forward": true, "src_ids": "2021.acl-long.215_6129"}
{"input": "in - domain and out - ofdomain cws is done by using Method| context: pre - trained language models ( e.g. , bert ) significantly alleviate two traditional challenging problems for chinese word segmentation ( cws ): segmentation ambiguity and out - ofvocabulary ( oov ) words . however , such improvements are usually achieved on traditional benchmark datasets and not close to an important goal of cws : practicability ( i.e. , low complexity as a standalone task and high beneficiality to downstream tasks ) .", "entity": "in - domain and out - ofdomain cws", "output": "neural method", "neg_sample": ["in - domain and out - ofdomain cws is done by using Method", "pre - trained language models ( e.g.", ", bert ) significantly alleviate two traditional challenging problems for chinese word segmentation ( cws ): segmentation ambiguity and out - ofvocabulary ( oov ) words .", "however , such improvements are usually achieved on traditional benchmark datasets and not close to an important goal of cws : practicability ( i.e.", ", low complexity as a standalone task and high beneficiality to downstream tasks ) ."], "relation": "used for", "id": "2021.findings-acl.383", "year": 2021, "rel_sent": "The neural method consists of a teacher model and a student model , which distills knowledge from unlabeled data to the student model so as to improve both in - domain and out - ofdomain CWS .", "forward": false, "src_ids": "2021.findings-acl.383_4753"}
{"input": "relation - aware graph neural network is used for OtherScientificTerm| context: previous studies utilize pre - trained models on large - scale corpora such as bert , or perform reasoning on knowledge graphs . however , these methods do not explicitly model the relations that connect entities , which are informational and can be used to enhance reasoning .", "entity": "relation - aware graph neural network", "output": "rich contextual information", "neg_sample": ["relation - aware graph neural network is used for OtherScientificTerm", "previous studies utilize pre - trained models on large - scale corpora such as bert , or perform reasoning on knowledge graphs .", "however , these methods do not explicitly model the relations that connect entities , which are informational and can be used to enhance reasoning ."], "relation": "used for", "id": "2021.conll-1.35", "year": 2021, "rel_sent": "Our method uses a relation - aware graph neural network to capture the rich contextual information from both entities and relations .", "forward": true, "src_ids": "2021.conll-1.35_7064"}
{"input": "quantified statements is done by using Task| context: literary texts feature a rich variety in expressing quantification , including a broad range of lexemes to express quantifiers and complex sentence structures to express the restrictor and the nuclear scope of a quantification .", "entity": "quantified statements", "output": "annotation of quantification", "neg_sample": ["quantified statements is done by using Task", "literary texts feature a rich variety in expressing quantification , including a broad range of lexemes to express quantifiers and complex sentence structures to express the restrictor and the nuclear scope of a quantification ."], "relation": "used for", "id": "2021.isa-1.3", "year": 2021, "rel_sent": "We present a tagset for the annotation of quantification which we currently use to annotate certain quantified statements in fictional works of literature .", "forward": false, "src_ids": "2021.isa-1.3_8112"}
{"input": "locality assumption is used for OtherScientificTerm| context: document - level mt models are still far from satisfactory . existing work extend translation unit from single sentence to multiple sentences . however , study shows that when we further enlarge the translation unit to a whole document , supervised training of transformer can fail . in this paper , we find such failure is not caused by overfitting , but by sticking around local minima during training .", "entity": "locality assumption", "output": "inductive bias", "neg_sample": ["locality assumption is used for OtherScientificTerm", "document - level mt models are still far from satisfactory .", "existing work extend translation unit from single sentence to multiple sentences .", "however , study shows that when we further enlarge the translation unit to a whole document , supervised training of transformer can fail .", "in this paper , we find such failure is not caused by overfitting , but by sticking around local minima during training ."], "relation": "used for", "id": "2021.acl-long.267", "year": 2021, "rel_sent": "As a solution , we propose G - Transformer , introducing locality assumption as an inductive bias into Transformer , reducing the hypothesis space of the attention from target to source .", "forward": true, "src_ids": "2021.acl-long.267_5684"}
{"input": "humour detection is done by using Method| context: humour detection is an interesting but difficult task in nlp . because humorous might not be obvious in text , it can be embedded into context , hide behind the literal meaning and require prior knowledge to understand .", "entity": "humour detection", "output": "pre - trained distilbert model", "neg_sample": ["humour detection is done by using Method", "humour detection is an interesting but difficult task in nlp .", "because humorous might not be obvious in text , it can be embedded into context , hide behind the literal meaning and require prior knowledge to understand ."], "relation": "used for", "id": "2021.semeval-1.166", "year": 2021, "rel_sent": "UoR at SemEval-2021 Task 7 : Utilizing Pre - trained DistilBERT Model and Multi - scale CNN for Humor Detection.", "forward": false, "src_ids": "2021.semeval-1.166_14851"}
{"input": "supervised topic modeling approach is used for Metric| context: estimating the effects of monetary policy is one of the fundamental research questions in monetary economics . many economies are facing ultra - low interest rate environments ever since the global financial crisis of 2007 - 9 . the covid pandemic recently reinforced this situation . in the us and europe , interest rates are close to ( or even below ) zero , which limits the scope of traditional monetary policy measures for central banks . dedicated central bank communication has hence become an increasingly important tool to steer and control market expectations these days . however , incorporating central bank language directly as features into economic models is still a very nascent research area . in particular , the content and effect of central bank speeches has been mostly neglected from monetary policy modelling sofar .", "entity": "supervised topic modeling approach", "output": "monetary policy signal dispersion index", "neg_sample": ["supervised topic modeling approach is used for Metric", "estimating the effects of monetary policy is one of the fundamental research questions in monetary economics .", "many economies are facing ultra - low interest rate environments ever since the global financial crisis of 2007 - 9 . the covid pandemic recently reinforced this situation .", "in the us and europe , interest rates are close to ( or even below ) zero , which limits the scope of traditional monetary policy measures for central banks .", "dedicated central bank communication has hence become an increasingly important tool to steer and control market expectations these days .", "however , incorporating central bank language directly as features into economic models is still a very nascent research area .", "in particular , the content and effect of central bank speeches has been mostly neglected from monetary policy modelling sofar ."], "relation": "used for", "id": "2021.econlp-1.12", "year": 2021, "rel_sent": "We use a supervised topic modeling approach that can deal with text as well as numeric covariates to estimate a monetary policy signal dispersion index along three key economic dimensions : GDP , CPI and unemployment .", "forward": true, "src_ids": "2021.econlp-1.12_6623"}
{"input": "decoding phase is done by using OtherScientificTerm| context: recent successes in deep generative modeling have led to significant advances in natural language generation ( nlg ) . incorporating entities into neural generation models has demonstrated great improvements by assisting to infer the summary topic and to generate coherent content .", "entity": "decoding phase", "output": "entity types", "neg_sample": ["decoding phase is done by using OtherScientificTerm", "recent successes in deep generative modeling have led to significant advances in natural language generation ( nlg ) .", "incorporating entities into neural generation models has demonstrated great improvements by assisting to infer the summary topic and to generate coherent content ."], "relation": "used for", "id": "2021.emnlp-main.56", "year": 2021, "rel_sent": "To enhance the role of entity in NLG , in this paper , we aim to model the entity type in the decoding phase to generate contextual words accurately .", "forward": false, "src_ids": "2021.emnlp-main.56_12948"}
{"input": "style - augmented translation models is used for OtherScientificTerm| context: one key ingredient of neural machine translation is the use of large datasets from different domains and resources ( e.g. europarl , ted talks ) . these datasets contain documents translated by professional translators using different but consistent translation styles . despite that , the model is usually trained in a way that neither explicitly captures the variety of translation styles present in the data nor translates new data in different and controllable styles .", "entity": "style - augmented translation models", "output": "style variations of translators", "neg_sample": ["style - augmented translation models is used for OtherScientificTerm", "one key ingredient of neural machine translation is the use of large datasets from different domains and resources ( e.g.", "europarl , ted talks ) .", "these datasets contain documents translated by professional translators using different but consistent translation styles .", "despite that , the model is usually trained in a way that neither explicitly captures the variety of translation styles present in the data nor translates new data in different and controllable styles ."], "relation": "used for", "id": "2021.naacl-main.94", "year": 2021, "rel_sent": "We show that our style - augmented translation models are able to capture the style variations of translators and to generate translations with different styles on new data .", "forward": true, "src_ids": "2021.naacl-main.94_2017"}
{"input": "task generation scheme is used for Task| context: meta - learning has recently been proposed to learn models and algorithms that can generalize from a handful of examples . however , applications to structured prediction and textual tasks pose challenges for meta - learning algorithms .", "entity": "task generation scheme", "output": "training", "neg_sample": ["task generation scheme is used for Task", "meta - learning has recently been proposed to learn models and algorithms that can generalize from a handful of examples .", "however , applications to structured prediction and textual tasks pose challenges for meta - learning algorithms ."], "relation": "used for", "id": "2021.metanlp-1.6", "year": 2021, "rel_sent": "We propose a task generation scheme for converting classical NER datasets into the few - shot setting , for both training and evaluation .", "forward": true, "src_ids": "2021.metanlp-1.6_309"}
{"input": "masked language modeling is done by using Method| context: masked language modeling ( mlm ) , a self - supervised pretraining objective , is widely used in natural language processing for learning text representations . mlm trains a model to predict a random sample of input tokens that have been replaced by a [ mask ] placeholder in a multi - class setting over the entire vocabulary . when pretraining , it is common to use alongside mlm other auxiliary objectives on the token or sequence level to improve downstream performance ( e.g. next sentence prediction ) . however , no previous work sofar has attempted in examining whether other simpler linguistically intuitive or not objectives can be used standalone as main pretraining objectives .", "entity": "masked language modeling", "output": "pretraining alternatives", "neg_sample": ["masked language modeling is done by using Method", "masked language modeling ( mlm ) , a self - supervised pretraining objective , is widely used in natural language processing for learning text representations .", "mlm trains a model to predict a random sample of input tokens that have been replaced by a [ mask ] placeholder in a multi - class setting over the entire vocabulary .", "when pretraining , it is common to use alongside mlm other auxiliary objectives on the token or sequence level to improve downstream performance ( e.g.", "next sentence prediction ) .", "however , no previous work sofar has attempted in examining whether other simpler linguistically intuitive or not objectives can be used standalone as main pretraining objectives ."], "relation": "used for", "id": "2021.emnlp-main.249", "year": 2021, "rel_sent": "Frustratingly Simple Pretraining Alternatives to Masked Language Modeling.", "forward": false, "src_ids": "2021.emnlp-main.249_12359"}
{"input": "small neural networks is used for OtherScientificTerm| context: in this paper , we explore the task of automatically generating natural language descriptions of salient patterns in a time series , such as stock prices of a company over a week . a model for this task should be able to extract high - level patterns such as presence of a peak or a dip . while typical contemporary neural models with attention mechanisms can generate fluent output descriptions for this task , they often generate factually incorrect descriptions .", "entity": "small neural networks", "output": "numerical patterns", "neg_sample": ["small neural networks is used for OtherScientificTerm", "in this paper , we explore the task of automatically generating natural language descriptions of salient patterns in a time series , such as stock prices of a company over a week .", "a model for this task should be able to extract high - level patterns such as presence of a peak or a dip .", "while typical contemporary neural models with attention mechanisms can generate fluent output descriptions for this task , they often generate factually incorrect descriptions ."], "relation": "used for", "id": "2021.emnlp-main.55", "year": 2021, "rel_sent": "A program in our model is constructed from modules , which are small neural networks that are designed to capture numerical patterns and temporal information .", "forward": true, "src_ids": "2021.emnlp-main.55_8623"}
{"input": "attention is used for Method| context: it is popular that neural graph - based models are applied in existing aspect - based sentiment analysis ( absa ) studies for utilizing word relations through dependency parses tofacilitate the task with better semantic guidance for analyzing context and aspect words . however , most of these studies only leverage dependency relations without considering their dependency types , and are limited in lacking efficient mechanisms to distinguish the important relations as well as learn from different layers of graph based models .", "entity": "attention", "output": "t - gcn", "neg_sample": ["attention is used for Method", "it is popular that neural graph - based models are applied in existing aspect - based sentiment analysis ( absa ) studies for utilizing word relations through dependency parses tofacilitate the task with better semantic guidance for analyzing context and aspect words .", "however , most of these studies only leverage dependency relations without considering their dependency types , and are limited in lacking efficient mechanisms to distinguish the important relations as well as learn from different layers of graph based models ."], "relation": "used for", "id": "2021.naacl-main.231", "year": 2021, "rel_sent": "To address such limitations , in this paper , we propose an approach to explicitly utilize dependency types for ABSA with type - aware graph convolutional networks ( T - GCN ) , where attention is used in T - GCN to distinguish different edges ( relations ) in the graph and attentive layer ensemble is proposed to comprehensively learn from different layers of T - GCN .", "forward": true, "src_ids": "2021.naacl-main.231_6297"}
{"input": "off - policy maximum entropy model is used for Task| context: reinforcement learning ( rl ) is a powerful framework to address the discrepancy between loss functions used during training and the final evaluation metrics to be used at test time . when applied to neural machine translation ( mt ) , it minimises the mismatch between the cross - entropy loss and non - differentiable evaluation metrics like bleu . however , the suitability of these metrics as reward function at training time is questionable : they tend to be sparse and biased towards the specific words used in the reference texts .", "entity": "off - policy maximum entropy model", "output": "language generation applications", "neg_sample": ["off - policy maximum entropy model is used for Task", "reinforcement learning ( rl ) is a powerful framework to address the discrepancy between loss functions used during training and the final evaluation metrics to be used at test time .", "when applied to neural machine translation ( mt ) , it minimises the mismatch between the cross - entropy loss and non - differentiable evaluation metrics like bleu .", "however , the suitability of these metrics as reward function at training time is questionable : they tend to be sparse and biased towards the specific words used in the reference texts ."], "relation": "used for", "id": "2021.eacl-main.164", "year": 2021, "rel_sent": "We base our proposals on the Soft Actor - Critic ( SAC ) framework , adapting the off - policy maximum entropy model for language generation applications such as MT .", "forward": true, "src_ids": "2021.eacl-main.164_4359"}
{"input": "logical table - to - text generation is done by using Method| context: the task remains challenging where deep learning models often generated linguistically fluent but logically inconsistent text . the underlying reason may be that deep learning models often capture surface - level spurious correlations rather than the causal relationships between the table x and the sentence y. specifically , in the training stage , a model can get a low empirical loss without understanding x and use spurious statistical cues instead .", "entity": "logical table - to - text generation", "output": "de - confounded variational encoder - decoder", "neg_sample": ["logical table - to - text generation is done by using Method", "the task remains challenging where deep learning models often generated linguistically fluent but logically inconsistent text .", "the underlying reason may be that deep learning models often capture surface - level spurious correlations rather than the causal relationships between the table x and the sentence y. specifically , in the training stage , a model can get a low empirical loss without understanding x and use spurious statistical cues instead ."], "relation": "used for", "id": "2021.acl-long.430", "year": 2021, "rel_sent": "De - Confounded Variational Encoder - Decoder for Logical Table - to - Text Generation.", "forward": false, "src_ids": "2021.acl-long.430_10389"}
{"input": "scaffolded input is used for OtherScientificTerm| context: the recurrent neural network ( rnn ) language model is a powerful tool for learning arbitrary sequential dependencies in language data . despite its enormous success in representing lexical sequences , little is known about the quality of the lexical representations that it acquires . in this work , we conjecture that it is straightforward to extract lexical representations ( i.e. static word embeddings ) from an rnn , but that the amount of semantic information that is encoded is limited when lexical items in the training data provide redundant semantic information .", "entity": "scaffolded input", "output": "atomic organization", "neg_sample": ["scaffolded input is used for OtherScientificTerm", "the recurrent neural network ( rnn ) language model is a powerful tool for learning arbitrary sequential dependencies in language data .", "despite its enormous success in representing lexical sequences , little is known about the quality of the lexical representations that it acquires .", "in this work , we conjecture that it is straightforward to extract lexical representations ( i.e.", "static word embeddings ) from an rnn , but that the amount of semantic information that is encoded is limited when lexical items in the training data provide redundant semantic information ."], "relation": "used for", "id": "2021.conll-1.32", "year": 2021, "rel_sent": "Scaffolded input promotes atomic organization in the recurrent neural network language model.", "forward": true, "src_ids": "2021.conll-1.32_10125"}
{"input": "dependency parsing model is used for Task| context: previous works on key information extraction from visually rich documents ( vrds ) mainly focus on labeling the text within each bounding box ( i.e. ,semantic entity ) , while the relations in - between are largely unexplored .", "entity": "dependency parsing model", "output": "entity relation extraction task", "neg_sample": ["dependency parsing model is used for Task", "previous works on key information extraction from visually rich documents ( vrds ) mainly focus on labeling the text within each bounding box ( i.e.", ",semantic entity ) , while the relations in - between are largely unexplored ."], "relation": "used for", "id": "2021.emnlp-main.218", "year": 2021, "rel_sent": "In this paper , we adapt the popular dependency parsing model , the biaffine parser , to this entity relation extraction task .", "forward": true, "src_ids": "2021.emnlp-main.218_3910"}
{"input": "discontinuous trees is done by using Method| context: the combined use of neural scoring systems and bert fine - tuning has led to very high results in many natural language processing ( nlp ) tasks . these high results raise two important questions about the contribution and the limitations of pretrained - language models : ( i ) what are the remaining errors in the bestperforming systems ? ( ii ) what are the types of test examples where pretrained language models help the most ?", "entity": "discontinuous trees", "output": "berkeley parser analyser", "neg_sample": ["discontinuous trees is done by using Method", "the combined use of neural scoring systems and bert fine - tuning has led to very high results in many natural language processing ( nlp ) tasks .", "these high results raise two important questions about the contribution and the limitations of pretrained - language models : ( i ) what are the remaining errors in the bestperforming systems ?", "( ii ) what are the types of test examples where pretrained language models help the most ?"], "relation": "used for", "id": "2021.findings-acl.288", "year": 2021, "rel_sent": "Second , we extend the Berkeley Parser Analyser - a tool that classifies parsing errors according to predefined structural patterns - , to discontinuous trees .", "forward": false, "src_ids": "2021.findings-acl.288_1541"}
{"input": "finegrained and explicit cross - lingual information is done by using Generic| context: with the modeling of bidirectional contexts , recently prevalent language modeling approaches such as xlm achieve better performance than traditional methods based on embedding alignment , which strives to assign similar vector representations to semantic - equivalent units . however , such approaches like xlm capture cross - lingual information based solely on shared bpe vocabulary , resulting in the absence of fine - grained supervision induced by embedding alignment .", "entity": "finegrained and explicit cross - lingual information", "output": "similar representations", "neg_sample": ["finegrained and explicit cross - lingual information is done by using Generic", "with the modeling of bidirectional contexts , recently prevalent language modeling approaches such as xlm achieve better performance than traditional methods based on embedding alignment , which strives to assign similar vector representations to semantic - equivalent units .", "however , such approaches like xlm capture cross - lingual information based solely on shared bpe vocabulary , resulting in the absence of fine - grained supervision induced by embedding alignment ."], "relation": "used for", "id": "2021.findings-acl.149", "year": 2021, "rel_sent": "While predicting the masked words based on bidirectional contexts , the proposal also encodes semantic equivalents from different languages into similar representations to introduce more finegrained and explicit cross - lingual information .", "forward": false, "src_ids": "2021.findings-acl.149_11028"}
{"input": "limited training data is done by using Task| context: automatic short answer grading ( asag ) is the task of assessing students ' short natural language responses to objective questions . it is a crucial component of new education platforms , and could support more wide - spread use of constructed response questions to replace cognitively less challenging multiple choice questions .", "entity": "limited training data", "output": "translation - based data augmentation", "neg_sample": ["limited training data is done by using Task", "automatic short answer grading ( asag ) is the task of assessing students ' short natural language responses to objective questions .", "it is a crucial component of new education platforms , and could support more wide - spread use of constructed response questions to replace cognitively less challenging multiple choice questions ."], "relation": "used for", "id": "2021.emnlp-main.487", "year": 2021, "rel_sent": "We apply translation - based data augmentation to address the two problems of limited training data , and high data skew for multi - class ASAG tasks .", "forward": false, "src_ids": "2021.emnlp-main.487_13268"}
{"input": "over - transfer and under - transfer problems is done by using OtherScientificTerm| context: unsupervised text style transfer aims to alter the underlying style of the text to a desired value while keeping its style - independent semantics , without the support of parallel training corpora . existing methods struggle to achieve both high style conversion rate and low content loss , exhibiting the over - transfer and under - transfer problems . we attribute these problems to the conflicting driving forces of the style conversion goal and content preservation goal .", "entity": "over - transfer and under - transfer problems", "output": "mutual knowledge distillation", "neg_sample": ["over - transfer and under - transfer problems is done by using OtherScientificTerm", "unsupervised text style transfer aims to alter the underlying style of the text to a desired value while keeping its style - independent semantics , without the support of parallel training corpora .", "existing methods struggle to achieve both high style conversion rate and low content loss , exhibiting the over - transfer and under - transfer problems .", "we attribute these problems to the conflicting driving forces of the style conversion goal and content preservation goal ."], "relation": "used for", "id": "2021.emnlp-main.729", "year": 2021, "rel_sent": "As a result , mutual knowledge distillation drives both decoders to a better optimum and alleviates the over - transfer and under - transfer problems .", "forward": false, "src_ids": "2021.emnlp-main.729_4136"}
{"input": "hierarchical retrieval mechanism is used for OtherScientificTerm| context: medical report generation is one of the most challenging tasks in medical image analysis . although existing approaches have achieved promising results , they either require a predefined template database in order to retrieve sentences or ignore the hierarchical nature of medical report generation .", "entity": "hierarchical retrieval mechanism", "output": "report and sentence - level templates", "neg_sample": ["hierarchical retrieval mechanism is used for OtherScientificTerm", "medical report generation is one of the most challenging tasks in medical image analysis .", "although existing approaches have achieved promising results , they either require a predefined template database in order to retrieve sentences or ignore the hierarchical nature of medical report generation ."], "relation": "used for", "id": "2021.acl-long.387", "year": 2021, "rel_sent": "To address these issues , we propose MedWriter that incorporates a novel hierarchical retrieval mechanism to automatically extract both report and sentence - level templates for clinically accurate report generation .", "forward": true, "src_ids": "2021.acl-long.387_15073"}
{"input": "creation of semantic frames is done by using OtherScientificTerm| context: automatic understanding of specifications containing flexible word order and expressiveness close to natural language is a challenging task .", "entity": "creation of semantic frames", "output": "rules", "neg_sample": ["creation of semantic frames is done by using OtherScientificTerm", "automatic understanding of specifications containing flexible word order and expressiveness close to natural language is a challenging task ."], "relation": "used for", "id": "2021.depling-1.7", "year": 2021, "rel_sent": "It also demonstrated the ease of creating rules to generate the same semantic frame for specifications with the same meaning but different word order .", "forward": false, "src_ids": "2021.depling-1.7_10107"}
{"input": "neighborhood information is done by using Method| context: the tasks of rich semantic parsing , such as abstract meaning representation ( amr ) , share similar goals with information extraction ( ie ) to convert natural language texts into structured semantic representations .", "entity": "neighborhood information", "output": "amr based semantic graph aggregator", "neg_sample": ["neighborhood information is done by using Method", "the tasks of rich semantic parsing , such as abstract meaning representation ( amr ) , share similar goals with information extraction ( ie ) to convert natural language texts into structured semantic representations ."], "relation": "used for", "id": "2021.naacl-main.4", "year": 2021, "rel_sent": "Our framework consists of two novel components : 1 ) an AMR based semantic graph aggregator to let the candidate entity and event trigger nodes collect neighborhood information from AMR graph for passing message among related knowledge elements ; 2 ) an AMR guided graph decoder to extract knowledge elements based on the order decided by the hierarchical structures in AMR .", "forward": false, "src_ids": "2021.naacl-main.4_10606"}
{"input": "children 's early acquisition of semantic knowledge is done by using Task| context: children learn the meaning of words and sentences in their native language at an impressive speed and from highly ambiguous input . to account for this learning , previous computational modeling has focused mainly on the study of perception - based mechanisms like cross - situational learning . however , children do not learn only by exposure to the input . as soon as they start to talk , they practice their knowledge in social interactions and they receive feedback from their caregivers .", "entity": "children 's early acquisition of semantic knowledge", "output": "perception - based and production - based learning", "neg_sample": ["children 's early acquisition of semantic knowledge is done by using Task", "children learn the meaning of words and sentences in their native language at an impressive speed and from highly ambiguous input .", "to account for this learning , previous computational modeling has focused mainly on the study of perception - based mechanisms like cross - situational learning .", "however , children do not learn only by exposure to the input .", "as soon as they start to talk , they practice their knowledge in social interactions and they receive feedback from their caregivers ."], "relation": "used for", "id": "2021.conll-1.31", "year": 2021, "rel_sent": "Modeling the Interaction Between Perception - Based and Production - Based Learning in Children 's Early Acquisition of Semantic Knowledge.", "forward": false, "src_ids": "2021.conll-1.31_13201"}
{"input": "human - written adversarial dataset is used for Material| context: despite recent progress , state - of - the - art question answering models remain vulnerable to a variety of adversarial attacks . while dynamic adversarial data collection , in which a human annotator tries to write examples that fool a model - in - the - loop , can improve model robustness , this process is expensive which limits the scale of the collected data .", "entity": "human - written adversarial dataset", "output": "synthetic question - answer pairs", "neg_sample": ["human - written adversarial dataset is used for Material", "despite recent progress , state - of - the - art question answering models remain vulnerable to a variety of adversarial attacks .", "while dynamic adversarial data collection , in which a human annotator tries to write examples that fool a model - in - the - loop , can improve model robustness , this process is expensive which limits the scale of the collected data ."], "relation": "used for", "id": "2021.emnlp-main.696", "year": 2021, "rel_sent": "Using this approach , we amplify a smaller human - written adversarial dataset to a much larger set of synthetic question - answer pairs .", "forward": true, "src_ids": "2021.emnlp-main.696_1132"}
{"input": "transferring toxic sets is used for OtherScientificTerm| context: we explore different dimensions which affect the model 's performance .", "entity": "transferring toxic sets", "output": "toxic spans", "neg_sample": ["transferring toxic sets is used for OtherScientificTerm", "we explore different dimensions which affect the model 's performance ."], "relation": "used for", "id": "2021.semeval-1.119", "year": 2021, "rel_sent": "NLP_UIOWA at Semeval-2021 Task 5 : Transferring Toxic Sets to Tag Toxic Spans.", "forward": true, "src_ids": "2021.semeval-1.119_301"}
{"input": "resolution of natural gender phenomena is done by using Method| context: languages differ in terms of the absence or presence of gender features , the number of gender classes and whether and where gender features are explicitly marked . these cross - linguistic differences can lead to ambiguities that are difficult to resolve , especially for sentence - level mt systems . the identification of ambiguity and its subsequent resolution is a challenging task for which currently there are n't any specific resources or challenge sets available .", "entity": "resolution of natural gender phenomena", "output": "gender - it", "neg_sample": ["resolution of natural gender phenomena is done by using Method", "languages differ in terms of the absence or presence of gender features , the number of gender classes and whether and where gender features are explicitly marked .", "these cross - linguistic differences can lead to ambiguities that are difficult to resolve , especially for sentence - level mt systems .", "the identification of ambiguity and its subsequent resolution is a challenging task for which currently there are n't any specific resources or challenge sets available ."], "relation": "used for", "id": "2021.gebnlp-1.1", "year": 2021, "rel_sent": "In this paper , we introduce gENder - IT , an English - Italian challenge set focusing on the resolution of natural gender phenomena by providing word - level gender tags on the English source side and multiple gender alternative translations , where needed , on the Italian target side .", "forward": false, "src_ids": "2021.gebnlp-1.1_2206"}
{"input": "candidate refinement is done by using OtherScientificTerm| context: automated frequently asked question ( faq ) retrieval provides an effective procedure to provide prompt responses to natural language based queries , providing an efficient platform for large - scale service - providing companies for presenting readily available information pertaining to customers ' questions .", "entity": "candidate refinement", "output": "fine - grained semantic similarity", "neg_sample": ["candidate refinement is done by using OtherScientificTerm", "automated frequently asked question ( faq ) retrieval provides an effective procedure to provide prompt responses to natural language based queries , providing an efficient platform for large - scale service - providing companies for presenting readily available information pertaining to customers ' questions ."], "relation": "used for", "id": "2021.sigdial-1.44", "year": 2021, "rel_sent": "We propose two decoupled deep learning architectures trained for ( i ) candidate generation via text classification for a user question , and ( ii ) learning fine - grained semantic similarity between user questions and the FAQ repository for candidate refinement .", "forward": false, "src_ids": "2021.sigdial-1.44_14105"}
{"input": "prediction process is done by using Method| context: many open - domain question answering problems can be cast as a textual entailment task , where a question and candidate answers are concatenated toform hypotheses . a qa system then determines if the supporting knowledge bases , regarded as potential premises , entail the hypotheses .", "entity": "prediction process", "output": "natural logic inference process", "neg_sample": ["prediction process is done by using Method", "many open - domain question answering problems can be cast as a textual entailment task , where a question and candidate answers are concatenated toform hypotheses .", "a qa system then determines if the supporting knowledge bases , regarded as potential premises , entail the hypotheses ."], "relation": "used for", "id": "2021.emnlp-main.298", "year": 2021, "rel_sent": "The natural logic inference process inherently provides evidence to help explain the prediction process .", "forward": false, "src_ids": "2021.emnlp-main.298_12153"}
{"input": "syntactic is done by using Method| context: most recent methods adopt syntax - based graph neural networks to extract the syntactic information from the dependency graph , thinking that would be beneficial for establishing relations between aspect and opinion words . however , these methods may ignore that some sentences have no remarkable syntactic structure , which causes the opposite judgement in sentiment analysis .", "entity": "syntactic", "output": "dm - gcn", "neg_sample": ["syntactic is done by using Method", "most recent methods adopt syntax - based graph neural networks to extract the syntactic information from the dependency graph , thinking that would be beneficial for establishing relations between aspect and opinion words .", "however , these methods may ignore that some sentences have no remarkable syntactic structure , which causes the opposite judgement in sentiment analysis ."], "relation": "used for", "id": "2021.findings-acl.232", "year": 2021, "rel_sent": "Our extensive experiments on SemEval 2014 and Twitter datasets confirm that DM - GCN fuses syntactic , semantic and their combinations optimally and outperforms all state - of - the - art alternatives with a large margin .", "forward": false, "src_ids": "2021.findings-acl.232_13623"}
{"input": "data mining pipeline is used for Material| context: rap generation , which aims to produce lyrics and corresponding singing beats , needs to model both rhymes and rhythms . previous works for rap generation focused on rhyming lyrics , but ignored rhythmic beats , which are important for rap performance .", "entity": "data mining pipeline", "output": "large - scale rap dataset", "neg_sample": ["data mining pipeline is used for Material", "rap generation , which aims to produce lyrics and corresponding singing beats , needs to model both rhymes and rhythms .", "previous works for rap generation focused on rhyming lyrics , but ignored rhythmic beats , which are important for rap performance ."], "relation": "used for", "id": "2021.acl-long.6", "year": 2021, "rel_sent": "Since there is no available rap datasets with rhythmic beats , we develop a data mining pipeline to collect a large - scale rap dataset , which includes a large number of rap songs with aligned lyrics and rhythmic beats .", "forward": true, "src_ids": "2021.acl-long.6_4595"}
{"input": "phrase - based neural topic model is used for OtherScientificTerm| context: phrase representations derived from bert often do not exhibit complex phrasal compositionality , as the model relies instead on lexical similarity to determine semantic relatedness .", "entity": "phrase - based neural topic model", "output": "topics", "neg_sample": ["phrase - based neural topic model is used for OtherScientificTerm", "phrase representations derived from bert often do not exhibit complex phrasal compositionality , as the model relies instead on lexical similarity to determine semantic relatedness ."], "relation": "used for", "id": "2021.emnlp-main.846", "year": 2021, "rel_sent": "Finally , as a case study , we show that Phrase - BERT embeddings can be easily integrated with a simple autoencoder to build a phrase - based neural topic model that interprets topics as mixtures of words and phrases by performing a nearest neighbor search in the embedding space .", "forward": true, "src_ids": "2021.emnlp-main.846_14986"}
{"input": "text classification is done by using Task| context: healthcare predictive analytics aids medical decision - making , diagnosis prediction and drug review analysis . therefore , prediction accuracy is an important criteria which also necessitates robust predictive language models . however , the models using deep learning have been proven vulnerable towards insignificantly perturbed input instances which are less likely to be misclassified by humans . recent efforts of generating adversaries using rule - based synonyms and bert - mlms have been witnessed in general domain , but the ever - increasing biomedical literature poses unique challenges .", "entity": "text classification", "output": "bert - based biomedical adversarial example generation", "neg_sample": ["text classification is done by using Task", "healthcare predictive analytics aids medical decision - making , diagnosis prediction and drug review analysis .", "therefore , prediction accuracy is an important criteria which also necessitates robust predictive language models .", "however , the models using deep learning have been proven vulnerable towards insignificantly perturbed input instances which are less likely to be misclassified by humans .", "recent efforts of generating adversaries using rule - based synonyms and bert - mlms have been witnessed in general domain , but the ever - increasing biomedical literature poses unique challenges ."], "relation": "used for", "id": "2021.naacl-main.423", "year": 2021, "rel_sent": "BBAEG : Towards BERT - based Biomedical Adversarial Example Generation for Text Classification.", "forward": false, "src_ids": "2021.naacl-main.423_14108"}
{"input": "core argument schemes is done by using Method| context: this paper takes a first step towards a critical thinking curriculum for neural auto - regressive language models .", "entity": "core argument schemes", "output": "cript", "neg_sample": ["core argument schemes is done by using Method", "this paper takes a first step towards a critical thinking curriculum for neural auto - regressive language models ."], "relation": "used for", "id": "2021.iwcs-1.7", "year": 2021, "rel_sent": "CRiPT generalizes the core argument schemes in a correct way .", "forward": false, "src_ids": "2021.iwcs-1.7_13976"}
{"input": "deep differential amplifier framework is used for Task| context: the imbalanced classification of summarization is inherent , which ca n't be addressed by common algorithms easily .", "entity": "deep differential amplifier framework", "output": "extractive summarization", "neg_sample": ["deep differential amplifier framework is used for Task", "the imbalanced classification of summarization is inherent , which ca n't be addressed by common algorithms easily ."], "relation": "used for", "id": "2021.acl-long.31", "year": 2021, "rel_sent": "Deep Differential Amplifier for Extractive Summarization.", "forward": true, "src_ids": "2021.acl-long.31_15493"}
{"input": "sequence - pair classification is done by using Task| context: fine - tuning pre - trained language models for downstream tasks has become a norm for nlp . recently it is found that intermediate training can improve performance for fine - tuning language models for target tasks , high - level inference tasks such as question answering ( qa ) tend to work best as intermediate tasks . however it is not clear if intermediate training generally benefits various language models .", "entity": "sequence - pair classification", "output": "text classification", "neg_sample": ["sequence - pair classification is done by using Task", "fine - tuning pre - trained language models for downstream tasks has become a norm for nlp .", "recently it is found that intermediate training can improve performance for fine - tuning language models for target tasks , high - level inference tasks such as question answering ( qa ) tend to work best as intermediate tasks .", "however it is not clear if intermediate training generally benefits various language models ."], "relation": "used for", "id": "2021.alta-1.16", "year": 2021, "rel_sent": "In this paper , using the SQuAD-2.0 QA task for intermediate training for target text classification tasks , we experimented on eight tasks for single - sequence classification and eight tasks for sequence - pair classification using two base and two compact language models .", "forward": false, "src_ids": "2021.alta-1.16_6"}
{"input": "knowledge model is used for OtherScientificTerm| context: script reasoning infers subsequent events from a given event chain , which involves the ability to understand relations between events . a human - labeled script reasoning dataset is usually of small size with limited event relations , which highlights the necessity to leverage external eventuality knowledge graphs ( kg ) consisting of numerous triple facts to describe the inferential relation between events . existing methods adopt a retrieval and integration paradigm tofocus merely on the graph triples that have event overlap with a script , but ignore much more supportive triples in the kg with similar inferential patterns , leading to underexploiting .", "entity": "knowledge model", "output": "inferential relations", "neg_sample": ["knowledge model is used for OtherScientificTerm", "script reasoning infers subsequent events from a given event chain , which involves the ability to understand relations between events .", "a human - labeled script reasoning dataset is usually of small size with limited event relations , which highlights the necessity to leverage external eventuality knowledge graphs ( kg ) consisting of numerous triple facts to describe the inferential relation between events .", "existing methods adopt a retrieval and integration paradigm tofocus merely on the graph triples that have event overlap with a script , but ignore much more supportive triples in the kg with similar inferential patterns , leading to underexploiting ."], "relation": "used for", "id": "2021.findings-acl.403", "year": 2021, "rel_sent": "Tofully exploit the KG , we propose a knowledge model to learn the inferential relations between events from the whole eventuality KG and then support downstream models by directly capturing the relation between events in a script .", "forward": true, "src_ids": "2021.findings-acl.403_4062"}
{"input": "dialogues is used for Task| context: the acquisition of a dialogue corpus is a key step in the process of training a dialogue model . in this context , corpora acquisitions have been designed either for open - domain information retrieval or slot - filling ( e.g. restaurant booking ) tasks . however , there has been scarce research in the problem of collecting personal conversations with users over a long period of time .", "entity": "dialogues", "output": "mental health applications", "neg_sample": ["dialogues is used for Task", "the acquisition of a dialogue corpus is a key step in the process of training a dialogue model .", "in this context , corpora acquisitions have been designed either for open - domain information retrieval or slot - filling ( e.g.", "restaurant booking ) tasks .", "however , there has been scarce research in the problem of collecting personal conversations with users over a long period of time ."], "relation": "used for", "id": "2021.nlpmc-1.1", "year": 2021, "rel_sent": "In this paper we focus on the types of dialogues that are required for mental health applications .", "forward": true, "src_ids": "2021.nlpmc-1.1_6757"}
{"input": "fact verification is done by using Method| context: fact verification is a challenging task that requires simultaneously reasoning and aggregating over multiple retrieved pieces of evidence to evaluate the truthfulness of a claim . existing approaches typically ( i ) explore the semantic interaction between the claim and evidence at different granularity levels but fail to capture their topical consistency during the reasoning process , which we believe is crucial for verification ; ( ii ) aggregate multiple pieces of evidence equally without considering their implicit stances to the claim , thereby introducing spurious information .", "entity": "fact verification", "output": "topic - aware evidence reasoning", "neg_sample": ["fact verification is done by using Method", "fact verification is a challenging task that requires simultaneously reasoning and aggregating over multiple retrieved pieces of evidence to evaluate the truthfulness of a claim .", "existing approaches typically ( i ) explore the semantic interaction between the claim and evidence at different granularity levels but fail to capture their topical consistency during the reasoning process , which we believe is crucial for verification ; ( ii ) aggregate multiple pieces of evidence equally without considering their implicit stances to the claim , thereby introducing spurious information ."], "relation": "used for", "id": "2021.acl-long.128", "year": 2021, "rel_sent": "To alleviate the above issues , we propose a novel topic - aware evidence reasoning and stance - aware aggregation model for more accurate fact verification , with the following four key properties : 1 ) checking topical consistency between the claim and evidence ; 2 ) maintaining topical coherence among multiple pieces of evidence ; 3 ) ensuring semantic similarity between the global topic information and the semantic representation of evidence ; 4 ) aggregating evidence based on their implicit stances to the claim .", "forward": false, "src_ids": "2021.acl-long.128_11366"}
{"input": "prod2bert is used for Method| context: word embeddings ( e.g. , word2vec ) have been applied successfully to ecommerce products through prod2vec .", "entity": "prod2bert", "output": "representations of products", "neg_sample": ["prod2bert is used for Method", "word embeddings ( e.g.", ", word2vec ) have been applied successfully to ecommerce products through prod2vec ."], "relation": "used for", "id": "2021.ecnlp-1.1", "year": 2021, "rel_sent": "Inspired by the recent performance improvements on several NLP tasks brought by contextualized embeddings , we propose to transfer BERT - like architectures to eCommerce : our model - Prod2BERT - is trained to generate representations of products through masked session modeling .", "forward": true, "src_ids": "2021.ecnlp-1.1_1942"}
{"input": "relation - aware sentence embeddings is used for Task| context: though language model text embeddings have revolutionized nlp research , their ability to capture high - level semantic information , such as relations between entities in text , is limited .", "entity": "relation - aware sentence embeddings", "output": "relation extraction task", "neg_sample": ["relation - aware sentence embeddings is used for Task", "though language model text embeddings have revolutionized nlp research , their ability to capture high - level semantic information , such as relations between entities in text , is limited ."], "relation": "used for", "id": "2021.conll-1.27", "year": 2021, "rel_sent": "The resulting relation - aware sentence embeddings achieve state - of - the - art results on the relation extraction task using only a simple KNN classifier , thereby demonstrating the success of the proposed method .", "forward": true, "src_ids": "2021.conll-1.27_13818"}
{"input": "hybrid framework is used for OtherScientificTerm| context: however , a large amount of world 's knowledge is stored in structured databases , and need to be accessed using query languages such as sql . furthermore , query languages can answer questions that require complex reasoning , as well as offering full explainability .", "entity": "hybrid framework", "output": "direct answers", "neg_sample": ["hybrid framework is used for OtherScientificTerm", "however , a large amount of world 's knowledge is stored in structured databases , and need to be accessed using query languages such as sql .", "furthermore , query languages can answer questions that require complex reasoning , as well as offering full explainability ."], "relation": "used for", "id": "2021.acl-long.315", "year": 2021, "rel_sent": "In this paper , we propose a hybrid framework that takes both textual and tabular evidences as input and generates either direct answers or SQL queries depending on which form could better answer the question .", "forward": true, "src_ids": "2021.acl-long.315_14879"}
{"input": "entity model is used for Method| context: most recent work models these two subtasks jointly , either by casting them in one structured prediction framework , or performing multi - task learning through shared representations .", "entity": "entity model", "output": "relation model", "neg_sample": ["entity model is used for Method", "most recent work models these two subtasks jointly , either by casting them in one structured prediction framework , or performing multi - task learning through shared representations ."], "relation": "used for", "id": "2021.naacl-main.5", "year": 2021, "rel_sent": "Our approach essentially builds on two independent encoders and merely uses the entity model to construct the input for the relation model .", "forward": true, "src_ids": "2021.naacl-main.5_11863"}
{"input": "nlp models is used for OtherScientificTerm| context: seq2seq models have demonstrated their incredible effectiveness in a large variety of applications . these outputs may potentially hurt the usability of seq2seq models and make the end - users feel offended .", "entity": "nlp models", "output": "profanity", "neg_sample": ["nlp models is used for OtherScientificTerm", "seq2seq models have demonstrated their incredible effectiveness in a large variety of applications .", "these outputs may potentially hurt the usability of seq2seq models and make the end - users feel offended ."], "relation": "used for", "id": "2021.emnlp-main.418", "year": 2021, "rel_sent": "Extensive experimental results show that the proposed training framework can successfully prevent the NLP models from generating profanity .", "forward": true, "src_ids": "2021.emnlp-main.418_8040"}
{"input": "enhancing content preservation is used for Task| context: a common approach is to map a given sentence to content representation that is free of style , and the content representation is fed to a decoder with a target style . previous methods in filtering style completely remove tokens with style at the token level , which incurs the loss of content information .", "entity": "enhancing content preservation", "output": "text style transfer", "neg_sample": ["enhancing content preservation is used for Task", "a common approach is to map a given sentence to content representation that is free of style , and the content representation is fed to a decoder with a target style .", "previous methods in filtering style completely remove tokens with style at the token level , which incurs the loss of content information ."], "relation": "used for", "id": "2021.acl-long.8", "year": 2021, "rel_sent": "Enhancing Content Preservation in Text Style Transfer Using Reverse Attention and Conditional Layer Normalization.", "forward": true, "src_ids": "2021.acl-long.8_1866"}
{"input": "signal is used for Task| context: understanding how news media frame political issues is important due to its impact on public attitudes , yet hard to automate . computational approaches have largely focused on classifying the frame of a full news article while framing signals are often subtle and local . furthermore , automatic news analysis is a sensitive domain , and existing classifiers lack transparency in their predictions .", "entity": "signal", "output": "document - level frame classification", "neg_sample": ["signal is used for Task", "understanding how news media frame political issues is important due to its impact on public attitudes , yet hard to automate .", "computational approaches have largely focused on classifying the frame of a full news article while framing signals are often subtle and local .", "furthermore , automatic news analysis is a sensitive domain , and existing classifiers lack transparency in their predictions ."], "relation": "used for", "id": "2021.naacl-main.174", "year": 2021, "rel_sent": "This paper addresses both issues with a novel semi - supervised model , which jointly learns to embed local information about the events and related actors in a news article through an auto - encoding framework , and to leverage this signal for document - level frame classification .", "forward": true, "src_ids": "2021.naacl-main.174_10353"}
{"input": "orthogonal regularizer is used for Method| context: aspect - based sentiment analysis is a fine - grained sentiment classification task . recently , graph neural networks over dependency trees have been explored to explicitly model connections between aspects and opinion words . however , the improvement is limited due to the inaccuracy of the dependency parsing results and the informal expressions and complexity of online reviews .", "entity": "orthogonal regularizer", "output": "semgcn", "neg_sample": ["orthogonal regularizer is used for Method", "aspect - based sentiment analysis is a fine - grained sentiment classification task .", "recently , graph neural networks over dependency trees have been explored to explicitly model connections between aspects and opinion words .", "however , the improvement is limited due to the inaccuracy of the dependency parsing results and the informal expressions and complexity of online reviews ."], "relation": "used for", "id": "2021.acl-long.494", "year": 2021, "rel_sent": "The orthogonal regularizer encourages the SemGCN to learn semantically correlated words with less overlap for each word .", "forward": true, "src_ids": "2021.acl-long.494_5766"}
{"input": "multiprecedence is done by using Method| context: raimy ( 1999 ; 2000a ; 2000b ) proposed a graphical formalism for modeling reduplication , originallymostly focused on phonological overapplication in a derivational framework . this framework is now known as precedence - based phonology or multiprecedence phonology . raimy 's idea is that the segments at the input to the phonology are not totally ordered by precedence .", "entity": "multiprecedence", "output": "match - extend serialization algorithm", "neg_sample": ["multiprecedence is done by using Method", "raimy ( 1999 ; 2000a ; 2000b ) proposed a graphical formalism for modeling reduplication , originallymostly focused on phonological overapplication in a derivational framework .", "this framework is now known as precedence - based phonology or multiprecedence phonology .", "raimy 's idea is that the segments at the input to the phonology are not totally ordered by precedence ."], "relation": "used for", "id": "2021.sigmorphon-1.3", "year": 2021, "rel_sent": "The Match - Extend serialization algorithm in Multiprecedence.", "forward": false, "src_ids": "2021.sigmorphon-1.3_6577"}
{"input": "less - resourced language pair is done by using Material| context: our monolingual corpora comprise 1.88 million manipuri sentences and 1.45 million english sentences , and our parallel corpus comprises 124,975 manipuri - english sentence pairs .", "entity": "less - resourced language pair", "output": "sentence - level comparable text corpus", "neg_sample": ["less - resourced language pair is done by using Material", "our monolingual corpora comprise 1.88 million manipuri sentences and 1.45 million english sentences , and our parallel corpus comprises 124,975 manipuri - english sentence pairs ."], "relation": "used for", "id": "2021.bucc-1.8", "year": 2021, "rel_sent": "In this paper , we introduce a sentence - level comparable text corpus crawled and created for the less - resourced language pair , Manipuri(mni ) and English ( eng ) .", "forward": false, "src_ids": "2021.bucc-1.8_9785"}
{"input": "joint embedding loss is used for OtherScientificTerm| context: hierarchical text classification is an important yet challenging task due to the complex structure of the label hierarchy . existing methods ignore the semantic relationship between text and labels , so they can not make full use of the hierarchical information .", "entity": "joint embedding loss", "output": "matching relationship", "neg_sample": ["joint embedding loss is used for OtherScientificTerm", "hierarchical text classification is an important yet challenging task due to the complex structure of the label hierarchy .", "existing methods ignore the semantic relationship between text and labels , so they can not make full use of the hierarchical information ."], "relation": "used for", "id": "2021.acl-long.337", "year": 2021, "rel_sent": "We then introduce a joint embedding loss and a matching learning loss to model the matching relationship between the text semantics and the label semantics .", "forward": true, "src_ids": "2021.acl-long.337_10136"}
{"input": "lexicon enhanced bert ( lebert ) is used for Task| context: however , existing methods solely fuse lexicon features via a shallow and random initialized sequence layer and do not integrate them into the bottom layers of bert .", "entity": "lexicon enhanced bert ( lebert )", "output": "chinese sequence labeling", "neg_sample": ["lexicon enhanced bert ( lebert ) is used for Task", "however , existing methods solely fuse lexicon features via a shallow and random initialized sequence layer and do not integrate them into the bottom layers of bert ."], "relation": "used for", "id": "2021.acl-long.454", "year": 2021, "rel_sent": "In this paper , we propose Lexicon Enhanced BERT ( LEBERT ) for Chinese sequence labeling , which integrates external lexicon knowledge into BERT layers directly by a Lexicon Adapter layer .", "forward": true, "src_ids": "2021.acl-long.454_5943"}
{"input": "structured commonsense reasoning is done by using Task| context: recent commonsense - reasoning tasks are typically discriminative in nature , where a model answers a multiple - choice question for a certain context . discriminative tasks are limiting because they fail to adequately evaluate the model 's ability to reason and explain predictions with underlying commonsense knowledge . they also allow such models to use reasoning shortcuts and not be ' right for the right reasons ' .", "entity": "structured commonsense reasoning", "output": "explanation graph generation task", "neg_sample": ["structured commonsense reasoning is done by using Task", "recent commonsense - reasoning tasks are typically discriminative in nature , where a model answers a multiple - choice question for a certain context .", "discriminative tasks are limiting because they fail to adequately evaluate the model 's ability to reason and explain predictions with underlying commonsense knowledge .", "they also allow such models to use reasoning shortcuts and not be ' right for the right reasons ' ."], "relation": "used for", "id": "2021.emnlp-main.609", "year": 2021, "rel_sent": "ExplaGraphs : An Explanation Graph Generation Task for Structured Commonsense Reasoning.", "forward": false, "src_ids": "2021.emnlp-main.609_15330"}
{"input": "bone rotation data is done by using OtherScientificTerm| context: the presentation is accompanied by the source code .", "entity": "bone rotation data", "output": "skeleton", "neg_sample": ["bone rotation data is done by using OtherScientificTerm", "the presentation is accompanied by the source code ."], "relation": "used for", "id": "2021.mtsummit-at4ssl.8", "year": 2021, "rel_sent": "From the anatomically independent landmarks , we create another skeleton based on the avatar 's skeletal bone architecture to calculate the bone rotation data .", "forward": false, "src_ids": "2021.mtsummit-at4ssl.8_8666"}
{"input": "resisting strategies is done by using Method| context: modelling persuasion strategies as predictors of task outcome has several real - world applications and has received considerable attention from the computational linguistics community . however , previous research has failed to account for the resisting strategies employed by an individual tofoil such persuasion attempts . grounded in prior literature in cognitive and social psychology , we propose a generalised framework for identifying resisting strategies in persuasive conversations .", "entity": "resisting strategies", "output": "hierarchical sequence - labelling neural architecture", "neg_sample": ["resisting strategies is done by using Method", "modelling persuasion strategies as predictors of task outcome has several real - world applications and has received considerable attention from the computational linguistics community .", "however , previous research has failed to account for the resisting strategies employed by an individual tofoil such persuasion attempts .", "grounded in prior literature in cognitive and social psychology , we propose a generalised framework for identifying resisting strategies in persuasive conversations ."], "relation": "used for", "id": "2021.eacl-main.7", "year": 2021, "rel_sent": "We also leverage a hierarchical sequence - labelling neural architecture to infer the aforementioned resisting strategies automatically .", "forward": false, "src_ids": "2021.eacl-main.7_9549"}
{"input": "generating feedback comments is used for Task| context: the task of generating explanatory notes for language learners is known as feedback comment generation . although various generation techniques are available , little is known about which methods are appropriate for this task . nagata ( 2019 ) demonstrates the effectiveness of neural - retrieval - based methods in generating feedback comments for preposition use . retrieval - based methods have limitations in that they can only output feedback comments existing in a given training data . furthermore , feedback comments can be made on other grammatical and writing items than preposition use , which is still unaddressed .", "entity": "generating feedback comments", "output": "writing learning", "neg_sample": ["generating feedback comments is used for Task", "the task of generating explanatory notes for language learners is known as feedback comment generation .", "although various generation techniques are available , little is known about which methods are appropriate for this task .", "nagata ( 2019 ) demonstrates the effectiveness of neural - retrieval - based methods in generating feedback comments for preposition use .", "retrieval - based methods have limitations in that they can only output feedback comments existing in a given training data .", "furthermore , feedback comments can be made on other grammatical and writing items than preposition use , which is still unaddressed ."], "relation": "used for", "id": "2021.emnlp-main.766", "year": 2021, "rel_sent": "Exploring Methods for Generating Feedback Comments for Writing Learning.", "forward": true, "src_ids": "2021.emnlp-main.766_5022"}
{"input": "probing tasks is used for Material| context: the success of pre - trained transformer language models has brought a great deal of interest on how these models work , and what they learn about language . however , prior research in the field is mainly devoted to english , and little is known regarding other languages .", "entity": "probing tasks", "output": "russian", "neg_sample": ["probing tasks is used for Material", "the success of pre - trained transformer language models has brought a great deal of interest on how these models work , and what they learn about language .", "however , prior research in the field is mainly devoted to english , and little is known regarding other languages ."], "relation": "used for", "id": "2021.bsnlp-1.6", "year": 2021, "rel_sent": "To this end , we introduce RuSentEval , an enhanced set of 14 probing tasks for Russian , including ones that have not been explored yet .", "forward": true, "src_ids": "2021.bsnlp-1.6_15235"}
{"input": "qmsum is used for Task| context: meetings are a key component of human collaboration . as increasing numbers of meetings are recorded and transcribed , meeting summaries have become essential to remind those who may or may not have attended the meetings about the key decisions made and the tasks to be completed . however , it is hard to create a single short summary that covers all the content of a long meeting involving multiple people and topics .", "entity": "qmsum", "output": "long meeting summarization", "neg_sample": ["qmsum is used for Task", "meetings are a key component of human collaboration .", "as increasing numbers of meetings are recorded and transcribed , meeting summaries have become essential to remind those who may or may not have attended the meetings about the key decisions made and the tasks to be completed .", "however , it is hard to create a single short summary that covers all the content of a long meeting involving multiple people and topics ."], "relation": "used for", "id": "2021.naacl-main.472", "year": 2021, "rel_sent": "Experimental results and manual analysis reveal that QMSum presents significant challenges in long meeting summarization for future research .", "forward": true, "src_ids": "2021.naacl-main.472_10189"}
{"input": "distilmbert is used for OtherScientificTerm| context: while there has been significant progress towards developing nlu resources for indic languages , syntactic evaluation has been relatively less explored .", "entity": "distilmbert", "output": "syntax", "neg_sample": ["distilmbert is used for OtherScientificTerm", "while there has been significant progress towards developing nlu resources for indic languages , syntactic evaluation has been relatively less explored ."], "relation": "used for", "id": "2021.mrl-1.14", "year": 2021, "rel_sent": "Further , our layer - wise probing experiments reveal that while mBERT , DistilmBERT , and XLM - R localize the syntax in middle layers , the Indic language models do not show such syntactic localization .", "forward": true, "src_ids": "2021.mrl-1.14_9811"}
{"input": "synonym annotations is used for OtherScientificTerm| context: recently , there has been significant progress in studying neural networks to translate text descriptions into sql queries .", "entity": "synonym annotations", "output": "table schemas", "neg_sample": ["synonym annotations is used for OtherScientificTerm", "recently , there has been significant progress in studying neural networks to translate text descriptions into sql queries ."], "relation": "used for", "id": "2021.acl-long.195", "year": 2021, "rel_sent": "The first category of approaches utilizes additional synonym annotations for table schemas by modifying the model input , while the second category is based on adversarial training .", "forward": true, "src_ids": "2021.acl-long.195_15358"}
{"input": "uncertain missing modality problem is done by using Method| context: multimodal fusion has been proved to improve emotion recognition performance in previous works . however , in real - world applications , we often encounter the problem of missing modality , and which modalities will be missing is uncertain . it makes the fixed multimodal fusion fail in such cases .", "entity": "uncertain missing modality problem", "output": "missing modality imagination network ( mmin )", "neg_sample": ["uncertain missing modality problem is done by using Method", "multimodal fusion has been proved to improve emotion recognition performance in previous works .", "however , in real - world applications , we often encounter the problem of missing modality , and which modalities will be missing is uncertain .", "it makes the fixed multimodal fusion fail in such cases ."], "relation": "used for", "id": "2021.acl-long.203", "year": 2021, "rel_sent": "In this work , we propose a unified model , Missing Modality Imagination Network ( MMIN ) , to deal with the uncertain missing modality problem .", "forward": false, "src_ids": "2021.acl-long.203_7833"}
{"input": "meta - learning approaches is used for Task| context: however , deep learning models are notorious for being data and computation hungry . these downsides limit the application of such models from deployment to different domains , languages , countries , or styles , since collecting in - genre data and model training from scratch are costly . the long - tail nature of human language makes challenges even more significant . meta - learning has been shown to allow faster fine - tuning , converge to better performance , and achieve amazing results for few - shot learning in many applications . meta - learning is one of the most important new techniques in machine learning in recent years . there is a related tutorial in icml 2019 and a related course at stanford , but most of the example applications given in these materials are about image processing .", "entity": "meta - learning approaches", "output": "natural language processing", "neg_sample": ["meta - learning approaches is used for Task", "however , deep learning models are notorious for being data and computation hungry .", "these downsides limit the application of such models from deployment to different domains , languages , countries , or styles , since collecting in - genre data and model training from scratch are costly .", "the long - tail nature of human language makes challenges even more significant .", "meta - learning has been shown to allow faster fine - tuning , converge to better performance , and achieve amazing results for few - shot learning in many applications .", "meta - learning is one of the most important new techniques in machine learning in recent years .", "there is a related tutorial in icml 2019 and a related course at stanford , but most of the example applications given in these materials are about image processing ."], "relation": "used for", "id": "2021.acl-tutorials.3", "year": 2021, "rel_sent": "It is believed that meta - learning has great potential to be applied in NLP , and some works have been proposed with notable achievements in several relevant problems , e.g. , relation extraction , machine translation , and dialogue generation and state tracking .", "forward": true, "src_ids": "2021.acl-tutorials.3_11277"}
{"input": "retriever is done by using Method| context: recent advances in open - domain qa have led to strong models based on dense retrieval , but only focused on retrieving textual passages .", "entity": "retriever", "output": "pre - training procedure", "neg_sample": ["retriever is done by using Method", "recent advances in open - domain qa have led to strong models based on dense retrieval , but only focused on retrieving textual passages ."], "relation": "used for", "id": "2021.naacl-main.43", "year": 2021, "rel_sent": "We present an effective pre - training procedure for our retriever and improve retrieval quality with mined hard negatives .", "forward": false, "src_ids": "2021.naacl-main.43_4374"}
{"input": "zero - shot sequence labeling is done by using Task| context: one straightforward approach is utilizing existing systems ( source models ) to generate pseudo - labeled datasets and train a target sequence labeler accordingly . however , due to the gap between the source and the target languages / domains , this approach may fail to recover the true labels .", "entity": "zero - shot sequence labeling", "output": "risk minimization", "neg_sample": ["zero - shot sequence labeling is done by using Task", "one straightforward approach is utilizing existing systems ( source models ) to generate pseudo - labeled datasets and train a target sequence labeler accordingly .", "however , due to the gap between the source and the target languages / domains , this approach may fail to recover the true labels ."], "relation": "used for", "id": "2021.acl-long.380", "year": 2021, "rel_sent": "Risk Minimization for Zero - shot Sequence Labeling.", "forward": false, "src_ids": "2021.acl-long.380_4056"}
{"input": "hierarchical compare - aggregate mechanism is used for OtherScientificTerm| context: on many natural language processing tasks , large pre - trained language models ( plms ) have shown overwhelming performances compared with traditional neural network methods . nevertheless , their huge model size and low inference speed have hindered the deployment on resource - limited devices in practice .", "entity": "hierarchical compare - aggregate mechanism", "output": "hierarchical relationships", "neg_sample": ["hierarchical compare - aggregate mechanism is used for OtherScientificTerm", "on many natural language processing tasks , large pre - trained language models ( plms ) have shown overwhelming performances compared with traditional neural network methods .", "nevertheless , their huge model size and low inference speed have hindered the deployment on resource - limited devices in practice ."], "relation": "used for", "id": "2021.emnlp-main.250", "year": 2021, "rel_sent": "And to dynamically select the most representative prototypes for each domain , we propose a hierarchical compare - aggregate mechanism to capture hierarchical relationships .", "forward": true, "src_ids": "2021.emnlp-main.250_1636"}
{"input": "icelandic corpora is used for Task| context: we describe the process of conversion between the pos tagging schemes of two languages , the icelandic mim - gold tagging scheme and the faroese sosialurin tagging scheme . these tagging schemes are functionally similar but use separate ways to encode fine - grained morphological information on tokenised text .", "entity": "icelandic corpora", "output": "cross - lingual nlp applications", "neg_sample": ["icelandic corpora is used for Task", "we describe the process of conversion between the pos tagging schemes of two languages , the icelandic mim - gold tagging scheme and the faroese sosialurin tagging scheme .", "these tagging schemes are functionally similar but use separate ways to encode fine - grained morphological information on tokenised text ."], "relation": "used for", "id": "2021.nodalida-main.33", "year": 2021, "rel_sent": "As a product of our work , we present a provisional version of Icelandic corpora , prepared in the Faroese PoS tagging scheme , ready for use in cross - lingual NLP applications .", "forward": true, "src_ids": "2021.nodalida-main.33_8749"}
{"input": "minimal prediction preserving inputs is used for Task| context: recent work ( feng et al . , 2018 ) establishes the presence of short , uninterpretable input fragments that yield high confidence and accuracy in neural models . we refer to these as minimal prediction preserving inputs ( mppis ) .", "entity": "minimal prediction preserving inputs", "output": "question answering", "neg_sample": ["minimal prediction preserving inputs is used for Task", "recent work ( feng et al .", ", 2018 ) establishes the presence of short , uninterpretable input fragments that yield high confidence and accuracy in neural models .", "we refer to these as minimal prediction preserving inputs ( mppis ) ."], "relation": "used for", "id": "2021.naacl-main.101", "year": 2021, "rel_sent": "On the Transferability of Minimal Prediction Preserving Inputs in Question Answering.", "forward": true, "src_ids": "2021.naacl-main.101_1864"}
{"input": "data annotation tool is used for Material| context: currently , text chatting is one of the primary means of communication . however , modern text chat still in general does not offer any navigation or even full - featured search , although the high volumes of messages demand it . since the task is novel , neither training nor gold - standard datasets for it have been created yet .", "entity": "data annotation tool", "output": "training and gold - standard data", "neg_sample": ["data annotation tool is used for Material", "currently , text chatting is one of the primary means of communication .", "however , modern text chat still in general does not offer any navigation or even full - featured search , although the high volumes of messages demand it .", "since the task is novel , neither training nor gold - standard datasets for it have been created yet ."], "relation": "used for", "id": "2021.acl-srw.14", "year": 2021, "rel_sent": "In order to mitigate these inconveniences , we formulate the problem of situation - based summarization and propose a special data annotation tool intended for developing training and gold - standard data .", "forward": true, "src_ids": "2021.acl-srw.14_371"}
{"input": "distributed representations is used for OtherScientificTerm| context: the existing studies working on emotion detection usually focus on how to improve the performance of model prediction , in which emotions are represented with one - hot vectors . however , emotion relations are ignored in one - hot representations .", "entity": "distributed representations", "output": "emotion categories", "neg_sample": ["distributed representations is used for OtherScientificTerm", "the existing studies working on emotion detection usually focus on how to improve the performance of model prediction , in which emotions are represented with one - hot vectors .", "however , emotion relations are ignored in one - hot representations ."], "relation": "used for", "id": "2021.acl-long.184", "year": 2021, "rel_sent": "In this article , we first propose a general framework to learn the distributed representations for emotion categories in emotion space from a given emotion classification dataset .", "forward": true, "src_ids": "2021.acl-long.184_9703"}
{"input": "keyphrases is done by using Method| context: keyword or keyphrase extraction is to identify words or phrases presenting the main topics of a document .", "entity": "keyphrases", "output": "attentionrank", "neg_sample": ["keyphrases is done by using Method", "keyword or keyphrase extraction is to identify words or phrases presenting the main topics of a document ."], "relation": "used for", "id": "2021.emnlp-main.146", "year": 2021, "rel_sent": "This paper proposes the AttentionRank , a hybrid attention model , to identify keyphrases from a document in an unsupervised manner .", "forward": false, "src_ids": "2021.emnlp-main.146_2416"}
{"input": "collective relation integration is used for Task| context: integrating extracted knowledge from the web to knowledge graphs ( kgs ) can facilitate tasks like question answering . however , the predictions are made independently , which can be mutually inconsistent .", "entity": "collective relation integration", "output": "open information extraction", "neg_sample": ["collective relation integration is used for Task", "integrating extracted knowledge from the web to knowledge graphs ( kgs ) can facilitate tasks like question answering .", "however , the predictions are made independently , which can be mutually inconsistent ."], "relation": "used for", "id": "2021.acl-long.363", "year": 2021, "rel_sent": "CoRI : Collective Relation Integration with Data Augmentation for Open Information Extraction.", "forward": true, "src_ids": "2021.acl-long.363_2568"}
{"input": "pre - training is used for Method| context: this tutorial provides a comprehensive guide to make the most of pre - training for neural machine translation .", "entity": "pre - training", "output": "nmt", "neg_sample": ["pre - training is used for Method", "this tutorial provides a comprehensive guide to make the most of pre - training for neural machine translation ."], "relation": "used for", "id": "2021.acl-tutorials.4", "year": 2021, "rel_sent": "Firstly , we will briefly introduce the background of NMT , pre - training methodology , and point out the main challenges when applying pre - training for NMT .", "forward": true, "src_ids": "2021.acl-tutorials.4_909"}
{"input": "uncertainty - based sampling strategy is used for Material| context: self - training has proven effective for improving nmt performance by augmenting model training with synthetic parallel data .", "entity": "uncertainty - based sampling strategy", "output": "monolingual data", "neg_sample": ["uncertainty - based sampling strategy is used for Material", "self - training has proven effective for improving nmt performance by augmenting model training with synthetic parallel data ."], "relation": "used for", "id": "2021.acl-long.221", "year": 2021, "rel_sent": "Accordingly , we design an uncertainty - based sampling strategy to efficiently exploit the monolingual data for self - training , in which monolingual sentences with higher uncertainty would be sampled with higher probability .", "forward": true, "src_ids": "2021.acl-long.221_10637"}
{"input": "informative and engaging responses is done by using Method| context: growing interests have been attracted in conversational recommender systems ( crs ) , which explore user preference through conversational interactions in order to make appropriate recommendation . however , there is still a lack of ability in existing crs to ( 1 ) traverse multiple reasoning paths over background knowledge to introduce relevant items and attributes , and ( 2 ) arrange selected entities appropriately under current system intents to control response generation .", "entity": "informative and engaging responses", "output": "cr - walker", "neg_sample": ["informative and engaging responses is done by using Method", "growing interests have been attracted in conversational recommender systems ( crs ) , which explore user preference through conversational interactions in order to make appropriate recommendation .", "however , there is still a lack of ability in existing crs to ( 1 ) traverse multiple reasoning paths over background knowledge to introduce relevant items and attributes , and ( 2 ) arrange selected entities appropriately under current system intents to control response generation ."], "relation": "used for", "id": "2021.emnlp-main.139", "year": 2021, "rel_sent": "Automatic and human evaluations show that CR - Walker can arrive at more accurate recommendation , and generate more informative and engaging responses .", "forward": false, "src_ids": "2021.emnlp-main.139_2624"}
{"input": "computer - aided translation ( cat ) tools is done by using Method| context: in english , inflectional affixes in verbs include suffixes only ; unlike english , verbs in arabic derive voice , mood , tense , number and person through various inflectional affixes e.g. pre or post a verb root . if it is dealt with as a character intervention , are the types of intervention penalized equally or differently ?", "entity": "computer - aided translation ( cat ) tools", "output": "similarity measurement approach", "neg_sample": ["computer - aided translation ( cat ) tools is done by using Method", "in english , inflectional affixes in verbs include suffixes only ; unlike english , verbs in arabic derive voice , mood , tense , number and person through various inflectional affixes e.g.", "pre or post a verb root .", "if it is dealt with as a character intervention , are the types of intervention penalized equally or differently ?"], "relation": "used for", "id": "2021.triton-1.14", "year": 2021, "rel_sent": "The aim of this paper is to investigate the similarity measurement approach of translation memory ( TM ) in five representative computer - aided translation ( CAT ) tools when retrieving inflectional verb - variation sentences in Arabic to English translation .", "forward": false, "src_ids": "2021.triton-1.14_2096"}
{"input": "e - wer is done by using Method| context: automatic speech recognition ( asr ) systems are evaluated using word error rate ( wer ) , which is calculated by comparing the number of errors between the ground truth and the transcription of the asr system . this calculation , however , requires manual transcription of the speech signal to obtain the ground truth . however , while converting to a classification setting , these approaches suffer from heavy class imbalance .", "entity": "e - wer", "output": "bert based architecture", "neg_sample": ["e - wer is done by using Method", "automatic speech recognition ( asr ) systems are evaluated using word error rate ( wer ) , which is calculated by comparing the number of errors between the ground truth and the transcription of the asr system .", "this calculation , however , requires manual transcription of the speech signal to obtain the ground truth .", "however , while converting to a classification setting , these approaches suffer from heavy class imbalance ."], "relation": "used for", "id": "2021.eacl-main.320", "year": 2021, "rel_sent": "Within this paradigm , we also propose WER - BERT , a BERT based architecture with speech features for e - WER .", "forward": false, "src_ids": "2021.eacl-main.320_1120"}
{"input": "diverse response generation is done by using Method| context: there are two stages involved in the learning process .", "entity": "diverse response generation", "output": "fine - grained generation model", "neg_sample": ["diverse response generation is done by using Method", "there are two stages involved in the learning process ."], "relation": "used for", "id": "2021.findings-acl.222", "year": 2021, "rel_sent": "In the second stage , a fine - grained generation model and an evaluation model are further trained to learn diverse response generation and response coherence estimation , respectively .", "forward": false, "src_ids": "2021.findings-acl.222_1171"}
{"input": "domain generalization is done by using Method| context: the importance of building semantic parsers which can be applied to new domains and generate programs unseen at training has long been acknowledged , and datasets testing out - of - domain performance are becoming increasingly available . however , little or no attention has been devoted to learning algorithms or objectives which promote domain generalization , with virtually all existing approaches relying on standard supervised learning .", "entity": "domain generalization", "output": "meta - learning", "neg_sample": ["domain generalization is done by using Method", "the importance of building semantic parsers which can be applied to new domains and generate programs unseen at training has long been acknowledged , and datasets testing out - of - domain performance are becoming increasingly available .", "however , little or no attention has been devoted to learning algorithms or objectives which promote domain generalization , with virtually all existing approaches relying on standard supervised learning ."], "relation": "used for", "id": "2021.naacl-main.33", "year": 2021, "rel_sent": "Meta - Learning for Domain Generalization in Semantic Parsing.", "forward": false, "src_ids": "2021.naacl-main.33_13473"}
{"input": "sarcasm detection is done by using Method| context: detecting arguments in online interactions is useful to understand how conflicts arise and get resolved . users often use figurative language , such as sarcasm , either as persuasive devices or to attack the opponent by an ad hominem argument .", "entity": "sarcasm detection", "output": "multitask learning", "neg_sample": ["sarcasm detection is done by using Method", "detecting arguments in online interactions is useful to understand how conflicts arise and get resolved .", "users often use figurative language , such as sarcasm , either as persuasive devices or to attack the opponent by an ad hominem argument ."], "relation": "used for", "id": "2021.eacl-main.171", "year": 2021, "rel_sent": "We exploit joint modeling in terms of ( a ) applying discrete features that are useful in detecting sarcasm to the task of argumentative relation classification ( agree / disagree / none ) , and ( b ) multitask learning for argumentative relation classification and sarcasm detection using deep learning architectures ( e.g. , dual Long Short - Term Memory ( LSTM ) with hierarchical attention and Transformer - based architectures ) .", "forward": false, "src_ids": "2021.eacl-main.171_1602"}
{"input": "neural machine translation is done by using Method| context: large - scale transformers have been shown the state - of - the - art on neural machine translation . however , training these increasingly wider and deeper models could be tremendously memory intensive .", "entity": "neural machine translation", "output": "multi - split based reversible transformers", "neg_sample": ["neural machine translation is done by using Method", "large - scale transformers have been shown the state - of - the - art on neural machine translation .", "however , training these increasingly wider and deeper models could be tremendously memory intensive ."], "relation": "used for", "id": "2021.eacl-main.19", "year": 2021, "rel_sent": "Multi - split Reversible Transformers Can Enhance Neural Machine Translation.", "forward": false, "src_ids": "2021.eacl-main.19_16115"}
{"input": "gcdf - rpe is used for Method| context: relative position embedding ( rpe ) is a successful method to explicitly and efficaciously encode position information into transformer models .", "entity": "gcdf - rpe", "output": "prior encoding mechanism", "neg_sample": ["gcdf - rpe is used for Method", "relative position embedding ( rpe ) is a successful method to explicitly and efficaciously encode position information into transformer models ."], "relation": "used for", "id": "2021.emnlp-main.237", "year": 2021, "rel_sent": "GCDF - RPE utilizes the excellent properties of the Gaussian function to amend the prior encoding mechanism in XL - RPE .", "forward": true, "src_ids": "2021.emnlp-main.237_10049"}
{"input": "discourse relations is done by using OtherScientificTerm| context: abstractive conversation summarization has received growing attention while most current state - of - the - art summarization models heavily rely on human - annotated summaries .", "entity": "discourse relations", "output": "random swapping / deletion", "neg_sample": ["discourse relations is done by using OtherScientificTerm", "abstractive conversation summarization has received growing attention while most current state - of - the - art summarization models heavily rely on human - annotated summaries ."], "relation": "used for", "id": "2021.emnlp-main.530", "year": 2021, "rel_sent": "To reduce the dependence on labeled summaries , in this work , we present a simple yet effective set of Conversational Data Augmentation ( CODA ) methods for semi - supervised abstractive conversation summarization , such as random swapping / deletion to perturb the discourse relations inside conversations , dialogue - acts - guided insertion to interrupt the development of conversations , and conditional - generation - based substitution to substitute utterances with their paraphrases generated based on the conversation context .", "forward": false, "src_ids": "2021.emnlp-main.530_10827"}
{"input": "transformerbased encoder is used for OtherScientificTerm| context: interactive argument pair identification is essential in the context of dialogical argumentation mining . existing research treats it as a problem of sentence matching and largely relies on textual information to compute the similarities . however , the interaction of opinions usually involves the background of the topic and requires reasoning of knowledge , which is beyond textual information .", "entity": "transformerbased encoder", "output": "representation of paths", "neg_sample": ["transformerbased encoder is used for OtherScientificTerm", "interactive argument pair identification is essential in the context of dialogical argumentation mining .", "existing research treats it as a problem of sentence matching and largely relies on textual information to compute the similarities .", "however , the interaction of opinions usually involves the background of the topic and requires reasoning of knowledge , which is beyond textual information ."], "relation": "used for", "id": "2021.findings-acl.203", "year": 2021, "rel_sent": "In practice , we utilize Graph Convolutional Network ( GCN ) to learn the concept representation in the knowledge graph and use a Transformerbased encoder to learn the representation of paths .", "forward": true, "src_ids": "2021.findings-acl.203_3288"}
{"input": "infused knowledge is done by using Method| context: we study the problem of injecting knowledge into large pre - trained models like bert and roberta . existing methods typically update the original parameters of pre - trained models when injecting knowledge . however , when multiple kinds of knowledge are injected , they may suffer from catastrophic forgetting .", "entity": "infused knowledge", "output": "neural adapter", "neg_sample": ["infused knowledge is done by using Method", "we study the problem of injecting knowledge into large pre - trained models like bert and roberta .", "existing methods typically update the original parameters of pre - trained models when injecting knowledge .", "however , when multiple kinds of knowledge are injected , they may suffer from catastrophic forgetting ."], "relation": "used for", "id": "2021.findings-acl.121", "year": 2021, "rel_sent": "Taking RoBERTa as the pre - trained model , K - Adapter has a neural adapter for each kind of infused knowledge , like a plug - in connected to RoBERTa .", "forward": false, "src_ids": "2021.findings-acl.121_5062"}
{"input": "metadata is done by using OtherScientificTerm| context: multi - label document classification , associating one document instance with a set of relevant labels , is attracting more and more research attention . existing methods explore the incorporation of information beyond text , such as document metadata or label structure . these approaches however either simply utilize the semantic information of metadata or employ the predefined parent - child label hierarchy , ignoring the heterogeneous graphical structures of metadata and labels , which we believe are crucial for accurate multi - label document classification .", "entity": "metadata", "output": "metadata heterogeneous graph", "neg_sample": ["metadata is done by using OtherScientificTerm", "multi - label document classification , associating one document instance with a set of relevant labels , is attracting more and more research attention .", "existing methods explore the incorporation of information beyond text , such as document metadata or label structure .", "these approaches however either simply utilize the semantic information of metadata or employ the predefined parent - child label hierarchy , ignoring the heterogeneous graphical structures of metadata and labels , which we believe are crucial for accurate multi - label document classification ."], "relation": "used for", "id": "2021.emnlp-main.253", "year": 2021, "rel_sent": "One is metadata heterogeneous graph , which models various types of metadata and their topological relations .", "forward": false, "src_ids": "2021.emnlp-main.253_4899"}
{"input": "paraphrase generation is done by using OtherScientificTerm| context: paraphrases refer to texts that convey the same meaning with different expression forms . pivot - based methods , also known as the round - trip translation , have shown promising results in generating high - quality paraphrases . however , existing pivot - based methods all rely on language as the pivot , where large - scale , high - quality parallel bilingual texts are required .", "entity": "paraphrase generation", "output": "parallel data of paraphrases", "neg_sample": ["paraphrase generation is done by using OtherScientificTerm", "paraphrases refer to texts that convey the same meaning with different expression forms .", "pivot - based methods , also known as the round - trip translation , have shown promising results in generating high - quality paraphrases .", "however , existing pivot - based methods all rely on language as the pivot , where large - scale , high - quality parallel bilingual texts are required ."], "relation": "used for", "id": "2021.emnlp-main.350", "year": 2021, "rel_sent": "Besides , several unsupervised pivot - based methods can generate paraphrases with similar quality as the supervised sequence - to - sequence model , which indicates that parallel data of paraphrases may not be necessary for paraphrase generation .", "forward": false, "src_ids": "2021.emnlp-main.350_11320"}
{"input": "local language varieties is done by using Material| context: this paper measures similarity both within and between 84 language varieties across nine languages .", "entity": "local language varieties", "output": "digital geo - referenced corpora", "neg_sample": ["local language varieties is done by using Material", "this paper measures similarity both within and between 84 language varieties across nine languages ."], "relation": "used for", "id": "2021.vardial-1.4", "year": 2021, "rel_sent": "This provides further evidence that digital geo - referenced corpora consistently represent local language varieties .", "forward": false, "src_ids": "2021.vardial-1.4_328"}
{"input": "one - model - for - all solution is done by using Method| context: despite the widespread success of self - supervised learning via masked language models ( mlm ) , accurately capturing fine - grained semantic relationships in the biomedical domain remains a challenge . this is of paramount importance for entity - level tasks such as entity linking where the ability to model entity relations ( especially synonymy ) is pivotal .", "entity": "one - model - for - all solution", "output": "sapbert", "neg_sample": ["one - model - for - all solution is done by using Method", "despite the widespread success of self - supervised learning via masked language models ( mlm ) , accurately capturing fine - grained semantic relationships in the biomedical domain remains a challenge .", "this is of paramount importance for entity - level tasks such as entity linking where the ability to model entity relations ( especially synonymy ) is pivotal ."], "relation": "used for", "id": "2021.naacl-main.334", "year": 2021, "rel_sent": "In contrast with previous pipeline - based hybrid systems , SapBERT offers an elegant one - model - for - all solution to the problem of medical entity linking ( MEL ) , achieving a new state - of - the - art ( SOTA ) on six MEL benchmarking datasets .", "forward": false, "src_ids": "2021.naacl-main.334_2294"}
{"input": "translation is done by using Method| context: we present mioeind 's submission for the english->icelandic and icelandic->english subsets of the 2021 wmt news translation task .", "entity": "translation", "output": "mbart-25 model", "neg_sample": ["translation is done by using Method", "we present mioeind 's submission for the english->icelandic and icelandic->english subsets of the 2021 wmt news translation task ."], "relation": "used for", "id": "2021.wmt-1.9", "year": 2021, "rel_sent": "A pretrained mBART-25 model is then adapted for translation using parallel data as well as the last backtranslation iteration .", "forward": false, "src_ids": "2021.wmt-1.9_2326"}
{"input": "parallel and monolingual data is done by using Task| context: we submitted two uni - directional models , one for english->icelandic direction and other for icelandic->english direction .", "entity": "parallel and monolingual data", "output": "forward translation", "neg_sample": ["parallel and monolingual data is done by using Task", "we submitted two uni - directional models , one for english->icelandic direction and other for icelandic->english direction ."], "relation": "used for", "id": "2021.wmt-1.10", "year": 2021, "rel_sent": "Our news translation system is based on the transformer - big architecture , it makes use of corpora filtering , back - translation and forward translation applied to parallel and monolingual data alike", "forward": false, "src_ids": "2021.wmt-1.10_6472"}
{"input": "generation process is used for OtherScientificTerm| context: modern models for event causality identification ( eci ) are mainly based on supervised learning , which are prone to the data lacking problem . unfortunately , the existing nlp - related augmentation methods can not directly produce available data required for this task .", "entity": "generation process", "output": "task - related sentences", "neg_sample": ["generation process is used for OtherScientificTerm", "modern models for event causality identification ( eci ) are mainly based on supervised learning , which are prone to the data lacking problem .", "unfortunately , the existing nlp - related augmentation methods can not directly produce available data required for this task ."], "relation": "used for", "id": "2021.acl-long.276", "year": 2021, "rel_sent": "On the other hand , our approach employs a dual mechanism , which is a learnable augmentation framework , and can interactively adjust the generation process to generate task - related sentences .", "forward": true, "src_ids": "2021.acl-long.276_15354"}
{"input": "rule - based method is used for Task| context: although parsing to abstract meaning representation ( amr ) has become very popular and amr has been shown effective on the many sentence - level downstream tasks , little work has studied how to generate amrs that can represent multi - sentence information .", "entity": "rule - based method", "output": "text summarization", "neg_sample": ["rule - based method is used for Task", "although parsing to abstract meaning representation ( amr ) has become very popular and amr has been shown effective on the many sentence - level downstream tasks , little work has studied how to generate amrs that can represent multi - sentence information ."], "relation": "used for", "id": "2021.acl-long.324", "year": 2021, "rel_sent": "Besides , the document - level AMRs obtained by our model can significantly improve over the AMRs generated by a rule - based method ( Liu et al . , 2015 ) on text summarization .", "forward": true, "src_ids": "2021.acl-long.324_7168"}
{"input": "balancing methods is used for Task| context: it becomes even more challenging when class distribution is long - tailed . resampling and re - weighting are common approaches used for addressing the class imbalance problem , however , they are not effective when there is label dependency besides class imbalance because they result in oversampling of common labels .", "entity": "balancing methods", "output": "multi - label text classification", "neg_sample": ["balancing methods is used for Task", "it becomes even more challenging when class distribution is long - tailed .", "resampling and re - weighting are common approaches used for addressing the class imbalance problem , however , they are not effective when there is label dependency besides class imbalance because they result in oversampling of common labels ."], "relation": "used for", "id": "2021.emnlp-main.643", "year": 2021, "rel_sent": "Balancing Methods for Multi - label Text Classification with Long - Tailed Class Distribution.", "forward": true, "src_ids": "2021.emnlp-main.643_9268"}
{"input": "neural network is used for Task| context: neural networks have recently successfully learned to predict some pragmatic inferences ( e.g. , jeretic et al . ( 2020 ) ; jiang and de marneffe ( 2019 ) ) . for instance , schuster et al . ( 2020 ) trained a neural network to predict human ratings of scalar inference strength from ' some ' to the negation of a stronger alternative with ' all ' .", "entity": "neural network", "output": "scalar inferences", "neg_sample": ["neural network is used for Task", "neural networks have recently successfully learned to predict some pragmatic inferences ( e.g.", ", jeretic et al .", "( 2020 ) ; jiang and de marneffe ( 2019 ) ) .", "for instance , schuster et al .", "( 2020 ) trained a neural network to predict human ratings of scalar inference strength from ' some ' to the negation of a stronger alternative with ' all ' ."], "relation": "used for", "id": "2021.scil-1.54", "year": 2021, "rel_sent": "We thus explore to what extent a neural network can learn to predict a different widely studied scalar inference : that from ' or ' to the negation of a stronger alternative with ' and ' , as in ( 1 ) .", "forward": true, "src_ids": "2021.scil-1.54_12415"}
{"input": "apache spark framework is used for Task| context: bilingual dictionaries are essential resources in many areas of natural language processing tasks , but resource - scarce and less popular language pairs rarely have such . efficient automatic methods for inducting bilingual dictionaries are needed as manual resources and efforts are scarce for low - resourced languages . in this paper , we induce word translations using bilingual embedding .", "entity": "apache spark framework", "output": "parallel computation", "neg_sample": ["apache spark framework is used for Task", "bilingual dictionaries are essential resources in many areas of natural language processing tasks , but resource - scarce and less popular language pairs rarely have such .", "efficient automatic methods for inducting bilingual dictionaries are needed as manual resources and efforts are scarce for low - resourced languages .", "in this paper , we induce word translations using bilingual embedding ."], "relation": "used for", "id": "2021.bucc-1.2", "year": 2021, "rel_sent": "We use the Apache Spark framework for parallel computation .", "forward": true, "src_ids": "2021.bucc-1.2_16079"}
{"input": "lemmatizing transformation is used for Task| context: lemmatization is often used with morphologically rich languages to address issues caused by morphological complexity , performed by grammar - based lemmatizers .", "entity": "lemmatizing transformation", "output": "downstream processing", "neg_sample": ["lemmatizing transformation is used for Task", "lemmatization is often used with morphologically rich languages to address issues caused by morphological complexity , performed by grammar - based lemmatizers ."], "relation": "used for", "id": "2021.nodalida-main.25", "year": 2021, "rel_sent": "This facilitates an alternative processing pipeline that replaces traditional lemmatization with the lemmatizing transformation in downstream processing for any application .", "forward": true, "src_ids": "2021.nodalida-main.25_4283"}
{"input": "utterance and slot information is done by using Method| context: task - oriented dialog systems help a user achieve a particular goal by parsing user requests to execute a particular action . these systems typically require copious amounts of training data to effectively understand the user intent and its corresponding slots . acquiring large training corpora requires significant manual effort in annotation , rendering its construction infeasible for low - resource languages .", "entity": "utterance and slot information", "output": "machine translation ( mt ) system", "neg_sample": ["utterance and slot information is done by using Method", "task - oriented dialog systems help a user achieve a particular goal by parsing user requests to execute a particular action .", "these systems typically require copious amounts of training data to effectively understand the user intent and its corresponding slots .", "acquiring large training corpora requires significant manual effort in annotation , rendering its construction infeasible for low - resource languages ."], "relation": "used for", "id": "2021.dravidianlangtech-1.11", "year": 2021, "rel_sent": "First , we use a machine translation ( MT ) system to translate the utterance and slot information to the target language .", "forward": false, "src_ids": "2021.dravidianlangtech-1.11_9257"}
{"input": "backdoor adjustment is used for OtherScientificTerm| context: distant supervision tackles the data bottleneck in ner by automatically generating training instances via dictionary matching .", "entity": "backdoor adjustment", "output": "spurious correlations", "neg_sample": ["backdoor adjustment is used for OtherScientificTerm", "distant supervision tackles the data bottleneck in ner by automatically generating training instances via dictionary matching ."], "relation": "used for", "id": "2021.acl-long.371", "year": 2021, "rel_sent": "For intra - dictionary bias , we conduct backdoor adjustment to remove the spurious correlations introduced by the dictionary confounder .", "forward": true, "src_ids": "2021.acl-long.371_9858"}
{"input": "tolerance principle is used for Task| context: child language acquisition is famously accurate despite the sparsity of linguistic input .", "entity": "tolerance principle", "output": "segmentation", "neg_sample": ["tolerance principle is used for Task", "child language acquisition is famously accurate despite the sparsity of linguistic input ."], "relation": "used for", "id": "2021.scil-1.17", "year": 2021, "rel_sent": "Using UniMorph annotations as an approximation of children 's semantic representation of verbal inflection , we use the Tolerance Principle to explicitly identify the formal processes of segmentation and mutation that productively encode the semantic relations ( e.g. , past tense ) between stems and inflected forms .", "forward": true, "src_ids": "2021.scil-1.17_12356"}
{"input": "deep learning classifier is used for OtherScientificTerm| context: many automatic semantic relation extraction tools extract subject - predicate - object triples from unstructured text . however , a large quantity of these triples merely represent background knowledge .", "entity": "deep learning classifier", "output": "important triples", "neg_sample": ["deep learning classifier is used for OtherScientificTerm", "many automatic semantic relation extraction tools extract subject - predicate - object triples from unstructured text .", "however , a large quantity of these triples merely represent background knowledge ."], "relation": "used for", "id": "2021.ranlp-1.126", "year": 2021, "rel_sent": "This corpus is used to train a deep learning classifier to identify important triples , and we suggest that an importance ranking for semantic triples could also be generated .", "forward": true, "src_ids": "2021.ranlp-1.126_13580"}
{"input": "grammatical irregularities is done by using Method| context: we present experiments on assessing the grammatical correctness of learners ' answers in a language - learning system ( references to the system , and the links to the released data and code are withheld for anonymity ) .", "entity": "grammatical irregularities", "output": "pre - trained bert", "neg_sample": ["grammatical irregularities is done by using Method", "we present experiments on assessing the grammatical correctness of learners ' answers in a language - learning system ( references to the system , and the links to the released data and code are withheld for anonymity ) ."], "relation": "used for", "id": "2021.bea-1.15", "year": 2021, "rel_sent": "Our experiments show a. that pre - trained BERT performs worse at detecting grammatical irregularities for Russian than for English ; b. that fine - tuned BERT yields promising results on assessing the correctness of grammatical exercises ; and c. establish a new benchmark for Russian .", "forward": false, "src_ids": "2021.bea-1.15_7200"}
{"input": "apis is used for Material| context: freedom of the press and media is of vital importance for democratically organised states and open societies .", "entity": "apis", "output": "news and twitter content", "neg_sample": ["apis is used for Material", "freedom of the press and media is of vital importance for democratically organised states and open societies ."], "relation": "used for", "id": "2021.emnlp-demo.18", "year": 2021, "rel_sent": "This paper presents our work on the tool , starting with the training phase , which comprises defining the topic - related keywords to be used for querying APIs for news and Twitter content and evaluating different machine learning models based on a training dataset specifically created for our use case .", "forward": true, "src_ids": "2021.emnlp-demo.18_811"}
{"input": "neural path hunter is used for Task| context: dialogue systems powered by large pre - trained language models exhibit an innate ability to deliver fluent and natural - sounding responses . despite their impressive performance , these models are fitful and can often generate factually incorrect statements impeding their widespread adoption .", "entity": "neural path hunter", "output": "reducing hallucination", "neg_sample": ["neural path hunter is used for Task", "dialogue systems powered by large pre - trained language models exhibit an innate ability to deliver fluent and natural - sounding responses .", "despite their impressive performance , these models are fitful and can often generate factually incorrect statements impeding their widespread adoption ."], "relation": "used for", "id": "2021.emnlp-main.168", "year": 2021, "rel_sent": "Neural Path Hunter : Reducing Hallucination in Dialogue Systems via Path Grounding.", "forward": true, "src_ids": "2021.emnlp-main.168_4392"}
{"input": "end - to - end ee model is done by using Method| context: existing end - to - end ee research usually adopts the role - averaged evaluation that produces evaluation measures by averaging evaluation statistics of each event role . however , although this averaged metric can indicate the model performance to some extent , we find that such metric can be pretty misleading to downstream applications that utilize an event instance as a whole , where one wrongly identified event argument can substantially alter the whole meaning of an event instance .", "entity": "end - to - end ee model", "output": "training paradigm", "neg_sample": ["end - to - end ee model is done by using Method", "existing end - to - end ee research usually adopts the role - averaged evaluation that produces evaluation measures by averaging evaluation statistics of each event role .", "however , although this averaged metric can indicate the model performance to some extent , we find that such metric can be pretty misleading to downstream applications that utilize an event instance as a whole , where one wrongly identified event argument can substantially alter the whole meaning of an event instance ."], "relation": "used for", "id": "2021.findings-acl.405", "year": 2021, "rel_sent": "Moreover , to support diverse preferences of evaluation metrics motivated by different scenarios , we propose a new training paradigm based on reinforcement learning for a typical end - to - end EE model , i.e. , Doc2EDAG .", "forward": false, "src_ids": "2021.findings-acl.405_10919"}
{"input": "downstream tasks is done by using Method| context: pretrained transformer - based models such as bert have demonstrated state - of - the - art predictive performance when adapted into a range of natural language processing tasks . an open problem is how to improve the faithfulness of explanations ( rationales ) for the predictions of these models .", "entity": "downstream tasks", "output": "fine - tuning", "neg_sample": ["downstream tasks is done by using Method", "pretrained transformer - based models such as bert have demonstrated state - of - the - art predictive performance when adapted into a range of natural language processing tasks .", "an open problem is how to improve the faithfulness of explanations ( rationales ) for the predictions of these models ."], "relation": "used for", "id": "2021.emnlp-main.645", "year": 2021, "rel_sent": "In this paper , we hypothesize that salient information extracted a priori from the training data can complement the task - specific information learned by the model during fine - tuning on a downstream task .", "forward": false, "src_ids": "2021.emnlp-main.645_8887"}
{"input": "dialectal atlas is done by using Material| context: in the first decade of the 21th century , an atlas of udmurt dialects was prepared for publication . although hundreds of maps and legends were completed , due to no hope for publication , the project was never finished .", "entity": "dialectal atlas", "output": "collection of exercise books", "neg_sample": ["dialectal atlas is done by using Material", "in the first decade of the 21th century , an atlas of udmurt dialects was prepared for publication .", "although hundreds of maps and legends were completed , due to no hope for publication , the project was never finished ."], "relation": "used for", "id": "2021.iwclul-1.3", "year": 2021, "rel_sent": "The paper describes the material the atlas was based on , how the collection of exercise books was digitized and prepared for the purpose of a dialectal atlas , and how the atlas was generated from the data .", "forward": false, "src_ids": "2021.iwclul-1.3_8617"}
{"input": "relation words is used for Material| context: video captioning combines video understanding and language generation . different from image captioning that describes a static image with details of almost every object , video captioning usually considers a sequence of frames and biases towards focused objects , e.g. , the objects that stay in focus regardless of the changing background . therefore , detecting and properly accommodating focused objects is critical in video captioning .", "entity": "relation words", "output": "draft caption", "neg_sample": ["relation words is used for Material", "video captioning combines video understanding and language generation .", "different from image captioning that describes a static image with details of almost every object , video captioning usually considers a sequence of frames and biases towards focused objects , e.g.", ", the objects that stay in focus regardless of the changing background .", "therefore , detecting and properly accommodating focused objects is critical in video captioning ."], "relation": "used for", "id": "2021.findings-acl.24", "year": 2021, "rel_sent": "To enforce the description of focused objects and achieve controllable video captioning , we propose an Object - Oriented Non - Autoregressive approach ( O2NA ) , which performs caption generation in three steps : 1 ) identify the focused objects and predict their locations in the target caption ; 2 ) generate the related attribute words and relation words of these focused objects toform a draft caption ; and 3 ) combine video information to refine the draft caption to a fluent final caption .", "forward": true, "src_ids": "2021.findings-acl.24_15818"}
{"input": "multi - task attribute - value extraction is done by using Task| context: automatic extraction of product attribute - value pairs from unstructured text like product descriptions is an important problem for e - commerce companies . the attribute schema typically varies from one category of products ( which will be referred as vertical ) to another . this leads to extreme annotation efforts for training of supervised deep sequence labeling models such as lstm - crf , and consequently not enough labeled data for some vertical - attribute pairs .", "entity": "multi - task attribute - value extraction", "output": "learning cross - task attribute - attribute similarity", "neg_sample": ["multi - task attribute - value extraction is done by using Task", "automatic extraction of product attribute - value pairs from unstructured text like product descriptions is an important problem for e - commerce companies .", "the attribute schema typically varies from one category of products ( which will be referred as vertical ) to another .", "this leads to extreme annotation efforts for training of supervised deep sequence labeling models such as lstm - crf , and consequently not enough labeled data for some vertical - attribute pairs ."], "relation": "used for", "id": "2021.ecnlp-1.10", "year": 2021, "rel_sent": "Learning Cross - Task Attribute - Attribute Similarity for Multi - task Attribute - Value Extraction.", "forward": false, "src_ids": "2021.ecnlp-1.10_6466"}
{"input": "conversational search systems is done by using Method| context: voice assistants , e.g. , alexa or google assistant , have dramatically improved in recent years . supporting voice - based search , exploration , and refinement are fundamental tasks for voice assistants , and remain an open challenge . for example , when using voice to search an online shopping site , a user often needs to refine their search by some aspect or facet . this common user intent is usually available through a ' filter - by ' interface on online shopping websites , but is challenging to support naturally via voice , as the intent of refinements must be interpreted in the context of the original search , the initial results , and the available product catalogue facets . to our knowledge , no benchmark dataset exists for training or validating such contextual search understanding models .", "entity": "conversational search systems", "output": "voiser", "neg_sample": ["conversational search systems is done by using Method", "voice assistants , e.g.", ", alexa or google assistant , have dramatically improved in recent years .", "supporting voice - based search , exploration , and refinement are fundamental tasks for voice assistants , and remain an open challenge .", "for example , when using voice to search an online shopping site , a user often needs to refine their search by some aspect or facet .", "this common user intent is usually available through a ' filter - by ' interface on online shopping websites , but is challenging to support naturally via voice , as the intent of refinements must be interpreted in the context of the original search , the initial results , and the available product catalogue facets .", "to our knowledge , no benchmark dataset exists for training or validating such contextual search understanding models ."], "relation": "used for", "id": "2021.eacl-main.197", "year": 2021, "rel_sent": "As we show , VoiSeR can support research in conversational query understanding , contextual user intent prediction , and other conversational search topics tofacilitate the development of conversational search systems .", "forward": false, "src_ids": "2021.eacl-main.197_12640"}
{"input": "ensemble is done by using Method| context: neural network algorithms such as those based on transformers and attention models have excelled on automatic text classification ( atc ) tasks . however , such enhanced performance comes at high computational costs . ensembles of simpler classifiers ( i.e. , stacking ) that exploit algorithmic and representational complementarities have also been shown to produce top - notch performance in atc , enjoying high effectiveness and potentially lower computational costs .", "entity": "ensemble", "output": "low - cost oracle - based method", "neg_sample": ["ensemble is done by using Method", "neural network algorithms such as those based on transformers and attention models have excelled on automatic text classification ( atc ) tasks .", "however , such enhanced performance comes at high computational costs .", "ensembles of simpler classifiers ( i.e.", ", stacking ) that exploit algorithmic and representational complementarities have also been shown to produce top - notch performance in atc , enjoying high effectiveness and potentially lower computational costs ."], "relation": "used for", "id": "2021.findings-acl.350", "year": 2021, "rel_sent": "Besides answering such questions , another main contribution of this paper is the proposal of a low - cost oracle - based method that can predict the best ensemble in each scenario ( with and without computational cost limitations ) using only a fraction of the available training data .", "forward": false, "src_ids": "2021.findings-acl.350_2968"}
{"input": "dialog state tracking is done by using Task| context: in real - world settings with constantly changing services , dst systems must generalize to new domains and unseen slot types . existing methods for dst do not generalize well to new slot names and many require known ontologies of slot types and values for inference .", "entity": "dialog state tracking", "output": "zero - shot generalization", "neg_sample": ["dialog state tracking is done by using Task", "in real - world settings with constantly changing services , dst systems must generalize to new domains and unseen slot types .", "existing methods for dst do not generalize well to new slot names and many require known ontologies of slot types and values for inference ."], "relation": "used for", "id": "2021.eacl-main.91", "year": 2021, "rel_sent": "Zero - shot Generalization in Dialog State Tracking through Generative Question Answering.", "forward": false, "src_ids": "2021.eacl-main.91_2081"}
{"input": "uncertain local - to - global networks is used for Task| context: event factuality indicates the degree of certainty about whether an event occurs in the real world . existing studies mainly focus on identifying event factuality at sentence level , which easily leads to conflicts between different mentions of the same event .", "entity": "uncertain local - to - global networks", "output": "document - level event factuality identification", "neg_sample": ["uncertain local - to - global networks is used for Task", "event factuality indicates the degree of certainty about whether an event occurs in the real world .", "existing studies mainly focus on identifying event factuality at sentence level , which easily leads to conflicts between different mentions of the same event ."], "relation": "used for", "id": "2021.emnlp-main.207", "year": 2021, "rel_sent": "Uncertain Local - to - Global Networks for Document - Level Event Factuality Identification.", "forward": true, "src_ids": "2021.emnlp-main.207_9409"}
{"input": "interactive workshop is used for Material| context: although natural language processing is at the core of many tools young people use in their everyday life , high school curricula ( in italy ) do not include any computational linguistics education . this lack of exposure makes the use of such tools less responsible than it could be , and makes choosing computational linguistics as a university degree unlikely .", "entity": "interactive workshop", "output": "italian students", "neg_sample": ["interactive workshop is used for Material", "although natural language processing is at the core of many tools young people use in their everyday life , high school curricula ( in italy ) do not include any computational linguistics education .", "this lack of exposure makes the use of such tools less responsible than it could be , and makes choosing computational linguistics as a university degree unlikely ."], "relation": "used for", "id": "2021.teachingnlp-1.26", "year": 2021, "rel_sent": "Teaching NLP with Bracelets and Restaurant Menus : An Interactive Workshop for Italian Students.", "forward": true, "src_ids": "2021.teachingnlp-1.26_5576"}
{"input": "transductive learning approach is used for Task| context: unsupervised style transfer models are mainly based on an inductive learning approach , which represents the style as embeddings , decoder parameters , or discriminator parameters and directly applies these general rules to the test cases . however , the lacking of parallel corpus hinders the ability of these inductive learning methods on this task . as a result , it is likely to cause severe inconsistent style expressions , like ' the salad is rude ' .", "entity": "transductive learning approach", "output": "unsupervised text style transfer", "neg_sample": ["transductive learning approach is used for Task", "unsupervised style transfer models are mainly based on an inductive learning approach , which represents the style as embeddings , decoder parameters , or discriminator parameters and directly applies these general rules to the test cases .", "however , the lacking of parallel corpus hinders the ability of these inductive learning methods on this task .", "as a result , it is likely to cause severe inconsistent style expressions , like ' the salad is rude ' ."], "relation": "used for", "id": "2021.emnlp-main.195", "year": 2021, "rel_sent": "Transductive Learning for Unsupervised Text Style Transfer.", "forward": true, "src_ids": "2021.emnlp-main.195_9955"}
{"input": "representative tweet embeddings is done by using Method| context: detecting sarcasm has never been easy for machines to process .", "entity": "representative tweet embeddings", "output": "sentence - bert model", "neg_sample": ["representative tweet embeddings is done by using Method", "detecting sarcasm has never been easy for machines to process ."], "relation": "used for", "id": "2021.wanlp-1.40", "year": 2021, "rel_sent": "Then , we used the Sentence - BERT model trained with contrastive learning to extract representative tweet embeddings .", "forward": false, "src_ids": "2021.wanlp-1.40_13464"}
{"input": "structured fine - tuning is done by using Method| context: predicting linearized abstract meaning representation ( amr ) graphs using pre - trained sequence - to - sequence transformer models has recently led to large improvements on amr parsing benchmarks . these parsers are simple and avoid explicit modeling of structure but lack desirable properties such as graph well - formedness guarantees or built - in graph - sentence alignments .", "entity": "structured fine - tuning", "output": "pre - trained language models", "neg_sample": ["structured fine - tuning is done by using Method", "predicting linearized abstract meaning representation ( amr ) graphs using pre - trained sequence - to - sequence transformer models has recently led to large improvements on amr parsing benchmarks .", "these parsers are simple and avoid explicit modeling of structure but lack desirable properties such as graph well - formedness guarantees or built - in graph - sentence alignments ."], "relation": "used for", "id": "2021.emnlp-main.507", "year": 2021, "rel_sent": "We depart from a pointer - based transition system and propose a simplified transition set , designed to better exploit pre - trained language models for structured fine - tuning .", "forward": false, "src_ids": "2021.emnlp-main.507_12736"}
{"input": "ud2rrg transformation is used for Task| context: we describe ud2rrg , a rule - based approach for converting ud trees to role and reference grammar ( rrg ) structures .", "entity": "ud2rrg transformation", "output": "annotation of multilingual rrg treebanks", "neg_sample": ["ud2rrg transformation is used for Task", "we describe ud2rrg , a rule - based approach for converting ud trees to role and reference grammar ( rrg ) structures ."], "relation": "used for", "id": "2021.udw-1.3", "year": 2021, "rel_sent": "Our evaluation , based on English , German , French , Russian , and Farsi , shows that the ud2rrg transformation of UD - parsed data constitutes a highly useful starting point for multilingual RRG treebanking .", "forward": true, "src_ids": "2021.udw-1.3_12686"}
{"input": "vanilla attention is done by using OtherScientificTerm| context: following the success of dot - product attention in transformers , numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length . while these variants are memory and compute efficient , it is not possible to directly use them with popular pre - trained language models trained using vanilla attention , without an expensive corrective pre - training stage .", "entity": "vanilla attention", "output": "drop - in replacement", "neg_sample": ["vanilla attention is done by using OtherScientificTerm", "following the success of dot - product attention in transformers , numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length .", "while these variants are memory and compute efficient , it is not possible to directly use them with popular pre - trained language models trained using vanilla attention , without an expensive corrective pre - training stage ."], "relation": "used for", "id": "2021.sustainlp-1.5", "year": 2021, "rel_sent": "Our approach offers several advantages : ( a ) its memory usage is linear in the input size , similar to linear attention variants , such as Performer and RFA ( b ) it is a drop - in replacement for vanilla attention that does not require any corrective pre - training , and ( c ) it can also lead to significant memory savings in the feed - forward layers after casting them into the familiar query - key - value framework .", "forward": false, "src_ids": "2021.sustainlp-1.5_14478"}
{"input": "self - supervised method is used for Task| context: training qe models require massive parallel data with hand - crafted quality annotations , which are time - consuming and labor - intensive to obtain . to address the issue of the absence of annotated training data , previous studies attempt to develop unsupervised qe methods .", "entity": "self - supervised method", "output": "sentence- and word - level qe", "neg_sample": ["self - supervised method is used for Task", "training qe models require massive parallel data with hand - crafted quality annotations , which are time - consuming and labor - intensive to obtain .", "to address the issue of the absence of annotated training data , previous studies attempt to develop unsupervised qe methods ."], "relation": "used for", "id": "2021.emnlp-main.267", "year": 2021, "rel_sent": "To reduce the negative impact of noises , we propose a self - supervised method for both sentence- and word - level QE , which performs quality estimation by recovering the masked target words .", "forward": true, "src_ids": "2021.emnlp-main.267_14541"}
{"input": "neural models is used for Method| context: interactive machine reading comprehension ( imrc ) is machine comprehension tasks where knowledge sources are partially observable . an agent must interact with an environment sequentially to gather necessary knowledge in order to answer a question .", "entity": "neural models", "output": "graph representations", "neg_sample": ["neural models is used for Method", "interactive machine reading comprehension ( imrc ) is machine comprehension tasks where knowledge sources are partially observable .", "an agent must interact with an environment sequentially to gather necessary knowledge in order to answer a question ."], "relation": "used for", "id": "2021.emnlp-main.540", "year": 2021, "rel_sent": "We describe methods that dynamically build and update these graphs during information gathering , as well as neural models to encode graph representations in RL agents .", "forward": true, "src_ids": "2021.emnlp-main.540_9306"}
{"input": "objective function is done by using OtherScientificTerm| context: generating diverse texts is an important factor for unsupervised text generation . one approach is to produce the diversity of texts conditioned by the sampled latent code . although several generative adversarial networks ( gans ) have been proposed thus far , these models still suffer from mode - collapsing if the models are not pre - trained .", "entity": "objective function", "output": "reconstruction loss", "neg_sample": ["objective function is done by using OtherScientificTerm", "generating diverse texts is an important factor for unsupervised text generation .", "one approach is to produce the diversity of texts conditioned by the sampled latent code .", "although several generative adversarial networks ( gans ) have been proposed thus far , these models still suffer from mode - collapsing if the models are not pre - trained ."], "relation": "used for", "id": "2021.eacl-srw.23", "year": 2021, "rel_sent": "To ensure that the text is generated conditioned upon the sampled latent code , reconstruction loss is introduced in our objective function .", "forward": false, "src_ids": "2021.eacl-srw.23_11816"}
{"input": "toxic spans detection is done by using Method| context: the real - world impact of polarization and toxicity in the online sphere marked the end of 2020 and the beginning of this year in a negative way . semeval-2021 , task 5 - toxic spans detection is based on a novel annotation of a subset of the jigsaw unintended bias dataset and is the first language toxicity detection task dedicated to identifying the toxicity - level spans .", "entity": "toxic spans detection", "output": "virtual adversarial training", "neg_sample": ["toxic spans detection is done by using Method", "the real - world impact of polarization and toxicity in the online sphere marked the end of 2020 and the beginning of this year in a negative way .", "semeval-2021 , task 5 - toxic spans detection is based on a novel annotation of a subset of the jigsaw unintended bias dataset and is the first language toxicity detection task dedicated to identifying the toxicity - level spans ."], "relation": "used for", "id": "2021.semeval-1.26", "year": 2021, "rel_sent": "UPB at SemEval-2021 Task 5 : Virtual Adversarial Training for Toxic Spans Detection.", "forward": false, "src_ids": "2021.semeval-1.26_8048"}
{"input": "nlp - based prediction model is used for Task| context: health and medical researchers often give clinical and policy recommendations to inform health practice and public health policy . however , no current health information system supports the direct retrieval of health advice .", "entity": "nlp - based prediction model", "output": "identifying health advice", "neg_sample": ["nlp - based prediction model is used for Task", "health and medical researchers often give clinical and policy recommendations to inform health practice and public health policy .", "however , no current health information system supports the direct retrieval of health advice ."], "relation": "used for", "id": "2021.emnlp-main.486", "year": 2021, "rel_sent": "This study fills the gap by developing and validating an NLP - based prediction model for identifying health advice in research publications .", "forward": true, "src_ids": "2021.emnlp-main.486_15169"}
{"input": "data augmentation is used for Task| context: integrating extracted knowledge from the web to knowledge graphs ( kgs ) can facilitate tasks like question answering . however , the predictions are made independently , which can be mutually inconsistent .", "entity": "data augmentation", "output": "open information extraction", "neg_sample": ["data augmentation is used for Task", "integrating extracted knowledge from the web to knowledge graphs ( kgs ) can facilitate tasks like question answering .", "however , the predictions are made independently , which can be mutually inconsistent ."], "relation": "used for", "id": "2021.acl-long.363", "year": 2021, "rel_sent": "CoRI : Collective Relation Integration with Data Augmentation for Open Information Extraction.", "forward": true, "src_ids": "2021.acl-long.363_2567"}
{"input": "sentiments analysis is done by using Method| context: substantial amount of text data which is increasingly being generated and shared on the internet and social media every second affect the society positively or negatively almost in any aspect of online world and also business and industries . sentiments / opinions / reviews ' of users posted on social media are the valuable information that have motivated researchers to analyze them to get better insight and feedbacks about any product such as a video in instagram , a movie in netflix , or even new brand car introduced by bmw . sentiments are usually written using a combination of languages such as english which is resource rich and regional languages such as tamil , kannada , malayalam , etc . which are resource poor . however , due to technical constraints , many users prefer to pen their opinions in roman script . these kinds of texts written in two or more languages using a common language script or different language scripts are called code - mixing texts . code - mixed texts are increasing day - by - day with the increase in the number of users depending on various online platforms . analyzing such texts pose a real challenge for the researchers .", "entity": "sentiments analysis", "output": "learning approaches", "neg_sample": ["sentiments analysis is done by using Method", "substantial amount of text data which is increasingly being generated and shared on the internet and social media every second affect the society positively or negatively almost in any aspect of online world and also business and industries .", "sentiments / opinions / reviews ' of users posted on social media are the valuable information that have motivated researchers to analyze them to get better insight and feedbacks about any product such as a video in instagram , a movie in netflix , or even new brand car introduced by bmw .", "sentiments are usually written using a combination of languages such as english which is resource rich and regional languages such as tamil , kannada , malayalam , etc .", "which are resource poor .", "however , due to technical constraints , many users prefer to pen their opinions in roman script .", "these kinds of texts written in two or more languages using a common language script or different language scripts are called code - mixing texts .", "code - mixed texts are increasing day - by - day with the increase in the number of users depending on various online platforms .", "analyzing such texts pose a real challenge for the researchers ."], "relation": "used for", "id": "2021.dravidianlangtech-1.14", "year": 2021, "rel_sent": "LA - SACo : A Study of Learning Approaches for Sentiments Analysis inCode - Mixing Texts.", "forward": false, "src_ids": "2021.dravidianlangtech-1.14_8843"}
{"input": "paraphrase identification models is used for OtherScientificTerm| context: if two sentences have the same meaning , it should follow that they are equivalent in their inferential properties , i.e. , each sentence should textually entail the other . however , many paraphrase datasets currently in widespread use rely on a sense of paraphrase based on word overlap and syntax . can we teach them instead to identify paraphrases in a way that draws on the inferential properties of the sentences , and is not over - reliant on lexical and syntactic similarities of a sentence pair ?", "entity": "paraphrase identification models", "output": "sentence - level meaning equivalence", "neg_sample": ["paraphrase identification models is used for OtherScientificTerm", "if two sentences have the same meaning , it should follow that they are equivalent in their inferential properties , i.e.", ", each sentence should textually entail the other .", "however , many paraphrase datasets currently in widespread use rely on a sense of paraphrase based on word overlap and syntax .", "can we teach them instead to identify paraphrases in a way that draws on the inferential properties of the sentences , and is not over - reliant on lexical and syntactic similarities of a sentence pair ?"], "relation": "used for", "id": "2021.acl-long.552", "year": 2021, "rel_sent": "We discuss implications for paraphrase detection and release our dataset in the hope of making paraphrase detection models better able to detect sentence - level meaning equivalence .", "forward": true, "src_ids": "2021.acl-long.552_15311"}
{"input": "text - only encoding layer is used for Method| context: pretrained language models have shown success in many natural language processing tasks . many works explore to incorporate the knowledge into the language models . in the biomedical domain , experts have taken decades of effort on building large - scale knowledge bases . for example , umls contains millions of entities with their synonyms and defines hundreds of relations among entities . leveraging this knowledge can benefit a variety of downstream tasks such as named entity recognition and relation extraction .", "entity": "text - only encoding layer", "output": "entity representation", "neg_sample": ["text - only encoding layer is used for Method", "pretrained language models have shown success in many natural language processing tasks .", "many works explore to incorporate the knowledge into the language models .", "in the biomedical domain , experts have taken decades of effort on building large - scale knowledge bases .", "for example , umls contains millions of entities with their synonyms and defines hundreds of relations among entities .", "leveraging this knowledge can benefit a variety of downstream tasks such as named entity recognition and relation extraction ."], "relation": "used for", "id": "2021.bionlp-1.20", "year": 2021, "rel_sent": "We then train a knowledge - aware language model that firstly applies a text - only encoding layer to learn entity representation and then applies a text - entity fusion encoding to aggregate entity representation .", "forward": true, "src_ids": "2021.bionlp-1.20_10243"}
{"input": "discourse is done by using Method| context: existing work on probing of pretrained language models ( lms ) has predominantly focused on sentence - level syntactic tasks .", "entity": "discourse", "output": "bart", "neg_sample": ["discourse is done by using Method", "existing work on probing of pretrained language models ( lms ) has predominantly focused on sentence - level syntactic tasks ."], "relation": "used for", "id": "2021.naacl-main.301", "year": 2021, "rel_sent": "We experiment with 7 pretrained LMs , 4 languages , and 7 discourse probing tasks , and find BART to be overall the best model at capturing discourse - but only in its encoder , with BERT performing surprisingly well as the baseline model .", "forward": false, "src_ids": "2021.naacl-main.301_15798"}
{"input": "multimet is used for Task| context: metaphor involves not only a linguistic phenomenon , but also a cognitive phenomenon structuring human thought , which makes understanding it challenging . as a means of cognition , metaphor is rendered by more than texts alone , and multimodal information in which vision / audio content is integrated with the text can play an important role in expressing and understanding metaphor . however , previous metaphor processing and understanding has focused on texts , partly due to the unavailability of large - scale datasets with ground truth labels of multimodal metaphor .", "entity": "multimet", "output": "automatic metaphor understanding", "neg_sample": ["multimet is used for Task", "metaphor involves not only a linguistic phenomenon , but also a cognitive phenomenon structuring human thought , which makes understanding it challenging .", "as a means of cognition , metaphor is rendered by more than texts alone , and multimodal information in which vision / audio content is integrated with the text can play an important role in expressing and understanding metaphor .", "however , previous metaphor processing and understanding has focused on texts , partly due to the unavailability of large - scale datasets with ground truth labels of multimodal metaphor ."], "relation": "used for", "id": "2021.acl-long.249", "year": 2021, "rel_sent": "MultiMET opens the door to automatic metaphor understanding by investigating multimodal cues and their interplay .", "forward": true, "src_ids": "2021.acl-long.249_3522"}
{"input": "bilingual dictionaries is used for OtherScientificTerm| context: multilingual models have demonstrated impressive cross - lingual transfer performance . however , test sets like xnli are monolingual at the example level . in multilingual communities , it is common for polyglots to code - mix when conversing with each other . this paper will be published in the proceedings of naacl - hlt 2021 .", "entity": "bilingual dictionaries", "output": "perturbations", "neg_sample": ["bilingual dictionaries is used for OtherScientificTerm", "multilingual models have demonstrated impressive cross - lingual transfer performance .", "however , test sets like xnli are monolingual at the example level .", "in multilingual communities , it is common for polyglots to code - mix when conversing with each other .", "this paper will be published in the proceedings of naacl - hlt 2021 ."], "relation": "used for", "id": "2021.calcs-1.19", "year": 2021, "rel_sent": "The former ( PolyGloss ) uses bilingual dictionaries to propose perturbations and translations of the clean example for sense disambiguation .", "forward": true, "src_ids": "2021.calcs-1.19_4305"}
{"input": "top - down parsing is used for Method| context: in contrast , a standard top - down parser is not efficient since the looping problem occurs during both the left and right recursion of standard tag derivation .", "entity": "top - down parsing", "output": "tree adjoining grammar ( tag )", "neg_sample": ["top - down parsing is used for Method", "in contrast , a standard top - down parser is not efficient since the looping problem occurs during both the left and right recursion of standard tag derivation ."], "relation": "used for", "id": "2021.scil-1.38", "year": 2021, "rel_sent": "Efficiency of Top - Down Parsing of Recursive Adjunction for Tree Adjoining Grammar.", "forward": true, "src_ids": "2021.scil-1.38_9133"}
{"input": "coreference corpora is done by using Method| context: we present an empirical study that compares mention heads as annotated manually in four coreference datasets ( for dutch , english , polish , and russian ) on one hand , with heads induced from dependency trees parsed automatically , on the other hand .", "entity": "coreference corpora", "output": "dependency parsers", "neg_sample": ["coreference corpora is done by using Method", "we present an empirical study that compares mention heads as annotated manually in four coreference datasets ( for dutch , english , polish , and russian ) on one hand , with heads induced from dependency trees parsed automatically , on the other hand ."], "relation": "used for", "id": "2021.depling-1.10", "year": 2021, "rel_sent": "This can be achieved with sufficient accuracy using modern dependency parsers even for coreference corpora that lack manual head annotation .", "forward": false, "src_ids": "2021.depling-1.10_2549"}
{"input": "equative constructions is done by using Method| context: by presenting a case study on rigvedic equative and similative constructions , this paper demonstrates that treebanks constitute an important support for research in historical linguistics for two main reasons . first , by providing quantitative evidence on linguistic phenomena , they can confirm or dismiss hypotheses formulated on the base of qualitative data .", "entity": "equative constructions", "output": "ud scheme", "neg_sample": ["equative constructions is done by using Method", "by presenting a case study on rigvedic equative and similative constructions , this paper demonstrates that treebanks constitute an important support for research in historical linguistics for two main reasons .", "first , by providing quantitative evidence on linguistic phenomena , they can confirm or dismiss hypotheses formulated on the base of qualitative data ."], "relation": "used for", "id": "2021.tlt-1.2", "year": 2021, "rel_sent": "Since an analysis of Rigvedic equative constructions calls for a granular and informative annotation scheme , the Vedic Treebank implements the UD scheme for equative constructions with subrelations ; while some such extensions were specifically designed for a study on Rigvedic similes , others might be adopted by every treebank developer interested in representing equative strategies .", "forward": false, "src_ids": "2021.tlt-1.2_15262"}
{"input": "information retrieval problem is done by using Task| context: given the more widespread nature of natural language interfaces , it is increasingly important to understand who are accessing those interfaces , and how those interfaces are being used .", "entity": "information retrieval problem", "output": "human - computer interaction", "neg_sample": ["information retrieval problem is done by using Task", "given the more widespread nature of natural language interfaces , it is increasingly important to understand who are accessing those interfaces , and how those interfaces are being used ."], "relation": "used for", "id": "2021.hcinlp-1.2", "year": 2021, "rel_sent": "We then use spellcheckers as a case study to highlight the need for an interdisciplinary approach that brings together natural language processing , education , human - computer interaction to address a known information retrieval problem : query misspelling .", "forward": false, "src_ids": "2021.hcinlp-1.2_6278"}
{"input": "recurrent attention is used for Method| context: recent research questions the importance of the dot - product self - attention in transformer models and shows that most attention heads learn simple positional patterns .", "entity": "recurrent attention", "output": "decoder of transformer", "neg_sample": ["recurrent attention is used for Method", "recent research questions the importance of the dot - product self - attention in transformer models and shows that most attention heads learn simple positional patterns ."], "relation": "used for", "id": "2021.emnlp-main.258", "year": 2021, "rel_sent": "Particularly , when apply RAN to the decoder of Transformer , there brings consistent improvements by about +0.5 BLEU on 6 translation tasks and +1.0 BLEU on Turkish - English translation task .", "forward": true, "src_ids": "2021.emnlp-main.258_9881"}
{"input": "question generation is used for Method| context: amidst rising mental health needs in society , virtual agents are increasingly deployed in counselling . in order to give pertinent advice , counsellors must first gain an understanding of the issues at hand by eliciting sharing from the counsellee .", "entity": "question generation", "output": "counsellor chatbot", "neg_sample": ["question generation is used for Method", "amidst rising mental health needs in society , virtual agents are increasingly deployed in counselling .", "in order to give pertinent advice , counsellors must first gain an understanding of the issues at hand by eliciting sharing from the counsellee ."], "relation": "used for", "id": "2021.nlp4posimpact-1.1", "year": 2021, "rel_sent": "Restatement and Question Generation for Counsellor Chatbot.", "forward": true, "src_ids": "2021.nlp4posimpact-1.1_3445"}
{"input": "natural language processing is done by using Generic| context: natural language processing ( nlp ) research combines the study of universal principles , through basic science , with applied science targeting specific use cases and settings . however , the process of exchange between basic nlp and applications is often assumed to emerge naturally , resulting in many innovations going unapplied and many important questions left unstudied .", "entity": "natural language processing", "output": "general principles", "neg_sample": ["natural language processing is done by using Generic", "natural language processing ( nlp ) research combines the study of universal principles , through basic science , with applied science targeting specific use cases and settings .", "however , the process of exchange between basic nlp and applications is often assumed to emerge naturally , resulting in many innovations going unapplied and many important questions left unstudied ."], "relation": "used for", "id": "2021.naacl-main.325", "year": 2021, "rel_sent": "Translational NLP : A New Paradigm and General Principles for Natural Language Processing Research.", "forward": false, "src_ids": "2021.naacl-main.325_13582"}
{"input": "enhancing transformers is used for Task| context: transfer learning has become the dominant paradigm for many natural language processing tasks . in addition to models being pretrained on large datasets , they can be further trained on intermediate ( supervised ) tasks that are similar to the target task . for small natural language inference ( nli ) datasets , language modelling is typically followed by pretraining on a large ( labelled ) nli dataset before fine - tuning with each nli subtask .", "entity": "enhancing transformers", "output": "nli fine - tuning", "neg_sample": ["enhancing transformers is used for Task", "transfer learning has become the dominant paradigm for many natural language processing tasks .", "in addition to models being pretrained on large datasets , they can be further trained on intermediate ( supervised ) tasks that are similar to the target task .", "for small natural language inference ( nli ) datasets , language modelling is typically followed by pretraining on a large ( labelled ) nli dataset before fine - tuning with each nli subtask ."], "relation": "used for", "id": "2021.findings-acl.26", "year": 2021, "rel_sent": "Enhancing Transformers with Gradient Boosted Decision Trees for NLI Fine - Tuning.", "forward": true, "src_ids": "2021.findings-acl.26_7354"}
{"input": "aspect - based sentiment analysis is done by using OtherScientificTerm| context: it is popular that neural graph - based models are applied in existing aspect - based sentiment analysis ( absa ) studies for utilizing word relations through dependency parses tofacilitate the task with better semantic guidance for analyzing context and aspect words .", "entity": "aspect - based sentiment analysis", "output": "dependency types", "neg_sample": ["aspect - based sentiment analysis is done by using OtherScientificTerm", "it is popular that neural graph - based models are applied in existing aspect - based sentiment analysis ( absa ) studies for utilizing word relations through dependency parses tofacilitate the task with better semantic guidance for analyzing context and aspect words ."], "relation": "used for", "id": "2021.naacl-main.231", "year": 2021, "rel_sent": "To address such limitations , in this paper , we propose an approach to explicitly utilize dependency types for ABSA with type - aware graph convolutional networks ( T - GCN ) , where attention is used in T - GCN to distinguish different edges ( relations ) in the graph and attentive layer ensemble is proposed to comprehensively learn from different layers of T - GCN .", "forward": false, "src_ids": "2021.naacl-main.231_6294"}
{"input": "understanding metaphorical information is done by using Material| context: metaphor involves not only a linguistic phenomenon , but also a cognitive phenomenon structuring human thought , which makes understanding it challenging . as a means of cognition , metaphor is rendered by more than texts alone , and multimodal information in which vision / audio content is integrated with the text can play an important role in expressing and understanding metaphor . however , previous metaphor processing and understanding has focused on texts , partly due to the unavailability of large - scale datasets with ground truth labels of multimodal metaphor .", "entity": "understanding metaphorical information", "output": "multimodal metaphor dataset", "neg_sample": ["understanding metaphorical information is done by using Material", "metaphor involves not only a linguistic phenomenon , but also a cognitive phenomenon structuring human thought , which makes understanding it challenging .", "as a means of cognition , metaphor is rendered by more than texts alone , and multimodal information in which vision / audio content is integrated with the text can play an important role in expressing and understanding metaphor .", "however , previous metaphor processing and understanding has focused on texts , partly due to the unavailability of large - scale datasets with ground truth labels of multimodal metaphor ."], "relation": "used for", "id": "2021.acl-long.249", "year": 2021, "rel_sent": "In this paper , we introduce MultiMET , a novel multimodal metaphor dataset tofacilitate understanding metaphorical information from multimodal text and image .", "forward": false, "src_ids": "2021.acl-long.249_3519"}
{"input": "bert is used for Task| context: with pre - trained models , such as bert , gaining more and more attention , plenty of research has been done tofurther promote their capabilities , from enhancing the experimental procedures ( sun et al . , 2019 ) to improving the mathematical principles .", "entity": "bert", "output": "text classification", "neg_sample": ["bert is used for Task", "with pre - trained models , such as bert , gaining more and more attention , plenty of research has been done tofurther promote their capabilities , from enhancing the experimental procedures ( sun et al .", ", 2019 ) to improving the mathematical principles ."], "relation": "used for", "id": "2021.findings-acl.152", "year": 2021, "rel_sent": "In this paper , we propose a concise method for improving BERT 's performance in text classification by utilizing a label embedding technique while keeping almost the same computational cost .", "forward": true, "src_ids": "2021.findings-acl.152_3625"}
{"input": "needle is used for OtherScientificTerm| context: weak supervision has shown promising results in many natural language processing tasks , such as named entity recognition ( ner ) . existing work mainly focuses on learning deep ner models only with weak supervision , i.e. , without any human annotation , and shows that by merely using weakly labeled data , one can achieve good performance , though still underperforms fully supervised ner with manually / strongly labeled data . in this paper , we consider a more practical scenario , where we have both a small amount of strongly labeled data and a large amount of weakly labeled data . unfortunately , we observe that weakly labeled data does not necessarily improve , or even deteriorate the model performance ( due to the extensive noise in the weak labels ) when we train deep ner models over a simple or weighted combination of the strongly labeled and weakly labeled data .", "entity": "needle", "output": "noise of the weak labels", "neg_sample": ["needle is used for OtherScientificTerm", "weak supervision has shown promising results in many natural language processing tasks , such as named entity recognition ( ner ) .", "existing work mainly focuses on learning deep ner models only with weak supervision , i.e.", ", without any human annotation , and shows that by merely using weakly labeled data , one can achieve good performance , though still underperforms fully supervised ner with manually / strongly labeled data .", "in this paper , we consider a more practical scenario , where we have both a small amount of strongly labeled data and a large amount of weakly labeled data .", "unfortunately , we observe that weakly labeled data does not necessarily improve , or even deteriorate the model performance ( due to the extensive noise in the weak labels ) when we train deep ner models over a simple or weighted combination of the strongly labeled and weakly labeled data ."], "relation": "used for", "id": "2021.acl-long.140", "year": 2021, "rel_sent": "Through experiments on E - commerce query NER and Biomedical NER , we demonstrate that NEEDLE can effectively suppress the noise of the weak labels and outperforms existing methods .", "forward": true, "src_ids": "2021.acl-long.140_7754"}
{"input": "distillation method is used for Task| context: the advent of contextual word embeddings - representations of words which incorporate semantic and syntactic information from their context - has led to tremendous improvements on a wide variety of nlp tasks . however , recent contextual models have prohibitively high computational cost in many use - cases and are often hard to interpret .", "entity": "distillation method", "output": "nlp applications", "neg_sample": ["distillation method is used for Task", "the advent of contextual word embeddings - representations of words which incorporate semantic and syntactic information from their context - has led to tremendous improvements on a wide variety of nlp tasks .", "however , recent contextual models have prohibitively high computational cost in many use - cases and are often hard to interpret ."], "relation": "used for", "id": "2021.acl-long.408", "year": 2021, "rel_sent": "In this work , we demonstrate that our proposed distillation method , which is a simple extension of CBOW - based training , allows to significantly improve computational efficiency of NLP applications , while outperforming the quality of existing static embeddings trained from scratch as well as those distilled from previously proposed methods .", "forward": true, "src_ids": "2021.acl-long.408_15875"}
{"input": "text - based adversarial training algorithm is used for Method| context: the advent of large pre - trained language models has given rise to rapid progress in the field of natural language processing ( nlp ) .", "entity": "text - based adversarial training algorithm", "output": "knowledge distillation", "neg_sample": ["text - based adversarial training algorithm is used for Method", "the advent of large pre - trained language models has given rise to rapid progress in the field of natural language processing ( nlp ) ."], "relation": "used for", "id": "2021.acl-long.86", "year": 2021, "rel_sent": "We present MATE - KD , a novel text - based adversarial training algorithm which improves the performance of knowledge distillation .", "forward": true, "src_ids": "2021.acl-long.86_13457"}
{"input": "bi - lstm - crf is used for Task| context: recurrent neural networks ( rnn ) have been widely used in various natural language processing ( nlp ) tasks such as text classification , sequence tagging , and machine translation . long short term memory ( lstm ) , a special unit of rnn , has the benefit of memorizing past and even future information in a sentence ( especially for bidirectional lstm ) .", "entity": "bi - lstm - crf", "output": "toxic span detection", "neg_sample": ["bi - lstm - crf is used for Task", "recurrent neural networks ( rnn ) have been widely used in various natural language processing ( nlp ) tasks such as text classification , sequence tagging , and machine translation .", "long short term memory ( lstm ) , a special unit of rnn , has the benefit of memorizing past and even future information in a sentence ( especially for bidirectional lstm ) ."], "relation": "used for", "id": "2021.semeval-1.138", "year": 2021, "rel_sent": "LZ1904 at SemEval-2021 Task 5 : Bi - LSTM - CRF for Toxic Span Detection using Pretrained Word Embedding.", "forward": true, "src_ids": "2021.semeval-1.138_838"}
{"input": "profane lexicon is done by using Method| context: bengali is a low - resource language that lacks tools and resources for profane and obscene textual content detection . until now , no lexicon exists for detecting obscenity in bengali social media text .", "entity": "profane lexicon", "output": "semi - automatic methodology", "neg_sample": ["profane lexicon is done by using Method", "bengali is a low - resource language that lacks tools and resources for profane and obscene textual content detection .", "until now , no lexicon exists for detecting obscenity in bengali social media text ."], "relation": "used for", "id": "2021.ranlp-1.145", "year": 2021, "rel_sent": "A semi - automatic methodology is presented for developing the profane lexicon that leverages an obscene corpus , word embedding , and part - of - speech ( POS ) taggers .", "forward": false, "src_ids": "2021.ranlp-1.145_11398"}
{"input": "post - editing subtitles is done by using OtherScientificTerm| context: language technologies , such as machine translation ( mt ) , but also the application of artificial intelligence in general and an abundance of cat tools and platforms have an increasing influence on the translation market . moreover , it has implications for translator training . one of the tasks that emerged with language technologies is post - editing ( pe ) where a human translator corrects raw machine translated output according to given guidelines and quality criteria ( o'brien , 2011 : 197 - 198 ) . already widely used in several traditional translation settings , its use has come intofocus in more creative processes such as literary translation and audiovisual translation ( avt ) as well .", "entity": "post - editing subtitles", "output": "job profiles", "neg_sample": ["post - editing subtitles is done by using OtherScientificTerm", "language technologies , such as machine translation ( mt ) , but also the application of artificial intelligence in general and an abundance of cat tools and platforms have an increasing influence on the translation market .", "moreover , it has implications for translator training .", "one of the tasks that emerged with language technologies is post - editing ( pe ) where a human translator corrects raw machine translated output according to given guidelines and quality criteria ( o'brien , 2011 : 197 - 198 ) .", "already widely used in several traditional translation settings , its use has come intofocus in more creative processes such as literary translation and audiovisual translation ( avt ) as well ."], "relation": "used for", "id": "2021.mtsummit-asltrw.2", "year": 2021, "rel_sent": "In this paper , we want to describe the different potential job profiles and respective competences needed when post - editing subtitles .", "forward": false, "src_ids": "2021.mtsummit-asltrw.2_4780"}
{"input": "automatic method is used for Task| context: ' machine reading comprehension ( mrc ) is a typical natural language processing ( nlp)task and has developed rapidly in the last few years . various reading comprehension datasets have been built to support mrc studies . however large - scale and high - quality datasets are rare due to the high complexity and huge workforce cost of making sucha dataset .", "entity": "automatic method", "output": "mrcdataset generation", "neg_sample": ["automatic method is used for Task", "' machine reading comprehension ( mrc ) is a typical natural language processing ( nlp)task and has developed rapidly in the last few years .", "various reading comprehension datasets have been built to support mrc studies .", "however large - scale and high - quality datasets are rare due to the high complexity and huge workforce cost of making sucha dataset ."], "relation": "used for", "id": "2021.ccl-1.95", "year": 2021, "rel_sent": "Besides most reading comprehension datasets are in English and Chinesedatasets are insufficient . In this paper we propose an automatic method for MRCdataset generation and build the largest Chinese medical reading comprehension dataset presently named CMedRC .", "forward": true, "src_ids": "2021.ccl-1.95_12982"}
{"input": "placeholder translation systems is used for Task| context: the system is trained to output special placeholder tokens and the user - specified term is injected into the output through the context - free replacement of the placeholder token . however and this approach could result in ungrammatical sentences because it is often the case that the specified term needs to be inflected according to the context of the output and which is unknown before the translation .", "entity": "placeholder translation systems", "output": "placeholder translation", "neg_sample": ["placeholder translation systems is used for Task", "the system is trained to output special placeholder tokens and the user - specified term is injected into the output through the context - free replacement of the placeholder token .", "however and this approach could result in ungrammatical sentences because it is often the case that the specified term needs to be inflected according to the context of the output and which is unknown before the translation ."], "relation": "used for", "id": "2021.mtsummit-research.19", "year": 2021, "rel_sent": "To address this problem and we propose a novel method of placeholder translation that can inflect specified terms according to the grammatical construction of the output sentence .", "forward": true, "src_ids": "2021.mtsummit-research.19_8836"}
{"input": "few - shot domain is done by using OtherScientificTerm| context: in this paper , we investigate few - shot joint learning for dialogue language understanding . most existing few - shot models learn a single task each time with only a few examples . however , dialogue language understanding contains two closely related tasks , i.e. , intent detection and slot filling , and often benefits from jointly learning the two tasks . this calls for new few - shot learning techniques that are able to capture task relations from only a few examples and jointly learn multiple tasks .", "entity": "few - shot domain", "output": "bridged metric space", "neg_sample": ["few - shot domain is done by using OtherScientificTerm", "in this paper , we investigate few - shot joint learning for dialogue language understanding .", "most existing few - shot models learn a single task each time with only a few examples .", "however , dialogue language understanding contains two closely related tasks , i.e.", ", intent detection and slot filling , and often benefits from jointly learning the two tasks .", "this calls for new few - shot learning techniques that are able to capture task relations from only a few examples and jointly learn multiple tasks ."], "relation": "used for", "id": "2021.findings-acl.282", "year": 2021, "rel_sent": "To achieve this , we propose a similarity - based few - shot learning scheme , named Contrastive Prototype Merging network ( ConProm ) , that learns to bridge metric spaces of intent and slot on data - rich domains , and then adapt the bridged metric space to specific few - shot domain .", "forward": false, "src_ids": "2021.findings-acl.282_7853"}
{"input": "graph convolutional network ( gcn ) is used for Method| context: target - oriented opinion words extraction ( towe ) ( fan et al . , 2019b ) is a new subtask of target - oriented sentiment analysis that aims to extract opinion words for a given aspect in text . current state - of - the - art methods leverage position embeddings to capture the relative position of a word to the target .", "entity": "graph convolutional network ( gcn )", "output": "word representations", "neg_sample": ["graph convolutional network ( gcn ) is used for Method", "target - oriented opinion words extraction ( towe ) ( fan et al .", ", 2019b ) is a new subtask of target - oriented sentiment analysis that aims to extract opinion words for a given aspect in text .", "current state - of - the - art methods leverage position embeddings to capture the relative position of a word to the target ."], "relation": "used for", "id": "2021.emnlp-main.722", "year": 2021, "rel_sent": "We also adapt a graph convolutional network ( GCN ) to enhance word representations by incorporating syntactic information .", "forward": true, "src_ids": "2021.emnlp-main.722_13102"}
{"input": "considerate service is done by using OtherScientificTerm| context: e - commerce has grown substantially over the last several years , and chatbots for intelligent customer service are concurrently drawing attention .", "entity": "considerate service", "output": "emotional comfort", "neg_sample": ["considerate service is done by using OtherScientificTerm", "e - commerce has grown substantially over the last several years , and chatbots for intelligent customer service are concurrently drawing attention ."], "relation": "used for", "id": "2021.naacl-industry.17", "year": 2021, "rel_sent": "According to the survey of user studies and the real online testing , emotional comfort of customers ' negative emotions , which make up more than 5 % of whole number of customer visits on AliMe , is a key point for providing considerate service .", "forward": false, "src_ids": "2021.naacl-industry.17_8733"}
{"input": "english mispronunciation detection is done by using Method| context: there has been increasing demand to develop effective computer - assisted language training ( capt ) systems , which can provide feedback on mispronunciations and facilitate second - language ( l2 ) learners to improve their speaking proficiency through repeated practice . due to the shortage of non - native speech for training the automatic speech recognition ( asr ) module of a capt system , the corresponding mispronunciation detection performance is often affected by imperfect asr .", "entity": "english mispronunciation detection", "output": "e2e asr", "neg_sample": ["english mispronunciation detection is done by using Method", "there has been increasing demand to develop effective computer - assisted language training ( capt ) systems , which can provide feedback on mispronunciations and facilitate second - language ( l2 ) learners to improve their speaking proficiency through repeated practice .", "due to the shortage of non - native speech for training the automatic speech recognition ( asr ) module of a capt system , the corresponding mispronunciation detection performance is often affected by imperfect asr ."], "relation": "used for", "id": "2021.rocling-1.17", "year": 2021, "rel_sent": "Exploring the Integration of E2E ASR and Pronunciation Modeling for English Mispronunciation Detection.", "forward": false, "src_ids": "2021.rocling-1.17_2493"}
{"input": "mitigation strategy is used for Task| context: knowledge - dependent tasks typically use two sources of knowledge : parametric , learned at training time , and contextual , given as a passage at inference time .", "entity": "mitigation strategy", "output": "generalization", "neg_sample": ["mitigation strategy is used for Task", "knowledge - dependent tasks typically use two sources of knowledge : parametric , learned at training time , and contextual , given as a passage at inference time ."], "relation": "used for", "id": "2021.emnlp-main.565", "year": 2021, "rel_sent": "Our findings demonstrate the importance for practitioners to evaluate model tendency to hallucinate rather than read , and show that our mitigation strategy encourages generalization to evolving information ( i.e.", "forward": true, "src_ids": "2021.emnlp-main.565_6872"}
{"input": "ssl training is done by using OtherScientificTerm| context: this paper presents a production semi - supervised learning ( ssl ) pipeline based on the student - teacher framework , which leverages millions of unlabeled examples to improve natural language understanding ( nlu ) tasks .", "entity": "ssl training", "output": "huge unlabeled data pool", "neg_sample": ["ssl training is done by using OtherScientificTerm", "this paper presents a production semi - supervised learning ( ssl ) pipeline based on the student - teacher framework , which leverages millions of unlabeled examples to improve natural language understanding ( nlu ) tasks ."], "relation": "used for", "id": "2021.naacl-industry.39", "year": 2021, "rel_sent": "We investigate two questions related to the use of unlabeled data in production SSL context : 1 ) how to select samples from a huge unlabeled data pool that are beneficial for SSL training , and 2 ) how does the selected data affect the performance of different state - of - the - art SSL techniques .", "forward": false, "src_ids": "2021.naacl-industry.39_5044"}
{"input": "downstream tasks is done by using Method| context: large pre - trained language models achieve state - of - the - art results when fine - tuned on downstream nlp tasks . however , they almost exclusively focus on text - only representation , while neglecting cell - level layout information that is important for form image understanding .", "entity": "downstream tasks", "output": "structurallm", "neg_sample": ["downstream tasks is done by using Method", "large pre - trained language models achieve state - of - the - art results when fine - tuned on downstream nlp tasks .", "however , they almost exclusively focus on text - only representation , while neglecting cell - level layout information that is important for form image understanding ."], "relation": "used for", "id": "2021.acl-long.493", "year": 2021, "rel_sent": "The pre - trained StructuralLM achieves new state - of - the - art results in different types of downstream tasks , including form understanding ( from 78.95 to 85.14 ) , document visual question answering ( from 72.59 to 83.94 ) and document image classification ( from 94.43 to 96.08 ) .", "forward": false, "src_ids": "2021.acl-long.493_1222"}
{"input": "nct model is used for Material| context: despite the promising performance of sentence - level and context - aware neural machine translation models , there still remain limitations in current nct models because the inherent dialogue characteristics of chat , such as dialogue coherence and speaker personality , are neglected .", "entity": "nct model", "output": "coherent and speaker - relevant translations", "neg_sample": ["nct model is used for Material", "despite the promising performance of sentence - level and context - aware neural machine translation models , there still remain limitations in current nct models because the inherent dialogue characteristics of chat , such as dialogue coherence and speaker personality , are neglected ."], "relation": "used for", "id": "2021.emnlp-main.6", "year": 2021, "rel_sent": "By this means , the NCT model can be enhanced by capturing the inherent dialogue characteristics , thus generating more coherent and speaker - relevant translations .", "forward": true, "src_ids": "2021.emnlp-main.6_13146"}
{"input": "bert model is used for Task| context: contrastive learning has been used to learn a high - quality representation of the image in computer vision . however , contrastive learning is not widely utilized in natural language processing due to the lack of a general method of data augmentation for text data .", "entity": "bert model", "output": "relation extraction", "neg_sample": ["bert model is used for Task", "contrastive learning has been used to learn a high - quality representation of the image in computer vision .", "however , contrastive learning is not widely utilized in natural language processing due to the lack of a general method of data augmentation for text data ."], "relation": "used for", "id": "2021.bionlp-1.1", "year": 2021, "rel_sent": "In this work , we explore the method of employing contrastive learning to improve the text representation from the BERT model for relation extraction .", "forward": true, "src_ids": "2021.bionlp-1.1_1802"}
{"input": "distilbert is used for Method| context: humour detection is an interesting but difficult task in nlp . because humorous might not be obvious in text , it can be embedded into context , hide behind the literal meaning and require prior knowledge to understand .", "entity": "distilbert", "output": "vector representation", "neg_sample": ["distilbert is used for Method", "humour detection is an interesting but difficult task in nlp .", "because humorous might not be obvious in text , it can be embedded into context , hide behind the literal meaning and require prior knowledge to understand ."], "relation": "used for", "id": "2021.semeval-1.166", "year": 2021, "rel_sent": "Models like Logistic Regression , LSTM , MLP , CNN were used , and pre - trained models like DistilBert were introduced to generate accurate vector representation for textual data .", "forward": true, "src_ids": "2021.semeval-1.166_14858"}
{"input": "dataless classifiers is used for Task| context: dataless text classification is capable of classifying documents into previously unseen labels by assigning a score to any document paired with a label description . while promising , it crucially relies on accurate descriptions of the label set for each downstream task .", "entity": "dataless classifiers", "output": "dataless classification", "neg_sample": ["dataless classifiers is used for Task", "dataless text classification is capable of classifying documents into previously unseen labels by assigning a score to any document paired with a label description .", "while promising , it crucially relies on accurate descriptions of the label set for each downstream task ."], "relation": "used for", "id": "2021.findings-acl.365", "year": 2021, "rel_sent": "Experiments show that our approach consistently improves dataless classification across different datasets and makes the classifier more robust to the choice of label descriptions .", "forward": true, "src_ids": "2021.findings-acl.365_13620"}
{"input": "multimodal feature inputs is done by using OtherScientificTerm| context: emotion recognition in conversation has received considerable attention recently because of its practical industrial applications . existing methods tend to overlook the immediate mutual interaction between different speakers in the speaker - utterance level , or apply single speaker - agnostic rnn for utterances from different speakers .", "entity": "multimodal feature inputs", "output": "minor perturbations", "neg_sample": ["multimodal feature inputs is done by using OtherScientificTerm", "emotion recognition in conversation has received considerable attention recently because of its practical industrial applications .", "existing methods tend to overlook the immediate mutual interaction between different speakers in the speaker - utterance level , or apply single speaker - agnostic rnn for utterances from different speakers ."], "relation": "used for", "id": "2021.maiworkshop-1.3", "year": 2021, "rel_sent": "To improve the robustness and generalization during training , we generate adversarial examples by applying the minor perturbations on multimodal feature inputs , unveiling the benefits of adversarial examples for emotion detection .", "forward": false, "src_ids": "2021.maiworkshop-1.3_1099"}
{"input": "edits is used for OtherScientificTerm| context: we study semantic parsing in an interactive setting in which users correct errors with natural language feedback .", "entity": "edits", "output": "parse", "neg_sample": ["edits is used for OtherScientificTerm", "we study semantic parsing in an interactive setting in which users correct errors with natural language feedback ."], "relation": "used for", "id": "2021.naacl-main.444", "year": 2021, "rel_sent": "We present NL - EDIT , a model for interpreting natural language feedback in the interaction context to generate a sequence of edits that can be applied to the initial parse to correct its errors .", "forward": true, "src_ids": "2021.naacl-main.444_9032"}
{"input": "data - questeval metric is done by using Method| context: questeval is a reference - less metric used in text - to - text tasks , that compares the generated summaries directly to the source text , by automatically asking and answering questions . its adaptation to data - to - text tasks is not straightforward , as it requires multimodal question generation and answering systems on the considered tasks , which are seldom available .", "entity": "data - questeval metric", "output": "multimodal components", "neg_sample": ["data - questeval metric is done by using Method", "questeval is a reference - less metric used in text - to - text tasks , that compares the generated summaries directly to the source text , by automatically asking and answering questions .", "its adaptation to data - to - text tasks is not straightforward , as it requires multimodal question generation and answering systems on the considered tasks , which are seldom available ."], "relation": "used for", "id": "2021.emnlp-main.633", "year": 2021, "rel_sent": "To this purpose , we propose a method to build synthetic multimodal corpora enabling to train multimodal components for a data - QuestEval metric .", "forward": false, "src_ids": "2021.emnlp-main.633_15483"}
{"input": "news headlines is done by using Method| context: part of speech ( pos ) tagging is a familiar nlp task . state of the art taggers routinely achieve token - level accuracies of over 97 % on news body text , evidence that the problem is well understood . however , the register of english news headlines , ' headlinese ' , is very different from the register of long - form text , causing pos tagging models to underperform on headlines .", "entity": "news headlines", "output": "nlp models", "neg_sample": ["news headlines is done by using Method", "part of speech ( pos ) tagging is a familiar nlp task .", "state of the art taggers routinely achieve token - level accuracies of over 97 % on news body text , evidence that the problem is well understood .", "however , the register of english news headlines , ' headlinese ' , is very different from the register of long - form text , causing pos tagging models to underperform on headlines ."], "relation": "used for", "id": "2021.emnlp-main.521", "year": 2021, "rel_sent": "We make POSH , the POS - tagged Headline corpus , available to encourage research in improved NLP models for news headlines .", "forward": false, "src_ids": "2021.emnlp-main.521_15584"}
{"input": "responsible natural language processing is done by using Method| context: computation - intensive pretrained models have been taking the lead of many natural language processing benchmarks such as glue . however , energy efficiency in the process of model training and inference becomes a critical bottleneck .", "entity": "responsible natural language processing", "output": "multi - task energy efficiency benchmarking platform", "neg_sample": ["responsible natural language processing is done by using Method", "computation - intensive pretrained models have been taking the lead of many natural language processing benchmarks such as glue .", "however , energy efficiency in the process of model training and inference becomes a critical bottleneck ."], "relation": "used for", "id": "2021.eacl-demos.39", "year": 2021, "rel_sent": "We introduce HULK , a multi - task energy efficiency benchmarking platform for responsible natural language processing .", "forward": false, "src_ids": "2021.eacl-demos.39_13797"}
{"input": "transformer architectures is used for Task| context: pretrained transformer - based models , such as bert and its variants , have become a common choice to obtain state - of - the - art performances in nlp tasks . in the identification of adverse drug events ( ade ) from social media texts , for example , bert architectures rank first in the leaderboard . however , a systematic comparison between these models has not yet been done .", "entity": "transformer architectures", "output": "adverse drug event detection", "neg_sample": ["transformer architectures is used for Task", "pretrained transformer - based models , such as bert and its variants , have become a common choice to obtain state - of - the - art performances in nlp tasks .", "in the identification of adverse drug events ( ade ) from social media texts , for example , bert architectures rank first in the leaderboard .", "however , a systematic comparison between these models has not yet been done ."], "relation": "used for", "id": "2021.eacl-main.149", "year": 2021, "rel_sent": "BERT Prescriptions to Avoid Unwanted Headaches : A Comparison of Transformer Architectures for Adverse Drug Event Detection.", "forward": true, "src_ids": "2021.eacl-main.149_2928"}
{"input": "linguistic observation is used for Task| context: entity tags in human - machine dialog are integral to natural language understanding ( nlu ) tasks in conversational assistants . however , current systems struggle to accurately parse spoken queries with the typical use of text input alone , and often fail to understand the user intent . previous work in linguistics has identified a cross - language tendency for longer speech pauses surrounding nouns as compared to verbs .", "entity": "linguistic observation", "output": "machine - learnt language understanding tasks", "neg_sample": ["linguistic observation is used for Task", "entity tags in human - machine dialog are integral to natural language understanding ( nlu ) tasks in conversational assistants .", "however , current systems struggle to accurately parse spoken queries with the typical use of text input alone , and often fail to understand the user intent .", "previous work in linguistics has identified a cross - language tendency for longer speech pauses surrounding nouns as compared to verbs ."], "relation": "used for", "id": "2021.nlp4convai-1.22", "year": 2021, "rel_sent": "We demonstrate that the linguistic observation on pauses can be used to improve accuracy in machine - learnt language understanding tasks .", "forward": true, "src_ids": "2021.nlp4convai-1.22_719"}
{"input": "dialogue - contextualized passage encodings is done by using Method| context: identifying relevant knowledge to be used in conversational systems that are grounded in long documents is critical to effective response generation .", "entity": "dialogue - contextualized passage encodings", "output": "knowledge identification model", "neg_sample": ["dialogue - contextualized passage encodings is done by using Method", "identifying relevant knowledge to be used in conversational systems that are grounded in long documents is critical to effective response generation ."], "relation": "used for", "id": "2021.emnlp-main.140", "year": 2021, "rel_sent": "We introduce a knowledge identification model that leverages the document structure to provide dialogue - contextualized passage encodings and better locate knowledge relevant to the conversation .", "forward": false, "src_ids": "2021.emnlp-main.140_5514"}
{"input": "complex domain - specific reasoning is done by using Task| context: we explore whether state - of - the - art bert models encode sufficient domain knowledge to correctly perform domain - specific inference . although bert implementations such as biobert are better at domain - based reasoning than those trained on general - domain corpora , there is still a wide margin compared to human performance on these tasks .", "entity": "complex domain - specific reasoning", "output": "knowledge retrieval", "neg_sample": ["complex domain - specific reasoning is done by using Task", "we explore whether state - of - the - art bert models encode sufficient domain knowledge to correctly perform domain - specific inference .", "although bert implementations such as biobert are better at domain - based reasoning than those trained on general - domain corpora , there is still a wide margin compared to human performance on these tasks ."], "relation": "used for", "id": "2021.bionlp-1.5", "year": 2021, "rel_sent": "By examining the retrieval output , we show that the methods fail due to unreliable knowledge retrieval for complex domain - specific reasoning .", "forward": false, "src_ids": "2021.bionlp-1.5_12038"}
{"input": "pseudo - log - likelihood is done by using OtherScientificTerm| context: we introduce self - critic pretraining transformers ( script ) for representation learning of text .", "entity": "pseudo - log - likelihood", "output": "self - critic scores", "neg_sample": ["pseudo - log - likelihood is done by using OtherScientificTerm", "we introduce self - critic pretraining transformers ( script ) for representation learning of text ."], "relation": "used for", "id": "2021.naacl-main.409", "year": 2021, "rel_sent": "Also , the self - critic scores can be directly used as pseudo - log - likelihood for efficient scoring .", "forward": false, "src_ids": "2021.naacl-main.409_14291"}
{"input": "cerebra 's context - based representations is used for Task| context: how do people understand the meaning of the word ' small ' when used to describe a mosquito , a church , or a planet ? while humans have a remarkable ability toform meanings by combining existing concepts , modeling this process is challenging .", "entity": "cerebra 's context - based representations", "output": "nlp applications", "neg_sample": ["cerebra 's context - based representations is used for Task", "how do people understand the meaning of the word ' small ' when used to describe a mosquito , a church , or a planet ?", "while humans have a remarkable ability toform meanings by combining existing concepts , modeling this process is challenging ."], "relation": "used for", "id": "2021.semspace-1.1", "year": 2021, "rel_sent": "CEREBRA 's context - based representations can potentially be used to make NLP applications more human - like .", "forward": true, "src_ids": "2021.semspace-1.1_10202"}
{"input": "nlp task is used for OtherScientificTerm| context: humans create things for a reason . ancient people created spears for hunting , knives for cutting meat , pots for preparing food , etc . the prototypical function of a physical artifact is a kind of commonsense knowledge that we rely on to understand natural language . for example , if someone says ' she borrowed the book ' then you would assume that she intends to read the book , or if someone asks ' can i use your knife ? ' then you would assume that they need to cut something .", "entity": "nlp task", "output": "prototypical uses", "neg_sample": ["nlp task is used for OtherScientificTerm", "humans create things for a reason .", "ancient people created spears for hunting , knives for cutting meat , pots for preparing food , etc .", "the prototypical function of a physical artifact is a kind of commonsense knowledge that we rely on to understand natural language .", "for example , if someone says ' she borrowed the book ' then you would assume that she intends to read the book , or if someone asks ' can i use your knife ? '", "then you would assume that they need to cut something ."], "relation": "used for", "id": "2021.acl-long.540", "year": 2021, "rel_sent": "In this paper , we introduce a new NLP task of learning the prototypical uses for human - made physical objects .", "forward": true, "src_ids": "2021.acl-long.540_13429"}
{"input": "hierarchy is done by using Method| context: we consider the problem of multi - label classification , where the labels lie on a hierarchy . however , unlike most existing works in hierarchical multi - label classification , we do not assume that the label - hierarchy is known .", "entity": "hierarchy", "output": "hyperbolic embeddings", "neg_sample": ["hierarchy is done by using Method", "we consider the problem of multi - label classification , where the labels lie on a hierarchy .", "however , unlike most existing works in hierarchical multi - label classification , we do not assume that the label - hierarchy is known ."], "relation": "used for", "id": "2021.eacl-main.247", "year": 2021, "rel_sent": "We also present evaluation of the hyperbolic embeddings obtained by joint learning and show that they represent the hierarchy more accurately than the other alternatives .", "forward": false, "src_ids": "2021.eacl-main.247_3683"}
{"input": "multilingual pre - training and data augmentation is used for Task| context: binary sequence classification is a standard nlp task with known state - of - the - art methods .", "entity": "multilingual pre - training and data augmentation", "output": "classification of toxicity", "neg_sample": ["multilingual pre - training and data augmentation is used for Task", "binary sequence classification is a standard nlp task with known state - of - the - art methods ."], "relation": "used for", "id": "2021.germeval-1.4", "year": 2021, "rel_sent": "DFKI SLT at GermEval 2021 : Multilingual Pre - training and Data Augmentation for the Classification of Toxicity in Social Media Comments.", "forward": true, "src_ids": "2021.germeval-1.4_12946"}
{"input": "translation is done by using Task| context: existing approaches for machine translation ( mt ) mostly translate given text in the source language into the target language and without explicitly referring to information indispensable for producing proper translation . this includes not only information in other textual elements and modalities than texts in the same document and but also extra - document and non - linguistic information and such as norms and skopos . to design better translation production work - flows and we need to distinguish translation issues that could be resolved by the existing text - to - text approaches and those beyond them .", "entity": "translation", "output": "post - editing", "neg_sample": ["translation is done by using Task", "existing approaches for machine translation ( mt ) mostly translate given text in the source language into the target language and without explicitly referring to information indispensable for producing proper translation .", "this includes not only information in other textual elements and modalities than texts in the same document and but also extra - document and non - linguistic information and such as norms and skopos .", "to design better translation production work - flows and we need to distinguish translation issues that could be resolved by the existing text - to - text approaches and those beyond them ."], "relation": "used for", "id": "2021.mtsummit-research.18", "year": 2021, "rel_sent": "First and examples of translation issues and their revisions were collected by a two - stage post - editing ( PE ) method : performing minimal PE to obtain translation attainable based on the given textual information and further performing full PE to obtain truly acceptable translation referring to any information if necessary .", "forward": false, "src_ids": "2021.mtsummit-research.18_11910"}
{"input": "semantic alignment is done by using OtherScientificTerm| context: analysis of teacher evaluations is crucial to the development of robust educational programs , particularly through the validation of desirable qualities being reflected on in the text .", "entity": "semantic alignment", "output": "word embedding similarities", "neg_sample": ["semantic alignment is done by using OtherScientificTerm", "analysis of teacher evaluations is crucial to the development of robust educational programs , particularly through the validation of desirable qualities being reflected on in the text ."], "relation": "used for", "id": "2021.findings-acl.33", "year": 2021, "rel_sent": "Inspired by the use of word embedding similarities to capture semantic alignment , we utilize GloVe embeddings to determine to what extent these evaluations reflect concepts critical to measuring the competency of Teacher Fellows and upholding the organization 's Vision and Mission .", "forward": false, "src_ids": "2021.findings-acl.33_407"}
{"input": "ensemble of trained models is used for Task| context: multiple instance learning ( mil ) has become the standard learning paradigm for distantly supervised relation extraction ( dsre ) . however , due to relation extraction being performed at bag level , mil has significant hardware requirements for training when coupled with large sentence encoders such as deep transformer neural networks .", "entity": "ensemble of trained models", "output": "prediction", "neg_sample": ["ensemble of trained models is used for Task", "multiple instance learning ( mil ) has become the standard learning paradigm for distantly supervised relation extraction ( dsre ) .", "however , due to relation extraction being performed at bag level , mil has significant hardware requirements for training when coupled with large sentence encoders such as deep transformer neural networks ."], "relation": "used for", "id": "2021.icnlsp-1.8", "year": 2021, "rel_sent": "To alleviate the issues caused by random sampling , we use an ensemble of trained models for prediction .", "forward": true, "src_ids": "2021.icnlsp-1.8_7008"}
{"input": "category and comparison information of numerals is done by using Method| context: in recent years , math word problem solving has received considerable attention and achieved promising results , but previous methods rarely take numerical values into consideration . most methods treat the numerical values in the problems as number symbols , and ignore the prominent role of the numerical values in solving the problem .", "entity": "category and comparison information of numerals", "output": "numerical properties prediction mechanism", "neg_sample": ["category and comparison information of numerals is done by using Method", "in recent years , math word problem solving has received considerable attention and achieved promising results , but previous methods rarely take numerical values into consideration .", "most methods treat the numerical values in the problems as number symbols , and ignore the prominent role of the numerical values in solving the problem ."], "relation": "used for", "id": "2021.acl-long.455", "year": 2021, "rel_sent": "In addition , a numerical properties prediction mechanism is used to capture the category and comparison information of numerals and measure their importance in global expressions .", "forward": false, "src_ids": "2021.acl-long.455_1704"}
{"input": "collaborative learning of bidirectional decoders is used for Task| context: existing methods struggle to achieve both high style conversion rate and low content loss , exhibiting the over - transfer and under - transfer problems . we attribute these problems to the conflicting driving forces of the style conversion goal and content preservation goal .", "entity": "collaborative learning of bidirectional decoders", "output": "unsupervised text style transfer", "neg_sample": ["collaborative learning of bidirectional decoders is used for Task", "existing methods struggle to achieve both high style conversion rate and low content loss , exhibiting the over - transfer and under - transfer problems .", "we attribute these problems to the conflicting driving forces of the style conversion goal and content preservation goal ."], "relation": "used for", "id": "2021.emnlp-main.729", "year": 2021, "rel_sent": "Collaborative Learning of Bidirectional Decoders for Unsupervised Text Style Transfer.", "forward": true, "src_ids": "2021.emnlp-main.729_4131"}
{"input": "machine translation system is used for Material| context: code - mixed languages are very popular in multilingual societies around the world , yet the resources lag behind to enable robust systems on such languages . a major contributing factor is the informal nature of these languages which makes it difficult to collect code - mixed data .", "entity": "machine translation system", "output": "english", "neg_sample": ["machine translation system is used for Material", "code - mixed languages are very popular in multilingual societies around the world , yet the resources lag behind to enable robust systems on such languages .", "a major contributing factor is the informal nature of these languages which makes it difficult to collect code - mixed data ."], "relation": "used for", "id": "2021.calcs-1.7", "year": 2021, "rel_sent": "In this paper , we propose our system for Task 1 of CACLS 2021 to generate a machine translation system for English to Hinglish in a supervised setting .", "forward": true, "src_ids": "2021.calcs-1.7_15393"}
{"input": "human dialog is done by using Task| context: in the visual dialog task guesswhat ? ! two players maintain a dialog in order to identify a secret object in an image . this raises a question : what 's the risk of having an imperfect oracle model ? .", "entity": "human dialog", "output": "guessing task", "neg_sample": ["human dialog is done by using Task", "in the visual dialog task guesswhat ? !", "two players maintain a dialog in order to identify a secret object in an image .", "this raises a question : what 's the risk of having an imperfect oracle model ?", "."], "relation": "used for", "id": "2021.reinact-1.2", "year": 2021, "rel_sent": "We show that having access to better quality answers has a direct impact on the guessing task for human dialog and argue that better answers could help train better question generation models .", "forward": false, "src_ids": "2021.reinact-1.2_10993"}
{"input": "pretrained bert - base and mbart-50 models is used for Material| context: one of the major challenges in sign language translation from a sign language to a spoken language is the lack of parallel corpora . recent works have achieved promising results on the rwth - phoenix - weather 2014 t dataset , which consists of over eight thousand parallel sentences between german sign language and german . however , from the perspective of neural machine translation , this is still a tiny dataset .", "entity": "pretrained bert - base and mbart-50 models", "output": "sign language video", "neg_sample": ["pretrained bert - base and mbart-50 models is used for Material", "one of the major challenges in sign language translation from a sign language to a spoken language is the lack of parallel corpora .", "recent works have achieved promising results on the rwth - phoenix - weather 2014 t dataset , which consists of over eight thousand parallel sentences between german sign language and german .", "however , from the perspective of neural machine translation , this is still a tiny dataset ."], "relation": "used for", "id": "2021.mtsummit-at4ssl.10", "year": 2021, "rel_sent": "We use pretrained BERT - base and mBART-50 models to initialize our sign language video to spoken language text translation model .", "forward": true, "src_ids": "2021.mtsummit-at4ssl.10_14429"}
{"input": "neural and non - neural methods is used for Task| context: however , such enhanced performance comes at high computational costs . ensembles of simpler classifiers ( i.e. , stacking ) that exploit algorithmic and representational complementarities have also been shown to produce top - notch performance in atc , enjoying high effectiveness and potentially lower computational costs .", "entity": "neural and non - neural methods", "output": "text classification", "neg_sample": ["neural and non - neural methods is used for Task", "however , such enhanced performance comes at high computational costs .", "ensembles of simpler classifiers ( i.e.", ", stacking ) that exploit algorithmic and representational complementarities have also been shown to produce top - notch performance in atc , enjoying high effectiveness and potentially lower computational costs ."], "relation": "used for", "id": "2021.findings-acl.350", "year": 2021, "rel_sent": "On the Cost - Effectiveness of Stacking of Neural and Non - Neural Methods for Text Classification : Scenarios and Performance Prediction.", "forward": true, "src_ids": "2021.findings-acl.350_2964"}
{"input": "baseline bert based rc models is done by using OtherScientificTerm| context: although bert based relation classification ( rc ) models have achieved significant improvements over the traditional deep learning models , it seems that no consensus can be reached on what is the optimal architecture , since there are many design choices available .", "entity": "baseline bert based rc models", "output": "search space", "neg_sample": ["baseline bert based rc models is done by using OtherScientificTerm", "although bert based relation classification ( rc ) models have achieved significant improvements over the traditional deep learning models , it seems that no consensus can be reached on what is the optimal architecture , since there are many design choices available ."], "relation": "used for", "id": "2021.acl-srw.4", "year": 2021, "rel_sent": "In this work , we design a comprehensive search space for BERT based RC models and employ a modified version of efficient neural architecture search ( ENAS ) method to automatically discover the design choices mentioned above .", "forward": false, "src_ids": "2021.acl-srw.4_7814"}
{"input": "learning algorithms is done by using Method| context: deep learning based natural language processing ( nlp ) has become the mainstream of research in recent years and significantly outperforms conventional methods . however , deep learning models are notorious for being data and computation hungry . these downsides limit the application of such models from deployment to different domains , languages , countries , or styles , since collecting in - genre data and model training from scratch are costly . the long - tail nature of human language makes challenges even more significant . there is a related tutorial in icml 2019 and a related course at stanford , but most of the example applications given in these materials are about image processing .", "entity": "learning algorithms", "output": "meta - learning approaches", "neg_sample": ["learning algorithms is done by using Method", "deep learning based natural language processing ( nlp ) has become the mainstream of research in recent years and significantly outperforms conventional methods .", "however , deep learning models are notorious for being data and computation hungry .", "these downsides limit the application of such models from deployment to different domains , languages , countries , or styles , since collecting in - genre data and model training from scratch are costly .", "the long - tail nature of human language makes challenges even more significant .", "there is a related tutorial in icml 2019 and a related course at stanford , but most of the example applications given in these materials are about image processing ."], "relation": "used for", "id": "2021.acl-tutorials.3", "year": 2021, "rel_sent": "Meta - learning , or ' Learning to Learn ' , aims to learn better learning algorithms , including better parameter initialization , optimization strategy , network architecture , distance metrics , and beyond .", "forward": false, "src_ids": "2021.acl-tutorials.3_11274"}
{"input": "fine - tuning neural machine translation models is done by using Metric| context: in most cases , the lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task .", "entity": "fine - tuning neural machine translation models", "output": "semantic similarity metrics", "neg_sample": ["fine - tuning neural machine translation models is done by using Metric", "in most cases , the lack of parallel corpora makes it impossible to directly train supervised models for the text style transfer task ."], "relation": "used for", "id": "2021.naacl-main.337", "year": 2021, "rel_sent": "In particular , we leverage semantic similarity metrics originally used for fine - tuning neural machine translation models to explicitly assess the preservation of content between system outputs and input texts .", "forward": false, "src_ids": "2021.naacl-main.337_12071"}
{"input": "consistency filters is used for Material| context: while self - training generates synthetic training data where natural inputs are aligned with noisy outputs , back - training results in natural outputs aligned with noisy inputs .", "entity": "consistency filters", "output": "low - quality synthetic data", "neg_sample": ["consistency filters is used for Material", "while self - training generates synthetic training data where natural inputs are aligned with noisy outputs , back - training results in natural outputs aligned with noisy inputs ."], "relation": "used for", "id": "2021.emnlp-main.566", "year": 2021, "rel_sent": "We further propose consistency filters to remove low - quality synthetic data before training .", "forward": true, "src_ids": "2021.emnlp-main.566_4460"}
{"input": "visual grounding is used for OtherScientificTerm| context: the general approach is to embed both textual and visual information into a common space -the grounded space- confined by an explicit relationship .", "entity": "visual grounding", "output": "concrete and abstract words", "neg_sample": ["visual grounding is used for OtherScientificTerm", "the general approach is to embed both textual and visual information into a common space -the grounded space- confined by an explicit relationship ."], "relation": "used for", "id": "2021.conll-1.12", "year": 2021, "rel_sent": "Intrinsic and extrinsic evaluations show that our way of visual grounding is highly beneficial for both abstract and concrete words .", "forward": true, "src_ids": "2021.conll-1.12_14460"}
{"input": "chatbot is done by using Task| context: a good open - domain chatbot should avoid presenting contradictory responses about facts or opinions in a conversational session , known as its consistency capacity . however , evaluating the consistency capacity of a chatbot is still challenging . employing human judges to interact with chatbots on purpose to check their capacities is costly and low - efficient , and difficult to get rid of subjective bias .", "entity": "chatbot", "output": "addressing inquiries about history", "neg_sample": ["chatbot is done by using Task", "a good open - domain chatbot should avoid presenting contradictory responses about facts or opinions in a conversational session , known as its consistency capacity .", "however , evaluating the consistency capacity of a chatbot is still challenging .", "employing human judges to interact with chatbots on purpose to check their capacities is costly and low - efficient , and difficult to get rid of subjective bias ."], "relation": "used for", "id": "2021.findings-acl.91", "year": 2021, "rel_sent": "At the conversation stage , AIH attempts to address appropriate inquiries about the dialogue history to induce the chatbot to redeclare the historical facts or opinions .", "forward": false, "src_ids": "2021.findings-acl.91_9903"}
{"input": "semi - automatic methodology is used for OtherScientificTerm| context: bengali is a low - resource language that lacks tools and resources for profane and obscene textual content detection . until now , no lexicon exists for detecting obscenity in bengali social media text .", "entity": "semi - automatic methodology", "output": "profane lexicon", "neg_sample": ["semi - automatic methodology is used for OtherScientificTerm", "bengali is a low - resource language that lacks tools and resources for profane and obscene textual content detection .", "until now , no lexicon exists for detecting obscenity in bengali social media text ."], "relation": "used for", "id": "2021.ranlp-1.145", "year": 2021, "rel_sent": "A semi - automatic methodology is presented for developing the profane lexicon that leverages an obscene corpus , word embedding , and part - of - speech ( POS ) taggers .", "forward": true, "src_ids": "2021.ranlp-1.145_11399"}
{"input": "bidirectional self - attention is used for OtherScientificTerm| context: social media such as twitter provide valuable information to crisis managers and affected people during natural disasters . machine learning can help structure and extract information from the large volume of messages shared during a crisis ; however , the constantly evolving nature of crises makes effective domain adaptation essential . supervised classification is limited by unchangeable class labels that may not be relevant to new events , and unsupervised topic modelling by insufficient prior knowledge .", "entity": "bidirectional self - attention", "output": "topic keywords", "neg_sample": ["bidirectional self - attention is used for OtherScientificTerm", "social media such as twitter provide valuable information to crisis managers and affected people during natural disasters .", "machine learning can help structure and extract information from the large volume of messages shared during a crisis ; however , the constantly evolving nature of crises makes effective domain adaptation essential .", "supervised classification is limited by unchangeable class labels that may not be relevant to new events , and unsupervised topic modelling by insufficient prior knowledge ."], "relation": "used for", "id": "2021.adaptnlp-1.5", "year": 2021, "rel_sent": "In this paper , we bridge the gap between the two and show that BERT embeddings finetuned on crisis - related tweet classification can effectively be used to adapt to a new crisis , discovering novel topics while preserving relevant classes from supervised training , and leveraging bidirectional self - attention to extract topic keywords .", "forward": true, "src_ids": "2021.adaptnlp-1.5_14635"}
{"input": "named entity recognition and classification is done by using Method| context: to be able to share the valuable information in electronic patient records ( epr ) they first need to be de - identified in order to protect the privacy of their subjects . named entity recognition and classification ( nerc ) is an important part of this process . in recent years , general - purpose language models pre - trained on large amounts of data , in particular bert , have achieved state of the art results in nerc , among other nlp tasks . sofar , however , no attempts have been made at applying bert for nerc on swedish epr data .", "entity": "named entity recognition and classification", "output": "bert - models", "neg_sample": ["named entity recognition and classification is done by using Method", "to be able to share the valuable information in electronic patient records ( epr ) they first need to be de - identified in order to protect the privacy of their subjects .", "named entity recognition and classification ( nerc ) is an important part of this process .", "in recent years , general - purpose language models pre - trained on large amounts of data , in particular bert , have achieved state of the art results in nerc , among other nlp tasks .", "sofar , however , no attempts have been made at applying bert for nerc on swedish epr data ."], "relation": "used for", "id": "2021.nodalida-main.23", "year": 2021, "rel_sent": "Applying and Sharing pre - trained BERT - models for Named Entity Recognition and Classification in Swedish Electronic Patient Records.", "forward": false, "src_ids": "2021.nodalida-main.23_2334"}
{"input": "diachronic trajectories is done by using OtherScientificTerm| context: we present a manually annotated lexical semantic change dataset for russian : rushifteval . its novelty is ensured by a single set of target words annotated for their diachronic semantic shifts across three time periods , while the previous work either used only two time periods , or different sets of target words .", "entity": "diachronic trajectories", "output": "ternary nature", "neg_sample": ["diachronic trajectories is done by using OtherScientificTerm", "we present a manually annotated lexical semantic change dataset for russian : rushifteval .", "its novelty is ensured by a single set of target words annotated for their diachronic semantic shifts across three time periods , while the previous work either used only two time periods , or different sets of target words ."], "relation": "used for", "id": "2021.lchange-1.2", "year": 2021, "rel_sent": "In addition , it is shown how the ternary nature of RuShiftEval allows to trace specific diachronic trajectories : ' changed at a particular time period and stable afterwards ' or ' was changing throughout all time periods ' .", "forward": false, "src_ids": "2021.lchange-1.2_13443"}
{"input": "local approximation is done by using Method| context: model - agnostic meta - learning ( maml ) has been recently put forth as a strategy to learn resource - poor languages in a sample - efficient fashion . nevertheless , the properties of these languages are often not well represented by those available during training . hence , we argue that the i.i.d . assumption ingrained in maml makes it ill - suited for cross - lingual nlp .", "entity": "local approximation", "output": "adaptive optimiser solving", "neg_sample": ["local approximation is done by using Method", "model - agnostic meta - learning ( maml ) has been recently put forth as a strategy to learn resource - poor languages in a sample - efficient fashion .", "nevertheless , the properties of these languages are often not well represented by those available during training .", "hence , we argue that the i.i.d .", "assumption ingrained in maml makes it ill - suited for cross - lingual nlp ."], "relation": "used for", "id": "2021.findings-acl.106", "year": 2021, "rel_sent": "In light of this , we propose a new adaptive optimiser solving for a local approximation to their Nash equilibrium .", "forward": false, "src_ids": "2021.findings-acl.106_1816"}
{"input": "machine - driven approaches is used for Task| context: it has long been recognized that suffixing is more common than prefixing in the languages of the world . more detailed statistics on this tendency are needed to sharpen proposed explanations for this tendency . the classic approach to gathering data on the prefix / suffix preference is for a human to read grammatical descriptions ( 948 languages ) , which is time - consuming and involves discretization judgments .", "entity": "machine - driven approaches", "output": "prefix and suffix statistics", "neg_sample": ["machine - driven approaches is used for Task", "it has long been recognized that suffixing is more common than prefixing in the languages of the world .", "more detailed statistics on this tendency are needed to sharpen proposed explanations for this tendency .", "the classic approach to gathering data on the prefix / suffix preference is for a human to read grammatical descriptions ( 948 languages ) , which is time - consuming and involves discretization judgments ."], "relation": "used for", "id": "2021.sigtyp-1.8", "year": 2021, "rel_sent": "In this paper we explore two machine - driven approaches for prefix and suffix statistics which are crude approximations , but have advantages in terms of time and replicability .", "forward": true, "src_ids": "2021.sigtyp-1.8_11640"}
{"input": "adversarial self - supervised learning is used for Task| context: detecting out - of - domain ( ood ) intents is crucial for the deployed task - oriented dialogue system . previous unsupervised ood detection methods only extract discriminative features of different in - domain intents while supervised counterparts can directly distinguish ood and in - domain intents but require extensive labeled ood data .", "entity": "adversarial self - supervised learning", "output": "out - of - domain detection", "neg_sample": ["adversarial self - supervised learning is used for Task", "detecting out - of - domain ( ood ) intents is crucial for the deployed task - oriented dialogue system .", "previous unsupervised ood detection methods only extract discriminative features of different in - domain intents while supervised counterparts can directly distinguish ood and in - domain intents but require extensive labeled ood data ."], "relation": "used for", "id": "2021.naacl-main.447", "year": 2021, "rel_sent": "Adversarial Self - Supervised Learning for Out - of - Domain Detection.", "forward": true, "src_ids": "2021.naacl-main.447_442"}
{"input": "remote patient care is done by using Method| context: in recent years , remote digital healthcare using online chats has gained momentum , especially in the global south . though prior work has studied interaction patterns in online ( health ) forums , such as talklife , reddit and facebook , there has been limited work in understanding interactions in small , close - knit community of instant messengers .", "entity": "remote patient care", "output": "nlp solutions", "neg_sample": ["remote patient care is done by using Method", "in recent years , remote digital healthcare using online chats has gained momentum , especially in the global south .", "though prior work has studied interaction patterns in online ( health ) forums , such as talklife , reddit and facebook , there has been limited work in understanding interactions in small , close - knit community of instant messengers ."], "relation": "used for", "id": "2021.law-1.7", "year": 2021, "rel_sent": "The primary aim of the framework is to understand interpersonal relationships among peer supporters in order to help develop NLP solutions for remote patient care and reduce burden of overworked healthcare providers .", "forward": false, "src_ids": "2021.law-1.7_15703"}
{"input": "gaussian distribution is used for OtherScientificTerm| context: event detection , a fundamental task of information extraction , tends to struggle when it needs to recognize novel event types with a few samples , i.e. few - shot event detection ( fsed ) . previous identify - then - classify paradigm attempts to solve this problem in the pipeline manner but ignores the trigger discrepancy between event types , thus suffering from the error propagation .", "entity": "gaussian distribution", "output": "transition scores", "neg_sample": ["gaussian distribution is used for OtherScientificTerm", "event detection , a fundamental task of information extraction , tends to struggle when it needs to recognize novel event types with a few samples , i.e.", "few - shot event detection ( fsed ) .", "previous identify - then - classify paradigm attempts to solve this problem in the pipeline manner but ignores the trigger discrepancy between event types , thus suffering from the error propagation ."], "relation": "used for", "id": "2021.findings-acl.3", "year": 2021, "rel_sent": "Then Gaussian distribution is introduced for the modeling of the transition scores in PA - CRF to alleviate the uncertain estimation resulting from insufficient data .", "forward": true, "src_ids": "2021.findings-acl.3_13840"}
{"input": "semantics is done by using OtherScientificTerm| context: lexical inference in context ( liic ) is the task of recognizing textual entailment between two very similar sentences , i.e. , sentences that only differ in one expression . it can therefore be seen as a variant of the natural language inference task that is focused on lexical semantics .", "entity": "semantics", "output": "handcrafted patterns", "neg_sample": ["semantics is done by using OtherScientificTerm", "lexical inference in context ( liic ) is the task of recognizing textual entailment between two very similar sentences , i.e.", ", sentences that only differ in one expression .", "it can therefore be seen as a variant of the natural language inference task that is focused on lexical semantics ."], "relation": "used for", "id": "2021.eacl-main.108", "year": 2021, "rel_sent": "We formulate and evaluate the first approaches based on pretrained language models ( LMs ) for this task : ( i ) a few - shot NLI classifier , ( ii ) a relation induction approach based on handcrafted patterns expressing the semantics of lexical inference , and ( iii ) a variant of ( ii ) with patterns that were automatically extracted from a corpus .", "forward": false, "src_ids": "2021.eacl-main.108_13206"}
{"input": "fake news detection is done by using Method| context: fake news with textual and visual contents has a better story - telling ability than text - only contents , and can be spread quickly with social media . people can be easily deceived by such fake news , and traditional expert identification is labor - intensive . therefore , automatic detection of multimodal fake news has become a new hot - spot issue . a shortcoming of existing approaches is their inability tofuse multimodality features effectively . they simply concatenate unimodal features without considering inter - modality relations .", "entity": "fake news detection", "output": "multimodal co - attention networks ( mcan )", "neg_sample": ["fake news detection is done by using Method", "fake news with textual and visual contents has a better story - telling ability than text - only contents , and can be spread quickly with social media .", "people can be easily deceived by such fake news , and traditional expert identification is labor - intensive .", "therefore , automatic detection of multimodal fake news has become a new hot - spot issue .", "a shortcoming of existing approaches is their inability tofuse multimodality features effectively .", "they simply concatenate unimodal features without considering inter - modality relations ."], "relation": "used for", "id": "2021.findings-acl.226", "year": 2021, "rel_sent": "Inspired by the way people read news with image and text , we propose a novel Multimodal Co - Attention Networks ( MCAN ) to better fuse textual and visual features for fake news detection .", "forward": false, "src_ids": "2021.findings-acl.226_2604"}
{"input": "lightweight models is done by using OtherScientificTerm| context: in natural language processing ( nlp ) , state - of - the - art ( sota ) semi - supervised learning ( ssl ) frameworks have shown great performance on deep pre - trained language models such as bert , and are expected to significantly reduce the demand for manual labeling . however , our empirical studies indicate that these frameworks are not suitable for lightweight models such as textcnn , lstm and etc .", "entity": "lightweight models", "output": "generalized regular constraint", "neg_sample": ["lightweight models is done by using OtherScientificTerm", "in natural language processing ( nlp ) , state - of - the - art ( sota ) semi - supervised learning ( ssl ) frameworks have shown great performance on deep pre - trained language models such as bert , and are expected to significantly reduce the demand for manual labeling .", "however , our empirical studies indicate that these frameworks are not suitable for lightweight models such as textcnn , lstm and etc ."], "relation": "used for", "id": "2021.emnlp-main.192", "year": 2021, "rel_sent": "FLiText introduces an inspirer network together with the consistency regularization framework , which leverages a generalized regular constraint on the lightweight models for efficient SSL .", "forward": false, "src_ids": "2021.emnlp-main.192_6715"}
{"input": "backpropagation is done by using Method| context: contrastive learning has been applied successfully to learn vector representations of text . previous research demonstrated that learning high - quality representations benefits from batch - wise contrastive loss with a large number of negatives . in practice , the technique of in - batch negative is used , where for each example in a batch , other batch examples ' positives will be taken as its negatives , avoiding encoding extra negatives . this , however , still conditions each example 's loss on all batch examples and requires fitting the entire large batch into gpu memory .", "entity": "backpropagation", "output": "gradient caching technique", "neg_sample": ["backpropagation is done by using Method", "contrastive learning has been applied successfully to learn vector representations of text .", "previous research demonstrated that learning high - quality representations benefits from batch - wise contrastive loss with a large number of negatives .", "in practice , the technique of in - batch negative is used , where for each example in a batch , other batch examples ' positives will be taken as its negatives , avoiding encoding extra negatives .", "this , however , still conditions each example 's loss on all batch examples and requires fitting the entire large batch into gpu memory ."], "relation": "used for", "id": "2021.repl4nlp-1.31", "year": 2021, "rel_sent": "This paper introduces a gradient caching technique that decouples backpropagation between contrastive loss and the encoder , removing encoder backward pass data dependency along the batch dimension .", "forward": false, "src_ids": "2021.repl4nlp-1.31_7841"}
{"input": "multimodal text style transfer is used for Task| context: one of the most challenging topics in natural language processing ( nlp ) is visually - grounded language understanding and reasoning . with the lack of human - annotated instructions that illustrate the intricate urban scenes , outdoor vln remains a challenging task to solve .", "entity": "multimodal text style transfer", "output": "outdoor vision - and - language navigation", "neg_sample": ["multimodal text style transfer is used for Task", "one of the most challenging topics in natural language processing ( nlp ) is visually - grounded language understanding and reasoning .", "with the lack of human - annotated instructions that illustrate the intricate urban scenes , outdoor vln remains a challenging task to solve ."], "relation": "used for", "id": "2021.eacl-main.103", "year": 2021, "rel_sent": "Multimodal Text Style Transfer for Outdoor Vision - and - Language Navigation.", "forward": true, "src_ids": "2021.eacl-main.103_10982"}
{"input": "next - word prediction is done by using Method| context: after a neural sequence model encounters an unexpected token , can its behavior be predicted ? we show that rnn and transformer language models exhibit structured , consistent generalization in out - of - distribution contexts .", "entity": "next - word prediction", "output": "idealized models of generalization", "neg_sample": ["next - word prediction is done by using Method", "after a neural sequence model encounters an unexpected token , can its behavior be predicted ?", "we show that rnn and transformer language models exhibit structured , consistent generalization in out - of - distribution contexts ."], "relation": "used for", "id": "2021.emnlp-main.448", "year": 2021, "rel_sent": "We begin by introducing two idealized models of generalization in next - word prediction : a lexical context model in which generalization is consistent with the last word observed , and a syntactic context model in which generalization is consistent with the global structure of the input .", "forward": false, "src_ids": "2021.emnlp-main.448_16089"}
{"input": "graph fusion operation is used for OtherScientificTerm| context: human communication is multimodal in nature ; it is through multiple modalities such as language , voice , and facial expressions , that opinions and emotions are expressed . data in this domain exhibits complex multi - relational and temporal interactions . learning from this data is a fundamentally challenging research problem .", "entity": "graph fusion operation", "output": "modal - temporal graph", "neg_sample": ["graph fusion operation is used for OtherScientificTerm", "human communication is multimodal in nature ; it is through multiple modalities such as language , voice , and facial expressions , that opinions and emotions are expressed .", "data in this domain exhibits complex multi - relational and temporal interactions .", "learning from this data is a fundamentally challenging research problem ."], "relation": "used for", "id": "2021.naacl-main.79", "year": 2021, "rel_sent": "Then , a novel graph fusion operation , called MTAG fusion , along with a dynamic pruning and read - out technique , is designed to efficiently process this modal - temporal graph and capture various interactions .", "forward": true, "src_ids": "2021.naacl-main.79_12616"}
{"input": "act and intent classification is done by using OtherScientificTerm| context: the user inten - tion and background .", "entity": "act and intent classification", "output": "context", "neg_sample": ["act and intent classification is done by using OtherScientificTerm", "the user inten - tion and background ."], "relation": "used for", "id": "2021.findings-acl.124", "year": 2021, "rel_sent": "Exploring the Role of Context in Utterance - level Emotion , Act and Intent Classification in Conversations : An Empirical Study.", "forward": false, "src_ids": "2021.findings-acl.124_15711"}
{"input": "parabart is used for Method| context: pre - trained language models have achieved huge success on a wide range of nlp tasks . however , contextual representations from pre - trained models contain entangled semantic and syntactic information , and therefore can not be directly used to derive useful semantic sentence embeddings for some tasks . paraphrase pairs offer an effective way of learning the distinction between semantics and syntax , as they naturally share semantics and often vary in syntax .", "entity": "parabart", "output": "disentangled semantic and syntactic representations", "neg_sample": ["parabart is used for Method", "pre - trained language models have achieved huge success on a wide range of nlp tasks .", "however , contextual representations from pre - trained models contain entangled semantic and syntactic information , and therefore can not be directly used to derive useful semantic sentence embeddings for some tasks .", "paraphrase pairs offer an effective way of learning the distinction between semantics and syntax , as they naturally share semantics and often vary in syntax ."], "relation": "used for", "id": "2021.naacl-main.108", "year": 2021, "rel_sent": "In this way , ParaBART learns disentangled semantic and syntactic representations from their respective inputs with separate encoders .", "forward": true, "src_ids": "2021.naacl-main.108_10970"}
{"input": "task - oriented dialogue systems is done by using Method| context: dialogue policy optimisation via reinforcement learning requires a large number of training interactions , which makes learning with real users time consuming and expensive . many set - ups therefore rely on a user simulator instead of humans . these user simulators have their own problems . while hand - coded , rule - based user simulators have been shown to be sufficient in small , simple domains , for complex domains the number of rules quickly becomes intractable . state - of - the - art data - driven user simulators , on the other hand , are still domain - dependent . this means that adaptation to each new domain requires redesigning and retraining .", "entity": "task - oriented dialogue systems", "output": "domain - independent user simulation", "neg_sample": ["task - oriented dialogue systems is done by using Method", "dialogue policy optimisation via reinforcement learning requires a large number of training interactions , which makes learning with real users time consuming and expensive .", "many set - ups therefore rely on a user simulator instead of humans .", "these user simulators have their own problems .", "while hand - coded , rule - based user simulators have been shown to be sufficient in small , simple domains , for complex domains the number of rules quickly becomes intractable .", "state - of - the - art data - driven user simulators , on the other hand , are still domain - dependent .", "this means that adaptation to each new domain requires redesigning and retraining ."], "relation": "used for", "id": "2021.sigdial-1.47", "year": 2021, "rel_sent": "Domain - independent User Simulation with Transformers for Task - oriented Dialogue Systems.", "forward": false, "src_ids": "2021.sigdial-1.47_13070"}
{"input": "multi - layer and sum - pooling attention is used for OtherScientificTerm| context: multi - label document classification ( mldc ) problems can be challenging , especially for long documents with a large label set and a long - tail distribution over labels .", "entity": "multi - layer and sum - pooling attention", "output": "informative features", "neg_sample": ["multi - layer and sum - pooling attention is used for OtherScientificTerm", "multi - label document classification ( mldc ) problems can be challenging , especially for long documents with a large label set and a long - tail distribution over labels ."], "relation": "used for", "id": "2021.emnlp-main.481", "year": 2021, "rel_sent": "Our innovations are three - fold : ( 1 ) we utilize a deep convolution - based encoder with the squeeze - and - excitation networks and residual networks to aggregate the information across the document and learn meaningful document representations that cover different ranges of texts ; ( 2 ) we explore multi - layer and sum - pooling attention to extract the most informative features from these multi - scale representations ; ( 3 ) we combine binary cross entropy loss and focal loss to improve performance for rare labels .", "forward": true, "src_ids": "2021.emnlp-main.481_902"}
{"input": "uyghur metaphor detection tasks is done by using Method| context: ' metaphor detection plays an important role in tasks such as machine translation and human - machine dialogue . as more users express their opinions on products or other topics on socialmedia through metaphorical expressions this task is particularly especially topical . most of the research in this field focuses on english and there are few studies on minority languages thatlack language resources and tools . moreover metaphorical expressions have different meaningsin different language environments .", "entity": "uyghur metaphor detection tasks", "output": "deep neural network ( dnn)framework", "neg_sample": ["uyghur metaphor detection tasks is done by using Method", "' metaphor detection plays an important role in tasks such as machine translation and human - machine dialogue .", "as more users express their opinions on products or other topics on socialmedia through metaphorical expressions this task is particularly especially topical .", "most of the research in this field focuses on english and there are few studies on minority languages thatlack language resources and tools .", "moreover metaphorical expressions have different meaningsin different language environments ."], "relation": "used for", "id": "2021.ccl-1.80", "year": 2021, "rel_sent": "We therefore established a deep neural network ( DNN)framework for Uyghur metaphor detection tasks .", "forward": false, "src_ids": "2021.ccl-1.80_5236"}
{"input": "masked language models is done by using Method| context: masked language models ( mlms ) have contributed to drastic performance improvements with regard to zero anaphora resolution ( zar ) . tofurther improve this approach , in this study , we made two proposals .", "entity": "masked language models", "output": "pretraining task", "neg_sample": ["masked language models is done by using Method", "masked language models ( mlms ) have contributed to drastic performance improvements with regard to zero anaphora resolution ( zar ) .", "tofurther improve this approach , in this study , we made two proposals ."], "relation": "used for", "id": "2021.emnlp-main.308", "year": 2021, "rel_sent": "The first is a new pretraining task that trains MLMs on anaphoric relations with explicit supervision , and the second proposal is a new finetuning method that remedies a notorious issue , the pretrain - finetune discrepancy .", "forward": false, "src_ids": "2021.emnlp-main.308_11481"}
{"input": "hahackathon 2021 is done by using Method| context: in writing , humor is mainly based on figurative language in which words and expressions change their conventional meaning to refer to something without saying it directly . this flip in the meaning of the words prevents natural language processing from revealing the real intention of a communication and , therefore , reduces the effectiveness of tasks such as sentiment analysis or emotion detection .", "entity": "hahackathon 2021", "output": "umuteam", "neg_sample": ["hahackathon 2021 is done by using Method", "in writing , humor is mainly based on figurative language in which words and expressions change their conventional meaning to refer to something without saying it directly .", "this flip in the meaning of the words prevents natural language processing from revealing the real intention of a communication and , therefore , reduces the effectiveness of tasks such as sentiment analysis or emotion detection ."], "relation": "used for", "id": "2021.semeval-1.152", "year": 2021, "rel_sent": "In this manuscript we describe the participation of the UMUTeam in HaHackathon 2021 , whose objective is to detect and rate humorous and controversial content .", "forward": false, "src_ids": "2021.semeval-1.152_4007"}
{"input": "unsupervised knowledge - grounded conversation is done by using Method| context: large - scale conversation models are turning to leveraging external knowledge to improve the factual accuracy in response generation . considering the infeasibility to annotate the external knowledge for large - scale dialogue corpora , it is desirable to learn the knowledge selection and response generation in an unsupervised manner .", "entity": "unsupervised knowledge - grounded conversation", "output": "plato - kag", "neg_sample": ["unsupervised knowledge - grounded conversation is done by using Method", "large - scale conversation models are turning to leveraging external knowledge to improve the factual accuracy in response generation .", "considering the infeasibility to annotate the external knowledge for large - scale dialogue corpora , it is desirable to learn the knowledge selection and response generation in an unsupervised manner ."], "relation": "used for", "id": "2021.nlp4convai-1.14", "year": 2021, "rel_sent": "PLATO - KAG : Unsupervised Knowledge - Grounded Conversation via Joint Modeling.", "forward": false, "src_ids": "2021.nlp4convai-1.14_906"}
{"input": "limited data is used for Method| context: public trust in science depends on honest and factual communication of scientific papers . however , recent studies have demonstrated a tendency of news media to misrepresent scientific papers by exaggerating their findings . while there are an abundance of scientific papers and popular media articles written about them , very rarely do the articles include a direct link to the original paper , making data collection challenging , and necessitating the need for few - shot learning .", "entity": "limited data", "output": "mt - pet", "neg_sample": ["limited data is used for Method", "public trust in science depends on honest and factual communication of scientific papers .", "however , recent studies have demonstrated a tendency of news media to misrepresent scientific papers by exaggerating their findings .", "while there are an abundance of scientific papers and popular media articles written about them , very rarely do the articles include a direct link to the original paper , making data collection challenging , and necessitating the need for few - shot learning ."], "relation": "used for", "id": "2021.emnlp-main.845", "year": 2021, "rel_sent": "Using limited data from this and previous studies on exaggeration detection in science , we introduce MT - PET , a multi - task version of Pattern Exploiting Training ( PET ) , which leverages knowledge from complementary cloze - style QA tasks to improve few - shot learning .", "forward": true, "src_ids": "2021.emnlp-main.845_7164"}
{"input": "nmt models is done by using Method| context: recent research questions the importance of the dot - product self - attention in transformer models and shows that most attention heads learn simple positional patterns .", "entity": "nmt models", "output": "recurrent attention", "neg_sample": ["nmt models is done by using Method", "recent research questions the importance of the dot - product self - attention in transformer models and shows that most attention heads learn simple positional patterns ."], "relation": "used for", "id": "2021.emnlp-main.258", "year": 2021, "rel_sent": "Our RAN is a promising alternative to build more effective and efficient NMT models .", "forward": false, "src_ids": "2021.emnlp-main.258_9878"}
{"input": "synonymous expressions is done by using OtherScientificTerm| context: ' this paper tackles a new task for event entity recognition ( eer ) . different from named entity recognizing ( ner ) task it only identifies the named entities which are related to a specific event type . currently there is no specific model to directly deal with the eer task . however these technical alternatives heavily rely on the efficiency of the event trigger detection which have to require the tedious yet expensive human la - beling of the event triggers especially for languages where triggers contain multiple tokens andhave numerous synonymous expressions ( such as chinese ) .", "entity": "synonymous expressions", "output": "task learning framework", "neg_sample": ["synonymous expressions is done by using OtherScientificTerm", "' this paper tackles a new task for event entity recognition ( eer ) .", "different from named entity recognizing ( ner ) task it only identifies the named entities which are related to a specific event type .", "currently there is no specific model to directly deal with the eer task .", "however these technical alternatives heavily rely on the efficiency of the event trigger detection which have to require the tedious yet expensive human la - beling of the event triggers especially for languages where triggers contain multiple tokens andhave numerous synonymous expressions ( such as chinese ) ."], "relation": "used for", "id": "2021.ccl-1.100", "year": 2021, "rel_sent": "Besides TAM can accurately identifythe synonymous expressions that are not included in the trigger dictionary .", "forward": false, "src_ids": "2021.ccl-1.100_1977"}
{"input": "incremental consolidation of new knowledge is done by using Method| context: the ability of humans to symbolically represent social events and situations is crucial for various interactions in everyday life . several studies in cognitive psychology have established the role of mental state attributions in effectively representing variable aspects of these social events . in the past , nlp research on learning event representations often focuses on construing syntactic and semantic information from language . however , they fail to consider the importance of pragmatic aspects and the need to consistently update new social situational information without forgetting the accumulated experiences .", "entity": "incremental consolidation of new knowledge", "output": "continual learning strategies", "neg_sample": ["incremental consolidation of new knowledge is done by using Method", "the ability of humans to symbolically represent social events and situations is crucial for various interactions in everyday life .", "several studies in cognitive psychology have established the role of mental state attributions in effectively representing variable aspects of these social events .", "in the past , nlp research on learning event representations often focuses on construing syntactic and semantic information from language .", "however , they fail to consider the importance of pragmatic aspects and the need to consistently update new social situational information without forgetting the accumulated experiences ."], "relation": "used for", "id": "2021.eacl-main.317", "year": 2021, "rel_sent": "Next , we introduce continual learning strategies that allow for incremental consolidation of new knowledge while retaining and promoting efficient usage of prior knowledge .", "forward": false, "src_ids": "2021.eacl-main.317_404"}
{"input": "generalized features is done by using Method| context: speech act classification determining the communicative intent of an utterance has been investigated widely over the years as a standalone task . this holds true for discussion in any fora including social media platform such as twitter . but the emotional state of the tweeter which has a considerable effect on the communication has not received the attention it deserves . closely related to emotion is sentiment , and understanding of one helps understand the other .", "entity": "generalized features", "output": "dyadic attention mechanism", "neg_sample": ["generalized features is done by using Method", "speech act classification determining the communicative intent of an utterance has been investigated widely over the years as a standalone task .", "this holds true for discussion in any fora including social media platform such as twitter .", "but the emotional state of the tweeter which has a considerable effect on the communication has not received the attention it deserves .", "closely related to emotion is sentiment , and understanding of one helps understand the other ."], "relation": "used for", "id": "2021.naacl-main.456", "year": 2021, "rel_sent": "DAM incorporates intra - modal and inter - modal attention tofuse multiple modalities and learns generalized features across all the tasks .", "forward": false, "src_ids": "2021.naacl-main.456_4212"}
{"input": "text style transfer is done by using Task| context: a common approach is to map a given sentence to content representation that is free of style , and the content representation is fed to a decoder with a target style . previous methods in filtering style completely remove tokens with style at the token level , which incurs the loss of content information .", "entity": "text style transfer", "output": "enhancing content preservation", "neg_sample": ["text style transfer is done by using Task", "a common approach is to map a given sentence to content representation that is free of style , and the content representation is fed to a decoder with a target style .", "previous methods in filtering style completely remove tokens with style at the token level , which incurs the loss of content information ."], "relation": "used for", "id": "2021.acl-long.8", "year": 2021, "rel_sent": "Enhancing Content Preservation in Text Style Transfer Using Reverse Attention and Conditional Layer Normalization.", "forward": false, "src_ids": "2021.acl-long.8_1865"}
{"input": "distant labels is done by using OtherScientificTerm| context: scientific literature analysis needs fine - grained named entity recognition ( ner ) to provide a wide range of information for scientific discovery . for example , chemistry research needs to study dozens to hundreds of distinct , fine - grained entity types , making consistent and accurate annotation difficult even for crowds of domain experts . on the other hand , domain - specific ontologies and knowledge bases ( kbs ) can be easily accessed , constructed , or integrated , which makes distant supervision realistic for fine - grained chemistry ner . in distant supervision , training labels are generated by matching mentions in a document with the concepts in the knowledge bases ( kbs ) . however , this kind of kb - matching suffers from two major challenges : incomplete annotation and noisy annotation .", "entity": "distant labels", "output": "chemistry type ontology structure", "neg_sample": ["distant labels is done by using OtherScientificTerm", "scientific literature analysis needs fine - grained named entity recognition ( ner ) to provide a wide range of information for scientific discovery .", "for example , chemistry research needs to study dozens to hundreds of distinct , fine - grained entity types , making consistent and accurate annotation difficult even for crowds of domain experts .", "on the other hand , domain - specific ontologies and knowledge bases ( kbs ) can be easily accessed , constructed , or integrated , which makes distant supervision realistic for fine - grained chemistry ner .", "in distant supervision , training labels are generated by matching mentions in a document with the concepts in the knowledge bases ( kbs ) .", "however , this kind of kb - matching suffers from two major challenges : incomplete annotation and noisy annotation ."], "relation": "used for", "id": "2021.emnlp-main.424", "year": 2021, "rel_sent": "It leverages the chemistry type ontology structure to generate distant labels with novel methods of flexible KB - matching and ontology - guided multi - type disambiguation .", "forward": false, "src_ids": "2021.emnlp-main.424_5688"}
{"input": "discriminative classifier is done by using OtherScientificTerm| context: out - of - scope intent detection is of practical importance in task - oriented dialogue systems . since the distribution of outlier utterances is arbitrary and unknown in the training stage , existing methods commonly rely on strong assumptions on data distribution such as mixture of gaussians to make inference , resulting in either complex multi - step training procedures or hand - crafted rules such as confidence threshold selection for outlier detection .", "entity": "discriminative classifier", "output": "pseudo outliers", "neg_sample": ["discriminative classifier is done by using OtherScientificTerm", "out - of - scope intent detection is of practical importance in task - oriented dialogue systems .", "since the distribution of outlier utterances is arbitrary and unknown in the training stage , existing methods commonly rely on strong assumptions on data distribution such as mixture of gaussians to make inference , resulting in either complex multi - step training procedures or hand - crafted rules such as confidence threshold selection for outlier detection ."], "relation": "used for", "id": "2021.acl-long.273", "year": 2021, "rel_sent": "The pseudo outliers are used to train a discriminative classifier that can be directly applied to and generalize well on the test task .", "forward": false, "src_ids": "2021.acl-long.273_13587"}
{"input": "keyword extraction is done by using Method| context: recent , well - known graph - based approaches typically employ the knowledge from word vector representations during the ranking process via popular centrality measures ( e.g. , pagerank ) without giving the primary role to vectors ' distribution .", "entity": "keyword extraction", "output": "graph - based ranking approaches", "neg_sample": ["keyword extraction is done by using Method", "recent , well - known graph - based approaches typically employ the knowledge from word vector representations during the ranking process via popular centrality measures ( e.g.", ", pagerank ) without giving the primary role to vectors ' distribution ."], "relation": "used for", "id": "11.textgraphs-1.9", "year": 2021, "rel_sent": "This work revisits the information given by the graph - of - words and its typical utilization through graph - based ranking approaches in the context of keyword extraction .", "forward": false, "src_ids": "11.textgraphs-1.9_10647"}
{"input": "query language is used for Task| context: currently , text chatting is one of the primary means of communication . however , modern text chat still in general does not offer any navigation or even full - featured search , although the high volumes of messages demand it . since the task is novel , neither training nor gold - standard datasets for it have been created yet .", "entity": "query language", "output": "semi - automatic situation extraction", "neg_sample": ["query language is used for Task", "currently , text chatting is one of the primary means of communication .", "however , modern text chat still in general does not offer any navigation or even full - featured search , although the high volumes of messages demand it .", "since the task is novel , neither training nor gold - standard datasets for it have been created yet ."], "relation": "used for", "id": "2021.acl-srw.14", "year": 2021, "rel_sent": "We also introduce a custom query language for semi - automatic situation extraction .", "forward": true, "src_ids": "2021.acl-srw.14_375"}
{"input": "children 's semantic representation of verbal inflection is done by using Method| context: child language acquisition is famously accurate despite the sparsity of linguistic input .", "entity": "children 's semantic representation of verbal inflection", "output": "unimorph annotations", "neg_sample": ["children 's semantic representation of verbal inflection is done by using Method", "child language acquisition is famously accurate despite the sparsity of linguistic input ."], "relation": "used for", "id": "2021.scil-1.17", "year": 2021, "rel_sent": "Using UniMorph annotations as an approximation of children 's semantic representation of verbal inflection , we use the Tolerance Principle to explicitly identify the formal processes of segmentation and mutation that productively encode the semantic relations ( e.g. , past tense ) between stems and inflected forms .", "forward": false, "src_ids": "2021.scil-1.17_12353"}
{"input": "noise is used for Task| context: after a neural sequence model encounters an unexpected token , can its behavior be predicted ? we show that rnn and transformer language models exhibit structured , consistent generalization in out - of - distribution contexts .", "entity": "noise", "output": "syntactic generalization", "neg_sample": ["noise is used for Task", "after a neural sequence model encounters an unexpected token , can its behavior be predicted ?", "we show that rnn and transformer language models exhibit structured , consistent generalization in out - of - distribution contexts ."], "relation": "used for", "id": "2021.emnlp-main.448", "year": 2021, "rel_sent": "We then show that , in some languages , noise mediates the twoforms of generalization : noise applied to input tokens encourages syntactic generalization , while noise in history representations encourages lexical generalization .", "forward": true, "src_ids": "2021.emnlp-main.448_16097"}
{"input": "unsupervised topic modelling is used for Task| context: social media such as twitter provide valuable information to crisis managers and affected people during natural disasters . machine learning can help structure and extract information from the large volume of messages shared during a crisis ; however , the constantly evolving nature of crises makes effective domain adaptation essential . supervised classification is limited by unchangeable class labels that may not be relevant to new events , and unsupervised topic modelling by insufficient prior knowledge .", "entity": "unsupervised topic modelling", "output": "social - media assisted crisis management", "neg_sample": ["unsupervised topic modelling is used for Task", "social media such as twitter provide valuable information to crisis managers and affected people during natural disasters .", "machine learning can help structure and extract information from the large volume of messages shared during a crisis ; however , the constantly evolving nature of crises makes effective domain adaptation essential .", "supervised classification is limited by unchangeable class labels that may not be relevant to new events , and unsupervised topic modelling by insufficient prior knowledge ."], "relation": "used for", "id": "2021.adaptnlp-1.5", "year": 2021, "rel_sent": "Bridging the gap between supervised classification and unsupervised topic modelling for social - media assisted crisis management.", "forward": true, "src_ids": "2021.adaptnlp-1.5_14632"}
{"input": "language models is done by using Task| context: language models ( lms ) must be both safe and equitable to be responsibly deployed in practice . 2020 ; krause et al . 2020 ) have been proposed to mitigate toxic lm generations .", "entity": "language models", "output": "detoxification", "neg_sample": ["language models is done by using Task", "language models ( lms ) must be both safe and equitable to be responsibly deployed in practice .", "2020 ; krause et al .", "2020 ) have been proposed to mitigate toxic lm generations ."], "relation": "used for", "id": "2021.naacl-main.190", "year": 2021, "rel_sent": "We find that detoxification makes LMs more brittle to distribution shift , especially on language used by marginalized groups .", "forward": false, "src_ids": "2021.naacl-main.190_7212"}
{"input": "cfd dataset is used for Method| context: counterfactual statements describe events that did not or can not take place . we consider the problem of counterfactual detection ( cfd ) in product reviews .", "entity": "cfd dataset", "output": "cfd models", "neg_sample": ["cfd dataset is used for Method", "counterfactual statements describe events that did not or can not take place .", "we consider the problem of counterfactual detection ( cfd ) in product reviews ."], "relation": "used for", "id": "2021.emnlp-main.568", "year": 2021, "rel_sent": "Moreover , our CFD dataset is compatible with prior datasets and can be merged to learn accurate CFD models .", "forward": true, "src_ids": "2021.emnlp-main.568_10088"}
{"input": "pos tagging is done by using Method| context: combining several embeddings typically improves performance in downstream tasks as different embeddings encode different information . it has been shown that even models using embeddings from transformers still benefit from the inclusion of standard word embeddings . however , the combination of embeddings of different types and dimensions is challenging .", "entity": "pos tagging", "output": "feature - based adversarial meta - embeddings", "neg_sample": ["pos tagging is done by using Method", "combining several embeddings typically improves performance in downstream tasks as different embeddings encode different information .", "it has been shown that even models using embeddings from transformers still benefit from the inclusion of standard word embeddings .", "however , the combination of embeddings of different types and dimensions is challenging ."], "relation": "used for", "id": "2021.emnlp-main.660", "year": 2021, "rel_sent": "FAME sets the new state of the art for POS tagging in 27 languages , various NER settings and question classification in different domains .", "forward": false, "src_ids": "2021.emnlp-main.660_15395"}
{"input": "knowledge base is done by using Method| context: in this paper , we present an automatic knowledge base construction system from large scale enterprise documents with minimal efforts of human intervention . in the design and deployment of such a knowledge mining system for enterprise , we faced several challenges including data distributional shift , performance evaluation , compliance requirements and other practical issues .", "entity": "knowledge base", "output": "machine learning techniques", "neg_sample": ["knowledge base is done by using Method", "in this paper , we present an automatic knowledge base construction system from large scale enterprise documents with minimal efforts of human intervention .", "in the design and deployment of such a knowledge mining system for enterprise , we faced several challenges including data distributional shift , performance evaluation , compliance requirements and other practical issues ."], "relation": "used for", "id": "2021.emnlp-demo.2", "year": 2021, "rel_sent": "We leveraged state - of - the - art deep learning models to extract information ( named entities and definitions ) at per document level , then further applied classical machine learning techniques to process global statistical information to improve the knowledge base .", "forward": false, "src_ids": "2021.emnlp-demo.2_12975"}
{"input": "event detection is done by using Method| context: event detection ( ed ) aims to identify event trigger words from a given text and classify it into an event type . most current methods to ed rely heavily on training instances , and almost ignore the correlation of event types . hence , they tend to suffer from data scarcity and fail to handle new unseen event types .", "entity": "event detection", "output": "ontoed", "neg_sample": ["event detection is done by using Method", "event detection ( ed ) aims to identify event trigger words from a given text and classify it into an event type .", "most current methods to ed rely heavily on training instances , and almost ignore the correlation of event types .", "hence , they tend to suffer from data scarcity and fail to handle new unseen event types ."], "relation": "used for", "id": "2021.acl-long.220", "year": 2021, "rel_sent": "Experiments indicate that OntoED is more predominant and robust than previous approaches to ED , especially in data - scarce scenarios .", "forward": false, "src_ids": "2021.acl-long.220_11783"}
{"input": "dual encoder model is done by using OtherScientificTerm| context: multilingual sentence embeddings capture rich semantic information not only for measuring similarity between texts but alsofor catering to a broad range of downstream cross - lingual nlp tasks . state - of - the - art multilingual sentence embedding models require large parallel corpora to learn efficiently , which confines the scope of these models .", "entity": "dual encoder model", "output": "multi - task loss function", "neg_sample": ["dual encoder model is done by using OtherScientificTerm", "multilingual sentence embeddings capture rich semantic information not only for measuring similarity between texts but alsofor catering to a broad range of downstream cross - lingual nlp tasks .", "state - of - the - art multilingual sentence embedding models require large parallel corpora to learn efficiently , which confines the scope of these models ."], "relation": "used for", "id": "2021.emnlp-main.716", "year": 2021, "rel_sent": "We capture semantic similarity and relatedness between sentences using a multi - task loss function for training a dual encoder model mapping different languages onto the same vector space .", "forward": false, "src_ids": "2021.emnlp-main.716_2523"}
{"input": "neural machine translation is done by using Method| context: recently , token - level adaptive training has achieved promising improvement in machine translation , where the cross - entropy loss function is adjusted by assigning different training weights to different tokens , in order to alleviate the token imbalance problem . however , previous approaches only use static word frequency information in the target language without considering the source language , which is insufficient for bilingual tasks like machine translation .", "entity": "neural machine translation", "output": "bilingual mutual information based adaptive training", "neg_sample": ["neural machine translation is done by using Method", "recently , token - level adaptive training has achieved promising improvement in machine translation , where the cross - entropy loss function is adjusted by assigning different training weights to different tokens , in order to alleviate the token imbalance problem .", "however , previous approaches only use static word frequency information in the target language without considering the source language , which is insufficient for bilingual tasks like machine translation ."], "relation": "used for", "id": "2021.acl-short.65", "year": 2021, "rel_sent": "Bilingual Mutual Information Based Adaptive Training for Neural Machine Translation.", "forward": false, "src_ids": "2021.acl-short.65_12799"}
{"input": "monolingual data is done by using Method| context: self - training has proven effective for improving nmt performance by augmenting model training with synthetic parallel data . the common practice is to construct synthetic data based on a randomly sampled subset of large - scale monolingual data , which we empirically show is sub - optimal .", "entity": "monolingual data", "output": "uncertainty - based sampling strategy", "neg_sample": ["monolingual data is done by using Method", "self - training has proven effective for improving nmt performance by augmenting model training with synthetic parallel data .", "the common practice is to construct synthetic data based on a randomly sampled subset of large - scale monolingual data , which we empirically show is sub - optimal ."], "relation": "used for", "id": "2021.acl-long.221", "year": 2021, "rel_sent": "Accordingly , we design an uncertainty - based sampling strategy to efficiently exploit the monolingual data for self - training , in which monolingual sentences with higher uncertainty would be sampled with higher probability .", "forward": false, "src_ids": "2021.acl-long.221_10635"}
{"input": "neural machine translation ( nmt ) system is done by using Material| context: we consider a low - resource translation task from finnish into northern sami . collecting all available parallel data between the languages , we obtain around 30,000 sentence pairs . however , there exists a significantly larger monolingual northern sami corpus , as well as a rule - based machine translation ( rbmt ) system between the languages .", "entity": "neural machine translation ( nmt ) system", "output": "monolingual data", "neg_sample": ["neural machine translation ( nmt ) system is done by using Material", "we consider a low - resource translation task from finnish into northern sami .", "collecting all available parallel data between the languages , we obtain around 30,000 sentence pairs .", "however , there exists a significantly larger monolingual northern sami corpus , as well as a rule - based machine translation ( rbmt ) system between the languages ."], "relation": "used for", "id": "2021.nodalida-main.37", "year": 2021, "rel_sent": "To make the best use of the monolingual data in a neural machine translation ( NMT ) system , we use the backtranslation approach to create synthetic parallel data from it using both NMT and RBMT systems .", "forward": false, "src_ids": "2021.nodalida-main.37_14024"}
{"input": "latent variational modules is used for OtherScientificTerm| context: despite the impressive performance of sentence - level and context - aware neural machine translation ( nmt ) , there still remain challenges to translate bilingual conversational text due to its inherent characteristics such as role preference , dialogue coherence , and translation consistency .", "entity": "latent variational modules", "output": "bilingual conversational characteristics", "neg_sample": ["latent variational modules is used for OtherScientificTerm", "despite the impressive performance of sentence - level and context - aware neural machine translation ( nmt ) , there still remain challenges to translate bilingual conversational text due to its inherent characteristics such as role preference , dialogue coherence , and translation consistency ."], "relation": "used for", "id": "2021.acl-long.444", "year": 2021, "rel_sent": "Specifically , we design three latent variational modules to learn the distributions of bilingual conversational characteristics .", "forward": true, "src_ids": "2021.acl-long.444_15141"}
{"input": "contrastive learning method is used for Method| context: dense passage retrieval has been shown to be an effective approach for information retrieval tasks such as open domain question answering .", "entity": "contrastive learning method", "output": "dual - encoder model", "neg_sample": ["contrastive learning method is used for Method", "dense passage retrieval has been shown to be an effective approach for information retrieval tasks such as open domain question answering ."], "relation": "used for", "id": "2021.acl-long.477", "year": 2021, "rel_sent": "In this paper , we propose a new contrastive learning method called Cross Momentum Contrastive learning ( xMoCo ) , for learning a dual - encoder model for question - passage matching .", "forward": true, "src_ids": "2021.acl-long.477_13234"}
{"input": "4 - tuple temporal representation is used for OtherScientificTerm| context: grounding events into a precise timeline is important for natural language understanding but has received limited attention in recent work . this problem is challenging due to the inherent ambiguity of language and the requirement for information propagation over inter - related events .", "entity": "4 - tuple temporal representation", "output": "fuzzy time spans", "neg_sample": ["4 - tuple temporal representation is used for OtherScientificTerm", "grounding events into a precise timeline is important for natural language understanding but has received limited attention in recent work .", "this problem is challenging due to the inherent ambiguity of language and the requirement for information propagation over inter - related events ."], "relation": "used for", "id": "2021.naacl-main.6", "year": 2021, "rel_sent": "This paper first formulates this problem based on a 4 - tuple temporal representation used in entity slot filling , which allows us to represent fuzzy time spans more conveniently .", "forward": true, "src_ids": "2021.naacl-main.6_4586"}
{"input": "fine - grained knowledge elements is done by using Method| context: to defend against machine - generated fake news , an effective mechanism is urgently needed .", "entity": "fine - grained knowledge elements", "output": "cross - media consistency checking", "neg_sample": ["fine - grained knowledge elements is done by using Method", "to defend against machine - generated fake news , an effective mechanism is urgently needed ."], "relation": "used for", "id": "2021.acl-long.133", "year": 2021, "rel_sent": "We contribute a novel benchmark for fake news detection at the knowledge element level , as well as a solution for this task which incorporates cross - media consistency checking to detect the fine - grained knowledge elements making news articles misinformative .", "forward": false, "src_ids": "2021.acl-long.133_7516"}
{"input": "low web - resource language model adaptation is done by using OtherScientificTerm| context: recent research in multilingual language models ( lm ) has demonstrated their ability to effectively handle multiple languages in a single model . this holds promise for low web - resource languages ( lrl ) as multilingual models can enable transfer of supervision from high resource languages to lrls . however , incorporating a new language in an lm still remains a challenge , particularly for languages with limited corpora and in unseen scripts .", "entity": "low web - resource language model adaptation", "output": "language relatedness", "neg_sample": ["low web - resource language model adaptation is done by using OtherScientificTerm", "recent research in multilingual language models ( lm ) has demonstrated their ability to effectively handle multiple languages in a single model .", "this holds promise for low web - resource languages ( lrl ) as multilingual models can enable transfer of supervision from high resource languages to lrls .", "however , incorporating a new language in an lm still remains a challenge , particularly for languages with limited corpora and in unseen scripts ."], "relation": "used for", "id": "2021.acl-long.105", "year": 2021, "rel_sent": "Exploiting Language Relatedness for Low Web - Resource Language Model Adaptation : An Indic Languages Study.", "forward": false, "src_ids": "2021.acl-long.105_9065"}
{"input": "entity alignment is done by using Method| context: this paper studies a new problem setting of entity alignment for knowledge graphs ( kgs ) . since kgs possess different sets of entities , there could be entities that can not find alignment across them , leading to the problem of dangling entities .", "entity": "entity alignment", "output": "multi - task learning framework", "neg_sample": ["entity alignment is done by using Method", "this paper studies a new problem setting of entity alignment for knowledge graphs ( kgs ) .", "since kgs possess different sets of entities , there could be entities that can not find alignment across them , leading to the problem of dangling entities ."], "relation": "used for", "id": "2021.acl-long.278", "year": 2021, "rel_sent": "As the first attempt to this problem , we construct a new dataset and design a multi - task learning framework for both entity alignment and dangling entity detection .", "forward": false, "src_ids": "2021.acl-long.278_4657"}
{"input": "sparsity of training data is done by using Method| context: accurate recovery of predicate - argument structure from a universal dependency ( ud ) parse is central to downstream tasks such as extraction of semantic roles or event representations .", "entity": "sparsity of training data", "output": "compchains", "neg_sample": ["sparsity of training data is done by using Method", "accurate recovery of predicate - argument structure from a universal dependency ( ud ) parse is central to downstream tasks such as extraction of semantic roles or event representations ."], "relation": "used for", "id": "2021.starsem-1.11", "year": 2021, "rel_sent": "We conclude by discussing how compchains provide a new perspective on the sparsity of training data for UD parsers , as well as the accuracy of the resulting UD parses .", "forward": false, "src_ids": "2021.starsem-1.11_6678"}
{"input": "speaker embedding methods is used for Metric| context: in recent years , speech synthesis system can generate speech with high speech quality . however , multi - speaker text - to - speech ( tts ) system still require large amount of speech data for each target speaker .", "entity": "speaker embedding methods", "output": "speaker similarity", "neg_sample": ["speaker embedding methods is used for Metric", "in recent years , speech synthesis system can generate speech with high speech quality .", "however , multi - speaker text - to - speech ( tts ) system still require large amount of speech data for each target speaker ."], "relation": "used for", "id": "2021.ijclclp-2.4", "year": 2021, "rel_sent": "Incorporating Speaker Embedding and Post - Filter Network for Improving Speaker Similarity of Personalized Speech Synthesis System.", "forward": true, "src_ids": "2021.ijclclp-2.4_7081"}
{"input": "model weaknesses is done by using Method| context: robustness and counterfactual bias are usually evaluated on a test dataset . however , are these evaluations robust ? if the test dataset is perturbed slightly , will the evaluation results keep the same ?", "entity": "model weaknesses", "output": "' double perturbation ' framework", "neg_sample": ["model weaknesses is done by using Method", "robustness and counterfactual bias are usually evaluated on a test dataset .", "however , are these evaluations robust ?", "if the test dataset is perturbed slightly , will the evaluation results keep the same ?"], "relation": "used for", "id": "2021.naacl-main.305", "year": 2021, "rel_sent": "In this paper , we propose a ' double perturbation ' framework to uncover model weaknesses beyond the test dataset .", "forward": false, "src_ids": "2021.naacl-main.305_7734"}
{"input": "positional encodings is used for Task| context: in order to preserve word - order information in a non - autoregressive setting , transformer architectures tend to include positional knowledge , by ( for instance ) adding positional encodings to token embeddings . several modifications have been proposed over the sinusoidal positional encodings used in the original transformer architecture ; these include , for instance , separating position encodings and token embeddings , or directly modifying attention weights based on the distance between word pairs .", "entity": "positional encodings", "output": "multilingual compression", "neg_sample": ["positional encodings is used for Task", "in order to preserve word - order information in a non - autoregressive setting , transformer architectures tend to include positional knowledge , by ( for instance ) adding positional encodings to token embeddings .", "several modifications have been proposed over the sinusoidal positional encodings used in the original transformer architecture ; these include , for instance , separating position encodings and token embeddings , or directly modifying attention weights based on the distance between word pairs ."], "relation": "used for", "id": "2021.emnlp-main.59", "year": 2021, "rel_sent": "The Impact of Positional Encodings on Multilingual Compression.", "forward": true, "src_ids": "2021.emnlp-main.59_11589"}
{"input": "mtbert is used for Method| context: however , their huge model sizes hinder their applications in many practical systems . however , the knowledge learned from a single teacher may be limited and even biased , resulting in low - quality student model .", "entity": "mtbert", "output": "pre - trained language models", "neg_sample": ["mtbert is used for Method", "however , their huge model sizes hinder their applications in many practical systems .", "however , the knowledge learned from a single teacher may be limited and even biased , resulting in low - quality student model ."], "relation": "used for", "id": "2021.findings-acl.387", "year": 2021, "rel_sent": "Experiments on three benchmark datasets validate the effectiveness of MTBERT in compressing PLMs .", "forward": true, "src_ids": "2021.findings-acl.387_10297"}
{"input": "highland puebla nahuatl speech translation corpus is used for Task| context: documentation of endangered languages ( els ) has become increasingly urgent as thousands of languages are on the verge of disappearing by the end of the 21st century . one challenging aspect of documentation is to develop machine learning tools to automate the processing of el audio via automatic speech recognition ( asr ) , machine translation ( mt ) , or speech translation ( st ) .", "entity": "highland puebla nahuatl speech translation corpus", "output": "endangered language documentation", "neg_sample": ["highland puebla nahuatl speech translation corpus is used for Task", "documentation of endangered languages ( els ) has become increasingly urgent as thousands of languages are on the verge of disappearing by the end of the 21st century .", "one challenging aspect of documentation is to develop machine learning tools to automate the processing of el audio via automatic speech recognition ( asr ) , machine translation ( mt ) , or speech translation ( st ) ."], "relation": "used for", "id": "2021.americasnlp-1.7", "year": 2021, "rel_sent": "Highland Puebla Nahuatl Speech Translation Corpus for Endangered Language Documentation.", "forward": true, "src_ids": "2021.americasnlp-1.7_15497"}
{"input": "pre - trained language models is done by using Method| context: while pre - trained language models ( ptlms ) have achieved noticeable success on many nlp tasks , they still struggle for tasks that require event temporal reasoning , which is essential for event - centric applications .", "entity": "pre - trained language models", "output": "continual pre - training approach", "neg_sample": ["pre - trained language models is done by using Method", "while pre - trained language models ( ptlms ) have achieved noticeable success on many nlp tasks , they still struggle for tasks that require event temporal reasoning , which is essential for event - centric applications ."], "relation": "used for", "id": "2021.emnlp-main.436", "year": 2021, "rel_sent": "We present a continual pre - training approach that equips PTLMs with targeted knowledge about event temporal relations .", "forward": false, "src_ids": "2021.emnlp-main.436_16024"}
{"input": "transformed contextual representations is used for OtherScientificTerm| context: the application of transformer - based contextual representations has became a de facto solution for solving complex nlp tasks . despite their successes , such representations are arguably opaque as their latent dimensions are not directly interpretable .", "entity": "transformed contextual representations", "output": "supersense category", "neg_sample": ["transformed contextual representations is used for OtherScientificTerm", "the application of transformer - based contextual representations has became a de facto solution for solving complex nlp tasks .", "despite their successes , such representations are arguably opaque as their latent dimensions are not directly interpretable ."], "relation": "used for", "id": "2021.acl-srw.25", "year": 2021, "rel_sent": "Our experiments reveal that the interpretable nature of transformed contextual representations makes it possible to accurately predict the supersense category of a word by simply looking for its transformed coordinate with the largest coefficient .", "forward": true, "src_ids": "2021.acl-srw.25_15304"}
{"input": "wmt21 is done by using Method| context: this paper describes mininglamp neural machine translation systems of the wmt2021 news translation tasks . we have participated in eight directions translation tasks for news text including chinese to / from english , hausa to / from english , german to / from english and french to / from german .", "entity": "wmt21", "output": "mininglamp machine translation system", "neg_sample": ["wmt21 is done by using Method", "this paper describes mininglamp neural machine translation systems of the wmt2021 news translation tasks .", "we have participated in eight directions translation tasks for news text including chinese to / from english , hausa to / from english , german to / from english and french to / from german ."], "relation": "used for", "id": "2021.wmt-1.25", "year": 2021, "rel_sent": "The Mininglamp Machine Translation System for WMT21.", "forward": false, "src_ids": "2021.wmt-1.25_14217"}
{"input": "learning event graph knowledge is used for Task| context: however , abundant event commonsense knowledge is not well exploited for this task .", "entity": "learning event graph knowledge", "output": "abductive reasoning", "neg_sample": ["learning event graph knowledge is used for Task", "however , abundant event commonsense knowledge is not well exploited for this task ."], "relation": "used for", "id": "2021.acl-long.403", "year": 2021, "rel_sent": "Learning Event Graph Knowledge for Abductive Reasoning.", "forward": true, "src_ids": "2021.acl-long.403_15716"}
{"input": "estimating mutual intelligibility is done by using Method| context: we describe experiments with character - based language modeling for written variants of nahuatl .", "entity": "estimating mutual intelligibility", "output": "character language models", "neg_sample": ["estimating mutual intelligibility is done by using Method", "we describe experiments with character - based language modeling for written variants of nahuatl ."], "relation": "used for", "id": "2021.americasnlp-1.3", "year": 2021, "rel_sent": "Using a standard LSTM model and publicly available Bible translations , we explore how character language models can be applied to the tasks of estimating mutual intelligibility , identifying genetic similarity , and distinguishing written variants .", "forward": false, "src_ids": "2021.americasnlp-1.3_13680"}
{"input": "unified generative framework is used for Task| context: most studies only focus on the subsets of these subtasks , which leads to various complicated absa models while hard to solve these subtasks in a unified framework .", "entity": "unified generative framework", "output": "aspect - based sentiment analysis", "neg_sample": ["unified generative framework is used for Task", "most studies only focus on the subsets of these subtasks , which leads to various complicated absa models while hard to solve these subtasks in a unified framework ."], "relation": "used for", "id": "2021.acl-long.188", "year": 2021, "rel_sent": "A Unified Generative Framework for Aspect - based Sentiment Analysis.", "forward": true, "src_ids": "2021.acl-long.188_13507"}
{"input": "hit is used for Task| context: understanding linguistics and morphology of resource - scarce code - mixed texts remains a key challenge in text processing . although word embedding comes in handy to support downstream tasks for low - resource languages , there are plenty of scopes in improving the quality of language representation particularly for code - mixed languages .", "entity": "hit", "output": "robust code - mixed language representation", "neg_sample": ["hit is used for Task", "understanding linguistics and morphology of resource - scarce code - mixed texts remains a key challenge in text processing .", "although word embedding comes in handy to support downstream tasks for low - resource languages , there are plenty of scopes in improving the quality of language representation particularly for code - mixed languages ."], "relation": "used for", "id": "2021.findings-acl.407", "year": 2021, "rel_sent": "HIT - A Hierarchically Fused Deep Attention Network for Robust Code - mixed Language Representation.", "forward": true, "src_ids": "2021.findings-acl.407_7794"}
{"input": "ws models is done by using OtherScientificTerm| context: the winograd schema ( ws ) has been proposed as a test for measuring commonsense capabilities of models . recently , pre - trained language model - based approaches have boosted performance on some ws benchmarks but the source of improvement is still not clear . this paper suggests that the apparent progress on ws may not necessarily reflect progress in commonsense reasoning .", "entity": "ws models", "output": "supervision", "neg_sample": ["ws models is done by using OtherScientificTerm", "the winograd schema ( ws ) has been proposed as a test for measuring commonsense capabilities of models .", "recently , pre - trained language model - based approaches have boosted performance on some ws benchmarks but the source of improvement is still not clear .", "this paper suggests that the apparent progress on ws may not necessarily reflect progress in commonsense reasoning ."], "relation": "used for", "id": "2021.emnlp-main.819", "year": 2021, "rel_sent": "We conclude that the observed progress is mostly due to the use of supervision in training WS models , which is not likely to successfully support all the required commonsense reasoning skills and knowledge .", "forward": false, "src_ids": "2021.emnlp-main.819_13919"}
{"input": "neural rankers is done by using Method| context: the performance of state - of - the - art neural rankers can deteriorate substantially when exposed to noisy inputs or applied to a new domain .", "entity": "neural rankers", "output": "contrastive fine - tuning method", "neg_sample": ["neural rankers is done by using Method", "the performance of state - of - the - art neural rankers can deteriorate substantially when exposed to noisy inputs or applied to a new domain ."], "relation": "used for", "id": "2021.findings-acl.51", "year": 2021, "rel_sent": "Contrastive Fine - tuning Improves Robustness for Neural Rankers.", "forward": false, "src_ids": "2021.findings-acl.51_12699"}
{"input": "abstract meaning representation is used for Task| context: existing approaches face significant challenges including complex question understanding , necessity for reasoning , and lack of large end - to - end training datasets .", "entity": "abstract meaning representation", "output": "knowledge base question answering", "neg_sample": ["abstract meaning representation is used for Task", "existing approaches face significant challenges including complex question understanding , necessity for reasoning , and lack of large end - to - end training datasets ."], "relation": "used for", "id": "2021.findings-acl.339", "year": 2021, "rel_sent": "Leveraging Abstract Meaning Representation for Knowledge Base Question Answering.", "forward": true, "src_ids": "2021.findings-acl.339_10052"}
{"input": "frame prediction is done by using Method| context: when journalists cover a news story , they can cover the story from multiple angles or perspectives . these perspectives are called ' frames , ' and usage of one frame or another may influence public perception and opinion of the issue at hand .", "entity": "frame prediction", "output": "multilingual language model", "neg_sample": ["frame prediction is done by using Method", "when journalists cover a news story , they can cover the story from multiple angles or perspectives .", "these perspectives are called ' frames , ' and usage of one frame or another may influence public perception and opinion of the issue at hand ."], "relation": "used for", "id": "2021.emnlp-demo.28", "year": 2021, "rel_sent": "The framework combines unsupervised and supervised machine learning and leverages a state - of - the - art ( SoTA ) multilingual language model , which can significantly enhance frame prediction performance while requiring a considerably small sample of manual annotations .", "forward": false, "src_ids": "2021.emnlp-demo.28_2783"}
{"input": "emergent domains is done by using Task| context: since late 2019 , covid-19 has quickly emerged as the newest biomedical domain , resulting in a surge of new information . as with other emergent domains , the discussion surrounding the topic has been rapidly changing , leading to the spread of misinformation . this has created the need for a public space for users to ask questions and receive credible , scientific answers .", "entity": "emergent domains", "output": "open - domain question - answering", "neg_sample": ["emergent domains is done by using Task", "since late 2019 , covid-19 has quickly emerged as the newest biomedical domain , resulting in a surge of new information .", "as with other emergent domains , the discussion surrounding the topic has been rapidly changing , leading to the spread of misinformation .", "this has created the need for a public space for users to ask questions and receive credible , scientific answers ."], "relation": "used for", "id": "2021.emnlp-demo.30", "year": 2021, "rel_sent": "Open - Domain Question - Answering for COVID-19 and Other Emergent Domains.", "forward": false, "src_ids": "2021.emnlp-demo.30_3390"}
{"input": "temporal knowledge graph is done by using Method| context: static knowledge graph ( skg ) embedding ( skge ) has been studied intensively in the past years . recently , temporal knowledge graph ( tkg ) embedding ( tkge ) has emerged .", "entity": "temporal knowledge graph", "output": "recursive temporal fact embedding", "neg_sample": ["temporal knowledge graph is done by using Method", "static knowledge graph ( skg ) embedding ( skge ) has been studied intensively in the past years .", "recently , temporal knowledge graph ( tkg ) embedding ( tkge ) has emerged ."], "relation": "used for", "id": "2021.naacl-main.451", "year": 2021, "rel_sent": "RTFE takes the SKGE to initialize the embeddings of TKG .", "forward": false, "src_ids": "2021.naacl-main.451_11841"}
{"input": "nmt models is used for Task| context: modern neural machine translation ( nmt ) models have achieved competitive performance in standard benchmarks such as wmt . however , there still exist significant issues such as robustness , domain generalization , etc .", "entity": "nmt models", "output": "compositional generalization", "neg_sample": ["nmt models is used for Task", "modern neural machine translation ( nmt ) models have achieved competitive performance in standard benchmarks such as wmt .", "however , there still exist significant issues such as robustness , domain generalization , etc ."], "relation": "used for", "id": "2021.acl-long.368", "year": 2021, "rel_sent": "We quantitatively analyze effects of various factors using compound translation error rate , then demonstrate that the NMT model fails badly on compositional generalization , although it performs remarkably well under traditional metrics .", "forward": true, "src_ids": "2021.acl-long.368_10805"}
{"input": "missing links is done by using Method| context: rapp estimated correlations , rho , between corpus statistics and pyscholinguistic norms . rho improves with quantity ( corpus size ) and quality ( balance ) .", "entity": "missing links", "output": "knowledge graph completion ( kgc )", "neg_sample": ["missing links is done by using Method", "rapp estimated correlations , rho , between corpus statistics and pyscholinguistic norms .", "rho improves with quantity ( corpus size ) and quality ( balance ) ."], "relation": "used for", "id": "2021.emnlp-main.501", "year": 2021, "rel_sent": "Knowledge Graph Completion ( KGC ) attempts to learn missing links from subsets .", "forward": false, "src_ids": "2021.emnlp-main.501_7774"}
{"input": "definite horn rule reasoning is used for Task| context: it has been addressed by two lines of research , i.e. , the more traditional logical rule reasoning and the more recent knowledge graph embedding ( kge ) . even worse , both approaches need to sample ground rules to tackle the scalability issue , as the total number of ground rules is intractable in practice , making them less effective in handling logical rules .", "entity": "definite horn rule reasoning", "output": "knowledge graph inference", "neg_sample": ["definite horn rule reasoning is used for Task", "it has been addressed by two lines of research , i.e.", ", the more traditional logical rule reasoning and the more recent knowledge graph embedding ( kge ) .", "even worse , both approaches need to sample ground rules to tackle the scalability issue , as the total number of ground rules is intractable in practice , making them less effective in handling logical rules ."], "relation": "used for", "id": "2021.emnlp-main.769", "year": 2021, "rel_sent": "UniKER : A Unified Framework for Combining Embedding and Definite Horn Rule Reasoning for Knowledge Graph Inference.", "forward": true, "src_ids": "2021.emnlp-main.769_13039"}
{"input": "digital humanities ( dh ) project is done by using Task| context: this paper focuses on data cleaning as part of a preprocessing procedure applied to text data retrieved from the web . although the importance of this early stage in a project using nlp methods is often highlighted by researchers , the details , general principles and techniques are usually left out due to consideration of space . at best , they are dismissed with a comment ' the usual data cleaning and preprocessing procedures were applied ' . more coverage is usually given to automatic text annotation such as lemmatisation , part - of - speech tagging and parsing , which is often included in preprocessing . in the literature , the term ' preprocessing ' is used to refer to a wide range of procedures , from filtering and cleaning to data transformation such as stemming and numeric representation , which might create confusion .", "entity": "digital humanities ( dh ) project", "output": "text preprocessing", "neg_sample": ["digital humanities ( dh ) project is done by using Task", "this paper focuses on data cleaning as part of a preprocessing procedure applied to text data retrieved from the web .", "although the importance of this early stage in a project using nlp methods is often highlighted by researchers , the details , general principles and techniques are usually left out due to consideration of space .", "at best , they are dismissed with a comment ' the usual data cleaning and preprocessing procedures were applied ' .", "more coverage is usually given to automatic text annotation such as lemmatisation , part - of - speech tagging and parsing , which is often included in preprocessing .", "in the literature , the term ' preprocessing ' is used to refer to a wide range of procedures , from filtering and cleaning to data transformation such as stemming and numeric representation , which might create confusion ."], "relation": "used for", "id": "2021.ranlp-srw.13", "year": 2021, "rel_sent": "Text Preprocessing and its Implications in a Digital Humanities Project.", "forward": false, "src_ids": "2021.ranlp-srw.13_870"}
{"input": "hypernym discovery is done by using Method| context: the high performance of large pretrained language models ( llms ) such as bert on nlp tasks has prompted questions about bert 's linguistic capabilities , and how they differ from humans ' .", "entity": "hypernym discovery", "output": "unsupervised models", "neg_sample": ["hypernym discovery is done by using Method", "the high performance of large pretrained language models ( llms ) such as bert on nlp tasks has prompted questions about bert 's linguistic capabilities , and how they differ from humans ' ."], "relation": "used for", "id": "2021.blackboxnlp-1.20", "year": 2021, "rel_sent": "Moreover , BERT with prompting outperforms other unsupervised models for hypernym discovery even in an unconstrained scenario .", "forward": false, "src_ids": "2021.blackboxnlp-1.20_8360"}
{"input": "consert is used for Task| context: learning high - quality sentence representations benefits a wide range of natural language processing tasks .", "entity": "consert", "output": "downstream tasks", "neg_sample": ["consert is used for Task", "learning high - quality sentence representations benefits a wide range of natural language processing tasks ."], "relation": "used for", "id": "2021.acl-long.393", "year": 2021, "rel_sent": "By making use of unlabeled texts , ConSERT solves the collapse issue of BERT - derived sentence representations and make them more applicable for downstream tasks .", "forward": true, "src_ids": "2021.acl-long.393_3905"}
{"input": "learning domain - specialised representations is used for Task| context: injecting external domain - specific knowledge ( e.g. , umls ) into pretrained language models ( lms ) advances their capability to handle specialised in - domain tasks such as biomedical entity linking ( bel ) . however , such abundant expert knowledge is available only for a handful of languages ( e.g. , english ) .", "entity": "learning domain - specialised representations", "output": "cross - lingual biomedical entity linking", "neg_sample": ["learning domain - specialised representations is used for Task", "injecting external domain - specific knowledge ( e.g.", ", umls ) into pretrained language models ( lms ) advances their capability to handle specialised in - domain tasks such as biomedical entity linking ( bel ) .", "however , such abundant expert knowledge is available only for a handful of languages ( e.g.", ", english ) ."], "relation": "used for", "id": "2021.acl-short.72", "year": 2021, "rel_sent": "Learning Domain - Specialised Representations for Cross - Lingual Biomedical Entity Linking.", "forward": true, "src_ids": "2021.acl-short.72_4161"}
{"input": "regularization is used for Task| context: in selective prediction , a classifier is allowed to abstain from making predictions on low - confidence examples . though this setting is interesting and important , selective prediction has rarely been examined in natural language processing ( nlp ) tasks . tofill this void in the literature , we study in this paper selective prediction for nlp , comparing different models and confidence estimators .", "entity": "regularization", "output": "confidence estimation", "neg_sample": ["regularization is used for Task", "in selective prediction , a classifier is allowed to abstain from making predictions on low - confidence examples .", "though this setting is interesting and important , selective prediction has rarely been examined in natural language processing ( nlp ) tasks .", "tofill this void in the literature , we study in this paper selective prediction for nlp , comparing different models and confidence estimators ."], "relation": "used for", "id": "2021.acl-long.84", "year": 2021, "rel_sent": "We alsofind that our proposed regularization improves confidence estimation and can be applied to other relevant scenarios , such as using classifier cascades for accuracy - efficiency trade - offs .", "forward": true, "src_ids": "2021.acl-long.84_3350"}
{"input": "hierarchically fused deep attention network is used for Task| context: understanding linguistics and morphology of resource - scarce code - mixed texts remains a key challenge in text processing . although word embedding comes in handy to support downstream tasks for low - resource languages , there are plenty of scopes in improving the quality of language representation particularly for code - mixed languages .", "entity": "hierarchically fused deep attention network", "output": "robust code - mixed language representation", "neg_sample": ["hierarchically fused deep attention network is used for Task", "understanding linguistics and morphology of resource - scarce code - mixed texts remains a key challenge in text processing .", "although word embedding comes in handy to support downstream tasks for low - resource languages , there are plenty of scopes in improving the quality of language representation particularly for code - mixed languages ."], "relation": "used for", "id": "2021.findings-acl.407", "year": 2021, "rel_sent": "HIT - A Hierarchically Fused Deep Attention Network for Robust Code - mixed Language Representation.", "forward": true, "src_ids": "2021.findings-acl.407_7793"}
{"input": "paraphrase generation is done by using Method| context: different from machine translation , paraphrase generation allows a certain level of discrepancy in semantics between source and target , which results in diverse transformations from lexical substitution to reordering of clauses . hence , the difficulty of transformations requires considering both source and target contexts .", "entity": "paraphrase generation", "output": "curriculum learning", "neg_sample": ["paraphrase generation is done by using Method", "different from machine translation , paraphrase generation allows a certain level of discrepancy in semantics between source and target , which results in diverse transformations from lexical substitution to reordering of clauses .", "hence , the difficulty of transformations requires considering both source and target contexts ."], "relation": "used for", "id": "2021.acl-srw.24", "year": 2021, "rel_sent": "In this study , we apply curriculum learning to paraphrase generation for the first time .", "forward": false, "src_ids": "2021.acl-srw.24_13448"}
{"input": "ensembling is used for Task| context: we describe three baseline beating systems for the high - resource english - only sub - task of the sigmorphon 2021 shared task 1 : a small ensemble that dialpad 's speech recognition team uses internally , a well - known off - the - shelf model , and a larger ensemble model comprising these and others .", "entity": "ensembling", "output": "grapheme - to - phoneme prediction", "neg_sample": ["ensembling is used for Task", "we describe three baseline beating systems for the high - resource english - only sub - task of the sigmorphon 2021 shared task 1 : a small ensemble that dialpad 's speech recognition team uses internally , a well - known off - the - shelf model , and a larger ensemble model comprising these and others ."], "relation": "used for", "id": "2021.sigmorphon-1.16", "year": 2021, "rel_sent": "Avengers , Ensemble ! Benefits of ensembling in grapheme - to - phoneme prediction.", "forward": true, "src_ids": "2021.sigmorphon-1.16_1514"}
{"input": "knowledge - empowered representation learning is used for Task| context: it has been widely studied recently , especially in open domains . however , few efforts have been made on closed - domain mrc , mainly due to the lack of large - scale training data .", "entity": "knowledge - empowered representation learning", "output": "chinese medical reading comprehension", "neg_sample": ["knowledge - empowered representation learning is used for Task", "it has been widely studied recently , especially in open domains .", "however , few efforts have been made on closed - domain mrc , mainly due to the lack of large - scale training data ."], "relation": "used for", "id": "2021.findings-acl.197", "year": 2021, "rel_sent": "Knowledge - Empowered Representation Learning for Chinese Medical Reading Comprehension : Task , Model and Resources.", "forward": true, "src_ids": "2021.findings-acl.197_706"}
{"input": "automatic factuality evaluation system is used for Task| context: beyond traditional token - overlap evaluation metrics ( bleu or meteor ) , a key concern faced by recent generators is to control the factuality of the generated text with respect to the input data specification .", "entity": "automatic factuality evaluation system", "output": "data - to - text generation", "neg_sample": ["automatic factuality evaluation system is used for Task", "beyond traditional token - overlap evaluation metrics ( bleu or meteor ) , a key concern faced by recent generators is to control the factuality of the generated text with respect to the input data specification ."], "relation": "used for", "id": "2021.unimplicit-1.3", "year": 2021, "rel_sent": "We report on our experience when developing an automatic factuality evaluation system for data - to - text generation that we are testing on WebNLG and E2E data .", "forward": true, "src_ids": "2021.unimplicit-1.3_1307"}
{"input": "manually selected synonyms is used for OtherScientificTerm| context: recently , there has been significant progress in studying neural networks to translate text descriptions into sql queries . despite achieving good performance on some public benchmarks , existing text - to - sql models typically rely on the lexical matching between words in natural language ( nl ) questions and tokens in table schemas , which may render the models vulnerable to attacks that break the schema linking mechanism .", "entity": "manually selected synonyms", "output": "real - world question paraphrases", "neg_sample": ["manually selected synonyms is used for OtherScientificTerm", "recently , there has been significant progress in studying neural networks to translate text descriptions into sql queries .", "despite achieving good performance on some public benchmarks , existing text - to - sql models typically rely on the lexical matching between words in natural language ( nl ) questions and tokens in table schemas , which may render the models vulnerable to attacks that break the schema linking mechanism ."], "relation": "used for", "id": "2021.acl-long.195", "year": 2021, "rel_sent": "NL questions in Spider - Syn are modified from Spider , by replacing their schema - related words with manually selected synonyms that reflect real - world question paraphrases .", "forward": true, "src_ids": "2021.acl-long.195_15362"}
{"input": "direct transfer is done by using Method| context: event extraction has long been a challenging task , addressed mostly with supervised methods that require expensive annotation and are not extensible to new event ontologies . in this work , we explore the possibility of zero - shot event extraction by formulating it as a set of textual entailment ( te ) and/or question answering ( qa ) queries ( e.g.", "entity": "direct transfer", "output": "te / qa models", "neg_sample": ["direct transfer is done by using Method", "event extraction has long been a challenging task , addressed mostly with supervised methods that require expensive annotation and are not extensible to new event ontologies .", "in this work , we explore the possibility of zero - shot event extraction by formulating it as a set of textual entailment ( te ) and/or question answering ( qa ) queries ( e.g."], "relation": "used for", "id": "2021.acl-short.42", "year": 2021, "rel_sent": "' A city was attacked ' entails ' There is an attack ' ) , exploiting pretrained TE / QA models for direct transfer .", "forward": false, "src_ids": "2021.acl-short.42_8745"}
{"input": "cause and effect spans is done by using Method| context: in this paper , we report on our system for fincausal 2021 financial document causality detection task .", "entity": "cause and effect spans", "output": "fine - tuned t5 - model", "neg_sample": ["cause and effect spans is done by using Method", "in this paper , we report on our system for fincausal 2021 financial document causality detection task ."], "relation": "used for", "id": "2021.fnp-1.9", "year": 2021, "rel_sent": "We tried two types of approaches : 1 ) a fine - tuned T5 - model that generated cause and effect spans 2 ) and a sequence - to - sequence model based on XLNet that solved the task as token classification .", "forward": false, "src_ids": "2021.fnp-1.9_5594"}
{"input": "context is done by using Method| context: answer sentence selection ( as2 ) is an efficient approach for the design of open - domain question answering ( qa ) systems . in order to achieve low latency , traditional as2 models score question - answer pairs individually , ignoring any information from the document each potential answer was extracted from . in contrast , more computationally expensive models designed for machine reading comprehension tasks typically receive one or more passages as input , which often results in better accuracy .", "entity": "context", "output": "multi - way attention architecture", "neg_sample": ["context is done by using Method", "answer sentence selection ( as2 ) is an efficient approach for the design of open - domain question answering ( qa ) systems .", "in order to achieve low latency , traditional as2 models score question - answer pairs individually , ignoring any information from the document each potential answer was extracted from .", "in contrast , more computationally expensive models designed for machine reading comprehension tasks typically receive one or more passages as input , which often results in better accuracy ."], "relation": "used for", "id": "2021.eacl-main.261", "year": 2021, "rel_sent": "Our best approach , which leverages a multi - way attention architecture to efficiently encode context , improves 6 % to 11 % over non - contextual state of the art in AS2 with minimal impact on system latency .", "forward": false, "src_ids": "2021.eacl-main.261_14535"}
{"input": "mahalanobis distance is used for OtherScientificTerm| context: pretrained transformers achieve remarkable performance when training and test data are from the same distribution . however , in real - world scenarios , the model often faces out - of - distribution ( ood ) instances that can cause severe semantic shift problems at inference time . therefore , in practice , a reliable model should identify such instances , and then either reject them during inference or pass them over to models that handle another distribution .", "entity": "mahalanobis distance", "output": "model 's penultimate layer", "neg_sample": ["mahalanobis distance is used for OtherScientificTerm", "pretrained transformers achieve remarkable performance when training and test data are from the same distribution .", "however , in real - world scenarios , the model often faces out - of - distribution ( ood ) instances that can cause severe semantic shift problems at inference time .", "therefore , in practice , a reliable model should identify such instances , and then either reject them during inference or pass them over to models that handle another distribution ."], "relation": "used for", "id": "2021.emnlp-main.84", "year": 2021, "rel_sent": "These OOD instances can then be accurately detected using the Mahalanobis distance in the model 's penultimate layer .", "forward": true, "src_ids": "2021.emnlp-main.84_4969"}
{"input": "multilingual scenario is done by using OtherScientificTerm| context: india is one of the richest language hubs on the earth and is very diverse and multilingual . but apart from a few indian languages , most of them are still considered to be resource poor . since most of the nlp techniques either require linguistic knowledge that can only be developed by experts and native speakers of that language or they require a lot of labelled data which is again expensive to generate , the task of text classification becomes challenging for most of the indian languages .", "entity": "multilingual scenario", "output": "lexical similarity", "neg_sample": ["multilingual scenario is done by using OtherScientificTerm", "india is one of the richest language hubs on the earth and is very diverse and multilingual .", "but apart from a few indian languages , most of them are still considered to be resource poor .", "since most of the nlp techniques either require linguistic knowledge that can only be developed by experts and native speakers of that language or they require a lot of labelled data which is again expensive to generate , the task of text classification becomes challenging for most of the indian languages ."], "relation": "used for", "id": "2021.ranlp-1.3", "year": 2021, "rel_sent": "The main objective of this paper is to see how one can benefit from the lexical similarity found in Indian languages in a multilingual scenario .", "forward": false, "src_ids": "2021.ranlp-1.3_12148"}
{"input": "polysemous verbs is done by using Task| context: despite recent advances in semantic role labeling propelled by pre - trained text encoders like bert , performance lags behind when applied to predicates observed infrequently during training or to sentences in new domains .", "entity": "polysemous verbs", "output": "predicate disambiguation of verbnet classes", "neg_sample": ["polysemous verbs is done by using Task", "despite recent advances in semantic role labeling propelled by pre - trained text encoders like bert , performance lags behind when applied to predicates observed infrequently during training or to sentences in new domains ."], "relation": "used for", "id": "2021.iwcs-1.6", "year": 2021, "rel_sent": "We alsofind that joint training of VerbNet role labeling and predicate disambiguation of VerbNet classes for polysemous verbs leads to improvements in both tasks , naturally supporting the extraction of VerbNet 's semantic representations .", "forward": false, "src_ids": "2021.iwcs-1.6_6794"}
{"input": "mrasp2 is used for Method| context: existing multilingual machine translation approaches mainly focus on english - centric directions , while the non - english directions still lag behind .", "entity": "mrasp2", "output": "single unified multilingual translation model", "neg_sample": ["mrasp2 is used for Method", "existing multilingual machine translation approaches mainly focus on english - centric directions , while the non - english directions still lag behind ."], "relation": "used for", "id": "2021.acl-long.21", "year": 2021, "rel_sent": "To this end , we propose mRASP2 , a training method to obtain a single unified multilingual translation model .", "forward": true, "src_ids": "2021.acl-long.21_13567"}
{"input": "indt5 is used for Task| context: transformer language models have become fundamental components of nlp based pipelines . although several transformer have been introduced to serve many languages , there is a shortage of models pre - trained for low - resource and indigenous languages in particular .", "entity": "indt5", "output": "machine translation", "neg_sample": ["indt5 is used for Task", "transformer language models have become fundamental components of nlp based pipelines .", "although several transformer have been introduced to serve many languages , there is a shortage of models pre - trained for low - resource and indigenous languages in particular ."], "relation": "used for", "id": "2021.americasnlp-1.30", "year": 2021, "rel_sent": "We also present the application of IndT5 to machine translation by investigating different approaches to translate between Spanish and the Indigenous languages as part of our contribution to the AmericasNLP 2021 Shared Task on Open Machine Translation .", "forward": true, "src_ids": "2021.americasnlp-1.30_12419"}
{"input": "mt systems is used for Task| context: language technologies , such as machine translation ( mt ) , but also the application of artificial intelligence in general and an abundance of cat tools and platforms have an increasing influence on the translation market . human interaction with these technologies becomes ever more important as they impact translators ' workflows , work environments , and job profiles . moreover , it has implications for translator training . one of the tasks that emerged with language technologies is post - editing ( pe ) where a human translator corrects raw machine translated output according to given guidelines and quality criteria ( o'brien , 2011 : 197 - 198 ) . already widely used in several traditional translation settings , its use has come intofocus in more creative processes such as literary translation and audiovisual translation ( avt ) as well .", "entity": "mt systems", "output": "translation process", "neg_sample": ["mt systems is used for Task", "language technologies , such as machine translation ( mt ) , but also the application of artificial intelligence in general and an abundance of cat tools and platforms have an increasing influence on the translation market .", "human interaction with these technologies becomes ever more important as they impact translators ' workflows , work environments , and job profiles .", "moreover , it has implications for translator training .", "one of the tasks that emerged with language technologies is post - editing ( pe ) where a human translator corrects raw machine translated output according to given guidelines and quality criteria ( o'brien , 2011 : 197 - 198 ) .", "already widely used in several traditional translation settings , its use has come intofocus in more creative processes such as literary translation and audiovisual translation ( avt ) as well ."], "relation": "used for", "id": "2021.mtsummit-asltrw.2", "year": 2021, "rel_sent": "With the integration of MT systems , the translation process should become more efficient .", "forward": true, "src_ids": "2021.mtsummit-asltrw.2_4783"}
{"input": "natural language processing ( nlp ) models is done by using Method| context: backdoor attacks , which maliciously control a well - trained model 's outputs of the instances with specific triggers , are recently shown to be serious threats to the safety of reusing deep neural networks ( dnns ) .", "entity": "natural language processing ( nlp ) models", "output": "word - based robustness - aware perturbation", "neg_sample": ["natural language processing ( nlp ) models is done by using Method", "backdoor attacks , which maliciously control a well - trained model 's outputs of the instances with specific triggers , are recently shown to be serious threats to the safety of reusing deep neural networks ( dnns ) ."], "relation": "used for", "id": "2021.emnlp-main.659", "year": 2021, "rel_sent": "Motivated by this observation , we construct a word - based robustness - aware perturbation to distinguish poisoned samples from clean samples to defend against the backdoor attacks on natural language processing ( NLP ) models .", "forward": false, "src_ids": "2021.emnlp-main.659_789"}
{"input": "hi - dst is done by using OtherScientificTerm| context: dialogue state tracking ( dst ) is a sub - task of task - based dialogue systems where the user intention is tracked through a set of ( domain , slot , slot - value ) triplets .", "entity": "hi - dst", "output": "model parameters", "neg_sample": ["hi - dst is done by using OtherScientificTerm", "dialogue state tracking ( dst ) is a sub - task of task - based dialogue systems where the user intention is tracked through a set of ( domain , slot , slot - value ) triplets ."], "relation": "used for", "id": "2021.sigdial-1.23", "year": 2021, "rel_sent": "The model parameters of Hi - DST are independent of the number of domains / slots .", "forward": false, "src_ids": "2021.sigdial-1.23_12030"}
{"input": "multilingual models is done by using Method| context: recent work demonstrates the potential of training one model for multilingual machine translation . in parallel , denoising pretraining using unlabeled monolingual data as a starting point for finetuning bitext machine translation systems has demonstrated strong performance gains . however , little has been explored on the potential to combine denoising pretraining with multilingual machine translation in a single model .", "entity": "multilingual models", "output": "pretraining and finetuning paradigm", "neg_sample": ["multilingual models is done by using Method", "recent work demonstrates the potential of training one model for multilingual machine translation .", "in parallel , denoising pretraining using unlabeled monolingual data as a starting point for finetuning bitext machine translation systems has demonstrated strong performance gains .", "however , little has been explored on the potential to combine denoising pretraining with multilingual machine translation in a single model ."], "relation": "used for", "id": "2021.findings-acl.304", "year": 2021, "rel_sent": "Finally , we discuss that the pretraining and finetuning paradigm alone is not enough to address the challenges of multilingual models for to - Many directions performance .", "forward": false, "src_ids": "2021.findings-acl.304_7566"}
{"input": "canonical metrics is used for Material| context: social media companies as well as censorship authorities make extensive use of artificial intelligence ( ai ) tools to monitor postings of hate speech , celebrations of violence or profanity . since ai software requires massive volumes of data to train computers , automatic - translation of the online content is usually implemented to compensate for the scarcity of text in some languages . however , machine translation ( mt ) mistakes are a regular occurrence when translating sentiment - oriented user - generated content ( ugc ) , especially when a low - resource language is involved . in such scenarios , the adequacy of the whole process relies on the assumption that the translation can be evaluated correctly .", "entity": "canonical metrics", "output": "meaningless translations", "neg_sample": ["canonical metrics is used for Material", "social media companies as well as censorship authorities make extensive use of artificial intelligence ( ai ) tools to monitor postings of hate speech , celebrations of violence or profanity .", "since ai software requires massive volumes of data to train computers , automatic - translation of the online content is usually implemented to compensate for the scarcity of text in some languages .", "however , machine translation ( mt ) mistakes are a regular occurrence when translating sentiment - oriented user - generated content ( ugc ) , especially when a low - resource language is involved .", "in such scenarios , the adequacy of the whole process relies on the assumption that the translation can be evaluated correctly ."], "relation": "used for", "id": "2021.triton-1.6", "year": 2021, "rel_sent": "We compare the performance of three canonical metrics on meaningless translations as compared to meaningful translations with a critical error that distorts the overall sentiment of the source text .", "forward": true, "src_ids": "2021.triton-1.6_15805"}
{"input": "fact verification is done by using OtherScientificTerm| context: fact verification is a challenging task of identifying the truthfulness of given claims based on the retrieval of relevant evidence texts . many claims require understanding and reasoning over external entity information for precise verification .", "entity": "fact verification", "output": "entity knowledge", "neg_sample": ["fact verification is done by using OtherScientificTerm", "fact verification is a challenging task of identifying the truthfulness of given claims based on the retrieval of relevant evidence texts .", "many claims require understanding and reasoning over external entity information for precise verification ."], "relation": "used for", "id": "2021.fever-1.6", "year": 2021, "rel_sent": "Modeling Entity Knowledge for Fact Verification.", "forward": false, "src_ids": "2021.fever-1.6_9866"}
{"input": "predicate - argument relations is done by using OtherScientificTerm| context: multi - text applications , such as multi - document summarization , are typically required to model redundancies across related texts . current methods confronting consolidation struggle tofuse overlapping information .", "entity": "predicate - argument relations", "output": "question - answer pairs", "neg_sample": ["predicate - argument relations is done by using OtherScientificTerm", "multi - text applications , such as multi - document summarization , are typically required to model redundancies across related texts .", "current methods confronting consolidation struggle tofuse overlapping information ."], "relation": "used for", "id": "2021.emnlp-main.778", "year": 2021, "rel_sent": "Our setting exploits QA - SRL , utilizing question - answer pairs to capture predicate - argument relations , facilitating laymen annotation of cross - text alignments .", "forward": false, "src_ids": "2021.emnlp-main.778_14514"}
{"input": "large neural models is used for OtherScientificTerm| context: humor and offense are highly subjective due to multiple word senses , cultural knowledge , and pragmatic competence . hence , accurately detecting humorous and offensive texts has several compelling use cases in recommendation systems and personalized content moderation . however , due to the lack of an extensive labeled dataset , most prior works in this domain have n't explored large neural models for subjective humor understanding .", "entity": "large neural models", "output": "rating", "neg_sample": ["large neural models is used for OtherScientificTerm", "humor and offense are highly subjective due to multiple word senses , cultural knowledge , and pragmatic competence .", "hence , accurately detecting humorous and offensive texts has several compelling use cases in recommendation systems and personalized content moderation .", "however , due to the lack of an extensive labeled dataset , most prior works in this domain have n't explored large neural models for subjective humor understanding ."], "relation": "used for", "id": "2021.semeval-1.36", "year": 2021, "rel_sent": "This paper explores whether large neural models and their ensembles can capture the intricacies associated with humor / offense detection and rating .", "forward": true, "src_ids": "2021.semeval-1.36_13500"}
{"input": "memory and knowledge augmented language models is used for Task| context: measuring event salience is essential in the understanding of stories .", "entity": "memory and knowledge augmented language models", "output": "inferring salience", "neg_sample": ["memory and knowledge augmented language models is used for Task", "measuring event salience is essential in the understanding of stories ."], "relation": "used for", "id": "2021.emnlp-main.65", "year": 2021, "rel_sent": "Memory and Knowledge Augmented Language Models for Inferring Salience in Long - Form Stories.", "forward": true, "src_ids": "2021.emnlp-main.65_3933"}
{"input": "sentiment analysis is done by using Method| context: this is usually referred to as the domain or task adaptation step . however , unlike the initial pre - training , this step is performed for each domain or task individually and is still rather slow , requiring several gpu days compared to several gpu hours required for the final task fine - tuning .", "entity": "sentiment analysis", "output": "masked language models ( mlm )", "neg_sample": ["sentiment analysis is done by using Method", "this is usually referred to as the domain or task adaptation step .", "however , unlike the initial pre - training , this step is performed for each domain or task individually and is still rather slow , requiring several gpu days compared to several gpu hours required for the final task fine - tuning ."], "relation": "used for", "id": "2021.emnlp-main.717", "year": 2021, "rel_sent": "NB - MLM : Efficient Domain Adaptation of Masked Language Models for Sentiment Analysis.", "forward": false, "src_ids": "2021.emnlp-main.717_13536"}
{"input": "asr algorithms is used for OtherScientificTerm| context: linguistic tone is transcribed for input into asr systems in numerous ways . this paper shows a systematic test of several transcription styles , using as an example the chibchan language bribri , an extremely low - resource language from costa rica .", "entity": "asr algorithms", "output": "tone patterns", "neg_sample": ["asr algorithms is used for OtherScientificTerm", "linguistic tone is transcribed for input into asr systems in numerous ways .", "this paper shows a systematic test of several transcription styles , using as an example the chibchan language bribri , an extremely low - resource language from costa rica ."], "relation": "used for", "id": "2021.americasnlp-1.20", "year": 2021, "rel_sent": "The most successful models separate the tone from the vowel , so that the ASR algorithms learn tone patterns independently .", "forward": true, "src_ids": "2021.americasnlp-1.20_6451"}
{"input": "transformer - based architecture is done by using Method| context: text - pair classification is the task of determining the class relationship between two sentences . it is embedded in several tasks such as paraphrase identification and duplicate question detection . contemporary methods use fine - tuned transformer encoder semantic representations of the classification token in the text - pair sequence from the transformer 's final layer for class prediction . however , research has shown that earlier parts of the network learn shallow features , such as syntax and structure , which existing methods do not directly exploit .", "entity": "transformer - based architecture", "output": "convolution - based decoder", "neg_sample": ["transformer - based architecture is done by using Method", "text - pair classification is the task of determining the class relationship between two sentences .", "it is embedded in several tasks such as paraphrase identification and duplicate question detection .", "contemporary methods use fine - tuned transformer encoder semantic representations of the classification token in the text - pair sequence from the transformer 's final layer for class prediction .", "however , research has shown that earlier parts of the network learn shallow features , such as syntax and structure , which existing methods do not directly exploit ."], "relation": "used for", "id": "2021.alta-1.7", "year": 2021, "rel_sent": "We propose a novel convolution - based decoder for transformer - based architecture that maximizes the use of encoder hidden features for text - pair classification .", "forward": false, "src_ids": "2021.alta-1.7_10341"}
{"input": "dialogue modeling is done by using Method| context: although neural models have achieved competitive results in dialogue systems , they have shown limited ability in representing core semantics , such as ignoring important entities .", "entity": "dialogue modeling", "output": "abstract meaning representation ( amr )", "neg_sample": ["dialogue modeling is done by using Method", "although neural models have achieved competitive results in dialogue systems , they have shown limited ability in representing core semantics , such as ignoring important entities ."], "relation": "used for", "id": "2021.acl-long.342", "year": 2021, "rel_sent": "To this end , we exploit Abstract Meaning Representation ( AMR ) to help dialogue modeling .", "forward": false, "src_ids": "2021.acl-long.342_4313"}
{"input": "knowledge distillation ( kd ) is used for Task| context: pretrained transformer - based encoders such as bert have been demonstrated to achieve state - of - the - art performance on numerous nlp tasks . despite their success , bert style encoders are large in size and have high latency during inference ( especially on cpu machines ) which make them unappealing for many online applications . recently introduced compression and distillation methods have provided effective ways to alleviate this shortcoming . however , the focus of these works has been mainly on monolingual encoders .", "entity": "knowledge distillation ( kd )", "output": "fine - tuning stage", "neg_sample": ["knowledge distillation ( kd ) is used for Task", "pretrained transformer - based encoders such as bert have been demonstrated to achieve state - of - the - art performance on numerous nlp tasks .", "despite their success , bert style encoders are large in size and have high latency during inference ( especially on cpu machines ) which make them unappealing for many online applications .", "recently introduced compression and distillation methods have provided effective ways to alleviate this shortcoming .", "however , the focus of these works has been mainly on monolingual encoders ."], "relation": "used for", "id": "2021.sustainlp-1.3", "year": 2021, "rel_sent": "Motivated by recent successes in zero - shot cross - lingual transfer learning using multilingual pretrained encoders such as mBERT , we evaluate the effectiveness of Knowledge Distillation ( KD ) both during pretraining stage and during fine - tuning stage on multilingual BERT models .", "forward": true, "src_ids": "2021.sustainlp-1.3_11741"}
{"input": "tpp pretraining is used for Task| context: we evaluate a simple approach to improving zero - shot multilingual transfer of mbert on social media corpus by adding a pretraining task called translation pair prediction ( tpp ) , which predicts whether a pair of cross - lingual texts are a valid translation .", "entity": "tpp pretraining", "output": "zero - shot transfer", "neg_sample": ["tpp pretraining is used for Task", "we evaluate a simple approach to improving zero - shot multilingual transfer of mbert on social media corpus by adding a pretraining task called translation pair prediction ( tpp ) , which predicts whether a pair of cross - lingual texts are a valid translation ."], "relation": "used for", "id": "2021.wnut-1.42", "year": 2021, "rel_sent": "We show improvements from TPP pretraining over mBERT alone in zero - shot transfer from English to Hindi , Arabic , and Japanese on two social media tasks : NER ( a 37 % average relative improvement in F1 across target languages ) and sentiment classification ( 12 % relative improvement in F1 ) on social media text , while also benchmarking on a non - social media task of Universal Dependency POS tagging ( 6.7 % relative improvement in accuracy ) .", "forward": true, "src_ids": "2021.wnut-1.42_10380"}
{"input": "otomi is done by using Task| context: in linguistics , interlinear glossing is an essential procedure for analyzing the morphology of languages . this type of annotation is useful for language documentation , and it can also provide valuable data for nlp applications .", "entity": "otomi", "output": "automatic glossing", "neg_sample": ["otomi is done by using Task", "in linguistics , interlinear glossing is an essential procedure for analyzing the morphology of languages .", "this type of annotation is useful for language documentation , and it can also provide valuable data for nlp applications ."], "relation": "used for", "id": "2021.americasnlp-1.5", "year": 2021, "rel_sent": "We perform automatic glossing for Otomi , an under - resourced language .", "forward": false, "src_ids": "2021.americasnlp-1.5_11984"}
{"input": "zero- and few - shot learning models is used for Task| context: state - of - the - art models are usually learned using annotated corpora or rely on hand - crafted affective lexicons .", "entity": "zero- and few - shot learning models", "output": "emotion classification", "neg_sample": ["zero- and few - shot learning models is used for Task", "state - of - the - art models are usually learned using annotated corpora or rely on hand - crafted affective lexicons ."], "relation": "used for", "id": "2021.ranlp-1.16", "year": 2021, "rel_sent": "Probabilistic Ensembles of Zero- and Few - Shot Learning Models for Emotion Classification.", "forward": true, "src_ids": "2021.ranlp-1.16_15110"}
{"input": "answer generation modules is used for Method| context: only a few of them adopt several answer generation modules for providing different mechanisms ; however , they either lack an aggregation mechanism to merge the answers from various modules , or are too complicated to be implemented with neural networks .", "entity": "answer generation modules", "output": "inference mechanisms", "neg_sample": ["answer generation modules is used for Method", "only a few of them adopt several answer generation modules for providing different mechanisms ; however , they either lack an aggregation mechanism to merge the answers from various modules , or are too complicated to be implemented with neural networks ."], "relation": "used for", "id": "2021.rocling-1.5", "year": 2021, "rel_sent": "The answer generation modules are designed to provide different inference mechanisms , the dispatch module is used to select a few appropriate answer generation modules to generate answer candidates , and the aggregation module is employed to select the final answer .", "forward": true, "src_ids": "2021.rocling-1.5_5677"}
{"input": "factuality evaluation metrics is done by using OtherScientificTerm| context: text generation models can generate factually inconsistent text containing distorted or fabricated facts about the source text . recent work has focused on building evaluation models to verify the factual correctness of semantically constrained text generation tasks such as document summarization . while the field of factuality evaluation is growing fast , we do n't have well - defined criteria for measuring the effectiveness , generalizability , reliability , or sensitivity of the factuality metrics .", "entity": "factuality evaluation metrics", "output": "common - sense conditions", "neg_sample": ["factuality evaluation metrics is done by using OtherScientificTerm", "text generation models can generate factually inconsistent text containing distorted or fabricated facts about the source text .", "recent work has focused on building evaluation models to verify the factual correctness of semantically constrained text generation tasks such as document summarization .", "while the field of factuality evaluation is growing fast , we do n't have well - defined criteria for measuring the effectiveness , generalizability , reliability , or sensitivity of the factuality metrics ."], "relation": "used for", "id": "2021.findings-acl.42", "year": 2021, "rel_sent": "We introduce five necessary , common - sense conditions for effective factuality metrics and experiment with nine recent factuality metrics using synthetic and human - labeled factuality data from short news , long news and dialogue summarization domains .", "forward": false, "src_ids": "2021.findings-acl.42_6397"}
{"input": "discrete word embedding space is done by using Method| context: as a prominent attribution - based explanation algorithm , integrated gradients ( ig ) is widely adopted due to its desirable explanation axioms and the ease of gradient computation . it measures feature importance by averaging the model 's output gradient interpolated along a straight - line path in the input data space . however , such straight - line interpolated points are not representative of text data due to the inherent discreteness of the word embedding space . this questions the faithfulness of the gradients computed at the interpolated points and consequently , the quality of the generated explanations .", "entity": "discrete word embedding space", "output": "interpolation strategies", "neg_sample": ["discrete word embedding space is done by using Method", "as a prominent attribution - based explanation algorithm , integrated gradients ( ig ) is widely adopted due to its desirable explanation axioms and the ease of gradient computation .", "it measures feature importance by averaging the model 's output gradient interpolated along a straight - line path in the input data space .", "however , such straight - line interpolated points are not representative of text data due to the inherent discreteness of the word embedding space .", "this questions the faithfulness of the gradients computed at the interpolated points and consequently , the quality of the generated explanations ."], "relation": "used for", "id": "2021.emnlp-main.805", "year": 2021, "rel_sent": "We develop two interpolation strategies for the discrete word embedding space that generates interpolation points that lie close to actual words in the embedding space , yielding more faithful gradient computation .", "forward": false, "src_ids": "2021.emnlp-main.805_690"}
{"input": "self - organizing maps ( soms ) is used for Task| context: romanian is one of the understudied languages in computational linguistics , with few resources available for the development of natural language processing tools .", "entity": "self - organizing maps ( soms )", "output": "clustering of word embeddings", "neg_sample": ["self - organizing maps ( soms ) is used for Task", "romanian is one of the understudied languages in computational linguistics , with few resources available for the development of natural language processing tools ."], "relation": "used for", "id": "2021.eacl-main.81", "year": 2021, "rel_sent": "We also demonstrate the generalization capacity of using SOMs for the clustering of word embeddings on another recently - introduced Romanian data set , for text categorization by topic .", "forward": true, "src_ids": "2021.eacl-main.81_2937"}
{"input": "relation patterns is done by using Method| context: distance based knowledge graph embedding methods show promising results on link prediction task , on which two topics have been widely studied : one is the ability to handle complex relations , such as n - to-1 , 1 - to - n and n - to - n , the other is to encode various relation patterns , such as symmetry / antisymmetry . however , the existing methods fail to solve these two problems at the same time , which leads to unsatisfactory results .", "entity": "relation patterns", "output": "pairre", "neg_sample": ["relation patterns is done by using Method", "distance based knowledge graph embedding methods show promising results on link prediction task , on which two topics have been widely studied : one is the ability to handle complex relations , such as n - to-1 , 1 - to - n and n - to - n , the other is to encode various relation patterns , such as symmetry / antisymmetry .", "however , the existing methods fail to solve these two problems at the same time , which leads to unsatisfactory results ."], "relation": "used for", "id": "2021.acl-long.336", "year": 2021, "rel_sent": "Besides , PairRE is capable of encoding three important relation patterns , symmetry / antisymmetry , inverse and composition .", "forward": false, "src_ids": "2021.acl-long.336_12292"}
{"input": "dutch dataset is used for Task| context: multi - label toxicity detection is highly prominent , with many research groups , companies , and individuals engaging with it through shared tasks and dedicated venues .", "entity": "dutch dataset", "output": "cross - lingual multilabel toxicity detection", "neg_sample": ["dutch dataset is used for Task", "multi - label toxicity detection is highly prominent , with many research groups , companies , and individuals engaging with it through shared tasks and dedicated venues ."], "relation": "used for", "id": "2021.bucc-1.10", "year": 2021, "rel_sent": "A Dutch Dataset for Cross - lingual Multilabel Toxicity Detection.", "forward": true, "src_ids": "2021.bucc-1.10_15688"}
{"input": "supervised ee is done by using Generic| context: event extraction ( ee ) has considerably benefited from pre - trained language models ( plms ) by fine - tuning . however , existing pre - training methods have not involved modeling event characteristics , resulting in the developed ee models can not take full advantage of large - scale unsupervised data .", "entity": "supervised ee", "output": "complementary representations", "neg_sample": ["supervised ee is done by using Generic", "event extraction ( ee ) has considerably benefited from pre - trained language models ( plms ) by fine - tuning .", "however , existing pre - training methods have not involved modeling event characteristics , resulting in the developed ee models can not take full advantage of large - scale unsupervised data ."], "relation": "used for", "id": "2021.acl-long.491", "year": 2021, "rel_sent": "The two complementary representations then work together to improve both the conventional supervised EE and the unsupervised ' liberal ' EE , which requires jointly extracting events and discovering event schemata without any annotated data .", "forward": false, "src_ids": "2021.acl-long.491_6674"}
{"input": "core computational resources is used for Material| context: identifying intertextual relationships between authors is of central importance to the study of literature .", "entity": "core computational resources", "output": "premodern language", "neg_sample": ["core computational resources is used for Material", "identifying intertextual relationships between authors is of central importance to the study of literature ."], "relation": "used for", "id": "2021.naacl-main.389", "year": 2021, "rel_sent": "Our results advance the development of core computational resources for a major premodern language and highlight a productive avenue for cross - disciplinary collaboration between the study of literature and NLP .", "forward": true, "src_ids": "2021.naacl-main.389_15987"}
{"input": "image retrieval is used for Method| context: visual grounding is a promising path toward more robust and accurate natural language processing ( nlp ) models . many multimodal extensions of bert ( e.g. , videobert , lxmert , vl - bert ) allow a joint modeling of texts and images that lead to state - of - the - art results on multimodal tasks such as visual question answering .", "entity": "image retrieval", "output": "pretraining", "neg_sample": ["image retrieval is used for Method", "visual grounding is a promising path toward more robust and accurate natural language processing ( nlp ) models .", "many multimodal extensions of bert ( e.g.", ", videobert , lxmert , vl - bert ) allow a joint modeling of texts and images that lead to state - of - the - art results on multimodal tasks such as visual question answering ."], "relation": "used for", "id": "2021.lantern-1.2", "year": 2021, "rel_sent": "The second one , which we call associative grounding , harnesses image retrieval to match texts with related images during both pretraining and text - only downstream tasks .", "forward": true, "src_ids": "2021.lantern-1.2_7244"}
{"input": "hierarchy level is used for Material| context: recent work on multilingual dependency parsing focused on developing highly multilingual parsers that can be applied to a wide range of low - resource languages .", "entity": "hierarchy level", "output": "treebanks", "neg_sample": ["hierarchy level is used for Material", "recent work on multilingual dependency parsing focused on developing highly multilingual parsers that can be applied to a wide range of low - resource languages ."], "relation": "used for", "id": "2021.findings-acl.431", "year": 2021, "rel_sent": "For each low - resource target language , we then climb this language hierarchy starting from the leaf node of that language and heuristically choose the hierarchy level at which to collect training treebanks .", "forward": true, "src_ids": "2021.findings-acl.431_10043"}
{"input": "biaffine parser is used for Task| context: previous works on key information extraction from visually rich documents ( vrds ) mainly focus on labeling the text within each bounding box ( i.e. ,semantic entity ) , while the relations in - between are largely unexplored .", "entity": "biaffine parser", "output": "entity relation extraction task", "neg_sample": ["biaffine parser is used for Task", "previous works on key information extraction from visually rich documents ( vrds ) mainly focus on labeling the text within each bounding box ( i.e.", ",semantic entity ) , while the relations in - between are largely unexplored ."], "relation": "used for", "id": "2021.emnlp-main.218", "year": 2021, "rel_sent": "In this paper , we adapt the popular dependency parsing model , the biaffine parser , to this entity relation extraction task .", "forward": true, "src_ids": "2021.emnlp-main.218_3911"}
{"input": "fine - tuning language models is used for Task| context: fine - tuning pre - trained language models for downstream tasks has become a norm for nlp . recently it is found that intermediate training can improve performance for fine - tuning language models for target tasks , high - level inference tasks such as question answering ( qa ) tend to work best as intermediate tasks . however it is not clear if intermediate training generally benefits various language models .", "entity": "fine - tuning language models", "output": "text classification", "neg_sample": ["fine - tuning language models is used for Task", "fine - tuning pre - trained language models for downstream tasks has become a norm for nlp .", "recently it is found that intermediate training can improve performance for fine - tuning language models for target tasks , high - level inference tasks such as question answering ( qa ) tend to work best as intermediate tasks .", "however it is not clear if intermediate training generally benefits various language models ."], "relation": "used for", "id": "2021.alta-1.16", "year": 2021, "rel_sent": "Does QA - based intermediate training help fine - tuning language models for text classification ?.", "forward": true, "src_ids": "2021.alta-1.16_7"}
{"input": "features is done by using Metric| context: answer sentence selection is an important sub - task in question answering ( qa ) that determines the correct answer sentence from a passage . this task can naturally be reduced to the semantic text similarity problem between question and answer candidate .", "entity": "features", "output": "similarity measures", "neg_sample": ["features is done by using Metric", "answer sentence selection is an important sub - task in question answering ( qa ) that determines the correct answer sentence from a passage .", "this task can naturally be reduced to the semantic text similarity problem between question and answer candidate ."], "relation": "used for", "id": "2021.paclic-1.29", "year": 2021, "rel_sent": "Study of Similarity Measures as Features in Classification for Answer Sentence Selection Task in Hindi Question Answering : Language - Specific v / s Other Measures.", "forward": false, "src_ids": "2021.paclic-1.29_5275"}
{"input": "fine - grained consistency reasoning is used for OtherScientificTerm| context: factual inconsistencies existed in the output of abstractive summarization models with original documents are frequently presented . fact consistency assessment requires the reasoning capability tofind subtle clues to identify whether a model - generated summary is consistent with the original document .", "entity": "fine - grained consistency reasoning", "output": "sentence level", "neg_sample": ["fine - grained consistency reasoning is used for OtherScientificTerm", "factual inconsistencies existed in the output of abstractive summarization models with original documents are frequently presented .", "fact consistency assessment requires the reasoning capability tofind subtle clues to identify whether a model - generated summary is consistent with the original document ."], "relation": "used for", "id": "2021.emnlp-main.9", "year": 2021, "rel_sent": "In the second stage , the model performs fine - grained consistency reasoning at the sentence level , and then aggregates all sentences ' consistency scores to obtain the final assessment result .", "forward": true, "src_ids": "2021.emnlp-main.9_6061"}
{"input": "multi - dialect arabic twitter corpus is used for OtherScientificTerm| context: social media ( sm ) platforms such as twitter provide large quantities of real - time data that can be leveraged during mass emergencies . developing tools to support crisis - affected communities requires available datasets , which often do not exist for low resource languages .", "entity": "multi - dialect arabic twitter corpus", "output": "crisis events", "neg_sample": ["multi - dialect arabic twitter corpus is used for OtherScientificTerm", "social media ( sm ) platforms such as twitter provide large quantities of real - time data that can be leveraged during mass emergencies .", "developing tools to support crisis - affected communities requires available datasets , which often do not exist for low resource languages ."], "relation": "used for", "id": "2021.wanlp-1.5", "year": 2021, "rel_sent": "This paper introduces Kawarith a multi - dialect Arabic Twitter corpus for crisis events , comprising more than a million Arabic tweets collected during 22 crises that occurred between 2018 and 2020 and involved several types of hazard .", "forward": true, "src_ids": "2021.wanlp-1.5_2923"}
{"input": "translation is done by using Method| context: neural machine translation ( nmt ) has shown a strong ability to utilize local context to disambiguate the meaning of words . however , it remains a challenge for nmt to leverage broader context information like topics .", "entity": "translation", "output": "nmt models", "neg_sample": ["translation is done by using Method", "neural machine translation ( nmt ) has shown a strong ability to utilize local context to disambiguate the meaning of words .", "however , it remains a challenge for nmt to leverage broader context information like topics ."], "relation": "used for", "id": "2021.emnlp-main.256", "year": 2021, "rel_sent": "In this paper , we propose heterogeneous ways of embedding topic information at the sentence level into an NMT model to improve translation performance .", "forward": false, "src_ids": "2021.emnlp-main.256_13899"}
{"input": "choral is used for OtherScientificTerm| context: humor detection has gained attention in recent years due to the desire to understand user - generated content with figurative language . however , substantial individual and cultural differences in humor perception make it very difficult to collect a large - scale humor dataset with reliable humor labels .", "entity": "choral", "output": "perceived humor labels", "neg_sample": ["choral is used for OtherScientificTerm", "humor detection has gained attention in recent years due to the desire to understand user - generated content with figurative language .", "however , substantial individual and cultural differences in humor perception make it very difficult to collect a large - scale humor dataset with reliable humor labels ."], "relation": "used for", "id": "2021.emnlp-main.364", "year": 2021, "rel_sent": "We propose CHoRaL , a framework to generate perceived humor labels on Facebook posts , using the naturally available user reactions to these posts with no manual annotation needed .", "forward": true, "src_ids": "2021.emnlp-main.364_13764"}
{"input": "unsupervised method is used for Task| context: historical corpora are known to contain errors introduced by ocr ( optical character recognition ) methods used in the digitization process , often said to be degrading the performance of nlp systems . correcting these errors manually is a time - consuming process and a great part of the automatic approaches have been relying on rules or supervised machine learning .", "entity": "unsupervised method", "output": "ocr post - correction", "neg_sample": ["unsupervised method is used for Task", "historical corpora are known to contain errors introduced by ocr ( optical character recognition ) methods used in the digitization process , often said to be degrading the performance of nlp systems .", "correcting these errors manually is a time - consuming process and a great part of the automatic approaches have been relying on rules or supervised machine learning ."], "relation": "used for", "id": "2021.nodalida-main.24", "year": 2021, "rel_sent": "An Unsupervised method for OCR Post - Correction and Spelling Normalisation for Finnish.", "forward": true, "src_ids": "2021.nodalida-main.24_13636"}
{"input": "reranker is used for Material| context: false claims that have been previously fact - checked can still spread on social media . to mitigate their continual spread , detecting previously fact - checked claims is indispensable . models that ignore the two aspects only leverage semantic relevance and may be misled by sentences that describe similar but irrelevant events .", "entity": "reranker", "output": "fc - articles", "neg_sample": ["reranker is used for Material", "false claims that have been previously fact - checked can still spread on social media .", "to mitigate their continual spread , detecting previously fact - checked claims is indispensable .", "models that ignore the two aspects only leverage semantic relevance and may be misled by sentences that describe similar but irrelevant events ."], "relation": "used for", "id": "2021.acl-long.425", "year": 2021, "rel_sent": "In this paper , we propose a novel reranker , MTM ( Memory - enhanced Transformers for Matching ) to rank FC - articles using key sentences selected with event ( lexical and semantic ) and pattern information .", "forward": true, "src_ids": "2021.acl-long.425_14384"}
{"input": "qualitative analysis of website localization is done by using Material| context: translation studies and more specifically , its subfield descriptive translation studies [ holmes 1988/2000 ] is , according to many scholars [ gambier , 2009 ; nenopoulou , 2007 ; munday , 2001/2008 ; hermans , 1999 ; snell - hornby et al . , 1994 e.t.c ] , a highly interdisciplinary field of study .", "entity": "qualitative analysis of website localization", "output": "polysemiotic corpora", "neg_sample": ["qualitative analysis of website localization is done by using Material", "translation studies and more specifically , its subfield descriptive translation studies [ holmes 1988/2000 ] is , according to many scholars [ gambier , 2009 ; nenopoulou , 2007 ; munday , 2001/2008 ; hermans , 1999 ; snell - hornby et al .", ", 1994 e.t.c ] , a highly interdisciplinary field of study ."], "relation": "used for", "id": "2021.triton-1.25", "year": 2021, "rel_sent": "Up to now research findings have shown that polysemiotic corpora can be a valuable tool not only of quantitative but also of qualitative analysis of website localization both for scholars and translation professionals working with multimodal genres .", "forward": false, "src_ids": "2021.triton-1.25_15685"}
{"input": "lifelong collaborative model ( lcm ) is used for OtherScientificTerm| context: lifelong topic models mainly focus on indomain text streams in which each chunk only contains documents from a single domain . to overcome data diversity of the in - domain corpus , most of the existing methods exploit the information from limited sources in a separate and heuristic manner .", "entity": "lifelong collaborative model ( lcm )", "output": "domain - specific word embeddings", "neg_sample": ["lifelong collaborative model ( lcm ) is used for OtherScientificTerm", "lifelong topic models mainly focus on indomain text streams in which each chunk only contains documents from a single domain .", "to overcome data diversity of the in - domain corpus , most of the existing methods exploit the information from limited sources in a separate and heuristic manner ."], "relation": "used for", "id": "2021.findings-acl.202", "year": 2021, "rel_sent": "In this study , we develop a lifelong collaborative model ( LCM ) based on non - negative matrix factorization to accurately learn topics and domain - specific word embeddings .", "forward": true, "src_ids": "2021.findings-acl.202_8683"}
{"input": "audio and text features is used for Method| context: having numerous potential applications and great impact , end - to - end speech translation ( st ) has long been treated as an independent task , failing tofully draw strength from the rapid advances of its sibling text machine translation ( mt ) . with text and audio inputs represented differently , the modality gap has rendered mt data and its end - to - end models incompatible with their st counterparts .", "entity": "audio and text features", "output": "common semantic representation", "neg_sample": ["audio and text features is used for Method", "having numerous potential applications and great impact , end - to - end speech translation ( st ) has long been treated as an independent task , failing tofully draw strength from the rapid advances of its sibling text machine translation ( mt ) .", "with text and audio inputs represented differently , the modality gap has rendered mt data and its end - to - end models incompatible with their st counterparts ."], "relation": "used for", "id": "2021.findings-acl.195", "year": 2021, "rel_sent": "By projecting audio and text features to a common semantic representation , Chimera unifies MT and ST tasks and boosts the performance on ST benchmarks , MuST - C and Augmented Librispeech , to a new state - of - the - art .", "forward": true, "src_ids": "2021.findings-acl.195_8526"}
{"input": "mrc model is used for OtherScientificTerm| context: most existing approaches propose to concat question and option together toform a context - aware model .", "entity": "mrc model", "output": "fine - grained context", "neg_sample": ["mrc model is used for OtherScientificTerm", "most existing approaches propose to concat question and option together toform a context - aware model ."], "relation": "used for", "id": "2021.semeval-1.110", "year": 2021, "rel_sent": "In this paper , we propose a novel MRC model by filling options into the question to produce a fine - grained context ( defined as summary ) which can better reveal the relationship between option and question .", "forward": true, "src_ids": "2021.semeval-1.110_13741"}
{"input": "paraphrasing models is used for Task| context: we present two novel unsupervised methods for eliminating toxicity in text .", "entity": "paraphrasing models", "output": "style transfer", "neg_sample": ["paraphrasing models is used for Task", "we present two novel unsupervised methods for eliminating toxicity in text ."], "relation": "used for", "id": "2021.emnlp-main.629", "year": 2021, "rel_sent": "Our first method combines two recent ideas : ( 1 ) guidance of the generation process with small style - conditional language models and ( 2 ) use of paraphrasing models to perform style transfer .", "forward": true, "src_ids": "2021.emnlp-main.629_12201"}
{"input": "syntax - sensitive dependencies is done by using OtherScientificTerm| context: the main subject and the associated verb in english must agree in grammatical number as per the subject - verb agreement ( sva ) phenomenon . it has been found that the presence of a noun between the verb and the main subject , whose grammatical number is opposite to that of the main subject , can cause speakers to produce a verb that agrees with the intervening noun rather than the main noun ; the former thus acts as an agreement attractor . previous work suggests that syntactic cues in the input can aid such models to choose hierarchical rules over linear rules for number agreement .", "entity": "syntax - sensitive dependencies", "output": "hierarchical bias", "neg_sample": ["syntax - sensitive dependencies is done by using OtherScientificTerm", "the main subject and the associated verb in english must agree in grammatical number as per the subject - verb agreement ( sva ) phenomenon .", "it has been found that the presence of a noun between the verb and the main subject , whose grammatical number is opposite to that of the main subject , can cause speakers to produce a verb that agrees with the intervening noun rather than the main noun ; the former thus acts as an agreement attractor .", "previous work suggests that syntactic cues in the input can aid such models to choose hierarchical rules over linear rules for number agreement ."], "relation": "used for", "id": "2021.scil-1.37", "year": 2021, "rel_sent": "Even in the presence of this biased training set , implicit hierarchical bias in the architecture ( as in the Ordered Neurons LSTM ) is not enough to capture syntax - sensitive dependencies .", "forward": false, "src_ids": "2021.scil-1.37_2786"}
{"input": "feature integration is used for Task| context: we believe this is because both types of features - the contextual information captured by the linear sequences and the structured information captured by the dependency trees may complement each other . however , existing approaches largely focused on stacking the lstm and graph neural networks such as graph convolutional networks ( gcns ) for building improved ner models , where the exact interaction mechanism between the two types of features is not very clear , and the performance gain does not appear to be significant .", "entity": "feature integration", "output": "named entity recognition", "neg_sample": ["feature integration is used for Task", "we believe this is because both types of features - the contextual information captured by the linear sequences and the structured information captured by the dependency trees may complement each other .", "however , existing approaches largely focused on stacking the lstm and graph neural networks such as graph convolutional networks ( gcns ) for building improved ner models , where the exact interaction mechanism between the two types of features is not very clear , and the performance gain does not appear to be significant ."], "relation": "used for", "id": "2021.naacl-main.271", "year": 2021, "rel_sent": "Better Feature Integration for Named Entity Recognition.", "forward": true, "src_ids": "2021.naacl-main.271_12263"}
{"input": "expert mixture is done by using Method| context: discrimination between antonyms and synonyms is an important and challenging nlp task . antonyms and synonyms often share the same or similar contexts and thus are hard to make a distinction .", "entity": "expert mixture", "output": "gating mechanism", "neg_sample": ["expert mixture is done by using Method", "discrimination between antonyms and synonyms is an important and challenging nlp task .", "antonyms and synonyms often share the same or similar contexts and thus are hard to make a distinction ."], "relation": "used for", "id": "2021.acl-short.71", "year": 2021, "rel_sent": "It works on the basis of a divide - and - conquer strategy , where a number of localized experts focus on their own domains ( or subspaces ) to learn their specialties , and a gating mechanism determines the space partitioning and the expert mixture .", "forward": false, "src_ids": "2021.acl-short.71_14086"}
{"input": "fact checkers is done by using Method| context: the last several years have seen a massive increase in the quantity and influence of disinformation being spread online . various approaches have been developed to target the process at different stages from identifying sources to tracking distribution in social media to providing follow up debunks to people who have encountered the disinformation . one common conclusion in each of these approaches is that disinformation is too nuanced and subjective a topic for fully automated solutions to work but the quantity of data to process and cross - reference is too high for humans to handle unassisted . ultimately , the problem calls for a hybrid approach of human experts with technological assistance .", "entity": "fact checkers", "output": "nlp techniques", "neg_sample": ["fact checkers is done by using Method", "the last several years have seen a massive increase in the quantity and influence of disinformation being spread online .", "various approaches have been developed to target the process at different stages from identifying sources to tracking distribution in social media to providing follow up debunks to people who have encountered the disinformation .", "one common conclusion in each of these approaches is that disinformation is too nuanced and subjective a topic for fully automated solutions to work but the quantity of data to process and cross - reference is too high for humans to handle unassisted .", "ultimately , the problem calls for a hybrid approach of human experts with technological assistance ."], "relation": "used for", "id": "2021.ranlp-1.154", "year": 2021, "rel_sent": "In this paper we will demonstrate the application of certain state - of - the - art NLP techniques in assisting expert debunkers and fact checkers as well as the role of these NLP algorithms within a more holistic approach to analyzing and countering the spread of disinformation .", "forward": false, "src_ids": "2021.ranlp-1.154_358"}
{"input": "multimodal multi - speaker input is used for OtherScientificTerm| context: risk prediction is an essential task in financial markets . merger and acquisition ( m&a ) calls provide key insights into the claims made by company executives about the restructuring of the financial firms . extracting vocal and textual cues from m&a calls can help model the risk associated with such financial activities .", "entity": "multimodal multi - speaker input", "output": "financial risk", "neg_sample": ["multimodal multi - speaker input is used for OtherScientificTerm", "risk prediction is an essential task in financial markets .", "merger and acquisition ( m&a ) calls provide key insights into the claims made by company executives about the restructuring of the financial firms .", "extracting vocal and textual cues from m&a calls can help model the risk associated with such financial activities ."], "relation": "used for", "id": "2021.acl-long.526", "year": 2021, "rel_sent": "We introduce M3ANet , a baseline architecture that takes advantage of the multimodal multi - speaker input toforecast the financial risk associated with the M&A calls .", "forward": true, "src_ids": "2021.acl-long.526_7068"}
{"input": "dense - captioning models is used for OtherScientificTerm| context: while much research has been done in text - to - image synthesis , little work has been done to explore the usage of linguistic structure of the input text . such information is even more important for story visualization since its inputs have an explicit narrative structure that needs to be translated into an image sequence ( or visual story ) . prior work in this domain has shown that there is ample room for improvement in the generated image sequence in terms of visual quality , consistency and relevance .", "entity": "dense - captioning models", "output": "spatial structure", "neg_sample": ["dense - captioning models is used for OtherScientificTerm", "while much research has been done in text - to - image synthesis , little work has been done to explore the usage of linguistic structure of the input text .", "such information is even more important for story visualization since its inputs have an explicit narrative structure that needs to be translated into an image sequence ( or visual story ) .", "prior work in this domain has shown that there is ample room for improvement in the generated image sequence in terms of visual quality , consistency and relevance ."], "relation": "used for", "id": "2021.emnlp-main.543", "year": 2021, "rel_sent": "We show that off - the - shelf dense - captioning models trained on Visual Genome can improve the spatial structure of images from a different target domain without needing fine - tuning .", "forward": true, "src_ids": "2021.emnlp-main.543_4143"}
{"input": "annotation of dialogues is done by using Method| context: dialogue systems are becoming ubiquitous in various forms and shapes - virtual assistants(siri , alexa , etc .", "entity": "annotation of dialogues", "output": "matilda", "neg_sample": ["annotation of dialogues is done by using Method", "dialogue systems are becoming ubiquitous in various forms and shapes - virtual assistants(siri , alexa , etc ."], "relation": "used for", "id": "2021.eacl-demos.5", "year": 2021, "rel_sent": "MATILDA allows the creation of corpora , the management of users , the annotation of dialogues , the quick adaptation of the user interface to any language and the resolution of inter - annotator disagreement .", "forward": false, "src_ids": "2021.eacl-demos.5_8237"}
{"input": "instant dot - product matching is used for Task| context: multimodal pre - training has propelled great advancement in vision - and - language research . these large - scale pre - trained models , although successful , fatefully suffer from slow inference speed due to enormous computational cost mainly from cross - modal attention in transformer architecture . when applied to real - life applications , such latency and computation demand severely deter the practical use of pre - trained models .", "entity": "instant dot - product matching", "output": "retrieval process", "neg_sample": ["instant dot - product matching is used for Task", "multimodal pre - training has propelled great advancement in vision - and - language research .", "these large - scale pre - trained models , although successful , fatefully suffer from slow inference speed due to enormous computational cost mainly from cross - modal attention in transformer architecture .", "when applied to real - life applications , such latency and computation demand severely deter the practical use of pre - trained models ."], "relation": "used for", "id": "2021.naacl-main.77", "year": 2021, "rel_sent": "LightningDOT removes the time - consuming cross - modal attention by extracting pre - cached feature indexes offline , and employing instant dot - product matching online , which significantly speeds up retrieval process .", "forward": true, "src_ids": "2021.naacl-main.77_3170"}
{"input": "recursive neural network is used for OtherScientificTerm| context: recent studies show that syntactic and structural information extracted from abstract syntax trees ( asts ) is conducive to summary generation . however , existing approaches fail tofully capture the rich information in asts because of the large size / depth of asts .", "entity": "recursive neural network", "output": "subtrees", "neg_sample": ["recursive neural network is used for OtherScientificTerm", "recent studies show that syntactic and structural information extracted from abstract syntax trees ( asts ) is conducive to summary generation .", "however , existing approaches fail tofully capture the rich information in asts because of the large size / depth of asts ."], "relation": "used for", "id": "2021.emnlp-main.332", "year": 2021, "rel_sent": "First , we hierarchically split a large AST into a set of subtrees and utilize a recursive neural network to encode the subtrees .", "forward": true, "src_ids": "2021.emnlp-main.332_3653"}
{"input": "heteronym phenomenon is done by using OtherScientificTerm| context: recent pretraining models in chinese neglect two important aspects specific to the chinese language : glyph and pinyin , which carry significant syntax and semantic information for language understanding .", "entity": "heteronym phenomenon", "output": "pinyin embedding", "neg_sample": ["heteronym phenomenon is done by using OtherScientificTerm", "recent pretraining models in chinese neglect two important aspects specific to the chinese language : glyph and pinyin , which carry significant syntax and semantic information for language understanding ."], "relation": "used for", "id": "2021.acl-long.161", "year": 2021, "rel_sent": "The glyph embedding is obtained based on different fonts of a Chinese character , being able to capture character semantics from the visual features , and the pinyin embedding characterizes the pronunciation of Chinese characters , which handles the highly prevalent heteronym phenomenon in Chinese ( the same character has different pronunciations with different meanings ) .", "forward": false, "src_ids": "2021.acl-long.161_10439"}
{"input": "unconstrained text generation is used for Task| context: we investigate post - ocr correction in a setting where we have access to different ocr views of the same document .", "entity": "unconstrained text generation", "output": "error correction", "neg_sample": ["unconstrained text generation is used for Task", "we investigate post - ocr correction in a setting where we have access to different ocr views of the same document ."], "relation": "used for", "id": "2021.emnlp-main.680", "year": 2021, "rel_sent": "This approach is motivated by scenarios in which unconstrained text generation for error correction is too risky .", "forward": true, "src_ids": "2021.emnlp-main.680_1961"}
{"input": "weakly supervised sentiment analysis is done by using Method| context: sentiment analysis is an important task in natural language processing ( nlp ) . most of existing state - of - the - art methods are under the supervised learning paradigm . however , human annotations can be scarce . thus , we should leverage more weak supervision for sentiment analysis .", "entity": "weakly supervised sentiment analysis", "output": "variational approach", "neg_sample": ["weakly supervised sentiment analysis is done by using Method", "sentiment analysis is an important task in natural language processing ( nlp ) .", "most of existing state - of - the - art methods are under the supervised learning paradigm .", "however , human annotations can be scarce .", "thus , we should leverage more weak supervision for sentiment analysis ."], "relation": "used for", "id": "2021.eacl-main.285", "year": 2021, "rel_sent": "Our experimental results show that the posterior regularization can improve the original variational approach to the weakly supervised sentiment analysis and the performance is more stable with smaller prediction variance .", "forward": false, "src_ids": "2021.eacl-main.285_8213"}
{"input": "human understandable categories is done by using Method| context: customers of machine learning systems demand accountability from the companies employing these algorithms for various prediction tasks . accountability requires understanding of system limit and condition of erroneous predictions , as customers are often interested in understanding the incorrect predictions , and model developers are absorbed in finding methods that can be used to get incremental improvements to an existing system .", "entity": "human understandable categories", "output": "aec", "neg_sample": ["human understandable categories is done by using Method", "customers of machine learning systems demand accountability from the companies employing these algorithms for various prediction tasks .", "accountability requires understanding of system limit and condition of erroneous predictions , as customers are often interested in understanding the incorrect predictions , and model developers are absorbed in finding methods that can be used to get incremental improvements to an existing system ."], "relation": "used for", "id": "2021.trustnlp-1.4", "year": 2021, "rel_sent": "Our results on the sample sentiment task show that AEC is able to characterize erroneous predictions into human understandable categories and also achieves promising results on selecting erroneous samples when compared with the uncertainty - based sampling .", "forward": false, "src_ids": "2021.trustnlp-1.4_15755"}
{"input": "argumentation mining is done by using Method| context: most existing methods determine argumentative relations by exhaustively enumerating all possible pairs of argument components , which suffer from low efficiency and class imbalance . moreover , due to the complex nature of argumentation , there is , sofar , no universal method that can address both tree and non - tree structured argumentation .", "entity": "argumentation mining", "output": "neural transition - based model", "neg_sample": ["argumentation mining is done by using Method", "most existing methods determine argumentative relations by exhaustively enumerating all possible pairs of argument components , which suffer from low efficiency and class imbalance .", "moreover , due to the complex nature of argumentation , there is , sofar , no universal method that can address both tree and non - tree structured argumentation ."], "relation": "used for", "id": "2021.acl-long.497", "year": 2021, "rel_sent": "Towards these issues , we propose a neural transition - based model for argumentation mining , which incrementally builds an argumentation graph by generating a sequence of actions , avoiding inefficient enumeration operations .", "forward": false, "src_ids": "2021.acl-long.497_17"}
{"input": "character - aware neural language models is done by using Method| context: however , these models are usually biased towards information from surface forms .", "entity": "character - aware neural language models", "output": "character - aware neural language models", "neg_sample": ["character - aware neural language models is done by using Method", "however , these models are usually biased towards information from surface forms ."], "relation": "used for", "id": "2021.ranlp-1.48", "year": 2021, "rel_sent": "To alleviate this problem , we propose a simple and effective method to improve a character - aware neural language model by forcing a character encoder to produce word - based embeddings under Skip - gram architecture in a warm - up step without extra training data .", "forward": false, "src_ids": "2021.ranlp-1.48_9775"}
{"input": "graph neural network is used for OtherScientificTerm| context: chinese spelling check ( csc ) is to detect and correct chinese spelling errors . many models utilize a predefined confusion set to learn a mapping between correct characters and its visually similar or phonetically similar misuses but the mapping may be out - of - domain .", "entity": "graph neural network", "output": "radical and pinyin information", "neg_sample": ["graph neural network is used for OtherScientificTerm", "chinese spelling check ( csc ) is to detect and correct chinese spelling errors .", "many models utilize a predefined confusion set to learn a mapping between correct characters and its visually similar or phonetically similar misuses but the mapping may be out - of - domain ."], "relation": "used for", "id": "2021.emnlp-main.287", "year": 2021, "rel_sent": "To explicitly capture the two erroneous patterns , we employ a graph neural network to introduce radical and pinyin information as visual and phonetic features .", "forward": true, "src_ids": "2021.emnlp-main.287_2531"}
{"input": "multi - step decoder is used for Task| context: recent successes in deep generative modeling have led to significant advances in natural language generation ( nlg ) . incorporating entities into neural generation models has demonstrated great improvements by assisting to infer the summary topic and to generate coherent content .", "entity": "multi - step decoder", "output": "entity mention generation", "neg_sample": ["multi - step decoder is used for Task", "recent successes in deep generative modeling have led to significant advances in natural language generation ( nlg ) .", "incorporating entities into neural generation models has demonstrated great improvements by assisting to infer the summary topic and to generate coherent content ."], "relation": "used for", "id": "2021.emnlp-main.56", "year": 2021, "rel_sent": "Our model has a multi - step decoder that injects the entity types into the process of entity mention generation .", "forward": true, "src_ids": "2021.emnlp-main.56_12956"}
{"input": "nmt system is done by using OtherScientificTerm| context: neural machine translation ( nmt ) is a predominant machine translation technology nowadays because of its end - to - end trainable flexibility . however , nmt still struggles to translate properly in low - resource settings specifically on distant language pairs . one way to overcome this is to use the information from other modalities if available .", "entity": "nmt system", "output": "multimodal information", "neg_sample": ["nmt system is done by using OtherScientificTerm", "neural machine translation ( nmt ) is a predominant machine translation technology nowadays because of its end - to - end trainable flexibility .", "however , nmt still struggles to translate properly in low - resource settings specifically on distant language pairs .", "one way to overcome this is to use the information from other modalities if available ."], "relation": "used for", "id": "2021.wat-1.18", "year": 2021, "rel_sent": "Multimodal information can help the NMT system to improve the translation by removing ambiguity on some phrases or words .", "forward": false, "src_ids": "2021.wat-1.18_3588"}
{"input": "personalized transformer is used for Task| context: in these tasks , user and item ids are important identifiers for personalization . transformer , which is demonstrated with strong language modeling capability , however , is not personalized and fails to make use of the user and item ids since the id tokens are not even in the same semantic space as the words .", "entity": "personalized transformer", "output": "explainable recommendation ( peter )", "neg_sample": ["personalized transformer is used for Task", "in these tasks , user and item ids are important identifiers for personalization .", "transformer , which is demonstrated with strong language modeling capability , however , is not personalized and fails to make use of the user and item ids since the id tokens are not even in the same semantic space as the words ."], "relation": "used for", "id": "2021.acl-long.383", "year": 2021, "rel_sent": "To address this problem , we present a PErsonalized Transformer for Explainable Recommendation ( PETER ) , on which we design a simple and effective learning objective that utilizes the IDs to predict the words in the target explanation , so as to endow the IDs with linguistic meanings and to achieve personalized Transformer .", "forward": true, "src_ids": "2021.acl-long.383_8370"}
{"input": "machine translation is used for Material| context: counterfactual statements describe events that did not or can not take place . we consider the problem of counterfactual detection ( cfd ) in product reviews .", "entity": "machine translation", "output": "multilingual data", "neg_sample": ["machine translation is used for Material", "counterfactual statements describe events that did not or can not take place .", "we consider the problem of counterfactual detection ( cfd ) in product reviews ."], "relation": "used for", "id": "2021.emnlp-main.568", "year": 2021, "rel_sent": "Applying machine translation on English counterfactual examples to create multilingual data performs poorly , demonstrating the language - specificity of this problem , which has been ignored sofar .", "forward": true, "src_ids": "2021.emnlp-main.568_10091"}
{"input": "bert - based models is used for Task| context: neural language models exhibit impressive performance on a variety of tasks , but their internal reasoning may be difficult to understand . prior art aims to uncover meaningful properties within model representations via probes , but it is unclear how faithfully such probes portray information that the models actually use .", "entity": "bert - based models", "output": "downstream prediction tasks", "neg_sample": ["bert - based models is used for Task", "neural language models exhibit impressive performance on a variety of tasks , but their internal reasoning may be difficult to understand .", "prior art aims to uncover meaningful properties within model representations via probes , but it is unclear how faithfully such probes portray information that the models actually use ."], "relation": "used for", "id": "2021.findings-acl.76", "year": 2021, "rel_sent": "In experiments testing our technique , we produce evidence that suggests some BERT - based models use a tree - distancelike representation of syntax in downstream prediction tasks .", "forward": true, "src_ids": "2021.findings-acl.76_10404"}
{"input": "zero - shot dst task is done by using Material| context: zero - shot transfer learning for dialogue state tracking ( dst ) enables us to handle a variety of task - oriented dialogue domains without the expense of collecting in - domain data .", "entity": "zero - shot dst task", "output": "general question answering ( qa ) corpora", "neg_sample": ["zero - shot dst task is done by using Material", "zero - shot transfer learning for dialogue state tracking ( dst ) enables us to handle a variety of task - oriented dialogue domains without the expense of collecting in - domain data ."], "relation": "used for", "id": "2021.emnlp-main.622", "year": 2021, "rel_sent": "In this work , we propose to transfer the cross - task knowledge from general question answering ( QA ) corpora for the zero - shot DST task .", "forward": false, "src_ids": "2021.emnlp-main.622_6742"}
{"input": "neural models is used for Task| context: yet , on word - level tasks , exact inference of these models reveals the empty string is often the global optimum .", "entity": "neural models", "output": "language generation", "neg_sample": ["neural models is used for Task", "yet , on word - level tasks , exact inference of these models reveals the empty string is often the global optimum ."], "relation": "used for", "id": "2021.eacl-main.118", "year": 2021, "rel_sent": "These observations suggest that the poor calibration of many neural models may stem from characteristics of a specific subset of tasks rather than general ill - suitedness of such models for language generation .", "forward": true, "src_ids": "2021.eacl-main.118_967"}
{"input": "ecpe task is done by using Method| context: unlike the more well - studied task of emotion cause extraction ( ece ) , ecpe does not require the emotion clauses to be provided as annotations . previous works on ecpe have either followed a multi - stage approach where emotion extraction , cause extraction , and pairing are done independently or use complex architectures to resolve its limitations .", "entity": "ecpe task", "output": "end - to - end model", "neg_sample": ["ecpe task is done by using Method", "unlike the more well - studied task of emotion cause extraction ( ece ) , ecpe does not require the emotion clauses to be provided as annotations .", "previous works on ecpe have either followed a multi - stage approach where emotion extraction , cause extraction , and pairing are done independently or use complex architectures to resolve its limitations ."], "relation": "used for", "id": "2021.wassa-1.9", "year": 2021, "rel_sent": "In this paper , we propose an end - to - end model for the ECPE task .", "forward": false, "src_ids": "2021.wassa-1.9_9533"}
{"input": "intent detection is done by using Method| context: transformer - based language models ( lms ) pretrained on large text collections are proven to store a wealth of semantic knowledge . however , 1 ) they are not effective as sentence encoders when used off - the - shelf , and 2 ) thus typically lag behind conversationally pretrained ( e.g. , via response selection ) encoders on conversational tasks such as intent detection ( id ) .", "entity": "intent detection", "output": "specialised sentence encoders", "neg_sample": ["intent detection is done by using Method", "transformer - based language models ( lms ) pretrained on large text collections are proven to store a wealth of semantic knowledge .", "however , 1 ) they are not effective as sentence encoders when used off - the - shelf , and 2 ) thus typically lag behind conversationally pretrained ( e.g.", ", via response selection ) encoders on conversational tasks such as intent detection ( id ) ."], "relation": "used for", "id": "2021.emnlp-main.88", "year": 2021, "rel_sent": "Consequently , such specialised sentence encoders allow for treating ID as a simple semantic similarity task based on interpretable nearest neighbours retrieval .", "forward": false, "src_ids": "2021.emnlp-main.88_8337"}
{"input": "word - aligned parallel corpus is done by using Generic| context: with more than 7000 languages worldwide , multilingual natural language processing ( nlp ) is essential both from an academic and commercial perspective . researching typological properties of languages is fundamental for progress in multilingual nlp . examples include assessing language similarity for effective transfer learning , injecting inductive biases into machine learning models or creating resources such as dictionaries and inflection tables .", "entity": "word - aligned parallel corpus", "output": "online tool", "neg_sample": ["word - aligned parallel corpus is done by using Generic", "with more than 7000 languages worldwide , multilingual natural language processing ( nlp ) is essential both from an academic and commercial perspective .", "researching typological properties of languages is fundamental for progress in multilingual nlp .", "examples include assessing language similarity for effective transfer learning , injecting inductive biases into machine learning models or creating resources such as dictionaries and inflection tables ."], "relation": "used for", "id": "2021.acl-demo.8", "year": 2021, "rel_sent": "We provide ParCourE , an online tool that allows to browse a word - aligned parallel corpus , covering 1334 languages .", "forward": false, "src_ids": "2021.acl-demo.8_3689"}
{"input": "triples is done by using Method| context: joint extraction of entities and relations from unstructured texts toform factual triples is a fundamental task of constructing a knowledge base ( kb ) . a common method is to decode triples by predicting entity pairs to obtain the corresponding relation . however , it is still challenging to handle this task efficiently , especially for the overlapping triple problem .", "entity": "triples", "output": "translating decoding schema", "neg_sample": ["triples is done by using Method", "joint extraction of entities and relations from unstructured texts toform factual triples is a fundamental task of constructing a knowledge base ( kb ) .", "a common method is to decode triples by predicting entity pairs to obtain the corresponding relation .", "however , it is still challenging to handle this task efficiently , especially for the overlapping triple problem ."], "relation": "used for", "id": "2021.emnlp-main.635", "year": 2021, "rel_sent": "TDEER can naturally handle the overlapping triple problem , because the translating decoding schema can recognize all possible triples , including overlapping and non - overlapping triples .", "forward": false, "src_ids": "2021.emnlp-main.635_14576"}
{"input": "large - scale meta - learning is done by using Task| context: meta - learning considers the problem of learning an efficient learning process that can leverage its past experience to accurately solve new tasks . however , the efficacy of meta - learning crucially depends on the distribution of tasks available for training , and this is often assumed to be known a priori or constructed from limited supervised datasets .", "entity": "large - scale meta - learning", "output": "self - supervised tasks", "neg_sample": ["large - scale meta - learning is done by using Task", "meta - learning considers the problem of learning an efficient learning process that can leverage its past experience to accurately solve new tasks .", "however , the efficacy of meta - learning crucially depends on the distribution of tasks available for training , and this is often assumed to be known a priori or constructed from limited supervised datasets ."], "relation": "used for", "id": "2021.emnlp-main.469", "year": 2021, "rel_sent": "In this work , we aim to provide task distributions for meta - learning by considering self - supervised tasks automatically proposed from unlabeled text , to enable large - scale meta - learning in NLP .", "forward": false, "src_ids": "2021.emnlp-main.469_8231"}
{"input": "word sense disambiguation is done by using Material| context: most of the previous work on event detection ( ed ) has only considered the datasets with a small number of event types ( i.e. , up to 38 types ) .", "entity": "word sense disambiguation", "output": "semcor dataset", "neg_sample": ["word sense disambiguation is done by using Material", "most of the previous work on event detection ( ed ) has only considered the datasets with a small number of event types ( i.e.", ", up to 38 types ) ."], "relation": "used for", "id": "2021.eacl-main.237", "year": 2021, "rel_sent": "We propose a novel method to transform the Semcor dataset for Word Sense Disambiguation into a large and high - quality dataset for FED .", "forward": false, "src_ids": "2021.eacl-main.237_5208"}
{"input": "reptile algorithm is used for Method| context: for manufacturers of home appliances , the studying discussion of products on social media can help manufacturers improve their products . opinions provided through online reviews can immediately reflect whether the product is accepted by people , and which aspect of the product are most discussed .", "entity": "reptile algorithm", "output": "meta learning", "neg_sample": ["reptile algorithm is used for Method", "for manufacturers of home appliances , the studying discussion of products on social media can help manufacturers improve their products .", "opinions provided through online reviews can immediately reflect whether the product is accepted by people , and which aspect of the product are most discussed ."], "relation": "used for", "id": "2021.rocling-1.24", "year": 2021, "rel_sent": "To improve the performance of ACSC , we combine the Reptile algorithm in meta learning with the concept of domain adversarial training toform the concept of the Adversarial Reptile algorithm .", "forward": true, "src_ids": "2021.rocling-1.24_8395"}
{"input": "context - independent question prototype is used for Generic| context: asking questions about a situation is an inherent step towards understanding it .", "entity": "context - independent question prototype", "output": "role", "neg_sample": ["context - independent question prototype is used for Generic", "asking questions about a situation is an inherent step towards understanding it ."], "relation": "used for", "id": "2021.emnlp-main.108", "year": 2021, "rel_sent": "We develop a two - stage model for this task , which first produces a context - independent question prototype for each role and then revises it to be contextually appropriate for the passage .", "forward": true, "src_ids": "2021.emnlp-main.108_13859"}
{"input": "model size is used for Task| context: recent developments in machine translation and multilingual text generation have led researchers to adopt trained metrics such as comet or bleurt , which treat evaluation as a regression problem and use representations from multilingual pre - trained models such as xlm - roberta or mbert . yet studies on related tasks suggest that these models are most efficient when they are large , which is costly and impractical for evaluation .", "entity": "model size", "output": "cross - lingual transfer", "neg_sample": ["model size is used for Task", "recent developments in machine translation and multilingual text generation have led researchers to adopt trained metrics such as comet or bleurt , which treat evaluation as a regression problem and use representations from multilingual pre - trained models such as xlm - roberta or mbert .", "yet studies on related tasks suggest that these models are most efficient when they are large , which is costly and impractical for evaluation ."], "relation": "used for", "id": "2021.emnlp-main.58", "year": 2021, "rel_sent": "We present a series of experiments which show that model size is indeed a bottleneck for cross - lingual transfer , then demonstrate how distillation can help addressing this bottleneck , by leveraging synthetic data generation and transferring knowledge from one teacher to multiple students trained on related languages .", "forward": true, "src_ids": "2021.emnlp-main.58_7432"}
{"input": "generator 's ranking is done by using OtherScientificTerm| context: generative approaches have been recently shown to be effective for both entity disambiguation and entity linking ( i.e. , joint mention detection and disambiguation ) . however , the previously proposed autoregressive formulation for el suffers from i ) high computational cost due to a complex ( deep ) decoder , ii ) non - parallelizable decoding that scales with the source sequence length , and iii ) the need for training on a large amount of data .", "entity": "generator 's ranking", "output": "correction term", "neg_sample": ["generator 's ranking is done by using OtherScientificTerm", "generative approaches have been recently shown to be effective for both entity disambiguation and entity linking ( i.e.", ", joint mention detection and disambiguation ) .", "however , the previously proposed autoregressive formulation for el suffers from i ) high computational cost due to a complex ( deep ) decoder , ii ) non - parallelizable decoding that scales with the source sequence length , and iii ) the need for training on a large amount of data ."], "relation": "used for", "id": "2021.emnlp-main.604", "year": 2021, "rel_sent": "Moreover , we augment the generative objective with an extra discriminative component , i.e. , a correction term which lets us directly optimize the generator 's ranking .", "forward": false, "src_ids": "2021.emnlp-main.604_12536"}
{"input": "document - level relation extraction is done by using Method| context: prior efforts to capture long - range dependencies have relied heavily on implicitly powerful representations learned through ( graph ) neural networks , which makes the model less transparent .", "entity": "document - level relation extraction", "output": "probabilistic model", "neg_sample": ["document - level relation extraction is done by using Method", "prior efforts to capture long - range dependencies have relied heavily on implicitly powerful representations learned through ( graph ) neural networks , which makes the model less transparent ."], "relation": "used for", "id": "2021.emnlp-main.95", "year": 2021, "rel_sent": "To tackle this challenge , in this paper , we propose LogiRE , a novel probabilistic model for document - level relation extraction by learning logic rules .", "forward": false, "src_ids": "2021.emnlp-main.95_15916"}
{"input": "syntactic parse trees is done by using OtherScientificTerm| context: syntactic structure is an important component of natural language text . recent top - performing models in answer sentence selection ( as2 ) use self - attention and transfer learning , but not syntactic structure . tree structures have shown strong performance in tasks with sentence pair input like semantic relatedness . we investigate whether tree structures can boost performance in as2 .", "entity": "syntactic parse trees", "output": "recursive nature", "neg_sample": ["syntactic parse trees is done by using OtherScientificTerm", "syntactic structure is an important component of natural language text .", "recent top - performing models in answer sentence selection ( as2 ) use self - attention and transfer learning , but not syntactic structure .", "tree structures have shown strong performance in tasks with sentence pair input like semantic relatedness .", "we investigate whether tree structures can boost performance in as2 ."], "relation": "used for", "id": "2021.acl-long.358", "year": 2021, "rel_sent": "The recursive nature of our model is able to represent all levels of syntactic parse trees with only one additional self - attention layer .", "forward": false, "src_ids": "2021.acl-long.358_13302"}
{"input": "word embedding is done by using Method| context: word embeddings trained on large corpora have shown to encode high levels of unfair discriminatory gender , racial , religious and ethnic biases . in contrast , human - written dictionaries describe the meanings of words in a concise , objective and an unbiased manner .", "entity": "word embedding", "output": "encoder", "neg_sample": ["word embedding is done by using Method", "word embeddings trained on large corpora have shown to encode high levels of unfair discriminatory gender , racial , religious and ethnic biases .", "in contrast , human - written dictionaries describe the meanings of words in a concise , objective and an unbiased manner ."], "relation": "used for", "id": "2021.eacl-main.16", "year": 2021, "rel_sent": "Specifically , we learn an encoder to generate a debiased version of an input word embedding such that it ( a ) retains the semantics of the pre - trained word embedding , ( b ) agrees with the unbiased definition of the word according to the dictionary , and ( c ) remains orthogonal to the vector space spanned by any biased basis vectors in the pre - trained word embedding space .", "forward": false, "src_ids": "2021.eacl-main.16_16106"}
{"input": "text representation is done by using Method| context: hope speech detection is a new task for finding and highlighting positive comments or supporting content from user - generated social media comments .", "entity": "text representation", "output": "recurrent neural network ( rnn )", "neg_sample": ["text representation is done by using Method", "hope speech detection is a new task for finding and highlighting positive comments or supporting content from user - generated social media comments ."], "relation": "used for", "id": "2021.ltedi-1.10", "year": 2021, "rel_sent": "In this paper , we present deep learning techniques using context - aware string embeddings for word representations and Recurrent Neural Network ( RNN ) and pooled document embeddings for text representation .", "forward": false, "src_ids": "2021.ltedi-1.10_4953"}
{"input": "pq and qa pairs is done by using OtherScientificTerm| context: generating some appealing questions in open - domain conversations is an effective way to improve human - machine interactions and lead the topic to a broader or deeper direction . to avoid dull or deviated questions , some researchers tried to utilize answer , the ' future ' information , to guide question generation . however , they separate a post - question - answer ( pqa ) triple into two parts : post - question ( pq ) and question - answer ( qa ) pairs , which may hurt the overall coherence . besides , the qa relationship is modeled as a one - to - one mapping that is not reasonable in open - domain conversations .", "entity": "pq and qa pairs", "output": "one - to - many semantic mappings", "neg_sample": ["pq and qa pairs is done by using OtherScientificTerm", "generating some appealing questions in open - domain conversations is an effective way to improve human - machine interactions and lead the topic to a broader or deeper direction .", "to avoid dull or deviated questions , some researchers tried to utilize answer , the ' future ' information , to guide question generation .", "however , they separate a post - question - answer ( pqa ) triple into two parts : post - question ( pq ) and question - answer ( qa ) pairs , which may hurt the overall coherence .", "besides , the qa relationship is modeled as a one - to - one mapping that is not reasonable in open - domain conversations ."], "relation": "used for", "id": "2021.acl-long.271", "year": 2021, "rel_sent": "Latent variables in three hierarchies are used to represent the shared background of a triple and one - to - many semantic mappings in both PQ and QA pairs .", "forward": false, "src_ids": "2021.acl-long.271_12531"}
{"input": "uda training is done by using OtherScientificTerm| context: in this work we explore unsupervised domain adaptation ( uda ) of pretrained language models for downstream tasks .", "entity": "uda training", "output": "stopping criterion", "neg_sample": ["uda training is done by using OtherScientificTerm", "in this work we explore unsupervised domain adaptation ( uda ) of pretrained language models for downstream tasks ."], "relation": "used for", "id": "2021.naacl-main.203", "year": 2021, "rel_sent": "Our experiments show that performance of models trained with the mixed loss scales with the amount of available target data and the mixed loss can be effectively used as a stopping criterion during UDA training .", "forward": false, "src_ids": "2021.naacl-main.203_3061"}
{"input": "graph - based context modeling is used for Task| context: existing models fail tofully utilize the contextual information which plays an important role in interpreting each local sentence .", "entity": "graph - based context modeling", "output": "implicit discourse relation recognition", "neg_sample": ["graph - based context modeling is used for Task", "existing models fail tofully utilize the contextual information which plays an important role in interpreting each local sentence ."], "relation": "used for", "id": "2021.naacl-main.126", "year": 2021, "rel_sent": "Context Tracking Network : Graph - based Context Modeling for Implicit Discourse Relation Recognition.", "forward": true, "src_ids": "2021.naacl-main.126_12185"}
{"input": "implications is done by using Method| context: transformers have been shown to emulate logical deduction over natural language theories ( logical rules expressed in natural language ) , reliably assigning true / false labels to candidate implications . however , their ability to generate implications of a theory has not yet been demonstrated , and methods for reconstructing proofs of answers are imperfect .", "entity": "implications", "output": "generative model", "neg_sample": ["implications is done by using Method", "transformers have been shown to emulate logical deduction over natural language theories ( logical rules expressed in natural language ) , reliably assigning true / false labels to candidate implications .", "however , their ability to generate implications of a theory has not yet been demonstrated , and methods for reconstructing proofs of answers are imperfect ."], "relation": "used for", "id": "2021.findings-acl.317", "year": 2021, "rel_sent": "In this work we show that a generative model , called ProofWriter , can reliably generate both implications of a theory and the natural language proofs that support them .", "forward": false, "src_ids": "2021.findings-acl.317_13170"}
{"input": "text generation is done by using Method| context: we study the task of long - form opinion text generation , which faces at least two distinct challenges . first , existing neural generation models fall short of coherence , thus requiring efficient content planning . second , diverse types of information are needed to guide the generator to cover both subjective and objective content .", "entity": "text generation", "output": "mixed language models", "neg_sample": ["text generation is done by using Method", "we study the task of long - form opinion text generation , which faces at least two distinct challenges .", "first , existing neural generation models fall short of coherence , thus requiring efficient content planning .", "second , diverse types of information are needed to guide the generator to cover both subjective and objective content ."], "relation": "used for", "id": "2021.acl-long.501", "year": 2021, "rel_sent": "DYPLOC : Dynamic Planning of Content Using Mixed Language Models for Text Generation.", "forward": false, "src_ids": "2021.acl-long.501_3258"}
{"input": "abstract meaning representation is used for Method| context: knowledge base question answering ( kbqa ) is an important task in natural language processing . existing approaches face significant challenges including complex question understanding , necessity for reasoning , and lack of large end - to - end training datasets .", "entity": "abstract meaning representation", "output": "kbqa systems", "neg_sample": ["abstract meaning representation is used for Method", "knowledge base question answering ( kbqa ) is an important task in natural language processing .", "existing approaches face significant challenges including complex question understanding , necessity for reasoning , and lack of large end - to - end training datasets ."], "relation": "used for", "id": "2021.findings-acl.339", "year": 2021, "rel_sent": "Furthermore , our analysis emphasizes that AMR is a powerful tool for KBQA systems .", "forward": true, "src_ids": "2021.findings-acl.339_10059"}
{"input": "phrase structure is used for Task| context: it is reported that grammatical information is useful for machine translation ( mt ) task . however , the annotation of grammatical information requires the highly human resources . furthermore , it is not trivial to adapt grammatical information to mt since grammatical annotation usually adapts tokenization standards which might not be suitable to capture the relation of two languages , and the use of sub - word tokenization , e.g. , byte - pair - encoding , to alleviate out - of - vocabulary problem might not be compatible with those annotations .", "entity": "phrase structure", "output": "constituency parsing", "neg_sample": ["phrase structure is used for Task", "it is reported that grammatical information is useful for machine translation ( mt ) task .", "however , the annotation of grammatical information requires the highly human resources .", "furthermore , it is not trivial to adapt grammatical information to mt since grammatical annotation usually adapts tokenization standards which might not be suitable to capture the relation of two languages , and the use of sub - word tokenization , e.g.", ", byte - pair - encoding , to alleviate out - of - vocabulary problem might not be compatible with those annotations ."], "relation": "used for", "id": "2021.acl-srw.33", "year": 2021, "rel_sent": "Although we could not obtain the high quality phrase structure in constituency parsing when evaluated monolingually , we find that the induced phrase structures enhance the explainability of translation through the synchronization constraint .", "forward": true, "src_ids": "2021.acl-srw.33_5481"}
{"input": "bert sentence representations is done by using Method| context: although bert and its variants have reshaped the nlp landscape , it still remains unclear how best to derive sentence embeddings from such pre - trained transformers .", "entity": "bert sentence representations", "output": "contrastive learning method", "neg_sample": ["bert sentence representations is done by using Method", "although bert and its variants have reshaped the nlp landscape , it still remains unclear how best to derive sentence embeddings from such pre - trained transformers ."], "relation": "used for", "id": "2021.acl-long.197", "year": 2021, "rel_sent": "In this work , we propose a contrastive learning method that utilizes self - guidance for improving the quality of BERT sentence representations .", "forward": false, "src_ids": "2021.acl-long.197_8830"}
{"input": "intent classification is done by using Method| context: natural language understanding is an important task in modern dialogue systems . it becomes more important with the rapid extension of the dialogue systems ' functionality .", "entity": "intent classification", "output": "zero - shot transfer learning", "neg_sample": ["intent classification is done by using Method", "natural language understanding is an important task in modern dialogue systems .", "it becomes more important with the rapid extension of the dialogue systems ' functionality ."], "relation": "used for", "id": "2021.ranlp-1.25", "year": 2021, "rel_sent": "In this work , we present an approach to zero - shot transfer learning for the tasks of intent classification and slot - filling based on pre - trained language models .", "forward": false, "src_ids": "2021.ranlp-1.25_12965"}
{"input": "bert - like models is used for Generic| context: a key challenge of dialog systems research is to effectively and efficiently adapt to new domains . a scalable paradigm for adaptation necessitates the development of generalizable models that perform well in few - shot settings .", "entity": "bert - like models", "output": "diluted representations", "neg_sample": ["bert - like models is used for Generic", "a key challenge of dialog systems research is to effectively and efficiently adapt to new domains .", "a scalable paradigm for adaptation necessitates the development of generalizable models that perform well in few - shot settings ."], "relation": "used for", "id": "2021.naacl-main.237", "year": 2021, "rel_sent": "Prior work has shown that BERT - like models tend to attribute a significant amount of attention to the [ CLS ] token , which we hypothesize results in diluted representations .", "forward": true, "src_ids": "2021.naacl-main.237_3197"}
{"input": "multi - modalities is done by using OtherScientificTerm| context: significant progress in sentiment analysis and emotion recognition has been made recently , and there has also been an increase in the requirements for solving real - world problems . however , human language is complicated and multimodal , making it difficult for computers or artificial intelligence systems to understand .", "entity": "multi - modalities", "output": "learning of representations", "neg_sample": ["multi - modalities is done by using OtherScientificTerm", "significant progress in sentiment analysis and emotion recognition has been made recently , and there has also been an increase in the requirements for solving real - world problems .", "however , human language is complicated and multimodal , making it difficult for computers or artificial intelligence systems to understand ."], "relation": "used for", "id": "2021.paclic-1.65", "year": 2021, "rel_sent": "In this study , we adopt several self - supervised learning models to strengthen the learning of representations for multi - modalities ( i.e. , language , acoustic and visual modalities ) to improve the performance of sentiment analysis systems .", "forward": false, "src_ids": "2021.paclic-1.65_325"}
{"input": "initial embeddings is used for OtherScientificTerm| context: most of the recent work on personality detection from online posts adopts multifarious deep neural networks to represent the posts and builds predictive models in a data - driven manner , without the exploitation of psycholinguistic knowledge that may unveil the connections between one 's language use and his psychological traits .", "entity": "initial embeddings", "output": "graph nodes", "neg_sample": ["initial embeddings is used for OtherScientificTerm", "most of the recent work on personality detection from online posts adopts multifarious deep neural networks to represent the posts and builds predictive models in a data - driven manner , without the exploitation of psycholinguistic knowledge that may unveil the connections between one 's language use and his psychological traits ."], "relation": "used for", "id": "2021.acl-long.326", "year": 2021, "rel_sent": "The initializer is employed to provide initial embeddings for the graph nodes .", "forward": true, "src_ids": "2021.acl-long.326_12556"}
{"input": "end - to - end neural qg model is done by using Material| context: question generation ( qg ) is the task of generating a plausible question for a given pair . template - based qg uses linguistically - informed heuristics to transform declarative sentences into interrogatives , whereas supervised qg uses existing question answering ( qa ) datasets to train a system to generate a question given a passage and an answer . a disadvantage of the heuristic approach is that the generated questions are heavily tied to their declarative counterparts . a disadvantage of the supervised approach is that they are heavily tied to the domain / language of the qa dataset used as training data .", "entity": "end - to - end neural qg model", "output": "news articles", "neg_sample": ["end - to - end neural qg model is done by using Material", "question generation ( qg ) is the task of generating a plausible question for a given pair .", "template - based qg uses linguistically - informed heuristics to transform declarative sentences into interrogatives , whereas supervised qg uses existing question answering ( qa ) datasets to train a system to generate a question given a passage and an answer .", "a disadvantage of the heuristic approach is that the generated questions are heavily tied to their declarative counterparts .", "a disadvantage of the supervised approach is that they are heavily tied to the domain / language of the qa dataset used as training data ."], "relation": "used for", "id": "2021.emnlp-main.340", "year": 2021, "rel_sent": "The resulting questions are then combined with the original news articles to train an end - to - end neural QG model .", "forward": false, "src_ids": "2021.emnlp-main.340_5891"}
{"input": "grammatical generalization is done by using Method| context: there has been an increased interest in data generation approaches to grammatical error correction ( gec ) using pseudo data . however , these approaches suffer from several issues that make them inconvenient for realworld deployment including a demand for large amounts of training data . on the other hand , some errors based on grammatical rules may not necessarily require a large amount of data if gec models can realize grammatical generalization .", "entity": "grammatical generalization", "output": "transformer - based gec model", "neg_sample": ["grammatical generalization is done by using Method", "there has been an increased interest in data generation approaches to grammatical error correction ( gec ) using pseudo data .", "however , these approaches suffer from several issues that make them inconvenient for realworld deployment including a demand for large amounts of training data .", "on the other hand , some errors based on grammatical rules may not necessarily require a large amount of data if gec models can realize grammatical generalization ."], "relation": "used for", "id": "2021.findings-acl.399", "year": 2021, "rel_sent": "We found that a current standard Transformer - based GEC model fails to realize grammatical generalization even in simple settings with limited vocabulary and syntax , suggesting that it lacks the generalization ability required to correct errors from provided training examples .", "forward": false, "src_ids": "2021.findings-acl.399_7407"}
{"input": "focus attention is used for Method| context: professional summaries are written with document - level information , such as the theme of the document , in mind .", "entity": "focus attention", "output": "decoders", "neg_sample": ["focus attention is used for Method", "professional summaries are written with document - level information , such as the theme of the document , in mind ."], "relation": "used for", "id": "2021.acl-long.474", "year": 2021, "rel_sent": "With the motivation to narrow this gap , we introduce Focus Attention Mechanism , a simple yet effective method to encourage decoders to proactively generate tokens that are similar or topical to the input document .", "forward": true, "src_ids": "2021.acl-long.474_8297"}
{"input": "bilingual representations is done by using Task| context: we use network pruning techniques and observe that pruning 50 - 70 % of the parameters from a trained mnmt model results only in a 0.29 - 1.98 drop in the bleu score . suggesting that there exist large redundancies in mnmt models .", "entity": "bilingual representations", "output": "multilingual neural machine translation", "neg_sample": ["bilingual representations is done by using Task", "we use network pruning techniques and observe that pruning 50 - 70 % of the parameters from a trained mnmt model results only in a 0.29 - 1.98 drop in the bleu score .", "suggesting that there exist large redundancies in mnmt models ."], "relation": "used for", "id": "2021.findings-acl.9", "year": 2021, "rel_sent": "We propose a novel adaptation strategy , where we iteratively prune and retrain the redundant parameters of an MNMT to improve bilingual representations while retaining the multilinguality .", "forward": false, "src_ids": "2021.findings-acl.9_2579"}
{"input": "search spaces is used for Generic| context: existing black box search methods have achieved high success rate in generating adversarial attacks against nlp models . however , such search methods are inefficient as they do not consider the amount of queries required to generate adversarial attacks .", "entity": "search spaces", "output": "prior attacks", "neg_sample": ["search spaces is used for Generic", "existing black box search methods have achieved high success rate in generating adversarial attacks against nlp models .", "however , such search methods are inefficient as they do not consider the amount of queries required to generate adversarial attacks ."], "relation": "used for", "id": "2021.emnlp-main.661", "year": 2021, "rel_sent": "Further , we benchmark our results across the same search space used in prior attacks .", "forward": true, "src_ids": "2021.emnlp-main.661_12744"}
{"input": "diversity is done by using Method| context: in recent years , neural paraphrase generation based on seq2seq has achieved superior performance , however , the generated paraphrase still has the problem of lack of diversity .", "entity": "diversity", "output": "backtranslation guided multi - round paraphrase generation", "neg_sample": ["diversity is done by using Method", "in recent years , neural paraphrase generation based on seq2seq has achieved superior performance , however , the generated paraphrase still has the problem of lack of diversity ."], "relation": "used for", "id": "2021.findings-acl.135", "year": 2021, "rel_sent": "We propose BTmPG ( BackTranslation guided multi - round Paraphrase Generation ) , which leverages multi - round paraphrase generation to improve diversity and employs back - translation to preserve semantic information .", "forward": false, "src_ids": "2021.findings-acl.135_13155"}
{"input": "orchestral parts is done by using Method| context: fully automatic opera tracking is challenging because of the acoustic complexity of the genre , combining musical and linguistic information ( singing , speech ) in complex ways .", "entity": "orchestral parts", "output": "music tracker", "neg_sample": ["orchestral parts is done by using Method", "fully automatic opera tracking is challenging because of the acoustic complexity of the genre , combining musical and linguistic information ( singing , speech ) in complex ways ."], "relation": "used for", "id": "2021.nlp4musa-1.1", "year": 2021, "rel_sent": "A music tracker that has proven to be effective at tracking orchestral parts , will lead the tracking process .", "forward": false, "src_ids": "2021.nlp4musa-1.1_8222"}
{"input": "few - shot setting is done by using Method| context: meta - learning has recently been proposed to learn models and algorithms that can generalize from a handful of examples . however , applications to structured prediction and textual tasks pose challenges for meta - learning algorithms .", "entity": "few - shot setting", "output": "task generation scheme", "neg_sample": ["few - shot setting is done by using Method", "meta - learning has recently been proposed to learn models and algorithms that can generalize from a handful of examples .", "however , applications to structured prediction and textual tasks pose challenges for meta - learning algorithms ."], "relation": "used for", "id": "2021.metanlp-1.6", "year": 2021, "rel_sent": "We propose a task generation scheme for converting classical NER datasets into the few - shot setting , for both training and evaluation .", "forward": false, "src_ids": "2021.metanlp-1.6_306"}
{"input": "modular self - supervision is used for Generic| context: extracting relations across large text spans has been relatively underexplored in nlp , but it is particularly important for high - value domains such as biomedicine , where obtaining high recall of the latest findings is crucial for practical applications . compared to conventional information extraction confined to short text spans , document - level relation extraction faces additional challenges in both inference and learning . given longer text spans , state - of - the - art neural architectures are less effective and task - specific self - supervision such as distant supervision becomes very noisy .", "entity": "modular self - supervision", "output": "sub - problem", "neg_sample": ["modular self - supervision is used for Generic", "extracting relations across large text spans has been relatively underexplored in nlp , but it is particularly important for high - value domains such as biomedicine , where obtaining high recall of the latest findings is crucial for practical applications .", "compared to conventional information extraction confined to short text spans , document - level relation extraction faces additional challenges in both inference and learning .", "given longer text spans , state - of - the - art neural architectures are less effective and task - specific self - supervision such as distant supervision becomes very noisy ."], "relation": "used for", "id": "2021.emnlp-main.429", "year": 2021, "rel_sent": "This enables us to incorporate explicit discourse modeling and leverage modular self - supervision for each sub - problem , which is less noise - prone and can be further refined end - to - end via variational EM .", "forward": true, "src_ids": "2021.emnlp-main.429_6931"}
{"input": "bert sentence representations is done by using Method| context: although bert and its variants have reshaped the nlp landscape , it still remains unclear how best to derive sentence embeddings from such pre - trained transformers .", "entity": "bert sentence representations", "output": "self - guided contrastive learning", "neg_sample": ["bert sentence representations is done by using Method", "although bert and its variants have reshaped the nlp landscape , it still remains unclear how best to derive sentence embeddings from such pre - trained transformers ."], "relation": "used for", "id": "2021.acl-long.197", "year": 2021, "rel_sent": "Self - Guided Contrastive Learning for BERT Sentence Representations.", "forward": false, "src_ids": "2021.acl-long.197_8827"}
{"input": "neural machine translation is done by using Method| context: bert has been studied as a promising technique to improve nmt .", "entity": "neural machine translation", "output": "pre - training", "neg_sample": ["neural machine translation is done by using Method", "bert has been studied as a promising technique to improve nmt ."], "relation": "used for", "id": "2021.findings-acl.150", "year": 2021, "rel_sent": "A Comparison between Pre - training and Large - scale Back - translation for Neural Machine Translation.", "forward": false, "src_ids": "2021.findings-acl.150_6625"}
{"input": "learning methods is used for OtherScientificTerm| context: weakly supervised question answering usually has only the final answers as supervision signals while the correct solutions to derive the answers are not provided . for example , for discrete reasoning tasks as on drop , there may exist many equations to derive a numeric answer , and typically only one of them is correct .", "entity": "learning methods", "output": "spurious solutions", "neg_sample": ["learning methods is used for OtherScientificTerm", "weakly supervised question answering usually has only the final answers as supervision signals while the correct solutions to derive the answers are not provided .", "for example , for discrete reasoning tasks as on drop , there may exist many equations to derive a numeric answer , and typically only one of them is correct ."], "relation": "used for", "id": "2021.acl-long.318", "year": 2021, "rel_sent": "Previous learning methods mostly filter out spurious solutions with heuristics or using model confidence , but do not explicitly exploit the semantic correlations between a question and its solution .", "forward": true, "src_ids": "2021.acl-long.318_14186"}
{"input": "gradient estimation is done by using OtherScientificTerm| context: first - order meta - learning algorithms have been widely used in practice to learn initial model parameters that can be quickly adapted to new tasks due to their efficiency and effectiveness . however , existing studies find that meta - learner can overfit to some specific adaptation when we have heterogeneous tasks , leading to significantly degraded performance . in natural language processing ( nlp ) applications , datasets are often diverse and each task has its unique characteristics .", "entity": "gradient estimation", "output": "variance reduction term", "neg_sample": ["gradient estimation is done by using OtherScientificTerm", "first - order meta - learning algorithms have been widely used in practice to learn initial model parameters that can be quickly adapted to new tasks due to their efficiency and effectiveness .", "however , existing studies find that meta - learner can overfit to some specific adaptation when we have heterogeneous tasks , leading to significantly degraded performance .", "in natural language processing ( nlp ) applications , datasets are often diverse and each task has its unique characteristics ."], "relation": "used for", "id": "2021.naacl-main.206", "year": 2021, "rel_sent": "The core of our algorithm is to introduce a novel variance reduction term to the gradient estimation when performing the task adaptation .", "forward": false, "src_ids": "2021.naacl-main.206_2398"}
{"input": "thought disorder is done by using Method| context: thought disorder - linguistic disturbances including incoherence and derailment of topic - is seen in individuals both with and at risk for psychosis . methods from computational linguistics have increasingly sought to quantify thought disorder to detect group differences between clinical populations and healthy controls . while previous work has been quite successful at these classification tasks , the lack of interpretability of the computational metrics has made it unclear whether they are in fact measuring thought disorder .", "entity": "thought disorder", "output": "automated coherence measures", "neg_sample": ["thought disorder is done by using Method", "thought disorder - linguistic disturbances including incoherence and derailment of topic - is seen in individuals both with and at risk for psychosis .", "methods from computational linguistics have increasingly sought to quantify thought disorder to detect group differences between clinical populations and healthy controls .", "while previous work has been quite successful at these classification tasks , the lack of interpretability of the computational metrics has made it unclear whether they are in fact measuring thought disorder ."], "relation": "used for", "id": "2021.clpsych-1.16", "year": 2021, "rel_sent": "Automated coherence measures fail to index thought disorder in individuals at risk for psychosis.", "forward": false, "src_ids": "2021.clpsych-1.16_9522"}
{"input": "cognitive language processing signals is done by using Method| context: most previous studies integrate cognitive language processing signals ( e.g. , eye - tracking or eeg data ) into neural models of natural language processing ( nlp ) just by directly concatenating word embeddings with cognitive features , ignoring the gap between the two modalities ( i.e. , textual vs. cognitive ) and noise in cognitive features .", "entity": "cognitive language processing signals", "output": "textual neural representations", "neg_sample": ["cognitive language processing signals is done by using Method", "most previous studies integrate cognitive language processing signals ( e.g.", ", eye - tracking or eeg data ) into neural models of natural language processing ( nlp ) just by directly concatenating word embeddings with cognitive features , ignoring the gap between the two modalities ( i.e.", ", textual vs. cognitive ) and noise in cognitive features ."], "relation": "used for", "id": "2021.acl-long.291", "year": 2021, "rel_sent": "CogAlign : Learning to Align Textual Neural Representations to Cognitive Language Processing Signals.", "forward": false, "src_ids": "2021.acl-long.291_129"}
{"input": "graph - aware terminology definition generation is done by using Method| context: precisely defining the terminology is the first step in scientific communication . developing neural text generation models for definition generation can circumvent the labor - intensity curation , further accelerating scientific discovery . unfortunately , the lack of large - scale terminology definition dataset hinders the process toward definition generation .", "entity": "graph - aware terminology definition generation", "output": "graphine", "neg_sample": ["graph - aware terminology definition generation is done by using Method", "precisely defining the terminology is the first step in scientific communication .", "developing neural text generation models for definition generation can circumvent the labor - intensity curation , further accelerating scientific discovery .", "unfortunately , the lack of large - scale terminology definition dataset hinders the process toward definition generation ."], "relation": "used for", "id": "2021.emnlp-main.278", "year": 2021, "rel_sent": "Graphine : A Dataset for Graph - aware Terminology Definition Generation.", "forward": false, "src_ids": "2021.emnlp-main.278_14989"}
{"input": "wh - in - situ languages is used for OtherScientificTerm| context: this paper examines island effects in vietnamese relativization using methods in experimental syntax developed by sprouse ( 2007 ) . typologically a wh - in - situ language , it is debatable whether vietnamese employs whmovement in the formation of relative clauses . if vietnamese relativization is the result of wh - movement process , relativizing certain elements from inside island structures would result in ill - formedness .", "entity": "wh - in - situ languages", "output": "island violations", "neg_sample": ["wh - in - situ languages is used for OtherScientificTerm", "this paper examines island effects in vietnamese relativization using methods in experimental syntax developed by sprouse ( 2007 ) .", "typologically a wh - in - situ language , it is debatable whether vietnamese employs whmovement in the formation of relative clauses .", "if vietnamese relativization is the result of wh - movement process , relativizing certain elements from inside island structures would result in ill - formedness ."], "relation": "used for", "id": "2021.paclic-1.54", "year": 2021, "rel_sent": "both wh - movement and wh - in - situ languages are sensitive to island violations .", "forward": true, "src_ids": "2021.paclic-1.54_8865"}
{"input": "neural parameterization is used for OtherScientificTerm| context: probabilistic context - free grammars ( pcfgs ) with neural parameterization have been shown to be effective in unsupervised phrase - structure grammar induction . however , due to the cubic computational complexity of pcfg representation and parsing , previous approaches can not scale up to a relatively large number of ( nonterminal and preterminal ) symbols .", "entity": "neural parameterization", "output": "form", "neg_sample": ["neural parameterization is used for OtherScientificTerm", "probabilistic context - free grammars ( pcfgs ) with neural parameterization have been shown to be effective in unsupervised phrase - structure grammar induction .", "however , due to the cubic computational complexity of pcfg representation and parsing , previous approaches can not scale up to a relatively large number of ( nonterminal and preterminal ) symbols ."], "relation": "used for", "id": "2021.naacl-main.117", "year": 2021, "rel_sent": "We further use neural parameterization for the new form to improve unsupervised parsing performance .", "forward": true, "src_ids": "2021.naacl-main.117_13574"}
{"input": "gazetteer knowledge integration is done by using Method| context: named entity recognition ( ner ) remains difficult in real - world settings ; current challenges include short texts ( low context ) , emerging entities , and complex entities ( e.g. movie names ) . gazetteer features can help , but results have been mixed due to challenges with adding extra features , and a lack of realistic evaluation data . it has been shown that including gazetteer features can cause models to overuse or underuse them , leading to poor generalization .", "entity": "gazetteer knowledge integration", "output": "gemnet", "neg_sample": ["gazetteer knowledge integration is done by using Method", "named entity recognition ( ner ) remains difficult in real - world settings ; current challenges include short texts ( low context ) , emerging entities , and complex entities ( e.g.", "movie names ) .", "gazetteer features can help , but results have been mixed due to challenges with adding extra features , and a lack of realistic evaluation data .", "it has been shown that including gazetteer features can cause models to overuse or underuse them , leading to poor generalization ."], "relation": "used for", "id": "2021.naacl-main.118", "year": 2021, "rel_sent": "We propose GEMNET , a novel approach for gazetteer knowledge integration , including ( 1 ) a flexible Contextual Gazetteer Representation ( CGR ) encoder that can be fused with any word - level model ; and ( 2 ) a Mixture - of- Experts gating network that overcomes the feature overuse issue by learning to conditionally combine the context and gazetteer features , instead of assigning them fixed weights .", "forward": false, "src_ids": "2021.naacl-main.118_6619"}
{"input": "non - autoregressive framework is used for Task| context: in essence , the facts contained in plain text are unordered . however , the popular openie systems usually output facts sequentially in the way of predicting the next fact conditioned on the previous decoded ones , which enforce an unnecessary order on the facts and involve the error accumulation between autoregressive steps .", "entity": "non - autoregressive framework", "output": "open information extraction", "neg_sample": ["non - autoregressive framework is used for Task", "in essence , the facts contained in plain text are unordered .", "however , the popular openie systems usually output facts sequentially in the way of predicting the next fact conditioned on the previous decoded ones , which enforce an unnecessary order on the facts and involve the error accumulation between autoregressive steps ."], "relation": "used for", "id": "2021.emnlp-main.764", "year": 2021, "rel_sent": "To break this bottleneck , we propose MacroIE , a novel non - autoregressive framework for OpenIE .", "forward": true, "src_ids": "2021.emnlp-main.764_12170"}
{"input": "complex interactions between frames is done by using Method| context: story visualization is an underexplored task that falls at the intersection of many important research directions in both computer vision and natural language processing . in this task , given a series of natural language captions which compose a story , an agent must generate a sequence of images that correspond to the captions . prior work has introduced recurrent generative models which outperform text - to - image synthesis models on this task . however , there is room for improvement of generated images in terms of visual quality , coherence and relevance .", "entity": "complex interactions between frames", "output": "mart - based transformers", "neg_sample": ["complex interactions between frames is done by using Method", "story visualization is an underexplored task that falls at the intersection of many important research directions in both computer vision and natural language processing .", "in this task , given a series of natural language captions which compose a story , an agent must generate a sequence of images that correspond to the captions .", "prior work has introduced recurrent generative models which outperform text - to - image synthesis models on this task .", "however , there is room for improvement of generated images in terms of visual quality , coherence and relevance ."], "relation": "used for", "id": "2021.naacl-main.194", "year": 2021, "rel_sent": "We present a number of improvements to prior modeling approaches , including ( 1 ) the addition of a dual learning framework that utilizes video captioning to reinforce the semantic alignment between the story and generated images , ( 2 ) a copy - transform mechanism for sequentially - consistent story visualization , and ( 3 ) MART - based transformers to model complex interactions between frames .", "forward": false, "src_ids": "2021.naacl-main.194_3824"}
{"input": "corpus of high - quality conversations is used for OtherScientificTerm| context: following reasonable procedures and using various support skills can help to effectively provide support .", "entity": "corpus of high - quality conversations", "output": "emotional support", "neg_sample": ["corpus of high - quality conversations is used for OtherScientificTerm", "following reasonable procedures and using various support skills can help to effectively provide support ."], "relation": "used for", "id": "2021.acl-long.269", "year": 2021, "rel_sent": "To ensure a corpus of high - quality conversations that provide examples of effective emotional support , we take extensive effort to design training tutorials for supporters and several mechanisms for quality control during data collection .", "forward": true, "src_ids": "2021.acl-long.269_6349"}
{"input": "abduction is done by using Method| context: transformers have been shown to emulate logical deduction over natural language theories ( logical rules expressed in natural language ) , reliably assigning true / false labels to candidate implications . however , their ability to generate implications of a theory has not yet been demonstrated , and methods for reconstructing proofs of answers are imperfect .", "entity": "abduction", "output": "generative techniques", "neg_sample": ["abduction is done by using Method", "transformers have been shown to emulate logical deduction over natural language theories ( logical rules expressed in natural language ) , reliably assigning true / false labels to candidate implications .", "however , their ability to generate implications of a theory has not yet been demonstrated , and methods for reconstructing proofs of answers are imperfect ."], "relation": "used for", "id": "2021.findings-acl.317", "year": 2021, "rel_sent": "We also show that generative techniques can perform a type of abduction with high precision : Given a theory and an unprovable conclusion , identify a missing fact that allows the conclusion to be proved , along with a proof .", "forward": false, "src_ids": "2021.findings-acl.317_13171"}
{"input": "transformer - based models is done by using Method| context: generative models for dialog systems have gained much interest because of the recent success of rnn and transformer based models in tasks like question answering and summarization . although the task of dialog response generation is generally seen as a sequence to sequence ( seq2seq ) problem , researchers in the past have found it challenging to train dialog systems using the standard seq2seq models . therefore , to help the model learn meaningful utterance and conversation level features , sordoni et al . ( 2015b ) , serban et al . with the transformer - based models dominating the seq2seq problems lately , the natural question to ask is the applicability of the notion of hierarchy in transformer - based dialog systems .", "entity": "transformer - based models", "output": "hierarchical encoding", "neg_sample": ["transformer - based models is done by using Method", "generative models for dialog systems have gained much interest because of the recent success of rnn and transformer based models in tasks like question answering and summarization .", "although the task of dialog response generation is generally seen as a sequence to sequence ( seq2seq ) problem , researchers in the past have found it challenging to train dialog systems using the standard seq2seq models .", "therefore , to help the model learn meaningful utterance and conversation level features , sordoni et al .", "( 2015b ) , serban et al .", "with the transformer - based models dominating the seq2seq problems lately , the natural question to ask is the applicability of the notion of hierarchy in transformer - based dialog systems ."], "relation": "used for", "id": "2021.naacl-main.449", "year": 2021, "rel_sent": "We demonstrate that Hierarchical Encoding helps achieve better natural language understanding of the contexts in transformer - based models for task - oriented dialog systems through a wide range of experiments .", "forward": false, "src_ids": "2021.naacl-main.449_15031"}
{"input": "author characteristics is used for Material| context: deceptive news posts shared in online communities can be detected with nlp models , and much recent research has focused on the development of such models .", "entity": "author characteristics", "output": "deceptive content", "neg_sample": ["author characteristics is used for Material", "deceptive news posts shared in online communities can be detected with nlp models , and much recent research has focused on the development of such models ."], "relation": "used for", "id": "2021.nlp4if-1.5", "year": 2021, "rel_sent": "We find that while author characteristics are better predictors of deceptive content than community characteristics , both characteristics are strongly correlated with model performance .", "forward": true, "src_ids": "2021.nlp4if-1.5_827"}
{"input": "argumentation mining is used for OtherScientificTerm| context: most existing methods determine argumentative relations by exhaustively enumerating all possible pairs of argument components , which suffer from low efficiency and class imbalance . moreover , due to the complex nature of argumentation , there is , sofar , no universal method that can address both tree and non - tree structured argumentation .", "entity": "argumentation mining", "output": "argumentation structures", "neg_sample": ["argumentation mining is used for OtherScientificTerm", "most existing methods determine argumentative relations by exhaustively enumerating all possible pairs of argument components , which suffer from low efficiency and class imbalance .", "moreover , due to the complex nature of argumentation , there is , sofar , no universal method that can address both tree and non - tree structured argumentation ."], "relation": "used for", "id": "2021.acl-long.497", "year": 2021, "rel_sent": "The goal of argumentation mining is to automatically extract argumentation structures from argumentative texts .", "forward": true, "src_ids": "2021.acl-long.497_21"}
{"input": "distant supervision is used for OtherScientificTerm| context: we consider the problem of using observational data to estimate the causal effects of linguistic properties . for example , does writing a complaint politely lead to a faster response time ? how much will a positive product review increase sales ?", "entity": "distant supervision", "output": "noisy proxies", "neg_sample": ["distant supervision is used for OtherScientificTerm", "we consider the problem of using observational data to estimate the causal effects of linguistic properties .", "for example , does writing a complaint politely lead to a faster response time ?", "how much will a positive product review increase sales ?"], "relation": "used for", "id": "2021.naacl-main.323", "year": 2021, "rel_sent": "The method leverages ( 1 ) distant supervision to improve the quality of noisy proxies , and ( 2 ) a pre - trained language model ( BERT ) to adjust for the text .", "forward": true, "src_ids": "2021.naacl-main.323_2860"}
{"input": "context encoding is used for Task| context: this often comes in the form of knowledge graphs , and the integration is done by creating pseudo utterances through paraphrasing knowledge triples , added into the accumulated dialogue context . however , the context length is fixed in these architectures , which restricts how much background or dialogue context can be kept .", "entity": "context encoding", "output": "non - task - oriented dialogue generation", "neg_sample": ["context encoding is used for Task", "this often comes in the form of knowledge graphs , and the integration is done by creating pseudo utterances through paraphrasing knowledge triples , added into the accumulated dialogue context .", "however , the context length is fixed in these architectures , which restricts how much background or dialogue context can be kept ."], "relation": "used for", "id": "2021.acl-long.546", "year": 2021, "rel_sent": "Space Efficient Context Encoding for Non - Task - Oriented Dialogue Generation with Graph Attention Transformer.", "forward": true, "src_ids": "2021.acl-long.546_8785"}
{"input": "ranking is done by using Task| context: claim verification is challenging because it requires first tofind textual evidence and then apply claim - evidence entailment to verify a claim .", "entity": "ranking", "output": "entailment prediction", "neg_sample": ["ranking is done by using Task", "claim verification is challenging because it requires first tofind textual evidence and then apply claim - evidence entailment to verify a claim ."], "relation": "used for", "id": "2021.ranlp-1.174", "year": 2021, "rel_sent": "Our experiments verify that leveraging entailment prediction improves ranking multiple pieces of evidence .", "forward": false, "src_ids": "2021.ranlp-1.174_3330"}
{"input": "internal representations is used for OtherScientificTerm| context: transformers that are pre - trained on multilingual corpora , such as , mbert and xlm - roberta , have achieved impressive cross - lingual transfer capabilities . in the zero - shot transfer setting , only english training data is used , and the fine - tuned model is evaluated on another target language . while this works surprisingly well , substantial variance has been observed in target language performance between different fine - tuning runs , and in the zero - shot setup , no target - language development data is available to select among multiple fine - tuned models . prior work has relied on english dev data to select among models that are fine - tuned with different learning rates , number of steps and other hyperparameters , often resulting in suboptimal choices .", "entity": "internal representations", "output": "cross - lingual capabilities", "neg_sample": ["internal representations is used for OtherScientificTerm", "transformers that are pre - trained on multilingual corpora , such as , mbert and xlm - roberta , have achieved impressive cross - lingual transfer capabilities .", "in the zero - shot transfer setting , only english training data is used , and the fine - tuned model is evaluated on another target language .", "while this works surprisingly well , substantial variance has been observed in target language performance between different fine - tuning runs , and in the zero - shot setup , no target - language development data is available to select among multiple fine - tuned models .", "prior work has relied on english dev data to select among models that are fine - tuned with different learning rates , number of steps and other hyperparameters , often resulting in suboptimal choices ."], "relation": "used for", "id": "2021.emnlp-main.459", "year": 2021, "rel_sent": "We propose a machine learning approach to model selection that uses the fine - tuned model 's own internal representations to predict its cross - lingual capabilities .", "forward": true, "src_ids": "2021.emnlp-main.459_10958"}
{"input": "stochastic rankers is done by using Task| context: according to the probability ranking principle ( prp ) , ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad - hoc retrieval . the prp holds when two conditions are met : [ c1 ] the models are well calibrated , and , [ c2 ] the probabilities of relevance are reported with certainty . we know however that deep neural networks ( dnns ) are often not well calibrated and have several sources of uncertainty , and thus [ c1 ] and [ c2 ] might not be satisfied by neural rankers .", "entity": "stochastic rankers", "output": "conversational search", "neg_sample": ["stochastic rankers is done by using Task", "according to the probability ranking principle ( prp ) , ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad - hoc retrieval .", "the prp holds when two conditions are met : [ c1 ] the models are well calibrated , and , [ c2 ] the probabilities of relevance are reported with certainty .", "we know however that deep neural networks ( dnns ) are often not well calibrated and have several sources of uncertainty , and thus [ c1 ] and [ c2 ] might not be satisfied by neural rankers ."], "relation": "used for", "id": "2021.eacl-main.12", "year": 2021, "rel_sent": "Then , motivated by our findings we use two techniques to model the uncertainty of neural rankers leading to the proposed stochastic rankers , which output a predictive distribution of relevance as opposed to point estimates .", "forward": false, "src_ids": "2021.eacl-main.12_8014"}
{"input": "leam is used for Task| context: from tweets to product reviews , text is ubiquitous on the web and often contains valuable information for both enterprises and consumers . however , the online text is generally noisy and incomplete , requiring users to process and analyze the data to extract insights . while there are systems effective for different stages of text analysis , users lack extensible platforms to support interactive text analysis workflows end - to - end .", "entity": "leam", "output": "interactive analysis", "neg_sample": ["leam is used for Task", "from tweets to product reviews , text is ubiquitous on the web and often contains valuable information for both enterprises and consumers .", "however , the online text is generally noisy and incomplete , requiring users to process and analyze the data to extract insights .", "while there are systems effective for different stages of text analysis , users lack extensible platforms to support interactive text analysis workflows end - to - end ."], "relation": "used for", "id": "2021.dash-1.9", "year": 2021, "rel_sent": "LEAM supports interactive analysis via GUI - based interactions and provides a declarative specification language , implemented based on a visual text algebra , to enable user - guided analysis .", "forward": true, "src_ids": "2021.dash-1.9_5338"}
{"input": "bengali text is done by using Method| context: although research on emotion classification has significantly progressed in high - resource languages , it is still infancy for resource - constrained languages like bengali . however , unavailability of necessary language processing tools and deficiency of benchmark corpora makes the emotion classification task in bengali more challenging and complicated .", "entity": "bengali text", "output": "transformer - based technique", "neg_sample": ["bengali text is done by using Method", "although research on emotion classification has significantly progressed in high - resource languages , it is still infancy for resource - constrained languages like bengali .", "however , unavailability of necessary language processing tools and deficiency of benchmark corpora makes the emotion classification task in bengali more challenging and complicated ."], "relation": "used for", "id": "2021.naacl-srw.19", "year": 2021, "rel_sent": "This work proposes a transformer - based technique to classify the Bengali text into one of the six basic emotions : anger , fear , disgust , sadness , joy , and surprise .", "forward": false, "src_ids": "2021.naacl-srw.19_10808"}
{"input": "generative estimator is done by using OtherScientificTerm| context: empathy is a complex cognitive ability based on the reasoning of others ' affective states . in order to better understand others and express stronger empathy in dialogues , we argue that two issues must be tackled at the same time : ( i ) identifying which word is the cause for the other 's emotion from his or her utterance and ( ii ) reflecting those specific words in the response generation . however , previous approaches for recognizing emotion cause words in text require sub - utterance level annotations , which can be demanding .", "entity": "generative estimator", "output": "social cognition", "neg_sample": ["generative estimator is done by using OtherScientificTerm", "empathy is a complex cognitive ability based on the reasoning of others ' affective states .", "in order to better understand others and express stronger empathy in dialogues , we argue that two issues must be tackled at the same time : ( i ) identifying which word is the cause for the other 's emotion from his or her utterance and ( ii ) reflecting those specific words in the response generation .", "however , previous approaches for recognizing emotion cause words in text require sub - utterance level annotations , which can be demanding ."], "relation": "used for", "id": "2021.emnlp-main.170", "year": 2021, "rel_sent": "Taking inspiration from social cognition , we leverage a generative estimator to infer emotion cause words from utterances with no word - level label .", "forward": false, "src_ids": "2021.emnlp-main.170_11553"}
{"input": "syntactic structures is done by using Method| context: controlling the generation of image captions attracts lots of attention recently .", "entity": "syntactic structures", "output": "syntactic dependency structure aware model ( sdsam )", "neg_sample": ["syntactic structures is done by using Method", "controlling the generation of image captions attracts lots of attention recently ."], "relation": "used for", "id": "2021.alvr-1.3", "year": 2021, "rel_sent": "To achieve this purpose , we propose a Syntactic Dependency Structure Aware Model ( SDSAM ) , which explicitly learns to generate the syntactic structures of image captions to include given partial dependency trees .", "forward": false, "src_ids": "2021.alvr-1.3_6553"}
{"input": "identification of interactive argument pairs is done by using Method| context: interactive argument pair identification is essential in the context of dialogical argumentation mining . existing research treats it as a problem of sentence matching and largely relies on textual information to compute the similarities . however , the interaction of opinions usually involves the background of the topic and requires reasoning of knowledge , which is beyond textual information .", "entity": "identification of interactive argument pairs", "output": "argumentation knowledge graph", "neg_sample": ["identification of interactive argument pairs is done by using Method", "interactive argument pair identification is essential in the context of dialogical argumentation mining .", "existing research treats it as a problem of sentence matching and largely relies on textual information to compute the similarities .", "however , the interaction of opinions usually involves the background of the topic and requires reasoning of knowledge , which is beyond textual information ."], "relation": "used for", "id": "2021.findings-acl.203", "year": 2021, "rel_sent": "Leveraging Argumentation Knowledge Graph for Interactive Argument Pair Identification.", "forward": false, "src_ids": "2021.findings-acl.203_3281"}
{"input": "similarity metric is used for OtherScientificTerm| context: for many nlp applications of online reviews , comparison of two opinion - bearing sentences is key . we argue that , while general purpose text similarity metrics have been applied for this purpose , there has been limited exploration of their applicability to opinion texts .", "entity": "similarity metric", "output": "opinion similarity", "neg_sample": ["similarity metric is used for OtherScientificTerm", "for many nlp applications of online reviews , comparison of two opinion - bearing sentences is key .", "we argue that , while general purpose text similarity metrics have been applied for this purpose , there has been limited exploration of their applicability to opinion texts ."], "relation": "used for", "id": "2021.newsum-1.9", "year": 2021, "rel_sent": "We further propose to learn a similarity metric for opinion similarity via fine - tuning the Sentence - BERT sentence - embedding network based on review text and weak supervision by review ratings .", "forward": true, "src_ids": "2021.newsum-1.9_3669"}
{"input": "transformer based language model is used for OtherScientificTerm| context: internet advancements have made a huge impact on the communication pattern of people and their life style . people express their opinion on products , politics , movies etc . in social media . even though , english is predominantly used , nowadays many people prefer to tweet in their native language and some- times by combining it with english .", "entity": "transformer based language model", "output": "sentiment", "neg_sample": ["transformer based language model is used for OtherScientificTerm", "internet advancements have made a huge impact on the communication pattern of people and their life style .", "people express their opinion on products , politics , movies etc .", "in social media .", "even though , english is predominantly used , nowadays many people prefer to tweet in their native language and some- times by combining it with english ."], "relation": "used for", "id": "2021.dravidianlangtech-1.53", "year": 2021, "rel_sent": "In this paper , the transformer based language model is applied to analyse the sentiment on Tanglish tweets , which is a combination of Tamil and English .", "forward": true, "src_ids": "2021.dravidianlangtech-1.53_5305"}
{"input": "multilingual architecture is done by using OtherScientificTerm| context: the choice of parameter sharing strategy in multilingual machine translation models determines how optimally parameter space is used and hence , directly influences ultimate translation quality . inspired by linguistic trees that show the degree of relatedness between different languages , the new general approach to parameter sharing in multilingual machine translation was suggested recently .", "entity": "multilingual architecture", "output": "expert language hierarchies", "neg_sample": ["multilingual architecture is done by using OtherScientificTerm", "the choice of parameter sharing strategy in multilingual machine translation models determines how optimally parameter space is used and hence , directly influences ultimate translation quality .", "inspired by linguistic trees that show the degree of relatedness between different languages , the new general approach to parameter sharing in multilingual machine translation was suggested recently ."], "relation": "used for", "id": "2021.vardial-1.2", "year": 2021, "rel_sent": "The main idea is to use these expert language hierarchies as a basis for multilingual architecture : the closer two languages are , the more parameters they share .", "forward": false, "src_ids": "2021.vardial-1.2_5614"}
{"input": "language learners is done by using Task| context: it is a task where given a text and a span , a system generates , for the span , an explanatory note that helps the writer ( language learner ) improve their writing skills .", "entity": "language learners", "output": "feedback comment generation", "neg_sample": ["language learners is done by using Task", "it is a task where given a text and a span , a system generates , for the span , an explanatory note that helps the writer ( language learner ) improve their writing skills ."], "relation": "used for", "id": "2021.inlg-1.35", "year": 2021, "rel_sent": "The motivations for this challenge are : ( i ) practically , it will be beneficial for both language learners and teachers if a computer - assisted language learning system can provide feedback comments just as human teachers do ; ( ii ) theoretically , feedback comment generation for language learners has a mixed aspect of other generation tasks together with its unique features and it will be interesting to explore what kind of generation technique is effective against what kind of writing rule .", "forward": false, "src_ids": "2021.inlg-1.35_9435"}
{"input": "generative framework is used for Task| context: most undeciphered lost languages exhibit two characteristics that pose significant decipherment challenges : ( 1 ) the scripts are not fully segmented into words ; ( 2 ) the closest known language is not determined .", "entity": "generative framework", "output": "word segmentation", "neg_sample": ["generative framework is used for Task", "most undeciphered lost languages exhibit two characteristics that pose significant decipherment challenges : ( 1 ) the scripts are not fully segmented into words ; ( 2 ) the closest known language is not determined ."], "relation": "used for", "id": "2021.tacl-1.5", "year": 2021, "rel_sent": "The resulting generative framework jointly models word segmentation and cognate alignment , informed by phonological constraints .", "forward": true, "src_ids": "2021.tacl-1.5_13769"}
{"input": "pos - constrained parallel decoding is used for Task| context: a common solution often resorts to sequence - level knowledge distillation by rebuilding the training dataset through autoregressive generation ( hereinafter known as ' teacher ag ' ) . the success of such methods may largely depend on a latent assumption , i.e. , the teacher ag is superior to the nag model .", "entity": "pos - constrained parallel decoding", "output": "non - autoregressive generation", "neg_sample": ["pos - constrained parallel decoding is used for Task", "a common solution often resorts to sequence - level knowledge distillation by rebuilding the training dataset through autoregressive generation ( hereinafter known as ' teacher ag ' ) .", "the success of such methods may largely depend on a latent assumption , i.e.", ", the teacher ag is superior to the nag model ."], "relation": "used for", "id": "2021.acl-long.467", "year": 2021, "rel_sent": "POS - Constrained Parallel Decoding for Non - autoregressive Generation.", "forward": true, "src_ids": "2021.acl-long.467_1708"}
{"input": "adjustable joint learning approach is used for Task| context: building automatic technical support system is an important yet challenge task . conceptually , to answer a user question on a technical forum , a human expert has tofirst retrieve relevant documents , and then read them carefully to identify the answer snippet . despite huge success the researchers have achieved in coping with general domain question answering ( qa ) , much less attentions have been paid for investigating technical qa . specifically , existing methods suffer from several unique challenges ( i ) the question and answer rarely overlaps substantially and ( ii ) very limited data size .", "entity": "adjustable joint learning approach", "output": "document retrieval", "neg_sample": ["adjustable joint learning approach is used for Task", "building automatic technical support system is an important yet challenge task .", "conceptually , to answer a user question on a technical forum , a human expert has tofirst retrieve relevant documents , and then read them carefully to identify the answer snippet .", "despite huge success the researchers have achieved in coping with general domain question answering ( qa ) , much less attentions have been paid for investigating technical qa .", "specifically , existing methods suffer from several unique challenges ( i ) the question and answer rarely overlaps substantially and ( ii ) very limited data size ."], "relation": "used for", "id": "2021.naacl-industry.23", "year": 2021, "rel_sent": "To this end , we present an adjustable joint learning approach for document retrieval and reading comprehension tasks .", "forward": true, "src_ids": "2021.naacl-industry.23_5225"}
{"input": "sentiment analysis is done by using Method| context: sentiment analysis systems have been shown to exhibit sensitivity to protected attributes .", "entity": "sentiment analysis", "output": "round - trip translation", "neg_sample": ["sentiment analysis is done by using Method", "sentiment analysis systems have been shown to exhibit sensitivity to protected attributes ."], "relation": "used for", "id": "2021.emnlp-main.363", "year": 2021, "rel_sent": "The Effect of Round - Trip Translation on Fairness in Sentiment Analysis.", "forward": false, "src_ids": "2021.emnlp-main.363_12779"}
{"input": "survival analysis is done by using Method| context: utilizing clinical texts in survival analysis is difficult because they are largely unstructured . current automatic extraction models fail to capture textual information comprehensively since their labels are limited in scope . furthermore , they typically require a large amount of data and high - quality expert annotations for training .", "entity": "survival analysis", "output": "deep representations of radiology reports", "neg_sample": ["survival analysis is done by using Method", "utilizing clinical texts in survival analysis is difficult because they are largely unstructured .", "current automatic extraction models fail to capture textual information comprehensively since their labels are limited in scope .", "furthermore , they typically require a large amount of data and high - quality expert annotations for training ."], "relation": "used for", "id": "2021.naacl-main.358", "year": 2021, "rel_sent": "Leveraging Deep Representations of Radiology Reports in Survival Analysis for Predicting Heart Failure Patient Mortality.", "forward": false, "src_ids": "2021.naacl-main.358_1743"}
{"input": "entity - based narrative graph ( eng ) is used for OtherScientificTerm| context: understanding narrative text requires capturing characters ' motivations , goals , and mental states .", "entity": "entity - based narrative graph ( eng )", "output": "internal- states of characters", "neg_sample": ["entity - based narrative graph ( eng ) is used for OtherScientificTerm", "understanding narrative text requires capturing characters ' motivations , goals , and mental states ."], "relation": "used for", "id": "2021.naacl-main.391", "year": 2021, "rel_sent": "This paper proposes an Entity - based Narrative Graph ( ENG ) to model the internal- states of characters in a story .", "forward": true, "src_ids": "2021.naacl-main.391_483"}
{"input": "evaluating evaluation measures is used for Task| context: ordinal classification ( oc ) is an important classification task where the classes are ordinal . for example , an oc task for sentiment analysis could have the following classes : highly positive , positive , neutral , negative , highly negative . clearly , evaluation measures for an oc task should penalise misclassifications by considering the ordinal nature of the classes . however , for both oc and oq , there are only a small number of known evaluation measures that meet this basic requirement .", "entity": "evaluating evaluation measures", "output": "oq tasks", "neg_sample": ["evaluating evaluation measures is used for Task", "ordinal classification ( oc ) is an important classification task where the classes are ordinal .", "for example , an oc task for sentiment analysis could have the following classes : highly positive , positive , neutral , negative , highly negative .", "clearly , evaluation measures for an oc task should penalise misclassifications by considering the ordinal nature of the classes .", "however , for both oc and oq , there are only a small number of known evaluation measures that meet this basic requirement ."], "relation": "used for", "id": "2021.acl-long.214", "year": 2021, "rel_sent": "Evaluation measures for an OQ task should also take the ordinal nature of the classes into account .", "forward": true, "src_ids": "2021.acl-long.214_9557"}
{"input": "single unified multilingual translation model is done by using Method| context: existing multilingual machine translation approaches mainly focus on english - centric directions , while the non - english directions still lag behind .", "entity": "single unified multilingual translation model", "output": "mrasp2", "neg_sample": ["single unified multilingual translation model is done by using Method", "existing multilingual machine translation approaches mainly focus on english - centric directions , while the non - english directions still lag behind ."], "relation": "used for", "id": "2021.acl-long.21", "year": 2021, "rel_sent": "To this end , we propose mRASP2 , a training method to obtain a single unified multilingual translation model .", "forward": false, "src_ids": "2021.acl-long.21_13565"}
{"input": "redditbias is used for Task| context: text representation models are prone to exhibit a range of societal biases , reflecting the non - controlled and biased nature of the underlying pretraining data , which consequently leads to severe ethical issues and even bias amplification . recent work has predominantly focused on measuring and mitigating bias in pretrained language models .", "entity": "redditbias", "output": "bias measurements and mitigation resources", "neg_sample": ["redditbias is used for Task", "text representation models are prone to exhibit a range of societal biases , reflecting the non - controlled and biased nature of the underlying pretraining data , which consequently leads to severe ethical issues and even bias amplification .", "recent work has predominantly focused on measuring and mitigating bias in pretrained language models ."], "relation": "used for", "id": "2021.acl-long.151", "year": 2021, "rel_sent": "In this work , we present REDDITBIAS , the first conversational data set grounded in the actual human conversations from Reddit , allowing for bias measurement and mitigation across four important bias dimensions : gender , race , religion , and queerness .", "forward": true, "src_ids": "2021.acl-long.151_15288"}
{"input": "tamil language is done by using Task| context: speech carries not only the semantic content but also the paralinguistic information which captures the speaking style . speaker traits and emotional states affect how words are being spoken . the research on paralinguistic information is an emerging field in speech and language processing and it has many potential applications including speech recognition , speaker identification and verification , emotion recognition and accent recognition . among them , there is a significant interest in emotion recognition from speech .", "entity": "tamil language", "output": "speech emotion", "neg_sample": ["tamil language is done by using Task", "speech carries not only the semantic content but also the paralinguistic information which captures the speaking style .", "speaker traits and emotional states affect how words are being spoken .", "the research on paralinguistic information is an emerging field in speech and language processing and it has many potential applications including speech recognition , speaker identification and verification , emotion recognition and accent recognition .", "among them , there is a significant interest in emotion recognition from speech ."], "relation": "used for", "id": "2021.dravidianlangtech-1.12", "year": 2021, "rel_sent": "A detailed study of paralinguistic information present in speech signal and an overview of research work related to speech emotion for Tamil Language is presented in this paper .", "forward": false, "src_ids": "2021.dravidianlangtech-1.12_14125"}
{"input": "suicide ideation detection is done by using Method| context: recent psychological studies indicate that individuals exhibiting suicidal ideation increasingly turn to social media rather than mental health practitioners . contextualizing the build - up of such ideation is critical for the identification of users at risk .", "entity": "suicide ideation detection", "output": "emotional phase - aware representations", "neg_sample": ["suicide ideation detection is done by using Method", "recent psychological studies indicate that individuals exhibiting suicidal ideation increasingly turn to social media rather than mental health practitioners .", "contextualizing the build - up of such ideation is critical for the identification of users at risk ."], "relation": "used for", "id": "2021.eacl-main.205", "year": 2021, "rel_sent": "PHASE : Learning Emotional Phase - aware Representations for Suicide Ideation Detection on Social Media.", "forward": false, "src_ids": "2021.eacl-main.205_8909"}
{"input": "open - domain question - answering is used for Material| context: since late 2019 , covid-19 has quickly emerged as the newest biomedical domain , resulting in a surge of new information . this has created the need for a public space for users to ask questions and receive credible , scientific answers .", "entity": "open - domain question - answering", "output": "emergent domains", "neg_sample": ["open - domain question - answering is used for Material", "since late 2019 , covid-19 has quickly emerged as the newest biomedical domain , resulting in a surge of new information .", "this has created the need for a public space for users to ask questions and receive credible , scientific answers ."], "relation": "used for", "id": "2021.emnlp-demo.30", "year": 2021, "rel_sent": "Open - Domain Question - Answering for COVID-19 and Other Emergent Domains.", "forward": true, "src_ids": "2021.emnlp-demo.30_3393"}
{"input": "sequence labeling is done by using Task| context: however , predicting useful transfer sources is a challenging problem , as even the most similar sources might lead to unexpected negative transfer results . thus , ranking methods based on task and text similarity - as suggested in prior work - may not be sufficient to identify promising sources .", "entity": "sequence labeling", "output": "model transfer", "neg_sample": ["sequence labeling is done by using Task", "however , predicting useful transfer sources is a challenging problem , as even the most similar sources might lead to unexpected negative transfer results .", "thus , ranking methods based on task and text similarity - as suggested in prior work - may not be sufficient to identify promising sources ."], "relation": "used for", "id": "2021.emnlp-main.689", "year": 2021, "rel_sent": "For this , we study the effects of model transfer on sequence labeling across various domains and tasks and show that our methods based on model similarity and support vector machines are able to predict promising sources , resulting in performance increases of up to 24 F1 points .", "forward": false, "src_ids": "2021.emnlp-main.689_16171"}
{"input": "adverse drug event detection is done by using Method| context: pretrained transformer - based models , such as bert and its variants , have become a common choice to obtain state - of - the - art performances in nlp tasks . in the identification of adverse drug events ( ade ) from social media texts , for example , bert architectures rank first in the leaderboard . however , a systematic comparison between these models has not yet been done .", "entity": "adverse drug event detection", "output": "transformer architectures", "neg_sample": ["adverse drug event detection is done by using Method", "pretrained transformer - based models , such as bert and its variants , have become a common choice to obtain state - of - the - art performances in nlp tasks .", "in the identification of adverse drug events ( ade ) from social media texts , for example , bert architectures rank first in the leaderboard .", "however , a systematic comparison between these models has not yet been done ."], "relation": "used for", "id": "2021.eacl-main.149", "year": 2021, "rel_sent": "BERT Prescriptions to Avoid Unwanted Headaches : A Comparison of Transformer Architectures for Adverse Drug Event Detection.", "forward": false, "src_ids": "2021.eacl-main.149_2927"}
{"input": "orthogonal structural probes is used for Method| context: state - of - the - art contextual embeddings are obtained from large language models available only for a few languages . for others , we need to learn representations using a multilingual model . there is an ongoing debate on whether multilingual embeddings can be aligned in a space shared across many languages .", "entity": "orthogonal structural probes", "output": "projection", "neg_sample": ["orthogonal structural probes is used for Method", "state - of - the - art contextual embeddings are obtained from large language models available only for a few languages .", "for others , we need to learn representations using a multilingual model .", "there is an ongoing debate on whether multilingual embeddings can be aligned in a space shared across many languages ."], "relation": "used for", "id": "2021.emnlp-main.376", "year": 2021, "rel_sent": "The novel Orthogonal Structural Probe ( Limisiewicz and Marecek , 2021 ) allows us to answer this question for specific linguistic features and learn a projection based only on mono - lingual annotated datasets .", "forward": true, "src_ids": "2021.emnlp-main.376_7310"}
{"input": "text representations is done by using OtherScientificTerm| context: neural language models have contributed to state - of - the - art results in a number of downstream applications including sentiment analysis , intent classification and others . however , obtaining text representations or embeddings using these models risks encoding personally identifiable information learned from language and context cues that may lead to privacy leaks .", "entity": "text representations", "output": "calibrated noise", "neg_sample": ["text representations is done by using OtherScientificTerm", "neural language models have contributed to state - of - the - art results in a number of downstream applications including sentiment analysis , intent classification and others .", "however , obtaining text representations or embeddings using these models risks encoding personally identifiable information learned from language and context cues that may lead to privacy leaks ."], "relation": "used for", "id": "2021.emnlp-main.628", "year": 2021, "rel_sent": "Specifically , CAPE firstly applies calibrated noise through differential privacy to maintain the privacy of text representations by preserving the encoded semantic links while obscuring sensitive information .", "forward": false, "src_ids": "2021.emnlp-main.628_4871"}
{"input": "open domain question answering is done by using Method| context: to date , most of recent work under the retrieval - reader framework for open - domain qa focuses on either extractive or generative reader exclusively .", "entity": "open domain question answering", "output": "hybrid approach", "neg_sample": ["open domain question answering is done by using Method", "to date , most of recent work under the retrieval - reader framework for open - domain qa focuses on either extractive or generative reader exclusively ."], "relation": "used for", "id": "2021.acl-long.240", "year": 2021, "rel_sent": "UnitedQA : A Hybrid Approach for Open Domain Question Answering.", "forward": false, "src_ids": "2021.acl-long.240_10448"}
{"input": "hyperbole detection models is done by using OtherScientificTerm| context: the detection of hyperbole is an important stepping stone to understanding the intentions of a hyperbolic utterance .", "entity": "hyperbole detection models", "output": "behavioural tests", "neg_sample": ["hyperbole detection models is done by using OtherScientificTerm", "the detection of hyperbole is an important stepping stone to understanding the intentions of a hyperbolic utterance ."], "relation": "used for", "id": "2021.alta-1.6", "year": 2021, "rel_sent": "We also introduce a suite of behavioural tests to probe the capabilities of hyperbole detection models across a range of hyperbole types .", "forward": false, "src_ids": "2021.alta-1.6_11499"}
{"input": "multi - source heterogeneous knowledge is used for Task| context: despite achieving remarkable performance , previous knowledge - enhanced works usually only use a single - source homogeneous knowledge base of limited knowledge coverage . thus , they often degenerate into traditional methods because not all dialogues can be linked with knowledge entries .", "entity": "multi - source heterogeneous knowledge", "output": "open - domain knowledge - enhanced dialogue generation", "neg_sample": ["multi - source heterogeneous knowledge is used for Task", "despite achieving remarkable performance , previous knowledge - enhanced works usually only use a single - source homogeneous knowledge base of limited knowledge coverage .", "thus , they often degenerate into traditional methods because not all dialogues can be linked with knowledge entries ."], "relation": "used for", "id": "2021.emnlp-main.175", "year": 2021, "rel_sent": "To our best knowledge , this work is the first to use the multi - source heterogeneous knowledge in the open - domain knowledge - enhanced dialogue generation .", "forward": true, "src_ids": "2021.emnlp-main.175_15014"}
{"input": "neural narrative generation is done by using Method| context: narrative generation is an open - ended nlp task in which a model generates a story given a prompt . the task is similar to neural response generation for chatbots ; however , innovations in response generation are often not applied to narrative generation , despite the similarity between these tasks .", "entity": "neural narrative generation", "output": "decoding methods", "neg_sample": ["neural narrative generation is done by using Method", "narrative generation is an open - ended nlp task in which a model generates a story given a prompt .", "the task is similar to neural response generation for chatbots ; however , innovations in response generation are often not applied to narrative generation , despite the similarity between these tasks ."], "relation": "used for", "id": "2021.gem-1.16", "year": 2021, "rel_sent": "We aim to bridge this gap by applying and evaluating advances in decoding methods for neural response generation to neural narrative generation .", "forward": false, "src_ids": "2021.gem-1.16_12808"}
{"input": "metric scores is done by using Method| context: evaluation metrics are a key ingredient for progress of text generation systems . in recent years , several bert - based evaluation metrics have been proposed ( including bertscore , moverscore , bleurt , etc . ) which correlate much better with human assessment of text generation quality than bleu or rouge , invented two decades ago . however , little is known what these metrics , which are based on black - box language model representations , actually capture ( it is typically assumed they model semantic similarity ) .", "entity": "metric scores", "output": "regression based global explainability technique", "neg_sample": ["metric scores is done by using Method", "evaluation metrics are a key ingredient for progress of text generation systems .", "in recent years , several bert - based evaluation metrics have been proposed ( including bertscore , moverscore , bleurt , etc . )", "which correlate much better with human assessment of text generation quality than bleu or rouge , invented two decades ago .", "however , little is known what these metrics , which are based on black - box language model representations , actually capture ( it is typically assumed they model semantic similarity ) ."], "relation": "used for", "id": "2021.emnlp-main.701", "year": 2021, "rel_sent": "In this work , we use a simple regression based global explainability technique to disentangle metric scores along linguistic factors , including semantics , syntax , morphology , and lexical overlap .", "forward": false, "src_ids": "2021.emnlp-main.701_4180"}
{"input": "as2 is done by using Method| context: syntactic structure is an important component of natural language text . recent top - performing models in answer sentence selection ( as2 ) use self - attention and transfer learning , but not syntactic structure . tree structures have shown strong performance in tasks with sentence pair input like semantic relatedness . we investigate whether tree structures can boost performance in as2 .", "entity": "as2", "output": "tree - structured models", "neg_sample": ["as2 is done by using Method", "syntactic structure is an important component of natural language text .", "recent top - performing models in answer sentence selection ( as2 ) use self - attention and transfer learning , but not syntactic structure .", "tree structures have shown strong performance in tasks with sentence pair input like semantic relatedness .", "we investigate whether tree structures can boost performance in as2 ."], "relation": "used for", "id": "2021.acl-long.358", "year": 2021, "rel_sent": "Our findings show that the ability of tree - structured models to successfully absorb syntactic information is strongly correlated with a higher performance in AS2 .", "forward": false, "src_ids": "2021.acl-long.358_13305"}
{"input": "textual and visual information is done by using Method| context: existed pre - training methods either focus on single - modal tasks or multi - modal tasks , and can not effectively adapt to each other . they can only utilize single - modal data ( i.e. , text or image ) or limited multi - modal data ( i.e. , image - text pairs ) .", "entity": "textual and visual information", "output": "cross - modal contrastive learning ( cmcl )", "neg_sample": ["textual and visual information is done by using Method", "existed pre - training methods either focus on single - modal tasks or multi - modal tasks , and can not effectively adapt to each other .", "they can only utilize single - modal data ( i.e.", ", text or image ) or limited multi - modal data ( i.e.", ", image - text pairs ) ."], "relation": "used for", "id": "2021.acl-long.202", "year": 2021, "rel_sent": "Large scale of free text corpus and image collections are utilized to improve the capability of visual and textual understanding , and cross - modal contrastive learning ( CMCL ) is leveraged to align the textual and visual information into a unified semantic space , over a corpus of image - text pairs augmented with related images and texts .", "forward": false, "src_ids": "2021.acl-long.202_3598"}
{"input": "word embeddings is done by using OtherScientificTerm| context: word embeddings are widely used in natural language processing ( nlp ) for a vast range of applications . however , it has been consistently proven that these embeddings reflect the same human biases that exist in the data used to train them . most of the introduced bias indicators to reveal word embeddings ' bias are average - based indicators based on the cosine similarity measure .", "entity": "word embeddings", "output": "euclidean distance", "neg_sample": ["word embeddings is done by using OtherScientificTerm", "word embeddings are widely used in natural language processing ( nlp ) for a vast range of applications .", "however , it has been consistently proven that these embeddings reflect the same human biases that exist in the data used to train them .", "most of the introduced bias indicators to reveal word embeddings ' bias are average - based indicators based on the cosine similarity measure ."], "relation": "used for", "id": "2021.trustnlp-1.2", "year": 2021, "rel_sent": "We found that over the ten categories of word embedding association tests , Mahalanobis distance reveals the smallest bias , and Euclidean distance reveals the largest bias in word embeddings .", "forward": false, "src_ids": "2021.trustnlp-1.2_1296"}
{"input": "detection of blc is done by using Generic| context: basic - level categories ( blc ) are an important psycholinguistic concept introduced by rosch et al . ( 1976 ) ; they are defined as the most inclusive categories for which a concrete mental image of the category as a whole can be formed , and also as those categories which are acquired early in life . rosch 's original algorithm for detecting blc ( called cue - validity ) is based on the availability of semantic features such as ' has tail ' for ' cat ' , and has remained untested at large . an at - scale algorithm for the automatic determination of blc exists , but it operates without rosch - style semantic features , and is thus unable to verify rosch 's hypothesis .", "entity": "detection of blc", "output": "soa", "neg_sample": ["detection of blc is done by using Generic", "basic - level categories ( blc ) are an important psycholinguistic concept introduced by rosch et al .", "( 1976 ) ; they are defined as the most inclusive categories for which a concrete mental image of the category as a whole can be formed , and also as those categories which are acquired early in life .", "rosch 's original algorithm for detecting blc ( called cue - validity ) is based on the availability of semantic features such as ' has tail ' for ' cat ' , and has remained untested at large .", "an at - scale algorithm for the automatic determination of blc exists , but it operates without rosch - style semantic features , and is thus unable to verify rosch 's hypothesis ."], "relation": "used for", "id": "2021.emnlp-main.654", "year": 2021, "rel_sent": "The best of our methods outperforms the current SoA in BLC detection , with an accuracy of English BLC detection of 75.0 % , and of Mandarin BLC detection 80.7 % on a test set .", "forward": false, "src_ids": "2021.emnlp-main.654_11110"}
{"input": "features is done by using Method| context: individuals with autism spectrum disorder ( asd ) experience difficulties in social aspects of communication , but the linguistic characteristics associated with deficits in discourse and pragmatic expression are often difficult to precisely identify and quantify . we are currently collecting a corpus of transcribed natural conversations produced in an experimental setting in which participants with and without asd complete a number of collaborative tasks with their neurotypical peers .", "entity": "features", "output": "neural models", "neg_sample": ["features is done by using Method", "individuals with autism spectrum disorder ( asd ) experience difficulties in social aspects of communication , but the linguistic characteristics associated with deficits in discourse and pragmatic expression are often difficult to precisely identify and quantify .", "we are currently collecting a corpus of transcribed natural conversations produced in an experimental setting in which participants with and without asd complete a number of collaborative tasks with their neurotypical peers ."], "relation": "used for", "id": "2021.acl-srw.29", "year": 2021, "rel_sent": "We find the best performing model for all three features is a feed - forward neural network trained with BERT embeddings .", "forward": false, "src_ids": "2021.acl-srw.29_14080"}
{"input": "scoring functions is used for Task| context: previous work on probing word representations for linguistic knowledge has focused on interpolation tasks . in this paper , we instead analyse probes in an extrapolation setting , where the inputs at test time are deliberately chosen to be ' harder ' than the training examples . we argue that such an analysis can shed further light on the open question whether probes actually decode linguistic knowledge , or merely learn the diagnostic task from shallow features .", "entity": "scoring functions", "output": "nlp tasks", "neg_sample": ["scoring functions is used for Task", "previous work on probing word representations for linguistic knowledge has focused on interpolation tasks .", "in this paper , we instead analyse probes in an extrapolation setting , where the inputs at test time are deliberately chosen to be ' harder ' than the training examples .", "we argue that such an analysis can shed further light on the open question whether probes actually decode linguistic knowledge , or merely learn the diagnostic task from shallow features ."], "relation": "used for", "id": "2021.blackboxnlp-1.2", "year": 2021, "rel_sent": "To quantify the hardness of an example , we consider scoring functions based on linguistic , statistical , and learning - related criteria , all of which are applicable to a broad range of NLP tasks .", "forward": true, "src_ids": "2021.blackboxnlp-1.2_12348"}
{"input": "universal dependencies is used for OtherScientificTerm| context: technical documents present distinct challenges when used in natural language processing tasks such as part - of - speech tagging or syntactic parsing . this is mainly due to the nature of their content , which may differ greatly from more studied texts like news articles , encyclopedic extracts or social media entries .", "entity": "universal dependencies", "output": "software requirements", "neg_sample": ["universal dependencies is used for OtherScientificTerm", "technical documents present distinct challenges when used in natural language processing tasks such as part - of - speech tagging or syntactic parsing .", "this is mainly due to the nature of their content , which may differ greatly from more studied texts like news articles , encyclopedic extracts or social media entries ."], "relation": "used for", "id": "2021.udw-1.5", "year": 2021, "rel_sent": "UD on Software Requirements : Application and Challenges.", "forward": true, "src_ids": "2021.udw-1.5_12140"}
{"input": "higher - order inference is used for OtherScientificTerm| context: relating entities and events in text is a key component of natural language understanding . cross - document coreference resolution , in particular , is important for the growing interest in multi - document analysis tasks .", "entity": "higher - order inference", "output": "cross - document settings", "neg_sample": ["higher - order inference is used for OtherScientificTerm", "relating entities and events in text is a key component of natural language understanding .", "cross - document coreference resolution , in particular , is important for the growing interest in multi - document analysis tasks ."], "relation": "used for", "id": "2021.emnlp-main.382", "year": 2021, "rel_sent": "In this work we propose a new model that extends the efficient sequential prediction paradigm for coreference resolution to cross - document settings and achieves competitive results for both entity and event coreference while providing strong evidence of the efficacy of both sequential models and higher - order inference in cross - document settings .", "forward": true, "src_ids": "2021.emnlp-main.382_12854"}
{"input": "fc - articles is done by using Method| context: false claims that have been previously fact - checked can still spread on social media . to mitigate their continual spread , detecting previously fact - checked claims is indispensable . given a claim , existing works focus on providing evidence for detection by reranking candidate fact - checking articles ( fc - articles ) retrieved by bm25 . however , these performances may be limited because they ignore the following characteristics of fc - articles : ( 1 ) claims are often quoted to describe the checked events , providing lexical information besides semantics ; ( 2 ) sentence templates to introduce or debunk claims are common across articles , providing pattern information . models that ignore the two aspects only leverage semantic relevance and may be misled by sentences that describe similar but irrelevant events .", "entity": "fc - articles", "output": "reranker", "neg_sample": ["fc - articles is done by using Method", "false claims that have been previously fact - checked can still spread on social media .", "to mitigate their continual spread , detecting previously fact - checked claims is indispensable .", "given a claim , existing works focus on providing evidence for detection by reranking candidate fact - checking articles ( fc - articles ) retrieved by bm25 .", "however , these performances may be limited because they ignore the following characteristics of fc - articles : ( 1 ) claims are often quoted to describe the checked events , providing lexical information besides semantics ; ( 2 ) sentence templates to introduce or debunk claims are common across articles , providing pattern information .", "models that ignore the two aspects only leverage semantic relevance and may be misled by sentences that describe similar but irrelevant events ."], "relation": "used for", "id": "2021.acl-long.425", "year": 2021, "rel_sent": "In this paper , we propose a novel reranker , MTM ( Memory - enhanced Transformers for Matching ) to rank FC - articles using key sentences selected with event ( lexical and semantic ) and pattern information .", "forward": false, "src_ids": "2021.acl-long.425_14383"}
{"input": "tkge models is used for Task| context: static knowledge graph ( skg ) embedding ( skge ) has been studied intensively in the past years . recently , temporal knowledge graph ( tkg ) embedding ( tkge ) has emerged .", "entity": "tkge models", "output": "tkg completion", "neg_sample": ["tkge models is used for Task", "static knowledge graph ( skg ) embedding ( skge ) has been studied intensively in the past years .", "recently , temporal knowledge graph ( tkg ) embedding ( tkge ) has emerged ."], "relation": "used for", "id": "2021.naacl-main.451", "year": 2021, "rel_sent": "In this paper , we propose a Recursive Temporal Fact Embedding ( RTFE ) framework to transplant SKGE models to TKGs and to enhance the performance of existing TKGE models for TKG completion .", "forward": true, "src_ids": "2021.naacl-main.451_11849"}
{"input": "cluster - based approach is used for OtherScientificTerm| context: the representation degeneration problem in contextual word representations ( cwrs ) hurts the expressiveness of the embedding space by forming an anisotropic cone where even unrelated words have excessively positive correlations .", "entity": "cluster - based approach", "output": "isotropy", "neg_sample": ["cluster - based approach is used for OtherScientificTerm", "the representation degeneration problem in contextual word representations ( cwrs ) hurts the expressiveness of the embedding space by forming an anisotropic cone where even unrelated words have excessively positive correlations ."], "relation": "used for", "id": "2021.acl-short.73", "year": 2021, "rel_sent": "A Cluster - based Approach for Improving Isotropy in Contextual Embedding Space.", "forward": true, "src_ids": "2021.acl-short.73_9249"}
{"input": "classification is done by using Method| context: over the past few months , there were huge numbers of circulating tweets and discussions about coronavirus ( covid-19 ) in the arab region . it is important for policy makers and many people to identify types of shared tweets to better understand public behavior , topics of interest , requests from governments , sources of tweets , etc . it is also crucial to prevent spreading of rumors and misinformation about the virus or bad cures .", "entity": "classification", "output": "machine learning and transformer based models", "neg_sample": ["classification is done by using Method", "over the past few months , there were huge numbers of circulating tweets and discussions about coronavirus ( covid-19 ) in the arab region .", "it is important for policy makers and many people to identify types of shared tweets to better understand public behavior , topics of interest , requests from governments , sources of tweets , etc .", "it is also crucial to prevent spreading of rumors and misinformation about the virus or bad cures ."], "relation": "used for", "id": "2021.louhi-1.1", "year": 2021, "rel_sent": "We describe annotation guidelines , analyze our dataset and build effective machine learning and transformer based models for classification .", "forward": false, "src_ids": "2021.louhi-1.1_9520"}
{"input": "system actions is done by using Method| context: dialogue policy learning , a subtask that determines the content of system response generation and then the degree of task completion , is essential for task - oriented dialogue systems . however , the unbalanced distribution of system actions in dialogue datasets often causes difficulty in learning to generate desired actions and responses .", "entity": "system actions", "output": "memoryaugmented multi - decoder network", "neg_sample": ["system actions is done by using Method", "dialogue policy learning , a subtask that determines the content of system response generation and then the degree of task completion , is essential for task - oriented dialogue systems .", "however , the unbalanced distribution of system actions in dialogue datasets often causes difficulty in learning to generate desired actions and responses ."], "relation": "used for", "id": "2021.findings-acl.39", "year": 2021, "rel_sent": "Then , we propose a memoryaugmented multi - decoder network to generate the system actions conditioned on the candidate actions , which allows the network to adaptively select key information in the candidate actions and ignore noises .", "forward": false, "src_ids": "2021.findings-acl.39_979"}
{"input": "inductive bias is used for Method| context: the uniform information density ( uid ) hypothesis , which posits that speakers behaving optimally tend to distribute information uniformly across a linguistic signal , has gained traction in psycholinguistics as an explanation for certain syntactic , morphological , and prosodic choices .", "entity": "inductive bias", "output": "statistical language modeling", "neg_sample": ["inductive bias is used for Method", "the uniform information density ( uid ) hypothesis , which posits that speakers behaving optimally tend to distribute information uniformly across a linguistic signal , has gained traction in psycholinguistics as an explanation for certain syntactic , morphological , and prosodic choices ."], "relation": "used for", "id": "2021.acl-long.404", "year": 2021, "rel_sent": "In this work , we explore whether the UID hypothesis can be operationalized as an inductive bias for statistical language modeling .", "forward": true, "src_ids": "2021.acl-long.404_1178"}
{"input": "bilingual dictionaries is used for Material| context: recent research in multilingual language models ( lm ) has demonstrated their ability to effectively handle multiple languages in a single model . this holds promise for low web - resource languages ( lrl ) as multilingual models can enable transfer of supervision from high resource languages to lrls . however , incorporating a new language in an lm still remains a challenge , particularly for languages with limited corpora and in unseen scripts .", "entity": "bilingual dictionaries", "output": "rpl text", "neg_sample": ["bilingual dictionaries is used for Material", "recent research in multilingual language models ( lm ) has demonstrated their ability to effectively handle multiple languages in a single model .", "this holds promise for low web - resource languages ( lrl ) as multilingual models can enable transfer of supervision from high resource languages to lrls .", "however , incorporating a new language in an lm still remains a challenge , particularly for languages with limited corpora and in unseen scripts ."], "relation": "used for", "id": "2021.acl-long.105", "year": 2021, "rel_sent": "While exploiting similar sentence structures , RelateLM utilizes readily available bilingual dictionaries to pseudo translate RPL text into LRL corpora .", "forward": true, "src_ids": "2021.acl-long.105_9073"}
{"input": "multiple interpretable unsupervised reward components is used for OtherScientificTerm| context: existing approaches for the table - to - text task suffer from issues such as missing information , hallucination and repetition . many approaches to this problem use reinforcement learning ( rl ) , which maximizes a single manually defined reward , such as bleu . in this work , we instead pose the table - to - text task as inverse reinforcement learning ( irl ) problem .", "entity": "multiple interpretable unsupervised reward components", "output": "composite reward function", "neg_sample": ["multiple interpretable unsupervised reward components is used for OtherScientificTerm", "existing approaches for the table - to - text task suffer from issues such as missing information , hallucination and repetition .", "many approaches to this problem use reinforcement learning ( rl ) , which maximizes a single manually defined reward , such as bleu .", "in this work , we instead pose the table - to - text task as inverse reinforcement learning ( irl ) problem ."], "relation": "used for", "id": "2021.acl-short.11", "year": 2021, "rel_sent": "We explore using multiple interpretable unsupervised reward components that are combined linearly toform a composite reward function .", "forward": true, "src_ids": "2021.acl-short.11_14601"}
{"input": "scientific document understanding tasks is done by using Method| context: scientific document understanding is challenging as the data is highly domain specific and diverse . however , datasets for tasks with scientific text require expensive manual annotation and tend to be small and limited to only one or a few fields . at the same time , scientific documents contain many potential training signals , such as citations , which can be used to build large labelled datasets .", "entity": "scientific document understanding tasks", "output": "language model fine - tuning", "neg_sample": ["scientific document understanding tasks is done by using Method", "scientific document understanding is challenging as the data is highly domain specific and diverse .", "however , datasets for tasks with scientific text require expensive manual annotation and tend to be small and limited to only one or a few fields .", "at the same time , scientific documents contain many potential training signals , such as citations , which can be used to build large labelled datasets ."], "relation": "used for", "id": "2021.findings-acl.157", "year": 2021, "rel_sent": "Finally , we demonstrate that language model fine - tuning with cite - worthiness as a secondary task leads to improved performance on downstream scientific document understanding tasks .", "forward": false, "src_ids": "2021.findings-acl.157_3508"}
{"input": "e - wer is done by using Method| context: automatic speech recognition ( asr ) systems are evaluated using word error rate ( wer ) , which is calculated by comparing the number of errors between the ground truth and the transcription of the asr system . this calculation , however , requires manual transcription of the speech signal to obtain the ground truth . however , while converting to a classification setting , these approaches suffer from heavy class imbalance .", "entity": "e - wer", "output": "balanced paradigm", "neg_sample": ["e - wer is done by using Method", "automatic speech recognition ( asr ) systems are evaluated using word error rate ( wer ) , which is calculated by comparing the number of errors between the ground truth and the transcription of the asr system .", "this calculation , however , requires manual transcription of the speech signal to obtain the ground truth .", "however , while converting to a classification setting , these approaches suffer from heavy class imbalance ."], "relation": "used for", "id": "2021.eacl-main.320", "year": 2021, "rel_sent": "In this paper , we propose a new balanced paradigm for e - WER in a classification setting .", "forward": false, "src_ids": "2021.eacl-main.320_1118"}
{"input": "intent features is used for Task| context: complex natural language understanding modules in dialog systems have a richer understanding of user utterances , and thus are critical in providing a better user experience . however , these models are often created from scratch , for specific clients and use cases and require the annotation of large datasets . this encourages the sharing of annotated data across multiple clients .", "entity": "intent features", "output": "rich natural language understanding", "neg_sample": ["intent features is used for Task", "complex natural language understanding modules in dialog systems have a richer understanding of user utterances , and thus are critical in providing a better user experience .", "however , these models are often created from scratch , for specific clients and use cases and require the annotation of large datasets .", "this encourages the sharing of annotated data across multiple clients ."], "relation": "used for", "id": "2021.naacl-industry.27", "year": 2021, "rel_sent": "Intent Features for Rich Natural Language Understanding.", "forward": true, "src_ids": "2021.naacl-industry.27_14281"}
{"input": "document retrieval is done by using Method| context: with the need of fast retrieval speed and small memory footprint , document hashing has been playing a crucial role in large - scale information retrieval . to generate high - quality hashing code , both semantics and neighborhood information are crucial . however , most existing methods leverage only one of them or simply combine them via some intuitive criteria , lacking a theoretical principle to guide the integration process .", "entity": "document retrieval", "output": "graph - driven generative models", "neg_sample": ["document retrieval is done by using Method", "with the need of fast retrieval speed and small memory footprint , document hashing has been playing a crucial role in large - scale information retrieval .", "to generate high - quality hashing code , both semantics and neighborhood information are crucial .", "however , most existing methods leverage only one of them or simply combine them via some intuitive criteria , lacking a theoretical principle to guide the integration process ."], "relation": "used for", "id": "2021.acl-long.174", "year": 2021, "rel_sent": "Integrating Semantics and Neighborhood Information with Graph - Driven Generative Models for Document Retrieval.", "forward": false, "src_ids": "2021.acl-long.174_13557"}
{"input": "source - based essay scoring is done by using OtherScientificTerm| context: human essay grading is a laborious task that can consume much time and effort . automated essay scoring ( aes ) has thus been proposed as a fast and effective solution to the problem of grading student writing at scale . however , because aes typically uses supervised machine learning , a human - graded essay corpus is still required to train the aes model . unfortunately , such a graded corpus often does not exist , so creating a corpus for machine learning can also be a laborious task .", "entity": "source - based essay scoring", "output": "weak supervision", "neg_sample": ["source - based essay scoring is done by using OtherScientificTerm", "human essay grading is a laborious task that can consume much time and effort .", "automated essay scoring ( aes ) has thus been proposed as a fast and effective solution to the problem of grading student writing at scale .", "however , because aes typically uses supervised machine learning , a human - graded essay corpus is still required to train the aes model .", "unfortunately , such a graded corpus often does not exist , so creating a corpus for machine learning can also be a laborious task ."], "relation": "used for", "id": "2021.bea-1.9", "year": 2021, "rel_sent": "Essay Quality Signals as Weak Supervision for Source - based Essay Scoring.", "forward": false, "src_ids": "2021.bea-1.9_1839"}
{"input": "neuro - cognitive mechanisms is used for Task| context: context guides comprehenders ' expectations during language processing , and informationtheoretic surprisal is commonly used as an index of cognitive processing effort . however , prior work using surprisal has considered only within - sentence context , using n - grams , neural language models , or syntactic structure as conditioning context .", "entity": "neuro - cognitive mechanisms", "output": "sentence processing", "neg_sample": ["neuro - cognitive mechanisms is used for Task", "context guides comprehenders ' expectations during language processing , and informationtheoretic surprisal is commonly used as an index of cognitive processing effort .", "however , prior work using surprisal has considered only within - sentence context , using n - grams , neural language models , or syntactic structure as conditioning context ."], "relation": "used for", "id": "2021.findings-acl.332", "year": 2021, "rel_sent": "More generally , our approach adds to a growing literature using methods from computational linguistics to operationalize and test hypotheses about neuro - cognitive mechanisms in sentence processing .", "forward": true, "src_ids": "2021.findings-acl.332_5745"}
{"input": "discourse - based argument structures is used for OtherScientificTerm| context: the paper presents a novel discourse - based approach to argument quality assessment defined as a graph classification task , where the depth of reasoning ( argumentation ) is evident from the number and type of detected discourse units and relations between them .", "entity": "discourse - based argument structures", "output": "qualitative properties of natural language arguments", "neg_sample": ["discourse - based argument structures is used for OtherScientificTerm", "the paper presents a novel discourse - based approach to argument quality assessment defined as a graph classification task , where the depth of reasoning ( argumentation ) is evident from the number and type of detected discourse units and relations between them ."], "relation": "used for", "id": "2021.ranlp-1.143", "year": 2021, "rel_sent": "The obtained accuracy ranges from 74.5 % to 85.0 % and indicates that discourse - based argument structures reflect qualitative properties of natural language arguments .", "forward": true, "src_ids": "2021.ranlp-1.143_12259"}
{"input": "ideal ratio mask is done by using OtherScientificTerm| context: the masking - based speech enhancement method pursues a multiplicative mask that applies to the spectrogram of input noise - corrupted utterance , and a deep neural network ( dnn ) is often used to learn the mask . in particular , the features commonly used for automatic speech recognition can serve as the input of the dnn to learn the well - behaved mask that significantly reduce the noise distortion of processed utterances .", "entity": "ideal ratio mask", "output": "low - pass filtered temporal speech features", "neg_sample": ["ideal ratio mask is done by using OtherScientificTerm", "the masking - based speech enhancement method pursues a multiplicative mask that applies to the spectrogram of input noise - corrupted utterance , and a deep neural network ( dnn ) is often used to learn the mask .", "in particular , the features commonly used for automatic speech recognition can serve as the input of the dnn to learn the well - behaved mask that significantly reduce the noise distortion of processed utterances ."], "relation": "used for", "id": "2021.ijclclp-2.3", "year": 2021, "rel_sent": "Employing Low - Pass Filtered Temporal Speech Features for the Training of Ideal Ratio Mask in Speech Enhancement.", "forward": false, "src_ids": "2021.ijclclp-2.3_5946"}
{"input": "kernel functions is used for OtherScientificTerm| context: following the success of dot - product attention in transformers , numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length . however , all approximations thus far have ignored the contribution of the * value vectors * to the quality of approximation . in this work , we argue that research efforts should be directed towards approximating the true output of the attention sub - layer , which includes the value vectors .", "entity": "kernel functions", "output": "attention similarity", "neg_sample": ["kernel functions is used for OtherScientificTerm", "following the success of dot - product attention in transformers , numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length .", "however , all approximations thus far have ignored the contribution of the * value vectors * to the quality of approximation .", "in this work , we argue that research efforts should be directed towards approximating the true output of the attention sub - layer , which includes the value vectors ."], "relation": "used for", "id": "2021.emnlp-main.753", "year": 2021, "rel_sent": "Moreover , we show that the choice of kernel function for computing attention similarity can substantially affect the quality of sparse approximations , where kernel functions that are less skewed are more affected by the value vectors .", "forward": true, "src_ids": "2021.emnlp-main.753_6315"}
{"input": "certified word substitution robustness methods is used for OtherScientificTerm| context: existing bias mitigation methods to reduce disparities in model outcomes across cohorts have focused on data augmentation , debiasing model embeddings , or adding fairness - based optimization objectives during training .", "entity": "certified word substitution robustness methods", "output": "equality of odds", "neg_sample": ["certified word substitution robustness methods is used for OtherScientificTerm", "existing bias mitigation methods to reduce disparities in model outcomes across cohorts have focused on data augmentation , debiasing model embeddings , or adding fairness - based optimization objectives during training ."], "relation": "used for", "id": "2021.findings-acl.294", "year": 2021, "rel_sent": "In this paper , we investigate the utility of certified word substitution robustness methods to improve equality of odds and equality of opportunity on multiple text classification tasks .", "forward": true, "src_ids": "2021.findings-acl.294_2359"}
{"input": "syntax - augmented mbert is used for Task| context: nevertheless , language syntax , e.g. , syntactic dependencies , can bridge the typological gap .", "entity": "syntax - augmented mbert", "output": "cross - lingual transfer", "neg_sample": ["syntax - augmented mbert is used for Task", "nevertheless , language syntax , e.g.", ", syntactic dependencies , can bridge the typological gap ."], "relation": "used for", "id": "2021.acl-long.350", "year": 2021, "rel_sent": "The experiment results show that syntax - augmented mBERT improves cross - lingual transfer on popular benchmarks , such as PAWS - X and MLQA , by 1.4 and 1.6 points on average across all languages .", "forward": true, "src_ids": "2021.acl-long.350_770"}
{"input": "lrn is used for OtherScientificTerm| context: conventional entity typing approaches are based on independent classification paradigms , which make them difficult to recognize inter - dependent , long - tailed and fine - grained entity types . in this paper , we argue that the implicitly entailed extrinsic and intrinsic dependencies between labels can provide critical knowledge to tackle the above challenges .", "entity": "lrn", "output": "complex label dependencies", "neg_sample": ["lrn is used for OtherScientificTerm", "conventional entity typing approaches are based on independent classification paradigms , which make them difficult to recognize inter - dependent , long - tailed and fine - grained entity types .", "in this paper , we argue that the implicitly entailed extrinsic and intrinsic dependencies between labels can provide critical knowledge to tackle the above challenges ."], "relation": "used for", "id": "2021.emnlp-main.378", "year": 2021, "rel_sent": "Specifically , LRN utilizes an auto - regressive network to conduct deductive reasoning and a bipartite attribute graph to conduct inductive reasoning between labels , which can effectively model , learn and reason complex label dependencies in a sequence - to - set , end - to - end manner .", "forward": true, "src_ids": "2021.emnlp-main.378_5733"}
{"input": "unsupervised baselines is used for Task| context: publicly available , large pretrained language models ( lms ) generate text with remarkable quality , but only sequentially from left to right . as a result , they are not immediately applicable to generation tasks that break the unidirectional assumption , such as paraphrasing or text - infilling , necessitating task - specific supervision .", "entity": "unsupervised baselines", "output": "abductive text infilling", "neg_sample": ["unsupervised baselines is used for Task", "publicly available , large pretrained language models ( lms ) generate text with remarkable quality , but only sequentially from left to right .", "as a result , they are not immediately applicable to generation tasks that break the unidirectional assumption , such as paraphrasing or text - infilling , necessitating task - specific supervision ."], "relation": "used for", "id": "2021.acl-long.114", "year": 2021, "rel_sent": "Comprehensive empirical results demonstrate that Reflective Decoding outperforms strong unsupervised baselines on both paraphrasing and abductive text infilling , significantly narrowing the gap between unsupervised and supervised methods .", "forward": true, "src_ids": "2021.acl-long.114_15872"}
{"input": "food is done by using Material| context: the accelerating growth of big data in the biomedical domain , with an endless amount of electronic health records and more than 30 million citations and abstracts in pubmed , introduces the need for automatic structuring of textual biomedical data .", "entity": "food", "output": "annotated data", "neg_sample": ["food is done by using Material", "the accelerating growth of big data in the biomedical domain , with an endless amount of electronic health records and more than 30 million citations and abstracts in pubmed , introduces the need for automatic structuring of textual biomedical data ."], "relation": "used for", "id": "2021.bionlp-1.4", "year": 2021, "rel_sent": "Due to the lack of annotated data on food with respect to health , we explore the feasibility of transfer learning by training BERT - based models on existing datasets annotated for the presence of cause and treat relations among different types of biomedical entities , and using them to recognize the same relations between food and disease entities in a dataset created for the purposes of this study .", "forward": false, "src_ids": "2021.bionlp-1.4_9483"}
{"input": "context - aware interaction network is used for Task| context: impressive milestones have been achieved in text matching by adopting a cross - attention mechanism to capture pertinent semantic connections between two sentence representations . however , regular cross - attention focuses on word - level links between the two input sequences , neglecting the importance of contextual information .", "entity": "context - aware interaction network", "output": "question matching", "neg_sample": ["context - aware interaction network is used for Task", "impressive milestones have been achieved in text matching by adopting a cross - attention mechanism to capture pertinent semantic connections between two sentence representations .", "however , regular cross - attention focuses on word - level links between the two input sequences , neglecting the importance of contextual information ."], "relation": "used for", "id": "2021.emnlp-main.312", "year": 2021, "rel_sent": "Context - Aware Interaction Network for Question Matching.", "forward": true, "src_ids": "2021.emnlp-main.312_11565"}
{"input": "embodied navigation tasks is done by using Method| context: people navigating in unfamiliar buildings take advantage of myriad visual , spatial and semantic cues to efficiently achieve their navigation goals .", "entity": "embodied navigation tasks", "output": "pathdreamer", "neg_sample": ["embodied navigation tasks is done by using Method", "people navigating in unfamiliar buildings take advantage of myriad visual , spatial and semantic cues to efficiently achieve their navigation goals ."], "relation": "used for", "id": "2021.alvr-1.9", "year": 2021, "rel_sent": "We hope that Pathdreamer will help unlock model - based approaches to challenging embodied navigation tasks such as navigating to specified objects and VLN .", "forward": false, "src_ids": "2021.alvr-1.9_4226"}
{"input": "idc system is used for Task| context: however , solving these two problems in arabic and on the basis of social network data ( i.e. , twitter ) is still of lower interest .", "entity": "idc system", "output": "sentiment classification problem", "neg_sample": ["idc system is used for Task", "however , solving these two problems in arabic and on the basis of social network data ( i.e.", ", twitter ) is still of lower interest ."], "relation": "used for", "id": "2021.wanlp-1.48", "year": 2021, "rel_sent": "The IDC System for Sentiment Classification and Sarcasm Detection in Arabic.", "forward": true, "src_ids": "2021.wanlp-1.48_2801"}
{"input": "generating long counseling text is used for Task| context: however , the lack of corpora is a main obstacle to this research , particularly in chinese language .", "entity": "generating long counseling text", "output": "psychological health support", "neg_sample": ["generating long counseling text is used for Task", "however , the lack of corpora is a main obstacle to this research , particularly in chinese language ."], "relation": "used for", "id": "2021.findings-acl.130", "year": 2021, "rel_sent": "PsyQA : A Chinese Dataset for Generating Long Counseling Text for Mental Health Support.", "forward": true, "src_ids": "2021.findings-acl.130_4438"}
{"input": "aspect - level polarity is done by using Method| context: both the issues of data deficiencies and semantic consistency are important for data augmentation . most of previous methods address the first issue , but ignore the second one . in the cases of aspect - based sentiment analysis , violation of the above issues may change the aspect and sentiment polarity .", "entity": "aspect - level polarity", "output": "replacement strategies", "neg_sample": ["aspect - level polarity is done by using Method", "both the issues of data deficiencies and semantic consistency are important for data augmentation .", "most of previous methods address the first issue , but ignore the second one .", "in the cases of aspect - based sentiment analysis , violation of the above issues may change the aspect and sentiment polarity ."], "relation": "used for", "id": "2021.emnlp-main.362", "year": 2021, "rel_sent": "We then substitute the unimportant tokens with two replacement strategies without altering the aspect - level polarity .", "forward": false, "src_ids": "2021.emnlp-main.362_12251"}
{"input": "transformer networks is used for Method| context: concept normalization , the task of linking textual mentions of concepts to concepts in an ontology , is critical for mining and analyzing biomedical texts .", "entity": "transformer networks", "output": "pre - trained models", "neg_sample": ["transformer networks is used for Method", "concept normalization , the task of linking textual mentions of concepts to concepts in an ontology , is critical for mining and analyzing biomedical texts ."], "relation": "used for", "id": "2021.bionlp-1.2", "year": 2021, "rel_sent": "The transformer networks refine existing pre - trained models , and the online triplet mining makes training efficient even with hundreds of thousands of concepts by sampling training triples within each mini - batch .", "forward": true, "src_ids": "2021.bionlp-1.2_8902"}
{"input": "ontoed is done by using OtherScientificTerm| context: event detection ( ed ) aims to identify event trigger words from a given text and classify it into an event type . most current methods to ed rely heavily on training instances , and almost ignore the correlation of event types . hence , they tend to suffer from data scarcity and fail to handle new unseen event types .", "entity": "ontoed", "output": "event ontology", "neg_sample": ["ontoed is done by using OtherScientificTerm", "event detection ( ed ) aims to identify event trigger words from a given text and classify it into an event type .", "most current methods to ed rely heavily on training instances , and almost ignore the correlation of event types .", "hence , they tend to suffer from data scarcity and fail to handle new unseen event types ."], "relation": "used for", "id": "2021.acl-long.220", "year": 2021, "rel_sent": "Based on the event ontology , OntoED can leverage and propagate correlation knowledge , particularly from data - rich to data - poor event types .", "forward": false, "src_ids": "2021.acl-long.220_11786"}
{"input": "multi - label document classification is done by using OtherScientificTerm| context: multi - label document classification , associating one document instance with a set of relevant labels , is attracting more and more research attention . these approaches however either simply utilize the semantic information of metadata or employ the predefined parent - child label hierarchy , ignoring the heterogeneous graphical structures of metadata and labels , which we believe are crucial for accurate multi - label document classification .", "entity": "multi - label document classification", "output": "label structure", "neg_sample": ["multi - label document classification is done by using OtherScientificTerm", "multi - label document classification , associating one document instance with a set of relevant labels , is attracting more and more research attention .", "these approaches however either simply utilize the semantic information of metadata or employ the predefined parent - child label hierarchy , ignoring the heterogeneous graphical structures of metadata and labels , which we believe are crucial for accurate multi - label document classification ."], "relation": "used for", "id": "2021.emnlp-main.253", "year": 2021, "rel_sent": "Beyond Text : Incorporating Metadata and Label Structure for Multi - Label Document Classification using Heterogeneous Graphs.", "forward": false, "src_ids": "2021.emnlp-main.253_4895"}
{"input": "dual encoder models is used for Task| context: the first style is dual encoder ( or two - tower ) models , where the query and document representations are computed completely independently and combined with a simple dot product operation . the second style is cross - attention models , where the query and document features are concatenated in the input layer and all computation is based on the joint query - document representation .", "entity": "dual encoder models", "output": "retrieval", "neg_sample": ["dual encoder models is used for Task", "the first style is dual encoder ( or two - tower ) models , where the query and document representations are computed completely independently and combined with a simple dot product operation .", "the second style is cross - attention models , where the query and document features are concatenated in the input layer and all computation is based on the joint query - document representation ."], "relation": "used for", "id": "2021.emnlp-main.443", "year": 2021, "rel_sent": "Dual encoder models are typically used for retrieval and deep re - ranking , while cross - attention models are typically used for shallow re - ranking .", "forward": true, "src_ids": "2021.emnlp-main.443_242"}
{"input": "multilingual neural machine translation is done by using OtherScientificTerm| context: multilingual neural machine translation ( mnmt ) learns to translate multiple language pairs with a single model , potentially improving both the accuracy and the memory - efficiency of deployed models . however , the heavy data imbalance between languages hinders the model from performing uniformly across language pairs .", "entity": "multilingual neural machine translation", "output": "learning objective", "neg_sample": ["multilingual neural machine translation is done by using OtherScientificTerm", "multilingual neural machine translation ( mnmt ) learns to translate multiple language pairs with a single model , potentially improving both the accuracy and the memory - efficiency of deployed models .", "however , the heavy data imbalance between languages hinders the model from performing uniformly across language pairs ."], "relation": "used for", "id": "2021.emnlp-main.458", "year": 2021, "rel_sent": "In this paper , we propose a new learning objective for MNMT based on distributionally robust optimization , which minimizes the worst - case expected loss over the set of language pairs .", "forward": false, "src_ids": "2021.emnlp-main.458_8097"}
{"input": "text simplification is done by using Material| context: text simplification is a growing field with many potential useful applications . training text simplification algorithms generally requires a lot of annotated data , however there are not many corpora suitable for this task .", "entity": "text simplification", "output": "monolingual parallel corpus", "neg_sample": ["text simplification is done by using Material", "text simplification is a growing field with many potential useful applications .", "training text simplification algorithms generally requires a lot of annotated data , however there are not many corpora suitable for this task ."], "relation": "used for", "id": "2021.naacl-srw.6", "year": 2021, "rel_sent": "Parallel Text Alignment and Monolingual Parallel Corpus Creation from Philosophical Texts for Text Simplification.", "forward": false, "src_ids": "2021.naacl-srw.6_14439"}
{"input": "proper nouns is done by using Method| context: user - generated texts include various types of stylistic properties , or noises . such texts are not properly processed by existing morpheme analyzers or language models based on formal texts such as encyclopedias or news articles .", "entity": "proper nouns", "output": "k - mt", "neg_sample": ["proper nouns is done by using Method", "user - generated texts include various types of stylistic properties , or noises .", "such texts are not properly processed by existing morpheme analyzers or language models based on formal texts such as encyclopedias or news articles ."], "relation": "used for", "id": "2021.wnut-1.45", "year": 2021, "rel_sent": "Through our tests , we found that K - MT is better fit to process internet slangs , proper nouns , and coinages , compared to a morpheme analyzer and a character - level WordPiece tokenizer .", "forward": false, "src_ids": "2021.wnut-1.45_1645"}
{"input": "contrastive masked language modeling ( cmlm ) is used for Method| context: the major paradigm of applying a pre - trained language model to downstream tasks is tofinetune it on labeled task data , which often suffers instability and low performance when the labeled examples are scarce .", "entity": "contrastive masked language modeling ( cmlm )", "output": "post - training", "neg_sample": ["contrastive masked language modeling ( cmlm ) is used for Method", "the major paradigm of applying a pre - trained language model to downstream tasks is tofinetune it on labeled task data , which often suffers instability and low performance when the labeled examples are scarce ."], "relation": "used for", "id": "2021.findings-acl.151", "year": 2021, "rel_sent": "Therefore , we propose complementary random masking ( CRM ) to generate a pair of masked sequences from an input sequence for sequence - level contrastive learning and then develop contrastive masked language modeling ( CMLM ) for post - training to integrate both token - level and sequence - level contrastive learnings .", "forward": true, "src_ids": "2021.findings-acl.151_10112"}
{"input": "training variance is done by using Method| context: electra ( clark et al . , 2020a ) pretrains a discriminator to detect replaced tokens , where the replacements are sampled from a generator trained with masked language modeling . despite the compelling performance , electra suffers from the following two issues . second , the generator 's prediction tends to be over - confident along with training , making replacements biased to correct tokens .", "entity": "training variance", "output": "sampling", "neg_sample": ["training variance is done by using Method", "electra ( clark et al .", ", 2020a ) pretrains a discriminator to detect replaced tokens , where the replacements are sampled from a generator trained with masked language modeling .", "despite the compelling performance , electra suffers from the following two issues .", "second , the generator 's prediction tends to be over - confident along with training , making replacements biased to correct tokens ."], "relation": "used for", "id": "2021.findings-acl.394", "year": 2021, "rel_sent": "We also prove that the efficient sampling reduces the training variance of the discriminator .", "forward": false, "src_ids": "2021.findings-acl.394_1521"}
{"input": "static word embeddings is done by using Method| context: word embeddings learn implicit biases from linguistic regularities captured by word co - occurrence statistics .", "entity": "static word embeddings", "output": "valnorm", "neg_sample": ["static word embeddings is done by using Method", "word embeddings learn implicit biases from linguistic regularities captured by word co - occurrence statistics ."], "relation": "used for", "id": "2021.emnlp-main.574", "year": 2021, "rel_sent": "We apply ValNorm on static word embeddings from seven languages ( Chinese , English , German , Polish , Portuguese , Spanish , and Turkish ) and from historical English text spanning 200 years .", "forward": false, "src_ids": "2021.emnlp-main.574_6113"}
{"input": "machine reading comprehension is done by using Task| context: an in - depth analysis of the level of language understanding required by existing machine reading comprehension ( mrc ) benchmarks can provide insight into the reading capabilities of machines .", "entity": "machine reading comprehension", "output": "understanding of explicit discourse relations", "neg_sample": ["machine reading comprehension is done by using Task", "an in - depth analysis of the level of language understanding required by existing machine reading comprehension ( mrc ) benchmarks can provide insight into the reading capabilities of machines ."], "relation": "used for", "id": "2021.eacl-main.311", "year": 2021, "rel_sent": "Is the Understanding of Explicit Discourse Relations Required in Machine Reading Comprehension ?.", "forward": false, "src_ids": "2021.eacl-main.311_5174"}
{"input": "cause - effect span detection is done by using Method| context: automatic identification of cause - effect spans in financial documents is important for causality modelling and understanding reasons that lead tofinancial events .", "entity": "cause - effect span detection", "output": "graph neural network", "neg_sample": ["cause - effect span detection is done by using Method", "automatic identification of cause - effect spans in financial documents is important for causality modelling and understanding reasons that lead tofinancial events ."], "relation": "used for", "id": "2021.fnp-1.6", "year": 2021, "rel_sent": "NUS - IDS at FinCausal 2021 : Dependency Tree in Graph Neural Network for Better Cause - Effect Span Detection.", "forward": false, "src_ids": "2021.fnp-1.6_5484"}
{"input": "risk minimization is used for Task| context: one straightforward approach is utilizing existing systems ( source models ) to generate pseudo - labeled datasets and train a target sequence labeler accordingly . however , due to the gap between the source and the target languages / domains , this approach may fail to recover the true labels .", "entity": "risk minimization", "output": "zero - shot sequence labeling", "neg_sample": ["risk minimization is used for Task", "one straightforward approach is utilizing existing systems ( source models ) to generate pseudo - labeled datasets and train a target sequence labeler accordingly .", "however , due to the gap between the source and the target languages / domains , this approach may fail to recover the true labels ."], "relation": "used for", "id": "2021.acl-long.380", "year": 2021, "rel_sent": "Risk Minimization for Zero - shot Sequence Labeling.", "forward": true, "src_ids": "2021.acl-long.380_4057"}
{"input": "gossip learning framework is done by using Method| context: advanced nlp models require huge amounts of data from various domains to produce high - quality representations . it is useful then for a few large public and private organizations to join their corpora during training . however , factors such as legislation and user emphasis on data privacy may prevent centralized orchestration and data sharing among these organizations .", "entity": "gossip learning framework", "output": "word2vec", "neg_sample": ["gossip learning framework is done by using Method", "advanced nlp models require huge amounts of data from various domains to produce high - quality representations .", "it is useful then for a few large public and private organizations to join their corpora during training .", "however , factors such as legislation and user emphasis on data privacy may prevent centralized orchestration and data sharing among these organizations ."], "relation": "used for", "id": "2021.nodalida-main.40", "year": 2021, "rel_sent": "We find that the application of Word2Vec in a gossip learning framework is viable .", "forward": false, "src_ids": "2021.nodalida-main.40_9852"}
{"input": "skeleton is done by using OtherScientificTerm| context: the presentation is accompanied by the source code .", "entity": "skeleton", "output": "anatomically independent landmarks", "neg_sample": ["skeleton is done by using OtherScientificTerm", "the presentation is accompanied by the source code ."], "relation": "used for", "id": "2021.mtsummit-at4ssl.8", "year": 2021, "rel_sent": "From the anatomically independent landmarks , we create another skeleton based on the avatar 's skeletal bone architecture to calculate the bone rotation data .", "forward": false, "src_ids": "2021.mtsummit-at4ssl.8_8669"}
{"input": "hierarchical explorations is used for OtherScientificTerm| context: effective unimodal representation and complementary crossmodal representation fusion are both important in multimodal representation learning . prior works often modulate one modal feature to another straightforwardly and thus , underutilizing both unimodal and crossmodal representation refinements , which incurs a bottleneck of performance improvement .", "entity": "hierarchical explorations", "output": "unimodal , bimodal , and trimodal interactions", "neg_sample": ["hierarchical explorations is used for OtherScientificTerm", "effective unimodal representation and complementary crossmodal representation fusion are both important in multimodal representation learning .", "prior works often modulate one modal feature to another straightforwardly and thus , underutilizing both unimodal and crossmodal representation refinements , which incurs a bottleneck of performance improvement ."], "relation": "used for", "id": "2021.emnlp-main.720", "year": 2021, "rel_sent": "By hierarchical explorations on unimodal , bimodal , and trimodal interactions , UCRN is highly robust against missing modality and noisy data .", "forward": true, "src_ids": "2021.emnlp-main.720_1055"}
{"input": "edges is used for OtherScientificTerm| context: approaches to computational argumentation tasks such as stance detection and aspect detection have largely focused on the text of independent claims , losing out on potentially valuable context provided by the rest of the collection .", "entity": "edges", "output": "pairwise relationships", "neg_sample": ["edges is used for OtherScientificTerm", "approaches to computational argumentation tasks such as stance detection and aspect detection have largely focused on the text of independent claims , losing out on potentially valuable context provided by the rest of the collection ."], "relation": "used for", "id": "2021.acl-long.126", "year": 2021, "rel_sent": "A syntopical graph is a typed multi - graph where nodes represent claims and edges represent different possible pairwise relationships , such as entailment , paraphrase , or support .", "forward": true, "src_ids": "2021.acl-long.126_15052"}
{"input": "intrinsic dimensionality is done by using Method| context: although pretrained language models can be fine - tuned to produce state - of - the - art results for a very wide range of language understanding tasks , the dynamics of this process are not well understood , especially in the low data regime . why can we use relatively vanilla gradient descent algorithms ( e.g. , without strong regularization ) to tune a model with hundreds of millions of parameters on datasets with only hundreds or thousands of labeled examples ?", "entity": "intrinsic dimensionality", "output": "pre - training", "neg_sample": ["intrinsic dimensionality is done by using Method", "although pretrained language models can be fine - tuned to produce state - of - the - art results for a very wide range of language understanding tasks , the dynamics of this process are not well understood , especially in the low data regime .", "why can we use relatively vanilla gradient descent algorithms ( e.g.", ", without strong regularization ) to tune a model with hundreds of millions of parameters on datasets with only hundreds or thousands of labeled examples ?"], "relation": "used for", "id": "2021.acl-long.568", "year": 2021, "rel_sent": "Furthermore , we empirically show that pre - training implicitly minimizes intrinsic dimension and , perhaps surprisingly , larger models tend to have lower intrinsic dimension after a fixed number of pre - training updates , at least in part explaining their extreme effectiveness .", "forward": false, "src_ids": "2021.acl-long.568_6204"}
{"input": "interpretability toolkit is used for Task| context: despite the recent advancements of attention - based deep learning architectures across a majority of natural language processing tasks , their application remains limited in a low - resource setting because of a lack of pre - trained models for such languages .", "entity": "interpretability toolkit", "output": "low - resource nlp", "neg_sample": ["interpretability toolkit is used for Task", "despite the recent advancements of attention - based deep learning architectures across a majority of natural language processing tasks , their application remains limited in a low - resource setting because of a lack of pre - trained models for such languages ."], "relation": "used for", "id": "2021.acl-srw.5", "year": 2021, "rel_sent": "We introduce InterpretLR , an interpretability toolkit for low - resource NLP and use it alongside human evaluations to gauge the trained models .", "forward": true, "src_ids": "2021.acl-srw.5_7449"}
{"input": "inference heuristics is used for Task| context: recent prompt - based approaches allow pretrained language models to achieve strong performances on few - shot finetuning by reformulating downstream tasks as a language modeling problem . in this work , we demonstrate that , despite its advantages on low data regimes , finetuned prompt - based models for sentence pair classification tasks still suffer from a common pitfall of adopting inference heuristics based on lexical overlap , e.g. , models incorrectly assuming a sentence pair is of the same meaning because they consist of the same set of words .", "entity": "inference heuristics", "output": "few - shot prompt - based finetuning", "neg_sample": ["inference heuristics is used for Task", "recent prompt - based approaches allow pretrained language models to achieve strong performances on few - shot finetuning by reformulating downstream tasks as a language modeling problem .", "in this work , we demonstrate that , despite its advantages on low data regimes , finetuned prompt - based models for sentence pair classification tasks still suffer from a common pitfall of adopting inference heuristics based on lexical overlap , e.g.", ", models incorrectly assuming a sentence pair is of the same meaning because they consist of the same set of words ."], "relation": "used for", "id": "2021.emnlp-main.713", "year": 2021, "rel_sent": "Avoiding Inference Heuristics in Few - shot Prompt - based Finetuning.", "forward": true, "src_ids": "2021.emnlp-main.713_14283"}
{"input": "hulk is used for Method| context: however , energy efficiency in the process of model training and inference becomes a critical bottleneck .", "entity": "hulk", "output": "pretrained models", "neg_sample": ["hulk is used for Method", "however , energy efficiency in the process of model training and inference becomes a critical bottleneck ."], "relation": "used for", "id": "2021.eacl-demos.39", "year": 2021, "rel_sent": "With HULK , we compare pretrained models ' energy efficiency from the perspectives of time and cost .", "forward": true, "src_ids": "2021.eacl-demos.39_13796"}
{"input": "cross - task generalization is done by using Method| context: language modeling with bert consists of two phases of ( i ) unsupervised pre - training on unlabeled text , and ( ii ) fine - tuning for a specific supervised task .", "entity": "cross - task generalization", "output": "maximal multiverse learning", "neg_sample": ["cross - task generalization is done by using Method", "language modeling with bert consists of two phases of ( i ) unsupervised pre - training on unlabeled text , and ( ii ) fine - tuning for a specific supervised task ."], "relation": "used for", "id": "2021.eacl-main.14", "year": 2021, "rel_sent": "Maximal Multiverse Learning for Promoting Cross - Task Generalization of Fine - Tuned Language Models.", "forward": false, "src_ids": "2021.eacl-main.14_16050"}
{"input": "toxic spans detection is done by using Method| context: toxic span detection requires the detection of spans that make a text toxic instead of simply classifying the text .", "entity": "toxic spans detection", "output": "transformer - based model", "neg_sample": ["toxic spans detection is done by using Method", "toxic span detection requires the detection of spans that make a text toxic instead of simply classifying the text ."], "relation": "used for", "id": "2021.semeval-1.112", "year": 2021, "rel_sent": "YNU - HPCC at SemEval-2021 Task 5 : Using a Transformer - based Model with Auxiliary Information for Toxic Span Detection.", "forward": false, "src_ids": "2021.semeval-1.112_10203"}
{"input": "dense retrieval is done by using Method| context: pre - trained transformer language models ( lm ) have become go - to text representation encoders . prior research fine - tunes deep lms to encode text sequences such as sentences and passages into single dense vector representations for efficient text comparison and retrieval . however , dense encoders require a lot of data and sophisticated techniques to effectively train and suffer in low data situations . this paper finds a key reason is that standard lms ' internal attention structure is not ready - to - use for dense encoders , which needs to aggregate text information into the dense representation .", "entity": "dense retrieval", "output": "pre - training architecture", "neg_sample": ["dense retrieval is done by using Method", "pre - trained transformer language models ( lm ) have become go - to text representation encoders .", "prior research fine - tunes deep lms to encode text sequences such as sentences and passages into single dense vector representations for efficient text comparison and retrieval .", "however , dense encoders require a lot of data and sophisticated techniques to effectively train and suffer in low data situations .", "this paper finds a key reason is that standard lms ' internal attention structure is not ready - to - use for dense encoders , which needs to aggregate text information into the dense representation ."], "relation": "used for", "id": "2021.emnlp-main.75", "year": 2021, "rel_sent": "Condenser : a Pre - training Architecture for Dense Retrieval.", "forward": false, "src_ids": "2021.emnlp-main.75_14733"}
{"input": "fine - tuning is used for Task| context: nlp is currently dominated by language models like roberta which are pretrained on billions of words . but what exact knowledge or skills do transformer lms learn from large - scale pretraining that they can not learn from less data ?", "entity": "fine - tuning", "output": "nlu tasks", "neg_sample": ["fine - tuning is used for Task", "nlp is currently dominated by language models like roberta which are pretrained on billions of words .", "but what exact knowledge or skills do transformer lms learn from large - scale pretraining that they can not learn from less data ?"], "relation": "used for", "id": "2021.acl-long.90", "year": 2021, "rel_sent": "To explore this question , we adopt five styles of evaluation : classifier probing , information - theoretic probing , unsupervised relative acceptability judgments , unsupervised language model knowledge probing , and fine - tuning on NLU tasks .", "forward": true, "src_ids": "2021.acl-long.90_6306"}
{"input": "error accumulation is done by using OtherScientificTerm| context: joint extraction of entities and relations from unstructured texts toform factual triples is a fundamental task of constructing a knowledge base ( kb ) . a common method is to decode triples by predicting entity pairs to obtain the corresponding relation . however , it is still challenging to handle this task efficiently , especially for the overlapping triple problem .", "entity": "error accumulation", "output": "negative samples", "neg_sample": ["error accumulation is done by using OtherScientificTerm", "joint extraction of entities and relations from unstructured texts toform factual triples is a fundamental task of constructing a knowledge base ( kb ) .", "a common method is to decode triples by predicting entity pairs to obtain the corresponding relation .", "however , it is still challenging to handle this task efficiently , especially for the overlapping triple problem ."], "relation": "used for", "id": "2021.emnlp-main.635", "year": 2021, "rel_sent": "To enhance model robustness , we introduce negative samples to alleviate error accumulation at different stages .", "forward": false, "src_ids": "2021.emnlp-main.635_14580"}
{"input": "copying behaviors is done by using OtherScientificTerm| context: previous studies have shown that initializing neural machine translation ( nmt ) models with the pre - trained language models ( lm ) can speed up the model training and boost the model performance . in this work , we identify a critical side - effect of pre - training for nmt , which is due to the discrepancy between the training objectives of lm - based pre - training and nmt .", "entity": "copying behaviors", "output": "pre - training initialization", "neg_sample": ["copying behaviors is done by using OtherScientificTerm", "previous studies have shown that initializing neural machine translation ( nmt ) models with the pre - trained language models ( lm ) can speed up the model training and boost the model performance .", "in this work , we identify a critical side - effect of pre - training for nmt , which is due to the discrepancy between the training objectives of lm - based pre - training and nmt ."], "relation": "used for", "id": "2021.findings-acl.373", "year": 2021, "rel_sent": "Since the LM objective learns to reconstruct a few source tokens and copy most of them , the pre - training initialization would affect the copying behaviors of NMT models .", "forward": false, "src_ids": "2021.findings-acl.373_9538"}
{"input": "sentence fusion is done by using OtherScientificTerm| context: sentence fusion is a conditional generation task that merges several related sentences into a coherent one , which can be deemed as a summary sentence . the importance of sentence fusion has long been recognized by communities in natural language generation , especially in text summarization . it remains challenging for a state - of - the - art neural abstractive summarization model to generate a well - integrated summary sentence .", "entity": "sentence fusion", "output": "event graph", "neg_sample": ["sentence fusion is done by using OtherScientificTerm", "sentence fusion is a conditional generation task that merges several related sentences into a coherent one , which can be deemed as a summary sentence .", "the importance of sentence fusion has long been recognized by communities in natural language generation , especially in text summarization .", "it remains challenging for a state - of - the - art neural abstractive summarization model to generate a well - integrated summary sentence ."], "relation": "used for", "id": "2021.emnlp-main.334", "year": 2021, "rel_sent": "We propose to build an event graph from the input sentences to effectively capture and organize related events in a structured way and use the constructed event graph to guide sentence fusion .", "forward": false, "src_ids": "2021.emnlp-main.334_15426"}
{"input": "annotated corpus is used for Task| context: the data set comprises a total of 100 contracts , obtained from 25 documents annotated in four different languages : english , german , italian , and polish .", "entity": "annotated corpus", "output": "multilingual analysis of potentially unfair clauses", "neg_sample": ["annotated corpus is used for Task", "the data set comprises a total of 100 contracts , obtained from 25 documents annotated in four different languages : english , german , italian , and polish ."], "relation": "used for", "id": "2021.nllp-1.1", "year": 2021, "rel_sent": "We present the first annotated corpus for multilingual analysis of potentially unfair clauses in online Terms of Service .", "forward": true, "src_ids": "2021.nllp-1.1_10850"}
{"input": "coreference reasoning is used for Material| context: coreference resolution is essential for natural language understanding and has been long studied in nlp . in recent years , as the format of question answering ( qa ) became a standard for machine reading comprehension ( mrc ) , there have been data collection efforts , e.g. , dasigi et al . ( 2019 ) , that attempt to evaluate the ability of mrc models to reason about coreference . however , as we show , coreference reasoning in mrc is a greater challenge than earlier thought ; mrc datasets do not reflect the natural distribution and , consequently , the challenges of coreference reasoning . specifically , success on these datasets does not reflect a model 's proficiency in coreference reasoning .", "entity": "coreference reasoning", "output": "sample evaluation set", "neg_sample": ["coreference reasoning is used for Material", "coreference resolution is essential for natural language understanding and has been long studied in nlp .", "in recent years , as the format of question answering ( qa ) became a standard for machine reading comprehension ( mrc ) , there have been data collection efforts , e.g.", ", dasigi et al .", "( 2019 ) , that attempt to evaluate the ability of mrc models to reason about coreference .", "however , as we show , coreference reasoning in mrc is a greater challenge than earlier thought ; mrc datasets do not reflect the natural distribution and , consequently , the challenges of coreference reasoning .", "specifically , success on these datasets does not reflect a model 's proficiency in coreference reasoning ."], "relation": "used for", "id": "2021.acl-long.448", "year": 2021, "rel_sent": "We propose a methodology for creating MRC datasets that better reflect the challenges of coreference reasoning and use it to create a sample evaluation set .", "forward": true, "src_ids": "2021.acl-long.448_5720"}
{"input": "alignment knowledge is done by using Method| context: a key solution to temporal sentence grounding ( tsg ) exists in how to learn effective alignment between vision and language features extracted from an untrimmed video and a sentence description . existing methods mainly leverage vanilla soft attention to perform the alignment in a single - step process . however , such single - step attention is insufficient in practice , since complicated relations between inter- and intra - modality are usually obtained through multi - step reasoning .", "entity": "alignment knowledge", "output": "calibration module", "neg_sample": ["alignment knowledge is done by using Method", "a key solution to temporal sentence grounding ( tsg ) exists in how to learn effective alignment between vision and language features extracted from an untrimmed video and a sentence description .", "existing methods mainly leverage vanilla soft attention to perform the alignment in a single - step process .", "however , such single - step attention is insufficient in practice , since complicated relations between inter- and intra - modality are usually obtained through multi - step reasoning ."], "relation": "used for", "id": "2021.emnlp-main.733", "year": 2021, "rel_sent": "Tofurther calibrate the misaligned attention caused by each reasoning step , we also devise a calibration module following each attention module to refine the alignment knowledge .", "forward": false, "src_ids": "2021.emnlp-main.733_5158"}
{"input": "contrastive learning is used for OtherScientificTerm| context: though language model text embeddings have revolutionized nlp research , their ability to capture high - level semantic information , such as relations between entities in text , is limited .", "entity": "contrastive learning", "output": "relation - related structure", "neg_sample": ["contrastive learning is used for OtherScientificTerm", "though language model text embeddings have revolutionized nlp research , their ability to capture high - level semantic information , such as relations between entities in text , is limited ."], "relation": "used for", "id": "2021.conll-1.27", "year": 2021, "rel_sent": "Given a sentence ( unstructured text ) and its graph , we use contrastive learning to impose relation - related structure on the token level representations of the sentence obtained with a CharacterBERT ( El Boukkouri et al . , 2020 ) model .", "forward": true, "src_ids": "2021.conll-1.27_13815"}
{"input": "augvic is used for Task| context: the success of neural machine translation ( nmt ) largely depends on the availability of large bitext training corpora . due to the lack of such large corpora in low - resource language pairs , nmt systems often exhibit poor performance . extra relevant monolingual data often helps , but acquiring it could be quite expensive , especially for low - resource languages . moreover , domain mismatch between bitext ( train / test ) and monolingual data might degrade the performance .", "entity": "augvic", "output": "low - resource nmt", "neg_sample": ["augvic is used for Task", "the success of neural machine translation ( nmt ) largely depends on the availability of large bitext training corpora .", "due to the lack of such large corpora in low - resource language pairs , nmt systems often exhibit poor performance .", "extra relevant monolingual data often helps , but acquiring it could be quite expensive , especially for low - resource languages .", "moreover , domain mismatch between bitext ( train / test ) and monolingual data might degrade the performance ."], "relation": "used for", "id": "2021.findings-acl.267", "year": 2021, "rel_sent": "AugVic : Exploiting BiText Vicinity for Low - Resource NMT.", "forward": true, "src_ids": "2021.findings-acl.267_3306"}
{"input": "few - shot learning scenarios is done by using Method| context: as the labeling cost for different modules in task - oriented dialog ( tod ) systems is expensive , a major challenge is to train different modules with the least amount of labeled data . recently , large - scale pre - trained language models , have shown promising results for few - shot learning in tod.", "entity": "few - shot learning scenarios", "output": "pre - trained models", "neg_sample": ["few - shot learning scenarios is done by using Method", "as the labeling cost for different modules in task - oriented dialog ( tod ) systems is expensive , a major challenge is to train different modules with the least amount of labeled data .", "recently , large - scale pre - trained language models , have shown promising results for few - shot learning in tod."], "relation": "used for", "id": "2021.emnlp-main.142", "year": 2021, "rel_sent": "In this paper , we devise a self - training approach to utilize the abundant unlabeled dialog data tofurther improve state - of - the - art pre - trained models in few - shot learning scenarios for ToD systems .", "forward": false, "src_ids": "2021.emnlp-main.142_11444"}
{"input": "french legal text is done by using Method| context: language models have proven to be very useful when adapted to specific domains . nonetheless , little research has been done on the adaptation of domain - specific bert models in the french language . we conclude that some specific tasks do not benefit from generic language models pre - trained on large amounts of data .", "entity": "french legal text", "output": "masked - language model adaptation", "neg_sample": ["french legal text is done by using Method", "language models have proven to be very useful when adapted to specific domains .", "nonetheless , little research has been done on the adaptation of domain - specific bert models in the french language .", "we conclude that some specific tasks do not benefit from generic language models pre - trained on large amounts of data ."], "relation": "used for", "id": "2021.nllp-1.9", "year": 2021, "rel_sent": "JuriBERT : A Masked - Language Model Adaptation for French Legal Text.", "forward": false, "src_ids": "2021.nllp-1.9_14482"}
{"input": "non - factual tokens is done by using Method| context: recent pre - trained abstractive summarization systems have started to achieve credible performance , but a major barrier to their use in practice is their propensity to output summaries that are not faithful to the input and that contain factual errors . while a number of annotated datasets and statistical models for assessing factuality have been explored , there is no clear picture of what errors are most important to target or where current techniques are succeeding and failing .", "entity": "non - factual tokens", "output": "factuality detection model", "neg_sample": ["non - factual tokens is done by using Method", "recent pre - trained abstractive summarization systems have started to achieve credible performance , but a major barrier to their use in practice is their propensity to output summaries that are not faithful to the input and that contain factual errors .", "while a number of annotated datasets and statistical models for assessing factuality have been explored , there is no clear picture of what errors are most important to target or where current techniques are succeeding and failing ."], "relation": "used for", "id": "2021.naacl-main.114", "year": 2021, "rel_sent": "Finally , we show that our best factuality detection model enables training of more factual XSum summarization models by allowing us to identify non - factual tokens in the training data .", "forward": false, "src_ids": "2021.naacl-main.114_752"}
{"input": "annotations is used for Task| context: statutory reasoning is the task of determining whether a legal statute , stated in natural language , applies to the text description of a case . prior work introduced a resource that approached statutory reasoning as a monolithic textual entailment problem , with neural baselines performing nearly at - chance .", "entity": "annotations", "output": "language understanding", "neg_sample": ["annotations is used for Task", "statutory reasoning is the task of determining whether a legal statute , stated in natural language , applies to the text description of a case .", "prior work introduced a resource that approached statutory reasoning as a monolithic textual entailment problem , with neural baselines performing nearly at - chance ."], "relation": "used for", "id": "2021.acl-long.213", "year": 2021, "rel_sent": "Augmenting an existing benchmark , we provide annotations for the four tasks , and baselines for three of them .", "forward": true, "src_ids": "2021.acl-long.213_6226"}
{"input": "intermediate fine - tuning is used for OtherScientificTerm| context: natural language ( nl ) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black - box pre - trained models , for tasks such as question answering ( qa ) and fact verification . recently , pre - trained sequence to sequence ( seq2seq ) models have proven to be very effective in jointly making predictions , as well as generating nl explanations . however , these models have many shortcomings ; they can fabricate explanations even for incorrect predictions , they are difficult to adapt to long input documents , and their training requires a large amount of labeled data .", "entity": "intermediate fine - tuning", "output": "few - shot", "neg_sample": ["intermediate fine - tuning is used for OtherScientificTerm", "natural language ( nl ) explanations of model predictions are gaining popularity as a means to understand and verify decisions made by large black - box pre - trained models , for tasks such as question answering ( qa ) and fact verification .", "recently , pre - trained sequence to sequence ( seq2seq ) models have proven to be very effective in jointly making predictions , as well as generating nl explanations .", "however , these models have many shortcomings ; they can fabricate explanations even for incorrect predictions , they are difficult to adapt to long input documents , and their training requires a large amount of labeled data ."], "relation": "used for", "id": "2021.emnlp-main.301", "year": 2021, "rel_sent": "In this paper , we develop FiD - Ex , which addresses these shortcomings for seq2seq models by : 1 ) introducing sentence markers to eliminate explanation fabrication by encouraging extractive generation , 2 ) using the fusion - in - decoder architecture to handle long input contexts , and 3 ) intermediate fine - tuning on re - structured open domain QA datasets to improve few - shot performance .", "forward": true, "src_ids": "2021.emnlp-main.301_11263"}
{"input": "fine - tuned bert model is used for OtherScientificTerm| context: the upsurge of prolific blogging and microblogging platforms enabled the abusers to spread negativity and threats greater than ever . detecting the toxic portions substantially aids to moderate or exclude the abusive parts for maintaining sound online platforms .", "entity": "fine - tuned bert model", "output": "toxic spans", "neg_sample": ["fine - tuned bert model is used for OtherScientificTerm", "the upsurge of prolific blogging and microblogging platforms enabled the abusers to spread negativity and threats greater than ever .", "detecting the toxic portions substantially aids to moderate or exclude the abusive parts for maintaining sound online platforms ."], "relation": "used for", "id": "2021.semeval-1.135", "year": 2021, "rel_sent": "We explore an ensemble of sequence labeling models including the BiLSTM - CRF , spaCy NER model with custom toxic tags , and fine - tuned BERT model to identify the toxic spans .", "forward": true, "src_ids": "2021.semeval-1.135_4201"}
{"input": "interpretability performance validation methods is used for Metric| context: when developing topic models , a critical question that should be asked is : how well will this model work in an applied setting ? because standard performance evaluation of topic interpretability uses automated measures modeled on human evaluation tests that are dissimilar to applied usage , these models ' generalizability remains in question .", "entity": "interpretability performance validation methods", "output": "model quality", "neg_sample": ["interpretability performance validation methods is used for Metric", "when developing topic models , a critical question that should be asked is : how well will this model work in an applied setting ?", "because standard performance evaluation of topic interpretability uses automated measures modeled on human evaluation tests that are dissimilar to applied usage , these models ' generalizability remains in question ."], "relation": "used for", "id": "2021.naacl-main.300", "year": 2021, "rel_sent": "These evaluations show that for some specialized collections , standard coherence measures may not inform the most appropriate topic model or the optimal number of topics , and current interpretability performance validation methods are challenged as a means to confirm model quality in the absence of ground truth data .", "forward": true, "src_ids": "2021.naacl-main.300_7498"}
{"input": "read operation is used for OtherScientificTerm| context: various machine learning tasks can benefit from access to external information of different modalities , such as text and images . recent work has focused on learning architectures with large memories capable of storing this knowledge .", "entity": "read operation", "output": "fixed external knowledge", "neg_sample": ["read operation is used for OtherScientificTerm", "various machine learning tasks can benefit from access to external information of different modalities , such as text and images .", "recent work has focused on learning architectures with large memories capable of storing this knowledge ."], "relation": "used for", "id": "2021.tacl-1.6", "year": 2021, "rel_sent": "Each KIF module learns a read operation to access fixed external knowledge .", "forward": true, "src_ids": "2021.tacl-1.6_9754"}
{"input": "sign language video is done by using Method| context: one of the major challenges in sign language translation from a sign language to a spoken language is the lack of parallel corpora . recent works have achieved promising results on the rwth - phoenix - weather 2014 t dataset , which consists of over eight thousand parallel sentences between german sign language and german . however , from the perspective of neural machine translation , this is still a tiny dataset .", "entity": "sign language video", "output": "pretrained bert - base and mbart-50 models", "neg_sample": ["sign language video is done by using Method", "one of the major challenges in sign language translation from a sign language to a spoken language is the lack of parallel corpora .", "recent works have achieved promising results on the rwth - phoenix - weather 2014 t dataset , which consists of over eight thousand parallel sentences between german sign language and german .", "however , from the perspective of neural machine translation , this is still a tiny dataset ."], "relation": "used for", "id": "2021.mtsummit-at4ssl.10", "year": 2021, "rel_sent": "We use pretrained BERT - base and mBART-50 models to initialize our sign language video to spoken language text translation model .", "forward": false, "src_ids": "2021.mtsummit-at4ssl.10_14426"}
{"input": "deep attention diffusion graph neural networks is used for Task| context: recently , graph neural networks ( gnns ) have attracted much attention due to their powerful representation ability . moreover , these models suffer from over - smoothing issues if many graph layers are stacked .", "entity": "deep attention diffusion graph neural networks", "output": "text classification", "neg_sample": ["deep attention diffusion graph neural networks is used for Task", "recently , graph neural networks ( gnns ) have attracted much attention due to their powerful representation ability .", "moreover , these models suffer from over - smoothing issues if many graph layers are stacked ."], "relation": "used for", "id": "2021.emnlp-main.642", "year": 2021, "rel_sent": "Deep Attention Diffusion Graph Neural Networks for Text Classification.", "forward": true, "src_ids": "2021.emnlp-main.642_9999"}
{"input": "word embeddings is used for Material| context: we evaluate the use of direct intrinsic word embedding evaluation tasks for specialized language . uniquely for our task , experts must rely on explicit knowledge and can not use their linguistic intuition , which may differ from that of the philosopher .", "entity": "word embeddings", "output": "specialized domains", "neg_sample": ["word embeddings is used for Material", "we evaluate the use of direct intrinsic word embedding evaluation tasks for specialized language .", "uniquely for our task , experts must rely on explicit knowledge and can not use their linguistic intuition , which may differ from that of the philosopher ."], "relation": "used for", "id": "2021.humeval-1.12", "year": 2021, "rel_sent": "Eliciting Explicit Knowledge From Domain Experts in Direct Intrinsic Evaluation of Word Embeddings for Specialized Domains.", "forward": true, "src_ids": "2021.humeval-1.12_8194"}
{"input": "nlp is done by using Method| context: due to its great power in modeling non - euclidean data like graphs or manifolds , deep learning on graph techniques ( i.e. , graph neural networks ( gnns ) ) have opened a new door to solving challenging graph - related nlp problems . there has seen a surge of interests in applying deep learning on graph techniques to nlp , and has achieved considerable success in many nlp tasks , ranging from classification tasks like sentence classification , semantic role labeling and relation extraction , to generation tasks like machine translation , question generation and summarization . despite these successes , deep learning on graphs for nlp still face many challenges , including automatically transforming original text sequence data into highly graph - structured data , and effectively modeling complex data that involves mapping between graph - based inputs and other highly structured output data such as sequences , trees , and graph data with multi - types in both nodes and edges .", "entity": "nlp", "output": "graph representation learning", "neg_sample": ["nlp is done by using Method", "due to its great power in modeling non - euclidean data like graphs or manifolds , deep learning on graph techniques ( i.e.", ", graph neural networks ( gnns ) ) have opened a new door to solving challenging graph - related nlp problems .", "there has seen a surge of interests in applying deep learning on graph techniques to nlp , and has achieved considerable success in many nlp tasks , ranging from classification tasks like sentence classification , semantic role labeling and relation extraction , to generation tasks like machine translation , question generation and summarization .", "despite these successes , deep learning on graphs for nlp still face many challenges , including automatically transforming original text sequence data into highly graph - structured data , and effectively modeling complex data that involves mapping between graph - based inputs and other highly structured output data such as sequences , trees , and graph data with multi - types in both nodes and edges ."], "relation": "used for", "id": "2021.naacl-tutorials.3", "year": 2021, "rel_sent": "This tutorial will cover relevant and interesting topics on applying deep learning on graph techniques to NLP , including automatic graph construction for NLP , graph representation learning for NLP , advanced GNN based models ( e.g. , graph2seq , graph2tree , and graph2graph ) for NLP , and the applications of GNNs in various NLP tasks ( e.g. , machine translation , natural language generation , information extraction and semantic parsing ) .", "forward": false, "src_ids": "2021.naacl-tutorials.3_8186"}
{"input": "language models is used for Task| context: lexical collocations are idiosyncratic combinations of two syntactically bound lexical items ( e.g. , ' heavy rain ' or ' take a step ' ) . understanding their degree of compositionality and idiosyncrasy , as well their underlying semantics , is crucial for language learners , lexicographers and downstream nlp applications .", "entity": "language models", "output": "collocation understanding", "neg_sample": ["language models is used for Task", "lexical collocations are idiosyncratic combinations of two syntactically bound lexical items ( e.g.", ", ' heavy rain ' or ' take a step ' ) .", "understanding their degree of compositionality and idiosyncrasy , as well their underlying semantics , is crucial for language learners , lexicographers and downstream nlp applications ."], "relation": "used for", "id": "2021.eacl-main.120", "year": 2021, "rel_sent": "In this paper , we perform an exhaustive analysis of current language models for collocation understanding .", "forward": true, "src_ids": "2021.eacl-main.120_6052"}
{"input": "syntactically controlled paraphrase generator ( synpg ) is used for Method| context: paraphrase generation plays an essential role in natural language process ( nlp ) , and it has many downstream applications . however , training supervised paraphrase models requires many annotated paraphrase pairs , which are usually costly to obtain . on the other hand , the paraphrases generated by existing unsupervised approaches are usually syntactically similar to the source sentences and are limited in diversity .", "entity": "syntactically controlled paraphrase generator ( synpg )", "output": "data augmentation", "neg_sample": ["syntactically controlled paraphrase generator ( synpg ) is used for Method", "paraphrase generation plays an essential role in natural language process ( nlp ) , and it has many downstream applications .", "however , training supervised paraphrase models requires many annotated paraphrase pairs , which are usually costly to obtain .", "on the other hand , the paraphrases generated by existing unsupervised approaches are usually syntactically similar to the source sentences and are limited in diversity ."], "relation": "used for", "id": "2021.eacl-main.88", "year": 2021, "rel_sent": "Finally , we show that the syntactically controlled paraphrases generated by SynPG can be utilized for data augmentation to improve the robustness of NLP models .", "forward": true, "src_ids": "2021.eacl-main.88_4475"}
{"input": "genetic algorithm is used for Method| context: deep neural networks are vulnerable to adversarial attacks , where a small perturbation to an input alters the model prediction . in many cases , malicious inputs intentionally crafted for one model can fool another model .", "entity": "genetic algorithm", "output": "ensemble of models", "neg_sample": ["genetic algorithm is used for Method", "deep neural networks are vulnerable to adversarial attacks , where a small perturbation to an input alters the model prediction .", "in many cases , malicious inputs intentionally crafted for one model can fool another model ."], "relation": "used for", "id": "2021.emnlp-main.121", "year": 2021, "rel_sent": "Based on these studies , we propose a genetic algorithm tofind an ensemble of models that can be used to induce adversarial examples tofool almost all existing models .", "forward": true, "src_ids": "2021.emnlp-main.121_15780"}
{"input": "calibrated noise is used for Method| context: neural language models have contributed to state - of - the - art results in a number of downstream applications including sentiment analysis , intent classification and others .", "entity": "calibrated noise", "output": "text representations", "neg_sample": ["calibrated noise is used for Method", "neural language models have contributed to state - of - the - art results in a number of downstream applications including sentiment analysis , intent classification and others ."], "relation": "used for", "id": "2021.emnlp-main.628", "year": 2021, "rel_sent": "Specifically , CAPE firstly applies calibrated noise through differential privacy to maintain the privacy of text representations by preserving the encoded semantic links while obscuring sensitive information .", "forward": true, "src_ids": "2021.emnlp-main.628_4872"}
{"input": "verbal inflections is done by using Generic| context: child language acquisition is famously accurate despite the sparsity of linguistic input .", "entity": "verbal inflections", "output": "cognitively motivated method", "neg_sample": ["verbal inflections is done by using Generic", "child language acquisition is famously accurate despite the sparsity of linguistic input ."], "relation": "used for", "id": "2021.scil-1.17", "year": 2021, "rel_sent": "In this paper , we introduce a cognitively motivated method for morphological acquisition with a special focus on verbal inflections .", "forward": false, "src_ids": "2021.scil-1.17_12350"}
{"input": "pipeline approach is used for Task| context: distantly supervision automatically generates plenty of training samples for relation extraction . however , it also incurs two major problems : noisy labels and imbalanced training data . previous works focus more on reducing wrongly labeled relations ( false positives ) while few explore the missing relations that are caused by incompleteness of knowledge base ( false negatives ) . furthermore , the quantity of negative labels overwhelmingly surpasses the positive ones in previous problem formulations .", "entity": "pipeline approach", "output": "sentence classification", "neg_sample": ["pipeline approach is used for Task", "distantly supervision automatically generates plenty of training samples for relation extraction .", "however , it also incurs two major problems : noisy labels and imbalanced training data .", "previous works focus more on reducing wrongly labeled relations ( false positives ) while few explore the missing relations that are caused by incompleteness of knowledge base ( false negatives ) .", "furthermore , the quantity of negative labels overwhelmingly surpasses the positive ones in previous problem formulations ."], "relation": "used for", "id": "2021.acl-long.277", "year": 2021, "rel_sent": "Thirdly , we propose a pipeline approach , dubbed ReRe , that first performs sentence classification with relational labels and then extracts the subjects / objects .", "forward": true, "src_ids": "2021.acl-long.277_8158"}
{"input": "legal text is done by using Task| context: however , researchers seem to struggle when it comes to identifying ethical limits to using nlp systems for acquiring genuine insights both about the law and the systems ' predictive capacity .", "entity": "legal text", "output": "natural language processing", "neg_sample": ["legal text is done by using Task", "however , researchers seem to struggle when it comes to identifying ethical limits to using nlp systems for acquiring genuine insights both about the law and the systems ' predictive capacity ."], "relation": "used for", "id": "2021.findings-acl.314", "year": 2021, "rel_sent": "On the Ethical Limits of Natural Language Processing on Legal Text.", "forward": false, "src_ids": "2021.findings-acl.314_6661"}
{"input": "jointly modeling tasks is used for OtherScientificTerm| context: the # metoo movement on social media platforms initiated discussions over several facets of sexual harassment in our society . however , emotional attributes associated with textual conversations related to the # metoo social movement are complexly intertwined with such narratives .", "entity": "jointly modeling tasks", "output": "sexual abuse disclosures", "neg_sample": ["jointly modeling tasks is used for OtherScientificTerm", "the # metoo movement on social media platforms initiated discussions over several facets of sexual harassment in our society .", "however , emotional attributes associated with textual conversations related to the # metoo social movement are complexly intertwined with such narratives ."], "relation": "used for", "id": "2021.naacl-main.387", "year": 2021, "rel_sent": "Our results demonstrate that positive knowledge transfer via context - specific shared representations of a flexible cross - stitched parameter sharing model helps establish the inherent benefit of jointly modeling tasks related to sexual abuse disclosures with emotion classification from the text in homogeneous and heterogeneous settings .", "forward": true, "src_ids": "2021.naacl-main.387_14788"}
{"input": "lmtc models is used for OtherScientificTerm| context: performance of models in prior art is evaluated with standard precision , recall , and f1 measures without regard for the rich hierarchical structure .", "entity": "lmtc models", "output": "icd-9 coding", "neg_sample": ["lmtc models is used for OtherScientificTerm", "performance of models in prior art is evaluated with standard precision , recall , and f1 measures without regard for the rich hierarchical structure ."], "relation": "used for", "id": "2021.emnlp-main.69", "year": 2021, "rel_sent": "We compare the evaluation scores from the proposed metrics with previously used metrics on prior art LMTC models for ICD-9 coding in MIMIC - III .", "forward": true, "src_ids": "2021.emnlp-main.69_15068"}
{"input": "time - aware predictions is done by using Material| context: designing profitable trading strategies is complex as stock movements are highly stochastic ; the market is influenced by large volumes of noisy data across diverse information sources like news and social media . prior work mostly treats stock movement prediction as a regression or classification task and is not directly optimized towards profit - making . further , they do not model the fine - grain temporal irregularities in the release of vast volumes of text that the market responds to quickly .", "entity": "time - aware predictions", "output": "textual data", "neg_sample": ["time - aware predictions is done by using Material", "designing profitable trading strategies is complex as stock movements are highly stochastic ; the market is influenced by large volumes of noisy data across diverse information sources like news and social media .", "prior work mostly treats stock movement prediction as a regression or classification task and is not directly optimized towards profit - making .", "further , they do not model the fine - grain temporal irregularities in the release of vast volumes of text that the market responds to quickly ."], "relation": "used for", "id": "2021.eacl-main.185", "year": 2021, "rel_sent": "Building on these limitations , we propose a novel hierarchical , learning to rank approach that uses textual data to make time - aware predictions for ranking stocks based on expected profit .", "forward": false, "src_ids": "2021.eacl-main.185_5965"}
{"input": "word - level alignments is used for OtherScientificTerm| context: we describe a span - level supervised attention loss that improves compositional generalization in semantic parsers .", "entity": "word - level alignments", "output": "spans", "neg_sample": ["word - level alignments is used for OtherScientificTerm", "we describe a span - level supervised attention loss that improves compositional generalization in semantic parsers ."], "relation": "used for", "id": "2021.naacl-main.225", "year": 2021, "rel_sent": "Where past work has used word - level alignments , we focus on spans ; borrowing ideas from phrase - based machine translation , we align subtrees in semantic parses to spans of input sentences , and encourage neural attention mechanisms to mimic these alignments .", "forward": true, "src_ids": "2021.naacl-main.225_8378"}
{"input": "large - scale commonsense knowledge is used for Task| context: ' abstractive dialogue summarization is the task of capturing the highlights of a dialogue andrewriting them into a concise version .", "entity": "large - scale commonsense knowledge", "output": "dialogue un - derstanding", "neg_sample": ["large - scale commonsense knowledge is used for Task", "' abstractive dialogue summarization is the task of capturing the highlights of a dialogue andrewriting them into a concise version ."], "relation": "used for", "id": "2021.ccl-1.86", "year": 2021, "rel_sent": "In this paper we present a novel multi - speaker dialogue summarizer to demonstrate how large - scale commonsense knowledge can facilitate dialogue un - derstanding and summary generation .", "forward": true, "src_ids": "2021.ccl-1.86_6078"}
{"input": "abstractive summarization is done by using Task| context: repetition in natural language generation reduces the informativeness of text and makes it less appealing . various techniques have been proposed to alleviate it .", "entity": "abstractive summarization", "output": "language modeling", "neg_sample": ["abstractive summarization is done by using Task", "repetition in natural language generation reduces the informativeness of text and makes it less appealing .", "various techniques have been proposed to alleviate it ."], "relation": "used for", "id": "2021.ranlp-srw.18", "year": 2021, "rel_sent": "First , we explore the application of unlikelihood training and embedding matrix regularizers from previous work on language modeling to abstractive summarization .", "forward": false, "src_ids": "2021.ranlp-srw.18_40"}
{"input": "graph convolutional networks is used for OtherScientificTerm| context: existing approaches for table annotation with entities and types either capture the structure of table using graphical models , or learn embeddings of table entries without accounting for the complete syntactic structure .", "entity": "graph convolutional networks", "output": "knowledge graph", "neg_sample": ["graph convolutional networks is used for OtherScientificTerm", "existing approaches for table annotation with entities and types either capture the structure of table using graphical models , or learn embeddings of table entries without accounting for the complete syntactic structure ."], "relation": "used for", "id": "2021.eacl-main.102", "year": 2021, "rel_sent": "We propose TabGCN , that uses Graph Convolutional Networks to capture the complete structure of tables , knowledge graph and the training annotations , and jointly learns embeddings for table elements as well as the entities and types .", "forward": true, "src_ids": "2021.eacl-main.102_1953"}
{"input": "supervised domain adaptation is used for Material| context: recent work has shown fine - tuning neural coreference models can produce strong performance when adapting to different domains . however , at the same time , this can require a large amount of annotated target examples .", "entity": "supervised domain adaptation", "output": "clinical notes", "neg_sample": ["supervised domain adaptation is used for Material", "recent work has shown fine - tuning neural coreference models can produce strong performance when adapting to different domains .", "however , at the same time , this can require a large amount of annotated target examples ."], "relation": "used for", "id": "2021.crac-1.13", "year": 2021, "rel_sent": "In this work , we focus on supervised domain adaptation for clinical notes , proposing the use of concept knowledge to more efficiently adapt coreference models to a new domain .", "forward": true, "src_ids": "2021.crac-1.13_12328"}
{"input": "translations is used for Task| context: multilingual models have demonstrated impressive cross - lingual transfer performance . however , test sets like xnli are monolingual at the example level . in multilingual communities , it is common for polyglots to code - mix when conversing with each other . this paper will be published in the proceedings of naacl - hlt 2021 .", "entity": "translations", "output": "sense disambiguation", "neg_sample": ["translations is used for Task", "multilingual models have demonstrated impressive cross - lingual transfer performance .", "however , test sets like xnli are monolingual at the example level .", "in multilingual communities , it is common for polyglots to code - mix when conversing with each other .", "this paper will be published in the proceedings of naacl - hlt 2021 ."], "relation": "used for", "id": "2021.calcs-1.19", "year": 2021, "rel_sent": "The former ( PolyGloss ) uses bilingual dictionaries to propose perturbations and translations of the clean example for sense disambiguation .", "forward": true, "src_ids": "2021.calcs-1.19_4307"}
{"input": "offence detection is done by using Method| context: with the advent of social media , we have seen a proliferation of data and public discourse . unfortunately , this includes offensive content as well . the problem is exacerbated due to the sheer number of languages spoken on these platforms and the multiple other modalities used for sharing offensive content ( images , gifs , videos and more ) .", "entity": "offence detection", "output": "ensemble of multilingual language models", "neg_sample": ["offence detection is done by using Method", "with the advent of social media , we have seen a proliferation of data and public discourse .", "unfortunately , this includes offensive content as well .", "the problem is exacerbated due to the sheer number of languages spoken on these platforms and the multiple other modalities used for sharing offensive content ( images , gifs , videos and more ) ."], "relation": "used for", "id": "2021.dravidianlangtech-1.42", "year": 2021, "rel_sent": "Bitions@DravidianLangTech - EACL2021 : Ensemble of Multilingual Language Models with Pseudo Labeling for offence Detection in Dravidian Languages.", "forward": false, "src_ids": "2021.dravidianlangtech-1.42_13659"}
{"input": "human effort is done by using Method| context: conversational dialogue systems ( cdss ) are hard to evaluate due to the complexity of natural language . automatic evaluation of dialogues often shows insufficient correlation with human judgements . human evaluation is reliable but labor - intensive .", "entity": "human effort", "output": "human effort estimation module", "neg_sample": ["human effort is done by using Method", "conversational dialogue systems ( cdss ) are hard to evaluate due to the complexity of natural language .", "automatic evaluation of dialogues often shows insufficient correlation with human judgements .", "human evaluation is reliable but labor - intensive ."], "relation": "used for", "id": "2021.acl-long.436", "year": 2021, "rel_sent": "HMCEval includes a model confidence estimation module to estimate the confidence of the predicted sample assignment , and a human effort estimation module to estimate the human effort should the sample be assigned to human evaluation , as well as a sample assignment execution module that finds the optimum assignment solution based on the estimated confidence and effort .", "forward": false, "src_ids": "2021.acl-long.436_5550"}
{"input": "multilingual model is done by using Method| context: we study a set of nine typologically diverse languages with readily available pretrained monolingual models on a set of five diverse monolingual downstream tasks .", "entity": "multilingual model", "output": "monolingual tokenizer", "neg_sample": ["multilingual model is done by using Method", "we study a set of nine typologically diverse languages with readily available pretrained monolingual models on a set of five diverse monolingual downstream tasks ."], "relation": "used for", "id": "2021.acl-long.243", "year": 2021, "rel_sent": "We further find that replacing the original multilingual tokenizer with the specialized monolingual tokenizer improves the downstream performance of the multilingual model for almost every task and language .", "forward": false, "src_ids": "2021.acl-long.243_1297"}
{"input": "assessing toxicity and game - specific aspects is done by using Task| context: traditional toxicity detection models have focused on the single utterance level without deeper understanding of context .", "entity": "assessing toxicity and game - specific aspects", "output": "toxicity detection tasks", "neg_sample": ["assessing toxicity and game - specific aspects is done by using Task", "traditional toxicity detection models have focused on the single utterance level without deeper understanding of context ."], "relation": "used for", "id": "2021.findings-acl.213", "year": 2021, "rel_sent": "Inspired by NLU , we also apply its metrics to the toxicity detection tasks for assessing toxicity and game - specific aspects .", "forward": false, "src_ids": "2021.findings-acl.213_3697"}
{"input": "machine learning models is used for Task| context: understanding language requires grasping not only the overtly stated content , but also making inferences about things that were left unsaid . these inferences include presuppositions , a phenomenon by which a listener learns about new information through reasoning about what a speaker takes as given . presuppositions require complex understanding of the lexical and syntactic properties that trigger them as well as the broader conversational context .", "entity": "machine learning models", "output": "human inferences", "neg_sample": ["machine learning models is used for Task", "understanding language requires grasping not only the overtly stated content , but also making inferences about things that were left unsaid .", "these inferences include presuppositions , a phenomenon by which a listener learns about new information through reasoning about what a speaker takes as given .", "presuppositions require complex understanding of the lexical and syntactic properties that trigger them as well as the broader conversational context ."], "relation": "used for", "id": "2021.conll-1.28", "year": 2021, "rel_sent": "In this work , we introduce the Naturally - Occurring Presuppositions in English ( NOPE ) Corpus to investigate the context - sensitivity of 10 different types of presupposition triggers and to evaluate machine learning models ' ability to predict human inferences .", "forward": true, "src_ids": "2021.conll-1.28_15810"}
{"input": "maximum matching paths is done by using Method| context: generating long text conditionally depending on the short input text has recently attracted more and more research efforts . most existing approaches focus more on introducing extra knowledge to supplement the short input text , but ignore the coherence issue of the generated texts .", "entity": "maximum matching paths", "output": "subgraph alignment methods", "neg_sample": ["maximum matching paths is done by using Method", "generating long text conditionally depending on the short input text has recently attracted more and more research efforts .", "most existing approaches focus more on introducing extra knowledge to supplement the short input text , but ignore the coherence issue of the generated texts ."], "relation": "used for", "id": "2021.emnlp-main.200", "year": 2021, "rel_sent": "Then , three subgraph alignment methods are proposed to extract the maximum matching paths or subgraphs .", "forward": false, "src_ids": "2021.emnlp-main.200_14559"}
{"input": "multi - teacher co - finetuning method is used for Method| context: pre - trained language models ( plms ) achieve great success in nlp . however , their huge model sizes hinder their applications in many practical systems . knowledge distillation is a popular technique to compress plms , which learns a small student model from a large teacher plm . however , the knowledge learned from a single teacher may be limited and even biased , resulting in low - quality student model .", "entity": "multi - teacher co - finetuning method", "output": "multiple teacher plms", "neg_sample": ["multi - teacher co - finetuning method is used for Method", "pre - trained language models ( plms ) achieve great success in nlp .", "however , their huge model sizes hinder their applications in many practical systems .", "knowledge distillation is a popular technique to compress plms , which learns a small student model from a large teacher plm .", "however , the knowledge learned from a single teacher may be limited and even biased , resulting in low - quality student model ."], "relation": "used for", "id": "2021.findings-acl.387", "year": 2021, "rel_sent": "In MTBERT we design a multi - teacher co - finetuning method to jointly finetune multiple teacher PLMs in downstream tasks with shared pooling and prediction layers to align their output space for better collaborative teaching .", "forward": true, "src_ids": "2021.findings-acl.387_10305"}
{"input": "frame semantic parsing is done by using Method| context: current researches on frame semantic parsing include three subtasks , namely frame identification , argument identification and role classification . most of previous systems process these subtasks independently and ignore their interactions .", "entity": "frame semantic parsing", "output": "multi - decoder framework", "neg_sample": ["frame semantic parsing is done by using Method", "current researches on frame semantic parsing include three subtasks , namely frame identification , argument identification and role classification .", "most of previous systems process these subtasks independently and ignore their interactions ."], "relation": "used for", "id": "2021.findings-acl.227", "year": 2021, "rel_sent": "Joint Multi - Decoder Framework with Hierarchical Pointer Network for Frame Semantic Parsing.", "forward": false, "src_ids": "2021.findings-acl.227_7633"}
{"input": "multi - vector attention models is used for Task| context: large - scale document retrieval systems often utilize two styles of neural network models which live at two different ends of the joint computation vs. accuracy spectrum . the first style is dual encoder ( or two - tower ) models , where the query and document representations are computed completely independently and combined with a simple dot product operation . the second style is cross - attention models , where the query and document features are concatenated in the input layer and all computation is based on the joint query - document representation .", "entity": "multi - vector attention models", "output": "deep re - ranking", "neg_sample": ["multi - vector attention models is used for Task", "large - scale document retrieval systems often utilize two styles of neural network models which live at two different ends of the joint computation vs. accuracy spectrum .", "the first style is dual encoder ( or two - tower ) models , where the query and document representations are computed completely independently and combined with a simple dot product operation .", "the second style is cross - attention models , where the query and document features are concatenated in the input layer and all computation is based on the joint query - document representation ."], "relation": "used for", "id": "2021.emnlp-main.443", "year": 2021, "rel_sent": "Multi - Vector Attention Models for Deep Re - ranking.", "forward": true, "src_ids": "2021.emnlp-main.443_237"}
{"input": "hashing based efficient inference is used for Task| context: recent works mainly focus on exploring the interactions between images and sentences to improve the performance without considering inference efficiency . specifically , for the large scale databases , it is unacceptable to perform such time - consuming mechanisms between a query ( text / image ) and each candidate datapoint ( image / text ) in the whole retrieval set during inference .", "entity": "hashing based efficient inference", "output": "image - text matching", "neg_sample": ["hashing based efficient inference is used for Task", "recent works mainly focus on exploring the interactions between images and sentences to improve the performance without considering inference efficiency .", "specifically , for the large scale databases , it is unacceptable to perform such time - consuming mechanisms between a query ( text / image ) and each candidate datapoint ( image / text ) in the whole retrieval set during inference ."], "relation": "used for", "id": "2021.findings-acl.66", "year": 2021, "rel_sent": "Hashing based Efficient Inference for Image - Text Matching.", "forward": true, "src_ids": "2021.findings-acl.66_10673"}
{"input": "ctc - based compression is used for Task| context: previous studies demonstrated that a dynamic phone - informed compression of the input audio is beneficial for speech translation ( st ) . however , they required a dedicated model for phone recognition and did not test this solution for direct st , in which a single model translates the input audio into the target language without intermediate representations .", "entity": "ctc - based compression", "output": "direct speech translation", "neg_sample": ["ctc - based compression is used for Task", "previous studies demonstrated that a dynamic phone - informed compression of the input audio is beneficial for speech translation ( st ) .", "however , they required a dedicated model for phone recognition and did not test this solution for direct st , in which a single model translates the input audio into the target language without intermediate representations ."], "relation": "used for", "id": "2021.eacl-main.57", "year": 2021, "rel_sent": "CTC - based Compression for Direct Speech Translation.", "forward": true, "src_ids": "2021.eacl-main.57_6561"}
{"input": "discourse context is used for Task| context: existing models fail tofully utilize the contextual information which plays an important role in interpreting each local sentence .", "entity": "discourse context", "output": "implicit discourse relation recognition", "neg_sample": ["discourse context is used for Task", "existing models fail tofully utilize the contextual information which plays an important role in interpreting each local sentence ."], "relation": "used for", "id": "2021.naacl-main.126", "year": 2021, "rel_sent": "In this paper , we thus propose a novel graph - based Context Tracking Network ( CT - Net ) to model the discourse context for IDRR .", "forward": true, "src_ids": "2021.naacl-main.126_12186"}
{"input": "time - aware graph neural network is used for Task| context: recently , the availability of temporal kgs ( tkgs ) that contain time information created the need for reasoning over time in such tkgs .", "entity": "time - aware graph neural network", "output": "entity alignment", "neg_sample": ["time - aware graph neural network is used for Task", "recently , the availability of temporal kgs ( tkgs ) that contain time information created the need for reasoning over time in such tkgs ."], "relation": "used for", "id": "2021.emnlp-main.709", "year": 2021, "rel_sent": "Time - aware Graph Neural Network for Entity Alignment between Temporal Knowledge Graphs.", "forward": true, "src_ids": "2021.emnlp-main.709_85"}
{"input": "context - sensitivity estimation is used for Task| context: hence , toxicity detectors trained on current datasets will also disregard context , making the detection of context - sensitive toxicity a lot harder when it occurs .", "entity": "context - sensitivity estimation", "output": "toxicity detection", "neg_sample": ["context - sensitivity estimation is used for Task", "hence , toxicity detectors trained on current datasets will also disregard context , making the detection of context - sensitive toxicity a lot harder when it occurs ."], "relation": "used for", "id": "2021.woah-1.15", "year": 2021, "rel_sent": "Context Sensitivity Estimation in Toxicity Detection.", "forward": true, "src_ids": "2021.woah-1.15_13028"}
{"input": "topic understanding is done by using Task| context: approaches to computational argumentation tasks such as stance detection and aspect detection have largely focused on the text of independent claims , losing out on potentially valuable context provided by the rest of the collection .", "entity": "topic understanding", "output": "reading process", "neg_sample": ["topic understanding is done by using Task", "approaches to computational argumentation tasks such as stance detection and aspect detection have largely focused on the text of independent claims , losing out on potentially valuable context provided by the rest of the collection ."], "relation": "used for", "id": "2021.acl-long.126", "year": 2021, "rel_sent": "We introduce a general approach to these tasks motivated by syntopical reading , a reading process that emphasizes comparing and contrasting viewpoints in order to improve topic understanding .", "forward": false, "src_ids": "2021.acl-long.126_15049"}
{"input": "aspect - based sentiment analysis is done by using Method| context: aspect - based sentiment analysis ( absa ) predicts the sentiment polarity towards a particular aspect term in a sentence , which is an important task in real - world applications . to perform absa , the trained model is required to have a good understanding of the contextual information , especially the particular patterns that suggest the sentiment polarity . however , these patterns typically vary in different sentences , especially when the sentences come from different sources ( domains ) , which makes absa still very challenging . although combining labeled data across different sources ( domains ) is a promising solution to address the challenge , in practical applications , these labeled data are usually stored at different locations and might be inaccessible to each other due to privacy or legal concerns ( e.g. , the data are owned by different companies ) .", "entity": "aspect - based sentiment analysis", "output": "federated learning", "neg_sample": ["aspect - based sentiment analysis is done by using Method", "aspect - based sentiment analysis ( absa ) predicts the sentiment polarity towards a particular aspect term in a sentence , which is an important task in real - world applications .", "to perform absa , the trained model is required to have a good understanding of the contextual information , especially the particular patterns that suggest the sentiment polarity .", "however , these patterns typically vary in different sentences , especially when the sentences come from different sources ( domains ) , which makes absa still very challenging .", "although combining labeled data across different sources ( domains ) is a promising solution to address the challenge , in practical applications , these labeled data are usually stored at different locations and might be inaccessible to each other due to privacy or legal concerns ( e.g.", ", the data are owned by different companies ) ."], "relation": "used for", "id": "2021.emnlp-main.321", "year": 2021, "rel_sent": "Improving Federated Learning for Aspect - based Sentiment Analysis via Topic Memories.", "forward": false, "src_ids": "2021.emnlp-main.321_10265"}
{"input": "multilingual training data is done by using Task| context: multilingual semantic parsing is a cost - effective method that allows a single model to understand different languages . however , researchers face a great imbalance of availability of training data , with english being resource rich , and other languages having much less data .", "entity": "multilingual training data", "output": "machine translation", "neg_sample": ["multilingual training data is done by using Task", "multilingual semantic parsing is a cost - effective method that allows a single model to understand different languages .", "however , researchers face a great imbalance of availability of training data , with english being resource rich , and other languages having much less data ."], "relation": "used for", "id": "2021.starsem-1.17", "year": 2021, "rel_sent": "To tackle the data limitation problem , we propose using machine translation to bootstrap multilingual training data from the more abundant English data .", "forward": false, "src_ids": "2021.starsem-1.17_169"}
{"input": "fine - tuning step is used for OtherScientificTerm| context: text - pair classification is the task of determining the class relationship between two sentences . it is embedded in several tasks such as paraphrase identification and duplicate question detection . contemporary methods use fine - tuned transformer encoder semantic representations of the classification token in the text - pair sequence from the transformer 's final layer for class prediction .", "entity": "fine - tuning step", "output": "shallow features", "neg_sample": ["fine - tuning step is used for OtherScientificTerm", "text - pair classification is the task of determining the class relationship between two sentences .", "it is embedded in several tasks such as paraphrase identification and duplicate question detection .", "contemporary methods use fine - tuned transformer encoder semantic representations of the classification token in the text - pair sequence from the transformer 's final layer for class prediction ."], "relation": "used for", "id": "2021.alta-1.7", "year": 2021, "rel_sent": "Our work shows that transformer - based models can improve text - pair classification by modifying the fine - tuning step to exploit shallow features while improving model generalization , with only a slight reduction in efficiency .", "forward": true, "src_ids": "2021.alta-1.7_10340"}
{"input": "gpt-2 is used for Task| context: fine - tuning is the de facto way of leveraging large pretrained language models for downstream tasks . however , fine - tuning modifies all the language model parameters and therefore necessitates storing a full copy for each task .", "entity": "gpt-2", "output": "table - to - text generation", "neg_sample": ["gpt-2 is used for Task", "fine - tuning is the de facto way of leveraging large pretrained language models for downstream tasks .", "however , fine - tuning modifies all the language model parameters and therefore necessitates storing a full copy for each task ."], "relation": "used for", "id": "2021.acl-long.353", "year": 2021, "rel_sent": "We apply prefix - tuning to GPT-2 for table - to - text generation and to BART for summarization .", "forward": true, "src_ids": "2021.acl-long.353_11057"}
{"input": "transformer is done by using OtherScientificTerm| context: sequence - to - sequence models usually transfer all encoder outputs to the decoder for generation . in this work , by contrast , we hypothesize that these encoder outputs can be compressed to shorten the sequence delivered for decoding .", "entity": "transformer", "output": "l0drop layer", "neg_sample": ["transformer is done by using OtherScientificTerm", "sequence - to - sequence models usually transfer all encoder outputs to the decoder for generation .", "in this work , by contrast , we hypothesize that these encoder outputs can be compressed to shorten the sequence delivered for decoding ."], "relation": "used for", "id": "2021.findings-acl.255", "year": 2021, "rel_sent": "In other words , via joint training , the L0DROP layer forces Transformer to route information through a subset of its encoder states .", "forward": false, "src_ids": "2021.findings-acl.255_12011"}
{"input": "inter sentence generation is used for OtherScientificTerm| context: document machine translation aims to translate the source sentence into the target language in the presence of additional contextual information . however , it typically suffers from a lack of doc - level bilingual data .", "entity": "inter sentence generation", "output": "cross - sentence dependency", "neg_sample": ["inter sentence generation is used for OtherScientificTerm", "document machine translation aims to translate the source sentence into the target language in the presence of additional contextual information .", "however , it typically suffers from a lack of doc - level bilingual data ."], "relation": "used for", "id": "2021.naacl-main.281", "year": 2021, "rel_sent": "The proposed model performs inter sentence generation to capture the cross - sentence dependency within the target document , and cross sentence translation to make better use of valuable contextual information .", "forward": true, "src_ids": "2021.naacl-main.281_1605"}
{"input": "grammar rules is done by using Method| context: automated program repair ( apr ) aims tofind an automatic solution to program language bugs without human intervention , and it can potentially reduce debugging costs and improve software quality . conventional approaches adopt learning - based methods such as sequence - to - sequence models for the patches generation . however , they tend to ignore the code structure information and suffer from grammar and syntax errors .", "entity": "grammar rules", "output": "encoders", "neg_sample": ["grammar rules is done by using Method", "automated program repair ( apr ) aims tofind an automatic solution to program language bugs without human intervention , and it can potentially reduce debugging costs and improve software quality .", "conventional approaches adopt learning - based methods such as sequence - to - sequence models for the patches generation .", "however , they tend to ignore the code structure information and suffer from grammar and syntax errors ."], "relation": "used for", "id": "2021.findings-acl.111", "year": 2021, "rel_sent": "To consider the grammar and syntax information , in this paper , we propose a grammar - based ruleto - rule model , which regards the repair process as the transformation of grammar rules , and leverages two encoders modeling both the original token sequence and the grammar rules , enhanced with a new tree - based self - attention .", "forward": false, "src_ids": "2021.findings-acl.111_9099"}
{"input": "shared understandings is done by using Method| context: discrepancies exist among different cultures or languages . a lack of mutual understanding among different colingual groups about the perspectives on specific values or events may lead to uninformed decisions or biased opinions . thus , automatically understanding the group perspectives can provide essential back - ground for many natural language processing tasks .", "entity": "shared understandings", "output": "computational approach", "neg_sample": ["shared understandings is done by using Method", "discrepancies exist among different cultures or languages .", "a lack of mutual understanding among different colingual groups about the perspectives on specific values or events may lead to uninformed decisions or biased opinions .", "thus , automatically understanding the group perspectives can provide essential back - ground for many natural language processing tasks ."], "relation": "used for", "id": "2021.socialnlp-1.16", "year": 2021, "rel_sent": "We present a novel computational approach to learn shared understandings , and benchmark our method by building culturally - aware models for the English , Chinese , and Japanese languages .", "forward": false, "src_ids": "2021.socialnlp-1.16_14911"}
{"input": "hard cases is done by using Method| context: relation extraction ( re ) is an essential topic in natural language processing and has attracted extensive attention . current re approaches achieve fantastic results on common datasets , while they still struggle on practical applications . in this paper , we analyze the above performance gap , the underlying reason of which is that practical applications intrinsically have more hard cases .", "entity": "hard cases", "output": "re models", "neg_sample": ["hard cases is done by using Method", "relation extraction ( re ) is an essential topic in natural language processing and has attracted extensive attention .", "current re approaches achieve fantastic results on common datasets , while they still struggle on practical applications .", "in this paper , we analyze the above performance gap , the underlying reason of which is that practical applications intrinsically have more hard cases ."], "relation": "used for", "id": "2021.findings-acl.249", "year": 2021, "rel_sent": "To make RE models more robust on such practical hard cases , we propose a case - oriented construction framework to build a Hard Case Relation Extraction Dataset ( HacRED ) .", "forward": false, "src_ids": "2021.findings-acl.249_6833"}
{"input": "simplified transition set is used for Method| context: predicting linearized abstract meaning representation ( amr ) graphs using pre - trained sequence - to - sequence transformer models has recently led to large improvements on amr parsing benchmarks . these parsers are simple and avoid explicit modeling of structure but lack desirable properties such as graph well - formedness guarantees or built - in graph - sentence alignments .", "entity": "simplified transition set", "output": "pre - trained language models", "neg_sample": ["simplified transition set is used for Method", "predicting linearized abstract meaning representation ( amr ) graphs using pre - trained sequence - to - sequence transformer models has recently led to large improvements on amr parsing benchmarks .", "these parsers are simple and avoid explicit modeling of structure but lack desirable properties such as graph well - formedness guarantees or built - in graph - sentence alignments ."], "relation": "used for", "id": "2021.emnlp-main.507", "year": 2021, "rel_sent": "We depart from a pointer - based transition system and propose a simplified transition set , designed to better exploit pre - trained language models for structured fine - tuning .", "forward": true, "src_ids": "2021.emnlp-main.507_12737"}
{"input": "speech technology is used for Task| context: to address the performance gap of english asr models on l2 english speakers , we evaluate fine - tuning of pretrained wav2vec 2.0 models ( baevski et al . , 2020 ; xu et al . , 2021 ) on l2 - arctic , a non - native english speech corpus ( zhao et al . , 2018 ) under different training settings .", "entity": "speech technology", "output": "automatic speech recognition", "neg_sample": ["speech technology is used for Task", "to address the performance gap of english asr models on l2 english speakers , we evaluate fine - tuning of pretrained wav2vec 2.0 models ( baevski et al .", ", 2020 ; xu et al .", ", 2021 ) on l2 - arctic , a non - native english speech corpus ( zhao et al .", ", 2018 ) under different training settings ."], "relation": "used for", "id": "2021.icnlsp-1.2", "year": 2021, "rel_sent": "Speech Technology for Everyone : Automatic Speech Recognition for Non - Native English.", "forward": true, "src_ids": "2021.icnlsp-1.2_11363"}
{"input": "pause duration is used for OtherScientificTerm| context: entity tags in human - machine dialog are integral to natural language understanding ( nlu ) tasks in conversational assistants . however , current systems struggle to accurately parse spoken queries with the typical use of text input alone , and often fail to understand the user intent . previous work in linguistics has identified a cross - language tendency for longer speech pauses surrounding nouns as compared to verbs .", "entity": "pause duration", "output": "contextual embeddings", "neg_sample": ["pause duration is used for OtherScientificTerm", "entity tags in human - machine dialog are integral to natural language understanding ( nlu ) tasks in conversational assistants .", "however , current systems struggle to accurately parse spoken queries with the typical use of text input alone , and often fail to understand the user intent .", "previous work in linguistics has identified a cross - language tendency for longer speech pauses surrounding nouns as compared to verbs ."], "relation": "used for", "id": "2021.nlp4convai-1.22", "year": 2021, "rel_sent": "Additionally , in contrast to text - based NLU , we apply pause duration to enrich contextual embeddings to improve shallow parsing of entities .", "forward": true, "src_ids": "2021.nlp4convai-1.22_722"}
{"input": "dependency graphs is used for Task| context: graph - based aspect - based sentiment classification ( absc ) approaches have yielded state - of - the - art results , expecially when equipped with contextual word embedding from pre - training language models ( plms ) . however , they ignore sequential features of the context and have not yet made the best of plms .", "entity": "dependency graphs", "output": "downstream classification", "neg_sample": ["dependency graphs is used for Task", "graph - based aspect - based sentiment classification ( absc ) approaches have yielded state - of - the - art results , expecially when equipped with contextual word embedding from pre - training language models ( plms ) .", "however , they ignore sequential features of the context and have not yet made the best of plms ."], "relation": "used for", "id": "2021.emnlp-main.724", "year": 2021, "rel_sent": "BERT4GCN utilizes outputs from intermediate layers of BERT and positional information between words to augment GCN ( Graph Convolutional Network ) to better encode the dependency graphs for the downstream classification .", "forward": true, "src_ids": "2021.emnlp-main.724_12657"}
{"input": "disagreement resolution is done by using Generic| context: the procedure is general , but of particular use in multiple - annotator tasks geared towards ground truth construction .", "entity": "disagreement resolution", "output": "systematic procedure", "neg_sample": ["disagreement resolution is done by using Generic", "the procedure is general , but of particular use in multiple - annotator tasks geared towards ground truth construction ."], "relation": "used for", "id": "2021.humeval-1.15", "year": 2021, "rel_sent": "Consensus among annotators , we maintain , should be striven for , through a systematic procedure for disagreement resolution such as the one we describe .", "forward": false, "src_ids": "2021.humeval-1.15_8899"}
{"input": "trusted clean data is used for Task| context: learning multilingual and multi - domain translation model is challenging as the heterogeneous and imbalanced data make the model converge inconsistently over different corpora in real world . one common practice is to adjust the share of each corpus in the training , so that the learning process is balanced and low - resource cases can benefit from the high resource ones . however , automatic balancing methods usually depend on the intra- and inter - dataset characteristics , which is usually agnostic or requires human priors .", "entity": "trusted clean data", "output": "multi - corpus machine translation", "neg_sample": ["trusted clean data is used for Task", "learning multilingual and multi - domain translation model is challenging as the heterogeneous and imbalanced data make the model converge inconsistently over different corpora in real world .", "one common practice is to adjust the share of each corpus in the training , so that the learning process is balanced and low - resource cases can benefit from the high resource ones .", "however , automatic balancing methods usually depend on the intra- and inter - dataset characteristics , which is usually agnostic or requires human priors ."], "relation": "used for", "id": "2021.emnlp-main.580", "year": 2021, "rel_sent": "In this work , we propose an approach , MultiUAT , that dynamically adjusts the training data usage based on the model 's uncertainty on a small set of trusted clean data for multi - corpus machine translation .", "forward": true, "src_ids": "2021.emnlp-main.580_5724"}
{"input": "joint models is used for Task| context: pipelines are conceptually simple , but errors propagate from one component to the next , without later components being able to revise earlier decisions .", "entity": "joint models", "output": "question answering", "neg_sample": ["joint models is used for Task", "pipelines are conceptually simple , but errors propagate from one component to the next , without later components being able to revise earlier decisions ."], "relation": "used for", "id": "2021.acl-long.301", "year": 2021, "rel_sent": "Experiments on biomedical data from BIOASQ show that our joint models vastly outperform the pipelines in snippet retrieval , the main goal for QA , with fewer trainable parameters , also remaining competitive in document retrieval .", "forward": true, "src_ids": "2021.acl-long.301_15663"}
{"input": "transfer learning is used for Method| context: transfer learning from pre - trained neural language models towards downstream tasks has been a predominant theme in nlp recently .", "entity": "transfer learning", "output": "deep nlp models", "neg_sample": ["transfer learning is used for Method", "transfer learning from pre - trained neural language models towards downstream tasks has been a predominant theme in nlp recently ."], "relation": "used for", "id": "2021.findings-acl.438", "year": 2021, "rel_sent": "How transfer learning impacts linguistic knowledge in deep NLP models ?.", "forward": true, "src_ids": "2021.findings-acl.438_2985"}
{"input": "unsupervised machine translation is used for OtherScientificTerm| context: unsupervised translation has reached impressive performance on resource - rich language pairs such as english - french and english - german . in this work , we show that multilinguality is critical to making unsupervised systems practical for low - resource settings .", "entity": "unsupervised machine translation", "output": "rare languages", "neg_sample": ["unsupervised machine translation is used for OtherScientificTerm", "unsupervised translation has reached impressive performance on resource - rich language pairs such as english - french and english - german .", "in this work , we show that multilinguality is critical to making unsupervised systems practical for low - resource settings ."], "relation": "used for", "id": "2021.naacl-main.89", "year": 2021, "rel_sent": "Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages.", "forward": true, "src_ids": "2021.naacl-main.89_6089"}
{"input": "gan objective function is done by using Method| context: generating diverse texts is an important factor for unsupervised text generation . one approach is to produce the diversity of texts conditioned by the sampled latent code . although several generative adversarial networks ( gans ) have been proposed thus far , these models still suffer from mode - collapsing if the models are not pre - trained .", "entity": "gan objective function", "output": "gan model", "neg_sample": ["gan objective function is done by using Method", "generating diverse texts is an important factor for unsupervised text generation .", "one approach is to produce the diversity of texts conditioned by the sampled latent code .", "although several generative adversarial networks ( gans ) have been proposed thus far , these models still suffer from mode - collapsing if the models are not pre - trained ."], "relation": "used for", "id": "2021.eacl-srw.23", "year": 2021, "rel_sent": "In this paper , we propose a GAN model that aims to improve the approach to generating diverse texts conditioned by the latent space .", "forward": false, "src_ids": "2021.eacl-srw.23_11813"}
{"input": "masked language models ( bert ) is used for Task| context: we address the task of antonym prediction in a context , which is a fill - in - the - blanks problem . this task setting is unique and practical because it requires contrastiveness to the other word and naturalness as a text in filling a blank .", "entity": "masked language models ( bert )", "output": "context - aware antonym prediction", "neg_sample": ["masked language models ( bert ) is used for Task", "we address the task of antonym prediction in a context , which is a fill - in - the - blanks problem .", "this task setting is unique and practical because it requires contrastiveness to the other word and naturalness as a text in filling a blank ."], "relation": "used for", "id": "2021.inlg-1.6", "year": 2021, "rel_sent": "We propose methods for fine - tuning pre - trained masked language models ( BERT ) for context - aware antonym prediction .", "forward": true, "src_ids": "2021.inlg-1.6_1034"}
{"input": "convolutional neural networks is used for Task| context: hope is an essential aspect of mental health stability and recovery in every individual in this fast - changing world . any tools and methods developed for detection , analysis , and generation of hope speech will be beneficial .", "entity": "convolutional neural networks", "output": "hope - speech detection", "neg_sample": ["convolutional neural networks is used for Task", "hope is an essential aspect of mental health stability and recovery in every individual in this fast - changing world .", "any tools and methods developed for detection , analysis , and generation of hope speech will be beneficial ."], "relation": "used for", "id": "2021.ltedi-1.11", "year": 2021, "rel_sent": "EDIOne@LT - EDI - EACL2021 : Pre - trained Transformers with Convolutional Neural Networks for Hope Speech Detection ..", "forward": true, "src_ids": "2021.ltedi-1.11_10131"}
{"input": "language applications is done by using Method| context: beam search is the go - to method for decoding auto - regressive machine translation models . while it yields consistent improvements in terms of bleu , it is only concerned with finding outputs with high model likelihood , and is thus agnostic to whatever end metric or score practitioners care about .", "entity": "language applications", "output": "monte - carlo tree search", "neg_sample": ["language applications is done by using Method", "beam search is the go - to method for decoding auto - regressive machine translation models .", "while it yields consistent improvements in terms of bleu , it is only concerned with finding outputs with high model likelihood , and is thus agnostic to whatever end metric or score practitioners care about ."], "relation": "used for", "id": "2021.emnlp-main.662", "year": 2021, "rel_sent": "We provide a blueprint for how to use MCTS fruitfully in language applications , which opens promising future directions .", "forward": false, "src_ids": "2021.emnlp-main.662_9931"}
{"input": "raw data is used for Task| context: however , there exists a discrepancy on low - frequency words between the distilled and the original data , leading to more errors on predicting low - frequency words .", "entity": "raw data", "output": "non - autoregressive translation", "neg_sample": ["raw data is used for Task", "however , there exists a discrepancy on low - frequency words between the distilled and the original data , leading to more errors on predicting low - frequency words ."], "relation": "used for", "id": "2021.acl-long.266", "year": 2021, "rel_sent": "To alleviate the problem , we directly expose the raw data into NAT by leveraging pretraining .", "forward": true, "src_ids": "2021.acl-long.266_3939"}
{"input": "domain knowledge is done by using Method| context: building a machine learning model in a sophisticated domain is a time - consuming process , partially due to the steep learning curve of domain knowledge for data scientists .", "entity": "domain knowledge", "output": "ziva", "neg_sample": ["domain knowledge is done by using Method", "building a machine learning model in a sophisticated domain is a time - consuming process , partially due to the steep learning curve of domain knowledge for data scientists ."], "relation": "used for", "id": "2021.dash-1.7", "year": 2021, "rel_sent": "We introduce Ziva , an interface for supporting domain knowledge from domain experts to data scientists in two ways : ( 1 ) a concept creation interface where domain experts extract important concept of the domain and ( 2 ) five kinds of justification elicitation interfaces that solicit elicitation how the domain concept are expressed in data instances .", "forward": false, "src_ids": "2021.dash-1.7_15947"}
{"input": "domain classification is done by using Method| context: domain classification is the fundamental task in natural language understanding ( nlu ) , which often requires fast accommodation to new emerging domains . this constraint makes it impossible to retrain all previous domains , even if they are accessible to the new model . most existing continual learning approaches suffer from low accuracy and performance fluctuation , especially when the distributions of old and new data are significantly different . in fact , the key real - world problem is not the absence of old data , but the inefficiency to retrain the model with the whole old dataset . is it potential to utilize some old data to yield high accuracy and maintain stable performance , while at the same time , without introducing extra hyperparameters ?", "entity": "domain classification", "output": "hyperparameter - free continuous learning", "neg_sample": ["domain classification is done by using Method", "domain classification is the fundamental task in natural language understanding ( nlu ) , which often requires fast accommodation to new emerging domains .", "this constraint makes it impossible to retrain all previous domains , even if they are accessible to the new model .", "most existing continual learning approaches suffer from low accuracy and performance fluctuation , especially when the distributions of old and new data are significantly different .", "in fact , the key real - world problem is not the absence of old data , but the inefficiency to retrain the model with the whole old dataset .", "is it potential to utilize some old data to yield high accuracy and maintain stable performance , while at the same time , without introducing extra hyperparameters ?"], "relation": "used for", "id": "2021.naacl-main.212", "year": 2021, "rel_sent": "Hyperparameter - free Continuous Learning for Domain Classification in Natural Language Understanding.", "forward": false, "src_ids": "2021.naacl-main.212_14065"}
{"input": "multi - lingual pretraining techniques is used for Task| context: semi - supervised learning through deep generative models and multi - lingual pretraining techniques have orchestrated tremendous success across different areas of nlp . nonetheless , their development has happened in isolation , while the combination of both could potentially be effective for tackling task - specific labelled data shortage .", "entity": "multi - lingual pretraining techniques", "output": "semi - supervised document classification", "neg_sample": ["multi - lingual pretraining techniques is used for Task", "semi - supervised learning through deep generative models and multi - lingual pretraining techniques have orchestrated tremendous success across different areas of nlp .", "nonetheless , their development has happened in isolation , while the combination of both could potentially be effective for tackling task - specific labelled data shortage ."], "relation": "used for", "id": "2021.eacl-main.76", "year": 2021, "rel_sent": "Combining Deep Generative Models and Multi - lingual Pretraining for Semi - supervised Document Classification.", "forward": true, "src_ids": "2021.eacl-main.76_6433"}
{"input": "neural machine translation is done by using Task| context: the choice of token vocabulary affects the performance of machine translation .", "entity": "neural machine translation", "output": "vocabulary learning", "neg_sample": ["neural machine translation is done by using Task", "the choice of token vocabulary affects the performance of machine translation ."], "relation": "used for", "id": "2021.acl-long.571", "year": 2021, "rel_sent": "Vocabulary Learning via Optimal Transport for Neural Machine Translation.", "forward": false, "src_ids": "2021.acl-long.571_10459"}
{"input": "few - shot learners is used for Task| context: humans can learn a new language task efficiently with only few examples , by leveraging their knowledge obtained when learning prior tasks .", "entity": "few - shot learners", "output": "diverse nlp tasks", "neg_sample": ["few - shot learners is used for Task", "humans can learn a new language task efficiently with only few examples , by leveraging their knowledge obtained when learning prior tasks ."], "relation": "used for", "id": "2021.emnlp-main.572", "year": 2021, "rel_sent": "In this paper , we explore whether and how such cross - task generalization ability can be acquired , and further applied to build better few - shot learners across diverse NLP tasks .", "forward": true, "src_ids": "2021.emnlp-main.572_14447"}
{"input": "safe conversational agents is done by using Method| context: conversational agents trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein , which include offensive or otherwise toxic behavior .", "entity": "safe conversational agents", "output": "bot - adversarial dialogue", "neg_sample": ["safe conversational agents is done by using Method", "conversational agents trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein , which include offensive or otherwise toxic behavior ."], "relation": "used for", "id": "2021.naacl-main.235", "year": 2021, "rel_sent": "Bot - Adversarial Dialogue for Safe Conversational Agents.", "forward": false, "src_ids": "2021.naacl-main.235_960"}
{"input": "lemma constraints is done by using Method| context: current approaches to incorporating terminology constraints in machine translation ( mt ) typically assume that the constraint terms are provided in their correct morphological forms . this limits their application to real - world scenarios where constraint terms are provided as lemmas .", "entity": "lemma constraints", "output": "cross - lingual inflection module", "neg_sample": ["lemma constraints is done by using Method", "current approaches to incorporating terminology constraints in machine translation ( mt ) typically assume that the constraint terms are provided in their correct morphological forms .", "this limits their application to real - world scenarios where constraint terms are provided as lemmas ."], "relation": "used for", "id": "2021.emnlp-main.477", "year": 2021, "rel_sent": "It is based on a novel cross - lingual inflection module that inflects the target lemma constraints based on the source context .", "forward": false, "src_ids": "2021.emnlp-main.477_2716"}
{"input": "adversarial training is used for OtherScientificTerm| context: combining several embeddings typically improves performance in downstream tasks as different embeddings encode different information . it has been shown that even models using embeddings from transformers still benefit from the inclusion of standard word embeddings . however , the combination of embeddings of different types and dimensions is challenging .", "entity": "adversarial training", "output": "mappings of differently - sized embeddings", "neg_sample": ["adversarial training is used for OtherScientificTerm", "combining several embeddings typically improves performance in downstream tasks as different embeddings encode different information .", "it has been shown that even models using embeddings from transformers still benefit from the inclusion of standard word embeddings .", "however , the combination of embeddings of different types and dimensions is challenging ."], "relation": "used for", "id": "2021.emnlp-main.660", "year": 2021, "rel_sent": "In addition , FAME uses adversarial training to optimize the mappings of differently - sized embeddings to the same space .", "forward": true, "src_ids": "2021.emnlp-main.660_15397"}
{"input": "neural - based model is used for OtherScientificTerm| context: sequence - to - sequence based models have recently shown promising results in generating high - quality questions . however , these models are also known to have main drawbacks such as lack of diversity and bad sentence structures .", "entity": "neural - based model", "output": "diverse expressions of questions", "neg_sample": ["neural - based model is used for OtherScientificTerm", "sequence - to - sequence based models have recently shown promising results in generating high - quality questions .", "however , these models are also known to have main drawbacks such as lack of diversity and bad sentence structures ."], "relation": "used for", "id": "2021.eacl-main.279", "year": 2021, "rel_sent": "In this paper , we focus on question generation over SQL database and propose a novel framework by expanding , retrieving , and infilling that first incorporates flexible templates with a neural - based model to generate diverse expressions of questions with sentence structure guidance .", "forward": true, "src_ids": "2021.eacl-main.279_7333"}
{"input": "neural model is used for OtherScientificTerm| context: atomic clauses are fundamental text units for understanding complex sentences . identifying the atomic sentences within complex sentences is important for applications such as summarization , argument mining , discourse analysis , discourse parsing , and question answering . previous work mainly relies on rule - based methods dependent on parsing .", "entity": "neural model", "output": "graph", "neg_sample": ["neural model is used for OtherScientificTerm", "atomic clauses are fundamental text units for understanding complex sentences .", "identifying the atomic sentences within complex sentences is important for applications such as summarization , argument mining , discourse analysis , discourse parsing , and question answering .", "previous work mainly relies on rule - based methods dependent on parsing ."], "relation": "used for", "id": "2021.acl-long.303", "year": 2021, "rel_sent": "Our neural model learns to Accept , Break , Copy or Drop elements of a graph that combines word adjacency and grammatical dependencies .", "forward": true, "src_ids": "2021.acl-long.303_2684"}
{"input": "language learners is done by using Task| context: it is a task where given a text and a span , a system generates , for the span , an explanatory note that helps the writer ( language learner ) improve their writing skills .", "entity": "language learners", "output": "feedback comment generation", "neg_sample": ["language learners is done by using Task", "it is a task where given a text and a span , a system generates , for the span , an explanatory note that helps the writer ( language learner ) improve their writing skills ."], "relation": "used for", "id": "2021.inlg-1.35", "year": 2021, "rel_sent": "Shared Task on Feedback Comment Generation for Language Learners.", "forward": false, "src_ids": "2021.inlg-1.35_9434"}
{"input": "reasons annotation is used for OtherScientificTerm| context: unlike the previous causality detection task , we do not assign target events in the text , but only provide structural event descriptions , and such settings accord more with practice scenarios .", "entity": "reasons annotation", "output": "financial events", "neg_sample": ["reasons annotation is used for OtherScientificTerm", "unlike the previous causality detection task , we do not assign target events in the text , but only provide structural event descriptions , and such settings accord more with practice scenarios ."], "relation": "used for", "id": "2021.eacl-main.175", "year": 2021, "rel_sent": "Moreover , we annotate a large dataset FinReason for evaluation , which provides Reasons annotation for Financial events in company announcements .", "forward": true, "src_ids": "2021.eacl-main.175_638"}
{"input": "few - shot transfer is done by using OtherScientificTerm| context: models pretrained with self - supervised objectives on large text corpora achieve state - of - the - art performance on english text summarization tasks . however , these models are typically fine - tuned on hundreds of thousands of data points , an infeasible requirement when applying summarization to new , niche domains .", "entity": "few - shot transfer", "output": "regularization term", "neg_sample": ["few - shot transfer is done by using OtherScientificTerm", "models pretrained with self - supervised objectives on large text corpora achieve state - of - the - art performance on english text summarization tasks .", "however , these models are typically fine - tuned on hundreds of thousands of data points , an infeasible requirement when applying summarization to new , niche domains ."], "relation": "used for", "id": "2021.naacl-main.57", "year": 2021, "rel_sent": "Tofurther boost performance , we employ data augmentation via round - trip translation as well as introduce a regularization term for improved few - shot transfer .", "forward": false, "src_ids": "2021.naacl-main.57_6537"}
{"input": "information retrieval is done by using OtherScientificTerm| context: with the growing availability of full - text articles , integrating abstracts and full texts of documents into a unified representation is essential for comprehensive search of scientific literature . however , previous studies have shown that naively merging abstracts with full texts of articles does not consistently yield better performance . balancing the contribution of query terms appearing in the abstract and in sections of different importance in full text articles remains a challenge both with traditional bag - of - words ir approaches and for neural retrieval methods .", "entity": "information retrieval", "output": "full text sections", "neg_sample": ["information retrieval is done by using OtherScientificTerm", "with the growing availability of full - text articles , integrating abstracts and full texts of documents into a unified representation is essential for comprehensive search of scientific literature .", "however , previous studies have shown that naively merging abstracts with full texts of articles does not consistently yield better performance .", "balancing the contribution of query terms appearing in the abstract and in sections of different importance in full text articles remains a challenge both with traditional bag - of - words ir approaches and for neural retrieval methods ."], "relation": "used for", "id": "2021.bionlp-1.27", "year": 2021, "rel_sent": "Measuring the relative importance of full text sections for information retrieval from scientific literature ..", "forward": false, "src_ids": "2021.bionlp-1.27_52"}
{"input": "topic attention module is used for Method| context: topic models have been widely used to learn text representations and gain insight into document corpora . however , leveraging topic - word distribution for learning better features during document encoding has not been explored much .", "entity": "topic attention module", "output": "variational inference", "neg_sample": ["topic attention module is used for Method", "topic models have been widely used to learn text representations and gain insight into document corpora .", "however , leveraging topic - word distribution for learning better features during document encoding has not been explored much ."], "relation": "used for", "id": "2021.acl-long.299", "year": 2021, "rel_sent": "The output of topic attention module is then used to carry out variational inference .", "forward": true, "src_ids": "2021.acl-long.299_9395"}
{"input": "explainable medical record based diagnosis is done by using Method| context: providing a reliable explanation for clinical diagnosis based on the electronic medical record ( emr ) is fundamental to the application of artificial intelligence in the medical field . current methods mostly treat the emr as a text sequence and provide explanations based on a precise medical knowledge base , which is disease - specific and difficult to obtain for experts in reality .", "entity": "explainable medical record based diagnosis", "output": "counterfactual supporting facts extraction", "neg_sample": ["explainable medical record based diagnosis is done by using Method", "providing a reliable explanation for clinical diagnosis based on the electronic medical record ( emr ) is fundamental to the application of artificial intelligence in the medical field .", "current methods mostly treat the emr as a text sequence and provide explanations based on a precise medical knowledge base , which is disease - specific and difficult to obtain for experts in reality ."], "relation": "used for", "id": "2021.naacl-main.156", "year": 2021, "rel_sent": "Counterfactual Supporting Facts Extraction for Explainable Medical Record Based Diagnosis with Graph Network.", "forward": false, "src_ids": "2021.naacl-main.156_14801"}
{"input": "conditioned dialogue generation is done by using Method| context: conditioned dialogue generation suffers from the scarcity of labeled responses . in this work , we exploit labeled non - dialogue text data related to the condition , which are much easier to collect .", "entity": "conditioned dialogue generation", "output": "multi - task learning approach", "neg_sample": ["conditioned dialogue generation is done by using Method", "conditioned dialogue generation suffers from the scarcity of labeled responses .", "in this work , we exploit labeled non - dialogue text data related to the condition , which are much easier to collect ."], "relation": "used for", "id": "2021.naacl-main.392", "year": 2021, "rel_sent": "A Simple and Efficient Multi - Task Learning Approach for Conditioned Dialogue Generation.", "forward": false, "src_ids": "2021.naacl-main.392_5985"}
{"input": "bert re - ranker is done by using Method| context: passage retrieval and ranking is a key task in open - domain question answering and information retrieval . current effective approaches mostly rely on pre - trained deep language model - based retrievers and rankers . these methods have been shown to effectively model the semantic matching between queries and passages , also in presence of keyword mismatch , i.e. passages that are relevant to a query but do not contain important query keywords .", "entity": "bert re - ranker", "output": "typos - aware training framework", "neg_sample": ["bert re - ranker is done by using Method", "passage retrieval and ranking is a key task in open - domain question answering and information retrieval .", "current effective approaches mostly rely on pre - trained deep language model - based retrievers and rankers .", "these methods have been shown to effectively model the semantic matching between queries and passages , also in presence of keyword mismatch , i.e.", "passages that are relevant to a query but do not contain important query keywords ."], "relation": "used for", "id": "2021.emnlp-main.225", "year": 2021, "rel_sent": "Our experimental results on the MS MARCO passage ranking dataset show that , with our proposed typos - aware training , DR and BERT re - ranker can become robust to typos in queries , resulting in significantly improved effectiveness compared to models trained without appropriately accounting for typos .", "forward": false, "src_ids": "2021.emnlp-main.225_14303"}
{"input": "argument mining is used for Task| context: this survey builds an interdisciplinary picture of argument mining ( am ) , with a strong focus on its potential to address issues related to social and political science . more specifically , we focus on am challenges related to its applications to social media and in the multilingual domain , and then proceed to the widely debated notion of argument quality .", "entity": "argument mining", "output": "social good", "neg_sample": ["argument mining is used for Task", "this survey builds an interdisciplinary picture of argument mining ( am ) , with a strong focus on its potential to address issues related to social and political science .", "more specifically , we focus on am challenges related to its applications to social media and in the multilingual domain , and then proceed to the widely debated notion of argument quality ."], "relation": "used for", "id": "2021.acl-long.107", "year": 2021, "rel_sent": "We finally define an application of AM for Social Good : ( semi-)automatic moderation , a highly integrative application which ( a ) represents a challenging testbed for the integrated notion of quality we advocate , ( b ) allows the empirical quantification of argument / deliberative quality to benefit from the developments in other NLP fields ( i.e.", "forward": true, "src_ids": "2021.acl-long.107_2275"}
{"input": "cross - lingual transferring ability is done by using OtherScientificTerm| context: udify is the state - of - the - art language - agnostic dependency parser which is trained on a polyglot corpus of 75 languages . this multilingual modeling enables the model to generalize over unknown / lesser - known languages , thus leading to improved performance on low - resource languages .", "entity": "cross - lingual transferring ability", "output": "linguistic typology knowledge", "neg_sample": ["cross - lingual transferring ability is done by using OtherScientificTerm", "udify is the state - of - the - art language - agnostic dependency parser which is trained on a polyglot corpus of 75 languages .", "this multilingual modeling enables the model to generalize over unknown / lesser - known languages , thus leading to improved performance on low - resource languages ."], "relation": "used for", "id": "2021.sigtyp-1.5", "year": 2021, "rel_sent": "In this work we used linguistic typology knowledge available in URIEL database , to improve the cross - lingual transferring ability of UDify even further .", "forward": false, "src_ids": "2021.sigtyp-1.5_15667"}
{"input": "learning word usage is done by using Task| context: we introduce a method for assisting english as second language ( esl ) learners by providing translations of collins cobuild grammar patterns(gp ) for a given word .", "entity": "learning word usage", "output": "native language support", "neg_sample": ["learning word usage is done by using Task", "we introduce a method for assisting english as second language ( esl ) learners by providing translations of collins cobuild grammar patterns(gp ) for a given word ."], "relation": "used for", "id": "2021.rocling-1.39", "year": 2021, "rel_sent": "In our approach , bilingual parallel corpus is transformed into bilingual GP pairs aimed at providing native language support for learning word usage through GPs .", "forward": false, "src_ids": "2021.rocling-1.39_2677"}
{"input": "time series data is done by using Task| context: in this paper , we explore the task of automatically generating natural language descriptions of salient patterns in a time series , such as stock prices of a company over a week . a model for this task should be able to extract high - level patterns such as presence of a peak or a dip . while typical contemporary neural models with attention mechanisms can generate fluent output descriptions for this task , they often generate factually incorrect descriptions .", "entity": "time series data", "output": "truth - conditional captions", "neg_sample": ["time series data is done by using Task", "in this paper , we explore the task of automatically generating natural language descriptions of salient patterns in a time series , such as stock prices of a company over a week .", "a model for this task should be able to extract high - level patterns such as presence of a peak or a dip .", "while typical contemporary neural models with attention mechanisms can generate fluent output descriptions for this task , they often generate factually incorrect descriptions ."], "relation": "used for", "id": "2021.emnlp-main.55", "year": 2021, "rel_sent": "Truth - Conditional Captions for Time Series Data.", "forward": false, "src_ids": "2021.emnlp-main.55_8619"}
{"input": "morpho - phonological systems is done by using OtherScientificTerm| context: neural models for the various flavours of morphological reinflection tasks have proven to be extremely accurate given ample labeled data , yet labeled data may be slow and costly to obtain .", "entity": "morpho - phonological systems", "output": "orthographic and semantic regularities", "neg_sample": ["morpho - phonological systems is done by using OtherScientificTerm", "neural models for the various flavours of morphological reinflection tasks have proven to be extremely accurate given ample labeled data , yet labeled data may be slow and costly to obtain ."], "relation": "used for", "id": "2021.emnlp-main.159", "year": 2021, "rel_sent": "Combined orthographic and semantic regularities alleviate difficulties with particularly complex morpho - phonological systems .", "forward": false, "src_ids": "2021.emnlp-main.159_1147"}
{"input": "open relation extraction ( openre ) is done by using Method| context: the clustering - based unsupervised relation discovery method has gradually become one of the important methods of open relation extraction ( openre ) . however , high - dimensional vectors can encode complex linguistic information which leads to the problem that the derived clusters can not explicitly align with the relational semantic classes .", "entity": "open relation extraction ( openre )", "output": "relation - oriented clustering method", "neg_sample": ["open relation extraction ( openre ) is done by using Method", "the clustering - based unsupervised relation discovery method has gradually become one of the important methods of open relation extraction ( openre ) .", "however , high - dimensional vectors can encode complex linguistic information which leads to the problem that the derived clusters can not explicitly align with the relational semantic classes ."], "relation": "used for", "id": "2021.emnlp-main.765", "year": 2021, "rel_sent": "A Relation - Oriented Clustering Method for Open Relation Extraction.", "forward": false, "src_ids": "2021.emnlp-main.765_9579"}
{"input": "post - training procedure is used for Method| context: as high - quality labeled data is scarce , unsupervised sentence representation learning has attracted much attention .", "entity": "post - training procedure", "output": "supervised methods", "neg_sample": ["post - training procedure is used for Method", "as high - quality labeled data is scarce , unsupervised sentence representation learning has attracted much attention ."], "relation": "used for", "id": "2021.acl-long.402", "year": 2021, "rel_sent": "It can be adopted as a post - training procedure to boost the performance of the supervised methods .", "forward": true, "src_ids": "2021.acl-long.402_7178"}
{"input": "helpfulness ranking is done by using OtherScientificTerm| context: online reviews are an essential aspect of online shopping for both customers and retailers . however , many reviews found on the internet lack in quality , informativeness or helpfulness . in many cases , they lead the customers towards positive or negative opinions without providing any concrete details ( e.g. , very poor product , i would not recommend it ) .", "entity": "helpfulness ranking", "output": "rankings", "neg_sample": ["helpfulness ranking is done by using OtherScientificTerm", "online reviews are an essential aspect of online shopping for both customers and retailers .", "however , many reviews found on the internet lack in quality , informativeness or helpfulness .", "in many cases , they lead the customers towards positive or negative opinions without providing any concrete details ( e.g.", ", very poor product , i would not recommend it ) ."], "relation": "used for", "id": "2021.ranlp-1.109", "year": 2021, "rel_sent": "We perform three rankings ( one for each feature above ) , which are then combined to obtain a final helpfulness ranking .", "forward": false, "src_ids": "2021.ranlp-1.109_14653"}
{"input": "discourse information is done by using OtherScientificTerm| context: existing work on probing of pretrained language models ( lms ) has predominantly focused on sentence - level syntactic tasks .", "entity": "discourse information", "output": "layers", "neg_sample": ["discourse information is done by using OtherScientificTerm", "existing work on probing of pretrained language models ( lms ) has predominantly focused on sentence - level syntactic tasks ."], "relation": "used for", "id": "2021.naacl-main.301", "year": 2021, "rel_sent": "Across the different models , there are substantial differences in which layers best capture discourse information , and large disparities between models .", "forward": false, "src_ids": "2021.naacl-main.301_15800"}
{"input": "event detection systems is used for OtherScientificTerm| context: evaluating the state - of - the - art event detection systems on determining spatio - temporal distribution of the events on the ground is performed unfrequently . but , the ability to both ( 1 ) extract events ' in the wild ' from text and ( 2 ) properly evaluate event detection systems has potential to support a wide variety of tasks such as monitoring the activity of socio - political movements , examining media coverage and public support of these movements , and informing policy decisions . the murder of george floyd , an unarmed black man , at the hands of police officers received global attention throughout the second half of 2020 . protests against police violence emerged worldwide and the blm movement , which was once mostly regulated to the united states , was now seeing activity globally .", "entity": "event detection systems", "output": "black lives matter ( blm ) events", "neg_sample": ["event detection systems is used for OtherScientificTerm", "evaluating the state - of - the - art event detection systems on determining spatio - temporal distribution of the events on the ground is performed unfrequently .", "but , the ability to both ( 1 ) extract events ' in the wild ' from text and ( 2 ) properly evaluate event detection systems has potential to support a wide variety of tasks such as monitoring the activity of socio - political movements , examining media coverage and public support of these movements , and informing policy decisions .", "the murder of george floyd , an unarmed black man , at the hands of police officers received global attention throughout the second half of 2020 . protests against police violence emerged worldwide and the blm movement , which was once mostly regulated to the united states , was now seeing activity globally ."], "relation": "used for", "id": "2021.case-1.27", "year": 2021, "rel_sent": "Therefore , we study performance of the best event detection systems on detecting Black Lives Matter ( BLM ) events from tweets and news articles .", "forward": true, "src_ids": "2021.case-1.27_2842"}
{"input": "inference scenarios is done by using Method| context: despite transformers ' impressive accuracy , their computational cost is often prohibitive to use with limited computational resources . most previous approaches to improve inference efficiency require a separate model for each possible computational budget .", "entity": "inference scenarios", "output": "length - adaptive transformer", "neg_sample": ["inference scenarios is done by using Method", "despite transformers ' impressive accuracy , their computational cost is often prohibitive to use with limited computational resources .", "most previous approaches to improve inference efficiency require a separate model for each possible computational budget ."], "relation": "used for", "id": "2021.acl-long.508", "year": 2021, "rel_sent": "In this paper , we extend PoWER - BERT ( Goyal et al . , 2020 ) and propose Length - Adaptive Transformer that can be used for various inference scenarios after one - shot training .", "forward": false, "src_ids": "2021.acl-long.508_14484"}
{"input": "bart - large is used for Method| context: pre - trained text - to - text transformers such as bart have achieved impressive performance across a range of nlp tasks . recent study further shows that they can learn to generalize to novel tasks , by including task descriptions as part of the source sequence and training the model with ( source , target ) examples . at test time , these fine - tuned models can make inferences on new tasks using the new task descriptions as part of the input . however , this approach has potential limitations , as the model learns to solve individual ( source , target ) examples ( i.e. , at the instance level ) , instead of learning to solve tasks by taking all examples within a task as a whole ( i.e. , at the task level ) .", "entity": "bart - large", "output": "hypter", "neg_sample": ["bart - large is used for Method", "pre - trained text - to - text transformers such as bart have achieved impressive performance across a range of nlp tasks .", "recent study further shows that they can learn to generalize to novel tasks , by including task descriptions as part of the source sequence and training the model with ( source , target ) examples .", "at test time , these fine - tuned models can make inferences on new tasks using the new task descriptions as part of the input .", "however , this approach has potential limitations , as the model learns to solve individual ( source , target ) examples ( i.e.", ", at the instance level ) , instead of learning to solve tasks by taking all examples within a task as a whole ( i.e.", ", at the task level ) ."], "relation": "used for", "id": "2021.acl-short.82", "year": 2021, "rel_sent": "Notably , when using BART - Large as the main network , Hypter brings 11.3 % comparative improvement on ZEST dataset .", "forward": true, "src_ids": "2021.acl-short.82_1739"}
{"input": "clutter scene grounding is done by using Material| context: to effectively apply robots in working environments and assist humans , it is essential to develop and evaluate how visual grounding ( vg ) can affect machine performance on occluded objects . however , current vg works are limited in working environments , such as offices and warehouses , where objects are usually occluded due to space utilization issues .", "entity": "clutter scene grounding", "output": "3d robotic dataset", "neg_sample": ["clutter scene grounding is done by using Material", "to effectively apply robots in working environments and assist humans , it is essential to develop and evaluate how visual grounding ( vg ) can affect machine performance on occluded objects .", "however , current vg works are limited in working environments , such as offices and warehouses , where objects are usually occluded due to space utilization issues ."], "relation": "used for", "id": "2021.naacl-main.419", "year": 2021, "rel_sent": "OCID - Ref : A 3D Robotic Dataset With Embodied Language For Clutter Scene Grounding.", "forward": false, "src_ids": "2021.naacl-main.419_1772"}
{"input": "annotated dataset is used for Material| context: this corpus covers the period from the 15th to the 21st century and is annotated with pos and morphosyntactic tags as well as century and region information .", "entity": "annotated dataset", "output": "low saxon", "neg_sample": ["annotated dataset is used for Material", "this corpus covers the period from the 15th to the 21st century and is annotated with pos and morphosyntactic tags as well as century and region information ."], "relation": "used for", "id": "2021.konvens-1.25", "year": 2021, "rel_sent": "We describe a new annotated dataset for Low Saxon with the intention to complement existing corpora .", "forward": true, "src_ids": "2021.konvens-1.25_8936"}
{"input": "temporal knowledge graph forecasting is done by using Method| context: temporal knowledge graph ( tkg ) reasoning is a crucial task that has gained increasing research interest in recent years . most existing methods focus on reasoning at past timestamps to complete the missing facts , and there are only a few works of reasoning on known tkgs toforecast future facts . compared with the completion task , the forecasting task is more difficult that faces two main challenges : ( 1 ) how to effectively model the time information to handle future timestamps ? ( 2 ) how to make inductive inference to handle previously unseen entities that emerge over time ?", "entity": "temporal knowledge graph forecasting", "output": "reinforcement learning", "neg_sample": ["temporal knowledge graph forecasting is done by using Method", "temporal knowledge graph ( tkg ) reasoning is a crucial task that has gained increasing research interest in recent years .", "most existing methods focus on reasoning at past timestamps to complete the missing facts , and there are only a few works of reasoning on known tkgs toforecast future facts .", "compared with the completion task , the forecasting task is more difficult that faces two main challenges : ( 1 ) how to effectively model the time information to handle future timestamps ?", "( 2 ) how to make inductive inference to handle previously unseen entities that emerge over time ?"], "relation": "used for", "id": "2021.emnlp-main.655", "year": 2021, "rel_sent": "TimeTraveler : Reinforcement Learning for Temporal Knowledge Graph Forecasting.", "forward": false, "src_ids": "2021.emnlp-main.655_2065"}
{"input": "supervised classification methods is used for Task| context: we explore boccaccio 's decameron to see how digital humanities tools can be used for tasks that have limited data in a language no longer in contemporary use : medieval italian .", "entity": "supervised classification methods", "output": "storytellers", "neg_sample": ["supervised classification methods is used for Task", "we explore boccaccio 's decameron to see how digital humanities tools can be used for tasks that have limited data in a language no longer in contemporary use : medieval italian ."], "relation": "used for", "id": "2021.latechclfl-1.17", "year": 2021, "rel_sent": "We use supervised classification methods to predict storytellers based on the stories they tell , confirming the difficulty of the task , and demonstrate that topic modeling can extract thematic storyteller ' profiles . '", "forward": true, "src_ids": "2021.latechclfl-1.17_1560"}
{"input": "dataaugmentation rules is used for Metric| context: coreference resolution is an important compo - nent in analyzing narrative text from admin - istrative data ( e.g. , clinical or police sources).however , existing coreference models trainedon general language corpora suffer from poortransferability due to domain gaps , especiallywhen they are applied to gender - inclusive datawith lesbian , gay , bisexual , and transgender(lgbt ) individuals . in this paper , we an - alyzed the challenges of coreference resolu - tion in an exemplary form of administrativetext written in english : violent death nar - ratives from the usa 's centers for diseasecontrol 's ( cdc ) national violent death re - porting system .", "entity": "dataaugmentation rules", "output": "model perfor - mance", "neg_sample": ["dataaugmentation rules is used for Metric", "coreference resolution is an important compo - nent in analyzing narrative text from admin - istrative data ( e.g.", ", clinical or police sources).however , existing coreference models trainedon general language corpora suffer from poortransferability due to domain gaps , especiallywhen they are applied to gender - inclusive datawith lesbian , gay , bisexual , and transgender(lgbt ) individuals .", "in this paper , we an - alyzed the challenges of coreference resolu - tion in an exemplary form of administrativetext written in english : violent death nar - ratives from the usa 's centers for diseasecontrol 's ( cdc ) national violent death re - porting system ."], "relation": "used for", "id": "2021.naacl-main.361", "year": 2021, "rel_sent": "We developed a set of dataaugmentation rules to improve model perfor - mance using a probabilistic data programmingframework .", "forward": true, "src_ids": "2021.naacl-main.361_2006"}
{"input": "hindi - english is used for Task| context: while recent benchmarks have spurred a lot of new work on improving the generalization of pretrained multilingual language models on multilingual tasks , techniques to improve code - switched natural language understanding tasks have been far less explored .", "entity": "hindi - english", "output": "natural language inference", "neg_sample": ["hindi - english is used for Task", "while recent benchmarks have spurred a lot of new work on improving the generalization of pretrained multilingual language models on multilingual tasks , techniques to improve code - switched natural language understanding tasks have been far less explored ."], "relation": "used for", "id": "2021.mrl-1.16", "year": 2021, "rel_sent": "We show consistent performance gains on four different code - switched language - pairs ( Hindi - English , Spanish - English , Tamil - English and Malayalam - English ) for SA and on Hindi - English for NLI and QA .", "forward": true, "src_ids": "2021.mrl-1.16_3034"}
{"input": "tf - idf weighting is used for OtherScientificTerm| context: financial documents , such as corporate annual reports , are usually very long and may consist of more than 100 pages . every report is divided into thematic sections or statements that have an inner structure and include special financial terms and numbers .", "entity": "tf - idf weighting", "output": "multi - word terms", "neg_sample": ["tf - idf weighting is used for OtherScientificTerm", "financial documents , such as corporate annual reports , are usually very long and may consist of more than 100 pages .", "every report is divided into thematic sections or statements that have an inner structure and include special financial terms and numbers ."], "relation": "used for", "id": "2021.fnp-1.14", "year": 2021, "rel_sent": "Summarization of financial documents with TF - IDF weighting of multi - word terms.", "forward": true, "src_ids": "2021.fnp-1.14_11998"}
{"input": "unsupervised topic modelling approach is used for OtherScientificTerm| context: organisations disclose their privacy practices by posting privacy policies on their websites . even though internet users often care about their digital privacy , they usually do not read privacy policies , since understanding them requires a significant investment of time and effort . natural language processing has been used to create experimental tools to interpret privacy policies , but there has been a lack of large privacy policy corpora tofacilitate the creation of large - scale semi - supervised and unsupervised models to interpret and simplify privacy policies . the number of unique websites represented in privaseer is about ten times larger than the next largest public collection of web privacy policies , and it surpasses the aggregate of unique websites represented in all other publicly available privacy policy corpora combined .", "entity": "unsupervised topic modelling approach", "output": "policy documents", "neg_sample": ["unsupervised topic modelling approach is used for OtherScientificTerm", "organisations disclose their privacy practices by posting privacy policies on their websites .", "even though internet users often care about their digital privacy , they usually do not read privacy policies , since understanding them requires a significant investment of time and effort .", "natural language processing has been used to create experimental tools to interpret privacy policies , but there has been a lack of large privacy policy corpora tofacilitate the creation of large - scale semi - supervised and unsupervised models to interpret and simplify privacy policies .", "the number of unique websites represented in privaseer is about ten times larger than the next largest public collection of web privacy policies , and it surpasses the aggregate of unique websites represented in all other publicly available privacy policy corpora combined ."], "relation": "used for", "id": "2021.acl-long.532", "year": 2021, "rel_sent": "We employ an unsupervised topic modelling approach to investigate the contents of policy documents in the corpus and discuss the distribution of topics in privacy policies at web scale .", "forward": true, "src_ids": "2021.acl-long.532_15214"}
{"input": "earnings announcements is done by using Task| context: in the midst of a global pandemic , understanding the public 's opinion of their government 's policy - level , non - pharmaceutical interventions ( npis ) is a crucial component of the health - policy - making process . prior work on covid-19 npi sentiment analysis by the epidemiological community has proceeded without a method for properly attributing sentiment changes to events , an ability to distinguish the influence of various events across time , a coherent model for predicting the public 's opinion of future events of the same sort , nor even a means of conducting significance tests .", "entity": "earnings announcements", "output": "event studies", "neg_sample": ["earnings announcements is done by using Task", "in the midst of a global pandemic , understanding the public 's opinion of their government 's policy - level , non - pharmaceutical interventions ( npis ) is a crucial component of the health - policy - making process .", "prior work on covid-19 npi sentiment analysis by the epidemiological community has proceeded without a method for properly attributing sentiment changes to events , an ability to distinguish the influence of various events across time , a coherent model for predicting the public 's opinion of future events of the same sort , nor even a means of conducting significance tests ."], "relation": "used for", "id": "2021.smm4h-1.1", "year": 2021, "rel_sent": "In the financial sector , event studies of the fluctuations in a publicly traded company 's stock price are commonplace for determining the effects of earnings announcements , product placements , etc .", "forward": false, "src_ids": "2021.smm4h-1.1_2022"}
{"input": "multi - level emotion cause analysis is used for Task| context: ' emotion cause analysis ( eca ) aims to identify the potential causes behind certain emotions intext . lots of eca models have been designed to extract the emotion cause at the clause level . however in many scenarios only extracting the cause clause is ambiguous .", "entity": "multi - level emotion cause analysis", "output": "identifying emotion cause clause ( ecc )", "neg_sample": ["multi - level emotion cause analysis is used for Task", "' emotion cause analysis ( eca ) aims to identify the potential causes behind certain emotions intext .", "lots of eca models have been designed to extract the emotion cause at the clause level .", "however in many scenarios only extracting the cause clause is ambiguous ."], "relation": "used for", "id": "2021.ccl-1.83", "year": 2021, "rel_sent": "To ease the problemin this paper we introduce multi - level emotion cause analysis which focuses on identifying emotion cause clause ( ECC ) and emotion cause keywords ( ECK ) simultaneously .", "forward": true, "src_ids": "2021.ccl-1.83_3411"}
{"input": "downstream cws tasks is done by using Method| context: recent researches show that pre - trained models ( ptms ) are beneficial to chinese word segmentation ( cws ) . however , ptms used in previous works usually adopt language modeling as pre - training tasks , lacking task - specific prior segmentation knowledge and ignoring the discrepancy between pre - training tasks and downstream cws tasks .", "entity": "downstream cws tasks", "output": "metaseg", "neg_sample": ["downstream cws tasks is done by using Method", "recent researches show that pre - trained models ( ptms ) are beneficial to chinese word segmentation ( cws ) .", "however , ptms used in previous works usually adopt language modeling as pre - training tasks , lacking task - specific prior segmentation knowledge and ignoring the discrepancy between pre - training tasks and downstream cws tasks ."], "relation": "used for", "id": "2021.naacl-main.436", "year": 2021, "rel_sent": "Empirical results show that MetaSeg could utilize common prior segmentation knowledge from different existing criteria and alleviate the discrepancy between pre - trained models and downstream CWS tasks .", "forward": false, "src_ids": "2021.naacl-main.436_800"}
{"input": "expressive prior is done by using Method| context: variational autoencoders ( vaes ) are widely used for latent variable modeling of text . we focus on variations that learn expressive prior distributions over the latent variable .", "entity": "expressive prior", "output": "flowprior", "neg_sample": ["expressive prior is done by using Method", "variational autoencoders ( vaes ) are widely used for latent variable modeling of text .", "we focus on variations that learn expressive prior distributions over the latent variable ."], "relation": "used for", "id": "2021.naacl-main.259", "year": 2021, "rel_sent": "We demonstrate that FlowPrior learns an expressive prior with analysis and several forms of evaluation involving generation .", "forward": false, "src_ids": "2021.naacl-main.259_9623"}
{"input": "substitutes is done by using Method| context: while its usage has increased in recent years , the paucity of annotated data prevents the finetuning of neural models on the task , hindering the full fruition of recently introduced powerful architectures such as language models . furthermore , lexical substitution is usually evaluated in a framework that is strictly bound to a limited vocabulary , making it impossible to credit appropriate , but out - of - vocabulary , substitutes .", "entity": "substitutes", "output": "seq2seq model", "neg_sample": ["substitutes is done by using Method", "while its usage has increased in recent years , the paucity of annotated data prevents the finetuning of neural models on the task , hindering the full fruition of recently introduced powerful architectures such as language models .", "furthermore , lexical substitution is usually evaluated in a framework that is strictly bound to a limited vocabulary , making it impossible to credit appropriate , but out - of - vocabulary , substitutes ."], "relation": "used for", "id": "2021.emnlp-main.844", "year": 2021, "rel_sent": "Thanks to a seq2seq model , we generate substitutes for a word according to the context it appears in , attaining state - of - the - art results on different benchmarks .", "forward": false, "src_ids": "2021.emnlp-main.844_14338"}
{"input": "multipronged strategy is used for Task| context: while the advance of pretrained multilingual encoders suggests an easy optimism of ' train on english , run on any language ' , we find through a thorough exploration and extension of techniques that a combination of approaches , both new and old , leads to better performance than any one cross - lingual strategy in particular .", "entity": "multipronged strategy", "output": "zero - shot cross - lingual information extraction", "neg_sample": ["multipronged strategy is used for Task", "while the advance of pretrained multilingual encoders suggests an easy optimism of ' train on english , run on any language ' , we find through a thorough exploration and extension of techniques that a combination of approaches , both new and old , leads to better performance than any one cross - lingual strategy in particular ."], "relation": "used for", "id": "2021.emnlp-main.149", "year": 2021, "rel_sent": "Everything Is All It Takes : A Multipronged Strategy for Zero - Shot Cross - Lingual Information Extraction.", "forward": true, "src_ids": "2021.emnlp-main.149_5077"}
{"input": "meta - learning adversarial domain adaptation network is used for Task| context: however , existing solutions heavily rely on the exploitation of lexical features and their distributional signatures on training data , while neglecting to strengthen the model 's ability to adapt to new tasks .", "entity": "meta - learning adversarial domain adaptation network", "output": "few - shot text classification", "neg_sample": ["meta - learning adversarial domain adaptation network is used for Task", "however , existing solutions heavily rely on the exploitation of lexical features and their distributional signatures on training data , while neglecting to strengthen the model 's ability to adapt to new tasks ."], "relation": "used for", "id": "2021.findings-acl.145", "year": 2021, "rel_sent": "Meta - Learning Adversarial Domain Adaptation Network for Few - Shot Text Classification.", "forward": true, "src_ids": "2021.findings-acl.145_10008"}
{"input": "emotion detection is done by using OtherScientificTerm| context: emotion detection from social media posts has attracted noticeable attention from natural language processing ( nlp ) community in recent years .", "entity": "emotion detection", "output": "gold labels", "neg_sample": ["emotion detection is done by using OtherScientificTerm", "emotion detection from social media posts has attracted noticeable attention from natural language processing ( nlp ) community in recent years ."], "relation": "used for", "id": "2021.ranlp-1.151", "year": 2021, "rel_sent": "Exploring Reliability of Gold Labels for Emotion Detection in Twitter.", "forward": false, "src_ids": "2021.ranlp-1.151_4079"}
{"input": "reconstruction of many hops explanations is done by using Method| context: this paper presents a novel framework for reconstructing multi - hop explanations in science question answering ( qa ) .", "entity": "reconstruction of many hops explanations", "output": "unification - based mechanism", "neg_sample": ["reconstruction of many hops explanations is done by using Method", "this paper presents a novel framework for reconstructing multi - hop explanations in science question answering ( qa ) ."], "relation": "used for", "id": "2021.eacl-main.15", "year": 2021, "rel_sent": "We present the following conclusions : ( 1 ) The proposed method achieves results competitive with Transformers , yet being orders of magnitude faster , a feature that makes it scalable to large explanatory corpora ( 2 ) The unification - based mechanism has a key role in reducing semantic drift , contributing to the reconstruction of many hops explanations ( 6 or more facts ) and the ranking of complex inference facts ( +12.0 Mean Average Precision ) ( 3 ) Crucially , the constructed explanations can support downstream QA models , improving the accuracy of BERT by up to 10 % overall .", "forward": false, "src_ids": "2021.eacl-main.15_7755"}
{"input": "high - accuracy ner system is done by using Method| context: we explore the application of state - of - the - art ner algorithms to asr - generated call center transcripts . previous work in this domain focused on the use of a bilstm - crf model which relied on flair embeddings ; however , such a model is unwieldy in terms of latency and memory consumption . in a production environment , end users require low - latency models which can be readily integrated into existing pipelines .", "entity": "high - accuracy ner system", "output": "transformer language models ( roberta )", "neg_sample": ["high - accuracy ner system is done by using Method", "we explore the application of state - of - the - art ner algorithms to asr - generated call center transcripts .", "previous work in this domain focused on the use of a bilstm - crf model which relied on flair embeddings ; however , such a model is unwieldy in terms of latency and memory consumption .", "in a production environment , end users require low - latency models which can be readily integrated into existing pipelines ."], "relation": "used for", "id": "2021.wnut-1.40", "year": 2021, "rel_sent": "First , we propose a set of models which utilize state - of - the - art Transformer language models ( RoBERTa ) to develop a high - accuracy NER system trained on custom annotated set of call center transcripts .", "forward": false, "src_ids": "2021.wnut-1.40_6142"}
{"input": "multi - task learning is used for Task| context: a recent topic of research in natural language generation has been the development of automatic response generation modules that can automatically respond to a user 's utterance in an empathetic manner . previous research has tackled this task using neural generative methods by augmenting emotion classes with the input sequences . however , the outputs by these models may be inconsistent .", "entity": "multi - task learning", "output": "emotion controlled dialog generation", "neg_sample": ["multi - task learning is used for Task", "a recent topic of research in natural language generation has been the development of automatic response generation modules that can automatically respond to a user 's utterance in an empathetic manner .", "previous research has tackled this task using neural generative methods by augmenting emotion classes with the input sequences .", "however , the outputs by these models may be inconsistent ."], "relation": "used for", "id": "2021.eacl-main.255", "year": 2021, "rel_sent": "Modelling Context Emotions using Multi - task Learning for Emotion Controlled Dialog Generation.", "forward": true, "src_ids": "2021.eacl-main.255_2188"}
{"input": "passage retrieval is used for Task| context: however , such retrieval models often require large memory to run because of the massive size of their passage index .", "entity": "passage retrieval", "output": "open - domain question answering", "neg_sample": ["passage retrieval is used for Task", "however , such retrieval models often require large memory to run because of the massive size of their passage index ."], "relation": "used for", "id": "2021.acl-short.123", "year": 2021, "rel_sent": "Efficient Passage Retrieval with Hashing for Open - domain Question Answering.", "forward": true, "src_ids": "2021.acl-short.123_13542"}
{"input": "empirically - based selection of unbiased data annotators is done by using OtherScientificTerm| context: implicit bias embedded in the annotated data is by far the greatest impediment in the effectual use of supervised machine learning models in tasks involving race , ethics , and geopolitical polarization . for societal good and demonstrable positive impact on wider society , it is paramount to carefully select data annotators and rigorously validate the annotation process . current approaches to selecting annotators are not sufficiently grounded in scientific principles and are limited at the policy - guidance level , thereby rendering them unusable for machine learning practitioners .", "entity": "empirically - based selection of unbiased data annotators", "output": "inane profile characteristics", "neg_sample": ["empirically - based selection of unbiased data annotators is done by using OtherScientificTerm", "implicit bias embedded in the annotated data is by far the greatest impediment in the effectual use of supervised machine learning models in tasks involving race , ethics , and geopolitical polarization .", "for societal good and demonstrable positive impact on wider society , it is paramount to carefully select data annotators and rigorously validate the annotation process .", "current approaches to selecting annotators are not sufficiently grounded in scientific principles and are limited at the policy - guidance level , thereby rendering them unusable for machine learning practitioners ."], "relation": "used for", "id": "2021.findings-acl.169", "year": 2021, "rel_sent": "By demonstrating it on a realworld geopolitical problem , we also identified and ranked key inane profile characteristics towards an empirically - based selection of unbiased data annotators .", "forward": false, "src_ids": "2021.findings-acl.169_10387"}
{"input": "news model is used for Method| context: news recommendation is critical for personalized news access . most existing news recommendation methods rely on centralized storage of users ' historical news click behavior data , which may lead to privacy concerns and hazards . federated learning is a privacy - preserving framework for multiple clients to collaboratively train models without sharing their private data . however , the computation and communication cost of directly learning many existing news recommendation models in a federated way are unacceptable for user clients .", "entity": "news model", "output": "news representations", "neg_sample": ["news model is used for Method", "news recommendation is critical for personalized news access .", "most existing news recommendation methods rely on centralized storage of users ' historical news click behavior data , which may lead to privacy concerns and hazards .", "federated learning is a privacy - preserving framework for multiple clients to collaboratively train models without sharing their private data .", "however , the computation and communication cost of directly learning many existing news recommendation models in a federated way are unacceptable for user clients ."], "relation": "used for", "id": "2021.emnlp-main.223", "year": 2021, "rel_sent": "The server updates its global user model with the aggregated gradients , and further updates its news model to infer updated news representations .", "forward": true, "src_ids": "2021.emnlp-main.223_2877"}
{"input": "interactive setting is done by using Task| context: allowing users to interact with multi - document summarizers is a promising direction towards improving and customizing summary results . different ideas for interactive summarization have been proposed in previous work but these solutions are highly divergent and incomparable .", "entity": "interactive setting", "output": "multi - document summarization evaluation", "neg_sample": ["interactive setting is done by using Task", "allowing users to interact with multi - document summarizers is a promising direction towards improving and customizing summary results .", "different ideas for interactive summarization have been proposed in previous work but these solutions are highly divergent and incomparable ."], "relation": "used for", "id": "2021.naacl-main.54", "year": 2021, "rel_sent": "Extending Multi - Document Summarization Evaluation to the Interactive Setting.", "forward": false, "src_ids": "2021.naacl-main.54_8812"}
{"input": "youcook2 is used for Task| context: in this paper , we present gem1 as a general evaluation benchmark for multimodal tasks .", "entity": "youcook2", "output": "video - language tasks", "neg_sample": ["youcook2 is used for Task", "in this paper , we present gem1 as a general evaluation benchmark for multimodal tasks ."], "relation": "used for", "id": "2021.findings-acl.229", "year": 2021, "rel_sent": "Comparing with existing multimodal datasets such as MSCOCO ( Chen et al . , 2015 ) and Flicker30 K ( Vinyals et al . , 2015 ) for image - language tasks , YouCook2 ( Zhou et al . , 2018 ) and MSR - VTT ( Xu et al . , 2016 ) for video - language tasks , GEM is not only the largest vision - language dataset covering image - language tasks and video - language tasks at the same time , but also labeled in multiple languages .", "forward": true, "src_ids": "2021.findings-acl.229_13175"}
{"input": "short and long summary generation tasks is done by using Metric| context: recent years have brought about an interest in the challenging task of summarizing conversation threads ( meetings , online discussions , etc . ) . such summaries help analysis of the long text to quickly catch up with the decisions made and thus improve our work or communication efficiency .", "entity": "short and long summary generation tasks", "output": "human evaluations", "neg_sample": ["short and long summary generation tasks is done by using Metric", "recent years have brought about an interest in the challenging task of summarizing conversation threads ( meetings , online discussions , etc . )", ".", "such summaries help analysis of the long text to quickly catch up with the decisions made and thus improve our work or communication efficiency ."], "relation": "used for", "id": "2021.acl-long.537", "year": 2021, "rel_sent": "We perform a comprehensive empirical study to explore different summarization techniques ( including extractive and abstractive methods , single - document and hierarchical models , as well as transfer and semisupervised learning ) and conduct human evaluations on both short and long summary generation tasks .", "forward": false, "src_ids": "2021.acl-long.537_5487"}
{"input": "modeling bilingual conversational characteristics is used for Task| context: despite the impressive performance of sentence - level and context - aware neural machine translation ( nmt ) , there still remain challenges to translate bilingual conversational text due to its inherent characteristics such as role preference , dialogue coherence , and translation consistency .", "entity": "modeling bilingual conversational characteristics", "output": "neural chat translation", "neg_sample": ["modeling bilingual conversational characteristics is used for Task", "despite the impressive performance of sentence - level and context - aware neural machine translation ( nmt ) , there still remain challenges to translate bilingual conversational text due to its inherent characteristics such as role preference , dialogue coherence , and translation consistency ."], "relation": "used for", "id": "2021.acl-long.444", "year": 2021, "rel_sent": "Modeling Bilingual Conversational Characteristics for Neural Chat Translation.", "forward": true, "src_ids": "2021.acl-long.444_15142"}
{"input": "synthetic data generation is done by using Method| context: synthetic data generation is widely known to boost the accuracy of neural grammatical error correction ( gec ) systems , but existing methods often lack diversity or are too simplistic to generate the broad range of grammatical errors made by human writers .", "entity": "synthetic data generation", "output": "automatic annotation tools", "neg_sample": ["synthetic data generation is done by using Method", "synthetic data generation is widely known to boost the accuracy of neural grammatical error correction ( gec ) systems , but existing methods often lack diversity or are too simplistic to generate the broad range of grammatical errors made by human writers ."], "relation": "used for", "id": "2021.bea-1.4", "year": 2021, "rel_sent": "In this work , we use error type tags from automatic annotation tools such as ERRANT to guide synthetic data generation .", "forward": false, "src_ids": "2021.bea-1.4_11472"}
{"input": "product ranking is done by using OtherScientificTerm| context: the growing popularity of virtual assistants poses new challenges for entity resolution , the task of linking mentions in text to their referent entities in a knowledge base . specifically , in the shopping domain , customers tend to mention the entities implicitly ( e.g. , ' organic milk ' ) rather than use the entity names explicitly , leading to a large number of candidate products . meanwhile , for the same query , different customers may expect different results . for example , with ' add milk to my cart ' , a customer may refer to a certain product from his / her favorite brand , while some customers may want to re - order products they regularly purchase . moreover , new customers may lack persistent shopping history , which requires us to enrich the connections between customers through products and their attributes .", "entity": "product ranking", "output": "personalized features", "neg_sample": ["product ranking is done by using OtherScientificTerm", "the growing popularity of virtual assistants poses new challenges for entity resolution , the task of linking mentions in text to their referent entities in a knowledge base .", "specifically , in the shopping domain , customers tend to mention the entities implicitly ( e.g.", ", ' organic milk ' ) rather than use the entity names explicitly , leading to a large number of candidate products .", "meanwhile , for the same query , different customers may expect different results .", "for example , with ' add milk to my cart ' , a customer may refer to a certain product from his / her favorite brand , while some customers may want to re - order products they regularly purchase .", "moreover , new customers may lack persistent shopping history , which requires us to enrich the connections between customers through products and their attributes ."], "relation": "used for", "id": "2021.ecnlp-1.6", "year": 2021, "rel_sent": "To address these issues , we propose a new framework that leverages personalized features to improve the accuracy of product ranking .", "forward": false, "src_ids": "2021.ecnlp-1.6_1717"}
{"input": "historical emotional spectrum is done by using Method| context: recent psychological studies indicate that individuals exhibiting suicidal ideation increasingly turn to social media rather than mental health practitioners . personally contextualizing the buildup of such ideation is critical for accurate identification of users at risk .", "entity": "historical emotional spectrum", "output": "hyperbolic graph convolution networks", "neg_sample": ["historical emotional spectrum is done by using Method", "recent psychological studies indicate that individuals exhibiting suicidal ideation increasingly turn to social media rather than mental health practitioners .", "personally contextualizing the buildup of such ideation is critical for accurate identification of users at risk ."], "relation": "used for", "id": "2021.naacl-main.176", "year": 2021, "rel_sent": "Reflecting upon the scale - free nature of social network relationships , we propose the use of Hyperbolic Graph Convolution Networks , in combination with the Hawkes process to learn the historical emotional spectrum of a user in a time - sensitive manner .", "forward": false, "src_ids": "2021.naacl-main.176_11822"}
{"input": "geocoding text data is done by using Method| context: text data are an important source of detailed information about social and political events . automated systems parse large volumes of text data to infer or extract structured information that describes actors , actions , dates , times , and locations . one of these sub - tasks is geocoding : predicting the geographic coordinates associated with events or locations described by a given text .", "entity": "geocoding text data", "output": "end - to - end probabilistic model", "neg_sample": ["geocoding text data is done by using Method", "text data are an important source of detailed information about social and political events .", "automated systems parse large volumes of text data to infer or extract structured information that describes actors , actions , dates , times , and locations .", "one of these sub - tasks is geocoding : predicting the geographic coordinates associated with events or locations described by a given text ."], "relation": "used for", "id": "2021.case-1.8", "year": 2021, "rel_sent": "I present an end - to - end probabilistic model for geocoding text data .", "forward": false, "src_ids": "2021.case-1.8_3472"}
{"input": "neural language models is used for Task| context: after a neural sequence model encounters an unexpected token , can its behavior be predicted ?", "entity": "neural language models", "output": "generalization", "neg_sample": ["neural language models is used for Task", "after a neural sequence model encounters an unexpected token , can its behavior be predicted ?"], "relation": "used for", "id": "2021.emnlp-main.448", "year": 2021, "rel_sent": "In experiments in English , Finnish , Mandarin , and random regular languages , we demonstrate that neural language models interpolate between these twoforms of generalization : their predictions are well - approximated by a log - linear combination of lexical and syntactic predictive distributions .", "forward": true, "src_ids": "2021.emnlp-main.448_16091"}
{"input": "rhythms is done by using Method| context: rap generation , which aims to produce lyrics and corresponding singing beats , needs to model both rhymes and rhythms . previous works for rap generation focused on rhyming lyrics , but ignored rhythmic beats , which are important for rap performance .", "entity": "rhythms", "output": "transformer - based rap generation system", "neg_sample": ["rhythms is done by using Method", "rap generation , which aims to produce lyrics and corresponding singing beats , needs to model both rhymes and rhythms .", "previous works for rap generation focused on rhyming lyrics , but ignored rhythmic beats , which are important for rap performance ."], "relation": "used for", "id": "2021.acl-long.6", "year": 2021, "rel_sent": "In this paper , we develop DeepRapper , a Transformer - based rap generation system that can model both rhymes and rhythms .", "forward": false, "src_ids": "2021.acl-long.6_4592"}
{"input": "tellmewhy is done by using Method| context: answering questions about why characters perform certain actions is central to understanding and reasoning about narratives . de - spite recent progress in qa , it is not clear if existing models have the ability to answer ' why ' questions that may require commonsense knowledge external to the input narrative .", "entity": "tellmewhy", "output": "systematized human evaluation interface", "neg_sample": ["tellmewhy is done by using Method", "answering questions about why characters perform certain actions is central to understanding and reasoning about narratives .", "de - spite recent progress in qa , it is not clear if existing models have the ability to answer ' why ' questions that may require commonsense knowledge external to the input narrative ."], "relation": "used for", "id": "2021.findings-acl.53", "year": 2021, "rel_sent": "Given the limita - tions of automated evaluation for this task , we also present a systematized human evaluation interface for this dataset .", "forward": false, "src_ids": "2021.findings-acl.53_4112"}
{"input": "gradient information is used for Task| context: existing methods typically require to learn to adapt the target model by exploiting the source data and sharing the network architecture across domains . however , this pipeline makes the source data risky and is inflexible for deploying the target model .", "entity": "gradient information", "output": "transfer", "neg_sample": ["gradient information is used for Task", "existing methods typically require to learn to adapt the target model by exploiting the source data and sharing the network architecture across domains .", "however , this pipeline makes the source data risky and is inflexible for deploying the target model ."], "relation": "used for", "id": "2021.acl-long.421", "year": 2021, "rel_sent": "As a type of important knowledge in the source domain , for the first time , the gradient information is exploited to boost the transfer performance .", "forward": true, "src_ids": "2021.acl-long.421_6504"}
{"input": "bilingual gp pairs is used for Task| context: we introduce a method for assisting english as second language ( esl ) learners by providing translations of collins cobuild grammar patterns(gp ) for a given word .", "entity": "bilingual gp pairs", "output": "native language support", "neg_sample": ["bilingual gp pairs is used for Task", "we introduce a method for assisting english as second language ( esl ) learners by providing translations of collins cobuild grammar patterns(gp ) for a given word ."], "relation": "used for", "id": "2021.rocling-1.39", "year": 2021, "rel_sent": "In our approach , bilingual parallel corpus is transformed into bilingual GP pairs aimed at providing native language support for learning word usage through GPs .", "forward": true, "src_ids": "2021.rocling-1.39_2678"}
{"input": "downstream tasks is done by using OtherScientificTerm| context: large - scale multi - modal classification aim to distinguish between different multi - modal data , and it has drawn dramatically attentions since last decade .", "entity": "downstream tasks", "output": "multi - modal encoder feature", "neg_sample": ["downstream tasks is done by using OtherScientificTerm", "large - scale multi - modal classification aim to distinguish between different multi - modal data , and it has drawn dramatically attentions since last decade ."], "relation": "used for", "id": "2021.maiworkshop-1.5", "year": 2021, "rel_sent": "Besides , multi - modal encoder feature can be used to enrich the raw dataset , and improve the performance of downstream tasks ( such as classification task ) .", "forward": false, "src_ids": "2021.maiworkshop-1.5_11493"}
{"input": "joint learning is done by using OtherScientificTerm| context: unsupervised style transfer models are mainly based on an inductive learning approach , which represents the style as embeddings , decoder parameters , or discriminator parameters and directly applies these general rules to the test cases . however , the lacking of parallel corpus hinders the ability of these inductive learning methods on this task . as a result , it is likely to cause severe inconsistent style expressions , like ' the salad is rude ' .", "entity": "joint learning", "output": "objective functions", "neg_sample": ["joint learning is done by using OtherScientificTerm", "unsupervised style transfer models are mainly based on an inductive learning approach , which represents the style as embeddings , decoder parameters , or discriminator parameters and directly applies these general rules to the test cases .", "however , the lacking of parallel corpus hinders the ability of these inductive learning methods on this task .", "as a result , it is likely to cause severe inconsistent style expressions , like ' the salad is rude ' ."], "relation": "used for", "id": "2021.emnlp-main.195", "year": 2021, "rel_sent": "In this paper , both sparse ( BM25 ) and dense retrieval functions ( MIPS ) are used , and two objective functions are designed tofacilitate joint learning .", "forward": false, "src_ids": "2021.emnlp-main.195_9958"}
{"input": "informative questions is done by using OtherScientificTerm| context: however , natural language can be ambiguous or unclear . in cases of uncertainty , humans engage in an interactive process known as repair : asking questions and seeking clarification until their uncertainty is resolved .", "entity": "informative questions", "output": "expected information gain objective", "neg_sample": ["informative questions is done by using OtherScientificTerm", "however , natural language can be ambiguous or unclear .", "in cases of uncertainty , humans engage in an interactive process known as repair : asking questions and seeking clarification until their uncertainty is resolved ."], "relation": "used for", "id": "2021.emnlp-main.44", "year": 2021, "rel_sent": "Our model uses an expected information gain objective to derive informative questions from an off - the - shelf image captioner without requiring any supervised question - answer data .", "forward": false, "src_ids": "2021.emnlp-main.44_944"}
{"input": "graph linearization is used for Task| context: automatic construction of relevant knowledge bases ( kbs ) from text , and generation of semantically meaningful text from kbs are both long - standing goals in machine learning .", "entity": "graph linearization", "output": "sequence to sequence generation problem", "neg_sample": ["graph linearization is used for Task", "automatic construction of relevant knowledge bases ( kbs ) from text , and generation of semantically meaningful text from kbs are both long - standing goals in machine learning ."], "relation": "used for", "id": "2021.emnlp-main.83", "year": 2021, "rel_sent": "Graph linearization enables us to re - frame both tasks as a sequence to sequence generation problem regardless of the generative direction , which in turn allows the use of Reinforcement Learning for sequence training where the model itself is employed as its own critic leading to Self - Critical Sequence Training ( SCST ) .", "forward": true, "src_ids": "2021.emnlp-main.83_1137"}
{"input": "lightweight architecture is used for OtherScientificTerm| context: large - scale document retrieval systems often utilize two styles of neural network models which live at two different ends of the joint computation vs. accuracy spectrum . the first style is dual encoder ( or two - tower ) models , where the query and document representations are computed completely independently and combined with a simple dot product operation . the second style is cross - attention models , where the query and document features are concatenated in the input layer and all computation is based on the joint query - document representation .", "entity": "lightweight architecture", "output": "joint cost vs. accuracy trade - off", "neg_sample": ["lightweight architecture is used for OtherScientificTerm", "large - scale document retrieval systems often utilize two styles of neural network models which live at two different ends of the joint computation vs. accuracy spectrum .", "the first style is dual encoder ( or two - tower ) models , where the query and document representations are computed completely independently and combined with a simple dot product operation .", "the second style is cross - attention models , where the query and document features are concatenated in the input layer and all computation is based on the joint query - document representation ."], "relation": "used for", "id": "2021.emnlp-main.443", "year": 2021, "rel_sent": "In this paper , we present a lightweight architecture that explores this joint cost vs. accuracy trade - off based on multi - vector attention ( MVA ) .", "forward": true, "src_ids": "2021.emnlp-main.443_245"}
{"input": "ood detection is done by using Method| context: detecting out - of - domain ( ood ) or unknown intents from user queries is essential in a task - oriented dialog system . a key challenge of ood detection is to learn discriminative semantic features . traditional cross - entropy loss only focuses on whether a sample is correctly classified , and does not explicitly distinguish the margins between categories .", "entity": "ood detection", "output": "discriminative representations", "neg_sample": ["ood detection is done by using Method", "detecting out - of - domain ( ood ) or unknown intents from user queries is essential in a task - oriented dialog system .", "a key challenge of ood detection is to learn discriminative semantic features .", "traditional cross - entropy loss only focuses on whether a sample is correctly classified , and does not explicitly distinguish the margins between categories ."], "relation": "used for", "id": "2021.acl-short.110", "year": 2021, "rel_sent": "Experiments on two public datasets prove the effectiveness of our method capturing discriminative representations for OOD detection .", "forward": false, "src_ids": "2021.acl-short.110_2505"}
{"input": "automatic quality metrics is used for OtherScientificTerm| context: social media companies as well as censorship authorities make extensive use of artificial intelligence ( ai ) tools to monitor postings of hate speech , celebrations of violence or profanity . since ai software requires massive volumes of data to train computers , automatic - translation of the online content is usually implemented to compensate for the scarcity of text in some languages . in such scenarios , the adequacy of the whole process relies on the assumption that the translation can be evaluated correctly .", "entity": "automatic quality metrics", "output": "machine translation ( mt ) mistakes", "neg_sample": ["automatic quality metrics is used for OtherScientificTerm", "social media companies as well as censorship authorities make extensive use of artificial intelligence ( ai ) tools to monitor postings of hate speech , celebrations of violence or profanity .", "since ai software requires massive volumes of data to train computers , automatic - translation of the online content is usually implemented to compensate for the scarcity of text in some languages .", "in such scenarios , the adequacy of the whole process relies on the assumption that the translation can be evaluated correctly ."], "relation": "used for", "id": "2021.triton-1.6", "year": 2021, "rel_sent": "In this paper , we assess the ability of automatic quality metrics to detect critical machine translation errors which can cause serious misunderstanding of the affect message .", "forward": true, "src_ids": "2021.triton-1.6_15802"}
{"input": "complex multi - step numerical reasoning is done by using Method| context: the sheer volume of financial statements makes it difficult for humans to access and analyze a business 's financials . robust numerical reasoning likewise faces unique challenges in this domain . in contrast to existing tasks on general domain , the finance domain includes complex numerical reasoning and understanding of heterogeneous representations .", "entity": "complex multi - step numerical reasoning", "output": "pre - trained models", "neg_sample": ["complex multi - step numerical reasoning is done by using Method", "the sheer volume of financial statements makes it difficult for humans to access and analyze a business 's financials .", "robust numerical reasoning likewise faces unique challenges in this domain .", "in contrast to existing tasks on general domain , the finance domain includes complex numerical reasoning and understanding of heterogeneous representations ."], "relation": "used for", "id": "2021.emnlp-main.300", "year": 2021, "rel_sent": "The results demonstrate that popular , large , pre - trained models fall far short of expert humans in acquiring finance knowledge and in complex multi - step numerical reasoning on that knowledge .", "forward": false, "src_ids": "2021.emnlp-main.300_1509"}
{"input": "sample embeddings is done by using Method| context: continual learning has gained increasing attention in recent years , thanks to its biological interpretation and efficiency in many real - world applications . some previous works have proved that storing typical samples of old relations in memory can help the model keep a stable understanding of old relations and avoid forgetting them . however , most methods heavily depend on the memory size in that they simply replay these memorized samples in subsequent tasks .", "entity": "sample embeddings", "output": "memory network", "neg_sample": ["sample embeddings is done by using Method", "continual learning has gained increasing attention in recent years , thanks to its biological interpretation and efficiency in many real - world applications .", "some previous works have proved that storing typical samples of old relations in memory can help the model keep a stable understanding of old relations and avoid forgetting them .", "however , most methods heavily depend on the memory size in that they simply replay these memorized samples in subsequent tasks ."], "relation": "used for", "id": "2021.acl-long.20", "year": 2021, "rel_sent": "The prototypes of all observed relations at current learning stage are used to re - initialize a memory network to refine subsequent sample embeddings , which ensures the model 's stable understanding on all observed relations when learning a new task .", "forward": false, "src_ids": "2021.acl-long.20_7414"}
{"input": "category information is used for OtherScientificTerm| context: morphological analysis ( ma ) and lexical normalization ( ln ) are both important tasks for japanese user - generated text ( ugt ) .", "entity": "category information", "output": "frequent ugt - specific phenomena", "neg_sample": ["category information is used for OtherScientificTerm", "morphological analysis ( ma ) and lexical normalization ( ln ) are both important tasks for japanese user - generated text ( ugt ) ."], "relation": "used for", "id": "2021.naacl-main.438", "year": 2021, "rel_sent": "Our corpus comprises 929 sentences annotated with morphological and normalization information , along with category information we classified for frequent UGT - specific phenomena .", "forward": true, "src_ids": "2021.naacl-main.438_13600"}
{"input": "psychological health support is done by using Material| context: great research interests have been attracted to devise ai services that are able to provide mental health support . however , the lack of corpora is a main obstacle to this research , particularly in chinese language .", "entity": "psychological health support", "output": "chinese dataset", "neg_sample": ["psychological health support is done by using Material", "great research interests have been attracted to devise ai services that are able to provide mental health support .", "however , the lack of corpora is a main obstacle to this research , particularly in chinese language ."], "relation": "used for", "id": "2021.findings-acl.130", "year": 2021, "rel_sent": "In this paper , we propose PsyQA , a Chinese dataset of psychological health support in the form of question and answer pair .", "forward": false, "src_ids": "2021.findings-acl.130_4434"}
{"input": "spoken language understanding is done by using Method| context: spoken language understanding ( slu ) systems parse speech into semantic structures like dialog acts and slots . this involves the use of an automatic speech recognizer ( asr ) to transcribe speech into multiple text alternatives ( hypotheses ) . transcription errors , ordinary in asrs , impact downstream slu performance negatively . common approaches to mitigate such errors involve using richer information from the asr , either in form of n - best hypotheses or word - lattices .", "entity": "spoken language understanding", "output": "n - best asr transformer", "neg_sample": ["spoken language understanding is done by using Method", "spoken language understanding ( slu ) systems parse speech into semantic structures like dialog acts and slots .", "this involves the use of an automatic speech recognizer ( asr ) to transcribe speech into multiple text alternatives ( hypotheses ) .", "transcription errors , ordinary in asrs , impact downstream slu performance negatively .", "common approaches to mitigate such errors involve using richer information from the asr , either in form of n - best hypotheses or word - lattices ."], "relation": "used for", "id": "2021.acl-short.14", "year": 2021, "rel_sent": "N - Best ASR Transformer : Enhancing SLU Performance using Multiple ASR Hypotheses.", "forward": false, "src_ids": "2021.acl-short.14_8328"}
{"input": "transition - based parser is used for OtherScientificTerm| context: ' episodic logic : unscoped logical form ' ( el - ulf ) is a semantic representation capturing predicate - argument structure as well as more challenging aspects of language within the episodic logic formalism .", "entity": "transition - based parser", "output": "unscoped episodic logical forms", "neg_sample": ["transition - based parser is used for OtherScientificTerm", "' episodic logic : unscoped logical form ' ( el - ulf ) is a semantic representation capturing predicate - argument structure as well as more challenging aspects of language within the episodic logic formalism ."], "relation": "used for", "id": "2021.iwcs-1.18", "year": 2021, "rel_sent": "A Transition - based Parser for Unscoped Episodic Logical Forms.", "forward": true, "src_ids": "2021.iwcs-1.18_3427"}
{"input": "bert model is used for Task| context: existing methods only consider the features of the microblog itself with - out combining the semantics of emotion categories for modeling .", "entity": "bert model", "output": "emotion classification", "neg_sample": ["bert model is used for Task", "existing methods only consider the features of the microblog itself with - out combining the semantics of emotion categories for modeling ."], "relation": "used for", "id": "2021.ccl-1.82", "year": 2021, "rel_sent": "Finally we construct a question - and - answer pair and use it as the input of the BERT model to complete emotion classification .", "forward": true, "src_ids": "2021.ccl-1.82_7957"}
{"input": "pre - trained language models is done by using OtherScientificTerm| context: models pre - trained on large - scale regular text corpora often do not work well for user - generated data where the language styles differ significantly from the mainstream text .", "entity": "pre - trained language models", "output": "rules", "neg_sample": ["pre - trained language models is done by using OtherScientificTerm", "models pre - trained on large - scale regular text corpora often do not work well for user - generated data where the language styles differ significantly from the mainstream text ."], "relation": "used for", "id": "2021.acl-long.124", "year": 2021, "rel_sent": "Our contributions are as follows : 1.We propose a new method , CARI , to integrate rules for pre - trained language models .", "forward": false, "src_ids": "2021.acl-long.124_4694"}
{"input": "large - scale language models is done by using Method| context: heavily overparameterized language models such as bert , xlnet and t5 have achieved impressive success in many nlp tasks . however , their high model complexity requires enormous computation resources and extremely long training time for both pre - training and fine - tuning . many works have studied model compression on large nlp models , but only focusing on reducing inference time while still requiring an expensive training process . other works use extremely large batch sizes to shorten the pre - training time , at the expense of higher computational resource demands .", "entity": "large - scale language models", "output": "training algorithm", "neg_sample": ["large - scale language models is done by using Method", "heavily overparameterized language models such as bert , xlnet and t5 have achieved impressive success in many nlp tasks .", "however , their high model complexity requires enormous computation resources and extremely long training time for both pre - training and fine - tuning .", "many works have studied model compression on large nlp models , but only focusing on reducing inference time while still requiring an expensive training process .", "other works use extremely large batch sizes to shorten the pre - training time , at the expense of higher computational resource demands ."], "relation": "used for", "id": "2021.acl-long.171", "year": 2021, "rel_sent": "In this paper , inspired by the Early - Bird Lottery Tickets recently studied for computer vision tasks , we propose EarlyBERT , a general computationally - efficient training algorithm applicable to both pre - training and fine - tuning of large - scale language models .", "forward": false, "src_ids": "2021.acl-long.171_4815"}
{"input": "title and author information is done by using Method| context: applications based on scholarly data are of ever increasing importance . this results in disadvantages for areas where high - quality data and compatible systems are not available , such as non - english publications .", "entity": "title and author information", "output": "sequence labeling models", "neg_sample": ["title and author information is done by using Method", "applications based on scholarly data are of ever increasing importance .", "this results in disadvantages for areas where high - quality data and compatible systems are not available , such as non - english publications ."], "relation": "used for", "id": "2021.sdp-1.8", "year": 2021, "rel_sent": "We utilize our data for training and evaluating sequence labeling models to extract title and author information .", "forward": false, "src_ids": "2021.sdp-1.8_697"}
{"input": "knowledge selection is used for Task| context: however , the needed gold knowledge label is difficult to collect in reality .", "entity": "knowledge selection", "output": "dialogue generation", "neg_sample": ["knowledge selection is used for Task", "however , the needed gold knowledge label is difficult to collect in reality ."], "relation": "used for", "id": "2021.findings-acl.105", "year": 2021, "rel_sent": "In this paper , we study knowledge selection for dialogue generation in the unsupervised scenario and propose a novel Distilled Distant Supervision Loss ( DDSL ) to supervise knowledge selection when the gold knowledge label is unknown .", "forward": true, "src_ids": "2021.findings-acl.105_8753"}
{"input": "multi - teacher knowledge distillation method is used for OtherScientificTerm| context: encoder pre - training is promising in end - to - end speech translation ( st ) , given the fact that speech - to - translation data is scarce . but st encoders are not simple instances of automatic speech recognition ( asr ) or machine translation ( mt ) encoders . for example , we find that asr encoders lack the global context representation , which is necessary for translation , whereas mt encoders are not designed to deal with long but locally attentive acoustic sequences .", "entity": "multi - teacher knowledge distillation method", "output": "pre - training knowledge", "neg_sample": ["multi - teacher knowledge distillation method is used for OtherScientificTerm", "encoder pre - training is promising in end - to - end speech translation ( st ) , given the fact that speech - to - translation data is scarce .", "but st encoders are not simple instances of automatic speech recognition ( asr ) or machine translation ( mt ) encoders .", "for example , we find that asr encoders lack the global context representation , which is necessary for translation , whereas mt encoders are not designed to deal with long but locally attentive acoustic sequences ."], "relation": "used for", "id": "2021.acl-long.204", "year": 2021, "rel_sent": "Also , we develop an adaptor module to alleviate the representation inconsistency between the pre - trained ASR encoder and MT encoder , and develop a multi - teacher knowledge distillation method to preserve the pre - training knowledge .", "forward": true, "src_ids": "2021.acl-long.204_14020"}
{"input": "neural networks is used for OtherScientificTerm| context: to provide consistent emotional interaction with users , dialog systems should be capable to automatically select appropriate emotions for responses like humans . however , most existing works focus on rendering specified emotions in responses or empathetically respond to the emotion of users , yet the individual difference in emotion expression is overlooked . this may lead to inconsistent emotional expressions and disinterest users .", "entity": "neural networks", "output": "personality traits", "neg_sample": ["neural networks is used for OtherScientificTerm", "to provide consistent emotional interaction with users , dialog systems should be capable to automatically select appropriate emotions for responses like humans .", "however , most existing works focus on rendering specified emotions in responses or empathetically respond to the emotion of users , yet the individual difference in emotion expression is overlooked .", "this may lead to inconsistent emotional expressions and disinterest users ."], "relation": "used for", "id": "2021.findings-acl.444", "year": 2021, "rel_sent": "Then , we design neural networks to encode the preceding dialog context and the specified personality traits to compose the variation .", "forward": true, "src_ids": "2021.findings-acl.444_1255"}
{"input": "autoregressive models is used for Method| context: generating long and coherent text is an important and challenging task encompassing many application areas such as summarization , document level machine translation and story generation . despite the success in modeling intra - sentence coherence , existing long text generation models ( e.g. , bart and gpt-3 ) still struggle to maintain a coherent event sequence throughout the generated text . we conjecture that this is because of the difficulty for the model to revise , replace , revoke or delete any part that has been generated by the model .", "entity": "autoregressive models", "output": "low level program", "neg_sample": ["autoregressive models is used for Method", "generating long and coherent text is an important and challenging task encompassing many application areas such as summarization , document level machine translation and story generation .", "despite the success in modeling intra - sentence coherence , existing long text generation models ( e.g.", ", bart and gpt-3 ) still struggle to maintain a coherent event sequence throughout the generated text .", "we conjecture that this is because of the difficulty for the model to revise , replace , revoke or delete any part that has been generated by the model ."], "relation": "used for", "id": "2021.alta-1.13", "year": 2021, "rel_sent": "We suggest various remedies such as using distilled dataset , designing better attention mechanisms and using autoregressive models as a low level program .", "forward": true, "src_ids": "2021.alta-1.13_725"}
{"input": "word - based robustness - aware perturbation is used for Method| context: backdoor attacks , which maliciously control a well - trained model 's outputs of the instances with specific triggers , are recently shown to be serious threats to the safety of reusing deep neural networks ( dnns ) .", "entity": "word - based robustness - aware perturbation", "output": "natural language processing ( nlp ) models", "neg_sample": ["word - based robustness - aware perturbation is used for Method", "backdoor attacks , which maliciously control a well - trained model 's outputs of the instances with specific triggers , are recently shown to be serious threats to the safety of reusing deep neural networks ( dnns ) ."], "relation": "used for", "id": "2021.emnlp-main.659", "year": 2021, "rel_sent": "Motivated by this observation , we construct a word - based robustness - aware perturbation to distinguish poisoned samples from clean samples to defend against the backdoor attacks on natural language processing ( NLP ) models .", "forward": true, "src_ids": "2021.emnlp-main.659_791"}
{"input": "universal conversational encoder is done by using Generic| context: transformer - based language models ( lms ) pretrained on large text collections are proven to store a wealth of semantic knowledge . however , 1 ) they are not effective as sentence encoders when used off - the - shelf , and 2 ) thus typically lag behind conversationally pretrained ( e.g. , via response selection ) encoders on conversational tasks such as intent detection ( id ) .", "entity": "universal conversational encoder", "output": "two - stage procedure", "neg_sample": ["universal conversational encoder is done by using Generic", "transformer - based language models ( lms ) pretrained on large text collections are proven to store a wealth of semantic knowledge .", "however , 1 ) they are not effective as sentence encoders when used off - the - shelf , and 2 ) thus typically lag behind conversationally pretrained ( e.g.", ", via response selection ) encoders on conversational tasks such as intent detection ( id ) ."], "relation": "used for", "id": "2021.emnlp-main.88", "year": 2021, "rel_sent": "In this work , we propose ConvFiT , a simple and efficient two - stage procedure which turns any pretrained LM into a universal conversational encoder ( after Stage 1 ConvFiT - ing ) and task - specialised sentence encoder ( after Stage 2 ) .", "forward": false, "src_ids": "2021.emnlp-main.88_8333"}
{"input": "event forecasting is done by using Task| context: event forecasting is a challenging , yet important task , as humans seek to constantly plan for the future . existing automated forecasting studies rely mostly on structured data , such as time - series or event - based knowledge graphs , to help predict future events .", "entity": "event forecasting", "output": "question answering challenge", "neg_sample": ["event forecasting is done by using Task", "event forecasting is a challenging , yet important task , as humans seek to constantly plan for the future .", "existing automated forecasting studies rely mostly on structured data , such as time - series or event - based knowledge graphs , to help predict future events ."], "relation": "used for", "id": "2021.acl-long.357", "year": 2021, "rel_sent": "ForecastQA : A Question Answering Challenge for Event Forecasting with Temporal Text Data.", "forward": false, "src_ids": "2021.acl-long.357_13961"}
{"input": "lexical diversity is done by using Method| context: the ability for variation in language use is necessary for speakers to achieve their conversational goals , for instance when referring to objects in visual environments . we argue that diversity should not be modelled as an independent objective in dialogue , but should rather be a result or by - product of goal - oriented language generation .", "entity": "lexical diversity", "output": "pragmatic reasoning", "neg_sample": ["lexical diversity is done by using Method", "the ability for variation in language use is necessary for speakers to achieve their conversational goals , for instance when referring to objects in visual environments .", "we argue that diversity should not be modelled as an independent objective in dialogue , but should rather be a result or by - product of goal - oriented language generation ."], "relation": "used for", "id": "2021.sigdial-1.43", "year": 2021, "rel_sent": "We find that boosting diversity itself does not result in more pragmatically informative captions , but pragmatic reasoning does increase lexical diversity .", "forward": false, "src_ids": "2021.sigdial-1.43_621"}
{"input": "distributed representations is done by using OtherScientificTerm| context: lemmatization is often used with morphologically rich languages to address issues caused by morphological complexity , performed by grammar - based lemmatizers .", "entity": "distributed representations", "output": "word embeddings", "neg_sample": ["distributed representations is done by using OtherScientificTerm", "lemmatization is often used with morphologically rich languages to address issues caused by morphological complexity , performed by grammar - based lemmatizers ."], "relation": "used for", "id": "2021.nodalida-main.25", "year": 2021, "rel_sent": "Word embeddings as distributed representations natively encode some information about the relationship between base and inflected forms , and we show that it is possible to learn a transformation that approximately maps the embeddings of inflected forms to the embeddings of the corresponding lemmas .", "forward": false, "src_ids": "2021.nodalida-main.25_4279"}
{"input": "transformer architecture is done by using Method| context: however , since the introduction of the transformer models , its performance has been surpassed .", "entity": "transformer architecture", "output": "hidden markov model", "neg_sample": ["transformer architecture is done by using Method", "however , since the introduction of the transformer models , its performance has been surpassed ."], "relation": "used for", "id": "2021.acl-srw.3", "year": 2021, "rel_sent": "This work proposes to introduce the concept of the hidden Markov model to the transformer architecture , which outperforms the transformer baseline .", "forward": false, "src_ids": "2021.acl-srw.3_6575"}
{"input": "sexual predators is done by using Method| context: with the increasing importance of social media in everyone 's life , the risk of its misuse by criminals is also increasing . in particular children are at risk of becoming victims of online related crime , especially sexual abuse . for example , sexual predators use online grooming to gain the trust of children and young adults .", "entity": "sexual predators", "output": "cnn", "neg_sample": ["sexual predators is done by using Method", "with the increasing importance of social media in everyone 's life , the risk of its misuse by criminals is also increasing .", "in particular children are at risk of becoming victims of online related crime , especially sexual abuse .", "for example , sexual predators use online grooming to gain the trust of children and young adults ."], "relation": "used for", "id": "2021.konvens-1.12", "year": 2021, "rel_sent": "In this paper , a two - step approach using a CNN to identify sexual predators in social networks is proposed .", "forward": false, "src_ids": "2021.konvens-1.12_10033"}
{"input": "subjective information is used for Task| context: nonetheless , providing pieces of evidence to explain why a suspicious tweet is rumor is essential .", "entity": "subjective information", "output": "rumor detection", "neg_sample": ["subjective information is used for Task", "nonetheless , providing pieces of evidence to explain why a suspicious tweet is rumor is essential ."], "relation": "used for", "id": "2021.findings-acl.63", "year": 2021, "rel_sent": "Moreover , we confirmed that both objective information and subjective information are fundamental clues for rumor detection .", "forward": true, "src_ids": "2021.findings-acl.63_6547"}
{"input": "annotations is used for OtherScientificTerm| context: for many nlp applications of online reviews , comparison of two opinion - bearing sentences is key . we argue that , while general purpose text similarity metrics have been applied for this purpose , there has been limited exploration of their applicability to opinion texts .", "entity": "annotations", "output": "opinion sentence pairs", "neg_sample": ["annotations is used for OtherScientificTerm", "for many nlp applications of online reviews , comparison of two opinion - bearing sentences is key .", "we argue that , while general purpose text similarity metrics have been applied for this purpose , there has been limited exploration of their applicability to opinion texts ."], "relation": "used for", "id": "2021.newsum-1.9", "year": 2021, "rel_sent": "We crowdsourced annotations for opinion sentence pairs and our main findings are : ( 1 ) annotators tend to agree on whether or not opinion sentences are similar or different ; and ( 2 ) embedding - based metrics capture human judgments of ' opinion similarity ' but not ' opinion difference ' .", "forward": true, "src_ids": "2021.newsum-1.9_3666"}
{"input": "archived fact - checks is used for OtherScientificTerm| context: misinformation has recently become a well - documented matter of public concern . existing studies on this topic have hitherto adopted a coarse concept of misinformation , which incorporates a broad spectrum of story types ranging from political conspiracies to misinterpreted pranks .", "entity": "archived fact - checks", "output": "misinformation stories", "neg_sample": ["archived fact - checks is used for OtherScientificTerm", "misinformation has recently become a well - documented matter of public concern .", "existing studies on this topic have hitherto adopted a coarse concept of misinformation , which incorporates a broad spectrum of story types ranging from political conspiracies to misinterpreted pranks ."], "relation": "used for", "id": "2021.acl-long.51", "year": 2021, "rel_sent": "Using archived fact - checks from Snopes.com , we identify ten types of misinformation stories .", "forward": true, "src_ids": "2021.acl-long.51_7943"}
{"input": "private vectors is done by using Method| context: ensuring strong theoretical privacy guarantees on text data is a challenging problem which is usually attained at the expense of utility . however , to improve the practicality of privacy preserving text analyses , it is essential to design algorithms that better optimize this tradeoff .", "entity": "private vectors", "output": "release mechanism", "neg_sample": ["private vectors is done by using Method", "ensuring strong theoretical privacy guarantees on text data is a challenging problem which is usually attained at the expense of utility .", "however , to improve the practicality of privacy preserving text analyses , it is essential to design algorithms that better optimize this tradeoff ."], "relation": "used for", "id": "2021.trustnlp-1.3", "year": 2021, "rel_sent": "To address this challenge , we propose a release mechanism that takes any ( text ) embedding vector as input and releases a corresponding private vector .", "forward": false, "src_ids": "2021.trustnlp-1.3_14074"}
{"input": "grammatical error correction is done by using OtherScientificTerm| context: in practice , there are a great number of real error patterns in the manually annotated training data .", "entity": "grammatical error correction", "output": "linguistic knowledge", "neg_sample": ["grammatical error correction is done by using OtherScientificTerm", "in practice , there are a great number of real error patterns in the manually annotated training data ."], "relation": "used for", "id": "2021.conll-1.17", "year": 2021, "rel_sent": "Data Augmentation of Incorporating Real Error Patterns and Linguistic Knowledge for Grammatical Error Correction.", "forward": false, "src_ids": "2021.conll-1.17_9454"}
{"input": "classification tasks is done by using Method| context: exploiting label hierarchies has become a promising approach to tackling the zero - shot multi - label text classification ( zs - mtc ) problem .", "entity": "classification tasks", "output": "pretrained models", "neg_sample": ["classification tasks is done by using Method", "exploiting label hierarchies has become a promising approach to tackling the zero - shot multi - label text classification ( zs - mtc ) problem ."], "relation": "used for", "id": "2021.naacl-main.83", "year": 2021, "rel_sent": "More recently , pretrained models like BERT ( Devlin et al . , 2018 ) have been used to convert classification tasks into a textual entailment task ( Yin et al . , 2019 ) .", "forward": false, "src_ids": "2021.naacl-main.83_4882"}
{"input": "counterfactual supporting facts extraction is used for Task| context: providing a reliable explanation for clinical diagnosis based on the electronic medical record ( emr ) is fundamental to the application of artificial intelligence in the medical field . current methods mostly treat the emr as a text sequence and provide explanations based on a precise medical knowledge base , which is disease - specific and difficult to obtain for experts in reality .", "entity": "counterfactual supporting facts extraction", "output": "explainable medical record based diagnosis", "neg_sample": ["counterfactual supporting facts extraction is used for Task", "providing a reliable explanation for clinical diagnosis based on the electronic medical record ( emr ) is fundamental to the application of artificial intelligence in the medical field .", "current methods mostly treat the emr as a text sequence and provide explanations based on a precise medical knowledge base , which is disease - specific and difficult to obtain for experts in reality ."], "relation": "used for", "id": "2021.naacl-main.156", "year": 2021, "rel_sent": "Counterfactual Supporting Facts Extraction for Explainable Medical Record Based Diagnosis with Graph Network.", "forward": true, "src_ids": "2021.naacl-main.156_14802"}
{"input": "language models is used for Task| context: transformer - based language models have taken many fields in nlp by storm . bert and its derivatives dominate most of the existing evaluation benchmarks , including those for word sense disambiguation ( wsd ) , thanks to their ability in capturing context - sensitive semantic nuances . however , there is still little knowledge about their capabilities and potential limitations in encoding and recovering word senses . however , this scenario rarely occurs in real - world settings and , hence , many practical challenges remain even in the coarse - grained setting .", "entity": "language models", "output": "coarse - grained noun disambiguation", "neg_sample": ["language models is used for Task", "transformer - based language models have taken many fields in nlp by storm .", "bert and its derivatives dominate most of the existing evaluation benchmarks , including those for word sense disambiguation ( wsd ) , thanks to their ability in capturing context - sensitive semantic nuances .", "however , there is still little knowledge about their capabilities and potential limitations in encoding and recovering word senses .", "however , this scenario rarely occurs in real - world settings and , hence , many practical challenges remain even in the coarse - grained setting ."], "relation": "used for", "id": "2021.cl-2.14", "year": 2021, "rel_sent": "Our analysis also reveals that in some cases language models come close to solving coarse - grained noun disambiguation under ideal conditions in terms of availability of training data and computing resources .", "forward": true, "src_ids": "2021.cl-2.14_12667"}
{"input": "annotations is used for Task| context: the quality of the annotated data directly influences in the success of supervised nlp models . however , creating annotated datasets is often time - consuming and expensive . although the annotation tool takes an important role , we know little about how it influences annotation quality .", "entity": "annotations", "output": "chat - untangling", "neg_sample": ["annotations is used for Task", "the quality of the annotated data directly influences in the success of supervised nlp models .", "however , creating annotated datasets is often time - consuming and expensive .", "although the annotation tool takes an important role , we know little about how it influences annotation quality ."], "relation": "used for", "id": "2021.acl-srw.22", "year": 2021, "rel_sent": "We compare the quality of annotations for the task of chat - untangling made by non - experts annotators using two different tools .", "forward": true, "src_ids": "2021.acl-srw.22_10178"}
{"input": "first - order meta - learning algorithms is used for Task| context: first - order meta - learning algorithms have been widely used in practice to learn initial model parameters that can be quickly adapted to new tasks due to their efficiency and effectiveness . however , existing studies find that meta - learner can overfit to some specific adaptation when we have heterogeneous tasks , leading to significantly degraded performance . in natural language processing ( nlp ) applications , datasets are often diverse and each task has its unique characteristics .", "entity": "first - order meta - learning algorithms", "output": "nlp applications", "neg_sample": ["first - order meta - learning algorithms is used for Task", "first - order meta - learning algorithms have been widely used in practice to learn initial model parameters that can be quickly adapted to new tasks due to their efficiency and effectiveness .", "however , existing studies find that meta - learner can overfit to some specific adaptation when we have heterogeneous tasks , leading to significantly degraded performance .", "in natural language processing ( nlp ) applications , datasets are often diverse and each task has its unique characteristics ."], "relation": "used for", "id": "2021.naacl-main.206", "year": 2021, "rel_sent": "Therefore , to address the overfitting issue when applying first - order meta - learning to NLP applications , we propose to reduce the variance of the gradient estimator used in task adaptation .", "forward": true, "src_ids": "2021.naacl-main.206_2393"}
{"input": "word and grain embeddings is done by using Method| context: word representations empowered with additional linguistic information have been widely studied and proved to outperform traditional embeddings . current methods mainly focus on learning embeddings for words while embeddings of linguistic information ( referred to as grain embeddings ) are discarded after the learning .", "entity": "word and grain embeddings", "output": "framework field embedding", "neg_sample": ["word and grain embeddings is done by using Method", "word representations empowered with additional linguistic information have been widely studied and proved to outperform traditional embeddings .", "current methods mainly focus on learning embeddings for words while embeddings of linguistic information ( referred to as grain embeddings ) are discarded after the learning ."], "relation": "used for", "id": "2021.naacl-main.140", "year": 2021, "rel_sent": "This work proposes a framework field embedding to jointly learn both word and grain embeddings by incorporating morphological , phonetic , and syntactical linguistic fields .", "forward": false, "src_ids": "2021.naacl-main.140_1917"}
{"input": "self - training is used for Task| context: while self - training generates synthetic training data where natural inputs are aligned with noisy outputs , back - training results in natural outputs aligned with noisy inputs .", "entity": "self - training", "output": "unsupervised domain adaptation ( uda )", "neg_sample": ["self - training is used for Task", "while self - training generates synthetic training data where natural inputs are aligned with noisy outputs , back - training results in natural outputs aligned with noisy inputs ."], "relation": "used for", "id": "2021.emnlp-main.566", "year": 2021, "rel_sent": "In this work , we introduce back - training , an alternative to self - training for unsupervised domain adaptation ( UDA ) .", "forward": true, "src_ids": "2021.emnlp-main.566_4457"}
{"input": "knowledge distillation is used for Method| context: recent studies argue that knowledge distillation is promising for speech translation ( st ) using end - to - end models .", "entity": "knowledge distillation", "output": "cascade st", "neg_sample": ["knowledge distillation is used for Method", "recent studies argue that knowledge distillation is promising for speech translation ( st ) using end - to - end models ."], "relation": "used for", "id": "2021.iwslt-1.24", "year": 2021, "rel_sent": "Our experimental results demonstrated that knowledge distillation is beneficial for a cascade ST . Further investigation that combined knowledge distillation and fine - tuning revealed that the combination consistently improved two language pairs : English - Italian and Spanish - English .", "forward": true, "src_ids": "2021.iwslt-1.24_11332"}
{"input": "semantic textual similarity dataset is used for Material| context: rouge is a widely used evaluation metric in text summarization . however , it is not suitable for the evaluation of abstractive summarization systems as it relies on lexical overlap between the gold standard and the generated summaries . this limitation becomes more apparent for agglutinative languages with very large vocabularies and high type / token ratios .", "entity": "semantic textual similarity dataset", "output": "turkish", "neg_sample": ["semantic textual similarity dataset is used for Material", "rouge is a widely used evaluation metric in text summarization .", "however , it is not suitable for the evaluation of abstractive summarization systems as it relies on lexical overlap between the gold standard and the generated summaries .", "this limitation becomes more apparent for agglutinative languages with very large vocabularies and high type / token ratios ."], "relation": "used for", "id": "2021.gem-1.3", "year": 2021, "rel_sent": "To achieve this , we translated the English STSb dataset into Turkish and presented the first semantic textual similarity dataset for Turkish as well .", "forward": true, "src_ids": "2021.gem-1.3_15517"}
{"input": "model - based approaches is done by using Method| context: people navigating in unfamiliar buildings take advantage of myriad visual , spatial and semantic cues to efficiently achieve their navigation goals .", "entity": "model - based approaches", "output": "pathdreamer", "neg_sample": ["model - based approaches is done by using Method", "people navigating in unfamiliar buildings take advantage of myriad visual , spatial and semantic cues to efficiently achieve their navigation goals ."], "relation": "used for", "id": "2021.alvr-1.9", "year": 2021, "rel_sent": "We hope that Pathdreamer will help unlock model - based approaches to challenging embodied navigation tasks such as navigating to specified objects and VLN .", "forward": false, "src_ids": "2021.alvr-1.9_4225"}
{"input": "sentiment lexicon is done by using Method| context: conventional opinion polls were usually conducted via questionnaires or phone interviews , which are time - consuming and error - prone . with the advances in social networking platforms , it 's easier for the general public to express their opinions on popular topics . given the huge amount of user opinions , it would be useful if we can automatically collect and aggregate the overall topical stance for a specific topic .", "entity": "sentiment lexicon", "output": "machine learning methods", "neg_sample": ["sentiment lexicon is done by using Method", "conventional opinion polls were usually conducted via questionnaires or phone interviews , which are time - consuming and error - prone .", "with the advances in social networking platforms , it 's easier for the general public to express their opinions on popular topics .", "given the huge amount of user opinions , it would be useful if we can automatically collect and aggregate the overall topical stance for a specific topic ."], "relation": "used for", "id": "2021.rocling-1.29", "year": 2021, "rel_sent": "For sentiment classification and aggregation , machine learning methods are used to train sentiment lexicon with word embeddings .", "forward": false, "src_ids": "2021.rocling-1.29_10749"}
{"input": "natural language generation tasks is done by using Method| context: keyphrases , that concisely summarize the high - level topics discussed in a document , can be categorized into present keyphrase which explicitly appears in the source text and absent keyphrase which does not match any contiguous subsequence but is highly semantically related to the source . most existing keyphrase generation approaches synchronously generate present and absent keyphrases without explicitly distinguishing these two categories .", "entity": "natural language generation tasks", "output": "select - guide - generate", "neg_sample": ["natural language generation tasks is done by using Method", "keyphrases , that concisely summarize the high - level topics discussed in a document , can be categorized into present keyphrase which explicitly appears in the source text and absent keyphrase which does not match any contiguous subsequence but is highly semantically related to the source .", "most existing keyphrase generation approaches synchronously generate present and absent keyphrases without explicitly distinguishing these two categories ."], "relation": "used for", "id": "2021.naacl-main.455", "year": 2021, "rel_sent": "Furthermore , we extend SGG to a title generation task which indicates its extensibility in natural language generation tasks .", "forward": false, "src_ids": "2021.naacl-main.455_7031"}
{"input": "fine - grained labels is used for OtherScientificTerm| context: memes are the combinations of text and images that are often humorous in nature . but , that may not always be the case , and certain combinations of texts and images may depict hate , referred to as hateful memes . that has been attacked ; and ( 2 ) detect the type of attack ( e.g.", "entity": "fine - grained labels", "output": "protected category", "neg_sample": ["fine - grained labels is used for OtherScientificTerm", "memes are the combinations of text and images that are often humorous in nature .", "but , that may not always be the case , and certain combinations of texts and images may depict hate , referred to as hateful memes .", "that has been attacked ; and ( 2 ) detect the type of attack ( e.g."], "relation": "used for", "id": "2021.woah-1.23", "year": 2021, "rel_sent": "We employ our pipeline on the Hateful Memes Challenge dataset with additional newly created fine - grained labels for protected category and type of attack .", "forward": true, "src_ids": "2021.woah-1.23_5841"}
{"input": "h - fnd is used for OtherScientificTerm| context: although distant supervision automatically generates training data for relation extraction , it also introduces false - positive ( fp ) and false - negative ( fn ) training instances to the generated datasets . whereas both types of errors degrade the final model performance , previous work on distant supervision denoising focuses more on suppressing fp noise and less on resolving the fn problem .", "entity": "h - fnd", "output": "fn instances", "neg_sample": ["h - fnd is used for OtherScientificTerm", "although distant supervision automatically generates training data for relation extraction , it also introduces false - positive ( fp ) and false - negative ( fn ) training instances to the generated datasets .", "whereas both types of errors degrade the final model performance , previous work on distant supervision denoising focuses more on suppressing fp noise and less on resolving the fn problem ."], "relation": "used for", "id": "2021.findings-acl.228", "year": 2021, "rel_sent": "In this setting , H - FND can revise FN instances correctly and maintains high F1 scores even when 50 % of the instances have been turned into negatives .", "forward": true, "src_ids": "2021.findings-acl.228_4870"}
{"input": "keyword - keyword correlation is done by using Method| context: weakly - supervised text classification has received much attention in recent years for it can alleviate the heavy burden of annotating massive data . among them , keyword - driven methods are the mainstream where user - provided keywords are exploited to generate pseudo - labels for unlabeled texts . however , existing methods treat keywords independently , thus ignore the correlation among them , which should be useful if properly exploited .", "entity": "keyword - keyword correlation", "output": "classkg", "neg_sample": ["keyword - keyword correlation is done by using Method", "weakly - supervised text classification has received much attention in recent years for it can alleviate the heavy burden of annotating massive data .", "among them , keyword - driven methods are the mainstream where user - provided keywords are exploited to generate pseudo - labels for unlabeled texts .", "however , existing methods treat keywords independently , thus ignore the correlation among them , which should be useful if properly exploited ."], "relation": "used for", "id": "2021.emnlp-main.222", "year": 2021, "rel_sent": "In this paper , we propose a novel framework called ClassKG to explore keyword - keyword correlation on keyword graph by GNN .", "forward": false, "src_ids": "2021.emnlp-main.222_13002"}
{"input": "single unified multilingual mt model is done by using Method| context: multilingual neural machine translation aims at learning a single translation model for multiple languages . these jointly trained models often suffer from performance degradationon rich - resource language pairs . we attribute this degeneration to parameter interference .", "entity": "single unified multilingual mt model", "output": "language specific sub - network", "neg_sample": ["single unified multilingual mt model is done by using Method", "multilingual neural machine translation aims at learning a single translation model for multiple languages .", "these jointly trained models often suffer from performance degradationon rich - resource language pairs .", "we attribute this degeneration to parameter interference ."], "relation": "used for", "id": "2021.acl-long.25", "year": 2021, "rel_sent": "In this paper , we propose LaSS to jointly train a single unified multilingual MT model .", "forward": false, "src_ids": "2021.acl-long.25_2210"}
{"input": "document information extractors is done by using Generic| context: information extraction from documents has become great use of novel natural language processing areas . most of the entity extraction methodologies are variant in a context such as medical area , financial area , also come even limited to the given language . also , another issue in such research is structural analysis while keeping the hierarchical , semantic , and heuristic features . another problem identified is that usually , it requires a massive training corpus . therefore , this research focus on mitigating such barriers .", "entity": "document information extractors", "output": "information extraction mechanism", "neg_sample": ["document information extractors is done by using Generic", "information extraction from documents has become great use of novel natural language processing areas .", "most of the entity extraction methodologies are variant in a context such as medical area , financial area , also come even limited to the given language .", "also , another issue in such research is structural analysis while keeping the hierarchical , semantic , and heuristic features .", "another problem identified is that usually , it requires a massive training corpus .", "therefore , this research focus on mitigating such barriers ."], "relation": "used for", "id": "2021.ranlp-srw.24", "year": 2021, "rel_sent": "Several approaches have been identifying towards building document information extractors focusing on different disciplines .", "forward": false, "src_ids": "2021.ranlp-srw.24_5310"}
{"input": "masking policies is used for Method| context: current nlp models are predominantly trained through a two - stage ' pre - train then fine - tune ' pipeline .", "entity": "masking policies", "output": "intermediate pre - training stage", "neg_sample": ["masking policies is used for Method", "current nlp models are predominantly trained through a two - stage ' pre - train then fine - tune ' pipeline ."], "relation": "used for", "id": "2021.emnlp-main.573", "year": 2021, "rel_sent": "In this paper , we perform a large - scale empirical study to investigate the effect of various masking policies in intermediate pre - training with nine selected tasks across three categories .", "forward": true, "src_ids": "2021.emnlp-main.573_15594"}
{"input": "nuance is done by using Method| context: causal inference is the process of capturing cause - effect relationship among variables . most existing works focus on dealing with structured data , while mining causal relationship among factors from unstructured data , like text , has been less examined , but is of great importance , especially in the legal domain .", "entity": "nuance", "output": "graph - based causal inference", "neg_sample": ["nuance is done by using Method", "causal inference is the process of capturing cause - effect relationship among variables .", "most existing works focus on dealing with structured data , while mining causal relationship among factors from unstructured data , like text , has been less examined , but is of great importance , especially in the legal domain ."], "relation": "used for", "id": "2021.naacl-main.155", "year": 2021, "rel_sent": "Experimental results show that GCI can capture the nuance from fact descriptions among multiple confusing charges and provide explainable discrimination , especially in few - shot settings .", "forward": false, "src_ids": "2021.naacl-main.155_12298"}
{"input": "transformer - free model is used for Task| context: transformer - based ' behemoths ' have grown in popularity , as well as structurally , shattering multiple nlp benchmarks along the way . however , their real - world usability remains a question .", "entity": "transformer - free model", "output": "ad - hoc retrieval", "neg_sample": ["transformer - free model is used for Task", "transformer - based ' behemoths ' have grown in popularity , as well as structurally , shattering multiple nlp benchmarks along the way .", "however , their real - world usability remains a question ."], "relation": "used for", "id": "2021.eacl-main.293", "year": 2021, "rel_sent": "Benchmarking a transformer - FREE model for ad - hoc retrieval.", "forward": true, "src_ids": "2021.eacl-main.293_4429"}
{"input": "noise samples is used for Material| context: respiratory insufficiency is a symptom that requires hospitalization . this work investigates whether it is possible to detect this condition by analyzing patient 's speech samples ; the analysis was performed on data collected during the first wave of the covid-19 pandemic in 2020 , and thus limited to respiratory insufficiency in covid-19 patients .", "entity": "noise samples", "output": "control group data", "neg_sample": ["noise samples is used for Material", "respiratory insufficiency is a symptom that requires hospitalization .", "this work investigates whether it is possible to detect this condition by analyzing patient 's speech samples ; the analysis was performed on data collected during the first wave of the covid-19 pandemic in 2020 , and thus limited to respiratory insufficiency in covid-19 patients ."], "relation": "used for", "id": "2021.findings-acl.55", "year": 2021, "rel_sent": "Due to the difficulty in filtering noise without eliminating crucial information , noise samples were injected in the control group data to prevent bias .", "forward": true, "src_ids": "2021.findings-acl.55_3026"}
{"input": "aspect - based sentiment analysis is done by using Method| context: existing works for aspect - based sentiment analysis ( absa ) have adopted a unified approach , which allows the interactive relations among subtasks . however , we observe that these methods tend to predict polarities based on the literal meaning of aspect and opinion terms and mainly consider relations implicitly among subtasks at the word level . in addition , identifying multiple aspect - opinion pairs with their polarities is much more challenging . therefore , a comprehensive understanding of contextual information w.r.t . the aspect and opinion are further required in absa .", "entity": "aspect - based sentiment analysis", "output": "self - supervised strategies", "neg_sample": ["aspect - based sentiment analysis is done by using Method", "existing works for aspect - based sentiment analysis ( absa ) have adopted a unified approach , which allows the interactive relations among subtasks .", "however , we observe that these methods tend to predict polarities based on the literal meaning of aspect and opinion terms and mainly consider relations implicitly among subtasks at the word level .", "in addition , identifying multiple aspect - opinion pairs with their polarities is much more challenging .", "therefore , a comprehensive understanding of contextual information w.r.t .", "the aspect and opinion are further required in absa ."], "relation": "used for", "id": "2021.acl-short.63", "year": 2021, "rel_sent": "Especially , we design novel self - supervised strategies for ABSA , which have strengths in dealing with multiple aspects .", "forward": false, "src_ids": "2021.acl-short.63_6648"}
{"input": "cross - lingual opinion mining is done by using Material| context: the capacity to accurately project annotations remains however an issue for sequence tagging tasks where annotation must be projected with correct spans . additionally , when the task implies noisy user - generated text , the quality of translation and annotation projection can be affected .", "entity": "cross - lingual opinion mining", "output": "translated data", "neg_sample": ["cross - lingual opinion mining is done by using Material", "the capacity to accurately project annotations remains however an issue for sequence tagging tasks where annotation must be projected with correct spans .", "additionally , when the task implies noisy user - generated text , the quality of translation and annotation projection can be affected ."], "relation": "used for", "id": "2021.wnut-1.27", "year": 2021, "rel_sent": "SpanAlign : Efficient Sequence Tagging Annotation Projection into Translated Data applied to Cross - Lingual Opinion Mining.", "forward": false, "src_ids": "2021.wnut-1.27_12846"}
{"input": "universal recurrent neural network grammars ( unirnng ) is used for Task| context: modern approaches to constituency parsing are mono - lingual supervised approaches which require large amount of labelled data to be trained on , thus limiting their utility to only a handful of high - resource languages .", "entity": "universal recurrent neural network grammars ( unirnng )", "output": "language - agnostic constituency parser", "neg_sample": ["universal recurrent neural network grammars ( unirnng ) is used for Task", "modern approaches to constituency parsing are mono - lingual supervised approaches which require large amount of labelled data to be trained on , thus limiting their utility to only a handful of high - resource languages ."], "relation": "used for", "id": "2021.rocling-1.1", "year": 2021, "rel_sent": "Once trained on sufficiently diverse polyglot corpus UniRNNG can be applied to any natural language thus making it Language - agnostic constituency parser .", "forward": true, "src_ids": "2021.rocling-1.1_13316"}
{"input": "model weights is done by using OtherScientificTerm| context: in dialog systems , the natural language understanding ( nlu ) component typically makes the interpretation decision ( including domain , intent and slots ) for an utterance before the mentioned entities are resolved . this may result in intent classification and slot tagging errors .", "entity": "model weights", "output": "loss term", "neg_sample": ["model weights is done by using OtherScientificTerm", "in dialog systems , the natural language understanding ( nlu ) component typically makes the interpretation decision ( including domain , intent and slots ) for an utterance before the mentioned entities are resolved .", "this may result in intent classification and slot tagging errors ."], "relation": "used for", "id": "2021.naacl-industry.3", "year": 2021, "rel_sent": "In this work , we propose to leverage Entity Resolution ( ER ) features in NLU reranking and introduce a novel loss term based on ER signals to better learn model weights in the reranking framework .", "forward": false, "src_ids": "2021.naacl-industry.3_13594"}
{"input": "interactive models is used for Task| context: despite the increasingly good quality of machine translation ( mt ) systems , mt outputs require corrections .", "entity": "interactive models", "output": "post - editing", "neg_sample": ["interactive models is used for Task", "despite the increasingly good quality of machine translation ( mt ) systems , mt outputs require corrections ."], "relation": "used for", "id": "2021.triton-1.19", "year": 2021, "rel_sent": "Interactive Models for Post - Editing.", "forward": true, "src_ids": "2021.triton-1.19_7876"}
{"input": "day trading behavior is done by using Material| context: with 56 million people actively trading and investing in cryptocurrency online and globally in 2020 , there is an increasing need for automatic social media analysis tools to help understand trading discourse and behavior .", "entity": "day trading behavior", "output": "tweets", "neg_sample": ["day trading behavior is done by using Material", "with 56 million people actively trading and investing in cryptocurrency online and globally in 2020 , there is an increasing need for automatic social media analysis tools to help understand trading discourse and behavior ."], "relation": "used for", "id": "2021.econlp-1.11", "year": 2021, "rel_sent": "This pipeline first predicts if tweets can be used to guide day trading behavior , specifically if a cryptocurrency investor should buy , sell , or hold their cryptocurrencies in order to make a profit .", "forward": false, "src_ids": "2021.econlp-1.11_14143"}
{"input": "sentiment tasks is done by using OtherScientificTerm| context: sentiment tasks such as hate speech detection and sentiment analysis , especially when performed on languages other than english , are often low - resource .", "entity": "sentiment tasks", "output": "emotional information", "neg_sample": ["sentiment tasks is done by using OtherScientificTerm", "sentiment tasks such as hate speech detection and sentiment analysis , especially when performed on languages other than english , are often low - resource ."], "relation": "used for", "id": "2021.eacl-srw.15", "year": 2021, "rel_sent": "In this study , we exploit the emotional information encoded in emojis to enhance the performance on a variety of sentiment tasks .", "forward": false, "src_ids": "2021.eacl-srw.15_13291"}
{"input": "annotated english - italian parallel challenge set is used for Task| context: languages differ in terms of the absence or presence of gender features , the number of gender classes and whether and where gender features are explicitly marked . these cross - linguistic differences can lead to ambiguities that are difficult to resolve , especially for sentence - level mt systems . the identification of ambiguity and its subsequent resolution is a challenging task for which currently there are n't any specific resources or challenge sets available .", "entity": "annotated english - italian parallel challenge set", "output": "cross - linguistic natural gender phenomena", "neg_sample": ["annotated english - italian parallel challenge set is used for Task", "languages differ in terms of the absence or presence of gender features , the number of gender classes and whether and where gender features are explicitly marked .", "these cross - linguistic differences can lead to ambiguities that are difficult to resolve , especially for sentence - level mt systems .", "the identification of ambiguity and its subsequent resolution is a challenging task for which currently there are n't any specific resources or challenge sets available ."], "relation": "used for", "id": "2021.gebnlp-1.1", "year": 2021, "rel_sent": "gENder - IT : An Annotated English - Italian Parallel Challenge Set for Cross - Linguistic Natural Gender Phenomena.", "forward": true, "src_ids": "2021.gebnlp-1.1_2208"}
{"input": "style transfer is done by using Method| context: we present two novel unsupervised methods for eliminating toxicity in text .", "entity": "style transfer", "output": "paraphrasing models", "neg_sample": ["style transfer is done by using Method", "we present two novel unsupervised methods for eliminating toxicity in text ."], "relation": "used for", "id": "2021.emnlp-main.629", "year": 2021, "rel_sent": "Our first method combines two recent ideas : ( 1 ) guidance of the generation process with small style - conditional language models and ( 2 ) use of paraphrasing models to perform style transfer .", "forward": false, "src_ids": "2021.emnlp-main.629_12200"}
{"input": "time - shaped reward is used for Method| context: temporal knowledge graph ( tkg ) reasoning is a crucial task that has gained increasing research interest in recent years . most existing methods focus on reasoning at past timestamps to complete the missing facts , and there are only a few works of reasoning on known tkgs toforecast future facts . compared with the completion task , the forecasting task is more difficult that faces two main challenges : ( 1 ) how to effectively model the time information to handle future timestamps ? ( 2 ) how to make inductive inference to handle previously unseen entities that emerge over time ?", "entity": "time - shaped reward", "output": "model learning", "neg_sample": ["time - shaped reward is used for Method", "temporal knowledge graph ( tkg ) reasoning is a crucial task that has gained increasing research interest in recent years .", "most existing methods focus on reasoning at past timestamps to complete the missing facts , and there are only a few works of reasoning on known tkgs toforecast future facts .", "compared with the completion task , the forecasting task is more difficult that faces two main challenges : ( 1 ) how to effectively model the time information to handle future timestamps ?", "( 2 ) how to make inductive inference to handle previously unseen entities that emerge over time ?"], "relation": "used for", "id": "2021.emnlp-main.655", "year": 2021, "rel_sent": "Our method defines a relative time encoding function to capture the timespan information , and we design a novel time - shaped reward based on Dirichlet distribution to guide the model learning .", "forward": true, "src_ids": "2021.emnlp-main.655_2072"}
{"input": "dense passage retrieval is done by using Method| context: in open - domain question answering , dense passage retrieval has become a new paradigm to retrieve relevant passages for finding answers . typically , the dual - encoder architecture is adopted to learn dense representations of questions and passages for semantic matching . however , it is difficult to effectively train a dual - encoder due to the challenges including the discrepancy between training and inference , the existence of unlabeled positives and limited training data .", "entity": "dense passage retrieval", "output": "optimized training approach", "neg_sample": ["dense passage retrieval is done by using Method", "in open - domain question answering , dense passage retrieval has become a new paradigm to retrieve relevant passages for finding answers .", "typically , the dual - encoder architecture is adopted to learn dense representations of questions and passages for semantic matching .", "however , it is difficult to effectively train a dual - encoder due to the challenges including the discrepancy between training and inference , the existence of unlabeled positives and limited training data ."], "relation": "used for", "id": "2021.naacl-main.466", "year": 2021, "rel_sent": "To address these challenges , we propose an optimized training approach , called RocketQA , to improving dense passage retrieval .", "forward": false, "src_ids": "2021.naacl-main.466_2534"}
{"input": "confidence - based filtering is used for Material| context: prior literature on bankruptcy prediction mainly focuses on developing more sophisticated prediction methodologies with financial variables . however , in our study , we focus on improving the quality of input dataset . specifically , we employ bert model to perform sentiment analysis on md&a disclosures .", "entity": "confidence - based filtering", "output": "corporate disclosure data", "neg_sample": ["confidence - based filtering is used for Material", "prior literature on bankruptcy prediction mainly focuses on developing more sophisticated prediction methodologies with financial variables .", "however , in our study , we focus on improving the quality of input dataset .", "specifically , we employ bert model to perform sentiment analysis on md&a disclosures ."], "relation": "used for", "id": "2021.econlp-1.4", "year": 2021, "rel_sent": "Further , instead of pre - training the BERT model from scratch , we apply self - learning with confidence - based filtering to corporate disclosure data ( 10 - K ) .", "forward": true, "src_ids": "2021.econlp-1.4_728"}
{"input": "transfer learning is used for Task| context: a named entity is often a word or expression that bears a valuable piece of information , which can be effectively employed by some major nlp tasks such as machine translation , question answering , and text summarization .", "entity": "transfer learning", "output": "named entity recognition", "neg_sample": ["transfer learning is used for Task", "a named entity is often a word or expression that bears a valuable piece of information , which can be effectively employed by some major nlp tasks such as machine translation , question answering , and text summarization ."], "relation": "used for", "id": "2021.ranlp-1.73", "year": 2021, "rel_sent": "In this paper , we introduce a new model called BERT - PersNER ( BERT based Persian Named Entity Recognizer ) , in which we have applied transfer learning and active learning approaches to NER in Persian , which is regarded as a low - resource language .", "forward": true, "src_ids": "2021.ranlp-1.73_12365"}
{"input": "unsupervised semantic parsing is done by using Method| context: semantic parsing is challenging due to the structure gap and the semantic gap between utterances and logical forms .", "entity": "unsupervised semantic parsing", "output": "synchronous semantic decoding ( ssd )", "neg_sample": ["unsupervised semantic parsing is done by using Method", "semantic parsing is challenging due to the structure gap and the semantic gap between utterances and logical forms ."], "relation": "used for", "id": "2021.acl-long.397", "year": 2021, "rel_sent": "Experimental results show that SSD is a promising approach and can achieve state - of - the - art unsupervised semantic parsing performance on multiple datasets .", "forward": false, "src_ids": "2021.acl-long.397_6085"}
{"input": "information retrieval is done by using OtherScientificTerm| context: recent neural ir models shifts towards soft matching all query document terms , but they lose the computation efficiency of exact match systems .", "entity": "information retrieval", "output": "exact lexical match", "neg_sample": ["information retrieval is done by using OtherScientificTerm", "recent neural ir models shifts towards soft matching all query document terms , but they lose the computation efficiency of exact match systems ."], "relation": "used for", "id": "2021.naacl-main.241", "year": 2021, "rel_sent": "COIL : Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List.", "forward": false, "src_ids": "2021.naacl-main.241_15635"}
{"input": "french question - answering task is done by using Method| context: for many tasks , state - of - the - art results have been achieved with transformer - based architectures , resulting in a paradigmatic shift in practices from the use of task - specific architectures to the fine - tuning of pre - trained language models . the ongoing trend consists in training models with an ever - increasing amount of data and parameters , which requires considerable resources . it leads to a strong search to improve resource efficiency based on algorithmic and hardware improvements evaluated only for english . this raises questions about their usability when applied to small - scale learning problems , for which a limited amount of training data is available , especially for under - resourced languages tasks . the lack of appropriately sized corpora is a hindrance to applying data - driven and transfer learning - based approaches with strong instability cases .", "entity": "french question - answering task", "output": "transformers - based models", "neg_sample": ["french question - answering task is done by using Method", "for many tasks , state - of - the - art results have been achieved with transformer - based architectures , resulting in a paradigmatic shift in practices from the use of task - specific architectures to the fine - tuning of pre - trained language models .", "the ongoing trend consists in training models with an ever - increasing amount of data and parameters , which requires considerable resources .", "it leads to a strong search to improve resource efficiency based on algorithmic and hardware improvements evaluated only for english .", "this raises questions about their usability when applied to small - scale learning problems , for which a limited amount of training data is available , especially for under - resourced languages tasks .", "the lack of appropriately sized corpora is a hindrance to applying data - driven and transfer learning - based approaches with strong instability cases ."], "relation": "used for", "id": "2021.ranlp-1.29", "year": 2021, "rel_sent": "On the Usability of Transformers - based Models for a French Question - Answering Task.", "forward": false, "src_ids": "2021.ranlp-1.29_5801"}
{"input": "social - science - oriented resources is done by using Method| context: automated event extraction in social science applications often requires corpus - level evaluations : for example , aggregating text predictions across metadata and unbiased estimates of recall .", "entity": "social - science - oriented resources", "output": "annotation approach", "neg_sample": ["social - science - oriented resources is done by using Method", "automated event extraction in social science applications often requires corpus - level evaluations : for example , aggregating text predictions across metadata and unbiased estimates of recall ."], "relation": "used for", "id": "2021.findings-acl.371", "year": 2021, "rel_sent": "Our novel corpus - level evaluations and annotation approach can guide creation of similar social - science - oriented resources in the future .", "forward": false, "src_ids": "2021.findings-acl.371_8064"}
{"input": "russian is done by using Task| context: the success of pre - trained transformer language models has brought a great deal of interest on how these models work , and what they learn about language . however , prior research in the field is mainly devoted to english , and little is known regarding other languages .", "entity": "russian", "output": "probing tasks", "neg_sample": ["russian is done by using Task", "the success of pre - trained transformer language models has brought a great deal of interest on how these models work , and what they learn about language .", "however , prior research in the field is mainly devoted to english , and little is known regarding other languages ."], "relation": "used for", "id": "2021.bsnlp-1.6", "year": 2021, "rel_sent": "To this end , we introduce RuSentEval , an enhanced set of 14 probing tasks for Russian , including ones that have not been explored yet .", "forward": false, "src_ids": "2021.bsnlp-1.6_15234"}
{"input": "nested named entity recognition is done by using Method| context: named entity recognition ( ner ) is a well - studied task in natural language processing . traditional ner research only deals with flat entities and ignores nested entities . the span - based methods treat entity recognition as a span classification task . although these methods have the innate ability to handle nested ner , they suffer from high computational cost , ignorance of boundary information , under - utilization of the spans that partially match with entities , and difficulties in long entity recognition .", "entity": "nested named entity recognition", "output": "two - stage identifier", "neg_sample": ["nested named entity recognition is done by using Method", "named entity recognition ( ner ) is a well - studied task in natural language processing .", "traditional ner research only deals with flat entities and ignores nested entities .", "the span - based methods treat entity recognition as a span classification task .", "although these methods have the innate ability to handle nested ner , they suffer from high computational cost , ignorance of boundary information , under - utilization of the spans that partially match with entities , and difficulties in long entity recognition ."], "relation": "used for", "id": "2021.acl-long.216", "year": 2021, "rel_sent": "Locate and Label : A Two - stage Identifier for Nested Named Entity Recognition.", "forward": false, "src_ids": "2021.acl-long.216_6700"}
{"input": "algebraic recombination is done by using Method| context: neural sequence models exhibit limited compositional generalization ability in semantic parsing tasks . compositional generalization requires algebraic recombination , i.e. , dynamically recombining structured expressions in a recursive manner . however , most previous studies mainly concentrate on recombining lexical units , which is an important but not sufficient part of algebraic recombination .", "entity": "algebraic recombination", "output": "end - toend neural model", "neg_sample": ["algebraic recombination is done by using Method", "neural sequence models exhibit limited compositional generalization ability in semantic parsing tasks .", "compositional generalization requires algebraic recombination , i.e.", ", dynamically recombining structured expressions in a recursive manner .", "however , most previous studies mainly concentrate on recombining lexical units , which is an important but not sufficient part of algebraic recombination ."], "relation": "used for", "id": "2021.findings-acl.97", "year": 2021, "rel_sent": "In this paper , we propose LEAR , an end - toend neural model to learn algebraic recombination for compositional generalization .", "forward": false, "src_ids": "2021.findings-acl.97_12464"}
{"input": "one - to - one tutorial dialogue sessions is done by using Method| context: cmas are more fine - grained sub - utterance acts compared to traditional dialogue act mark - up .", "entity": "one - to - one tutorial dialogue sessions", "output": "conversational management act ( cma ) annotation schema", "neg_sample": ["one - to - one tutorial dialogue sessions is done by using Method", "cmas are more fine - grained sub - utterance acts compared to traditional dialogue act mark - up ."], "relation": "used for", "id": "2021.reinact-1.4", "year": 2021, "rel_sent": "We present a conversational management act ( CMA ) annotation schema for one - to - one tutorial dialogue sessions where a tutor uses an analogy to teach a student a concept .", "forward": false, "src_ids": "2021.reinact-1.4_15833"}
{"input": "consistency loss is used for OtherScientificTerm| context: a recent topic of research in natural language generation has been the development of automatic response generation modules that can automatically respond to a user 's utterance in an empathetic manner . previous research has tackled this task using neural generative methods by augmenting emotion classes with the input sequences . however , the outputs by these models may be inconsistent .", "entity": "consistency loss", "output": "coherent decoding", "neg_sample": ["consistency loss is used for OtherScientificTerm", "a recent topic of research in natural language generation has been the development of automatic response generation modules that can automatically respond to a user 's utterance in an empathetic manner .", "previous research has tackled this task using neural generative methods by augmenting emotion classes with the input sequences .", "however , the outputs by these models may be inconsistent ."], "relation": "used for", "id": "2021.eacl-main.255", "year": 2021, "rel_sent": "We use the focal loss to handle imbalanced data distribution , and utilize the consistency loss to allow coherent decoding by the decoders .", "forward": true, "src_ids": "2021.eacl-main.255_2192"}
{"input": "neural language model - based probabilistic metrics is used for OtherScientificTerm| context: unfortunately , it is a nebulous concept , difficult to both define and quantify .", "entity": "neural language model - based probabilistic metrics", "output": "disclosive transparency", "neg_sample": ["neural language model - based probabilistic metrics is used for OtherScientificTerm", "unfortunately , it is a nebulous concept , difficult to both define and quantify ."], "relation": "used for", "id": "2021.emnlp-main.153", "year": 2021, "rel_sent": "To improve this state of affairs , We introduce neural language model - based probabilistic metrics to directly model disclosive transparency , and demonstrate that they correlate with user and expert opinions of system transparency , making them a valid objective proxy .", "forward": true, "src_ids": "2021.emnlp-main.153_13178"}
{"input": "dialog structure is used for OtherScientificTerm| context: learning discrete dialog structure graph from human - human dialogs yields basic insights into the structure of conversation , and also provides background knowledge tofacilitate dialog generation . however , this problem is less studied in open - domain dialogue .", "entity": "dialog structure", "output": "multi - turn coherence", "neg_sample": ["dialog structure is used for OtherScientificTerm", "learning discrete dialog structure graph from human - human dialogs yields basic insights into the structure of conversation , and also provides background knowledge tofacilitate dialog generation .", "however , this problem is less studied in open - domain dialogue ."], "relation": "used for", "id": "2021.acl-long.136", "year": 2021, "rel_sent": "Experimental results on two benchmark corpora confirm that DVAE - GNN can discover meaningful dialog structure graph , and the use of dialog structure as background knowledge can significantly improve multi - turn coherence .", "forward": true, "src_ids": "2021.acl-long.136_4621"}
{"input": "bias is done by using Method| context: reporting and providing test sets for harmful bias in nlp applications is essential for building a robust understanding of the current problem . bias in downstream applications can stem from training data , word embeddings , or be amplified by the model in use . however , focusing on biased word embeddings is potentially the most impactful first step due to their universal nature .", "entity": "bias", "output": "post - processing methods", "neg_sample": ["bias is done by using Method", "reporting and providing test sets for harmful bias in nlp applications is essential for building a robust understanding of the current problem .", "bias in downstream applications can stem from training data , word embeddings , or be amplified by the model in use .", "however , focusing on biased word embeddings is potentially the most impactful first step due to their universal nature ."], "relation": "used for", "id": "2021.findings-acl.369", "year": 2021, "rel_sent": "Here we seek to understand how the intrinsic properties of word embeddings contribute to this observed marked attribute effect , and whether current post - processing methods address the bias successfully .", "forward": false, "src_ids": "2021.findings-acl.369_4974"}
{"input": "type - based embeddings is used for Task| context: the optimal fine - tuning of models including pre- and post - processing is largely unclear .", "entity": "type - based embeddings", "output": "lexical semantic change detection", "neg_sample": ["type - based embeddings is used for Task", "the optimal fine - tuning of models including pre- and post - processing is largely unclear ."], "relation": "used for", "id": "2021.eacl-main.10", "year": 2021, "rel_sent": "Effects of Pre- and Post - Processing on type - based Embeddings in Lexical Semantic Change Detection.", "forward": true, "src_ids": "2021.eacl-main.10_16018"}
{"input": "dialogue utterances is done by using OtherScientificTerm| context: automatic translation of dialogue texts is a much needed demand in many real life scenarios . however , the currently existing neural machine translation delivers unsatisfying results .", "entity": "dialogue utterances", "output": "context", "neg_sample": ["dialogue utterances is done by using OtherScientificTerm", "automatic translation of dialogue texts is a much needed demand in many real life scenarios .", "however , the currently existing neural machine translation delivers unsatisfying results ."], "relation": "used for", "id": "2021.naacl-industry.14", "year": 2021, "rel_sent": "In response to these challenges , we propose a joint learning method to identify omission and typo , and utilize context to translate dialogue utterances .", "forward": false, "src_ids": "2021.naacl-industry.14_12100"}
{"input": "soa is used for Task| context: basic - level categories ( blc ) are an important psycholinguistic concept introduced by rosch et al . ( 1976 ) ; they are defined as the most inclusive categories for which a concrete mental image of the category as a whole can be formed , and also as those categories which are acquired early in life . an at - scale algorithm for the automatic determination of blc exists , but it operates without rosch - style semantic features , and is thus unable to verify rosch 's hypothesis .", "entity": "soa", "output": "detection of blc", "neg_sample": ["soa is used for Task", "basic - level categories ( blc ) are an important psycholinguistic concept introduced by rosch et al .", "( 1976 ) ; they are defined as the most inclusive categories for which a concrete mental image of the category as a whole can be formed , and also as those categories which are acquired early in life .", "an at - scale algorithm for the automatic determination of blc exists , but it operates without rosch - style semantic features , and is thus unable to verify rosch 's hypothesis ."], "relation": "used for", "id": "2021.emnlp-main.654", "year": 2021, "rel_sent": "The best of our methods outperforms the current SoA in BLC detection , with an accuracy of English BLC detection of 75.0 % , and of Mandarin BLC detection 80.7 % on a test set .", "forward": true, "src_ids": "2021.emnlp-main.654_11109"}
{"input": "english complex sentence constructions is used for Task| context: abstract meaning representation ( amr ) is a sentence - level meaning representation based on predicate argument structure . knowing the core part of the sentence structure in advance may be beneficial in such a task .", "entity": "english complex sentence constructions", "output": "amr parsing", "neg_sample": ["english complex sentence constructions is used for Task", "abstract meaning representation ( amr ) is a sentence - level meaning representation based on predicate argument structure .", "knowing the core part of the sentence structure in advance may be beneficial in such a task ."], "relation": "used for", "id": "2021.starsem-1.20", "year": 2021, "rel_sent": "In this paper , we present a list of dependency patterns for English complex sentence constructions designed for AMR parsing .", "forward": true, "src_ids": "2021.starsem-1.20_10469"}
{"input": "hybrid temporal tagger is used for Material| context: reliable tagging of temporal expressions ( tes , e.g. , book a table at l'osteria for sunday evening ) is a central requirement for voice assistants ( vas ) . however , there is a dearth of resources and systems for the va domain , since publicly - available temporal taggers are trained only on substantially different domains , such as news and clinical text .", "entity": "hybrid temporal tagger", "output": "english va domain", "neg_sample": ["hybrid temporal tagger is used for Material", "reliable tagging of temporal expressions ( tes , e.g.", ", book a table at l'osteria for sunday evening ) is a central requirement for voice assistants ( vas ) .", "however , there is a dearth of resources and systems for the va domain , since publicly - available temporal taggers are trained only on substantially different domains , such as news and clinical text ."], "relation": "used for", "id": "2021.iwcs-1.14", "year": 2021, "rel_sent": "Since the cost of annotating large datasets is prohibitive , we investigate the trade - off between in - domain data and performance in DA - Time , a hybrid temporal tagger for the English VA domain which combines a neural architecture for robust TE recognition , with a parser - based TE normalizer .", "forward": true, "src_ids": "2021.iwcs-1.14_4635"}
{"input": "joint extraction of entities and relations is done by using Method| context: joint extraction of entities and relations from unstructured texts toform factual triples is a fundamental task of constructing a knowledge base ( kb ) . a common method is to decode triples by predicting entity pairs to obtain the corresponding relation . however , it is still challenging to handle this task efficiently , especially for the overlapping triple problem .", "entity": "joint extraction of entities and relations", "output": "translating decoding schema", "neg_sample": ["joint extraction of entities and relations is done by using Method", "joint extraction of entities and relations from unstructured texts toform factual triples is a fundamental task of constructing a knowledge base ( kb ) .", "a common method is to decode triples by predicting entity pairs to obtain the corresponding relation .", "however , it is still challenging to handle this task efficiently , especially for the overlapping triple problem ."], "relation": "used for", "id": "2021.emnlp-main.635", "year": 2021, "rel_sent": "To address such a problem , this paper proposes a novel efficient entities and relations extraction model called TDEER , which stands for Translating Decoding Schema for Joint Extraction of Entities and Relations .", "forward": false, "src_ids": "2021.emnlp-main.635_14575"}
{"input": "reinforce algorithm is used for Task| context: the recent progress has featured advanced transformer - based language models ( e.g. , bert ) as a critical component in state - of - the - art models for ed . however , the length limit for input texts is a barrier for such ed models as they can not encode long - range document - level context that has been shown to be beneficial for ed .", "entity": "reinforce algorithm", "output": "relevant sentence selection", "neg_sample": ["reinforce algorithm is used for Task", "the recent progress has featured advanced transformer - based language models ( e.g.", ", bert ) as a critical component in state - of - the - art models for ed .", "however , the length limit for input texts is a barrier for such ed models as they can not encode long - range document - level context that has been shown to be beneficial for ed ."], "relation": "used for", "id": "2021.emnlp-main.439", "year": 2021, "rel_sent": "To this end , the REINFORCE algorithm is employed to train the relevant sentence selection for ED .", "forward": true, "src_ids": "2021.emnlp-main.439_8695"}
{"input": "pre - trained language models is used for Task| context: this paper presents the first study on using large - scale pre - trained language models for automated generation of an event - level temporal graph for a document . despite the huge success of neural pre - training methods in nlp tasks , its potential for temporal reasoning over event graphs has not been sufficiently explored . part of the reason is the difficulty in obtaining large training corpora with human - annotated events and temporal links .", "entity": "pre - trained language models", "output": "graph generation task", "neg_sample": ["pre - trained language models is used for Task", "this paper presents the first study on using large - scale pre - trained language models for automated generation of an event - level temporal graph for a document .", "despite the huge success of neural pre - training methods in nlp tasks , its potential for temporal reasoning over event graphs has not been sufficiently explored .", "part of the reason is the difficulty in obtaining large training corpora with human - annotated events and temporal links ."], "relation": "used for", "id": "2021.naacl-main.67", "year": 2021, "rel_sent": "These strategies enable us to leverage and fine - tune pre - trained language models on the system - induced training data for the graph generation task .", "forward": true, "src_ids": "2021.naacl-main.67_3831"}
{"input": "lexical substitution is done by using Method| context: while its usage has increased in recent years , the paucity of annotated data prevents the finetuning of neural models on the task , hindering the full fruition of recently introduced powerful architectures such as language models . furthermore , lexical substitution is usually evaluated in a framework that is strictly bound to a limited vocabulary , making it impossible to credit appropriate , but out - of - vocabulary , substitutes .", "entity": "lexical substitution", "output": "generative approach", "neg_sample": ["lexical substitution is done by using Method", "while its usage has increased in recent years , the paucity of annotated data prevents the finetuning of neural models on the task , hindering the full fruition of recently introduced powerful architectures such as language models .", "furthermore , lexical substitution is usually evaluated in a framework that is strictly bound to a limited vocabulary , making it impossible to credit appropriate , but out - of - vocabulary , substitutes ."], "relation": "used for", "id": "2021.emnlp-main.844", "year": 2021, "rel_sent": "To assess these issues , we proposed GeneSis ( Generating Substitutes in contexts ) , the first generative approach to lexical substitution .", "forward": false, "src_ids": "2021.emnlp-main.844_14336"}
{"input": "chinese - lattice - bert is done by using Method| context: chinese pre - trained language models usually process text as a sequence of characters , while ignoring more coarse granularity , e.g. , words .", "entity": "chinese - lattice - bert", "output": "pre - training paradigm", "neg_sample": ["chinese - lattice - bert is done by using Method", "chinese pre - trained language models usually process text as a sequence of characters , while ignoring more coarse granularity , e.g.", ", words ."], "relation": "used for", "id": "2021.naacl-main.137", "year": 2021, "rel_sent": "In this work , we propose a novel pre - training paradigm for Chinese - Lattice - BERT , which explicitly incorporates word representations along with characters , thus can model a sentence in a multi - granularity manner .", "forward": false, "src_ids": "2021.naacl-main.137_15929"}
{"input": "h - fnd is used for Task| context: although distant supervision automatically generates training data for relation extraction , it also introduces false - positive ( fp ) and false - negative ( fn ) training instances to the generated datasets . whereas both types of errors degrade the final model performance , previous work on distant supervision denoising focuses more on suppressing fp noise and less on resolving the fn problem .", "entity": "h - fnd", "output": "fn denoising solution", "neg_sample": ["h - fnd is used for Task", "although distant supervision automatically generates training data for relation extraction , it also introduces false - positive ( fp ) and false - negative ( fn ) training instances to the generated datasets .", "whereas both types of errors degrade the final model performance , previous work on distant supervision denoising focuses more on suppressing fp noise and less on resolving the fn problem ."], "relation": "used for", "id": "2021.findings-acl.228", "year": 2021, "rel_sent": "We here propose H - FND , a hierarchical false - negative denoising framework for robust distant supervision relation extraction , as an FN denoising solution .", "forward": true, "src_ids": "2021.findings-acl.228_4867"}
{"input": "dual slot selector is used for Task| context: existing approaches generally predict the dialogue state at every turn from scratch . however , the overwhelming majority of the slots in each turn should simply inherit the slot values from the previous turn . therefore , the mechanism of treating slots equally in each turn not only is inefficient but also may lead to additional errors because of the redundant slot value generation .", "entity": "dual slot selector", "output": "dialogue state tracking", "neg_sample": ["dual slot selector is used for Task", "existing approaches generally predict the dialogue state at every turn from scratch .", "however , the overwhelming majority of the slots in each turn should simply inherit the slot values from the previous turn .", "therefore , the mechanism of treating slots equally in each turn not only is inefficient but also may lead to additional errors because of the redundant slot value generation ."], "relation": "used for", "id": "2021.acl-long.12", "year": 2021, "rel_sent": "Dual Slot Selector via Local Reliability Verification for Dialogue State Tracking.", "forward": true, "src_ids": "2021.acl-long.12_1452"}
{"input": "morphosyntactic alignment system is used for OtherScientificTerm| context: we investigate how multilingual bert ( mbert ) encodes grammar by examining how the high - order grammatical feature of morphosyntactic alignment ( how different languages define what counts as a ' subject ' ) is manifested across the embedding spaces of different languages .", "entity": "morphosyntactic alignment system", "output": "intransitive subjects", "neg_sample": ["morphosyntactic alignment system is used for OtherScientificTerm", "we investigate how multilingual bert ( mbert ) encodes grammar by examining how the high - order grammatical feature of morphosyntactic alignment ( how different languages define what counts as a ' subject ' ) is manifested across the embedding spaces of different languages ."], "relation": "used for", "id": "2021.scil-1.50", "year": 2021, "rel_sent": "Whereas in English and many other languages , we think of intransitive subjects as grammatical subjects , ergative languages have a different morphosyntactic alignment system that aligns intransitive subjects", "forward": true, "src_ids": "2021.scil-1.50_653"}
{"input": "english definitions is used for Material| context: definition modelling is the task of automatically generating a dictionary - style definition given a target word .", "entity": "english definitions", "output": "wolastoqey ( malecite - passamaquoddy ) words", "neg_sample": ["english definitions is used for Material", "definition modelling is the task of automatically generating a dictionary - style definition given a target word ."], "relation": "used for", "id": "2021.ranlp-1.17", "year": 2021, "rel_sent": "Specifically , we generate English definitions for Wolastoqey ( Malecite - Passamaquoddy ) words .", "forward": true, "src_ids": "2021.ranlp-1.17_2589"}
{"input": "intermediate representations is done by using Method| context: to translate natural language questions into executable database queries , most approaches rely on a fully annotated training set . annotating a large dataset with queries is difficult as it requires query - language expertise .", "entity": "intermediate representations", "output": "neural semantic parser", "neg_sample": ["intermediate representations is done by using Method", "to translate natural language questions into executable database queries , most approaches rely on a fully annotated training set .", "annotating a large dataset with queries is difficult as it requires query - language expertise ."], "relation": "used for", "id": "2021.emnlp-main.708", "year": 2021, "rel_sent": "Our pipeline consists of two parts : a neural semantic parser that converts natural language questions into the intermediate representations and a non - trainable transpiler to the SPARQL query language ( a standard language for accessing knowledge graphs and semantic web ) .", "forward": false, "src_ids": "2021.emnlp-main.708_13211"}
{"input": "executive function prediction model is done by using Method| context: as the average life expectancy of chinese people rises , the health care problems of the elderly are becoming more diverse , and the demand for long - term care is also increasing . therefore , how to help the elderly have a good quality of life and maintain their dignity is what we need to think about .", "entity": "executive function prediction model", "output": "word vector model", "neg_sample": ["executive function prediction model is done by using Method", "as the average life expectancy of chinese people rises , the health care problems of the elderly are becoming more diverse , and the demand for long - term care is also increasing .", "therefore , how to help the elderly have a good quality of life and maintain their dignity is what we need to think about ."], "relation": "used for", "id": "2021.rocling-1.8", "year": 2021, "rel_sent": "Then , through the word vector model and regression model , an executive function prediction model based on dialogue data is established to help understand the degradation trajectory of executive function and establish an early warning .", "forward": false, "src_ids": "2021.rocling-1.8_8859"}
{"input": "transformer is used for Task| context: one of the major problems in applying deep learning to software engineering is that source code often contains a lot of rare identifiers , resulting in huge vocabularies .", "entity": "transformer", "output": "code processing tasks", "neg_sample": ["transformer is used for Task", "one of the major problems in applying deep learning to software engineering is that source code often contains a lot of rare identifiers , resulting in huge vocabularies ."], "relation": "used for", "id": "2021.naacl-main.26", "year": 2021, "rel_sent": "We show that the proposed OOV anonymization method significantly improves the performance of the Transformer in two code processing tasks : code completion and bug fixing .", "forward": true, "src_ids": "2021.naacl-main.26_9392"}
{"input": "aspect sentiment quad prediction is done by using Method| context: aspect - based sentiment analysis ( absa ) has been extensively studied in recent years , which typically involves four fundamental sentiment elements , including the aspect category , aspect term , opinion term , and sentiment polarity . existing studies usually consider the detection of partial sentiment elements , instead of predicting the four elements in one shot .", "entity": "aspect sentiment quad prediction", "output": "generation formulation", "neg_sample": ["aspect sentiment quad prediction is done by using Method", "aspect - based sentiment analysis ( absa ) has been extensively studied in recent years , which typically involves four fundamental sentiment elements , including the aspect category , aspect term , opinion term , and sentiment polarity .", "existing studies usually consider the detection of partial sentiment elements , instead of predicting the four elements in one shot ."], "relation": "used for", "id": "2021.emnlp-main.726", "year": 2021, "rel_sent": "On one hand , the generation formulation allows solving ASQP in an end - to - end manner , alleviating the potential error propagation in the pipeline solution .", "forward": false, "src_ids": "2021.emnlp-main.726_12993"}
{"input": "schema - aware curriculum learning is used for Task| context: existing dialog state tracking ( dst ) models are trained with dialog data in a random order , neglecting rich structural information in a dataset .", "entity": "schema - aware curriculum learning", "output": "multi - domain dialogue state tracking", "neg_sample": ["schema - aware curriculum learning is used for Task", "existing dialog state tracking ( dst ) models are trained with dialog data in a random order , neglecting rich structural information in a dataset ."], "relation": "used for", "id": "2021.acl-short.111", "year": 2021, "rel_sent": "Preview , Attend and Review : Schema - Aware Curriculum Learning for Multi - Domain Dialogue State Tracking.", "forward": true, "src_ids": "2021.acl-short.111_3985"}
{"input": "user answer is done by using Method| context: in semantic parsing of geographical queries against real - world databases such as openstreetmap ( osm ) , unique correct answers do not necessarily exist . instead , the truth might be lying in the eye of the user , who needs to enter an interactive setup where ambiguities can be resolved and parsing mistakes can be corrected .", "entity": "user answer", "output": "multi - source training", "neg_sample": ["user answer is done by using Method", "in semantic parsing of geographical queries against real - world databases such as openstreetmap ( osm ) , unique correct answers do not necessarily exist .", "instead , the truth might be lying in the eye of the user , who needs to enter an interactive setup where ambiguities can be resolved and parsing mistakes can be corrected ."], "relation": "used for", "id": "2021.splurobonlp-1.6", "year": 2021, "rel_sent": "Our experimental results show that a combination of entropy - based uncertainty detection and beam search , together with multi - source training on clarification question , initial parse , and user answer , results in improvements of 1.2 % F1 score on a parser that already performs at 90.26 % on the NLMaps dataset for OSM semantic parsing .", "forward": false, "src_ids": "2021.splurobonlp-1.6_8008"}
{"input": "hedges is used for Material| context: the use of hedges plays an important role in daily language use .", "entity": "hedges", "output": "interpersonal interaction", "neg_sample": ["hedges is used for Material", "the use of hedges plays an important role in daily language use ."], "relation": "used for", "id": "2021.paclic-1.9", "year": 2021, "rel_sent": "It is because that with multiple functions , hedges are beneficial for speakers tofacilitating interpersonal interaction .", "forward": true, "src_ids": "2021.paclic-1.9_13402"}
{"input": "adversarial training is used for OtherScientificTerm| context: arabic diacritization is a fundamental task for arabic language processing . previous studies have demonstrated that automatically generated knowledge can be helpful to this task . however , these studies regard the auto - generated knowledge instances as gold references , which limits their effectiveness since such knowledge is not always accurate and inferior instances can lead to incorrect predictions .", "entity": "adversarial training", "output": "noisy knowledge", "neg_sample": ["adversarial training is used for OtherScientificTerm", "arabic diacritization is a fundamental task for arabic language processing .", "previous studies have demonstrated that automatically generated knowledge can be helpful to this task .", "however , these studies regard the auto - generated knowledge instances as gold references , which limits their effectiveness since such knowledge is not always accurate and inferior instances can lead to incorrect predictions ."], "relation": "used for", "id": "2021.acl-short.68", "year": 2021, "rel_sent": "In this paper , we propose to use regularized decoding and adversarial training to appropriately learn from such noisy knowledge for diacritization .", "forward": true, "src_ids": "2021.acl-short.68_9871"}
{"input": "data model is used for Task| context: transcribing low resource languages can be challenging in the absence of a good lexicon and trained transcribers .", "entity": "data model", "output": "interactive transcription", "neg_sample": ["data model is used for Task", "transcribing low resource languages can be challenging in the absence of a good lexicon and trained transcribers ."], "relation": "used for", "id": "2021.dash-1.16", "year": 2021, "rel_sent": "This paper presents a data model and a system architecture for interactive transcription , supporting multiple modes of interactivity , increasing the likelihood of finding tasks that engage local participation in language work .", "forward": true, "src_ids": "2021.dash-1.16_15472"}
{"input": "automation is used for Task| context: human evaluation for summarization tasks is reliable but brings in issues of reproducibility and high costs . automatic metrics are cheap and reproducible but sometimes poorly correlated with human judgment .", "entity": "automation", "output": "summary evaluation", "neg_sample": ["automation is used for Task", "human evaluation for summarization tasks is reliable but brings in issues of reproducibility and high costs .", "automatic metrics are cheap and reproducible but sometimes poorly correlated with human judgment ."], "relation": "used for", "id": "2021.emnlp-main.531", "year": 2021, "rel_sent": "Finding a Balanced Degree of Automation for Summary Evaluation.", "forward": true, "src_ids": "2021.emnlp-main.531_2610"}
{"input": "data - deficient dialog generation task is done by using Method| context: under the pandemic of covid-19 , people experiencing covid19 - related symptoms have a pressing need to consult doctors . because of the shortage of medical professionals , many people can not receive online consultations timely . training complex dialog generation models on small datasets bears high risk of overfitting .", "entity": "data - deficient dialog generation task", "output": "multi - task learning approach", "neg_sample": ["data - deficient dialog generation task is done by using Method", "under the pandemic of covid-19 , people experiencing covid19 - related symptoms have a pressing need to consult doctors .", "because of the shortage of medical professionals , many people can not receive online consultations timely .", "training complex dialog generation models on small datasets bears high risk of overfitting ."], "relation": "used for", "id": "2021.acl-short.112", "year": 2021, "rel_sent": "To alleviate overfitting , we develop a multi - task learning approach , which regularizes the data - deficient dialog generation task with a masked token prediction task .", "forward": false, "src_ids": "2021.acl-short.112_6255"}
{"input": "learningbased interpretation method is used for OtherScientificTerm| context: it is evident that deep text classification models trained on human data could be biased . in particular , they produce biased outcomes for texts that explicitly include identity terms of certain demographic groups . we refer to this type of bias as explicit bias , which has been extensively studied . however , deep text classification models can also produce biased outcomes for texts written by authors of certain demographic groups .", "entity": "learningbased interpretation method", "output": "implicit bias", "neg_sample": ["learningbased interpretation method is used for OtherScientificTerm", "it is evident that deep text classification models trained on human data could be biased .", "in particular , they produce biased outcomes for texts that explicitly include identity terms of certain demographic groups .", "we refer to this type of bias as explicit bias , which has been extensively studied .", "however , deep text classification models can also produce biased outcomes for texts written by authors of certain demographic groups ."], "relation": "used for", "id": "2021.findings-acl.7", "year": 2021, "rel_sent": "Then , we build a learningbased interpretation method to deepen our knowledge of implicit bias .", "forward": true, "src_ids": "2021.findings-acl.7_11249"}
{"input": "stereotypical biases is done by using Task| context: a stereotype is an over - generalized belief about a particular group of people , e.g. , asians are good at math or african americans are athletic . such beliefs ( biases ) are known to hurt target groups . since pretrained language models are trained on large real - world data , they are known to capture stereotypical biases . it is important to quantify to what extent these biases are present in them . although this is a rapidly growing area of research , existing literature lacks in two important aspects : 1 ) they mainly evaluate bias of pretrained language models on a small set of artificial sentences , even though these models are trained on natural data 2 ) current evaluations focus on measuring bias without considering the language modeling ability of a model , which could lead to misleading trust on a model even if it is a poor language model . we address both these problems .", "entity": "stereotypical biases", "output": "stereoset", "neg_sample": ["stereotypical biases is done by using Task", "a stereotype is an over - generalized belief about a particular group of people , e.g.", ", asians are good at math or african americans are athletic .", "such beliefs ( biases ) are known to hurt target groups .", "since pretrained language models are trained on large real - world data , they are known to capture stereotypical biases .", "it is important to quantify to what extent these biases are present in them .", "although this is a rapidly growing area of research , existing literature lacks in two important aspects : 1 ) they mainly evaluate bias of pretrained language models on a small set of artificial sentences , even though these models are trained on natural data 2 ) current evaluations focus on measuring bias without considering the language modeling ability of a model , which could lead to misleading trust on a model even if it is a poor language model .", "we address both these problems ."], "relation": "used for", "id": "2021.acl-long.416", "year": 2021, "rel_sent": "We present StereoSet , a large - scale natural English dataset to measure stereotypical biases in four domains : gender , profession , race , and religion .", "forward": false, "src_ids": "2021.acl-long.416_7001"}
{"input": "structural similarity is done by using Material| context: unsupervised cross - lingual word embedding(clwe ) methods learn a linear transformation matrix that maps two monolingual embedding spaces that are separately trained with monolingual corpora . this method relies on the assumption that the two embedding spaces are structurally similar , which does not necessarily hold true in general .", "entity": "structural similarity", "output": "pseudo - parallel corpus", "neg_sample": ["structural similarity is done by using Material", "unsupervised cross - lingual word embedding(clwe ) methods learn a linear transformation matrix that maps two monolingual embedding spaces that are separately trained with monolingual corpora .", "this method relies on the assumption that the two embedding spaces are structurally similar , which does not necessarily hold true in general ."], "relation": "used for", "id": "2021.acl-srw.17", "year": 2021, "rel_sent": "In this paper , we argue that using a pseudo - parallel corpus generated by an unsupervised machine translation model facilitates the structural similarity of the two embedding spaces and improves the quality of CLWEs in the unsupervised mapping method .", "forward": false, "src_ids": "2021.acl-srw.17_11101"}
{"input": "fine - grained annotated dataset is used for Task| context: whatsapp messenger is one of the most popular channels for spreading information with a current reach of more than 180 countries and 2 billion people . its widespread usage has made it one of the most popular media for information propagation among the masses during any socially engaging event . in the recent past , several countries have witnessed its effectiveness and influence in political and social campaigns . we observe a high surge in information and propaganda flow during election campaigning .", "entity": "fine - grained annotated dataset", "output": "whatsapp political campaigning", "neg_sample": ["fine - grained annotated dataset is used for Task", "whatsapp messenger is one of the most popular channels for spreading information with a current reach of more than 180 countries and 2 billion people .", "its widespread usage has made it one of the most popular media for information propagation among the masses during any socially engaging event .", "in the recent past , several countries have witnessed its effectiveness and influence in political and social campaigns .", "we observe a high surge in information and propaganda flow during election campaigning ."], "relation": "used for", "id": "2021.wnut-1.15", "year": 2021, "rel_sent": "In addition to the raw noisy user - generated data , we present a fine - grained annotated dataset of 3,848 messages that will be useful to understand the various dimensions of WhatsApp political campaigning .", "forward": true, "src_ids": "2021.wnut-1.15_16081"}
{"input": "numeracy is used for Method| context: specialized number representations in nlp have shown improvements on numerical reasoning tasks like arithmetic word problems and masked number prediction . but humans also use numeracy to make better sense of world concepts , e.g. , you can seat 5 people in your ' room ' but not 500 . does a better grasp of numbers improve a model 's understanding of other concepts and words ?", "entity": "numeracy", "output": "language models", "neg_sample": ["numeracy is used for Method", "specialized number representations in nlp have shown improvements on numerical reasoning tasks like arithmetic word problems and masked number prediction .", "but humans also use numeracy to make better sense of world concepts , e.g.", ", you can seat 5 people in your ' room ' but not 500 . does a better grasp of numbers improve a model 's understanding of other concepts and words ?"], "relation": "used for", "id": "2021.emnlp-main.557", "year": 2021, "rel_sent": "Numeracy enhances the Literacy of Language Models.", "forward": true, "src_ids": "2021.emnlp-main.557_5095"}
{"input": "classification is done by using Method| context: toxic span detection requires the detection of spans that make a text toxic instead of simply classifying the text .", "entity": "classification", "output": "auxiliary information module", "neg_sample": ["classification is done by using Method", "toxic span detection requires the detection of spans that make a text toxic instead of simply classifying the text ."], "relation": "used for", "id": "2021.semeval-1.112", "year": 2021, "rel_sent": "It consists of three parts : a transformer - based model that can obtain the token representation , an auxiliary information module that combines features from different layers , and an output layer used for the classification .", "forward": false, "src_ids": "2021.semeval-1.112_10209"}
{"input": "logical ) model is done by using Method| context: recent question answering and machine reading benchmarks frequently reduce the task to one of pinpointing spans within a certain text passage that answers the given question . typically , these systems are not required to actually understand the text on a deeper level that allows for more complex reasoning on the information contained .", "entity": "logical ) model", "output": "structured datalog program", "neg_sample": ["logical ) model is done by using Method", "recent question answering and machine reading benchmarks frequently reduce the task to one of pinpointing spans within a certain text passage that answers the given question .", "typically , these systems are not required to actually understand the text on a deeper level that allows for more complex reasoning on the information contained ."], "relation": "used for", "id": "2021.starsem-1.10", "year": 2021, "rel_sent": "All texts are accompanied by a structured Datalog program that represents a ( logical ) model of its information .", "forward": false, "src_ids": "2021.starsem-1.10_10862"}
{"input": "human grammar acquisition is done by using Method| context: this article describes a simple pcfg induction model with a fixed category domain that predicts a large majority of attested constituent boundaries , and predicts labels consistent with nearly half of attested constituent labels on a standard evaluation data set of child - directed speech .", "entity": "human grammar acquisition", "output": "depth - bounded statistical pcfg induction", "neg_sample": ["human grammar acquisition is done by using Method", "this article describes a simple pcfg induction model with a fixed category domain that predicts a large majority of attested constituent boundaries , and predicts labels consistent with nearly half of attested constituent labels on a standard evaluation data set of child - directed speech ."], "relation": "used for", "id": "2021.cl-1.7", "year": 2021, "rel_sent": "Depth - Bounded Statistical PCFG Induction as a Model of Human Grammar Acquisition.", "forward": false, "src_ids": "2021.cl-1.7_7274"}
{"input": "natural triggers is used for Task| context: recent work has demonstrated the vulnerability of modern text classifiers to universal adversarial attacks , which are input - agnostic sequences of words added to text processed by classifiers . despite being successful , the word sequences produced in such attacks are often ungrammatical and can be easily distinguished from natural text . we develop adversarial attacks that appear closer to natural english phrases and yet confuse classification systems when added to benign inputs .", "entity": "natural triggers", "output": "text classification", "neg_sample": ["natural triggers is used for Task", "recent work has demonstrated the vulnerability of modern text classifiers to universal adversarial attacks , which are input - agnostic sequences of words added to text processed by classifiers .", "despite being successful , the word sequences produced in such attacks are often ungrammatical and can be easily distinguished from natural text .", "we develop adversarial attacks that appear closer to natural english phrases and yet confuse classification systems when added to benign inputs ."], "relation": "used for", "id": "2021.naacl-main.291", "year": 2021, "rel_sent": "Universal Adversarial Attacks with Natural Triggers for Text Classification.", "forward": true, "src_ids": "2021.naacl-main.291_9156"}
{"input": "monolingual data is used for Method| context: the common practice is to construct synthetic data based on a randomly sampled subset of large - scale monolingual data , which we empirically show is sub - optimal .", "entity": "monolingual data", "output": "self - training", "neg_sample": ["monolingual data is used for Method", "the common practice is to construct synthetic data based on a randomly sampled subset of large - scale monolingual data , which we empirically show is sub - optimal ."], "relation": "used for", "id": "2021.acl-long.221", "year": 2021, "rel_sent": "Accordingly , we design an uncertainty - based sampling strategy to efficiently exploit the monolingual data for self - training , in which monolingual sentences with higher uncertainty would be sampled with higher probability .", "forward": true, "src_ids": "2021.acl-long.221_10632"}
{"input": "feature extraction algorithm is done by using Method| context: as the system of confiscation becomes more and more perfect , grasping the distribution of the types of confiscations actually announced by the courts will enable you to understand changing of the trend . in addition to assisting legislators in formulating laws , it can also provide other people with an understanding of the actual operation of the confiscation system . in order to enable artificial intelligence technology to automatically identify the distribution of confiscation , and consumes a lot of manpower and time costs of manual judgment .", "entity": "feature extraction algorithm", "output": "word2vec algorithm", "neg_sample": ["feature extraction algorithm is done by using Method", "as the system of confiscation becomes more and more perfect , grasping the distribution of the types of confiscations actually announced by the courts will enable you to understand changing of the trend .", "in addition to assisting legislators in formulating laws , it can also provide other people with an understanding of the actual operation of the confiscation system .", "in order to enable artificial intelligence technology to automatically identify the distribution of confiscation , and consumes a lot of manpower and time costs of manual judgment ."], "relation": "used for", "id": "2021.rocling-1.41", "year": 2021, "rel_sent": "This research will use Term Frequency - Inverse Document Frequency ( TF - IDF ) and Word2Vec algorithm as the feature extraction algorithm , with random forest classifier , and CKIPlabBERT pretrained model for training and identification .", "forward": false, "src_ids": "2021.rocling-1.41_8353"}
{"input": "massive lexicon table is used for OtherScientificTerm| context: we translate a closed text that is known in advance into a severely low resource language by leveraging massive source parallelism . in other words , given a text in 124 source languages , we translate it into a severely low resource language using only ~1,000 lines of low resource data without any external help .", "entity": "massive lexicon table", "output": "bible named entities", "neg_sample": ["massive lexicon table is used for OtherScientificTerm", "we translate a closed text that is known in advance into a severely low resource language by leveraging massive source parallelism .", "in other words , given a text in 124 source languages , we translate it into a severely low resource language using only ~1,000 lines of low resource data without any external help ."], "relation": "used for", "id": "2021.sigtyp-1.7", "year": 2021, "rel_sent": "In order to translate named entities well , we build a massive lexicon table for 2,939 Bible named entities in 124 source languages , and include many that occur once and covers more than 66 severely low resource languages .", "forward": true, "src_ids": "2021.sigtyp-1.7_14008"}
{"input": "keyphrase extraction is done by using Method| context: to keep pace with the increased generation and digitization of documents , automated methods that can improve search , discovery and mining of the vast body of literature are essential . keyphrases provide a concise representation by identifying salient concepts in a document . various supervised approaches model keyphrase extraction using local context to predict the label for each token and perform much better than the unsupervised counterparts . moreover , keyphrases , which are usually the gist of a document , need to be the central theme .", "entity": "keyphrase extraction", "output": "word centrality constrained representation", "neg_sample": ["keyphrase extraction is done by using Method", "to keep pace with the increased generation and digitization of documents , automated methods that can improve search , discovery and mining of the vast body of literature are essential .", "keyphrases provide a concise representation by identifying salient concepts in a document .", "various supervised approaches model keyphrase extraction using local context to predict the label for each token and perform much better than the unsupervised counterparts .", "moreover , keyphrases , which are usually the gist of a document , need to be the central theme ."], "relation": "used for", "id": "2021.bionlp-1.17", "year": 2021, "rel_sent": "Word centrality constrained representation for keyphrase extraction.", "forward": false, "src_ids": "2021.bionlp-1.17_4784"}
{"input": "encoder - decoders is used for Task| context: we often use perturbations to regularize neural models . for neural encoder - decoders , previous studies applied the scheduled sampling ( bengio et al . , 2015 ) and adversarial perturbations ( sato et al . , 2019 ) as perturbations but these methods require considerable computational time . thus , this study addresses the question of whether these approaches are efficient enough for training time .", "entity": "encoder - decoders", "output": "fast training", "neg_sample": ["encoder - decoders is used for Task", "we often use perturbations to regularize neural models .", "for neural encoder - decoders , previous studies applied the scheduled sampling ( bengio et al .", ", 2015 ) and adversarial perturbations ( sato et al .", ", 2019 ) as perturbations but these methods require considerable computational time .", "thus , this study addresses the question of whether these approaches are efficient enough for training time ."], "relation": "used for", "id": "2021.naacl-main.460", "year": 2021, "rel_sent": "Rethinking Perturbations in Encoder - Decoders for Fast Training.", "forward": true, "src_ids": "2021.naacl-main.460_10014"}
{"input": "language generation is done by using Method| context: beam search is a go - to strategy for decoding neural sequence models . the algorithm can naturally be viewed as a subset optimization problem , albeit one where the corresponding set function does not reflect interactions between candidates . empirically , this leads to sets often exhibiting high overlap , e.g. , strings may differ by only a single word . yet in use - cases that call for multiple solutions , a diverse or representative set is often desired .", "entity": "language generation", "output": "diverse set generation strategies", "neg_sample": ["language generation is done by using Method", "beam search is a go - to strategy for decoding neural sequence models .", "the algorithm can naturally be viewed as a subset optimization problem , albeit one where the corresponding set function does not reflect interactions between candidates .", "empirically , this leads to sets often exhibiting high overlap , e.g.", ", strings may differ by only a single word .", "yet in use - cases that call for multiple solutions , a diverse or representative set is often desired ."], "relation": "used for", "id": "2021.acl-long.512", "year": 2021, "rel_sent": "We observe that our algorithm offers competitive performance against other diverse set generation strategies in the context of language generation , while providing a more general approach to optimizing for diversity .", "forward": false, "src_ids": "2021.acl-long.512_11820"}
{"input": "link selection task is done by using Method| context: conversations are often held in laboratories and companies . a summary is vital to grasp the content of a discussion for people who did not attend the discussion . if the summary is illustrated as an argument structure , it is helpful to grasp the discussion 's essentials immediately .", "entity": "link selection task", "output": "score - based approach", "neg_sample": ["link selection task is done by using Method", "conversations are often held in laboratories and companies .", "a summary is vital to grasp the content of a discussion for people who did not attend the discussion .", "if the summary is illustrated as an argument structure , it is helpful to grasp the discussion 's essentials immediately ."], "relation": "used for", "id": "2021.ranlp-1.61", "year": 2021, "rel_sent": "Then , we apply a score - based approach as the second step : a link selection task .", "forward": false, "src_ids": "2021.ranlp-1.61_156"}
{"input": "fake news detection system is done by using Method| context: the exponential growth of the internet and social media in the past decade gave way to the increase in dissemination of false or misleading information . since the 2016 us presidential election , the term ' fake news ' became increasingly popular and this phenomenon has received more attention . in the past years several fact - checking agencies were created , but due to the great number of daily posts on social media , manual checking is insufficient . currently , there is a pressing need for automatic fake news detection tools , either to assist manual fact - checkers or to operate as standalone tools . there are several projects underway on this topic , but most of them focus on english .", "entity": "fake news detection system", "output": "deep learning models", "neg_sample": ["fake news detection system is done by using Method", "the exponential growth of the internet and social media in the past decade gave way to the increase in dissemination of false or misleading information .", "since the 2016 us presidential election , the term ' fake news ' became increasingly popular and this phenomenon has received more attention .", "in the past years several fact - checking agencies were created , but due to the great number of daily posts on social media , manual checking is insufficient .", "currently , there is a pressing need for automatic fake news detection tools , either to assist manual fact - checkers or to operate as standalone tools .", "there are several projects underway on this topic , but most of them focus on english ."], "relation": "used for", "id": "2021.triton-1.16", "year": 2021, "rel_sent": "Based on the preliminary results of these classifiers , we shall choose a deep learning model or combine several deep learning models which hold promise to enhance the performance of our fake news detection system .", "forward": false, "src_ids": "2021.triton-1.16_14655"}
{"input": "meta - learner is done by using Method| context: with the wide availability of pre - trained language models ( plms ) , multi - task fine - tuning across domains has been extensively applied . for tasks related to distant domains with different class label sets , plms may memorize non - transferable knowledge for the target domain and suffer from negative transfer .", "entity": "meta - learner", "output": "weighted maximum entropy regularizers", "neg_sample": ["meta - learner is done by using Method", "with the wide availability of pre - trained language models ( plms ) , multi - task fine - tuning across domains has been extensively applied .", "for tasks related to distant domains with different class label sets , plms may memorize non - transferable knowledge for the target domain and suffer from negative transfer ."], "relation": "used for", "id": "2021.emnlp-main.768", "year": 2021, "rel_sent": "The weighted maximum entropy regularizers are proposed to make meta - learner more task - agnostic and unbiased .", "forward": false, "src_ids": "2021.emnlp-main.768_7772"}
{"input": "skge models is done by using Method| context: static knowledge graph ( skg ) embedding ( skge ) has been studied intensively in the past years . recently , temporal knowledge graph ( tkg ) embedding ( tkge ) has emerged .", "entity": "skge models", "output": "recursive temporal fact embedding ( rtfe ) framework", "neg_sample": ["skge models is done by using Method", "static knowledge graph ( skg ) embedding ( skge ) has been studied intensively in the past years .", "recently , temporal knowledge graph ( tkg ) embedding ( tkge ) has emerged ."], "relation": "used for", "id": "2021.naacl-main.451", "year": 2021, "rel_sent": "In this paper , we propose a Recursive Temporal Fact Embedding ( RTFE ) framework to transplant SKGE models to TKGs and to enhance the performance of existing TKGE models for TKG completion .", "forward": false, "src_ids": "2021.naacl-main.451_11843"}
{"input": "non - native english is done by using Task| context: to address the performance gap of english asr models on l2 english speakers , we evaluate fine - tuning of pretrained wav2vec 2.0 models ( baevski et al . , 2020 ; xu et al . , 2021 ) on l2 - arctic , a non - native english speech corpus ( zhao et al . , 2018 ) under different training settings .", "entity": "non - native english", "output": "automatic speech recognition", "neg_sample": ["non - native english is done by using Task", "to address the performance gap of english asr models on l2 english speakers , we evaluate fine - tuning of pretrained wav2vec 2.0 models ( baevski et al .", ", 2020 ; xu et al .", ", 2021 ) on l2 - arctic , a non - native english speech corpus ( zhao et al .", ", 2018 ) under different training settings ."], "relation": "used for", "id": "2021.icnlsp-1.2", "year": 2021, "rel_sent": "Speech Technology for Everyone : Automatic Speech Recognition for Non - Native English.", "forward": false, "src_ids": "2021.icnlsp-1.2_11362"}
{"input": "breadth first reasoning graph is used for Task| context: however , the unnecessary updations and simple edge constructions prevent an accurate answer span extraction in a more direct and interpretable way .", "entity": "breadth first reasoning graph", "output": "multi - hop question answering task", "neg_sample": ["breadth first reasoning graph is used for Task", "however , the unnecessary updations and simple edge constructions prevent an accurate answer span extraction in a more direct and interpretable way ."], "relation": "used for", "id": "2021.naacl-main.464", "year": 2021, "rel_sent": "Breadth First Reasoning Graph for Multi - hop Question Answering.", "forward": true, "src_ids": "2021.naacl-main.464_12062"}
{"input": "non - autoregressive machine translation is done by using Method| context: non - autoregressive machine translation ( nat ) models have demonstrated significant inference speedup but suffer from inferior translation accuracy . the common practice to tackle the problem is transferring the autoregressive machine translation ( at ) knowledge to nat models , e.g. , with knowledge distillation . in this work , we hypothesize and empirically verify that at and nat encoders capture different linguistic properties of source sentences .", "entity": "non - autoregressive machine translation", "output": "knowledge transfer method", "neg_sample": ["non - autoregressive machine translation is done by using Method", "non - autoregressive machine translation ( nat ) models have demonstrated significant inference speedup but suffer from inferior translation accuracy .", "the common practice to tackle the problem is transferring the autoregressive machine translation ( at ) knowledge to nat models , e.g.", ", with knowledge distillation .", "in this work , we hypothesize and empirically verify that at and nat encoders capture different linguistic properties of source sentences ."], "relation": "used for", "id": "2021.naacl-main.313", "year": 2021, "rel_sent": "In addition , experimental results demonstrate that our Multi - Task NAT is complementary to knowledge distillation , the standard knowledge transfer method for NAT .", "forward": false, "src_ids": "2021.naacl-main.313_7146"}
{"input": "drop - in replacement is used for Method| context: following the success of dot - product attention in transformers , numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length .", "entity": "drop - in replacement", "output": "vanilla attention", "neg_sample": ["drop - in replacement is used for Method", "following the success of dot - product attention in transformers , numerous approximations have been recently proposed to address its quadratic complexity with respect to the input length ."], "relation": "used for", "id": "2021.sustainlp-1.5", "year": 2021, "rel_sent": "Our approach offers several advantages : ( a ) its memory usage is linear in the input size , similar to linear attention variants , such as Performer and RFA ( b ) it is a drop - in replacement for vanilla attention that does not require any corrective pre - training , and ( c ) it can also lead to significant memory savings in the feed - forward layers after casting them into the familiar query - key - value framework .", "forward": true, "src_ids": "2021.sustainlp-1.5_14477"}
{"input": "rhetoric and emotion identification problem is done by using Method| context: rhetorical implicit emotion identification is one of important and challenging tasks in natural language processing . we observe that each rhetoric may express certain evidence of semantic and syntactic patterns .", "entity": "rhetoric and emotion identification problem", "output": "multi - task learning framework", "neg_sample": ["rhetoric and emotion identification problem is done by using Method", "rhetorical implicit emotion identification is one of important and challenging tasks in natural language processing .", "we observe that each rhetoric may express certain evidence of semantic and syntactic patterns ."], "relation": "used for", "id": "2021.findings-acl.123", "year": 2021, "rel_sent": "We thus propose a new multi - task learning framework that can encode the categorical correlation between tasks to improve the performance of rhetoric and emotion identification problem .", "forward": false, "src_ids": "2021.findings-acl.123_10558"}
{"input": "adversarial learning framework is used for Task| context: contextual representations learned by language models can often encode undesirable attributes , like demographic associations of the users , while being trained for an unrelated target task .", "entity": "adversarial learning framework", "output": "debias contextual representations", "neg_sample": ["adversarial learning framework is used for Task", "contextual representations learned by language models can often encode undesirable attributes , like demographic associations of the users , while being trained for an unrelated target task ."], "relation": "used for", "id": "2021.emnlp-main.43", "year": 2021, "rel_sent": "In this paper , we present an adversarial learning framework ' Adversarial Scrubber ' ( AdS ) , to debias contextual representations .", "forward": true, "src_ids": "2021.emnlp-main.43_14315"}
{"input": "vietnamese is done by using Material| context: the current covid-19 pandemic has lead to the creation of many corpora that facilitate nlp research and downstream applications to help fight the pandemic . however , most of these corpora are exclusively for english . as the pandemic is a global problem , it is worth creating covid-19 related datasets for languages other than english .", "entity": "vietnamese", "output": "manually - annotated covid-19 domain - specific dataset", "neg_sample": ["vietnamese is done by using Material", "the current covid-19 pandemic has lead to the creation of many corpora that facilitate nlp research and downstream applications to help fight the pandemic .", "however , most of these corpora are exclusively for english .", "as the pandemic is a global problem , it is worth creating covid-19 related datasets for languages other than english ."], "relation": "used for", "id": "2021.naacl-main.173", "year": 2021, "rel_sent": "In this paper , we present the first manually - annotated COVID-19 domain - specific dataset for Vietnamese .", "forward": false, "src_ids": "2021.naacl-main.173_13745"}
{"input": "semantic shift is done by using Method| context: paraphrases refer to texts that convey the same meaning with different expression forms . pivot - based methods , also known as the round - trip translation , have shown promising results in generating high - quality paraphrases . however , existing pivot - based methods all rely on language as the pivot , where large - scale , high - quality parallel bilingual texts are required .", "entity": "semantic shift", "output": "end - to - end framework", "neg_sample": ["semantic shift is done by using Method", "paraphrases refer to texts that convey the same meaning with different expression forms .", "pivot - based methods , also known as the round - trip translation , have shown promising results in generating high - quality paraphrases .", "however , existing pivot - based methods all rely on language as the pivot , where large - scale , high - quality parallel bilingual texts are required ."], "relation": "used for", "id": "2021.emnlp-main.350", "year": 2021, "rel_sent": "The end - to - end framework can reduce semantic shift when language is used as the pivot .", "forward": false, "src_ids": "2021.emnlp-main.350_11315"}
{"input": "encoders is used for Task| context: the task of generating weather - forecast comments from meteorological simulations has the following requirements : ( i ) the changes in numerical values for various physical quantities need to be considered , ( ii ) the weather comments should be dependent on delivery time and area information , and ( iii ) the comments should provide useful information for users .", "entity": "encoders", "output": "numerical forecast maps", "neg_sample": ["encoders is used for Task", "the task of generating weather - forecast comments from meteorological simulations has the following requirements : ( i ) the changes in numerical values for various physical quantities need to be considered , ( ii ) the weather comments should be dependent on delivery time and area information , and ( iii ) the comments should provide useful information for users ."], "relation": "used for", "id": "2021.eacl-main.125", "year": 2021, "rel_sent": "To meet these requirements , we propose a data - to - text model that incorporates three types of encoders for numerical forecast maps , observation data , and meta - data .", "forward": true, "src_ids": "2021.eacl-main.125_4687"}
{"input": "answer ranking is done by using Method| context: conversational kbqa is about answering a sequence of questions related to a kb . follow - up questions in conversational kbqa often have missing information referring to entities from the conversation history .", "entity": "answer ranking", "output": "kbqa module", "neg_sample": ["answer ranking is done by using Method", "conversational kbqa is about answering a sequence of questions related to a kb .", "follow - up questions in conversational kbqa often have missing information referring to entities from the conversation history ."], "relation": "used for", "id": "2021.acl-long.255", "year": 2021, "rel_sent": "We propose a novel graph - based model to capture the transitions of focal entities and apply a graph neural network to derive a probability distribution of focal entities for each question , which is then combined with a standard KBQA module to perform answer ranking .", "forward": false, "src_ids": "2021.acl-long.255_6057"}
{"input": "rank models is used for Task| context: according to the probability ranking principle ( prp ) , ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad - hoc retrieval . the prp holds when two conditions are met : [ c1 ] the models are well calibrated , and , [ c2 ] the probabilities of relevance are reported with certainty . we know however that deep neural networks ( dnns ) are often not well calibrated and have several sources of uncertainty , and thus [ c1 ] and [ c2 ] might not be satisfied by neural rankers .", "entity": "rank models", "output": "conversational search", "neg_sample": ["rank models is used for Task", "according to the probability ranking principle ( prp ) , ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad - hoc retrieval .", "the prp holds when two conditions are met : [ c1 ] the models are well calibrated , and , [ c2 ] the probabilities of relevance are reported with certainty .", "we know however that deep neural networks ( dnns ) are often not well calibrated and have several sources of uncertainty , and thus [ c1 ] and [ c2 ] might not be satisfied by neural rankers ."], "relation": "used for", "id": "2021.eacl-main.12", "year": 2021, "rel_sent": "On the Calibration and Uncertainty of Neural Learning to Rank Models for Conversational Search.", "forward": true, "src_ids": "2021.eacl-main.12_8015"}
{"input": "specialized collections is done by using Metric| context: when developing topic models , a critical question that should be asked is : how well will this model work in an applied setting ? because standard performance evaluation of topic interpretability uses automated measures modeled on human evaluation tests that are dissimilar to applied usage , these models ' generalizability remains in question .", "entity": "specialized collections", "output": "informative coherence measures", "neg_sample": ["specialized collections is done by using Metric", "when developing topic models , a critical question that should be asked is : how well will this model work in an applied setting ?", "because standard performance evaluation of topic interpretability uses automated measures modeled on human evaluation tests that are dissimilar to applied usage , these models ' generalizability remains in question ."], "relation": "used for", "id": "2021.naacl-main.300", "year": 2021, "rel_sent": "In this paper , we probe the issue of validity in topic model evaluation and assess how informative coherence measures are for specialized collections used in an applied setting .", "forward": false, "src_ids": "2021.naacl-main.300_7494"}
{"input": "multiplex graph neural network is used for Task| context: to extract a good summary from a long text document , sentence embedding plays an important role . recent studies have leveraged graph neural networks to capture the inter - sentential relationship ( e.g. , the discourse graph ) within the documents to learn contextual sentence embedding . however , those approaches neither consider multiple types of inter - sentential relationships ( e.g. , semantic similarity and natural connection relationships ) , nor model intra - sentential relationships ( e.g , semantic similarity and syntactic relationship among words ) .", "entity": "multiplex graph neural network", "output": "extractive text summarization", "neg_sample": ["multiplex graph neural network is used for Task", "to extract a good summary from a long text document , sentence embedding plays an important role .", "recent studies have leveraged graph neural networks to capture the inter - sentential relationship ( e.g.", ", the discourse graph ) within the documents to learn contextual sentence embedding .", "however , those approaches neither consider multiple types of inter - sentential relationships ( e.g.", ", semantic similarity and natural connection relationships ) , nor model intra - sentential relationships ( e.g , semantic similarity and syntactic relationship among words ) ."], "relation": "used for", "id": "2021.emnlp-main.11", "year": 2021, "rel_sent": "Multiplex Graph Neural Network for Extractive Text Summarization.", "forward": true, "src_ids": "2021.emnlp-main.11_7403"}
{"input": "fine - tuning is done by using Method| context: pretrained transformer - based encoders such as bert have been demonstrated to achieve state - of - the - art performance on numerous nlp tasks . despite their success , bert style encoders are large in size and have high latency during inference ( especially on cpu machines ) which make them unappealing for many online applications . however , the focus of these works has been mainly on monolingual encoders .", "entity": "fine - tuning", "output": "distillation", "neg_sample": ["fine - tuning is done by using Method", "pretrained transformer - based encoders such as bert have been demonstrated to achieve state - of - the - art performance on numerous nlp tasks .", "despite their success , bert style encoders are large in size and have high latency during inference ( especially on cpu machines ) which make them unappealing for many online applications .", "however , the focus of these works has been mainly on monolingual encoders ."], "relation": "used for", "id": "2021.sustainlp-1.3", "year": 2021, "rel_sent": "We demonstrate that in contradiction to the previous observation in the case of monolingual distillation , in multilingual settings , distillation during pretraining is more effective than distillation during fine - tuning for zero - shot transfer learning .", "forward": false, "src_ids": "2021.sustainlp-1.3_11744"}
{"input": "cloze - questions is used for Task| context: some nlp tasks can be solved in a fully unsupervised fashion by providing a pretrained language model with ' task descriptions ' in natural language ( e.g. , radford et al . , 2019 ) .", "entity": "cloze - questions", "output": "few - shot text classification", "neg_sample": ["cloze - questions is used for Task", "some nlp tasks can be solved in a fully unsupervised fashion by providing a pretrained language model with ' task descriptions ' in natural language ( e.g.", ", radford et al .", ", 2019 ) ."], "relation": "used for", "id": "2021.eacl-main.20", "year": 2021, "rel_sent": "Exploiting Cloze - Questions for Few - Shot Text Classification and Natural Language Inference.", "forward": true, "src_ids": "2021.eacl-main.20_16156"}
{"input": "dart is used for Task| context: we present dart , an open domain structured data record to text generation dataset with over 82k instances ( darts ) . data - to - text annotations can be a costly process , especially when dealing with tables which are the major source of structured data and contain nontrivial structures .", "entity": "dart", "output": "out - of - domain generalization", "neg_sample": ["dart is used for Task", "we present dart , an open domain structured data record to text generation dataset with over 82k instances ( darts ) .", "data - to - text annotations can be a costly process , especially when dealing with tables which are the major source of structured data and contain nontrivial structures ."], "relation": "used for", "id": "2021.naacl-main.37", "year": 2021, "rel_sent": "We present systematic evaluation on DART as well as new state - of - the - art results on WebNLG 2017 to show that DART ( 1 ) poses new challenges to existing data - to - text datasets and ( 2 ) facilitates out - of - domain generalization .", "forward": true, "src_ids": "2021.naacl-main.37_672"}
{"input": "two - step methods is used for Task| context: conversations are often held in laboratories and companies . a summary is vital to grasp the content of a discussion for people who did not attend the discussion . if the summary is illustrated as an argument structure , it is helpful to grasp the discussion 's essentials immediately .", "entity": "two - step methods", "output": "structure prediction task", "neg_sample": ["two - step methods is used for Task", "conversations are often held in laboratories and companies .", "a summary is vital to grasp the content of a discussion for people who did not attend the discussion .", "if the summary is illustrated as an argument structure , it is helpful to grasp the discussion 's essentials immediately ."], "relation": "used for", "id": "2021.ranlp-1.61", "year": 2021, "rel_sent": "To solve this problem , we introduce a two - step method to the structure prediction task .", "forward": true, "src_ids": "2021.ranlp-1.61_153"}
{"input": "linguistic annotation framework is used for Task| context: in recent years , remote digital healthcare using online chats has gained momentum , especially in the global south . though prior work has studied interaction patterns in online ( health ) forums , such as talklife , reddit and facebook , there has been limited work in understanding interactions in small , close - knit community of instant messengers .", "entity": "linguistic annotation framework", "output": "analysis of health - focused whatsapp groups", "neg_sample": ["linguistic annotation framework is used for Task", "in recent years , remote digital healthcare using online chats has gained momentum , especially in the global south .", "though prior work has studied interaction patterns in online ( health ) forums , such as talklife , reddit and facebook , there has been limited work in understanding interactions in small , close - knit community of instant messengers ."], "relation": "used for", "id": "2021.law-1.7", "year": 2021, "rel_sent": "In this paper , we propose a linguistic annotation framework tofacilitate analysis of health - focused WhatsApp groups .", "forward": true, "src_ids": "2021.law-1.7_15702"}
{"input": "voice assistant domain is done by using Method| context: reliable tagging of temporal expressions ( tes , e.g. , book a table at l'osteria for sunday evening ) is a central requirement for voice assistants ( vas ) .", "entity": "voice assistant domain", "output": "temporal taggers", "neg_sample": ["voice assistant domain is done by using Method", "reliable tagging of temporal expressions ( tes , e.g.", ", book a table at l'osteria for sunday evening ) is a central requirement for voice assistants ( vas ) ."], "relation": "used for", "id": "2021.iwcs-1.14", "year": 2021, "rel_sent": "New Domain , Major Effort ? How Much Data is Necessary to Adapt a Temporal Tagger to the Voice Assistant Domain.", "forward": false, "src_ids": "2021.iwcs-1.14_4632"}
{"input": "gradually fine - tuning is used for Task| context: fine - tuning is known to improve nlp models by adapting an initial model trained on more plentiful but less domain - salient examples to data in a target domain . such domain adaptation is typically done using one stage of fine - tuning .", "entity": "gradually fine - tuning", "output": "low - resource domain adaptation", "neg_sample": ["gradually fine - tuning is used for Task", "fine - tuning is known to improve nlp models by adapting an initial model trained on more plentiful but less domain - salient examples to data in a target domain .", "such domain adaptation is typically done using one stage of fine - tuning ."], "relation": "used for", "id": "2021.adaptnlp-1.22", "year": 2021, "rel_sent": "Gradual Fine - Tuning for Low - Resource Domain Adaptation.", "forward": true, "src_ids": "2021.adaptnlp-1.22_2112"}
{"input": "few shot models is done by using OtherScientificTerm| context: few shot learning with large language models has the potential to give individuals without formal machine learning training the access to a wide range of text to text models .", "entity": "few shot models", "output": "user interface", "neg_sample": ["few shot models is done by using OtherScientificTerm", "few shot learning with large language models has the potential to give individuals without formal machine learning training the access to a wide range of text to text models ."], "relation": "used for", "id": "2021.eacl-demos.29", "year": 2021, "rel_sent": "We consider how this applies to creative writers and present Story Centaur , a user interface for prototyping few shot models and a set of recombinable web components that deploy them .", "forward": false, "src_ids": "2021.eacl-demos.29_2063"}
{"input": "clustering of word embeddings is done by using Method| context: romanian is one of the understudied languages in computational linguistics , with few resources available for the development of natural language processing tools .", "entity": "clustering of word embeddings", "output": "self - organizing maps ( soms )", "neg_sample": ["clustering of word embeddings is done by using Method", "romanian is one of the understudied languages in computational linguistics , with few resources available for the development of natural language processing tools ."], "relation": "used for", "id": "2021.eacl-main.81", "year": 2021, "rel_sent": "We also demonstrate the generalization capacity of using SOMs for the clustering of word embeddings on another recently - introduced Romanian data set , for text categorization by topic .", "forward": false, "src_ids": "2021.eacl-main.81_2938"}
{"input": "relational learning model is used for OtherScientificTerm| context: extracting moral sentiment from text is a vital component in understanding public opinion , social movements , and policy decisions . the moral foundation theory identifies five moral foundations , each associated with a positive and negative polarity . however , moral sentiment is often motivated by its targets , which can correspond to individuals or collective entities .", "entity": "relational learning model", "output": "moral attitudes", "neg_sample": ["relational learning model is used for OtherScientificTerm", "extracting moral sentiment from text is a vital component in understanding public opinion , social movements , and policy decisions .", "the moral foundation theory identifies five moral foundations , each associated with a positive and negative polarity .", "however , moral sentiment is often motivated by its targets , which can correspond to individuals or collective entities ."], "relation": "used for", "id": "2021.emnlp-main.783", "year": 2021, "rel_sent": "Then , we propose a relational learning model to predict moral attitudes towards entities and moral foundations jointly .", "forward": true, "src_ids": "2021.emnlp-main.783_774"}
{"input": "end - to - end speech translation is used for Material| context: speech translation is the translation of speech in one language typically to text in another , traditionally accomplished through a combination of automatic speech recognition and machine translation . speech translation has attracted interest for many years , but the recent successful applications of deep learning to both individual tasks have enabled new opportunities through joint modeling , in what we today call ' end - to - end speech translation . '", "entity": "end - to - end speech translation", "output": "high- and low - resource languages", "neg_sample": ["end - to - end speech translation is used for Material", "speech translation is the translation of speech in one language typically to text in another , traditionally accomplished through a combination of automatic speech recognition and machine translation .", "speech translation has attracted interest for many years , but the recent successful applications of deep learning to both individual tasks have enabled new opportunities through joint modeling , in what we today call ' end - to - end speech translation . '"], "relation": "used for", "id": "2021.eacl-tutorials.3", "year": 2021, "rel_sent": "Starting from the traditional cascaded approach , we will given an overview on data sources and model architectures to achieve state - of - the art performance with end - to - end speech translation for both high- and low - resource languages .", "forward": true, "src_ids": "2021.eacl-tutorials.3_141"}
{"input": "context flow is done by using Method| context: nowadays , open - domain dialogue models can generate acceptable responses according to the historical context based on the large - scale pre - trained language models . however , they generally concatenate the dialogue history directly as the model input to predict the response , which we named as the flat pattern and ignores the dynamic information flow across dialogue utterances .", "entity": "context flow", "output": "dynamic flow mechanism", "neg_sample": ["context flow is done by using Method", "nowadays , open - domain dialogue models can generate acceptable responses according to the historical context based on the large - scale pre - trained language models .", "however , they generally concatenate the dialogue history directly as the model input to predict the response , which we named as the flat pattern and ignores the dynamic information flow across dialogue utterances ."], "relation": "used for", "id": "2021.acl-long.11", "year": 2021, "rel_sent": "In this work , we propose the DialoFlow model , in which we introduce a dynamic flow mechanism to model the context flow , and design three training objectives to capture the information dynamics across dialogue utterances by addressing the semantic influence brought about by each utterance in large - scale pre - training .", "forward": false, "src_ids": "2021.acl-long.11_10610"}
{"input": "sentiment analysis is done by using Material| context: in a multilingual society , people communicate in more than one language , leading to code - mixed data . sentimental analysis on code - mixed telugu - english text ( cmtet ) poses unique challenges . the unstructured nature of the code - mixed data is due to the informal language , informal transliterations , and spelling errors .", "entity": "sentiment analysis", "output": "annotated dataset", "neg_sample": ["sentiment analysis is done by using Material", "in a multilingual society , people communicate in more than one language , leading to code - mixed data .", "sentimental analysis on code - mixed telugu - english text ( cmtet ) poses unique challenges .", "the unstructured nature of the code - mixed data is due to the informal language , informal transliterations , and spelling errors ."], "relation": "used for", "id": "2021.ranlp-1.86", "year": 2021, "rel_sent": "In this paper , we introduce an annotated dataset for Sentiment Analysis in CMTET .", "forward": false, "src_ids": "2021.ranlp-1.86_6097"}
{"input": "end - to - end tts is done by using Task| context: while end-2 - end text - to - speech ( tts ) has made significant progresses over the past few years , these systems still lack intuitive user controls over prosody . for instance , generating speech with fine - grained prosody control ( prosodic prominence , contextually appropriate emotions ) is still an open challenge .", "entity": "end - to - end tts", "output": "controlling prosody", "neg_sample": ["end - to - end tts is done by using Task", "while end-2 - end text - to - speech ( tts ) has made significant progresses over the past few years , these systems still lack intuitive user controls over prosody .", "for instance , generating speech with fine - grained prosody control ( prosodic prominence , contextually appropriate emotions ) is still an open challenge ."], "relation": "used for", "id": "2021.conll-1.42", "year": 2021, "rel_sent": "Controlling Prosody in End - to - End TTS : A Case Study on Contrastive Focus Generation.", "forward": false, "src_ids": "2021.conll-1.42_2733"}
{"input": "spurious solutions is done by using Method| context: weakly supervised question answering usually has only the final answers as supervision signals while the correct solutions to derive the answers are not provided . this setting gives rise to the spurious solution problem : there may exist many spurious solutions that coincidentally derive the correct answer , but training on such solutions can hurt model performance ( e.g. , producing wrong solutions or answers ) . for example , for discrete reasoning tasks as on drop , there may exist many equations to derive a numeric answer , and typically only one of them is correct .", "entity": "spurious solutions", "output": "learning methods", "neg_sample": ["spurious solutions is done by using Method", "weakly supervised question answering usually has only the final answers as supervision signals while the correct solutions to derive the answers are not provided .", "this setting gives rise to the spurious solution problem : there may exist many spurious solutions that coincidentally derive the correct answer , but training on such solutions can hurt model performance ( e.g.", ", producing wrong solutions or answers ) .", "for example , for discrete reasoning tasks as on drop , there may exist many equations to derive a numeric answer , and typically only one of them is correct ."], "relation": "used for", "id": "2021.acl-long.318", "year": 2021, "rel_sent": "Previous learning methods mostly filter out spurious solutions with heuristics or using model confidence , but do not explicitly exploit the semantic correlations between a question and its solution .", "forward": false, "src_ids": "2021.acl-long.318_14187"}
{"input": "mldc problem is done by using Method| context: multi - label document classification ( mldc ) problems can be challenging , especially for long documents with a large label set and a long - tail distribution over labels .", "entity": "mldc problem", "output": "convolutional attention network", "neg_sample": ["mldc problem is done by using Method", "multi - label document classification ( mldc ) problems can be challenging , especially for long documents with a large label set and a long - tail distribution over labels ."], "relation": "used for", "id": "2021.emnlp-main.481", "year": 2021, "rel_sent": "In this paper , we present an effective convolutional attention network for the MLDC problem with a focus on medical code prediction from clinical documents .", "forward": false, "src_ids": "2021.emnlp-main.481_893"}
{"input": "pegasus is used for Method| context: sequence - to - sequence models have been applied to a wide variety of nlp tasks , but how to properly use them for dialogue state tracking has not been systematically investigated .", "entity": "pegasus", "output": "state tracking model", "neg_sample": ["pegasus is used for Method", "sequence - to - sequence models have been applied to a wide variety of nlp tasks , but how to properly use them for dialogue state tracking has not been systematically investigated ."], "relation": "used for", "id": "2021.emnlp-main.593", "year": 2021, "rel_sent": "We also explore using Pegasus , a span prediction - based pre - training objective for text summarization , for the state tracking model .", "forward": true, "src_ids": "2021.emnlp-main.593_4068"}
{"input": "wmt 2021 shared tasks is used for Task| context: within the task , the community studied very low resource translation between german and upper sorbian , unsupervised translation between german and lower sorbian and low resource translation between russian and chuvash , all minority languages with active language communities working on preserving the languages , who are partners in the evaluation .", "entity": "wmt 2021 shared tasks", "output": "unsupervised mt", "neg_sample": ["wmt 2021 shared tasks is used for Task", "within the task , the community studied very low resource translation between german and upper sorbian , unsupervised translation between german and lower sorbian and low resource translation between russian and chuvash , all minority languages with active language communities working on preserving the languages , who are partners in the evaluation ."], "relation": "used for", "id": "2021.wmt-1.72", "year": 2021, "rel_sent": "Findings of the WMT 2021 Shared Tasks in Unsupervised MT and Very Low Resource Supervised MT.", "forward": true, "src_ids": "2021.wmt-1.72_14397"}
{"input": "softmax trees is used for Task| context: classification problems having thousands or more classes naturally occur in nlp , for example language models or document classification .", "entity": "softmax trees", "output": "inference", "neg_sample": ["softmax trees is used for Task", "classification problems having thousands or more classes naturally occur in nlp , for example language models or document classification ."], "relation": "used for", "id": "2021.emnlp-main.838", "year": 2021, "rel_sent": "Compared to a softmax and other classifiers , the resulting softmax trees are both more accurate in prediction and faster in inference , as shown in NLP problems having from one thousand to one hundred thousand classes .", "forward": true, "src_ids": "2021.emnlp-main.838_13092"}
{"input": "mix - up method is used for Task| context: the mix - up method ( zhang et al . , 2017 ) , one of the methods for data augmentation , is known to be easy to implement and highly effective . although the mix - up method is intended for image identification , it can also be applied to natural language processing .", "entity": "mix - up method", "output": "document classification task", "neg_sample": ["mix - up method is used for Task", "the mix - up method ( zhang et al .", ", 2017 ) , one of the methods for data augmentation , is known to be easy to implement and highly effective .", "although the mix - up method is intended for image identification , it can also be applied to natural language processing ."], "relation": "used for", "id": "2021.ranlp-1.77", "year": 2021, "rel_sent": "Application of Mix - Up Method in Document Classification Task Using BERT.", "forward": true, "src_ids": "2021.ranlp-1.77_681"}
{"input": "hybrid framework is used for OtherScientificTerm| context: the current state - of - the - art generative models for open - domain question answering ( odqa ) have focused on generating direct answers from unstructured textual information . however , a large amount of world 's knowledge is stored in structured databases , and need to be accessed using query languages such as sql . furthermore , query languages can answer questions that require complex reasoning , as well as offering full explainability .", "entity": "hybrid framework", "output": "sql queries", "neg_sample": ["hybrid framework is used for OtherScientificTerm", "the current state - of - the - art generative models for open - domain question answering ( odqa ) have focused on generating direct answers from unstructured textual information .", "however , a large amount of world 's knowledge is stored in structured databases , and need to be accessed using query languages such as sql .", "furthermore , query languages can answer questions that require complex reasoning , as well as offering full explainability ."], "relation": "used for", "id": "2021.acl-long.315", "year": 2021, "rel_sent": "In this paper , we propose a hybrid framework that takes both textual and tabular evidences as input and generates either direct answers or SQL queries depending on which form could better answer the question .", "forward": true, "src_ids": "2021.acl-long.315_14882"}
{"input": "summarization systems is used for Material| context: modern summarization models generate highly fluent but often factually unreliable outputs . this motivated a surge of metrics attempting to measure the factuality of automatically generated summaries . due to the lack of common benchmarks , these metrics can not be compared . moreover , all these methods treat factuality as a binary concept and fail to provide deeper insights on the kinds of inconsistencies made by different systems .", "entity": "summarization systems", "output": "cnn / dm and xsum datasets", "neg_sample": ["summarization systems is used for Material", "modern summarization models generate highly fluent but often factually unreliable outputs .", "this motivated a surge of metrics attempting to measure the factuality of automatically generated summaries .", "due to the lack of common benchmarks , these metrics can not be compared .", "moreover , all these methods treat factuality as a binary concept and fail to provide deeper insights on the kinds of inconsistencies made by different systems ."], "relation": "used for", "id": "2021.naacl-main.383", "year": 2021, "rel_sent": "To address these limitations , we devise a typology of factual errors and use it to collect human annotations of generated summaries from state - of - the - art summarization systems for the CNN / DM and XSum datasets .", "forward": true, "src_ids": "2021.naacl-main.383_11752"}
{"input": "multilingual dataset is done by using OtherScientificTerm| context: while one can hardly overestimate how much this benchmark contributed to progress in computer vision , it is mostly derived from lexical databases and image queries in english , resulting in source material with a north american or western european bias .", "entity": "multilingual dataset", "output": "concepts", "neg_sample": ["multilingual dataset is done by using OtherScientificTerm", "while one can hardly overestimate how much this benchmark contributed to progress in computer vision , it is mostly derived from lexical databases and image queries in english , resulting in source material with a north american or western european bias ."], "relation": "used for", "id": "2021.emnlp-main.818", "year": 2021, "rel_sent": "On top of the concepts and images obtained through this new protocol , we create a multilingual dataset for Multicultural Reasoning over Vision and Language ( MaRVL ) by eliciting statements from native speaker annotators about pairs of images .", "forward": false, "src_ids": "2021.emnlp-main.818_10199"}
{"input": "bert is done by using OtherScientificTerm| context: pre - trained language models like bert are performant in a wide range of natural language tasks . however , they are resource exhaustive and computationally expensive for industrial scenarios .", "entity": "bert", "output": "early exits", "neg_sample": ["bert is done by using OtherScientificTerm", "pre - trained language models like bert are performant in a wide range of natural language tasks .", "however , they are resource exhaustive and computationally expensive for industrial scenarios ."], "relation": "used for", "id": "2021.acl-long.231", "year": 2021, "rel_sent": "LeeBERT : Learned Early Exit for BERT with cross - level optimization.", "forward": false, "src_ids": "2021.acl-long.231_15124"}
{"input": "projected annotation is used for Method| context: named entity recognition ( ner ) in lowresource languages has been a long - standing challenge in nlp . recent work has shown great progress in two directions : developing cross - lingual features / models to transfer knowledge to low - resource languages , and translating source - language training data into low - resource target - language training data by projecting annotations with cheap resources . we focus on the second direction in this study . existing methods suffer from the low quality of the resulting annotated data in the target language ; for example , they can not handle word order and lexical ambiguity well .", "entity": "projected annotation", "output": "transformer language model", "neg_sample": ["projected annotation is used for Method", "named entity recognition ( ner ) in lowresource languages has been a long - standing challenge in nlp .", "recent work has shown great progress in two directions : developing cross - lingual features / models to transfer knowledge to low - resource languages , and translating source - language training data into low - resource target - language training data by projecting annotations with cheap resources .", "we focus on the second direction in this study .", "existing methods suffer from the low quality of the resulting annotated data in the target language ; for example , they can not handle word order and lexical ambiguity well ."], "relation": "used for", "id": "2021.findings-acl.396", "year": 2021, "rel_sent": "To handle these limitations we propose a novel approach that uses the projected annotation to generate pseudo supervised data with a transformer language model and a constrained beam search .", "forward": true, "src_ids": "2021.findings-acl.396_13088"}
{"input": "literal and idiomatic usages is done by using Method| context: understanding idioms is important in nlp .", "entity": "literal and idiomatic usages", "output": "bert indeed", "neg_sample": ["literal and idiomatic usages is done by using Method", "understanding idioms is important in nlp ."], "relation": "used for", "id": "2021.ranlp-1.156", "year": 2021, "rel_sent": "Our experiment results suggest that BERT indeed can separate the literal and idiomatic usages of a PIE with high accuracy .", "forward": false, "src_ids": "2021.ranlp-1.156_9162"}
{"input": "explicit abuse is done by using Method| context: abusive language detection has become an important tool for the cultivation of safe online platforms .", "entity": "explicit abuse", "output": "lexicon - based approaches", "neg_sample": ["explicit abuse is done by using Method", "abusive language detection has become an important tool for the cultivation of safe online platforms ."], "relation": "used for", "id": "2021.ranlp-1.99", "year": 2021, "rel_sent": "We also investigate different methods of distinguishing between explicit and implicit abuse and show lexicon - based approaches either over- or under - estimate the proportion of explicit abuse in data sets .", "forward": false, "src_ids": "2021.ranlp-1.99_13951"}
{"input": "bert model is done by using OtherScientificTerm| context: machine reading comprehension ( mrc ) is one of the most challenging tasks in natural language processing domain . recent state - of - the - art results for mrc have been achieved with the pre - trained language models , such as bert and its modifications . despite the high performance of these models , they still suffer from the inability to retrieve correct answers from the detailed and lengthy passages .", "entity": "bert model", "output": "linguistic enhancing", "neg_sample": ["bert model is done by using OtherScientificTerm", "machine reading comprehension ( mrc ) is one of the most challenging tasks in natural language processing domain .", "recent state - of - the - art results for mrc have been achieved with the pre - trained language models , such as bert and its modifications .", "despite the high performance of these models , they still suffer from the inability to retrieve correct answers from the detailed and lengthy passages ."], "relation": "used for", "id": "2021.ranlp-1.51", "year": 2021, "rel_sent": "Experiments performed on the SQuAD benchmark and more complex question answering datasets have shown that linguistic enhancing boosts the performance of the standard BERT model significantly .", "forward": false, "src_ids": "2021.ranlp-1.51_14159"}
{"input": "language model is used for OtherScientificTerm| context: any test that promises to assess human knowledge of language ( kol ) for any statistically - based language model ( lm ) must meet three requirements : ( 1 ) comprehensive coverage of linguistic phenomena ; ( 2 ) replicable and statistically - vetted human judgement data ; and ( 3 ) test the lm 's ability to track the gradience of sentence acceptability .", "entity": "language model", "output": "human acceptability judgements", "neg_sample": ["language model is used for OtherScientificTerm", "any test that promises to assess human knowledge of language ( kol ) for any statistically - based language model ( lm ) must meet three requirements : ( 1 ) comprehensive coverage of linguistic phenomena ; ( 2 ) replicable and statistically - vetted human judgement data ; and ( 3 ) test the lm 's ability to track the gradience of sentence acceptability ."], "relation": "used for", "id": "2021.blackboxnlp-1.38", "year": 2021, "rel_sent": "Finally , we posit the Acceptability Delta Criterion ( ADC ) , an evaluation metric that tests how well a LM can track changes in human acceptability judgements across minimal pairs instead of testing whether the LM assigned a greater likelihood to the expert - labeled acceptable sequence of a minimal pair ( S_1 > S_2 ) .", "forward": true, "src_ids": "2021.blackboxnlp-1.38_5350"}
{"input": "representation method is used for OtherScientificTerm| context: temporal knowledge graph ( tkg ) reasoning is a crucial task that has gained increasing research interest in recent years . most existing methods focus on reasoning at past timestamps to complete the missing facts , and there are only a few works of reasoning on known tkgs toforecast future facts . compared with the completion task , the forecasting task is more difficult that faces two main challenges : ( 1 ) how to effectively model the time information to handle future timestamps ? ( 2 ) how to make inductive inference to handle previously unseen entities that emerge over time ?", "entity": "representation method", "output": "inductive inference ability", "neg_sample": ["representation method is used for OtherScientificTerm", "temporal knowledge graph ( tkg ) reasoning is a crucial task that has gained increasing research interest in recent years .", "most existing methods focus on reasoning at past timestamps to complete the missing facts , and there are only a few works of reasoning on known tkgs toforecast future facts .", "compared with the completion task , the forecasting task is more difficult that faces two main challenges : ( 1 ) how to effectively model the time information to handle future timestamps ?", "( 2 ) how to make inductive inference to handle previously unseen entities that emerge over time ?"], "relation": "used for", "id": "2021.emnlp-main.655", "year": 2021, "rel_sent": "Furthermore , we propose a novel representation method for unseen entities to improve the inductive inference ability of the model .", "forward": true, "src_ids": "2021.emnlp-main.655_2074"}
{"input": "knowledge transfer method is used for Task| context: the common practice to tackle the problem is transferring the autoregressive machine translation ( at ) knowledge to nat models , e.g. , with knowledge distillation . in this work , we hypothesize and empirically verify that at and nat encoders capture different linguistic properties of source sentences .", "entity": "knowledge transfer method", "output": "non - autoregressive machine translation", "neg_sample": ["knowledge transfer method is used for Task", "the common practice to tackle the problem is transferring the autoregressive machine translation ( at ) knowledge to nat models , e.g.", ", with knowledge distillation .", "in this work , we hypothesize and empirically verify that at and nat encoders capture different linguistic properties of source sentences ."], "relation": "used for", "id": "2021.naacl-main.313", "year": 2021, "rel_sent": "In addition , experimental results demonstrate that our Multi - Task NAT is complementary to knowledge distillation , the standard knowledge transfer method for NAT .", "forward": true, "src_ids": "2021.naacl-main.313_7147"}
{"input": "language models is used for OtherScientificTerm| context: analogies play a central role in human commonsense reasoning . the ability to recognize analogies such as ' eye is to seeing what ear is to hearing ' , sometimes referred to as analogical proportions , shape how we structure knowledge and understand language . surprisingly , however , the task of identifying such analogies has not yet received much attention in the language model era .", "entity": "language models", "output": "abstract and complex relations", "neg_sample": ["language models is used for OtherScientificTerm", "analogies play a central role in human commonsense reasoning .", "the ability to recognize analogies such as ' eye is to seeing what ear is to hearing ' , sometimes referred to as analogical proportions , shape how we structure knowledge and understand language .", "surprisingly , however , the task of identifying such analogies has not yet received much attention in the language model era ."], "relation": "used for", "id": "2021.acl-long.280", "year": 2021, "rel_sent": "We find that off - the - shelf language models can identify analogies to a certain extent , but struggle with abstract and complex relations , and results are highly sensitive to model architecture and hyperparameters .", "forward": true, "src_ids": "2021.acl-long.280_6636"}
{"input": "sense - making of large complex text corpora is done by using Method| context: information visualization is critical to analytical reasoning and knowledge discovery .", "entity": "sense - making of large complex text corpora", "output": "text analytics algorithms", "neg_sample": ["sense - making of large complex text corpora is done by using Method", "information visualization is critical to analytical reasoning and knowledge discovery ."], "relation": "used for", "id": "2021.dash-1.3", "year": 2021, "rel_sent": "We present an interactive studio that integrates perceptive visualization techniques with powerful text analytics algorithms to assist humans in sense - making of large complex text corpora .", "forward": false, "src_ids": "2021.dash-1.3_7386"}
{"input": "multiple parallel and monolingual data is done by using Method| context: existing multilingual machine translation approaches mainly focus on english - centric directions , while the non - english directions still lag behind .", "entity": "multiple parallel and monolingual data", "output": "data augmentation", "neg_sample": ["multiple parallel and monolingual data is done by using Method", "existing multilingual machine translation approaches mainly focus on english - centric directions , while the non - english directions still lag behind ."], "relation": "used for", "id": "2021.acl-long.21", "year": 2021, "rel_sent": "mRASP2 is empowered by two techniques : a ) a contrastive learning scheme to close the gap among representations of different languages , and b ) data augmentation on both multiple parallel and monolingual data tofurther align token representations .", "forward": false, "src_ids": "2021.acl-long.21_13568"}
{"input": "cross - lingual tasks is done by using Method| context: existing work in multilingual pretraining has demonstrated the potential of cross - lingual transferability by training a unified transformer encoder for multiple languages . however , much of this work only relies on the shared vocabulary and bilingual contexts to encourage the correlation across languages , which is loose and implicit for aligning the contextual representations between languages .", "entity": "cross - lingual tasks", "output": "cross - attention module", "neg_sample": ["cross - lingual tasks is done by using Method", "existing work in multilingual pretraining has demonstrated the potential of cross - lingual transferability by training a unified transformer encoder for multiple languages .", "however , much of this work only relies on the shared vocabulary and bilingual contexts to encourage the correlation across languages , which is loose and implicit for aligning the contextual representations between languages ."], "relation": "used for", "id": "2021.acl-long.308", "year": 2021, "rel_sent": "More importantly , when fine - tuning on downstream tasks , the cross - attention module can be plugged in or out on - demand , thus naturally benefiting a wider range of cross - lingual tasks , from language understanding to generation .", "forward": false, "src_ids": "2021.acl-long.308_10062"}
{"input": "medical note encoding module is done by using Method| context: medical code assignment from clinical text is a fundamental task in clinical information system management . as medical notes are typically lengthy and the medical coding system 's code space is large , this task is a long - standing challenge . recent work applies deep neural network models to encode the medical notes and assign medical codes to clinical documents . however , these methods are still ineffective as they do not fully encode and capture the lengthy and rich semantic information of medical notes nor explicitly exploit the interactions between the notes and codes .", "entity": "medical note encoding module", "output": "gated information propagation", "neg_sample": ["medical note encoding module is done by using Method", "medical code assignment from clinical text is a fundamental task in clinical information system management .", "as medical notes are typically lengthy and the medical coding system 's code space is large , this task is a long - standing challenge .", "recent work applies deep neural network models to encode the medical notes and assign medical codes to clinical documents .", "however , these methods are still ineffective as they do not fully encode and capture the lengthy and rich semantic information of medical notes nor explicitly exploit the interactions between the notes and codes ."], "relation": "used for", "id": "2021.findings-acl.89", "year": 2021, "rel_sent": "Our methods capture the rich semantic information of the lengthy clinical text for better representation by utilizing embedding injection and gated information propagation in the medical note encoding module .", "forward": false, "src_ids": "2021.findings-acl.89_9855"}
{"input": "graph - to - sequence neural network graphmr is used for Task| context: previous natural language processing researches have proven the effectiveness of sequence - to - sequence ( seq2seq ) or related variants on mathematics solving . however , few works have been able to explore structural or syntactic information hidden in expressions ( e.g. , precedence and associativity ) .", "entity": "graph - to - sequence neural network graphmr", "output": "mathematics and speculate answers", "neg_sample": ["graph - to - sequence neural network graphmr is used for Task", "previous natural language processing researches have proven the effectiveness of sequence - to - sequence ( seq2seq ) or related variants on mathematics solving .", "however , few works have been able to explore structural or syntactic information hidden in expressions ( e.g.", ", precedence and associativity ) ."], "relation": "used for", "id": "2021.emnlp-main.273", "year": 2021, "rel_sent": "Having transformed to the new representations , we proposed a graph - to - sequence neural network GraphMR , which can effectively learn the hierarchical information of graphs inputs to solve mathematics and speculate answers .", "forward": true, "src_ids": "2021.emnlp-main.273_13671"}
{"input": "systematic procedure is used for Task| context: the procedure is general , but of particular use in multiple - annotator tasks geared towards ground truth construction .", "entity": "systematic procedure", "output": "disagreement resolution", "neg_sample": ["systematic procedure is used for Task", "the procedure is general , but of particular use in multiple - annotator tasks geared towards ground truth construction ."], "relation": "used for", "id": "2021.humeval-1.15", "year": 2021, "rel_sent": "Consensus among annotators , we maintain , should be striven for , through a systematic procedure for disagreement resolution such as the one we describe .", "forward": true, "src_ids": "2021.humeval-1.15_8901"}
{"input": "knowledge graph is used for Task| context: knowledge graph ( kg ) and attention mechanism have been demonstrated effective in introducing and selecting useful information for weakly supervised methods . however , only qualitative analysis and ablation study are provided as evidence .", "entity": "knowledge graph", "output": "bag - level relation extraction ( re )", "neg_sample": ["knowledge graph is used for Task", "knowledge graph ( kg ) and attention mechanism have been demonstrated effective in introducing and selecting useful information for weakly supervised methods .", "however , only qualitative analysis and ablation study are provided as evidence ."], "relation": "used for", "id": "2021.acl-long.359", "year": 2021, "rel_sent": "In this paper , we contribute a dataset and propose a paradigm to quantitatively evaluate the effect of attention and KG on bag - level relation extraction ( RE ) .", "forward": true, "src_ids": "2021.acl-long.359_1375"}
{"input": "hate speech detection models is done by using OtherScientificTerm| context: detecting online hate is a difficult task that even state - of - the - art models struggle with . typically , hate speech detection models are evaluated by measuring their performance on held - out test data using metrics such as accuracy and f1 score . however , this approach makes it difficult to identify specific model weak points . it also risks overestimating generalisable model performance due to increasingly well - evidenced systematic gaps and biases in hate speech datasets .", "entity": "hate speech detection models", "output": "functional tests", "neg_sample": ["hate speech detection models is done by using OtherScientificTerm", "detecting online hate is a difficult task that even state - of - the - art models struggle with .", "typically , hate speech detection models are evaluated by measuring their performance on held - out test data using metrics such as accuracy and f1 score .", "however , this approach makes it difficult to identify specific model weak points .", "it also risks overestimating generalisable model performance due to increasingly well - evidenced systematic gaps and biases in hate speech datasets ."], "relation": "used for", "id": "2021.acl-long.4", "year": 2021, "rel_sent": "To enable more targeted diagnostic insights , we introduce HateCheck , a suite of functional tests for hate speech detection models .", "forward": false, "src_ids": "2021.acl-long.4_3950"}
{"input": "post - editing is done by using OtherScientificTerm| context: to translate large volumes of text in a globally connected world , more and more translators are integrating machine translation ( mt ) and post - editing ( pe ) into their translation workflows to generate publishable quality translations . while this process has been shown to save time and reduce errors , the task of translation is changing from mostly text production from scratch tofixing errors within useful but partly incorrect mt output . this is affecting the interface design of translation tools , where better support for text editing tasks is required .", "entity": "post - editing", "output": "gestures", "neg_sample": ["post - editing is done by using OtherScientificTerm", "to translate large volumes of text in a globally connected world , more and more translators are integrating machine translation ( mt ) and post - editing ( pe ) into their translation workflows to generate publishable quality translations .", "while this process has been shown to save time and reduce errors , the task of translation is changing from mostly text production from scratch tofixing errors within useful but partly incorrect mt output .", "this is affecting the interface design of translation tools , where better support for text editing tasks is required ."], "relation": "used for", "id": "2021.acl-long.527", "year": 2021, "rel_sent": "These gestures combined with the keyboard facilitate all editing types required for PE .", "forward": false, "src_ids": "2021.acl-long.527_13734"}
{"input": "thai bert model is used for Task| context: our corpus is compiled by expert verified cases of depression in several online blogs .", "entity": "thai bert model", "output": "detecting depression", "neg_sample": ["thai bert model is used for Task", "our corpus is compiled by expert verified cases of depression in several online blogs ."], "relation": "used for", "id": "2021.wnut-1.3", "year": 2021, "rel_sent": "We achieve a 77.53 % accuracy with a Thai BERT model in detecting depression .", "forward": true, "src_ids": "2021.wnut-1.3_12769"}
{"input": "box - to - box transformations is used for OtherScientificTerm| context: learning representations of entities and relations in structured knowledge bases is an active area of research , with much emphasis placed on choosing the appropriate geometry to capture the hierarchical structures exploited in , for example , isa or haspart relations . box embeddings ( vilnis et al . , 2018 ; li et al . , 2019 ; dasgupta et al . , 2020 ) , which represent concepts as n - dimensional hyperrectangles , are capable of embedding hierarchies when training on a subset of the transitive closure .", "entity": "box - to - box transformations", "output": "hierarchy", "neg_sample": ["box - to - box transformations is used for OtherScientificTerm", "learning representations of entities and relations in structured knowledge bases is an active area of research , with much emphasis placed on choosing the appropriate geometry to capture the hierarchical structures exploited in , for example , isa or haspart relations .", "box embeddings ( vilnis et al .", ", 2018 ; li et al .", ", 2019 ; dasgupta et al .", ", 2020 ) , which represent concepts as n - dimensional hyperrectangles , are capable of embedding hierarchies when training on a subset of the transitive closure ."], "relation": "used for", "id": "2021.repl4nlp-1.28", "year": 2021, "rel_sent": "In this work , we introduce a learned box - to - box transformation that respects the structure of each hierarchy .", "forward": true, "src_ids": "2021.repl4nlp-1.28_14382"}
{"input": "semantic and phonemic search strategies is used for Task| context: effective management of dementia hinges on timely detection and precise diagnosis of the underlying cause of the syndrome at an early mild cognitive impairment ( mci ) stage . in these tasks , participants are asked to produce as many words as possible belonging to either a semantic category ( svf task ) or a phonemic category ( pvf task ) . even though both svf and pvf share neurocognitive function profiles , the pvf is typically believed to be less sensitive to measure mci - related cognitive impairment and recent research on fine - grained automatic evaluation of vf tasks has mainly focused on the svf .", "entity": "semantic and phonemic search strategies", "output": "early dementia", "neg_sample": ["semantic and phonemic search strategies is used for Task", "effective management of dementia hinges on timely detection and precise diagnosis of the underlying cause of the syndrome at an early mild cognitive impairment ( mci ) stage .", "in these tasks , participants are asked to produce as many words as possible belonging to either a semantic category ( svf task ) or a phonemic category ( pvf task ) .", "even though both svf and pvf share neurocognitive function profiles , the pvf is typically believed to be less sensitive to measure mci - related cognitive impairment and recent research on fine - grained automatic evaluation of vf tasks has mainly focused on the svf ."], "relation": "used for", "id": "2021.clpsych-1.4", "year": 2021, "rel_sent": "Dissociating Semantic and Phonemic Search Strategies in the Phonemic Verbal Fluency Task in early Dementia.", "forward": true, "src_ids": "2021.clpsych-1.4_2457"}
{"input": "program syntax is done by using Method| context: code summarization and generation empower conversion between programming language ( pl ) and natural language ( nl ) , while code translation avails the migration of legacy code from one pl to another .", "entity": "program syntax", "output": "plbart", "neg_sample": ["program syntax is done by using Method", "code summarization and generation empower conversion between programming language ( pl ) and natural language ( nl ) , while code translation avails the migration of legacy code from one pl to another ."], "relation": "used for", "id": "2021.naacl-main.211", "year": 2021, "rel_sent": "Furthermore , analysis reveals that PLBART learns program syntax , style ( e.g. , identifier naming convention ) , logical flow ( e.g. , ' if ' block inside an ' else ' block is equivalent to ' else if ' block ) that are crucial to program semantics and thus excels even with limited annotations .", "forward": false, "src_ids": "2021.naacl-main.211_511"}
{"input": "sentiment analysis task is done by using Method| context: working with a wide range of annotators with the same attributes is crucial , as in real - world applications . although such application cases often use crowd - sourcing mechanisms to gather a variety of annotators , most real - world users use mobile devices .", "entity": "sentiment analysis task", "output": "crowd - sourced annotation", "neg_sample": ["sentiment analysis task is done by using Method", "working with a wide range of annotators with the same attributes is crucial , as in real - world applications .", "although such application cases often use crowd - sourcing mechanisms to gather a variety of annotators , most real - world users use mobile devices ."], "relation": "used for", "id": "2021.emnlp-demo.41", "year": 2021, "rel_sent": "In our experiments , we conducted crowd - sourced annotation for a sentiment analysis task with several annotators and evaluated annotation metrics such as speed , quality , and ease of use from the tool 's logs and user surveys .", "forward": false, "src_ids": "2021.emnlp-demo.41_14032"}
{"input": "multimodal neural machine translation system is used for Material| context: multimodal machine translation ( mmt ) systems utilize additional information from other modalities beyond text to improve the quality of machine translation ( mt ) . the additional modality is typically in the form of images . despite proven advantages , it is indeed difficult to develop an mmt system for various languages primarily due to the lack of a suitable multimodal dataset .", "entity": "multimodal neural machine translation system", "output": "english", "neg_sample": ["multimodal neural machine translation system is used for Material", "multimodal machine translation ( mmt ) systems utilize additional information from other modalities beyond text to improve the quality of machine translation ( mt ) .", "the additional modality is typically in the form of images .", "despite proven advantages , it is indeed difficult to develop an mmt system for various languages primarily due to the lack of a suitable multimodal dataset ."], "relation": "used for", "id": "2021.mmtlrl-1.6", "year": 2021, "rel_sent": "Multimodal Neural Machine Translation System for English to Bengali.", "forward": true, "src_ids": "2021.mmtlrl-1.6_14717"}
{"input": "text classification is done by using OtherScientificTerm| context: neural network architectures in natural language processing often use attention mechanisms to produce probability distributions over input token representations . attention has empirically been demonstrated to improve performance in various tasks , while its weights have been extensively used as explanations for model predictions . recent studies ( jain and wallace , 2019 ; serrano and smith , 2019 ; wiegreffe and pinter , 2019 ) have showed that it can not generally be considered as a faithful explanation ( jacovi and goldberg , 2020 ) across encoders and tasks .", "entity": "text classification", "output": "faithful attention - based explanations", "neg_sample": ["text classification is done by using OtherScientificTerm", "neural network architectures in natural language processing often use attention mechanisms to produce probability distributions over input token representations .", "attention has empirically been demonstrated to improve performance in various tasks , while its weights have been extensively used as explanations for model predictions .", "recent studies ( jain and wallace , 2019 ; serrano and smith , 2019 ; wiegreffe and pinter , 2019 ) have showed that it can not generally be considered as a faithful explanation ( jacovi and goldberg , 2020 ) across encoders and tasks ."], "relation": "used for", "id": "2021.acl-long.40", "year": 2021, "rel_sent": "In this paper , we seek to improve the faithfulness of attention - based explanations for text classification .", "forward": false, "src_ids": "2021.acl-long.40_11301"}
{"input": "deep learning schema ( nodeele ) is used for Method| context: due to the wide - spread development of machine translation ( mt ) systems -especially neural machine translation ( nmt ) systems- mt evaluation , both automatic and human , has become more and more important as it helps us establish how mt systems perform . yet , automatic evaluation metrics have lagged behind , as the most popular choices ( e.g. , bleu , meteor and rouge ) may correlate poorly with human judgments .", "entity": "deep learning schema ( nodeele )", "output": "nmt systems", "neg_sample": ["deep learning schema ( nodeele ) is used for Method", "due to the wide - spread development of machine translation ( mt ) systems -especially neural machine translation ( nmt ) systems- mt evaluation , both automatic and human , has become more and more important as it helps us establish how mt systems perform .", "yet , automatic evaluation metrics have lagged behind , as the most popular choices ( e.g.", ", bleu , meteor and rouge ) may correlate poorly with human judgments ."], "relation": "used for", "id": "2021.triton-1.5", "year": 2021, "rel_sent": "This paper seeks to put to the test an evaluation model based on a novel deep learning schema ( NoDeeLe ) used to compare two NMT systems on four different text genres , i.e.", "forward": true, "src_ids": "2021.triton-1.5_9510"}
{"input": "latent variable model approach is used for Task| context: embedding entities and relations of a knowledge graph in a low - dimensional space has shown impressive performance in predicting missing links between entities . although progresses have been achieved , existing methods are heuristically motivated and theoretical understanding of such embeddings is comparatively underdeveloped .", "entity": "latent variable model approach", "output": "knowledge graph embedding", "neg_sample": ["latent variable model approach is used for Task", "embedding entities and relations of a knowledge graph in a low - dimensional space has shown impressive performance in predicting missing links between entities .", "although progresses have been achieved , existing methods are heuristically motivated and theoretical understanding of such embeddings is comparatively underdeveloped ."], "relation": "used for", "id": "2021.eacl-main.133", "year": 2021, "rel_sent": "RelWalk - A Latent Variable Model Approach to Knowledge Graph Embedding.", "forward": true, "src_ids": "2021.eacl-main.133_11462"}
{"input": "bert - based models is used for Task| context: due to the multi - dimensional variation of textual data , detection of event triggers from new domains can become a lot more challenging . recently , large transformer - based language models , e.g. however , their unwieldy nature also prevents effective adaptation across domains .", "entity": "bert - based models", "output": "event detection", "neg_sample": ["bert - based models is used for Task", "due to the multi - dimensional variation of textual data , detection of event triggers from new domains can become a lot more challenging .", "recently , large transformer - based language models , e.g.", "however , their unwieldy nature also prevents effective adaptation across domains ."], "relation": "used for", "id": "2021.findings-acl.351", "year": 2021, "rel_sent": "To this end , this work proposes a Domain - specific Adapter - based Adaptation ( DAA ) framework to improve the adaptability of BERT - based models for event detection across domains .", "forward": true, "src_ids": "2021.findings-acl.351_2037"}
{"input": "monolingual text is done by using Method| context: the data scarcity in low - resource languages has become a bottleneck to building robust neural machine translation systems .", "entity": "monolingual text", "output": "mbart", "neg_sample": ["monolingual text is done by using Method", "the data scarcity in low - resource languages has become a bottleneck to building robust neural machine translation systems ."], "relation": "used for", "id": "2021.findings-acl.239", "year": 2021, "rel_sent": "We first construct noisy mixed - language text from the monolingual corpus of the target language in the translation pair to cover both the source and target languages , and then , we continue pretraining mBART to reconstruct the original monolingual text .", "forward": false, "src_ids": "2021.findings-acl.239_14011"}
{"input": "plotcoder is done by using Material| context: creating effective visualization is an important part of data analytics . while there are many libraries for creating visualization , writing such code remains difficult given the myriad of parameters that users need to provide .", "entity": "plotcoder", "output": "jupyter notebooks", "neg_sample": ["plotcoder is done by using Material", "creating effective visualization is an important part of data analytics .", "while there are many libraries for creating visualization , writing such code remains difficult given the myriad of parameters that users need to provide ."], "relation": "used for", "id": "2021.acl-long.169", "year": 2021, "rel_sent": "We use Jupyter notebooks containing visualization programs crawled from GitHub to train PlotCoder .", "forward": false, "src_ids": "2021.acl-long.169_13783"}
{"input": "mask strategies is used for OtherScientificTerm| context: copy mechanisms explicitly obtain unchanged tokens from the source ( input ) sequence to generate the target ( output ) sequence under the neural seq2seq framework . however , most of the existing copy mechanisms only consider single word copying from the source sentences , which results in losing essential tokens while copying long spans .", "entity": "mask strategies", "output": "probability distributions", "neg_sample": ["mask strategies is used for OtherScientificTerm", "copy mechanisms explicitly obtain unchanged tokens from the source ( input ) sequence to generate the target ( output ) sequence under the neural seq2seq framework .", "however , most of the existing copy mechanisms only consider single word copying from the source sentences , which results in losing essential tokens while copying long spans ."], "relation": "used for", "id": "2021.sustainlp-1.6", "year": 2021, "rel_sent": "In the inference stage , the model will firstly predict the BIO tag at each time step , then conduct different mask strategies based on the predicted BIO label to diminish the scope of the probability distributions over the vocabulary list .", "forward": true, "src_ids": "2021.sustainlp-1.6_16028"}
{"input": "abstractive summaries is done by using Method| context: abstractive summarization models heavily rely on copy mechanisms , such as the pointer network or attention , to achieve good performance , measured by textual overlap with reference summaries . as a result , the generated summaries stay close to the formulations in the source document .", "entity": "abstractive summaries", "output": "* sentence planner * model", "neg_sample": ["abstractive summaries is done by using Method", "abstractive summarization models heavily rely on copy mechanisms , such as the pointer network or attention , to achieve good performance , measured by textual overlap with reference summaries .", "as a result , the generated summaries stay close to the formulations in the source document ."], "relation": "used for", "id": "2021.newsum-1.1", "year": 2021, "rel_sent": "We propose the * sentence planner * model to generate more abstractive summaries .", "forward": false, "src_ids": "2021.newsum-1.1_13393"}
{"input": "heuristic method is used for Method| context: despite the widespread use of knowledge graph embeddings ( kge ) , little is known about the security vulnerabilities that might disrupt their intended behaviour . we study data poisoning attacks against kge models for link prediction .", "entity": "heuristic method", "output": "adversarial additions", "neg_sample": ["heuristic method is used for Method", "despite the widespread use of knowledge graph embeddings ( kge ) , little is known about the security vulnerabilities that might disrupt their intended behaviour .", "we study data poisoning attacks against kge models for link prediction ."], "relation": "used for", "id": "2021.emnlp-main.648", "year": 2021, "rel_sent": "We further propose a heuristic method to replace one of the two entities in each influential triple to generate adversarial additions .", "forward": true, "src_ids": "2021.emnlp-main.648_7010"}
{"input": "pdaln is used for Material| context: due to limited labeled resources and domain shift , cross - domain ner is a challenging task .", "entity": "pdaln", "output": "high - resource domains", "neg_sample": ["pdaln is used for Material", "due to limited labeled resources and domain shift , cross - domain ner is a challenging task ."], "relation": "used for", "id": "2021.emnlp-main.442", "year": 2021, "rel_sent": "Extensive experiments on four benchmarks show that PDALN can effectively adapt high - resource domains to low - resource target domains , even if they are diverse in terms and writing styles .", "forward": true, "src_ids": "2021.emnlp-main.442_3746"}
{"input": "conversational structure aware graph network is used for Task| context: however , it remains a major challenge for existing csrl parser to handle conversational structural information .", "entity": "conversational structure aware graph network", "output": "conversational semantic role labeling ( csrl )", "neg_sample": ["conversational structure aware graph network is used for Task", "however , it remains a major challenge for existing csrl parser to handle conversational structural information ."], "relation": "used for", "id": "2021.emnlp-main.177", "year": 2021, "rel_sent": "CSAGN : Conversational Structure Aware Graph Network for Conversational Semantic Role Labeling.", "forward": true, "src_ids": "2021.emnlp-main.177_4286"}
{"input": "bert - based neural baselines is used for Task| context: science , technology and innovation ( sti ) policies have evolved in the past decade . we are now progressing towards policies that are more aligned with sustainable development through integrating social , economic and environmental dimensions . in this new policy environment , the need to keep track of innovation from its conception in science and research has emerged . argumentation mining , an interdisciplinary nlp field , gives rise to the required technologies .", "entity": "bert - based neural baselines", "output": "claim", "neg_sample": ["bert - based neural baselines is used for Task", "science , technology and innovation ( sti ) policies have evolved in the past decade .", "we are now progressing towards policies that are more aligned with sustainable development through integrating social , economic and environmental dimensions .", "in this new policy environment , the need to keep track of innovation from its conception in science and research has emerged .", "argumentation mining , an interdisciplinary nlp field , gives rise to the required technologies ."], "relation": "used for", "id": "2021.argmining-1.10", "year": 2021, "rel_sent": "We also present a set of strong , BERT - based neural baselines achieving an f1 - score of 70.0 for Claim and 62.4 for Evidence identification evaluated with 10 - fold cross - validation .", "forward": true, "src_ids": "2021.argmining-1.10_4721"}
{"input": "new domains is done by using Task| context: this paper explores the topic of transportability , as a sub - area of generalisability .", "entity": "new domains", "output": "estimation of nlp system", "neg_sample": ["new domains is done by using Task", "this paper explores the topic of transportability , as a sub - area of generalisability ."], "relation": "used for", "id": "2021.iwcs-1.1", "year": 2021, "rel_sent": "Defining a new measure for transportability may allow for better estimation of NLP system performance in new domains , and is crucial when assessing the performance of NLP systems in new tasks and domains .", "forward": false, "src_ids": "2021.iwcs-1.1_11213"}
{"input": "heterogeneous user - interest transfer learning is used for Task| context: this is a problem where traditional content - based recommendation techniques often fail . luckily , in real - world recommendation services , some publisher ( e.g. , daily news ) may have accumulated a large corpus with lots of consumers which can be used for a newly deployed publisher ( e.g. , political news ) .", "entity": "heterogeneous user - interest transfer learning", "output": "news recommendation", "neg_sample": ["heterogeneous user - interest transfer learning is used for Task", "this is a problem where traditional content - based recommendation techniques often fail .", "luckily , in real - world recommendation services , some publisher ( e.g.", ", daily news ) may have accumulated a large corpus with lots of consumers which can be used for a newly deployed publisher ( e.g.", ", political news ) ."], "relation": "used for", "id": "2021.eacl-main.62", "year": 2021, "rel_sent": "TrNews : Heterogeneous User - Interest Transfer Learning for News Recommendation.", "forward": true, "src_ids": "2021.eacl-main.62_8983"}
{"input": "features is done by using Method| context: large - scale multi - modal classification aim to distinguish between different multi - modal data , and it has drawn dramatically attentions since last decade .", "entity": "features", "output": "attention mechanism", "neg_sample": ["features is done by using Method", "large - scale multi - modal classification aim to distinguish between different multi - modal data , and it has drawn dramatically attentions since last decade ."], "relation": "used for", "id": "2021.maiworkshop-1.5", "year": 2021, "rel_sent": "As for attention - based multimodal modeling branch , we first employ attention mechanism to make the model focused on important features , then we use the multi - modal encoder feature to enrich the input information , achieve a better performance .", "forward": false, "src_ids": "2021.maiworkshop-1.5_11495"}
{"input": "pretraining alternatives is used for Method| context: next sentence prediction ) . however , no previous work sofar has attempted in examining whether other simpler linguistically intuitive or not objectives can be used standalone as main pretraining objectives .", "entity": "pretraining alternatives", "output": "masked language modeling", "neg_sample": ["pretraining alternatives is used for Method", "next sentence prediction ) .", "however , no previous work sofar has attempted in examining whether other simpler linguistically intuitive or not objectives can be used standalone as main pretraining objectives ."], "relation": "used for", "id": "2021.emnlp-main.249", "year": 2021, "rel_sent": "Frustratingly Simple Pretraining Alternatives to Masked Language Modeling.", "forward": true, "src_ids": "2021.emnlp-main.249_12360"}
{"input": "nlp models is done by using Material| context: part of speech ( pos ) tagging is a familiar nlp task . state of the art taggers routinely achieve token - level accuracies of over 97 % on news body text , evidence that the problem is well understood . however , the register of english news headlines , ' headlinese ' , is very different from the register of long - form text , causing pos tagging models to underperform on headlines .", "entity": "nlp models", "output": "posh", "neg_sample": ["nlp models is done by using Material", "part of speech ( pos ) tagging is a familiar nlp task .", "state of the art taggers routinely achieve token - level accuracies of over 97 % on news body text , evidence that the problem is well understood .", "however , the register of english news headlines , ' headlinese ' , is very different from the register of long - form text , causing pos tagging models to underperform on headlines ."], "relation": "used for", "id": "2021.emnlp-main.521", "year": 2021, "rel_sent": "We make POSH , the POS - tagged Headline corpus , available to encourage research in improved NLP models for news headlines .", "forward": false, "src_ids": "2021.emnlp-main.521_15583"}
{"input": "three - level relevance is used for Task| context: in our recommendation system , different people follow different hot keywords with interest . we need to attach documents to each keyword and then distribute the documents to people whofollow these keywords . the ideal documents should have the same topic with the keyword , which we call topic - aware relevance . in other words , topic - aware relevance documents are better than partially - relevance ones in this application . however , previous tasks never define topic - aware relevance clearly .", "entity": "three - level relevance", "output": "keyword - document matching task", "neg_sample": ["three - level relevance is used for Task", "in our recommendation system , different people follow different hot keywords with interest .", "we need to attach documents to each keyword and then distribute the documents to people whofollow these keywords .", "the ideal documents should have the same topic with the keyword , which we call topic - aware relevance .", "in other words , topic - aware relevance documents are better than partially - relevance ones in this application .", "however , previous tasks never define topic - aware relevance clearly ."], "relation": "used for", "id": "2021.naacl-main.428", "year": 2021, "rel_sent": "To tackle this problem , we define a three - level relevance in keyword - document matching task : topic - aware relevance , partially - relevance and irrelevance .", "forward": true, "src_ids": "2021.naacl-main.428_3001"}
{"input": "multi - emotion classification problem is done by using Method| context: song lyrics convey a multitude of emotions to the listener and powerfully portray the emotional state of the writer or singer .", "entity": "multi - emotion classification problem", "output": "modeling approaches", "neg_sample": ["multi - emotion classification problem is done by using Method", "song lyrics convey a multitude of emotions to the listener and powerfully portray the emotional state of the writer or singer ."], "relation": "used for", "id": "2021.wassa-1.24", "year": 2021, "rel_sent": "This paper examines a variety of modeling approaches to the multi - emotion classification problem for songs .", "forward": false, "src_ids": "2021.wassa-1.24_15474"}
{"input": "event coreference resolution is done by using Method| context: event coreference resolution is an important research problem with many applications . despite the recent remarkable success of pre - trained language models , we argue that it is still highly beneficial to utilize symbolic features for the task . however , as the input for coreference resolution typically comes from upstream components in the information extraction pipeline , the automatically extracted symbolic features can be noisy and contain errors . also , depending on the specific context , some features can be more informative than others .", "entity": "event coreference resolution", "output": "context - dependent gated module", "neg_sample": ["event coreference resolution is done by using Method", "event coreference resolution is an important research problem with many applications .", "despite the recent remarkable success of pre - trained language models , we argue that it is still highly beneficial to utilize symbolic features for the task .", "however , as the input for coreference resolution typically comes from upstream components in the information extraction pipeline , the automatically extracted symbolic features can be noisy and contain errors .", "also , depending on the specific context , some features can be more informative than others ."], "relation": "used for", "id": "2021.naacl-main.274", "year": 2021, "rel_sent": "A Context - Dependent Gated Module for Incorporating Symbolic Semantics into Event Coreference Resolution.", "forward": false, "src_ids": "2021.naacl-main.274_3122"}
{"input": "text classification is done by using Method| context: fine - tuning pre - trained language models for downstream tasks has become a norm for nlp . however it is not clear if intermediate training generally benefits various language models .", "entity": "text classification", "output": "fine - tuning language models", "neg_sample": ["text classification is done by using Method", "fine - tuning pre - trained language models for downstream tasks has become a norm for nlp .", "however it is not clear if intermediate training generally benefits various language models ."], "relation": "used for", "id": "2021.alta-1.16", "year": 2021, "rel_sent": "Does QA - based intermediate training help fine - tuning language models for text classification ?.", "forward": false, "src_ids": "2021.alta-1.16_4"}
{"input": "irene is used for Metric| context: existing software - based energy measurements of nlp models are not accurate because they do not consider the complex interactions between energy consumption and model execution .", "entity": "irene", "output": "inference energy consumption", "neg_sample": ["irene is used for Metric", "existing software - based energy measurements of nlp models are not accurate because they do not consider the complex interactions between energy consumption and model execution ."], "relation": "used for", "id": "2021.acl-long.167", "year": 2021, "rel_sent": "Experiments across multiple Transformer models show IrEne predicts inference energy consumption of transformer models with an error of under 7 % compared to the ground truth .", "forward": true, "src_ids": "2021.acl-long.167_10287"}
{"input": "temporal commonsense reasoning is done by using Method| context: temporal commonsense reasoning is a challenging task as it requires temporal knowledge usually not explicit in text .", "entity": "temporal commonsense reasoning", "output": "ensemble model", "neg_sample": ["temporal commonsense reasoning is done by using Method", "temporal commonsense reasoning is a challenging task as it requires temporal knowledge usually not explicit in text ."], "relation": "used for", "id": "2021.ranlp-srw.12", "year": 2021, "rel_sent": "In this work , we propose an ensemble model for temporal commonsense reasoning .", "forward": false, "src_ids": "2021.ranlp-srw.12_2757"}
{"input": "youtuber embedding training is used for OtherScientificTerm| context: technology is changing the way we consume information and entertainment . youtube streaming video services provide a discussion function that allows video publishers to know what matters most to the people they want to love their brand . through comments , video publishers can better understand the audience 's thoughts and even help video publishers improve their video quality .", "entity": "youtuber embedding training", "output": "audience sentiment", "neg_sample": ["youtuber embedding training is used for OtherScientificTerm", "technology is changing the way we consume information and entertainment .", "youtube streaming video services provide a discussion function that allows video publishers to know what matters most to the people they want to love their brand .", "through comments , video publishers can better understand the audience 's thoughts and even help video publishers improve their video quality ."], "relation": "used for", "id": "2021.ijclclp-2.2", "year": 2021, "rel_sent": "The result validates that YouTuber embedding training is significantly helpful when detecting audience sentiment towards YouTubers .", "forward": true, "src_ids": "2021.ijclclp-2.2_14646"}
{"input": "dependency trees is done by using Method| context: probabilistic distributions over spanning trees in directed graphs are a fundamental model of dependency structure in natural language processing , syntactic dependency trees . in nlp , dependency trees often have an additional root constraint : only one edge may emanate from the root . however , no sampling algorithm has been presented in the literature to account for this additional constraint .", "entity": "dependency trees", "output": "spanning tree sampling algorithms", "neg_sample": ["dependency trees is done by using Method", "probabilistic distributions over spanning trees in directed graphs are a fundamental model of dependency structure in natural language processing , syntactic dependency trees .", "in nlp , dependency trees often have an additional root constraint : only one edge may emanate from the root .", "however , no sampling algorithm has been presented in the literature to account for this additional constraint ."], "relation": "used for", "id": "2021.emnlp-main.824", "year": 2021, "rel_sent": "In this paper , we adapt two spanning tree sampling algorithms tofaithfully sample dependency trees from a graph subject to the root constraint .", "forward": false, "src_ids": "2021.emnlp-main.824_4524"}
{"input": "filtering is used for Method| context: domain - specific neural machine translation ( nmt ) model can provide improved performance , however , it is difficult to always access a domain - specific parallel corpus . iterative back - translation can be used for fine - tuning an nmt model for a domain even if only a monolingual domain corpus is available . the quality of synthetic parallel corpora in terms of closeness to in - domain sentences can play an important role in the performance of the translation model .", "entity": "filtering", "output": "back translation", "neg_sample": ["filtering is used for Method", "domain - specific neural machine translation ( nmt ) model can provide improved performance , however , it is difficult to always access a domain - specific parallel corpus .", "iterative back - translation can be used for fine - tuning an nmt model for a domain even if only a monolingual domain corpus is available .", "the quality of synthetic parallel corpora in terms of closeness to in - domain sentences can play an important role in the performance of the translation model ."], "relation": "used for", "id": "2021.adaptnlp-1.26", "year": 2021, "rel_sent": "Recent works have shown that filtering at different stages of the back translation and weighting the sentences can provide state - of - the - art performance .", "forward": true, "src_ids": "2021.adaptnlp-1.26_1969"}
{"input": "flonet is used for Task| context: such dialogs are grounded in domain - specific flowcharts , which the agent is supposed tofollow during the conversation .", "entity": "flonet", "output": "zero - shot transfer", "neg_sample": ["flonet is used for Task", "such dialogs are grounded in domain - specific flowcharts , which the agent is supposed tofollow during the conversation ."], "relation": "used for", "id": "2021.emnlp-main.357", "year": 2021, "rel_sent": "Our experiments find that FLONET can do zero - shot transfer to unseen flowcharts , and sets a strong baseline for future research .", "forward": true, "src_ids": "2021.emnlp-main.357_4843"}
{"input": "constrained decoding techniques is used for OtherScientificTerm| context: ad hominem attacks are those that target some feature of a person 's character instead of the position the person is maintaining . these attacks are harmful because they propagate implicit biases and diminish a person 's credibility .", "entity": "constrained decoding techniques", "output": "ad hominems", "neg_sample": ["constrained decoding techniques is used for OtherScientificTerm", "ad hominem attacks are those that target some feature of a person 's character instead of the position the person is maintaining .", "these attacks are harmful because they propagate implicit biases and diminish a person 's credibility ."], "relation": "used for", "id": "2021.naacl-main.60", "year": 2021, "rel_sent": "Furthermore , we propose a constrained decoding technique that uses salient n - gram similarity as a soft constraint for top - k sampling to reduce the amount of ad hominems generated .", "forward": true, "src_ids": "2021.naacl-main.60_8478"}
{"input": "annotation framework is used for OtherScientificTerm| context: its effects can reach beyond the online context , contributing to mental or emotional stress on users . automatic tools for detecting abuse can alleviate the issue . however , there is currently a lack of standards for creating datasets in the field .", "entity": "annotation framework", "output": "abusive language", "neg_sample": ["annotation framework is used for OtherScientificTerm", "its effects can reach beyond the online context , contributing to mental or emotional stress on users .", "automatic tools for detecting abuse can alleviate the issue .", "however , there is currently a lack of standards for creating datasets in the field ."], "relation": "used for", "id": "2021.woah-1.20", "year": 2021, "rel_sent": "This paper introduces an annotation framework inspired by legal concepts to define abusive language in the context of online harassment .", "forward": true, "src_ids": "2021.woah-1.20_8613"}
{"input": "factual and coherent sentences is done by using Method| context: following each patient visit , physicians draft long semi - structured clinical summaries called soap notes . while invaluable to clinicians and researchers , creating digital soap notes is burdensome , contributing to physician burnout .", "entity": "factual and coherent sentences", "output": "cluster2sent", "neg_sample": ["factual and coherent sentences is done by using Method", "following each patient visit , physicians draft long semi - structured clinical summaries called soap notes .", "while invaluable to clinicians and researchers , creating digital soap notes is burdensome , contributing to physician burnout ."], "relation": "used for", "id": "2021.acl-long.384", "year": 2021, "rel_sent": "Cluster2Sent outperforms its purely abstractive counterpart by 8 ROUGE-1 points , and produces significantly more factual and coherent sentences as assessed by expert human evaluators .", "forward": false, "src_ids": "2021.acl-long.384_15750"}
{"input": "classification of toxicity is done by using Task| context: binary sequence classification is a standard nlp task with known state - of - the - art methods .", "entity": "classification of toxicity", "output": "multilingual pre - training and data augmentation", "neg_sample": ["classification of toxicity is done by using Task", "binary sequence classification is a standard nlp task with known state - of - the - art methods ."], "relation": "used for", "id": "2021.germeval-1.4", "year": 2021, "rel_sent": "DFKI SLT at GermEval 2021 : Multilingual Pre - training and Data Augmentation for the Classification of Toxicity in Social Media Comments.", "forward": false, "src_ids": "2021.germeval-1.4_12945"}
{"input": "full - scale dataset is used for Task| context: multimodal summarization becomes increasingly significant as it is the basis for question answering , web search , and many other downstream tasks . however , its learning materials have been lacking a holistic organization by integrating resources from various modalities , thereby lagging behind the research progress of this field .", "entity": "full - scale dataset", "output": "multi - modal summarization", "neg_sample": ["full - scale dataset is used for Task", "multimodal summarization becomes increasingly significant as it is the basis for question answering , web search , and many other downstream tasks .", "however , its learning materials have been lacking a holistic organization by integrating resources from various modalities , thereby lagging behind the research progress of this field ."], "relation": "used for", "id": "2021.naacl-main.473", "year": 2021, "rel_sent": "MM - AVS : A Full - Scale Dataset for Multi - modal Summarization.", "forward": true, "src_ids": "2021.naacl-main.473_10795"}
{"input": "pre - trained models is done by using Method| context: as the labeling cost for different modules in task - oriented dialog ( tod ) systems is expensive , a major challenge is to train different modules with the least amount of labeled data . recently , large - scale pre - trained language models , have shown promising results for few - shot learning in tod.", "entity": "pre - trained models", "output": "self - training approach", "neg_sample": ["pre - trained models is done by using Method", "as the labeling cost for different modules in task - oriented dialog ( tod ) systems is expensive , a major challenge is to train different modules with the least amount of labeled data .", "recently , large - scale pre - trained language models , have shown promising results for few - shot learning in tod."], "relation": "used for", "id": "2021.emnlp-main.142", "year": 2021, "rel_sent": "In this paper , we devise a self - training approach to utilize the abundant unlabeled dialog data tofurther improve state - of - the - art pre - trained models in few - shot learning scenarios for ToD systems .", "forward": false, "src_ids": "2021.emnlp-main.142_11436"}
{"input": "dependency graph is done by using Method| context: people rely on digital task management tools , such as email or to - do apps , to manage their tasks . some of these tasks are large and complex , leading to action paralysis and feelings of being overwhelmed on the part of the user . the micro - productivity literature has shown that such tasks could benefit from being decomposed and organized , in order to reduce user cognitive load .", "entity": "dependency graph", "output": "end - to - end pipeline", "neg_sample": ["dependency graph is done by using Method", "people rely on digital task management tools , such as email or to - do apps , to manage their tasks .", "some of these tasks are large and complex , leading to action paralysis and feelings of being overwhelmed on the part of the user .", "the micro - productivity literature has shown that such tasks could benefit from being decomposed and organized , in order to reduce user cognitive load ."], "relation": "used for", "id": "2021.naacl-main.217", "year": 2021, "rel_sent": "Thus in this paper , we propose a novel end - to - end pipeline that consumes a complex task and induces a dependency graph from unstructured text to represent sub - tasks and their relationships .", "forward": false, "src_ids": "2021.naacl-main.217_9672"}
{"input": "a multi - grained bert is done by using Material| context: pre - trained language models such as bert have exhibited remarkable performances in many tasks in natural language understanding ( nlu ) . in fact , both fine - grained and coarse - grained tokenizations have advantages and disadvantages for learning of pre - trained language models .", "entity": "a multi - grained bert", "output": "english", "neg_sample": ["a multi - grained bert is done by using Material", "pre - trained language models such as bert have exhibited remarkable performances in many tasks in natural language understanding ( nlu ) .", "in fact , both fine - grained and coarse - grained tokenizations have advantages and disadvantages for learning of pre - trained language models ."], "relation": "used for", "id": "2021.findings-acl.37", "year": 2021, "rel_sent": "For English , AMBERT takes both the sequence of words ( fine - grained tokens ) and the sequence of phrases ( coarse - grained tokens ) as input after tokenization , employs one encoder for processing the sequence of words and the other encoder for processing the sequence of the phrases , utilizes shared parameters between the two encoders , and finally creates a sequence of contextualized representations of the words and a sequence of contextualized representations of the phrases .", "forward": false, "src_ids": "2021.findings-acl.37_12472"}
{"input": "ensembles is done by using Method| context: current research on quality estimation of machine translation focuses on the sentence - level quality of the translations .", "entity": "ensembles", "output": "explainability techniques", "neg_sample": ["ensembles is done by using Method", "current research on quality estimation of machine translation focuses on the sentence - level quality of the translations ."], "relation": "used for", "id": "2021.eval4nlp-1.23", "year": 2021, "rel_sent": "Further , we combine explainability methods to ensembles to exploit the strengths of individual explainers to get better explanations .", "forward": false, "src_ids": "2021.eval4nlp-1.23_2645"}
{"input": "three - level hierarchy is used for OtherScientificTerm| context: existing news recommendation methods usually learn a single user embedding for each user from their previous behaviors to represent their overall interest .", "entity": "three - level hierarchy", "output": "user interest", "neg_sample": ["three - level hierarchy is used for OtherScientificTerm", "existing news recommendation methods usually learn a single user embedding for each user from their previous behaviors to represent their overall interest ."], "relation": "used for", "id": "2021.acl-long.423", "year": 2021, "rel_sent": "We use a three - level hierarchy to represent 1 ) overall user interest ; 2 ) user interest in coarse - grained topics like sports ; and 3 ) user interest in fine - grained topics like football .", "forward": true, "src_ids": "2021.acl-long.423_6209"}
{"input": "english is done by using Method| context: integrating an adaptive intelligent tutoring system ( its ) in real - life school contexts requires coverage of the official curricula , which necessitates a broad range and number of activities to practice the official set of language phenomena .", "entity": "english", "output": "adaptive its", "neg_sample": ["english is done by using Method", "integrating an adaptive intelligent tutoring system ( its ) in real - life school contexts requires coverage of the official curricula , which necessitates a broad range and number of activities to practice the official set of language phenomena ."], "relation": "used for", "id": "2021.nlp4call-1.2", "year": 2021, "rel_sent": "In the context of developing an adaptive ITS for English as a Foreign Language , we propose a method to automatically derive rich activity models from ordinary exercise specifications .", "forward": false, "src_ids": "2021.nlp4call-1.2_8605"}
{"input": "sufficient data is used for Method| context: multilingual grammatical framework ( gf ) domain grammars have been used in a variety of different applications , including question answering , where concrete syntaxes for parsing questions and generating answers are typically required for each supported language . in low - resourced settings , grammar engineering skills , appropriate knowledge of the use of supported languages in a domain , and appropriate domain data are scarce . this presents a challenge for developing domain specific concrete syntaxes for a gf application grammar , on the one hand , while on the other hand , machine learning techniques for performing questionanswering are hampered by a lack of sufficient data .", "entity": "sufficient data", "output": "neural network", "neg_sample": ["sufficient data is used for Method", "multilingual grammatical framework ( gf ) domain grammars have been used in a variety of different applications , including question answering , where concrete syntaxes for parsing questions and generating answers are typically required for each supported language .", "in low - resourced settings , grammar engineering skills , appropriate knowledge of the use of supported languages in a domain , and appropriate domain data are scarce .", "this presents a challenge for developing domain specific concrete syntaxes for a gf application grammar , on the one hand , while on the other hand , machine learning techniques for performing questionanswering are hampered by a lack of sufficient data ."], "relation": "used for", "id": "2021.cnl-1.4", "year": 2021, "rel_sent": "A Zulu resource grammar is leveraged to create sufficient data to train a neural network that approximates a Zulu concrete syntax for parsing questions in a proof - of - concept question - answering system .", "forward": true, "src_ids": "2021.cnl-1.4_9083"}
{"input": "automatic discriminator model is used for Material| context: recently , pre - trained transformer - based architectures have proven to be very efficient at language modeling and understanding , given that they are trained on a large enough corpus . applications in language generation for arabic are still lagging in comparison to other nlp advances primarily due to the lack of advanced arabic language generation models .", "entity": "automatic discriminator model", "output": "model - generated text", "neg_sample": ["automatic discriminator model is used for Material", "recently , pre - trained transformer - based architectures have proven to be very efficient at language modeling and understanding , given that they are trained on a large enough corpus .", "applications in language generation for arabic are still lagging in comparison to other nlp advances primarily due to the lack of advanced arabic language generation models ."], "relation": "used for", "id": "2021.wanlp-1.21", "year": 2021, "rel_sent": "We thus develop and release an automatic discriminator model with a 98 % percent accuracy in detecting model - generated text .", "forward": true, "src_ids": "2021.wanlp-1.21_9938"}
{"input": "indian languages is done by using Task| context: india is one of the richest language hubs on the earth and is very diverse and multilingual . but apart from a few indian languages , most of them are still considered to be resource poor . since most of the nlp techniques either require linguistic knowledge that can only be developed by experts and native speakers of that language or they require a lot of labelled data which is again expensive to generate , the task of text classification becomes challenging for most of the indian languages .", "entity": "indian languages", "output": "multilingual text classification", "neg_sample": ["indian languages is done by using Task", "india is one of the richest language hubs on the earth and is very diverse and multilingual .", "but apart from a few indian languages , most of them are still considered to be resource poor .", "since most of the nlp techniques either require linguistic knowledge that can only be developed by experts and native speakers of that language or they require a lot of labelled data which is again expensive to generate , the task of text classification becomes challenging for most of the indian languages ."], "relation": "used for", "id": "2021.ranlp-1.3", "year": 2021, "rel_sent": "Efficient Multilingual Text Classification for Indian Languages.", "forward": false, "src_ids": "2021.ranlp-1.3_12145"}
{"input": "automatic short answer grading is done by using Method| context: automatic short answer grading ( asag ) is the task of assessing students ' short natural language responses to objective questions . it is a crucial component of new education platforms , and could support more wide - spread use of constructed response questions to replace cognitively less challenging multiple choice questions .", "entity": "automatic short answer grading", "output": "semantic feature - wise transformation relation network", "neg_sample": ["automatic short answer grading is done by using Method", "automatic short answer grading ( asag ) is the task of assessing students ' short natural language responses to objective questions .", "it is a crucial component of new education platforms , and could support more wide - spread use of constructed response questions to replace cognitively less challenging multiple choice questions ."], "relation": "used for", "id": "2021.emnlp-main.487", "year": 2021, "rel_sent": "A Semantic Feature - Wise Transformation Relation Network for Automatic Short Answer Grading.", "forward": false, "src_ids": "2021.emnlp-main.487_13260"}
{"input": "sentiment is done by using Method| context: with the popularity of the current internet age , online social platforms have provided a bridge for communication between private companies , public organizations , and the public .", "entity": "sentiment", "output": "deep learning model", "neg_sample": ["sentiment is done by using Method", "with the popularity of the current internet age , online social platforms have provided a bridge for communication between private companies , public organizations , and the public ."], "relation": "used for", "id": "2021.rocling-1.27", "year": 2021, "rel_sent": "In addition to consider Valence and Arousal which is the smallest morpheme of emotional information , the dependence relationship between texts is also integrated into the deep learning model to analyze the sentiment .", "forward": false, "src_ids": "2021.rocling-1.27_5421"}
{"input": "long tail label problem is done by using Method| context: conventional entity typing approaches are based on independent classification paradigms , which make them difficult to recognize inter - dependent , long - tailed and fine - grained entity types . in this paper , we argue that the implicitly entailed extrinsic and intrinsic dependencies between labels can provide critical knowledge to tackle the above challenges .", "entity": "long tail label problem", "output": "lrn", "neg_sample": ["long tail label problem is done by using Method", "conventional entity typing approaches are based on independent classification paradigms , which make them difficult to recognize inter - dependent , long - tailed and fine - grained entity types .", "in this paper , we argue that the implicitly entailed extrinsic and intrinsic dependencies between labels can provide critical knowledge to tackle the above challenges ."], "relation": "used for", "id": "2021.emnlp-main.378", "year": 2021, "rel_sent": "Experiments show that LRN achieves the state - of - the - art performance on standard ultra fine - grained entity typing benchmarks , and can also resolve the long tail label problem effectively .", "forward": false, "src_ids": "2021.emnlp-main.378_5727"}
{"input": "atomic organization is done by using OtherScientificTerm| context: the recurrent neural network ( rnn ) language model is a powerful tool for learning arbitrary sequential dependencies in language data . despite its enormous success in representing lexical sequences , little is known about the quality of the lexical representations that it acquires . in this work , we conjecture that it is straightforward to extract lexical representations ( i.e. static word embeddings ) from an rnn , but that the amount of semantic information that is encoded is limited when lexical items in the training data provide redundant semantic information .", "entity": "atomic organization", "output": "scaffolded input", "neg_sample": ["atomic organization is done by using OtherScientificTerm", "the recurrent neural network ( rnn ) language model is a powerful tool for learning arbitrary sequential dependencies in language data .", "despite its enormous success in representing lexical sequences , little is known about the quality of the lexical representations that it acquires .", "in this work , we conjecture that it is straightforward to extract lexical representations ( i.e.", "static word embeddings ) from an rnn , but that the amount of semantic information that is encoded is limited when lexical items in the training data provide redundant semantic information ."], "relation": "used for", "id": "2021.conll-1.32", "year": 2021, "rel_sent": "Scaffolded input promotes atomic organization in the recurrent neural network language model.", "forward": false, "src_ids": "2021.conll-1.32_10124"}
{"input": "attention weights is done by using Method| context: we investigate how sentence - level transformers can be modified into effective sequence labelers at the token level without any direct supervision . existing approaches to zero - shot sequence labeling do not perform well when applied on transformer - based architectures . as transformers contain multiple layers of multi - head self - attention , information in the sentence gets distributed between many tokens , negatively affecting zero - shot token - level performance .", "entity": "attention weights", "output": "soft attention module", "neg_sample": ["attention weights is done by using Method", "we investigate how sentence - level transformers can be modified into effective sequence labelers at the token level without any direct supervision .", "existing approaches to zero - shot sequence labeling do not perform well when applied on transformer - based architectures .", "as transformers contain multiple layers of multi - head self - attention , information in the sentence gets distributed between many tokens , negatively affecting zero - shot token - level performance ."], "relation": "used for", "id": "2021.repl4nlp-1.20", "year": 2021, "rel_sent": "We find that a soft attention module which explicitly encourages sharpness of attention weights can significantly outperform existing methods .", "forward": false, "src_ids": "2021.repl4nlp-1.20_15906"}
{"input": "noisy tokens is done by using Method| context: reviews written by the users for a particular product or service play an influencing role for the customers to make an informative decision . although online e - commerce portals have immensely impacted our lives , available contents predominantly are in english language- often limiting its widespread usage . there is an exponential growth in the number of e - commerce users who are not proficient in english . hence , there is a necessity to make these services available in non - english languages , especially in a multilingual country like india . this can be achieved by an in - domain robust machine translation ( mt ) system . however , the reviews written by the users pose unique challenges to mt , such as misspelled words , ungrammatical constructions , presence of colloquial terms , lack of resources such as in - domain parallel corpus etc .", "entity": "noisy tokens", "output": "nmt model", "neg_sample": ["noisy tokens is done by using Method", "reviews written by the users for a particular product or service play an influencing role for the customers to make an informative decision .", "although online e - commerce portals have immensely impacted our lives , available contents predominantly are in english language- often limiting its widespread usage .", "there is an exponential growth in the number of e - commerce users who are not proficient in english .", "hence , there is a necessity to make these services available in non - english languages , especially in a multilingual country like india .", "this can be achieved by an in - domain robust machine translation ( mt ) system .", "however , the reviews written by the users pose unique challenges to mt , such as misspelled words , ungrammatical constructions , presence of colloquial terms , lack of resources such as in - domain parallel corpus etc ."], "relation": "used for", "id": "2021.ecnlp-1.21", "year": 2021, "rel_sent": "In order to make our NMT model robust enough to handle the noisy tokens in the reviews , we integrate a character based language model to generate word vectors and map the noisy tokens with their correct forms .", "forward": false, "src_ids": "2021.ecnlp-1.21_10405"}
{"input": "cnl is used for Task| context: regelspraak is a cnl developed at the dutch tax administration ( dta ) over the last decade . keeping up with frequently changing tax rules poses a formidable challenge to the dta it department . regelspraak is a central asset in ongoing efforts of the dta to attune their tax it systems to automatic execution of tax law . regelspraak now is part of the operational process of rule specification and execution .", "entity": "cnl", "output": "executable tax rules specification", "neg_sample": ["cnl is used for Task", "regelspraak is a cnl developed at the dutch tax administration ( dta ) over the last decade .", "keeping up with frequently changing tax rules poses a formidable challenge to the dta it department .", "regelspraak is a central asset in ongoing efforts of the dta to attune their tax it systems to automatic execution of tax law .", "regelspraak now is part of the operational process of rule specification and execution ."], "relation": "used for", "id": "2021.cnl-1.6", "year": 2021, "rel_sent": "RegelSpraak : a CNL for Executable Tax Rules Specification.", "forward": true, "src_ids": "2021.cnl-1.6_878"}
{"input": "structure of tables is done by using Method| context: existing approaches for table annotation with entities and types either capture the structure of table using graphical models , or learn embeddings of table entries without accounting for the complete syntactic structure .", "entity": "structure of tables", "output": "graph convolutional networks", "neg_sample": ["structure of tables is done by using Method", "existing approaches for table annotation with entities and types either capture the structure of table using graphical models , or learn embeddings of table entries without accounting for the complete syntactic structure ."], "relation": "used for", "id": "2021.eacl-main.102", "year": 2021, "rel_sent": "We propose TabGCN , that uses Graph Convolutional Networks to capture the complete structure of tables , knowledge graph and the training annotations , and jointly learns embeddings for table elements as well as the entities and types .", "forward": false, "src_ids": "2021.eacl-main.102_1950"}
{"input": "multitrainmt is used for Task| context: machine translation is seen as a resource that can support citizens in their attempt to acquire and develop language skills if they are trained in an informed and critical way .", "entity": "multitrainmt", "output": "neural machine translation", "neg_sample": ["multitrainmt is used for Task", "machine translation is seen as a resource that can support citizens in their attempt to acquire and develop language skills if they are trained in an informed and critical way ."], "relation": "used for", "id": "2021.triton-1.21", "year": 2021, "rel_sent": "MultiTraiNMT : Training Materials to Approach Neural Machine Translation from Scratch.", "forward": true, "src_ids": "2021.triton-1.21_9329"}
{"input": "emotion recognition is done by using Material| context: emotion recognition in conversation has received considerable attention recently because of its practical industrial applications . existing methods tend to overlook the immediate mutual interaction between different speakers in the speaker - utterance level , or apply single speaker - agnostic rnn for utterances from different speakers .", "entity": "emotion recognition", "output": "adversarial examples", "neg_sample": ["emotion recognition is done by using Material", "emotion recognition in conversation has received considerable attention recently because of its practical industrial applications .", "existing methods tend to overlook the immediate mutual interaction between different speakers in the speaker - utterance level , or apply single speaker - agnostic rnn for utterances from different speakers ."], "relation": "used for", "id": "2021.maiworkshop-1.3", "year": 2021, "rel_sent": "To improve the robustness and generalization during training , we generate adversarial examples by applying the minor perturbations on multimodal feature inputs , unveiling the benefits of adversarial examples for emotion detection .", "forward": false, "src_ids": "2021.maiworkshop-1.3_1098"}
{"input": "richly contextualized deep representation learning is used for Task| context: evidence plays a crucial role in any biomedical research narrative , providing justification for some claims and refutation for others .", "entity": "richly contextualized deep representation learning", "output": "analysis of scientific discourse structures", "neg_sample": ["richly contextualized deep representation learning is used for Task", "evidence plays a crucial role in any biomedical research narrative , providing justification for some claims and refutation for others ."], "relation": "used for", "id": "2021.eacl-main.218", "year": 2021, "rel_sent": "We apply richly contextualized deep representation learning pre - trained on biomedical domain corpus to the analysis of scientific discourse structures and the extraction of ' evidence fragments ' ( i.e. , the text in the results section describing data presented in a specified subfigure ) from a set of biomedical experimental research articles .", "forward": true, "src_ids": "2021.eacl-main.218_13332"}
{"input": "controlling prosody is used for Method| context: while end-2 - end text - to - speech ( tts ) has made significant progresses over the past few years , these systems still lack intuitive user controls over prosody . for instance , generating speech with fine - grained prosody control ( prosodic prominence , contextually appropriate emotions ) is still an open challenge .", "entity": "controlling prosody", "output": "end - to - end tts", "neg_sample": ["controlling prosody is used for Method", "while end-2 - end text - to - speech ( tts ) has made significant progresses over the past few years , these systems still lack intuitive user controls over prosody .", "for instance , generating speech with fine - grained prosody control ( prosodic prominence , contextually appropriate emotions ) is still an open challenge ."], "relation": "used for", "id": "2021.conll-1.42", "year": 2021, "rel_sent": "Controlling Prosody in End - to - End TTS : A Case Study on Contrastive Focus Generation.", "forward": true, "src_ids": "2021.conll-1.42_2734"}
{"input": "task - specific models is done by using Method| context: the need to deploy large - scale pre - trained models on edge devices under limited computational resources has led to substantial research to compress these large models . however , less attention has been given to compress the task - specific models .", "entity": "task - specific models", "output": "local pruning", "neg_sample": ["task - specific models is done by using Method", "the need to deploy large - scale pre - trained models on edge devices under limited computational resources has led to substantial research to compress these large models .", "however , less attention has been given to compress the task - specific models ."], "relation": "used for", "id": "2021.ranlp-srw.17", "year": 2021, "rel_sent": "Does local pruning offer task - specific models to learn effectively ?.", "forward": false, "src_ids": "2021.ranlp-srw.17_12620"}
{"input": "large - scale pretraining is used for Method| context: identifying the value of product attribute is essential for many e - commerce functions such as product search and product recommendations . therefore , identifying attribute values from unstructured product descriptions is a critical undertaking for any e - commerce retailer . what makes this problem challenging is the diversity of product types and their attributes and values . existing methods have typically employed multiple types of machine learning models , each of which handles specific product types or attribute classes . this has limited their scalability and generalization for large scale real world e - commerce applications . previous approaches for this task have formulated the attribute value extraction as a named entity recognition ( ner ) task or a question answering ( qa ) task .", "entity": "large - scale pretraining", "output": "gpt-2", "neg_sample": ["large - scale pretraining is used for Method", "identifying the value of product attribute is essential for many e - commerce functions such as product search and product recommendations .", "therefore , identifying attribute values from unstructured product descriptions is a critical undertaking for any e - commerce retailer .", "what makes this problem challenging is the diversity of product types and their attributes and values .", "existing methods have typically employed multiple types of machine learning models , each of which handles specific product types or attribute classes .", "this has limited their scalability and generalization for large scale real world e - commerce applications .", "previous approaches for this task have formulated the attribute value extraction as a named entity recognition ( ner ) task or a question answering ( qa ) task ."], "relation": "used for", "id": "2021.ecnlp-1.2", "year": 2021, "rel_sent": "We leverage the large - scale pretraining of the GPT-2 and the T5 text - to - text transformer to create fine - tuned models that can effectively perform this task .", "forward": true, "src_ids": "2021.ecnlp-1.2_1570"}
{"input": "n - best phone sequence hypotheses is done by using Method| context: there has been increasing demand to develop effective computer - assisted language training ( capt ) systems , which can provide feedback on mispronunciations and facilitate second - language ( l2 ) learners to improve their speaking proficiency through repeated practice . due to the shortage of non - native speech for training the automatic speech recognition ( asr ) module of a capt system , the corresponding mispronunciation detection performance is often affected by imperfect asr .", "entity": "n - best phone sequence hypotheses", "output": "end - to - end asr module", "neg_sample": ["n - best phone sequence hypotheses is done by using Method", "there has been increasing demand to develop effective computer - assisted language training ( capt ) systems , which can provide feedback on mispronunciations and facilitate second - language ( l2 ) learners to improve their speaking proficiency through repeated practice .", "due to the shortage of non - native speech for training the automatic speech recognition ( asr ) module of a capt system , the corresponding mispronunciation detection performance is often affected by imperfect asr ."], "relation": "used for", "id": "2021.rocling-1.17", "year": 2021, "rel_sent": "In the first stage , the speech uttered by an L2 learner is processed by an end - to - end ASR module to produce N - best phone sequence hypotheses .", "forward": false, "src_ids": "2021.rocling-1.17_2497"}
{"input": "diverse dialogue tasks is done by using Method| context: loading models pre - trained on the large - scale corpus in the general domain and fine - tuning them on specific downstream tasks is gradually becoming a paradigm in natural language processing . previous investigations prove that introducing a further pre - training phase between pre - training and fine - tuning phases to adapt the model on the domain - specific unlabeled data can bring positive effects . however , most of these further pre - training works just keep running the conventional pre - training task , e.g. , masked language model , which can be regarded as the domain adaptation to bridge the data distribution gap . after observing diverse downstream tasks , we suggest that different tasks may also need a further pre - training phase with appropriate training tasks to bridge the task formulation gap .", "entity": "diverse dialogue tasks", "output": "pre - training approaches", "neg_sample": ["diverse dialogue tasks is done by using Method", "loading models pre - trained on the large - scale corpus in the general domain and fine - tuning them on specific downstream tasks is gradually becoming a paradigm in natural language processing .", "previous investigations prove that introducing a further pre - training phase between pre - training and fine - tuning phases to adapt the model on the domain - specific unlabeled data can bring positive effects .", "however , most of these further pre - training works just keep running the conventional pre - training task , e.g.", ", masked language model , which can be regarded as the domain adaptation to bridge the data distribution gap .", "after observing diverse downstream tasks , we suggest that different tasks may also need a further pre - training phase with appropriate training tasks to bridge the task formulation gap ."], "relation": "used for", "id": "2021.emnlp-main.178", "year": 2021, "rel_sent": "Different Strokes for Different Folks : Investigating Appropriate Further Pre - training Approaches for Diverse Dialogue Tasks.", "forward": false, "src_ids": "2021.emnlp-main.178_5148"}
{"input": "cross - domain named entity recognition is done by using Method| context: however , most existing techniques focus on augmenting in - domain data in low - resource scenarios where annotated data is quite limited .", "entity": "cross - domain named entity recognition", "output": "data augmentation", "neg_sample": ["cross - domain named entity recognition is done by using Method", "however , most existing techniques focus on augmenting in - domain data in low - resource scenarios where annotated data is quite limited ."], "relation": "used for", "id": "2021.emnlp-main.434", "year": 2021, "rel_sent": "Data Augmentation for Cross - Domain Named Entity Recognition.", "forward": false, "src_ids": "2021.emnlp-main.434_5136"}
{"input": "language modeling is done by using Method| context: syntax is fundamental to our thinking about language . failing to capture the structure of input language could lead to generalization problems and over - parametrization .", "entity": "language modeling", "output": "syntactic ordered memory ( som )", "neg_sample": ["language modeling is done by using Method", "syntax is fundamental to our thinking about language .", "failing to capture the structure of input language could lead to generalization problems and over - parametrization ."], "relation": "used for", "id": "2021.naacl-main.132", "year": 2021, "rel_sent": "Experiments show that SOM can achieve strong results in language modeling , incremental parsing , and syntactic generalization tests while using fewer parameters than other models .", "forward": false, "src_ids": "2021.naacl-main.132_12824"}
{"input": "composer is used for OtherScientificTerm| context: neural sequence models exhibit limited compositional generalization ability in semantic parsing tasks . compositional generalization requires algebraic recombination , i.e. , dynamically recombining structured expressions in a recursive manner . however , most previous studies mainly concentrate on recombining lexical units , which is an important but not sufficient part of algebraic recombination .", "entity": "composer", "output": "latent syntax", "neg_sample": ["composer is used for OtherScientificTerm", "neural sequence models exhibit limited compositional generalization ability in semantic parsing tasks .", "compositional generalization requires algebraic recombination , i.e.", ", dynamically recombining structured expressions in a recursive manner .", "however , most previous studies mainly concentrate on recombining lexical units , which is an important but not sufficient part of algebraic recombination ."], "relation": "used for", "id": "2021.findings-acl.97", "year": 2021, "rel_sent": "Specifically , we learn two modules jointly : a Composer for producing latent syntax , and an Interpreter for assigning semantic operations .", "forward": true, "src_ids": "2021.findings-acl.97_12466"}
{"input": "gaussian linear transformation is used for OtherScientificTerm| context: semi - supervised text classification ( sstc ) mainly works under the spirit of self - training . they initialize the deep classifier by training over labeled texts ; and then alternatively predict unlabeled texts as their pseudo - labels and train the deep classifier over the mixture of labeled and pseudo - labeled texts . naturally , their performance is largely affected by the accuracy of pseudo - labels for unlabeled texts . unfortunately , they often suffer from low accuracy because of the margin bias problem caused by the large difference between representation distributions of labels in sstc .", "entity": "gaussian linear transformation", "output": "balanced label angle variances", "neg_sample": ["gaussian linear transformation is used for OtherScientificTerm", "semi - supervised text classification ( sstc ) mainly works under the spirit of self - training .", "they initialize the deep classifier by training over labeled texts ; and then alternatively predict unlabeled texts as their pseudo - labels and train the deep classifier over the mixture of labeled and pseudo - labeled texts .", "naturally , their performance is largely affected by the accuracy of pseudo - labels for unlabeled texts .", "unfortunately , they often suffer from low accuracy because of the margin bias problem caused by the large difference between representation distributions of labels in sstc ."], "relation": "used for", "id": "2021.acl-long.391", "year": 2021, "rel_sent": "To alleviate this problem , we apply the angular margin loss , and perform Gaussian linear transformation to achieve balanced label angle variances , i.e. , the variance of label angles of texts within the same label .", "forward": true, "src_ids": "2021.acl-long.391_7455"}
{"input": "multi - lingual question generation research is done by using Material| context: question generation is the task of generating coherent and relevant question given context paragraph . recently , with the development of large - scale question answering datasets such as squad , the english question generation has been rapidly developed . however , for other languages such as chinese , the available training data is limited , which hinders the development of question generation in the corresponding language .", "entity": "multi - lingual question generation research", "output": "large - scale chinese question generation dataset", "neg_sample": ["multi - lingual question generation research is done by using Material", "question generation is the task of generating coherent and relevant question given context paragraph .", "recently , with the development of large - scale question answering datasets such as squad , the english question generation has been rapidly developed .", "however , for other languages such as chinese , the available training data is limited , which hinders the development of question generation in the corresponding language ."], "relation": "used for", "id": "2021.findings-acl.199", "year": 2021, "rel_sent": "In addition , we propose a large - scale Chinese question generation dataset containing more than 220k human - generated questions to benefit the multi - lingual question generation research .", "forward": false, "src_ids": "2021.findings-acl.199_11688"}
{"input": "virtual p / q - embedding matrices is used for OtherScientificTerm| context: adversarial training ( at ) as a regularization method has proved its effectiveness on various tasks . though there are successful applications of at on some nlp tasks , the distinguishing characteristics of nlp tasks have not been exploited .", "entity": "virtual p / q - embedding matrices", "output": "global perturbations of words", "neg_sample": ["virtual p / q - embedding matrices is used for OtherScientificTerm", "adversarial training ( at ) as a regularization method has proved its effectiveness on various tasks .", "though there are successful applications of at on some nlp tasks , the distinguishing characteristics of nlp tasks have not been exploited ."], "relation": "used for", "id": "2021.starsem-1.30", "year": 2021, "rel_sent": "To differentiate the roles of passages and questions , PQAT uses additional virtual P / Q - embedding matrices to gather the global perturbations of words from passages and questions separately .", "forward": true, "src_ids": "2021.starsem-1.30_12086"}
{"input": "in - processing fair sampling method is used for Generic| context: internet search affects people 's cognition of the world , so mitigating biases in search results and learning fair models is imperative for social good . we study a unique gender bias in image search in this work : the search images are often gender - imbalanced for gender - neutral natural language queries .", "entity": "in - processing fair sampling method", "output": "training models", "neg_sample": ["in - processing fair sampling method is used for Generic", "internet search affects people 's cognition of the world , so mitigating biases in search results and learning fair models is imperative for social good .", "we study a unique gender bias in image search in this work : the search images are often gender - imbalanced for gender - neutral natural language queries ."], "relation": "used for", "id": "2021.emnlp-main.151", "year": 2021, "rel_sent": "Therefore , we introduce two novel debiasing approaches : an in - processing fair sampling method to address the gender imbalance issue for training models , and a post - processing feature clipping method base on mutual information to debias multimodal representations of pre - trained models .", "forward": true, "src_ids": "2021.emnlp-main.151_12715"}
{"input": "crfr is used for Task| context: although paths of user interests shift in knowledge graphs ( kgs ) can benefit conversational recommender systems ( crs ) , explicit reasoning on kgs has not been well considered in crs , due to the complex of high - order and incomplete paths .", "entity": "crfr", "output": "explicit multi - hop reasoning", "neg_sample": ["crfr is used for Task", "although paths of user interests shift in knowledge graphs ( kgs ) can benefit conversational recommender systems ( crs ) , explicit reasoning on kgs has not been well considered in crs , due to the complex of high - order and incomplete paths ."], "relation": "used for", "id": "2021.emnlp-main.355", "year": 2021, "rel_sent": "We propose CRFR , which effectively does explicit multi - hop reasoning on KGs with a conversational context - based reinforcement learning model .", "forward": true, "src_ids": "2021.emnlp-main.355_11990"}
{"input": "unconditioned generation tasks is done by using Method| context: however , their performance seems to still have a significant margin for improvement .", "entity": "unconditioned generation tasks", "output": "generative adversarial networks ( gans )", "neg_sample": ["unconditioned generation tasks is done by using Method", "however , their performance seems to still have a significant margin for improvement ."], "relation": "used for", "id": "2021.paclic-1.69", "year": 2021, "rel_sent": "For this reason , in this paper we propose a new adversarial training method that tackles some of the limitations of GAN training in unconditioned generation tasks .", "forward": false, "src_ids": "2021.paclic-1.69_1748"}
{"input": "word - level annotation is used for OtherScientificTerm| context: while the ud guidelines provided a general framework for our annotations , language - specific decisions were made necessary by the rich morphology of the polysynthetic language .", "entity": "word - level annotation", "output": "degenerate trees", "neg_sample": ["word - level annotation is used for OtherScientificTerm", "while the ud guidelines provided a general framework for our annotations , language - specific decisions were made necessary by the rich morphology of the polysynthetic language ."], "relation": "used for", "id": "2021.americasnlp-1.14", "year": 2021, "rel_sent": "Word - level annotation results in degenerate trees for some Yupik sentences and often fails to capture syntactic relations that can be manifested at the morpheme level .", "forward": true, "src_ids": "2021.americasnlp-1.14_15568"}
{"input": "seeds is used for Task| context: a common factor in bias measurement methods is the use of hand - curated seed lexicons , but there remains little guidance for their selection .", "entity": "seeds", "output": "sensitive measurements", "neg_sample": ["seeds is used for Task", "a common factor in bias measurement methods is the use of hand - curated seed lexicons , but there remains little guidance for their selection ."], "relation": "used for", "id": "2021.acl-long.148", "year": 2021, "rel_sent": "Seeds developed in one context are often re - used in other contexts , but documentation and evaluation remain necessary precursors to relying on seeds for sensitive measurements .", "forward": true, "src_ids": "2021.acl-long.148_14638"}
{"input": "translations is done by using Material| context: multilingual models have demonstrated impressive cross - lingual transfer performance . however , test sets like xnli are monolingual at the example level . in multilingual communities , it is common for polyglots to code - mix when conversing with each other . this paper will be published in the proceedings of naacl - hlt 2021 .", "entity": "translations", "output": "bilingual dictionaries", "neg_sample": ["translations is done by using Material", "multilingual models have demonstrated impressive cross - lingual transfer performance .", "however , test sets like xnli are monolingual at the example level .", "in multilingual communities , it is common for polyglots to code - mix when conversing with each other .", "this paper will be published in the proceedings of naacl - hlt 2021 ."], "relation": "used for", "id": "2021.calcs-1.19", "year": 2021, "rel_sent": "The former ( PolyGloss ) uses bilingual dictionaries to propose perturbations and translations of the clean example for sense disambiguation .", "forward": false, "src_ids": "2021.calcs-1.19_4304"}
{"input": "transformers is done by using Method| context: transformer - based architectures have become the de - facto standard models for a wide range of natural language processing tasks . however , their memory footprint and high latency are prohibitive for efficient deployment and inference on resource - limited devices .", "entity": "transformers", "output": "quantization", "neg_sample": ["transformers is done by using Method", "transformer - based architectures have become the de - facto standard models for a wide range of natural language processing tasks .", "however , their memory footprint and high latency are prohibitive for efficient deployment and inference on resource - limited devices ."], "relation": "used for", "id": "2021.emnlp-main.627", "year": 2021, "rel_sent": "In this work , we explore quantization for transformers .", "forward": false, "src_ids": "2021.emnlp-main.627_14123"}
{"input": "compoundgrow is used for Method| context: as the excessive pre - training cost arouses the need to improve efficiency , considerable efforts have been made to train bert progressively - start from an inferior but low - cost model and gradually increase the computational complexity .", "entity": "compoundgrow", "output": "bert pre - training", "neg_sample": ["compoundgrow is used for Method", "as the excessive pre - training cost arouses the need to improve efficiency , considerable efforts have been made to train bert progressively - start from an inferior but low - cost model and gradually increase the computational complexity ."], "relation": "used for", "id": "2021.naacl-main.406", "year": 2021, "rel_sent": "In light of our analyses , the proposed method CompoundGrow speeds up BERT pre - training by 73.6 % and 82.2 % for the base and large models respectively while achieving comparable performances .", "forward": true, "src_ids": "2021.naacl-main.406_4270"}
{"input": "summarization models is done by using Method| context: how can we effectively inform content selection in transformer - based abstractive summarization models ?", "entity": "summarization models", "output": "attention head masking technique", "neg_sample": ["summarization models is done by using Method", "how can we effectively inform content selection in transformer - based abstractive summarization models ?"], "relation": "used for", "id": "2021.naacl-main.397", "year": 2021, "rel_sent": "Using attention head masking , we are able to reveal the relation between encoder - decoder attentions and content selection behaviors of summarization models .", "forward": false, "src_ids": "2021.naacl-main.397_4209"}
{"input": "sibling treelstm model is used for OtherScientificTerm| context: drss are document - level representations which encode rich semantic detail pertaining to rhetorical relations , presupposition , and co - reference within and across sentences .", "entity": "sibling treelstm model", "output": "drs structures", "neg_sample": ["sibling treelstm model is used for OtherScientificTerm", "drss are document - level representations which encode rich semantic detail pertaining to rhetorical relations , presupposition , and co - reference within and across sentences ."], "relation": "used for", "id": "2021.naacl-main.35", "year": 2021, "rel_sent": "Our generator relies on a novel sibling treeLSTM model which is able to accurately represent DRS structures and is more generally suited to trees with wide branches .", "forward": true, "src_ids": "2021.naacl-main.35_606"}
{"input": "relational graph induction module is used for OtherScientificTerm| context: identifying causal relations of events is an important task in natural language processing area . however , the task is very challenging , because event causality is usually expressed in diverse forms that often lack explicit causal clues . existing methods can not handle well the problem , especially in the condition of lacking training data . nonetheless , humans can make a correct judgement based on their background knowledge , including descriptive knowledge and relational knowledge .", "entity": "relational graph induction module", "output": "reasoning structure", "neg_sample": ["relational graph induction module is used for OtherScientificTerm", "identifying causal relations of events is an important task in natural language processing area .", "however , the task is very challenging , because event causality is usually expressed in diverse forms that often lack explicit causal clues .", "existing methods can not handle well the problem , especially in the condition of lacking training data .", "nonetheless , humans can make a correct judgement based on their background knowledge , including descriptive knowledge and relational knowledge ."], "relation": "used for", "id": "2021.acl-long.376", "year": 2021, "rel_sent": "To leverage the relational knowledge , we propose a Relational Graph Induction module which is able to automatically learn a reasoning structure for event causality reasoning .", "forward": true, "src_ids": "2021.acl-long.376_6697"}
{"input": "referential form is done by using OtherScientificTerm| context: it is often posited that more predictable parts of a speaker 's meaning tend to be made less explicit , for instance using shorter , less informative words . studying these dynamics in the domain of referring expressions has proven difficult , with existing studies , both psycholinguistic and corpus - based , providing contradictory results .", "entity": "referential form", "output": "referent predictability", "neg_sample": ["referential form is done by using OtherScientificTerm", "it is often posited that more predictable parts of a speaker 's meaning tend to be made less explicit , for instance using shorter , less informative words .", "studying these dynamics in the domain of referring expressions has proven difficult , with existing studies , both psycholinguistic and corpus - based , providing contradictory results ."], "relation": "used for", "id": "2021.conll-1.36", "year": 2021, "rel_sent": "Does referent predictability affect the choice of referential form ? A computational approach using masked coreference resolution.", "forward": false, "src_ids": "2021.conll-1.36_7905"}
{"input": "formal cnls is used for Task| context: communication protocols allow to standardize communication . they are typically implemented to standardize the exchange of messages in the area of information systems .", "entity": "formal cnls", "output": "logistics", "neg_sample": ["formal cnls is used for Task", "communication protocols allow to standardize communication .", "they are typically implemented to standardize the exchange of messages in the area of information systems ."], "relation": "used for", "id": "2021.cnl-1.14", "year": 2021, "rel_sent": "Here an artifact that allows applying formal CNLs for communication in the domain of logistics is presented .", "forward": true, "src_ids": "2021.cnl-1.14_16105"}
{"input": "distillation method is used for OtherScientificTerm| context: pretrained language models like bert have achieved good results on nlp tasks , but are impractical on resource - limited devices due to memory footprint . a large fraction of this footprint comes from the input embeddings with large input vocabulary and embedding dimensions . existing knowledge distillation methods used for model compression can not be directly applied to train student models with reduced vocabulary sizes .", "entity": "distillation method", "output": "teacher and student embeddings", "neg_sample": ["distillation method is used for OtherScientificTerm", "pretrained language models like bert have achieved good results on nlp tasks , but are impractical on resource - limited devices due to memory footprint .", "a large fraction of this footprint comes from the input embeddings with large input vocabulary and embedding dimensions .", "existing knowledge distillation methods used for model compression can not be directly applied to train student models with reduced vocabulary sizes ."], "relation": "used for", "id": "2021.eacl-main.238", "year": 2021, "rel_sent": "To this end , we propose a distillation method to align the teacher and student embeddings via mixed - vocabulary training .", "forward": true, "src_ids": "2021.eacl-main.238_75"}
{"input": "synonym substitution is done by using Method| context: recently , there has been significant progress in studying neural networks to translate text descriptions into sql queries .", "entity": "synonym substitution", "output": "text - to - sql models", "neg_sample": ["synonym substitution is done by using Method", "recently , there has been significant progress in studying neural networks to translate text descriptions into sql queries ."], "relation": "used for", "id": "2021.acl-long.195", "year": 2021, "rel_sent": "In this work , we investigate the robustness of text - to - SQL models to synonym substitution .", "forward": false, "src_ids": "2021.acl-long.195_15356"}
{"input": "derivational morphology is used for Method| context: how does the input segmentation of pretrained language models ( plms ) affect their interpretations of complex words ?", "entity": "derivational morphology", "output": "bert", "neg_sample": ["derivational morphology is used for Method", "how does the input segmentation of pretrained language models ( plms ) affect their interpretations of complex words ?"], "relation": "used for", "id": "2021.acl-long.279", "year": 2021, "rel_sent": "Superbizarre Is Not Superb : Derivational Morphology Improves BERT 's Interpretation of Complex Words.", "forward": true, "src_ids": "2021.acl-long.279_12844"}
{"input": "cross - lingual natural language inference is done by using Method| context: due to recent pretrained multilingual representation models , it has become feasible to exploit labeled data from one language to train a cross - lingual model that can then be applied to multiple new languages . in practice , however , we still face the problem of scarce labeled data , leading to subpar results .", "entity": "cross - lingual natural language inference", "output": "data augmentation strategy", "neg_sample": ["cross - lingual natural language inference is done by using Method", "due to recent pretrained multilingual representation models , it has become feasible to exploit labeled data from one language to train a cross - lingual model that can then be applied to multiple new languages .", "in practice , however , we still face the problem of scarce labeled data , leading to subpar results ."], "relation": "used for", "id": "2021.acl-long.401", "year": 2021, "rel_sent": "In this paper , we propose a novel data augmentation strategy for better cross - lingual natural language inference by enriching the data to reflect more diversity in a semantically faithful way .", "forward": false, "src_ids": "2021.acl-long.401_15546"}
{"input": "dependency - free method is used for Material| context: we describe models focused at the understudied problem of translating between monolingual and code - mixed language pairs . more specifically , we offer a wide range of models that convert monolingual english text into hinglish ( code - mixed hindi and english ) .", "entity": "dependency - free method", "output": "code - mixed texts", "neg_sample": ["dependency - free method is used for Material", "we describe models focused at the understudied problem of translating between monolingual and code - mixed language pairs .", "more specifically , we offer a wide range of models that convert monolingual english text into hinglish ( code - mixed hindi and english ) ."], "relation": "used for", "id": "2021.calcs-1.6", "year": 2021, "rel_sent": "Given the paucity of training data for code - mixing , we also propose a dependency - free method for generating code - mixed texts from bilingual distributed representations that we exploit for improving language model performance .", "forward": true, "src_ids": "2021.calcs-1.6_346"}
{"input": "analogies is done by using Method| context: analogies play a central role in human commonsense reasoning . the ability to recognize analogies such as ' eye is to seeing what ear is to hearing ' , sometimes referred to as analogical proportions , shape how we structure knowledge and understand language . surprisingly , however , the task of identifying such analogies has not yet received much attention in the language model era .", "entity": "analogies", "output": "pre - trained language models", "neg_sample": ["analogies is done by using Method", "analogies play a central role in human commonsense reasoning .", "the ability to recognize analogies such as ' eye is to seeing what ear is to hearing ' , sometimes referred to as analogical proportions , shape how we structure knowledge and understand language .", "surprisingly , however , the task of identifying such analogies has not yet received much attention in the language model era ."], "relation": "used for", "id": "2021.acl-long.280", "year": 2021, "rel_sent": "BERT is to NLP what AlexNet is to CV : Can Pre - Trained Language Models Identify Analogies ?.", "forward": false, "src_ids": "2021.acl-long.280_6631"}
{"input": "computational models is used for Method| context: mental health conditions remain underdiagnosed even in countries with common access to advanced medical care . the ability to accurately and efficiently predict mood from easily collectible data has several important implications for the early detection , intervention , and treatment of mental health disorders . one promising data source to help monitor human behavior is daily smartphone usage . however , care must be taken to summarize behaviors without identifying the user through personal ( e.g. , personally identifiable information ) or protected ( e.g. , race , gender ) attributes .", "entity": "computational models", "output": "language and multimodal representations", "neg_sample": ["computational models is used for Method", "mental health conditions remain underdiagnosed even in countries with common access to advanced medical care .", "the ability to accurately and efficiently predict mood from easily collectible data has several important implications for the early detection , intervention , and treatment of mental health disorders .", "one promising data source to help monitor human behavior is daily smartphone usage .", "however , care must be taken to summarize behaviors without identifying the user through personal ( e.g.", ", personally identifiable information ) or protected ( e.g.", ", race , gender ) attributes ."], "relation": "used for", "id": "2021.acl-long.322", "year": 2021, "rel_sent": "Using computational models , we find that language and multimodal representations of mobile typed text ( spanning typed characters , words , keystroke timings , and app usage ) are predictive of daily mood .", "forward": true, "src_ids": "2021.acl-long.322_14710"}
{"input": "word embeddings is used for OtherScientificTerm| context: reporting and providing test sets for harmful bias in nlp applications is essential for building a robust understanding of the current problem . bias in downstream applications can stem from training data , word embeddings , or be amplified by the model in use . however , focusing on biased word embeddings is potentially the most impactful first step due to their universal nature .", "entity": "word embeddings", "output": "marked attribute effect", "neg_sample": ["word embeddings is used for OtherScientificTerm", "reporting and providing test sets for harmful bias in nlp applications is essential for building a robust understanding of the current problem .", "bias in downstream applications can stem from training data , word embeddings , or be amplified by the model in use .", "however , focusing on biased word embeddings is potentially the most impactful first step due to their universal nature ."], "relation": "used for", "id": "2021.findings-acl.369", "year": 2021, "rel_sent": "Here we seek to understand how the intrinsic properties of word embeddings contribute to this observed marked attribute effect , and whether current post - processing methods address the bias successfully .", "forward": true, "src_ids": "2021.findings-acl.369_4971"}
{"input": "audio - based system is used for Material| context: medical simulators provide a controlled environment for training and assessing clinical skills . however , as an assessment platform , it requires the presence of an experienced examiner to provide performance feedback , commonly preformed using a task specific checklist . this makes the assessment process inefficient and expensive . furthermore , this evaluation method does not provide medical practitioners the opportunity for independent training .", "entity": "audio - based system", "output": "simulation platforms", "neg_sample": ["audio - based system is used for Material", "medical simulators provide a controlled environment for training and assessing clinical skills .", "however , as an assessment platform , it requires the presence of an experienced examiner to provide performance feedback , commonly preformed using a task specific checklist .", "this makes the assessment process inefficient and expensive .", "furthermore , this evaluation method does not provide medical practitioners the opportunity for independent training ."], "relation": "used for", "id": "2021.nlpmc-1.4", "year": 2021, "rel_sent": "Developing an audio - based system will improve the experience of a wide range of simulation platforms .", "forward": true, "src_ids": "2021.nlpmc-1.4_13388"}
{"input": "layers is done by using OtherScientificTerm| context: we demonstrate that , hidden within one - layer randomly weighted neural networks , there exist subnetworks that can achieve impressive performance , without ever modifying the weight initializations , on machine translation tasks .", "entity": "layers", "output": "binary masks", "neg_sample": ["layers is done by using OtherScientificTerm", "we demonstrate that , hidden within one - layer randomly weighted neural networks , there exist subnetworks that can achieve impressive performance , without ever modifying the weight initializations , on machine translation tasks ."], "relation": "used for", "id": "2021.emnlp-main.231", "year": 2021, "rel_sent": "Tofind subnetworks for one - layer randomly weighted neural networks , we apply different binary masks to the same weight matrix to generate different layers .", "forward": false, "src_ids": "2021.emnlp-main.231_14673"}
{"input": "memoryaugmented multi - decoder network is used for OtherScientificTerm| context: dialogue policy learning , a subtask that determines the content of system response generation and then the degree of task completion , is essential for task - oriented dialogue systems .", "entity": "memoryaugmented multi - decoder network", "output": "system actions", "neg_sample": ["memoryaugmented multi - decoder network is used for OtherScientificTerm", "dialogue policy learning , a subtask that determines the content of system response generation and then the degree of task completion , is essential for task - oriented dialogue systems ."], "relation": "used for", "id": "2021.findings-acl.39", "year": 2021, "rel_sent": "Then , we propose a memoryaugmented multi - decoder network to generate the system actions conditioned on the candidate actions , which allows the network to adaptively select key information in the candidate actions and ignore noises .", "forward": true, "src_ids": "2021.findings-acl.39_980"}
{"input": "plug - in lexicon incorporation approach is used for Task| context: incorporating lexical knowledge into deep learning models has been proved to be very effective for sequence labeling tasks . however , previous works commonly have difficulty dealing with large - scale dynamic lexicons which often cause excessive matching noise and problems of frequent updates .", "entity": "plug - in lexicon incorporation approach", "output": "bert based sequence labeling tasks", "neg_sample": ["plug - in lexicon incorporation approach is used for Task", "incorporating lexical knowledge into deep learning models has been proved to be very effective for sequence labeling tasks .", "however , previous works commonly have difficulty dealing with large - scale dynamic lexicons which often cause excessive matching noise and problems of frequent updates ."], "relation": "used for", "id": "2021.emnlp-main.211", "year": 2021, "rel_sent": "In this paper , we propose DyLex , a plug - in lexicon incorporation approach for BERT based sequence labeling tasks .", "forward": true, "src_ids": "2021.emnlp-main.211_2906"}
{"input": "stance - relevant knowledge probing is used for Method| context: stance detection ( sd ) entails classifying the sentiment of a text towards a given target , and is a relevant sub - task for opinion mining and social media analysis . recent works have explored knowledge infusion supplementing the linguistic competence and latent knowledge of large pre - trained language models with structured knowledge graphs ( kgs ) , yet few works have applied such methods to the sd task .", "entity": "stance - relevant knowledge probing", "output": "transformers - based pre - trained models", "neg_sample": ["stance - relevant knowledge probing is used for Method", "stance detection ( sd ) entails classifying the sentiment of a text towards a given target , and is a relevant sub - task for opinion mining and social media analysis .", "recent works have explored knowledge infusion supplementing the linguistic competence and latent knowledge of large pre - trained language models with structured knowledge graphs ( kgs ) , yet few works have applied such methods to the sd task ."], "relation": "used for", "id": "2021.wnut-1.34", "year": 2021, "rel_sent": "In this work , we first perform stance - relevant knowledge probing on Transformers - based pre - trained models in a zero - shot setting , showing these models ' latent real - world knowledge about SD targets and their sensitivity to context .", "forward": true, "src_ids": "2021.wnut-1.34_10018"}
{"input": "fine - tuning of models is used for OtherScientificTerm| context: while fine - tuning models on additional data has been used to mitigate them , a common issue is that of catastrophic forgetting of the original training dataset .", "entity": "fine - tuning of models", "output": "biases", "neg_sample": ["fine - tuning of models is used for OtherScientificTerm", "while fine - tuning models on additional data has been used to mitigate them , a common issue is that of catastrophic forgetting of the original training dataset ."], "relation": "used for", "id": "2021.eacl-main.82", "year": 2021, "rel_sent": "In this paper , we show that elastic weight consolidation ( EWC ) allows fine - tuning of models to mitigate biases while being less susceptible to catastrophic forgetting .", "forward": true, "src_ids": "2021.eacl-main.82_14036"}
{"input": "low - shot wsd is done by using Method| context: current models for word sense disambiguation ( wsd ) struggle to disambiguate rare senses , despite reaching human performance on global wsd metrics . this stems from a lack of data for both modeling and evaluating rare senses in existing wsd datasets .", "entity": "low - shot wsd", "output": "few - shot examples of word senses", "neg_sample": ["low - shot wsd is done by using Method", "current models for word sense disambiguation ( wsd ) struggle to disambiguate rare senses , despite reaching human performance on global wsd metrics .", "this stems from a lack of data for both modeling and evaluating rare senses in existing wsd datasets ."], "relation": "used for", "id": "2021.eacl-main.36", "year": 2021, "rel_sent": "Finally , we find humans outperform the best baseline models on FEWS , indicating that FEWS will support significant future work on low - shot WSD .", "forward": false, "src_ids": "2021.eacl-main.36_13521"}
{"input": "random forest classifier is used for Task| context: whereas native speakers can intuitively handle multiword expressions whose compositional meanings are hard to trace back to individual word semantics , there is still ample scope for improvement regarding computational approaches .", "entity": "random forest classifier", "output": "automatic recognition of idioms", "neg_sample": ["random forest classifier is used for Task", "whereas native speakers can intuitively handle multiword expressions whose compositional meanings are hard to trace back to individual word semantics , there is still ample scope for improvement regarding computational approaches ."], "relation": "used for", "id": "2021.mwe-1.3", "year": 2021, "rel_sent": "To this end , we apply a Random Forest classifier to analyze the individual contribution of features for automatically detecting idioms , and study the trade - off between recall and precision .", "forward": true, "src_ids": "2021.mwe-1.3_11734"}
{"input": "ease is used for Task| context: ease is based on the pattern of answers provided by multiple annotators to a given question .", "entity": "ease", "output": "training / fine - tuning", "neg_sample": ["ease is used for Task", "ease is based on the pattern of answers provided by multiple annotators to a given question ."], "relation": "used for", "id": "2021.naacl-main.192", "year": 2021, "rel_sent": "Second , we show that EASE can be successfully used to select the most - informative samples for training / fine - tuning .", "forward": true, "src_ids": "2021.naacl-main.192_15590"}
{"input": "zero - shot approach is used for Task| context: it becomes more important with the rapid extension of the dialogue systems ' functionality .", "entity": "zero - shot approach", "output": "natural language understanding", "neg_sample": ["zero - shot approach is used for Task", "it becomes more important with the rapid extension of the dialogue systems ' functionality ."], "relation": "used for", "id": "2021.ranlp-1.25", "year": 2021, "rel_sent": "InFoBERT : Zero - Shot Approach to Natural Language Understanding Using Contextualized Word Embedding.", "forward": true, "src_ids": "2021.ranlp-1.25_12964"}
{"input": "arg - ctrl is used for OtherScientificTerm| context: we rely on arguments in our daily lives to deliver our opinions and base them on evidence , making them more convincing in turn . however , finding and formulating arguments can be challenging .", "entity": "arg - ctrl", "output": "aspect - specific arguments", "neg_sample": ["arg - ctrl is used for OtherScientificTerm", "we rely on arguments in our daily lives to deliver our opinions and base them on evidence , making them more convincing in turn .", "however , finding and formulating arguments can be challenging ."], "relation": "used for", "id": "2021.naacl-main.34", "year": 2021, "rel_sent": "Our evaluation shows that the Arg - CTRL is able to generate high - quality , aspect - specific arguments , applicable to automatic counter - argument generation .", "forward": true, "src_ids": "2021.naacl-main.34_5855"}
{"input": "large - scale meta - learning is used for Task| context: meta - learning considers the problem of learning an efficient learning process that can leverage its past experience to accurately solve new tasks . however , the efficacy of meta - learning crucially depends on the distribution of tasks available for training , and this is often assumed to be known a priori or constructed from limited supervised datasets .", "entity": "large - scale meta - learning", "output": "nlp", "neg_sample": ["large - scale meta - learning is used for Task", "meta - learning considers the problem of learning an efficient learning process that can leverage its past experience to accurately solve new tasks .", "however , the efficacy of meta - learning crucially depends on the distribution of tasks available for training , and this is often assumed to be known a priori or constructed from limited supervised datasets ."], "relation": "used for", "id": "2021.emnlp-main.469", "year": 2021, "rel_sent": "In this work , we aim to provide task distributions for meta - learning by considering self - supervised tasks automatically proposed from unlabeled text , to enable large - scale meta - learning in NLP .", "forward": true, "src_ids": "2021.emnlp-main.469_8235"}
{"input": "word embedding models is used for Task| context: eye - tracking psycholinguistic studies have suggested that context - word semantic coherence and predictability influence language processing during the reading activity .", "entity": "word embedding models", "output": "eye - tracking features prediction", "neg_sample": ["word embedding models is used for Task", "eye - tracking psycholinguistic studies have suggested that context - word semantic coherence and predictability influence language processing during the reading activity ."], "relation": "used for", "id": "2021.iwcs-1.9", "year": 2021, "rel_sent": "Looking for a Role for Word Embeddings in Eye - Tracking Features Prediction : Does Semantic Similarity Help ?.", "forward": true, "src_ids": "2021.iwcs-1.9_5915"}
{"input": "entity segmentation is used for Task| context: these papers rely on different resources and include features not related to the use of gazetteers , rendering impossible the comparison of the relative effectiveness of the approaches .", "entity": "entity segmentation", "output": "named entity recognition", "neg_sample": ["entity segmentation is used for Task", "these papers rely on different resources and include features not related to the use of gazetteers , rendering impossible the comparison of the relative effectiveness of the approaches ."], "relation": "used for", "id": "2021.findings-acl.349", "year": 2021, "rel_sent": "The Utility and Interplay of Gazetteers and Entity Segmentation for Named Entity Recognition in English.", "forward": true, "src_ids": "2021.findings-acl.349_9422"}
{"input": "word2vec is used for Method| context: with the rapid growth in technology , social media activity has seen a boom across all age groups . it is humanly impossible to check all the tweets , comments and status manually whether they follow proper community guidelines . a lot of toxicity is regularly posted on these social media platforms .", "entity": "word2vec", "output": "flairnlp framework", "neg_sample": ["word2vec is used for Method", "with the rapid growth in technology , social media activity has seen a boom across all age groups .", "it is humanly impossible to check all the tweets , comments and status manually whether they follow proper community guidelines .", "a lot of toxicity is regularly posted on these social media platforms ."], "relation": "used for", "id": "2021.semeval-1.128", "year": 2021, "rel_sent": "To solve this challenging problem , authors have combined concepts of Linked List for pre - processing and then used the idea of stacked embeddings like BERT Embeddings , Flair Embeddings and Word2Vec on the flairNLP framework to get the desired results .", "forward": true, "src_ids": "2021.semeval-1.128_5026"}
{"input": "decoupled method is done by using OtherScientificTerm| context: recently , text - to - sql for multi - turn dialogue has attracted great interest . here , the user input of the current turn is parsed into the corresponding sql query of the appropriate database , given all previous dialogue history . current approaches mostly employ end - to - end models and consequently face two challenges . first , dialogue history modeling and text - tosql parsing are implicitly combined , hence it is hard to carry out interpretable analysis and obtain targeted improvement . second , sql annotation of multi - turn dialogue is very expensive , leading to training data sparsity .", "entity": "decoupled method", "output": "annotated rewrite cases", "neg_sample": ["decoupled method is done by using OtherScientificTerm", "recently , text - to - sql for multi - turn dialogue has attracted great interest .", "here , the user input of the current turn is parsed into the corresponding sql query of the appropriate database , given all previous dialogue history .", "current approaches mostly employ end - to - end models and consequently face two challenges .", "first , dialogue history modeling and text - tosql parsing are implicitly combined , hence it is hard to carry out interpretable analysis and obtain targeted improvement .", "second , sql annotation of multi - turn dialogue is very expensive , leading to training data sparsity ."], "relation": "used for", "id": "2021.findings-acl.270", "year": 2021, "rel_sent": "With just a few annotated rewrite cases , the decoupled method outperforms the released state - of - the - art endto - end models on both SParC and CoSQL datasets .", "forward": false, "src_ids": "2021.findings-acl.270_4341"}
{"input": "internet slangs is done by using Method| context: user - generated texts include various types of stylistic properties , or noises . such texts are not properly processed by existing morpheme analyzers or language models based on formal texts such as encyclopedias or news articles .", "entity": "internet slangs", "output": "k - mt", "neg_sample": ["internet slangs is done by using Method", "user - generated texts include various types of stylistic properties , or noises .", "such texts are not properly processed by existing morpheme analyzers or language models based on formal texts such as encyclopedias or news articles ."], "relation": "used for", "id": "2021.wnut-1.45", "year": 2021, "rel_sent": "Through our tests , we found that K - MT is better fit to process internet slangs , proper nouns , and coinages , compared to a morpheme analyzer and a character - level WordPiece tokenizer .", "forward": false, "src_ids": "2021.wnut-1.45_1644"}
{"input": "language models is used for OtherScientificTerm| context: creating explanations for answers to science questions is a challenging task that requires multi - hop inference over a large set of fact sentences .", "entity": "language models", "output": "relevance scores", "neg_sample": ["language models is used for OtherScientificTerm", "creating explanations for answers to science questions is a challenging task that requires multi - hop inference over a large set of fact sentences ."], "relation": "used for", "id": "11.textgraphs-1.20", "year": 2021, "rel_sent": "Our system , which achieved second place on the Shared Task leaderboard , combines initial statement retrieval ; language models trained to predict the relevance scores ; and ensembling of a number of the resulting rankings .", "forward": true, "src_ids": "11.textgraphs-1.20_7856"}
{"input": "multilingual embeddings is used for Method| context: nmt systems require a large amount of parallel corpora to obtain good quality translations .", "entity": "multilingual embeddings", "output": "transfer learning", "neg_sample": ["multilingual embeddings is used for Method", "nmt systems require a large amount of parallel corpora to obtain good quality translations ."], "relation": "used for", "id": "2021.mtsummit-research.4", "year": 2021, "rel_sent": "Techniques such as Phrase Table Injection ( PTI ) and back - translation and mixing of language corpora are used for enhancing the parallel data ; whereas pivoting and multilingual embeddings are used to leverage transfer learning .", "forward": true, "src_ids": "2021.mtsummit-research.4_759"}
{"input": "stacked embeddings is used for Method| context: with the rapid growth in technology , social media activity has seen a boom across all age groups . it is humanly impossible to check all the tweets , comments and status manually whether they follow proper community guidelines . a lot of toxicity is regularly posted on these social media platforms .", "entity": "stacked embeddings", "output": "flairnlp framework", "neg_sample": ["stacked embeddings is used for Method", "with the rapid growth in technology , social media activity has seen a boom across all age groups .", "it is humanly impossible to check all the tweets , comments and status manually whether they follow proper community guidelines .", "a lot of toxicity is regularly posted on these social media platforms ."], "relation": "used for", "id": "2021.semeval-1.128", "year": 2021, "rel_sent": "To solve this challenging problem , authors have combined concepts of Linked List for pre - processing and then used the idea of stacked embeddings like BERT Embeddings , Flair Embeddings and Word2Vec on the flairNLP framework to get the desired results .", "forward": true, "src_ids": "2021.semeval-1.128_5025"}
{"input": "natural language processing is used for Task| context: the field of natural language processing ( nlp ) changes rapidly , requiring course offerings to adjust with those changes , and nlp is not just for computer scientists ; it 's a field that should be accessible to anyone who has a sufficient background .", "entity": "natural language processing", "output": "data scientists", "neg_sample": ["natural language processing is used for Task", "the field of natural language processing ( nlp ) changes rapidly , requiring course offerings to adjust with those changes , and nlp is not just for computer scientists ; it 's a field that should be accessible to anyone who has a sufficient background ."], "relation": "used for", "id": "2021.teachingnlp-1.21", "year": 2021, "rel_sent": "Natural Language Processing for Computer Scientists and Data Scientists at a Large State University.", "forward": true, "src_ids": "2021.teachingnlp-1.21_9569"}
{"input": "contrastive learning is used for Task| context: despite pre - trained language models have proven useful for learning high - quality semantic representations , these models are still vulnerable to simple perturbations . recent works aimed to improve the robustness of pre - trained models mainly focus on adversarial training from perturbed examples with similar semantics , neglecting the utilization of different or even opposite semantics . different from the image processing field , the text is discrete and few word substitutions can cause significant semantic changes .", "entity": "contrastive learning", "output": "natural language understanding", "neg_sample": ["contrastive learning is used for Task", "despite pre - trained language models have proven useful for learning high - quality semantic representations , these models are still vulnerable to simple perturbations .", "recent works aimed to improve the robustness of pre - trained models mainly focus on adversarial training from perturbed examples with similar semantics , neglecting the utilization of different or even opposite semantics .", "different from the image processing field , the text is discrete and few word substitutions can cause significant semantic changes ."], "relation": "used for", "id": "2021.acl-long.181", "year": 2021, "rel_sent": "CLINE : Contrastive Learning with Semantic Negative Examples for Natural Language Understanding.", "forward": true, "src_ids": "2021.acl-long.181_16004"}
{"input": "intra - sentence level semantics is used for OtherScientificTerm| context: sentence - level extractive text summarization aims to select important sentences from a given document . however , it is very challenging to model the importance of sentences .", "entity": "intra - sentence level semantics", "output": "internal semantic structure", "neg_sample": ["intra - sentence level semantics is used for OtherScientificTerm", "sentence - level extractive text summarization aims to select important sentences from a given document .", "however , it is very challenging to model the importance of sentences ."], "relation": "used for", "id": "2021.emnlp-main.331", "year": 2021, "rel_sent": "In particular , intra - sentence level semantics leverage Frames and Frame Elements to model internal semantic structure within a sentence , while inter - sentence level semantics leverage Frame - to - Frame relations to model relationships among sentences .", "forward": true, "src_ids": "2021.emnlp-main.331_1128"}
{"input": "german is done by using Method| context: we apply neural coreference resolution to german , surpassing the previous state - of - theart performance by a wide margin of 10- 30 points f1 across three established datasets for german .", "entity": "german", "output": "neural end - to - end coreference resolution", "neg_sample": ["german is done by using Method", "we apply neural coreference resolution to german , surpassing the previous state - of - theart performance by a wide margin of 10- 30 points f1 across three established datasets for german ."], "relation": "used for", "id": "2021.konvens-1.15", "year": 2021, "rel_sent": "Neural End - to - end Coreference Resolution for German in Different Domains.", "forward": false, "src_ids": "2021.konvens-1.15_15874"}
{"input": "faroese is done by using Material| context: we describe the application of language technology methods and resources devised for icelandic , a north germanic language with about 300,000 speakers , in digital language resource creation for faroese , a north germanic language with about 50,000 speakers .", "entity": "faroese", "output": "pos - tagged corpus", "neg_sample": ["faroese is done by using Material", "we describe the application of language technology methods and resources devised for icelandic , a north germanic language with about 300,000 speakers , in digital language resource creation for faroese , a north germanic language with about 50,000 speakers ."], "relation": "used for", "id": "2021.computel-1.9", "year": 2021, "rel_sent": "To achieve this , a state - of - the - art neural PoS tagger for Icelandic , ABLTagger , was trained on a 100,000 word PoS - tagged corpus for Faroese , standardised with methods previously applied to Icelandic corpora .", "forward": false, "src_ids": "2021.computel-1.9_14354"}
{"input": "time - stamped language model ( tslm ) is used for OtherScientificTerm| context: tracking entities throughout a procedure described in a text is challenging due to the dynamic nature of the world described in the process .", "entity": "time - stamped language model ( tslm )", "output": "event information", "neg_sample": ["time - stamped language model ( tslm ) is used for OtherScientificTerm", "tracking entities throughout a procedure described in a text is challenging due to the dynamic nature of the world described in the process ."], "relation": "used for", "id": "2021.naacl-main.362", "year": 2021, "rel_sent": "Secondly , since the transformer - based language models can not encode the flow of events by themselves , we propose a Time - Stamped Language Model ( TSLM ) to encode event information in LMs architecture by introducing the timestamp encoding .", "forward": true, "src_ids": "2021.naacl-main.362_6823"}
{"input": "inflection of sentences is done by using Task| context: morphological tasks have gained decent popularity within the nlp community in the recent years , with large multi - lingual datasets providing morphological analysis of words , either in or out of context . however , the lack of a clear linguistic definition for words destines the annotative work to be incomplete and mired in inconsistencies , especially cross - linguistically .", "entity": "inflection of sentences", "output": "annotation", "neg_sample": ["inflection of sentences is done by using Task", "morphological tasks have gained decent popularity within the nlp community in the recent years , with large multi - lingual datasets providing morphological analysis of words , either in or out of context .", "however , the lack of a clear linguistic definition for words destines the annotative work to be incomplete and mired in inconsistencies , especially cross - linguistically ."], "relation": "used for", "id": "2021.mrl-1.23", "year": 2021, "rel_sent": "To allow annotation for sentence - inflection we define a morphological annotation scheme by a fixed set of inflectional features .", "forward": false, "src_ids": "2021.mrl-1.23_15763"}
{"input": "multi - task learning - based framework is used for Task| context: large - scale multi - modal classification aim to distinguish between different multi - modal data , and it has drawn dramatically attentions since last decade .", "entity": "multi - task learning - based framework", "output": "multimodal classification task", "neg_sample": ["multi - task learning - based framework is used for Task", "large - scale multi - modal classification aim to distinguish between different multi - modal data , and it has drawn dramatically attentions since last decade ."], "relation": "used for", "id": "2021.maiworkshop-1.5", "year": 2021, "rel_sent": "In this paper , we propose a multi - task learning - based framework for the multimodal classification task , which consists of two branches : multi - modal autoencoder branch and attention - based multi - modal modeling branch .", "forward": true, "src_ids": "2021.maiworkshop-1.5_11490"}
{"input": "arabic language understanding is done by using Method| context: advances in english language representation enabled a more sample - efficient pre - training task by efficiently learning an encoder that classifies token replacements accurately ( electra ) . which , instead of training a model to recover masked tokens , it trains a discriminator model to distinguish true input tokens from corrupted tokens that were replaced by a generator network . on the other hand , current arabic language representation approaches rely only on pretraining via masked language modeling .", "entity": "arabic language understanding", "output": "pre - training text discriminators", "neg_sample": ["arabic language understanding is done by using Method", "advances in english language representation enabled a more sample - efficient pre - training task by efficiently learning an encoder that classifies token replacements accurately ( electra ) .", "which , instead of training a model to recover masked tokens , it trains a discriminator model to distinguish true input tokens from corrupted tokens that were replaced by a generator network .", "on the other hand , current arabic language representation approaches rely only on pretraining via masked language modeling ."], "relation": "used for", "id": "2021.wanlp-1.20", "year": 2021, "rel_sent": "AraELECTRA : Pre - Training Text Discriminators for Arabic Language Understanding.", "forward": false, "src_ids": "2021.wanlp-1.20_16039"}
{"input": "graph attention network is used for Method| context: external syntactic and semantic information has been largely ignored by existing neural coreference resolution models .", "entity": "graph attention network", "output": "syntactically and semantically augmented word representation", "neg_sample": ["graph attention network is used for Method", "external syntactic and semantic information has been largely ignored by existing neural coreference resolution models ."], "relation": "used for", "id": "2021.naacl-main.125", "year": 2021, "rel_sent": "By applying a graph attention network , we can obtain syntactically and semantically augmented word representation , which can be integrated using an attentive integration layer and gating mechanism .", "forward": true, "src_ids": "2021.naacl-main.125_16044"}
{"input": "operationalizations is used for OtherScientificTerm| context: while its implications on language production have been well explored , the hypothesis potentially makes predictions about language comprehension and linguistic acceptability as well . further , it is unclear how uniformity in a linguistic signal - or lack thereof - should be measured , and over which linguistic unit , e.g. , the sentence or language level , this uniformity should hold .", "entity": "operationalizations", "output": "uniform information density", "neg_sample": ["operationalizations is used for OtherScientificTerm", "while its implications on language production have been well explored , the hypothesis potentially makes predictions about language comprehension and linguistic acceptability as well .", "further , it is unclear how uniformity in a linguistic signal - or lack thereof - should be measured , and over which linguistic unit , e.g.", ", the sentence or language level , this uniformity should hold ."], "relation": "used for", "id": "2021.emnlp-main.74", "year": 2021, "rel_sent": "We then explore multiple operationalizations of UID , motivated by different interpretations of the original hypothesis , and analyze the scope over which the pressure towards uniformity is exerted .", "forward": true, "src_ids": "2021.emnlp-main.74_4385"}
{"input": "english is done by using Method| context: though the proposed annotation scheme is conceptually promising , the feasibility is only examined in four indo - european languages .", "entity": "english", "output": "compositional semantic analysis", "neg_sample": ["english is done by using Method", "though the proposed annotation scheme is conceptually promising , the feasibility is only examined in four indo - european languages ."], "relation": "used for", "id": "2021.naacl-main.440", "year": 2021, "rel_sent": "The corpus consists of 1100 English - Chinese parallel sentences , where compositional semantic analysis is available for English , and another 1000 Chinese sentences which has enriched syntactic analysis .", "forward": false, "src_ids": "2021.naacl-main.440_5398"}
{"input": "estimation of semantic answer similarity is done by using Method| context: the evaluation of question answering models compares ground - truth annotations with model predictions . however , as of today , this comparison is mostly lexical - based and therefore misses out on answers that have no lexical overlap but are still semantically similar , thus treating correct answers as false . this underestimation of the true performance of models hinders user acceptance in applications and complicates a fair comparison of different models . therefore , there is a need for an evaluation metric that is based on semantics instead of pure string similarity .", "entity": "estimation of semantic answer similarity", "output": "cross - encoder - based metric", "neg_sample": ["estimation of semantic answer similarity is done by using Method", "the evaluation of question answering models compares ground - truth annotations with model predictions .", "however , as of today , this comparison is mostly lexical - based and therefore misses out on answers that have no lexical overlap but are still semantically similar , thus treating correct answers as false .", "this underestimation of the true performance of models hinders user acceptance in applications and complicates a fair comparison of different models .", "therefore , there is a need for an evaluation metric that is based on semantics instead of pure string similarity ."], "relation": "used for", "id": "2021.mrqa-1.15", "year": 2021, "rel_sent": "In this short paper , we present SAS , a cross - encoder - based metric for the estimation of semantic answer similarity , and compare it to seven existing metrics .", "forward": false, "src_ids": "2021.mrqa-1.15_4497"}
{"input": "dexperts is used for Task| context: despite recent advances in natural language generation , it remains challenging to control attributes of generated text .", "entity": "dexperts", "output": "language detoxification", "neg_sample": ["dexperts is used for Task", "despite recent advances in natural language generation , it remains challenging to control attributes of generated text ."], "relation": "used for", "id": "2021.acl-long.522", "year": 2021, "rel_sent": "We apply DExperts to language detoxification and sentiment - controlled generation , where we outperform existing controllable generation methods on both automatic and human evaluations .", "forward": true, "src_ids": "2021.acl-long.522_5109"}
{"input": "cross - lingual natural language inference is done by using Method| context: in meta - learning , the knowledge learned from previous tasks is transferred to new ones , but this transfer only works if tasks are related . sharing information between unrelated tasks might hurt performance , and it is unclear how to transfer knowledge across tasks that have a hierarchical structure .", "entity": "cross - lingual natural language inference", "output": "natural language processing models", "neg_sample": ["cross - lingual natural language inference is done by using Method", "in meta - learning , the knowledge learned from previous tasks is transferred to new ones , but this transfer only works if tasks are related .", "sharing information between unrelated tasks might hurt performance , and it is unclear how to transfer knowledge across tasks that have a hierarchical structure ."], "relation": "used for", "id": "2021.adaptnlp-1.8", "year": 2021, "rel_sent": "We show that TreeMAML successfully trains natural language processing models for cross - lingual Natural Language Inference by taking advantage of the language phylogenetic tree .", "forward": false, "src_ids": "2021.adaptnlp-1.8_11002"}
{"input": "convolutional neural networks is used for OtherScientificTerm| context: irrespective of the success of the deep learning - based mixed - domain transfer learning approach for solving various natural language processing tasks , it does not lend a generalizable solution for detecting misinformation from covid-19 social media data .", "entity": "convolutional neural networks", "output": "local as well as global context", "neg_sample": ["convolutional neural networks is used for OtherScientificTerm", "irrespective of the success of the deep learning - based mixed - domain transfer learning approach for solving various natural language processing tasks , it does not lend a generalizable solution for detecting misinformation from covid-19 social media data ."], "relation": "used for", "id": "2021.emnlp-main.485", "year": 2021, "rel_sent": "By conducting a systematic investigation , we show that : ( i ) the deep Transformer - based pre - trained models , utilized via the mixed - domain transfer learning , are only good at capturing the local context , thus exhibits poor generalization , and ( ii ) a combination of shallow network - based domain - specific models and convolutional neural networks can efficiently extract local as well as global context directly from the target data in a hierarchical fashion , enabling it to offer a more generalizable solution .", "forward": true, "src_ids": "2021.emnlp-main.485_14221"}
{"input": "code generation is done by using Method| context: code generation is the task of generating code snippets from input user specifications in natural language . leveraging the linguisticallymotivated hierarchical structure of the input can benefit code generation , especially since the specifications are complex sentences containing multiple variables and operations over various data structures . moreover , recent advances in transformer architectures have led to improved performance with tree - to - tree style generation for other seq2seq tasks e.g. , machine translation .", "entity": "code generation", "output": "tree - structured architectures", "neg_sample": ["code generation is done by using Method", "code generation is the task of generating code snippets from input user specifications in natural language .", "leveraging the linguisticallymotivated hierarchical structure of the input can benefit code generation , especially since the specifications are complex sentences containing multiple variables and operations over various data structures .", "moreover , recent advances in transformer architectures have led to improved performance with tree - to - tree style generation for other seq2seq tasks e.g.", ", machine translation ."], "relation": "used for", "id": "2021.findings-acl.384", "year": 2021, "rel_sent": "Analysis of Tree - Structured Architectures for Code Generation.", "forward": false, "src_ids": "2021.findings-acl.384_44"}
{"input": "cluster consistencies is used for Task| context: deep learning methods have recently been applied for this task to deliver state - of - the - art performance .", "entity": "cluster consistencies", "output": "event coreference resolution", "neg_sample": ["cluster consistencies is used for Task", "deep learning methods have recently been applied for this task to deliver state - of - the - art performance ."], "relation": "used for", "id": "2021.acl-long.374", "year": 2021, "rel_sent": "Exploiting Document Structures and Cluster Consistencies for Event Coreference Resolution.", "forward": true, "src_ids": "2021.acl-long.374_1492"}
{"input": "decoder is done by using Method| context: personalized response generation is essential for more human - like conversations . however , how to model user personalization information with no explicit user persona descriptions or demographics still remains under - investigated .", "entity": "decoder", "output": "personalized response embedding", "neg_sample": ["decoder is done by using Method", "personalized response generation is essential for more human - like conversations .", "however , how to model user personalization information with no explicit user persona descriptions or demographics still remains under - investigated ."], "relation": "used for", "id": "2021.gem-1.5", "year": 2021, "rel_sent": "The personalized response embedding is fed to either the decoder of an LSTM - based Seq2Seq model or a transformer language model to help generate more personalized responses .", "forward": false, "src_ids": "2021.gem-1.5_13654"}
{"input": "ocr engine is used for Material| context: despite recent advances , standard sequence labeling systems often fail when processing noisy user - generated text or consuming the output of an optical character recognition ( ocr ) process .", "entity": "ocr engine", "output": "large parallel text corpus", "neg_sample": ["ocr engine is used for Material", "despite recent advances , standard sequence labeling systems often fail when processing noisy user - generated text or consuming the output of an optical character recognition ( ocr ) process ."], "relation": "used for", "id": "2021.findings-acl.27", "year": 2021, "rel_sent": "Using an OCR engine , we generated a large parallel text corpus for training and produced several real - world noisy sequence labeling benchmarks for evaluation .", "forward": true, "src_ids": "2021.findings-acl.27_3079"}
{"input": "multimodal interactions is done by using Method| context: multimodal transformers achieve superior performance in multimodal learning tasks . however , the quadratic complexity of the self - attention mechanism in transformers limits their deployment in low - resource devices and makes their inference and training computationally expensive .", "entity": "multimodal interactions", "output": "( spt )", "neg_sample": ["multimodal interactions is done by using Method", "multimodal transformers achieve superior performance in multimodal learning tasks .", "however , the quadratic complexity of the self - attention mechanism in transformers limits their deployment in low - resource devices and makes their inference and training computationally expensive ."], "relation": "used for", "id": "2021.emnlp-main.189", "year": 2021, "rel_sent": "We conclude that ( SPT ) along with parameter sharing can capture multimodal interactions with reduced model size and improved sample efficiency .", "forward": false, "src_ids": "2021.emnlp-main.189_9354"}
{"input": "semi - supervised learning signal is done by using Material| context: neural generative dialogue agents have shown an increasing ability to hold short chitchat conversations , when evaluated by crowdworkers in controlled settings . however , their performance in real - life deployment - talking to intrinsically - motivated users in noisy environments - is less well - explored .", "entity": "semi - supervised learning signal", "output": "dissatisfied user utterances", "neg_sample": ["semi - supervised learning signal is done by using Material", "neural generative dialogue agents have shown an increasing ability to hold short chitchat conversations , when evaluated by crowdworkers in controlled settings .", "however , their performance in real - life deployment - talking to intrinsically - motivated users in noisy environments - is less well - explored ."], "relation": "used for", "id": "2021.sigdial-1.1", "year": 2021, "rel_sent": "Finally , we show that dissatisfied user utterances can be used as a semi - supervised learning signal to improve the dialogue system .", "forward": false, "src_ids": "2021.sigdial-1.1_5226"}
{"input": "encoding of syntactic structure is used for Method| context: probes are models devised to investigate the encoding of knowledge - e.g. probes are often designed for simplicity , which has led to restrictions on probe design that may not allow for the full exploitation of the structure of encoded information ; one such restriction is linearity .", "entity": "encoding of syntactic structure", "output": "contextual representations", "neg_sample": ["encoding of syntactic structure is used for Method", "probes are models devised to investigate the encoding of knowledge - e.g.", "probes are often designed for simplicity , which has led to restrictions on probe design that may not allow for the full exploitation of the structure of encoded information ; one such restriction is linearity ."], "relation": "used for", "id": "2021.naacl-main.12", "year": 2021, "rel_sent": "We examine the case of a structural probe ( Hewitt and Manning , 2019 ) , which aims to investigate the encoding of syntactic structure in contextual representations through learning only linear transformations .", "forward": true, "src_ids": "2021.naacl-main.12_1439"}
{"input": "concepts is done by using Method| context: automated predictions require explanations to be interpretable by humans . one type of explanation is a rationale , i.e. , a selection of input features such as relevant text snippets from which the model computes the outcome . however , a single overall selection does not provide a complete explanation , e.g. , weighing several aspects for decisions .", "entity": "concepts", "output": "conrat", "neg_sample": ["concepts is done by using Method", "automated predictions require explanations to be interpretable by humans .", "one type of explanation is a rationale , i.e.", ", a selection of input features such as relevant text snippets from which the model computes the outcome .", "however , a single overall selection does not provide a complete explanation , e.g.", ", weighing several aspects for decisions ."], "relation": "used for", "id": "2021.findings-acl.68", "year": 2021, "rel_sent": "Experiments on both singleand multi - aspect sentiment classification tasks show that ConRAT is the first to generate concepts that align with human rationalization while using only the overall label .", "forward": false, "src_ids": "2021.findings-acl.68_10542"}
{"input": "mixture - of - partitions is used for OtherScientificTerm| context: infusing factual knowledge into pre - trained models is fundamental for many knowledge - intensive tasks .", "entity": "mixture - of - partitions", "output": "berts", "neg_sample": ["mixture - of - partitions is used for OtherScientificTerm", "infusing factual knowledge into pre - trained models is fundamental for many knowledge - intensive tasks ."], "relation": "used for", "id": "2021.emnlp-main.383", "year": 2021, "rel_sent": "We evaluate our MoP with three biomedical BERTs ( SciBERT , BioBERT , PubmedBERT ) on six downstream tasks ( inc . NLI , QA , Classification ) , and the results show that our MoP consistently enhances the underlying BERTs in task performance , and achieves new SOTA performances on five evaluated datasets .", "forward": true, "src_ids": "2021.emnlp-main.383_3778"}
{"input": "generating feedback is done by using OtherScientificTerm| context: automatic generation of feedback messages in a natural - language based programming for video games is presented .", "entity": "generating feedback", "output": "multi - phase context vectors", "neg_sample": ["generating feedback is done by using OtherScientificTerm", "automatic generation of feedback messages in a natural - language based programming for video games is presented ."], "relation": "used for", "id": "2021.cnl-1.16", "year": 2021, "rel_sent": "Multi - Phase Context Vectors for Generating Feedback for Natural - Language Based Programming.", "forward": false, "src_ids": "2021.cnl-1.16_14761"}
{"input": "debiasing methods is used for Task| context: automatic detection of toxic language plays an essential role in protecting social media users , especially minority groups , from verbal abuse . however , biases toward some attributes , including gender , race , and dialect , exist in most training datasets for toxicity detection . the biases make the learned models unfair and can even exacerbate the marginalization of people .", "entity": "debiasing methods", "output": "natural language understanding tasks", "neg_sample": ["debiasing methods is used for Task", "automatic detection of toxic language plays an essential role in protecting social media users , especially minority groups , from verbal abuse .", "however , biases toward some attributes , including gender , race , and dialect , exist in most training datasets for toxicity detection .", "the biases make the learned models unfair and can even exacerbate the marginalization of people ."], "relation": "used for", "id": "2021.woah-1.12", "year": 2021, "rel_sent": "Considering that current debiasing methods for general natural language understanding tasks can not effectively mitigate the biases in the toxicity detectors , we propose to use invariant rationalization ( InvRat ) , a game - theoretic framework consisting of a rationale generator and a predictor , to rule out the spurious correlation of certain syntactic patterns ( e.g. , identity mentions , dialect ) to toxicity labels .", "forward": true, "src_ids": "2021.woah-1.12_8314"}
{"input": "equation consistency constraint is used for OtherScientificTerm| context: we study the problem of generating arithmetic math word problems ( mwps ) given a math equation that specifies the mathematical computation and a context that specifies the problem scenario . existing approaches are prone to generating mwps that are either mathematically invalid or have unsatisfactory language quality . they also either ignore the context or require manual specification of a problem template , which compromises the diversity of the generated mwps .", "entity": "equation consistency constraint", "output": "mathematical validity", "neg_sample": ["equation consistency constraint is used for OtherScientificTerm", "we study the problem of generating arithmetic math word problems ( mwps ) given a math equation that specifies the mathematical computation and a context that specifies the problem scenario .", "existing approaches are prone to generating mwps that are either mathematically invalid or have unsatisfactory language quality .", "they also either ignore the context or require manual specification of a problem template , which compromises the diversity of the generated mwps ."], "relation": "used for", "id": "2021.emnlp-main.484", "year": 2021, "rel_sent": "In this paper , we develop a novel MWP generation approach that leverages i ) pre - trained language models and a context keyword selection model to improve the language quality of generated MWPs and ii ) an equation consistency constraint for math equations to improve the mathematical validity of the generated MWPs .", "forward": true, "src_ids": "2021.emnlp-main.484_1965"}
{"input": "generative framework is used for Task| context: document - level entity - based extraction ( ee ) , aiming at extracting entity - centric information such as entity roles and entity relations , is key to automatic knowledge acquisition from text corpora for various domains . most document - level ee systems build extractive models , which struggle to model long - term dependencies among entities at the document level .", "entity": "generative framework", "output": "document - level ee tasks", "neg_sample": ["generative framework is used for Task", "document - level entity - based extraction ( ee ) , aiming at extracting entity - centric information such as entity roles and entity relations , is key to automatic knowledge acquisition from text corpora for various domains .", "most document - level ee systems build extractive models , which struggle to model long - term dependencies among entities at the document level ."], "relation": "used for", "id": "2021.emnlp-main.426", "year": 2021, "rel_sent": "To address this issue , we propose a generative framework for two document - level EE tasks : role - filler entity extraction ( REE ) and relation extraction ( RE ) .", "forward": true, "src_ids": "2021.emnlp-main.426_823"}
{"input": "reviews is used for OtherScientificTerm| context: existing conversational recommendation ( cr ) systems usually suffer from insufficient item information when conducted on short dialogue history and unfamiliar items . incorporating external information ( e.g. , reviews ) is a potential solution to alleviate this problem . given that reviews often provide a rich and detailed user experience on different interests , they are potential ideal resources for providing high - quality recommendations within an informative conversation .", "entity": "reviews", "output": "coherent and informative responses", "neg_sample": ["reviews is used for OtherScientificTerm", "existing conversational recommendation ( cr ) systems usually suffer from insufficient item information when conducted on short dialogue history and unfamiliar items .", "incorporating external information ( e.g.", ", reviews ) is a potential solution to alleviate this problem .", "given that reviews often provide a rich and detailed user experience on different interests , they are potential ideal resources for providing high - quality recommendations within an informative conversation ."], "relation": "used for", "id": "2021.findings-acl.99", "year": 2021, "rel_sent": "In this paper , we design a novel end - to - end framework , namely , Review - augmented Conversational Recommender ( RevCore ) , where reviews are seamlessly incorporated to enrich item information and assist in generating both coherent and informative responses .", "forward": true, "src_ids": "2021.findings-acl.99_1156"}
{"input": "zero - shot cross - lingual semantic parsing is done by using OtherScientificTerm| context: the availability of corpora has led to significant advances in training semantic parsers in english . unfortunately , for languages other than english , annotated data is limited and so is the performance of the developed parsers . recently , pretrained multilingual models have been proven useful for zero - shot cross - lingual transfer in many nlp tasks . what else does it require to apply a parser trained in english to other languages for zero - shot cross - lingual semantic parsing ?", "entity": "zero - shot cross - lingual semantic parsing", "output": "language - independent features", "neg_sample": ["zero - shot cross - lingual semantic parsing is done by using OtherScientificTerm", "the availability of corpora has led to significant advances in training semantic parsers in english .", "unfortunately , for languages other than english , annotated data is limited and so is the performance of the developed parsers .", "recently , pretrained multilingual models have been proven useful for zero - shot cross - lingual transfer in many nlp tasks .", "what else does it require to apply a parser trained in english to other languages for zero - shot cross - lingual semantic parsing ?"], "relation": "used for", "id": "2021.emnlp-main.472", "year": 2021, "rel_sent": "Frustratingly Simple but Surprisingly Strong : Using Language - Independent Features for Zero - shot Cross - lingual Semantic Parsing.", "forward": false, "src_ids": "2021.emnlp-main.472_8929"}
{"input": "alignment is used for Task| context: abstract meaning representations ( amr ) are a broad - coverage semantic formalism which represents sentence meaning as a directed acyclic graph . to train most amr parsers , one needs to segment the graph into subgraphs and align each such subgraph to a word in a sentence ; this is normally done at preprocessing , relying on hand - crafted rules .", "entity": "alignment", "output": "amr parsing", "neg_sample": ["alignment is used for Task", "abstract meaning representations ( amr ) are a broad - coverage semantic formalism which represents sentence meaning as a directed acyclic graph .", "to train most amr parsers , one needs to segment the graph into subgraphs and align each such subgraph to a word in a sentence ; this is normally done at preprocessing , relying on hand - crafted rules ."], "relation": "used for", "id": "2021.emnlp-main.714", "year": 2021, "rel_sent": "A Differentiable Relaxation of Graph Segmentation and Alignment for AMR Parsing.", "forward": true, "src_ids": "2021.emnlp-main.714_14913"}
{"input": "chinese tasks is done by using Method| context: multilingual transformers ( xlm , mt5 ) have been shown to have remarkable transfer skills in zero - shot settings . most transfer studies , however , rely on automatically translated resources ( xnli , xquad ) , making it hard to discern the particular linguistic knowledge that is being transferred , and the role of expert annotated monolingual datasets when developing task - specific models .", "entity": "chinese tasks", "output": "cross - lingual models", "neg_sample": ["chinese tasks is done by using Method", "multilingual transformers ( xlm , mt5 ) have been shown to have remarkable transfer skills in zero - shot settings .", "most transfer studies , however , rely on automatically translated resources ( xnli , xquad ) , making it hard to discern the particular linguistic knowledge that is being transferred , and the role of expert annotated monolingual datasets when developing task - specific models ."], "relation": "used for", "id": "2021.findings-acl.331", "year": 2021, "rel_sent": "We find that cross - lingual models trained on English NLI do transfer well across our Chinese tasks ( e.g. , in 3/4 of our challenge categories , they perform as well / better than the best monolingual models , even on 3/5 uniquely Chinese linguistic phenomena such as idioms , pro drop ) .", "forward": false, "src_ids": "2021.findings-acl.331_4778"}
{"input": "transfer learning is done by using Method| context: open - domain question answering can be reformulated as a phrase retrieval problem , without the need for processing documents on - demand during inference ( seo et al . , 2019 ) . however , current phrase retrieval models heavily depend on sparse representations and still underperform retriever - reader approaches .", "entity": "transfer learning", "output": "query - side fine - tuning strategy", "neg_sample": ["transfer learning is done by using Method", "open - domain question answering can be reformulated as a phrase retrieval problem , without the need for processing documents on - demand during inference ( seo et al .", ", 2019 ) .", "however , current phrase retrieval models heavily depend on sparse representations and still underperform retriever - reader approaches ."], "relation": "used for", "id": "2021.acl-long.518", "year": 2021, "rel_sent": "We also propose a query - side fine - tuning strategy , which can support transfer learning and reduce the discrepancy between training and inference .", "forward": false, "src_ids": "2021.acl-long.518_1729"}
{"input": "reliability testing is used for Method| context: questions of fairness , robustness , and transparency are paramount to address before deploying nlp systems . central to these concerns is the question of reliability : can nlp systems reliably treat different demographics fairly and function correctly in diverse and noisy environments ?", "entity": "reliability testing", "output": "natural language processing systems", "neg_sample": ["reliability testing is used for Method", "questions of fairness , robustness , and transparency are paramount to address before deploying nlp systems .", "central to these concerns is the question of reliability : can nlp systems reliably treat different demographics fairly and function correctly in diverse and noisy environments ?"], "relation": "used for", "id": "2021.acl-long.321", "year": 2021, "rel_sent": "Reliability Testing for Natural Language Processing Systems.", "forward": true, "src_ids": "2021.acl-long.321_3135"}
{"input": "multi - task learning approach is used for Material| context: understanding voluminous historical records provides clues on the past in various aspects , such as social and political issues and even natural science facts . however , it is generally difficult tofully utilize the historical records , since most of the documents are not written in a modern language and part of the contents are damaged over time . as a result , restoring the damaged or unrecognizable parts as well as translating the records into modern languages are crucial tasks .", "entity": "multi - task learning approach", "output": "historical documents", "neg_sample": ["multi - task learning approach is used for Material", "understanding voluminous historical records provides clues on the past in various aspects , such as social and political issues and even natural science facts .", "however , it is generally difficult tofully utilize the historical records , since most of the documents are not written in a modern language and part of the contents are damaged over time .", "as a result , restoring the damaged or unrecognizable parts as well as translating the records into modern languages are crucial tasks ."], "relation": "used for", "id": "2021.naacl-main.317", "year": 2021, "rel_sent": "In response , we present a multi - task learning approach to restore and translate historical documents based on a self - attention mechanism , specifically utilizing two Korean historical records , ones of the most voluminous historical records in the world .", "forward": true, "src_ids": "2021.naacl-main.317_4178"}
{"input": "detecting disambiguation errors is done by using Method| context: lexical disambiguation is a major challenge for machine translation systems , especially if some senses of a word are trained less often than others . identifying patterns of overgeneralization requires evaluation methods that are both reliable and scalable .", "entity": "detecting disambiguation errors", "output": "contrastive conditioning", "neg_sample": ["detecting disambiguation errors is done by using Method", "lexical disambiguation is a major challenge for machine translation systems , especially if some senses of a word are trained less often than others .", "identifying patterns of overgeneralization requires evaluation methods that are both reliable and scalable ."], "relation": "used for", "id": "2021.emnlp-main.803", "year": 2021, "rel_sent": "We propose contrastive conditioning as a reference - free black - box method for detecting disambiguation errors .", "forward": false, "src_ids": "2021.emnlp-main.803_1471"}
{"input": "monolingual models is used for OtherScientificTerm| context: however , most of these machine learning models focus only on the language they were trained on . given the fact that social media platforms are being used in different languages , managing machine learning models for each and every language separately would be chaotic .", "entity": "monolingual models", "output": "false information", "neg_sample": ["monolingual models is used for OtherScientificTerm", "however , most of these machine learning models focus only on the language they were trained on .", "given the fact that social media platforms are being used in different languages , managing machine learning models for each and every language separately would be chaotic ."], "relation": "used for", "id": "2021.ranlp-1.160", "year": 2021, "rel_sent": "We show that multilingual models perform on par with the monolingual models and sometimes even better than the monolingual models to detect false information in social media making them more useful in real - world scenarios .", "forward": true, "src_ids": "2021.ranlp-1.160_6252"}
{"input": "probr is used for OtherScientificTerm| context: in this paper , we investigate the problem of reasoning over natural language statements . prior neural based approaches do not explicitly consider the inter - dependency among answers and their proofs .", "entity": "probr", "output": "joint probabilistic distribution", "neg_sample": ["probr is used for OtherScientificTerm", "in this paper , we investigate the problem of reasoning over natural language statements .", "prior neural based approaches do not explicitly consider the inter - dependency among answers and their proofs ."], "relation": "used for", "id": "2021.findings-acl.277", "year": 2021, "rel_sent": "PROBR defines a joint probabilistic distribution over all possible proof graphs and answers via an induced graphical model .", "forward": true, "src_ids": "2021.findings-acl.277_7463"}
{"input": "encoder - decoder attentions is done by using Method| context: how can we effectively inform content selection in transformer - based abstractive summarization models ?", "entity": "encoder - decoder attentions", "output": "attention head masking technique", "neg_sample": ["encoder - decoder attentions is done by using Method", "how can we effectively inform content selection in transformer - based abstractive summarization models ?"], "relation": "used for", "id": "2021.naacl-main.397", "year": 2021, "rel_sent": "In this work , we present a simple - yet - effective attention head masking technique , which is applied on encoder - decoder attentions to pinpoint salient content at inference time .", "forward": false, "src_ids": "2021.naacl-main.397_4207"}
{"input": "profanity is done by using Method| context: seq2seq models have demonstrated their incredible effectiveness in a large variety of applications . however , recent research has shown that inappropriate language in training samples and well - designed testing cases can induce seq2seq models to output profanity . these outputs may potentially hurt the usability of seq2seq models and make the end - users feel offended .", "entity": "profanity", "output": "nlp models", "neg_sample": ["profanity is done by using Method", "seq2seq models have demonstrated their incredible effectiveness in a large variety of applications .", "however , recent research has shown that inappropriate language in training samples and well - designed testing cases can induce seq2seq models to output profanity .", "these outputs may potentially hurt the usability of seq2seq models and make the end - users feel offended ."], "relation": "used for", "id": "2021.emnlp-main.418", "year": 2021, "rel_sent": "Extensive experimental results show that the proposed training framework can successfully prevent the NLP models from generating profanity .", "forward": false, "src_ids": "2021.emnlp-main.418_8046"}
{"input": "question - answering is done by using Method| context: multi - party dialogue machine reading comprehension ( mrc ) raises an even more challenging understanding goal on dialogue with more than two involved speakers , compared with the traditional plain passage style mrc . to accurately perform the question - answering ( qa ) task according to such multi - party dialogue , models have to handle fundamentally different discourse relationships from common nondialogue plain text , where discourse relations are supposed to connect twofar apart utterances in a linguistics - motivated way .", "entity": "question - answering", "output": "multi - task model", "neg_sample": ["question - answering is done by using Method", "multi - party dialogue machine reading comprehension ( mrc ) raises an even more challenging understanding goal on dialogue with more than two involved speakers , compared with the traditional plain passage style mrc .", "to accurately perform the question - answering ( qa ) task according to such multi - party dialogue , models have to handle fundamentally different discourse relationships from common nondialogue plain text , where discourse relations are supposed to connect twofar apart utterances in a linguistics - motivated way ."], "relation": "used for", "id": "2021.paclic-1.8", "year": 2021, "rel_sent": "Tofurther explore the role of such unusual discourse structure on the correlated QA task in terms of MRC , we propose the first multi - task model for jointly performing QA and discourse parsing ( DP ) on the multi - party dialogue MRC task .", "forward": false, "src_ids": "2021.paclic-1.8_4516"}
{"input": "language and multimodal representations is done by using Method| context: mental health conditions remain underdiagnosed even in countries with common access to advanced medical care . the ability to accurately and efficiently predict mood from easily collectible data has several important implications for the early detection , intervention , and treatment of mental health disorders . one promising data source to help monitor human behavior is daily smartphone usage . however , care must be taken to summarize behaviors without identifying the user through personal ( e.g. , personally identifiable information ) or protected ( e.g. , race , gender ) attributes .", "entity": "language and multimodal representations", "output": "computational models", "neg_sample": ["language and multimodal representations is done by using Method", "mental health conditions remain underdiagnosed even in countries with common access to advanced medical care .", "the ability to accurately and efficiently predict mood from easily collectible data has several important implications for the early detection , intervention , and treatment of mental health disorders .", "one promising data source to help monitor human behavior is daily smartphone usage .", "however , care must be taken to summarize behaviors without identifying the user through personal ( e.g.", ", personally identifiable information ) or protected ( e.g.", ", race , gender ) attributes ."], "relation": "used for", "id": "2021.acl-long.322", "year": 2021, "rel_sent": "Using computational models , we find that language and multimodal representations of mobile typed text ( spanning typed characters , words , keystroke timings , and app usage ) are predictive of daily mood .", "forward": false, "src_ids": "2021.acl-long.322_14707"}
{"input": "visual and textual information is done by using Method| context: previous methods treat it either as a boundary regression task or a span extraction task .", "entity": "visual and textual information", "output": "choice - query interactor", "neg_sample": ["visual and textual information is done by using Method", "previous methods treat it either as a boundary regression task or a span extraction task ."], "relation": "used for", "id": "2021.emnlp-main.324", "year": 2021, "rel_sent": "A choice - query interactor is proposed to match the visual and textual information simultaneously in sentence - moment and token - moment levels , leading to a coarse - and - fine cross - modal interaction .", "forward": false, "src_ids": "2021.emnlp-main.324_5706"}
{"input": "differentially private text generation is used for Task| context: most of privacy protection studies for textual data focus on removing explicit sensitive identifiers . however , personal writing style , as a strong indicator of the authorship , is often neglected . recent studies , such as syntf , have shown promising results on privacy - preserving text mining . however , their anonymization algorithm can only output numeric term vectors which are difficult for the recipients to interpret .", "entity": "differentially private text generation", "output": "authorship anonymization", "neg_sample": ["differentially private text generation is used for Task", "most of privacy protection studies for textual data focus on removing explicit sensitive identifiers .", "however , personal writing style , as a strong indicator of the authorship , is often neglected .", "recent studies , such as syntf , have shown promising results on privacy - preserving text mining .", "however , their anonymization algorithm can only output numeric term vectors which are difficult for the recipients to interpret ."], "relation": "used for", "id": "2021.naacl-main.314", "year": 2021, "rel_sent": "ER - AE : Differentially Private Text Generation for Authorship Anonymization.", "forward": true, "src_ids": "2021.naacl-main.314_10312"}
{"input": "database - style queries is done by using Method| context: neural models have shown impressive performance gains in answering queries from natural language text . however , existing works are unable to support database queries , such as ' list / count all female athletes who were born in 20th century ' , which require reasoning over sets of relevant facts with operations such as join , filtering and aggregation . we show that while state - of - the - art transformer models perform very well for small databases , they exhibit limitations in processing noisy data , numerical operations , and queries that aggregate facts .", "entity": "database - style queries", "output": "modular architecture", "neg_sample": ["database - style queries is done by using Method", "neural models have shown impressive performance gains in answering queries from natural language text .", "however , existing works are unable to support database queries , such as ' list / count all female athletes who were born in 20th century ' , which require reasoning over sets of relevant facts with operations such as join , filtering and aggregation .", "we show that while state - of - the - art transformer models perform very well for small databases , they exhibit limitations in processing noisy data , numerical operations , and queries that aggregate facts ."], "relation": "used for", "id": "2021.acl-long.241", "year": 2021, "rel_sent": "We propose a modular architecture to answer these database - style queries over multiple spans from text and aggregating these at scale .", "forward": false, "src_ids": "2021.acl-long.241_4448"}
{"input": "natural language interventions is done by using Method| context: is it possible to use natural language to intervene in a model 's behavior and alter its prediction in a desired way ?", "entity": "natural language interventions", "output": "language models", "neg_sample": ["natural language interventions is done by using Method", "is it possible to use natural language to intervene in a model 's behavior and alter its prediction in a desired way ?"], "relation": "used for", "id": "2021.findings-acl.364", "year": 2021, "rel_sent": "Ethical - Advice Taker : Do Language Models Understand Natural Language Interventions ?.", "forward": false, "src_ids": "2021.findings-acl.364_100"}
{"input": "turkish search engine queries is done by using Material| context: recognizing named entities in short search engine queries is a difficult task due to their weaker contextual information compared to long sentences . standard named entity recognition ( ner ) systems that are trained on grammatically correct and long sentences fail to perform well on such queries .", "entity": "turkish search engine queries", "output": "named entity recognition dataset", "neg_sample": ["turkish search engine queries is done by using Material", "recognizing named entities in short search engine queries is a difficult task due to their weaker contextual information compared to long sentences .", "standard named entity recognition ( ner ) systems that are trained on grammatically correct and long sentences fail to perform well on such queries ."], "relation": "used for", "id": "2021.ranlp-1.158", "year": 2021, "rel_sent": "TR - SEQ : Named Entity Recognition Dataset for Turkish Search Engine Queries.", "forward": false, "src_ids": "2021.ranlp-1.158_234"}
{"input": "zero - shot cross - domain dst is done by using Method| context: zero - shot cross - domain dialogue state tracking ( dst ) enables us to handle unseen domains without the expense of collecting in - domain data .", "entity": "zero - shot cross - domain dst", "output": "slot descriptions enhanced generative approach", "neg_sample": ["zero - shot cross - domain dst is done by using Method", "zero - shot cross - domain dialogue state tracking ( dst ) enables us to handle unseen domains without the expense of collecting in - domain data ."], "relation": "used for", "id": "2021.naacl-main.448", "year": 2021, "rel_sent": "In this paper , we propose a slot descriptions enhanced generative approach for zero - shot cross - domain DST .", "forward": false, "src_ids": "2021.naacl-main.448_9043"}
{"input": "neural machine translation is done by using OtherScientificTerm| context: achieving satisfying performance in machine translation on domains for which there is no training data is challenging . traditional supervised domain adaptation is not suitable for addressing such zero - resource domains because it relies on in - domain parallel data .", "entity": "neural machine translation", "output": "document - level context", "neg_sample": ["neural machine translation is done by using OtherScientificTerm", "achieving satisfying performance in machine translation on domains for which there is no training data is challenging .", "traditional supervised domain adaptation is not suitable for addressing such zero - resource domains because it relies on in - domain parallel data ."], "relation": "used for", "id": "2021.adaptnlp-1.9", "year": 2021, "rel_sent": "Addressing Zero - Resource Domains Using Document - Level Context in Neural Machine Translation.", "forward": false, "src_ids": "2021.adaptnlp-1.9_8848"}
{"input": "neuro - symbolic algorithms is used for Task| context: developers often include such knowledge , structure as taxonomies , in the documentation of chatbots .", "entity": "neuro - symbolic algorithms", "output": "intent recognition", "neg_sample": ["neuro - symbolic algorithms is used for Task", "developers often include such knowledge , structure as taxonomies , in the documentation of chatbots ."], "relation": "used for", "id": "2021.acl-long.545", "year": 2021, "rel_sent": "By using neuro - symbolic algorithms to incorporate those taxonomies into embeddings of the output space , we were able to improve accuracy in intent recognition .", "forward": true, "src_ids": "2021.acl-long.545_7536"}
{"input": "parameter norm growth is used for Task| context: the capacity of neural networks like the widely adopted transformer is known to be very high . evidence is emerging that they learn successfully due to inductive bias in the training routine , typically a variant of gradient descent ( gd ) .", "entity": "parameter norm growth", "output": "transformer training", "neg_sample": ["parameter norm growth is used for Task", "the capacity of neural networks like the widely adopted transformer is known to be very high .", "evidence is emerging that they learn successfully due to inductive bias in the training routine , typically a variant of gradient descent ( gd ) ."], "relation": "used for", "id": "2021.emnlp-main.133", "year": 2021, "rel_sent": "Effects of Parameter Norm Growth During Transformer Training : Inductive Bias from Gradient Descent.", "forward": true, "src_ids": "2021.emnlp-main.133_7151"}
{"input": "collective model is used for OtherScientificTerm| context: integrating extracted knowledge from the web to knowledge graphs ( kgs ) can facilitate tasks like question answering . however , the predictions are made independently , which can be mutually inconsistent .", "entity": "collective model", "output": "globally coherent predictions", "neg_sample": ["collective model is used for OtherScientificTerm", "integrating extracted knowledge from the web to knowledge graphs ( kgs ) can facilitate tasks like question answering .", "however , the predictions are made independently , which can be mutually inconsistent ."], "relation": "used for", "id": "2021.acl-long.363", "year": 2021, "rel_sent": "We propose a two - stage Collective Relation Integration ( CoRI ) model , where the first stage independently makes candidate predictions , and the second stage employs a collective model that accesses all candidate predictions to make globally coherent predictions .", "forward": true, "src_ids": "2021.acl-long.363_2572"}
{"input": "annotation schemes is used for Task| context: relation extraction is a key task in knowledge extraction , and is commonly defined as the task of identifying relations that hold between entities in text .", "entity": "annotation schemes", "output": "meta - relation extraction", "neg_sample": ["annotation schemes is used for Task", "relation extraction is a key task in knowledge extraction , and is commonly defined as the task of identifying relations that hold between entities in text ."], "relation": "used for", "id": "2021.eacl-srw.18", "year": 2021, "rel_sent": "We explore recent works in relation extraction and discuss our plans toformally conceptualise meta - relations for the domain of user - generated health texts , and create a new dataset , annotation scheme and models for meta - relation extraction .", "forward": true, "src_ids": "2021.eacl-srw.18_6809"}
{"input": "checklist is used for Method| context: this is often due to insufficient understanding of the capabilities and limitations of models and the heavy reliance on standard evaluation benchmarks . research into non - standard evaluation to mitigate this brittleness is gaining increasing attention . notably , the behavioral testing principle ' checklist ' , which decouples testing from implementation revealed significant failures in state - of - the - art models for multiple tasks .", "entity": "checklist", "output": "nlp systems", "neg_sample": ["checklist is used for Method", "this is often due to insufficient understanding of the capabilities and limitations of models and the heavy reliance on standard evaluation benchmarks .", "research into non - standard evaluation to mitigate this brittleness is gaining increasing attention .", "notably , the behavioral testing principle ' checklist ' , which decouples testing from implementation revealed significant failures in state - of - the - art models for multiple tasks ."], "relation": "used for", "id": "2021.humeval-1.14", "year": 2021, "rel_sent": "We lay out the challenges and open questions based on our observations of using Checklist for human - in - loop evaluation and improvement of NLP systems .", "forward": true, "src_ids": "2021.humeval-1.14_13675"}
{"input": "weak supervision is done by using Method| context: strategies for improving the training and prediction quality of weakly supervised machine learning models vary in how much they are tailored to a specific task or integrated with a specific model architecture .", "entity": "weak supervision", "output": "training methods", "neg_sample": ["weak supervision is done by using Method", "strategies for improving the training and prediction quality of weakly supervised machine learning models vary in how much they are tailored to a specific task or integrated with a specific model architecture ."], "relation": "used for", "id": "2021.repl4nlp-1.12", "year": 2021, "rel_sent": "Hence , our framework can encompass a wide range of training methods for improving weak supervision , ranging from methods that only look at correlations of rules and output classes ( independently of the machine learning model trained with the resulting labels ) , to those that harness the interplay of neural networks and weakly labeled data .", "forward": false, "src_ids": "2021.repl4nlp-1.12_2128"}
{"input": "conditional and resultant clauses is done by using OtherScientificTerm| context: business process management ( bpm ) is the discipline which is responsible for management of discovering , analyzing , redesigning , monitoring , and controlling business processes . one of the most crucial tasks of bpm is discovering and modelling business processes from text documents .", "entity": "conditional and resultant clauses", "output": "boundaries", "neg_sample": ["conditional and resultant clauses is done by using OtherScientificTerm", "business process management ( bpm ) is the discipline which is responsible for management of discovering , analyzing , redesigning , monitoring , and controlling business processes .", "one of the most crucial tasks of bpm is discovering and modelling business processes from text documents ."], "relation": "used for", "id": "2021.ranlp-1.167", "year": 2021, "rel_sent": "In this paper , we present our system that resolves an end - to - end problem consisting of 1 ) recognizing conditional sentences from technical documents , 2 ) finding boundaries to extract conditional and resultant clauses from each conditional sentence , and 3 ) categorizing resultant clause as Action or Consequence which later helps to generate new steps in our business process model automatically .", "forward": false, "src_ids": "2021.ranlp-1.167_9228"}
{"input": "learning cross - task attribute - attribute similarity is used for Task| context: automatic extraction of product attribute - value pairs from unstructured text like product descriptions is an important problem for e - commerce companies . the attribute schema typically varies from one category of products ( which will be referred as vertical ) to another . this leads to extreme annotation efforts for training of supervised deep sequence labeling models such as lstm - crf , and consequently not enough labeled data for some vertical - attribute pairs .", "entity": "learning cross - task attribute - attribute similarity", "output": "multi - task attribute - value extraction", "neg_sample": ["learning cross - task attribute - attribute similarity is used for Task", "automatic extraction of product attribute - value pairs from unstructured text like product descriptions is an important problem for e - commerce companies .", "the attribute schema typically varies from one category of products ( which will be referred as vertical ) to another .", "this leads to extreme annotation efforts for training of supervised deep sequence labeling models such as lstm - crf , and consequently not enough labeled data for some vertical - attribute pairs ."], "relation": "used for", "id": "2021.ecnlp-1.10", "year": 2021, "rel_sent": "Learning Cross - Task Attribute - Attribute Similarity for Multi - task Attribute - Value Extraction.", "forward": true, "src_ids": "2021.ecnlp-1.10_6467"}
{"input": "character - level representations is used for OtherScientificTerm| context: the dataset consists of more than 293 k tokens annotated with sixteen universal part - of - speech categories .", "entity": "character - level representations", "output": "character - level information", "neg_sample": ["character - level representations is used for OtherScientificTerm", "the dataset consists of more than 293 k tokens annotated with sixteen universal part - of - speech categories ."], "relation": "used for", "id": "2021.ranlp-srw.4", "year": 2021, "rel_sent": "Besides pre - trained GloVe and fastText representation , the character - level representations are incorporated to extract character - level information using the bidirectional long - short - term memory encoder .", "forward": true, "src_ids": "2021.ranlp-srw.4_10949"}
{"input": "nel model is used for Task| context: named entity linking ( nel ) or mapping ' strings ' to ' things ' in a knowledge base is a fundamental preprocessing step in systems that require knowledge of entities such as information extraction and question answering . in this work , we lay out and investigate two challenges faced by individuals or organizations building nel systems .", "entity": "nel model", "output": "sports question - answering ( qa ) system", "neg_sample": ["nel model is used for Task", "named entity linking ( nel ) or mapping ' strings ' to ' things ' in a knowledge base is a fundamental preprocessing step in systems that require knowledge of entities such as information extraction and question answering .", "in this work , we lay out and investigate two challenges faced by individuals or organizations building nel systems ."], "relation": "used for", "id": "2021.naacl-industry.26", "year": 2021, "rel_sent": "Second , for a use case where the NEL model is used in a sports question - answering ( QA ) system , we investigate how to close the loop in our analysis by repurposing the best off - the - shelf model ( Bootleg ) to correct sport - related errors .", "forward": true, "src_ids": "2021.naacl-industry.26_1519"}
{"input": "zero - shot setting is done by using Method| context: hate speech and profanity detection suffer from data sparsity , especially for languages other than english , due to the subjective nature of the tasks and the resulting annotation incompatibility of existing corpora .", "entity": "zero - shot setting", "output": "bert representations", "neg_sample": ["zero - shot setting is done by using Method", "hate speech and profanity detection suffer from data sparsity , especially for languages other than english , due to the subjective nature of the tasks and the resulting annotation incompatibility of existing corpora ."], "relation": "used for", "id": "2021.woah-1.2", "year": 2021, "rel_sent": "We observe that , on both similar and distant target tasks and across all languages , the subspace - based representations transfer more effectively than standard BERT representations in the zero - shot setting , with improvements between F1 +10.9 and F1 +42.9 over the baselines across all tested monolingual and cross - lingual scenarios .", "forward": false, "src_ids": "2021.woah-1.2_15406"}
{"input": "commonsense is used for Task| context: smooth and effective communication requires the ability to perform latent or explicit commonsense inference . prior commonsense reasoning benchmarks ( such as socialiqa and commonsenseqa ) mainly focus on the discriminative task of choosing the right answer from a set of candidates , and do not involve interactive language generation as in dialogue . moreover , existing dialogue datasets do not explicitly focus on exhibiting commonsense as a facet .", "entity": "commonsense", "output": "dialogue response generation", "neg_sample": ["commonsense is used for Task", "smooth and effective communication requires the ability to perform latent or explicit commonsense inference .", "prior commonsense reasoning benchmarks ( such as socialiqa and commonsenseqa ) mainly focus on the discriminative task of choosing the right answer from a set of candidates , and do not involve interactive language generation as in dialogue .", "moreover , existing dialogue datasets do not explicitly focus on exhibiting commonsense as a facet ."], "relation": "used for", "id": "2021.sigdial-1.13", "year": 2021, "rel_sent": "In this paper , we present an empirical study of commonsense in dialogue response generation .", "forward": true, "src_ids": "2021.sigdial-1.13_6969"}
{"input": "tree structure is done by using Method| context: dependency parse trees are helpful for discovering the opinion words in aspect - based sentiment analysis ( absa ) ( citation ) . however , the trees obtained from off - the - shelf dependency parsers are static , and could be sub - optimal in absa . this is because the syntactic trees are not designed for capturing the interactions between opinion words and aspect words .", "entity": "tree structure", "output": "learning process", "neg_sample": ["tree structure is done by using Method", "dependency parse trees are helpful for discovering the opinion words in aspect - based sentiment analysis ( absa ) ( citation ) .", "however , the trees obtained from off - the - shelf dependency parsers are static , and could be sub - optimal in absa .", "this is because the syntactic trees are not designed for capturing the interactions between opinion words and aspect words ."], "relation": "used for", "id": "2021.emnlp-main.317", "year": 2021, "rel_sent": "The learning process allows the tree structure to adaptively correlate the aspect and opinion words , enabling us to better identify the polarity in the ABSA task .", "forward": false, "src_ids": "2021.emnlp-main.317_550"}
{"input": "pipelined approach is used for Task| context: most recent work models these two subtasks jointly , either by casting them in one structured prediction framework , or performing multi - task learning through shared representations .", "entity": "pipelined approach", "output": "entity and relation extraction", "neg_sample": ["pipelined approach is used for Task", "most recent work models these two subtasks jointly , either by casting them in one structured prediction framework , or performing multi - task learning through shared representations ."], "relation": "used for", "id": "2021.naacl-main.5", "year": 2021, "rel_sent": "In this work , we present a simple pipelined approach for entity and relation extraction , and establish the new state - of - the - art on standard benchmarks ( ACE04 , ACE05 and SciERC ) , obtaining a 1.7%-2.8 % absolute improvement in relation F1 over previous joint models with the same pre - trained encoders .", "forward": true, "src_ids": "2021.naacl-main.5_11862"}
{"input": "slot filling is done by using Material| context: automatically inducing high quality knowledge graphs from a given collection of documents still remains a challenging problem in ai . one way to make headway for this problem is through advancements in a related task known as slot filling . in this task , given an entity query in form of [ entity , slot , ? the recent works in the field try to solve this task in an end - to - end fashion using retrieval - based language models .", "entity": "slot filling", "output": "tacred dataset", "neg_sample": ["slot filling is done by using Material", "automatically inducing high quality knowledge graphs from a given collection of documents still remains a challenging problem in ai .", "one way to make headway for this problem is through advancements in a related task known as slot filling .", "in this task , given an entity query in form of [ entity , slot , ?", "the recent works in the field try to solve this task in an end - to - end fashion using retrieval - based language models ."], "relation": "used for", "id": "2021.emnlp-main.148", "year": 2021, "rel_sent": "Moreover , we demonstrate the robustness of our system showing its domain adaptation capability on a new variant of the TACRED dataset for slot filling , through a combination of zero / few - shot learning .", "forward": false, "src_ids": "2021.emnlp-main.148_9349"}
{"input": "explicit and non - explicit discourse relations is done by using Method| context: this paper demonstrates discopy , a novel framework that makes it easy to design components for end - to - end shallow discourse parsing .", "entity": "explicit and non - explicit discourse relations", "output": "contextualized word embeddings", "neg_sample": ["explicit and non - explicit discourse relations is done by using Method", "this paper demonstrates discopy , a novel framework that makes it easy to design components for end - to - end shallow discourse parsing ."], "relation": "used for", "id": "2021.codi-main.12", "year": 2021, "rel_sent": "For the purpose of demonstration , we implement recent neural approaches and integrate contextualized word embeddings to predict explicit and non - explicit discourse relations .", "forward": false, "src_ids": "2021.codi-main.12_11031"}
{"input": "engaging conversations is done by using OtherScientificTerm| context: having engaging and informative conversations with users is the utmost goal for open - domain conversational systems . recent advances in transformer - based language models and their applications to dialogue systems have succeeded to generate fluent and human - like responses . however , they still lack control over the generation process towards producing contentful responses and achieving engaging conversations .", "entity": "engaging conversations", "output": "convlines", "neg_sample": ["engaging conversations is done by using OtherScientificTerm", "having engaging and informative conversations with users is the utmost goal for open - domain conversational systems .", "recent advances in transformer - based language models and their applications to dialogue systems have succeeded to generate fluent and human - like responses .", "however , they still lack control over the generation process towards producing contentful responses and achieving engaging conversations ."], "relation": "used for", "id": "2021.naacl-demos.4", "year": 2021, "rel_sent": "Through automatic and human evaluations , we demonstrate the efficiency of the convlines in producing engaging conversations .", "forward": false, "src_ids": "2021.naacl-demos.4_230"}
{"input": "pedagogy is done by using Task| context: integrating an adaptive intelligent tutoring system ( its ) in real - life school contexts requires coverage of the official curricula , which necessitates a broad range and number of activities to practice the official set of language phenomena .", "entity": "pedagogy", "output": "automatic annotation of curricular language targets", "neg_sample": ["pedagogy is done by using Task", "integrating an adaptive intelligent tutoring system ( its ) in real - life school contexts requires coverage of the official curricula , which necessitates a broad range and number of activities to practice the official set of language phenomena ."], "relation": "used for", "id": "2021.nlp4call-1.2", "year": 2021, "rel_sent": "Automatic annotation of curricular language targets to enrich activity models and support both pedagogy and adaptive systems.", "forward": false, "src_ids": "2021.nlp4call-1.2_8602"}
{"input": "mwe integration is used for Task| context: recent work has investigated ideograph or stroke level embedding .", "entity": "mwe integration", "output": "machine translation", "neg_sample": ["mwe integration is used for Task", "recent work has investigated ideograph or stroke level embedding ."], "relation": "used for", "id": "2021.nodalida-main.35", "year": 2021, "rel_sent": "MWE integration into MT has seen more than a decade of exploration .", "forward": true, "src_ids": "2021.nodalida-main.35_3613"}
{"input": "multilingual code - switching is used for Task| context: predicting user intent and detecting the corresponding slots from text are two key problems in natural language understanding ( nlu ) . since annotated datasets are only available for a handful of languages , our work focuses particularly on a zero - shot scenario where the target language is unseen during training . in the context of zero - shot learning , this task is typically approached using representations from pre - trained multilingual language models such as mbert or by fine - tuning on data automatically translated into the target language .", "entity": "multilingual code - switching", "output": "zero - shot cross - lingual intent prediction", "neg_sample": ["multilingual code - switching is used for Task", "predicting user intent and detecting the corresponding slots from text are two key problems in natural language understanding ( nlu ) .", "since annotated datasets are only available for a handful of languages , our work focuses particularly on a zero - shot scenario where the target language is unseen during training .", "in the context of zero - shot learning , this task is typically approached using representations from pre - trained multilingual language models such as mbert or by fine - tuning on data automatically translated into the target language ."], "relation": "used for", "id": "2021.mrl-1.18", "year": 2021, "rel_sent": "Multilingual Code - Switching for Zero - Shot Cross - Lingual Intent Prediction and Slot Filling.", "forward": true, "src_ids": "2021.mrl-1.18_14357"}
{"input": "headline grouping is done by using Method| context: recent progress in natural language understanding ( nlu ) has seen the latest models outperform human performance on many standard tasks . these impressive results have led the community to introspect on dataset limitations , and iterate on more nuanced challenges .", "entity": "headline grouping", "output": "unsupervised headline generator swap model", "neg_sample": ["headline grouping is done by using Method", "recent progress in natural language understanding ( nlu ) has seen the latest models outperform human performance on many standard tasks .", "these impressive results have led the community to introspect on dataset limitations , and iterate on more nuanced challenges ."], "relation": "used for", "id": "2021.naacl-main.255", "year": 2021, "rel_sent": "We further propose a novel unsupervised Headline Generator Swap model for the task of HeadLine Grouping that achieves within 3 F-1 of the best supervised model .", "forward": false, "src_ids": "2021.naacl-main.255_5366"}
{"input": "sflm is used for OtherScientificTerm| context: as unlabeled data carry rich task - relevant information , they are proven useful for few - shot learning of language model . the question is how to effectively make use of such data .", "entity": "sflm", "output": "pseudo label", "neg_sample": ["sflm is used for OtherScientificTerm", "as unlabeled data carry rich task - relevant information , they are proven useful for few - shot learning of language model .", "the question is how to effectively make use of such data ."], "relation": "used for", "id": "2021.emnlp-main.718", "year": 2021, "rel_sent": "Given two views of a text sample via weak and strong augmentation techniques , SFLM generates a pseudo label on the weakly augmented version .", "forward": true, "src_ids": "2021.emnlp-main.718_6588"}
{"input": "supervised learning tasks is used for OtherScientificTerm| context: training traditional extractive summarization models relies heavily on human - engineered labels such as sentence - level annotations of summary - worthiness . however , in many use cases , such human - engineered labels do not exist and manually annotating thousands of documents for the purpose of training models may not be feasible .", "entity": "supervised learning tasks", "output": "indirect signals", "neg_sample": ["supervised learning tasks is used for OtherScientificTerm", "training traditional extractive summarization models relies heavily on human - engineered labels such as sentence - level annotations of summary - worthiness .", "however , in many use cases , such human - engineered labels do not exist and manually annotating thousands of documents for the purpose of training models may not be feasible ."], "relation": "used for", "id": "2021.sigdial-1.54", "year": 2021, "rel_sent": "In this paper , we develop a general framework that generates extractive summarization as a byproduct of supervised learning tasks for indirect signals via the help of attention mechanism .", "forward": true, "src_ids": "2021.sigdial-1.54_1239"}
{"input": "automatic capture of emerging scientific concepts is done by using Method| context: scientific knowledge is evolving at an unprecedented rate of speed , with new concepts constantly being introduced from millions of academic articles published every month .", "entity": "automatic capture of emerging scientific concepts", "output": "sciconceptminer", "neg_sample": ["automatic capture of emerging scientific concepts is done by using Method", "scientific knowledge is evolving at an unprecedented rate of speed , with new concepts constantly being introduced from millions of academic articles published every month ."], "relation": "used for", "id": "2021.acl-demo.6", "year": 2021, "rel_sent": "In this paper , we introduce a self - supervised end - to - end system , SciConceptMiner , for the automatic capture of emerging scientific concepts from both independent knowledge sources ( semi - structured data ) and academic publications ( unstructured documents ) .", "forward": false, "src_ids": "2021.acl-demo.6_5652"}
{"input": "images is used for Task| context: images are core components of multi - modal learning in natural language processing ( nlp ) , and results have varied substantially as to whether images improve nlp tasks or not . one confounding effect has been that previous nlp research has generally focused on sophisticated tasks ( in varying settings ) , generally applied to english only .", "entity": "images", "output": "text classification", "neg_sample": ["images is used for Task", "images are core components of multi - modal learning in natural language processing ( nlp ) , and results have varied substantially as to whether images improve nlp tasks or not .", "one confounding effect has been that previous nlp research has generally focused on sophisticated tasks ( in varying settings ) , generally applied to english only ."], "relation": "used for", "id": "2021.eacl-main.4", "year": 2021, "rel_sent": "On the ( In)Effectiveness of Images for Text Classification.", "forward": true, "src_ids": "2021.eacl-main.4_12253"}
{"input": "raw weighted real distances is used for OtherScientificTerm| context: transformer has achieved great success in the nlp field by composing various advanced models like bert and gpt . however , transformer and its existing variants may not be optimal in capturing token distances because the position or distance embeddings used by these methods usually can not keep the precise information of real distances , which may not be beneficial for modeling the orders and relations of contexts .", "entity": "raw weighted real distances", "output": "raw self - attention weights", "neg_sample": ["raw weighted real distances is used for OtherScientificTerm", "transformer has achieved great success in the nlp field by composing various advanced models like bert and gpt .", "however , transformer and its existing variants may not be optimal in capturing token distances because the position or distance embeddings used by these methods usually can not keep the precise information of real distances , which may not be beneficial for modeling the orders and relations of contexts ."], "relation": "used for", "id": "2021.naacl-main.166", "year": 2021, "rel_sent": "Since the raw weighted real distances may not be optimal for adjusting self - attention weights , we propose a learnable sigmoid function to map them into re - scaled coefficients that have proper ranges .", "forward": true, "src_ids": "2021.naacl-main.166_5748"}
{"input": "mapping - based clwes is done by using Method| context: unsupervised cross - lingual word embedding(clwe ) methods learn a linear transformation matrix that maps two monolingual embedding spaces that are separately trained with monolingual corpora . this method relies on the assumption that the two embedding spaces are structurally similar , which does not necessarily hold true in general .", "entity": "mapping - based clwes", "output": "data augmentation", "neg_sample": ["mapping - based clwes is done by using Method", "unsupervised cross - lingual word embedding(clwe ) methods learn a linear transformation matrix that maps two monolingual embedding spaces that are separately trained with monolingual corpora .", "this method relies on the assumption that the two embedding spaces are structurally similar , which does not necessarily hold true in general ."], "relation": "used for", "id": "2021.acl-srw.17", "year": 2021, "rel_sent": "We show that our approach outperforms other alternative approaches given the same amount of data , and , through detailed analysis , we show that data augmentation with the pseudo data from unsupervised machine translation is especially effective for mapping - based CLWEs because ( 1 ) the pseudo data makes the source and target corpora ( partially ) parallel ; ( 2 ) the pseudo data contains information on the original language that helps to learn similar embedding spaces between the source and target languages .", "forward": false, "src_ids": "2021.acl-srw.17_11098"}
{"input": "disambiguation is done by using Method| context: supervised systems have nowadays become the standard recipe for word sense disambiguation ( wsd ) , with transformer - based language models as their primary ingredient . however , while these systems have certainly attained unprecedented performances , virtually all of them operate under the constraining assumption that , given a context , each word can be disambiguated individually with no account of the other sense choices .", "entity": "disambiguation", "output": "feedback loop strategy", "neg_sample": ["disambiguation is done by using Method", "supervised systems have nowadays become the standard recipe for word sense disambiguation ( wsd ) , with transformer - based language models as their primary ingredient .", "however , while these systems have certainly attained unprecedented performances , virtually all of them operate under the constraining assumption that , given a context , each word can be disambiguated individually with no account of the other sense choices ."], "relation": "used for", "id": "2021.emnlp-main.112", "year": 2021, "rel_sent": "To address this limitation and drop this assumption , we propose CONtinuous SEnse Comprehension ( ConSeC ) , a novel approach to WSD : leveraging a recent re - framing of this task as a text extraction problem , we adapt it to our formulation and introduce a feedback loop strategy that allows the disambiguation of a target word to be conditioned not only on its context but also on the explicit senses assigned to nearby words .", "forward": false, "src_ids": "2021.emnlp-main.112_12157"}
{"input": "pretrained language model adaptation is done by using Method| context: it works by adding light - weight adapter modules to a pretrained language model ( prlm ) and only updating the parameters of adapter modules when learning on a downstream task . as such , it adds only a few trainable parameters per new task , allowing a high degree of parameter sharing . in this paper , we study the latter .", "entity": "pretrained language model adaptation", "output": "adapter - based tuning", "neg_sample": ["pretrained language model adaptation is done by using Method", "it works by adding light - weight adapter modules to a pretrained language model ( prlm ) and only updating the parameters of adapter modules when learning on a downstream task .", "as such , it adds only a few trainable parameters per new task , allowing a high degree of parameter sharing .", "in this paper , we study the latter ."], "relation": "used for", "id": "2021.acl-long.172", "year": 2021, "rel_sent": "On the Effectiveness of Adapter - based Tuning for Pretrained Language Model Adaptation.", "forward": false, "src_ids": "2021.acl-long.172_13245"}
{"input": "ensemble of sequence tagging models is used for Task| context: the upsurge of prolific blogging and microblogging platforms enabled the abusers to spread negativity and threats greater than ever . detecting the toxic portions substantially aids to moderate or exclude the abusive parts for maintaining sound online platforms .", "entity": "ensemble of sequence tagging models", "output": "toxic spans detection", "neg_sample": ["ensemble of sequence tagging models is used for Task", "the upsurge of prolific blogging and microblogging platforms enabled the abusers to spread negativity and threats greater than ever .", "detecting the toxic portions substantially aids to moderate or exclude the abusive parts for maintaining sound online platforms ."], "relation": "used for", "id": "2021.semeval-1.135", "year": 2021, "rel_sent": "CSECU - DSG at SemEval-2021 Task 5 : Leveraging Ensemble of Sequence Tagging Models for Toxic Spans Detection.", "forward": true, "src_ids": "2021.semeval-1.135_4199"}
{"input": "crisscrossed captions is used for OtherScientificTerm| context: by supporting multi - modal retrieval training and evaluation , image captioning datasets have spurred remarkable progress on representation learning . unfortunately , datasets have limited cross - modal associations : images are not paired with other images , captions are only paired with other captions of the same image , there are no negative associations and there are missing positive cross - modal associations . this undermines research into how inter - modality learning impacts intra - modality tasks .", "entity": "crisscrossed captions", "output": "intramodal and intermodal semantic similarity judgments", "neg_sample": ["crisscrossed captions is used for OtherScientificTerm", "by supporting multi - modal retrieval training and evaluation , image captioning datasets have spurred remarkable progress on representation learning .", "unfortunately , datasets have limited cross - modal associations : images are not paired with other images , captions are only paired with other captions of the same image , there are no negative associations and there are missing positive cross - modal associations .", "this undermines research into how inter - modality learning impacts intra - modality tasks ."], "relation": "used for", "id": "2021.eacl-main.249", "year": 2021, "rel_sent": "Crisscrossed Captions : Extended Intramodal and Intermodal Semantic Similarity Judgments for MS - COCO.", "forward": true, "src_ids": "2021.eacl-main.249_1725"}
{"input": "medical knowledge is used for Task| context: in recent years pre - trained language models ( plm ) such as bert have proven to be very effective in diverse nlp tasks such as information extraction , sentiment analysis and question answering . trained with massive general - domain text , these pre - trained language models capture rich syntactic , semantic and discourse information in the text . furthermore , it may require additional medical knowledge to understand clinical text properly .", "entity": "medical knowledge", "output": "extracting clinical relations", "neg_sample": ["medical knowledge is used for Task", "in recent years pre - trained language models ( plm ) such as bert have proven to be very effective in diverse nlp tasks such as information extraction , sentiment analysis and question answering .", "trained with massive general - domain text , these pre - trained language models capture rich syntactic , semantic and discourse information in the text .", "furthermore , it may require additional medical knowledge to understand clinical text properly ."], "relation": "used for", "id": "2021.emnlp-main.435", "year": 2021, "rel_sent": "Incorporating medical knowledge in BERT for clinical relation extraction.", "forward": true, "src_ids": "2021.emnlp-main.435_8871"}
{"input": "domiknows is used for Task| context: we demonstrate a library for the integration of domain knowledge in deep learning architectures . using this library , the structure of the data is expressed symbolically via graph declarations and the logical constraints over outputs or latent variables can be seamlessly added to the deep models .", "entity": "domiknows", "output": "integration of symbolic domain knowledge", "neg_sample": ["domiknows is used for Task", "we demonstrate a library for the integration of domain knowledge in deep learning architectures .", "using this library , the structure of the data is expressed symbolically via graph declarations and the logical constraints over outputs or latent variables can be seamlessly added to the deep models ."], "relation": "used for", "id": "2021.emnlp-demo.27", "year": 2021, "rel_sent": "DomiKnowS : A Library for Integration of Symbolic Domain Knowledge in Deep Learning.", "forward": true, "src_ids": "2021.emnlp-demo.27_9686"}
{"input": "automatically - extracted dense region captions is used for OtherScientificTerm| context: leveraging large - scale unlabeled web videos such as instructional videos for pre - training followed by task - specific finetuning has become the de facto approach for many video - and - language tasks . however , these instructional videos are very noisy , the accompanying asr narrations are often incomplete , and can be irrelevant to or temporally misaligned with the visual content , limiting the performance of the models trained on such data .", "entity": "automatically - extracted dense region captions", "output": "auxiliary text input", "neg_sample": ["automatically - extracted dense region captions is used for OtherScientificTerm", "leveraging large - scale unlabeled web videos such as instructional videos for pre - training followed by task - specific finetuning has become the de facto approach for many video - and - language tasks .", "however , these instructional videos are very noisy , the accompanying asr narrations are often incomplete , and can be irrelevant to or temporally misaligned with the visual content , limiting the performance of the models trained on such data ."], "relation": "used for", "id": "2021.naacl-main.193", "year": 2021, "rel_sent": "To address these issues , we propose an improved video - and - language pre - training method that first adds automatically - extracted dense region captions from the videoframes as auxiliary text input , to provide informative visual cues for learning better video and language associations .", "forward": true, "src_ids": "2021.naacl-main.193_4747"}
{"input": "abstractive summarization task is done by using Method| context: state - of - the - art abstractive summarization models generally rely on extensive labeled data , which lowers their generalization ability on domains where such data are not available .", "entity": "abstractive summarization task", "output": "domain adaptation methods", "neg_sample": ["abstractive summarization task is done by using Method", "state - of - the - art abstractive summarization models generally rely on extensive labeled data , which lowers their generalization ability on domains where such data are not available ."], "relation": "used for", "id": "2021.naacl-main.471", "year": 2021, "rel_sent": "Furthermore , results illustrate that a huge gap still exists between the low - resource and high - resource settings , which highlights the need for more advanced domain adaptation methods for the abstractive summarization task .", "forward": false, "src_ids": "2021.naacl-main.471_11155"}
{"input": "coupled policies is used for Task| context: we present a novel approach to efficiently learn a simultaneous translation model with coupled programmer - interpreter policies .", "entity": "coupled policies", "output": "simultaneous machine translation", "neg_sample": ["coupled policies is used for Task", "we present a novel approach to efficiently learn a simultaneous translation model with coupled programmer - interpreter policies ."], "relation": "used for", "id": "2021.eacl-main.233", "year": 2021, "rel_sent": "Learning Coupled Policies for Simultaneous Machine Translation using Imitation Learning.", "forward": true, "src_ids": "2021.eacl-main.233_14749"}
{"input": "tabular perturbation is done by using Method| context: to grasp the true reasoning ability , the natural language inference model should be evaluated on counterfactual data .", "entity": "tabular perturbation", "output": "tabpert", "neg_sample": ["tabular perturbation is done by using Method", "to grasp the true reasoning ability , the natural language inference model should be evaluated on counterfactual data ."], "relation": "used for", "id": "2021.emnlp-demo.39", "year": 2021, "rel_sent": "TabPert : An Effective Platform for Tabular Perturbation.", "forward": false, "src_ids": "2021.emnlp-demo.39_11723"}
{"input": "semi - supervised document classification is done by using Method| context: nonetheless , their development has happened in isolation , while the combination of both could potentially be effective for tackling task - specific labelled data shortage .", "entity": "semi - supervised document classification", "output": "deep generative models", "neg_sample": ["semi - supervised document classification is done by using Method", "nonetheless , their development has happened in isolation , while the combination of both could potentially be effective for tackling task - specific labelled data shortage ."], "relation": "used for", "id": "2021.eacl-main.76", "year": 2021, "rel_sent": "Combining Deep Generative Models and Multi - lingual Pretraining for Semi - supervised Document Classification.", "forward": false, "src_ids": "2021.eacl-main.76_6432"}
{"input": "paired training data is used for Method| context: metaphor generation is a difficult task , and has seen tremendous improvement with the advent of deep pretrained models .", "entity": "paired training data", "output": "t5 models", "neg_sample": ["paired training data is used for Method", "metaphor generation is a difficult task , and has seen tremendous improvement with the advent of deep pretrained models ."], "relation": "used for", "id": "2021.conll-1.26", "year": 2021, "rel_sent": "We evaluate two methods for generating paired training data , which is then used to train T5 models for free and controlled generation .", "forward": true, "src_ids": "2021.conll-1.26_3109"}
{"input": "generic pretrained model is done by using Method| context: large pre - trained models such as bert are known to improve different downstream nlp tasks , even when such a model is trained on a generic domain . moreover , recent studies have shown that when large domain - specific corpora are available , continued pre - training on domain - specific data can further improve the performance of in - domain tasks . however , this practice requires significant domain - specific data and computational resources which may not always be available .", "entity": "generic pretrained model", "output": "( word based ) n - grams", "neg_sample": ["generic pretrained model is done by using Method", "large pre - trained models such as bert are known to improve different downstream nlp tasks , even when such a model is trained on a generic domain .", "moreover , recent studies have shown that when large domain - specific corpora are available , continued pre - training on domain - specific data can further improve the performance of in - domain tasks .", "however , this practice requires significant domain - specific data and computational resources which may not always be available ."], "relation": "used for", "id": "2021.acl-long.259", "year": 2021, "rel_sent": "We demonstrate that by explicitly incorporating multi - granularity information of unseen and domain - specific words via the adaptation of ( word based ) n - grams , the performance of a generic pretrained model can be greatly improved .", "forward": false, "src_ids": "2021.acl-long.259_2172"}
{"input": "shallow patterns is done by using OtherScientificTerm| context: feed - forward layers constitute two - thirds of a transformer model 's parameters , yet their role in the network remains under - explored . we show that feed - forward layers in transformer - based language models operate as key - value memories , where each key correlates with textual patterns in the training examples , and each value induces a distribution over the output vocabulary .", "entity": "shallow patterns", "output": "lower layers", "neg_sample": ["shallow patterns is done by using OtherScientificTerm", "feed - forward layers constitute two - thirds of a transformer model 's parameters , yet their role in the network remains under - explored .", "we show that feed - forward layers in transformer - based language models operate as key - value memories , where each key correlates with textual patterns in the training examples , and each value induces a distribution over the output vocabulary ."], "relation": "used for", "id": "2021.emnlp-main.446", "year": 2021, "rel_sent": "Our experiments show that the learned patterns are human - interpretable , and that lower layers tend to capture shallow patterns , while upper layers learn more semantic ones .", "forward": false, "src_ids": "2021.emnlp-main.446_8857"}
{"input": "contextualized embeddings is done by using Method| context: pre - trained transformer language models have shown remarkable performance on a variety of nlp tasks . however , recent research has suggested that phrase - level representations in these models reflect heavy influences of lexical content , but lack evidence of sophisticated , compositional phrase information ( yu and ettinger , 2020 ) .", "entity": "contextualized embeddings", "output": "fine - tuning", "neg_sample": ["contextualized embeddings is done by using Method", "pre - trained transformer language models have shown remarkable performance on a variety of nlp tasks .", "however , recent research has suggested that phrase - level representations in these models reflect heavy influences of lexical content , but lack evidence of sophisticated , compositional phrase information ( yu and ettinger , 2020 ) ."], "relation": "used for", "id": "2021.findings-acl.201", "year": 2021, "rel_sent": "Here we investigate the impact of fine - tuning on the capacity of contextualized embeddings to capture phrase meaning information beyond lexical content .", "forward": false, "src_ids": "2021.findings-acl.201_12123"}
{"input": "arguments is done by using Task| context: many forms of argumentation employ images as persuasive means , but research in argument mining has been focused on verbal argumentation sofar .", "entity": "arguments", "output": "image retrieval", "neg_sample": ["arguments is done by using Task", "many forms of argumentation employ images as persuasive means , but research in argument mining has been focused on verbal argumentation sofar ."], "relation": "used for", "id": "2021.argmining-1.4", "year": 2021, "rel_sent": "Image Retrieval for Arguments Using Stance - Aware Query Expansion.", "forward": false, "src_ids": "2021.argmining-1.4_3996"}
{"input": "domain adaptation is done by using OtherScientificTerm| context: pre - trained language models ( ptlms ) acquire domain - independent linguistic knowledge through pre - training with massive textual resources . additional pre - training is effective in adapting ptlms to domains that are not well covered by the pre - training corpora .", "entity": "domain adaptation", "output": "static word embeddings", "neg_sample": ["domain adaptation is done by using OtherScientificTerm", "pre - trained language models ( ptlms ) acquire domain - independent linguistic knowledge through pre - training with massive textual resources .", "additional pre - training is effective in adapting ptlms to domains that are not well covered by the pre - training corpora ."], "relation": "used for", "id": "2021.findings-acl.398", "year": 2021, "rel_sent": "Here , we focus on the static word embeddings of PTLMs for domain adaptation to teach PTLMs domain - specific meanings of words .", "forward": false, "src_ids": "2021.findings-acl.398_15948"}
{"input": "multi - hop reasoning is done by using OtherScientificTerm| context: multi - hop reasoning requires aggregation and inference from multiple facts .", "entity": "multi - hop reasoning", "output": "compositional knowledge", "neg_sample": ["multi - hop reasoning is done by using OtherScientificTerm", "multi - hop reasoning requires aggregation and inference from multiple facts ."], "relation": "used for", "id": "2021.naacl-main.363", "year": 2021, "rel_sent": "First , we introduce several attention- and embedding - based analyses , which indicate that jointly retrieving and reranking approaches can learn compositional knowledge required for multi - hop reasoning .", "forward": false, "src_ids": "2021.naacl-main.363_6159"}
{"input": "fine - tuning is used for Generic| context: the dataset comprises 65k european union ( eu ) laws , officially translated in 23 languages , annotated with multiple labels from the eurovoc taxonomy .", "entity": "fine - tuning", "output": "end - tasks", "neg_sample": ["fine - tuning is used for Generic", "the dataset comprises 65k european union ( eu ) laws , officially translated in 23 languages , annotated with multiple labels from the eurovoc taxonomy ."], "relation": "used for", "id": "2021.emnlp-main.559", "year": 2021, "rel_sent": "Adaptation strategies , namely partial fine - tuning , adapters , BITFIT , LNFIT , originally proposed to accelerate fine - tuning for new end - tasks , help retain multilingual knowledge from pretraining , substantially improving zero - shot cross - lingual transfer , but their impact also depends on the pretrained model used and the size of the label set .", "forward": true, "src_ids": "2021.emnlp-main.559_11628"}
{"input": "human language processing is done by using Method| context: the human mind is a dynamical system , yet many analysis techniques used to study it are limited in their ability to capture the complex dynamics that may characterize mental processes .", "entity": "human language processing", "output": "continuous - time deconvolutional regressive neural network", "neg_sample": ["human language processing is done by using Method", "the human mind is a dynamical system , yet many analysis techniques used to study it are limited in their ability to capture the complex dynamics that may characterize mental processes ."], "relation": "used for", "id": "2021.acl-long.288", "year": 2021, "rel_sent": "Behavioral and fMRI experiments reveal detailed and plausible estimates of human language processing dynamics that generalize better than CDR and other baselines , supporting a potential role for CDRNN in studying human language processing .", "forward": false, "src_ids": "2021.acl-long.288_9497"}
{"input": "text classification is done by using Method| context: existing bias mitigation methods to reduce disparities in model outcomes across cohorts have focused on data augmentation , debiasing model embeddings , or adding fairness - based optimization objectives during training .", "entity": "text classification", "output": "certified word substitution robustness methods", "neg_sample": ["text classification is done by using Method", "existing bias mitigation methods to reduce disparities in model outcomes across cohorts have focused on data augmentation , debiasing model embeddings , or adding fairness - based optimization objectives during training ."], "relation": "used for", "id": "2021.findings-acl.294", "year": 2021, "rel_sent": "Does Robustness Improve Fairness ? Approaching Fairness with Word Substitution Robustness Methods for Text Classification.", "forward": false, "src_ids": "2021.findings-acl.294_2351"}
{"input": "utterance rewrite model is used for OtherScientificTerm| context: recently , text - to - sql for multi - turn dialogue has attracted great interest . here , the user input of the current turn is parsed into the corresponding sql query of the appropriate database , given all previous dialogue history . current approaches mostly employ end - to - end models and consequently face two challenges . first , dialogue history modeling and text - tosql parsing are implicitly combined , hence it is hard to carry out interpretable analysis and obtain targeted improvement . second , sql annotation of multi - turn dialogue is very expensive , leading to training data sparsity .", "entity": "utterance rewrite model", "output": "completion of dialogue context", "neg_sample": ["utterance rewrite model is used for OtherScientificTerm", "recently , text - to - sql for multi - turn dialogue has attracted great interest .", "here , the user input of the current turn is parsed into the corresponding sql query of the appropriate database , given all previous dialogue history .", "current approaches mostly employ end - to - end models and consequently face two challenges .", "first , dialogue history modeling and text - tosql parsing are implicitly combined , hence it is hard to carry out interpretable analysis and obtain targeted improvement .", "second , sql annotation of multi - turn dialogue is very expensive , leading to training data sparsity ."], "relation": "used for", "id": "2021.findings-acl.270", "year": 2021, "rel_sent": "In this paper , we propose a novel decoupled multi - turn Text - to - SQL framework , where an utterance rewrite model first explicitly solves completion of dialogue context , and then a single - turn Text - to - SQL parser follows .", "forward": true, "src_ids": "2021.findings-acl.270_4336"}
{"input": "supervised machine learning techniques is used for Task| context: written communication is of utmost importance to the progress of scientific research . the speed of such development , however , may be affected by the scarcity of reviewers to referee the quality of research articles . in this context , automatic approaches that are able to query linguistic segments in written contributions by detecting the presence or absence of common rhetorical patterns have become a necessity .", "entity": "supervised machine learning techniques", "output": "genre analysis", "neg_sample": ["supervised machine learning techniques is used for Task", "written communication is of utmost importance to the progress of scientific research .", "the speed of such development , however , may be affected by the scarcity of reviewers to referee the quality of research articles .", "in this context , automatic approaches that are able to query linguistic segments in written contributions by detecting the presence or absence of common rhetorical patterns have become a necessity ."], "relation": "used for", "id": "2021.ranlp-1.8", "year": 2021, "rel_sent": "This paper aims to compare supervised machine learning techniques tested to accomplish genre analysis in Introduction sections of software engineering articles .", "forward": true, "src_ids": "2021.ranlp-1.8_12383"}
{"input": "neural agents is used for OtherScientificTerm| context: natural languages display a trade - off among different strategies to convey syntactic structure , such as word order or inflection . this trade - off , however , has not appeared in recent simulations of iterated language learning with neural network agents ( chaabouni et al . , 2019b ) .", "entity": "neural agents", "output": "utterance type distribution", "neg_sample": ["neural agents is used for OtherScientificTerm", "natural languages display a trade - off among different strategies to convey syntactic structure , such as word order or inflection .", "this trade - off , however , has not appeared in recent simulations of iterated language learning with neural network agents ( chaabouni et al .", ", 2019b ) ."], "relation": "used for", "id": "2021.emnlp-main.794", "year": 2021, "rel_sent": "Our simulations show that neural agents mainly strive to maintain the utterance type distribution observed during learning , instead of developing a more efficient or systematic language .", "forward": true, "src_ids": "2021.emnlp-main.794_4651"}
{"input": "pre - trained embeddings is used for Task| context: universal adversarial texts ( uats ) refer to short pieces of text units that can largely affect the predictions of nlp models .", "entity": "pre - trained embeddings", "output": "universal adversarial attacks", "neg_sample": ["pre - trained embeddings is used for Task", "universal adversarial texts ( uats ) refer to short pieces of text units that can largely affect the predictions of nlp models ."], "relation": "used for", "id": "2021.alta-1.14", "year": 2021, "rel_sent": "Our empirical studies on three text classification datasets reveal that : 1 ) CNN based models are more extremely vulnerable to UATs while self - attention models show the most robustness , 2 ) the vulnerability of CNN and LSTM models and robustness of self - attention models could be attributed to whether they rely on training data artifacts for their predictions , and 3 ) the pre - trained embeddings could expose vulnerability to both universal adversarial attack and the UAT transfer attack .", "forward": true, "src_ids": "2021.alta-1.14_161"}
{"input": "machine translation application is done by using Method| context: in this paper , we present miss , an assistant for multi - style simultaneous translation . our proposed translation system has five key features : highly accurate translation , simultaneous translation , translation for multiple text styles , back - translation for translation quality evaluation , and grammatical error correction .", "entity": "machine translation application", "output": "translation assistance system", "neg_sample": ["machine translation application is done by using Method", "in this paper , we present miss , an assistant for multi - style simultaneous translation .", "our proposed translation system has five key features : highly accurate translation , simultaneous translation , translation for multiple text styles , back - translation for translation quality evaluation , and grammatical error correction ."], "relation": "used for", "id": "2021.emnlp-demo.1", "year": 2021, "rel_sent": "Compared with the free commercial translation systems commonly used , our translation assistance system regards the machine translation application as a more complete and fully - featured tool for users .", "forward": false, "src_ids": "2021.emnlp-demo.1_1502"}
{"input": "complex affective states is done by using Method| context: datasets with induced emotion labels are scarce but of utmost importance for many nlp tasks .", "entity": "complex affective states", "output": "reaction gifs", "neg_sample": ["complex affective states is done by using Method", "datasets with induced emotion labels are scarce but of utmost importance for many nlp tasks ."], "relation": "used for", "id": "2021.acl-short.50", "year": 2021, "rel_sent": "The method exploits the online use of reaction GIFs , which capture complex affective states .", "forward": false, "src_ids": "2021.acl-short.50_9831"}
{"input": "continual learning is used for Task| context: although some cl techniques have been proposed for document sentiment classification , we are not aware of any cl work on asc .", "entity": "continual learning", "output": "aspect sentiment classification ( asc ) tasks", "neg_sample": ["continual learning is used for Task", "although some cl techniques have been proposed for document sentiment classification , we are not aware of any cl work on asc ."], "relation": "used for", "id": "2021.naacl-main.378", "year": 2021, "rel_sent": "Adapting BERT for Continual Learning of a Sequence of Aspect Sentiment Classification Tasks.", "forward": true, "src_ids": "2021.naacl-main.378_15696"}
{"input": "few - shot neural text generation is done by using Task| context: large - scale pretrained language models have led to dramatic improvements in text generation . impressive performance can be achieved by finetuning only on a small number of instances ( few - shot setting ) . nonetheless , almost all previous work simply applies random sampling to select the few - shot training instances . little to no attention has been paid to the selection strategies and how they would affect model performance .", "entity": "few - shot neural text generation", "output": "training instance selection", "neg_sample": ["few - shot neural text generation is done by using Task", "large - scale pretrained language models have led to dramatic improvements in text generation .", "impressive performance can be achieved by finetuning only on a small number of instances ( few - shot setting ) .", "nonetheless , almost all previous work simply applies random sampling to select the few - shot training instances .", "little to no attention has been paid to the selection strategies and how they would affect model performance ."], "relation": "used for", "id": "2021.acl-short.2", "year": 2021, "rel_sent": "On Training Instance Selection for Few - Shot Neural Text Generation.", "forward": false, "src_ids": "2021.acl-short.2_10894"}
{"input": "virtual adversarial training is used for Task| context: the real - world impact of polarization and toxicity in the online sphere marked the end of 2020 and the beginning of this year in a negative way .", "entity": "virtual adversarial training", "output": "toxic spans detection", "neg_sample": ["virtual adversarial training is used for Task", "the real - world impact of polarization and toxicity in the online sphere marked the end of 2020 and the beginning of this year in a negative way ."], "relation": "used for", "id": "2021.semeval-1.26", "year": 2021, "rel_sent": "UPB at SemEval-2021 Task 5 : Virtual Adversarial Training for Toxic Spans Detection.", "forward": true, "src_ids": "2021.semeval-1.26_8050"}
{"input": "text classification is used for Material| context: automated frequently asked question ( faq ) retrieval provides an effective procedure to provide prompt responses to natural language based queries , providing an efficient platform for large - scale service - providing companies for presenting readily available information pertaining to customers ' questions .", "entity": "text classification", "output": "user question", "neg_sample": ["text classification is used for Material", "automated frequently asked question ( faq ) retrieval provides an effective procedure to provide prompt responses to natural language based queries , providing an efficient platform for large - scale service - providing companies for presenting readily available information pertaining to customers ' questions ."], "relation": "used for", "id": "2021.sigdial-1.44", "year": 2021, "rel_sent": "We propose two decoupled deep learning architectures trained for ( i ) candidate generation via text classification for a user question , and ( ii ) learning fine - grained semantic similarity between user questions and the FAQ repository for candidate refinement .", "forward": true, "src_ids": "2021.sigdial-1.44_14104"}
{"input": "document - level ee tasks is done by using Method| context: document - level entity - based extraction ( ee ) , aiming at extracting entity - centric information such as entity roles and entity relations , is key to automatic knowledge acquisition from text corpora for various domains . most document - level ee systems build extractive models , which struggle to model long - term dependencies among entities at the document level .", "entity": "document - level ee tasks", "output": "generative framework", "neg_sample": ["document - level ee tasks is done by using Method", "document - level entity - based extraction ( ee ) , aiming at extracting entity - centric information such as entity roles and entity relations , is key to automatic knowledge acquisition from text corpora for various domains .", "most document - level ee systems build extractive models , which struggle to model long - term dependencies among entities at the document level ."], "relation": "used for", "id": "2021.emnlp-main.426", "year": 2021, "rel_sent": "To address this issue , we propose a generative framework for two document - level EE tasks : role - filler entity extraction ( REE ) and relation extraction ( RE ) .", "forward": false, "src_ids": "2021.emnlp-main.426_821"}
{"input": "case - oriented construction framework is used for Material| context: relation extraction ( re ) is an essential topic in natural language processing and has attracted extensive attention . current re approaches achieve fantastic results on common datasets , while they still struggle on practical applications . in this paper , we analyze the above performance gap , the underlying reason of which is that practical applications intrinsically have more hard cases .", "entity": "case - oriented construction framework", "output": "hard case relation extraction dataset ( hacred )", "neg_sample": ["case - oriented construction framework is used for Material", "relation extraction ( re ) is an essential topic in natural language processing and has attracted extensive attention .", "current re approaches achieve fantastic results on common datasets , while they still struggle on practical applications .", "in this paper , we analyze the above performance gap , the underlying reason of which is that practical applications intrinsically have more hard cases ."], "relation": "used for", "id": "2021.findings-acl.249", "year": 2021, "rel_sent": "To make RE models more robust on such practical hard cases , we propose a case - oriented construction framework to build a Hard Case Relation Extraction Dataset ( HacRED ) .", "forward": true, "src_ids": "2021.findings-acl.249_6829"}
{"input": "morpheus - multilingual is used for OtherScientificTerm| context: in this work , we analyze the robustness of neural machine translation systems towards grammatical perturbations in the source . in particular , we focus on morphological inflection related perturbations . while this has been recently studied for english->french ( morpheus ) ( tan et al . , 2020 ) , it is unclear how this extends to any->english translation systems .", "entity": "morpheus - multilingual", "output": "morphological perturbations", "neg_sample": ["morpheus - multilingual is used for OtherScientificTerm", "in this work , we analyze the robustness of neural machine translation systems towards grammatical perturbations in the source .", "in particular , we focus on morphological inflection related perturbations .", "while this has been recently studied for english->french ( morpheus ) ( tan et al .", ", 2020 ) , it is unclear how this extends to any->english translation systems ."], "relation": "used for", "id": "2021.sigmorphon-1.6", "year": 2021, "rel_sent": "We propose MORPHEUS - MULTILINGUAL that utilizes UniMorph dictionaries to identify morphological perturbations to source that adversely affect the translation models .", "forward": true, "src_ids": "2021.sigmorphon-1.6_14896"}
{"input": "dynamic sentence graph is used for OtherScientificTerm| context: pre - trained models like bidirectional encoder representations from transformers ( bert ) , have recently made a big leap forward in natural language processing ( nlp ) tasks . however , there are still some shortcomings in the masked language modeling ( mlm ) task performed by these models .", "entity": "dynamic sentence graph", "output": "local context", "neg_sample": ["dynamic sentence graph is used for OtherScientificTerm", "pre - trained models like bidirectional encoder representations from transformers ( bert ) , have recently made a big leap forward in natural language processing ( nlp ) tasks .", "however , there are still some shortcomings in the masked language modeling ( mlm ) task performed by these models ."], "relation": "used for", "id": "11.textgraphs-1.12", "year": 2021, "rel_sent": "The proposed model also employs a dynamic sentence graph to capture local context effectively .", "forward": true, "src_ids": "11.textgraphs-1.12_3275"}
{"input": "mt community is done by using Task| context: over the years , many different filtering approaches have been proposed . however , varying task definitions and data conditions make it difficult to draw a meaningful comparison .", "entity": "mt community", "output": "data filtering", "neg_sample": ["mt community is done by using Task", "over the years , many different filtering approaches have been proposed .", "however , varying task definitions and data conditions make it difficult to draw a meaningful comparison ."], "relation": "used for", "id": "2021.naacl-main.15", "year": 2021, "rel_sent": "First , we analyze the performance of language identification , a tool commonly used for data filtering in the MT community and identify specific weaknesses .", "forward": false, "src_ids": "2021.naacl-main.15_13754"}
{"input": "ndh - full task is done by using Method| context: communication between human and mobile agents is getting increasingly important as such agents are widely deployed in our daily lives . vision - and - dialogue navigation is one of the tasks that evaluate the agent 's ability to interact with humans for assistance and navigate based on natural language responses .", "entity": "ndh - full task", "output": "training methods", "neg_sample": ["ndh - full task is done by using Method", "communication between human and mobile agents is getting increasingly important as such agents are widely deployed in our daily lives .", "vision - and - dialogue navigation is one of the tasks that evaluate the agent 's ability to interact with humans for assistance and navigate based on natural language responses ."], "relation": "used for", "id": "2021.emnlp-main.518", "year": 2021, "rel_sent": "We further describe several approaches that we try , in order to improve the model performance ( based on curriculum learning , pre - training , and data - augmentation ) , suggesting potential useful training methods on this new NDH - Full task .", "forward": false, "src_ids": "2021.emnlp-main.518_6381"}
{"input": "order is done by using Method| context: discontinuous constituent parsers have always lagged behind continuous approaches in terms of accuracy and speed , as the presence of constituents with discontinuous yield introduces extra complexity to the task . however , a discontinuous tree can be converted into a continuous variant by reordering tokens .", "entity": "order", "output": "bijective function", "neg_sample": ["order is done by using Method", "discontinuous constituent parsers have always lagged behind continuous approaches in terms of accuracy and speed , as the presence of constituents with discontinuous yield introduces extra complexity to the task .", "however , a discontinuous tree can be converted into a continuous variant by reordering tokens ."], "relation": "used for", "id": "2021.emnlp-main.825", "year": 2021, "rel_sent": "To that end , we develop a Pointer Network capable of accurately generating the continuous token arrangement for a given input sentence and define a bijective function to recover the original order .", "forward": false, "src_ids": "2021.emnlp-main.825_14409"}
{"input": "propaganda techniques is used for Material| context: in political news media , propaganda techniques are often employed to express one 's political view , or to influence the audience 's stance . annotation of propaganda techniques are yet to be developed .", "entity": "propaganda techniques", "output": "chinese political news texts", "neg_sample": ["propaganda techniques is used for Material", "in political news media , propaganda techniques are often employed to express one 's political view , or to influence the audience 's stance .", "annotation of propaganda techniques are yet to be developed ."], "relation": "used for", "id": "2021.ijclclp-1.5", "year": 2021, "rel_sent": "In this paper , with an explainable approach , we annotated the use of propaganda techniques in Chinese political news texts , and enlarged the dataset by bootstrapping using a small set of manually annotated data .", "forward": true, "src_ids": "2021.ijclclp-1.5_13433"}
{"input": "knowledge graph inference is done by using Method| context: knowledge graph inference has been studied extensively due to its wide applications . it has been addressed by two lines of research , i.e. , the more traditional logical rule reasoning and the more recent knowledge graph embedding ( kge ) . several attempts have been made to combine kge and logical rules for better knowledge graph inference . even worse , both approaches need to sample ground rules to tackle the scalability issue , as the total number of ground rules is intractable in practice , making them less effective in handling logical rules .", "entity": "knowledge graph inference", "output": "definite horn rule reasoning", "neg_sample": ["knowledge graph inference is done by using Method", "knowledge graph inference has been studied extensively due to its wide applications .", "it has been addressed by two lines of research , i.e.", ", the more traditional logical rule reasoning and the more recent knowledge graph embedding ( kge ) .", "several attempts have been made to combine kge and logical rules for better knowledge graph inference .", "even worse , both approaches need to sample ground rules to tackle the scalability issue , as the total number of ground rules is intractable in practice , making them less effective in handling logical rules ."], "relation": "used for", "id": "2021.emnlp-main.769", "year": 2021, "rel_sent": "UniKER : A Unified Framework for Combining Embedding and Definite Horn Rule Reasoning for Knowledge Graph Inference.", "forward": false, "src_ids": "2021.emnlp-main.769_13038"}
{"input": "essay length is used for Method| context: previous work has shown that automated essay scoring systems , in particular machine learning - based systems , are not capable of assessing the quality of essays , but are relying on essay length , a factor irrelevant to writing proficiency .", "entity": "essay length", "output": "neural essay scoring systems", "neg_sample": ["essay length is used for Method", "previous work has shown that automated essay scoring systems , in particular machine learning - based systems , are not capable of assessing the quality of essays , but are relying on essay length , a factor irrelevant to writing proficiency ."], "relation": "used for", "id": "2021.sustainlp-1.4", "year": 2021, "rel_sent": "Countering the Influence of Essay Length in Neural Essay Scoring.", "forward": true, "src_ids": "2021.sustainlp-1.4_1978"}
{"input": "pre - training process is used for OtherScientificTerm| context: we notice that about 30 % of reviews do not contain obvious opinion words , but still convey clear human - aware sentiment orientation , which is known as implicit sentiment . however , recent neural network - based approaches paid little attention to implicit sentiment entailed in the reviews .", "entity": "pre - training process", "output": "implicit and explicit sentiment orientation", "neg_sample": ["pre - training process is used for OtherScientificTerm", "we notice that about 30 % of reviews do not contain obvious opinion words , but still convey clear human - aware sentiment orientation , which is known as implicit sentiment .", "however , recent neural network - based approaches paid little attention to implicit sentiment entailed in the reviews ."], "relation": "used for", "id": "2021.emnlp-main.22", "year": 2021, "rel_sent": "By aligning the representation of implicit sentiment expressions to those with the same sentiment label , the pre - training process leads to better capture of both implicit and explicit sentiment orientation towards aspects in reviews .", "forward": true, "src_ids": "2021.emnlp-main.22_15139"}
{"input": "cross - domain stance detection is done by using Material| context: stance detection determines whether the author of a text is in favor of , against or neutral to a specific target and provides valuable insights into important events such as presidential election . however , progress on stance detection has been hampered by the absence of large annotated datasets .", "entity": "cross - domain stance detection", "output": "pstance dataset", "neg_sample": ["cross - domain stance detection is done by using Material", "stance detection determines whether the author of a text is in favor of , against or neutral to a specific target and provides valuable insights into important events such as presidential election .", "however , progress on stance detection has been hampered by the absence of large annotated datasets ."], "relation": "used for", "id": "2021.findings-acl.208", "year": 2021, "rel_sent": "Moreover , our PSTANCE dataset can facilitate research in the fields of cross - domain stance detection such as cross - target stance detection where a classifier is adapted from a different but related target .", "forward": false, "src_ids": "2021.findings-acl.208_2332"}
{"input": "adaptive information seeking is used for Task| context: recently , iterative approaches have been proven to be effective for complex questions , by recursively retrieving new evidence at each step . however , almost all existing iterative approaches use predefined strategies , either applying the same retrieval function multiple times or fixing the order of different retrieval functions , which can not fulfill the diverse requirements of various questions .", "entity": "adaptive information seeking", "output": "open - domain question answering", "neg_sample": ["adaptive information seeking is used for Task", "recently , iterative approaches have been proven to be effective for complex questions , by recursively retrieving new evidence at each step .", "however , almost all existing iterative approaches use predefined strategies , either applying the same retrieval function multiple times or fixing the order of different retrieval functions , which can not fulfill the diverse requirements of various questions ."], "relation": "used for", "id": "2021.emnlp-main.293", "year": 2021, "rel_sent": "Adaptive Information Seeking for Open - Domain Question Answering.", "forward": true, "src_ids": "2021.emnlp-main.293_4916"}
{"input": "descriptiveness score is done by using Method| context: generating descriptive sentences that convey non - trivial , detailed , and salient information about images is an important goal of image captioning .", "entity": "descriptiveness score", "output": "pagerank algorithm", "neg_sample": ["descriptiveness score is done by using Method", "generating descriptive sentences that convey non - trivial , detailed , and salient information about images is an important goal of image captioning ."], "relation": "used for", "id": "2021.acl-short.36", "year": 2021, "rel_sent": "A PageRank algorithm is then employed to estimate the descriptiveness score of each node .", "forward": false, "src_ids": "2021.acl-short.36_10664"}
{"input": "shrinking quality is done by using OtherScientificTerm| context: end - to - end simultaneous speech translation ( sst ) , which directly translates speech in one language into text in another language in realtime , is useful in many scenarios but has not been fully investigated .", "entity": "shrinking quality", "output": "blank penalty", "neg_sample": ["shrinking quality is done by using OtherScientificTerm", "end - to - end simultaneous speech translation ( sst ) , which directly translates speech in one language into text in another language in realtime , is useful in many scenarios but has not been fully investigated ."], "relation": "used for", "id": "2021.findings-acl.218", "year": 2021, "rel_sent": "Besides , to improve the model performance in simultaneous scenarios , we propose a blank penalty to enhance the shrinking quality and a Wait - K - Stride - N strategy to allow local reranking during decoding .", "forward": false, "src_ids": "2021.findings-acl.218_8819"}
{"input": "dialogue systems is done by using Method| context: for voice assistants like alexa , google assistant , and siri , correctly interpreting users ' intentions is of utmost importance . however , users sometimes experience friction with these assistants , caused by errors from different system components or user errors such as slips of the tongue . users tend to rephrase their queries until they get a satisfactory response . rephrase detection is used to identify the rephrases and has long been treated as a task with pairwise input , which does not fully utilize the contextual information ( e.g. users ' implicit feedback ) .", "entity": "dialogue systems", "output": "contextual rephrase detection model", "neg_sample": ["dialogue systems is done by using Method", "for voice assistants like alexa , google assistant , and siri , correctly interpreting users ' intentions is of utmost importance .", "however , users sometimes experience friction with these assistants , caused by errors from different system components or user errors such as slips of the tongue .", "users tend to rephrase their queries until they get a satisfactory response .", "rephrase detection is used to identify the rephrases and has long been treated as a task with pairwise input , which does not fully utilize the contextual information ( e.g.", "users ' implicit feedback ) ."], "relation": "used for", "id": "2021.emnlp-main.143", "year": 2021, "rel_sent": "Contextual Rephrase Detection for Reducing Friction in Dialogue Systems.", "forward": false, "src_ids": "2021.emnlp-main.143_353"}
{"input": "multimodal ner is done by using Material| context: multimodal named entity recognition ( mner ) requires to bridge the gap between language understanding and visual context .", "entity": "multimodal ner", "output": "images", "neg_sample": ["multimodal ner is done by using Material", "multimodal named entity recognition ( mner ) requires to bridge the gap between language understanding and visual context ."], "relation": "used for", "id": "2021.wnut-1.11", "year": 2021, "rel_sent": "Can images help recognize entities ? A study of the role of images for Multimodal NER.", "forward": false, "src_ids": "2021.wnut-1.11_11019"}
{"input": "dialog acts is used for Task| context: growing interests have been attracted in conversational recommender systems ( crs ) , which explore user preference through conversational interactions in order to make appropriate recommendation . however , there is still a lack of ability in existing crs to ( 1 ) traverse multiple reasoning paths over background knowledge to introduce relevant items and attributes , and ( 2 ) arrange selected entities appropriately under current system intents to control response generation .", "entity": "dialog acts", "output": "conversational recommendation", "neg_sample": ["dialog acts is used for Task", "growing interests have been attracted in conversational recommender systems ( crs ) , which explore user preference through conversational interactions in order to make appropriate recommendation .", "however , there is still a lack of ability in existing crs to ( 1 ) traverse multiple reasoning paths over background knowledge to introduce relevant items and attributes , and ( 2 ) arrange selected entities appropriately under current system intents to control response generation ."], "relation": "used for", "id": "2021.emnlp-main.139", "year": 2021, "rel_sent": "CR - Walker : Tree - Structured Graph Reasoning and Dialog Acts for Conversational Recommendation.", "forward": true, "src_ids": "2021.emnlp-main.139_2629"}
{"input": "neural graph - parser based edge predictor is done by using OtherScientificTerm| context: english treebanks for enhanced ud have been created from gold basic dependencies using a heuristic rule - based converter , which propagates only core arguments .", "entity": "neural graph - parser based edge predictor", "output": "automatic parses", "neg_sample": ["neural graph - parser based edge predictor is done by using OtherScientificTerm", "english treebanks for enhanced ud have been created from gold basic dependencies using a heuristic rule - based converter , which propagates only core arguments ."], "relation": "used for", "id": "2021.eacl-main.67", "year": 2021, "rel_sent": "When using automatic parses , our neural graph - parser based edge predictor outperforms the currently predominant pipelines using a basic - layer tree parser plus converters .", "forward": false, "src_ids": "2021.eacl-main.67_11426"}
{"input": "table - to - text generation is done by using Material| context: this task is very important in many situations , such as changing some conditions , consequences , or properties in a legal document , or changing some key information of an event in a news text . this is very challenging , as it is hard to obtain a parallel corpus for training , and we need tofirst find all text positions that should be changed and then decide how to change them .", "entity": "table - to - text generation", "output": "wikibio", "neg_sample": ["table - to - text generation is done by using Material", "this task is very important in many situations , such as changing some conditions , consequences , or properties in a legal document , or changing some key information of an event in a news text .", "this is very challenging , as it is hard to obtain a parallel corpus for training , and we need tofirst find all text positions that should be changed and then decide how to change them ."], "relation": "used for", "id": "2021.findings-acl.110", "year": 2021, "rel_sent": "We constructed the new dataset WIKIBIOCTE for this task based on the existing dataset WIKIBIO ( originally created for table - to - text generation ) .", "forward": false, "src_ids": "2021.findings-acl.110_13356"}
{"input": "transfer learning is used for Method| context: tables are widely used in various kinds of documents to present information concisely . understanding tables is a challenging problem that requires an understanding of language and table structure , along with numerical and logical reasoning .", "entity": "transfer learning", "output": "tapas", "neg_sample": ["transfer learning is used for Method", "tables are widely used in various kinds of documents to present information concisely .", "understanding tables is a challenging problem that requires an understanding of language and table structure , along with numerical and logical reasoning ."], "relation": "used for", "id": "2021.semeval-1.180", "year": 2021, "rel_sent": "In subtask A , we evaluate how transfer learning and standardizing tables to have a single header row improves TAPAS ' performance .", "forward": true, "src_ids": "2021.semeval-1.180_9037"}
{"input": "insufficient training data is done by using Method| context: however , only qualitative analysis and ablation study are provided as evidence .", "entity": "insufficient training data", "output": "attention mechanism", "neg_sample": ["insufficient training data is done by using Method", "however , only qualitative analysis and ablation study are provided as evidence ."], "relation": "used for", "id": "2021.acl-long.359", "year": 2021, "rel_sent": "We find that ( 1 ) higher attention accuracy may lead to worse performance as it may harm the model 's ability to extract entity mention features ; ( 2 ) the performance of attention is largely influenced by various noise distribution patterns , which is closely related to real - world datasets ; ( 3 ) KG - enhanced attention indeed improves RE performance , while not through enhanced attention but by incorporating entity prior ; and ( 4 ) attention mechanism may exacerbate the issue of insufficient training data .", "forward": false, "src_ids": "2021.acl-long.359_1373"}
{"input": "personalized speech synthesis system is done by using Method| context: in recent years , speech synthesis system can generate speech with high speech quality . however , multi - speaker text - to - speech ( tts ) system still require large amount of speech data for each target speaker .", "entity": "personalized speech synthesis system", "output": "post - filter network", "neg_sample": ["personalized speech synthesis system is done by using Method", "in recent years , speech synthesis system can generate speech with high speech quality .", "however , multi - speaker text - to - speech ( tts ) system still require large amount of speech data for each target speaker ."], "relation": "used for", "id": "2021.ijclclp-2.4", "year": 2021, "rel_sent": "Incorporating Speaker Embedding and Post - Filter Network for Improving Speaker Similarity of Personalized Speech Synthesis System.", "forward": false, "src_ids": "2021.ijclclp-2.4_7079"}
{"input": "basic - level categories is done by using Generic| context: basic - level categories ( blc ) are an important psycholinguistic concept introduced by rosch et al . ( 1976 ) ; they are defined as the most inclusive categories for which a concrete mental image of the category as a whole can be formed , and also as those categories which are acquired early in life . rosch 's original algorithm for detecting blc ( called cue - validity ) is based on the availability of semantic features such as ' has tail ' for ' cat ' , and has remained untested at large . an at - scale algorithm for the automatic determination of blc exists , but it operates without rosch - style semantic features , and is thus unable to verify rosch 's hypothesis .", "entity": "basic - level categories", "output": "indicator", "neg_sample": ["basic - level categories is done by using Generic", "basic - level categories ( blc ) are an important psycholinguistic concept introduced by rosch et al .", "( 1976 ) ; they are defined as the most inclusive categories for which a concrete mental image of the category as a whole can be formed , and also as those categories which are acquired early in life .", "rosch 's original algorithm for detecting blc ( called cue - validity ) is based on the availability of semantic features such as ' has tail ' for ' cat ' , and has remained untested at large .", "an at - scale algorithm for the automatic determination of blc exists , but it operates without rosch - style semantic features , and is thus unable to verify rosch 's hypothesis ."], "relation": "used for", "id": "2021.emnlp-main.654", "year": 2021, "rel_sent": "As well as confirming the usefulness of Rosch 's cue validity algorithm , we also developed and evaluated our own new indicator for BLC , which models the fact that BLC features tend to be BLC themselves .", "forward": false, "src_ids": "2021.emnlp-main.654_11108"}
{"input": "multi - label text classification is done by using Method| context: multi - label text classification is a challenging task because it requires capturing label dependencies . it becomes even more challenging when class distribution is long - tailed . resampling and re - weighting are common approaches used for addressing the class imbalance problem , however , they are not effective when there is label dependency besides class imbalance because they result in oversampling of common labels .", "entity": "multi - label text classification", "output": "balancing methods", "neg_sample": ["multi - label text classification is done by using Method", "multi - label text classification is a challenging task because it requires capturing label dependencies .", "it becomes even more challenging when class distribution is long - tailed .", "resampling and re - weighting are common approaches used for addressing the class imbalance problem , however , they are not effective when there is label dependency besides class imbalance because they result in oversampling of common labels ."], "relation": "used for", "id": "2021.emnlp-main.643", "year": 2021, "rel_sent": "Balancing Methods for Multi - label Text Classification with Long - Tailed Class Distribution.", "forward": false, "src_ids": "2021.emnlp-main.643_9267"}
{"input": "xslue is used for Task| context: every natural text is written in some style . style is formed by a complex combination of different stylistic factors , including formality markers , emotions , metaphors , etc . one can not form a complete understanding of a text without considering these factors . the factors combine and co - vary in complex ways toform styles . studying the nature of the covarying combinations sheds light on stylistic language in general , sometimes called cross - style language understanding .", "entity": "xslue", "output": "cross - style applications", "neg_sample": ["xslue is used for Task", "every natural text is written in some style .", "style is formed by a complex combination of different stylistic factors , including formality markers , emotions , metaphors , etc .", "one can not form a complete understanding of a text without considering these factors .", "the factors combine and co - vary in complex ways toform styles .", "studying the nature of the covarying combinations sheds light on stylistic language in general , sometimes called cross - style language understanding ."], "relation": "used for", "id": "2021.acl-long.185", "year": 2021, "rel_sent": "Using XSLUE , we propose three interesting cross - style applications in classification , correlation , and generation .", "forward": true, "src_ids": "2021.acl-long.185_11604"}
{"input": "bi - direction interplay is done by using Method| context: multimodal sentiment analysis is the challenging research area that attends to the fusion of multiple heterogeneous modalities . the main challenge is the occurrence of some missing modalities during the multimodal fusion procedure . however , the existing techniques require all modalities as input , thus are sensitive to missing modalities at predicting time .", "entity": "bi - direction interplay", "output": "coupled - translation fusion network ( ctfn )", "neg_sample": ["bi - direction interplay is done by using Method", "multimodal sentiment analysis is the challenging research area that attends to the fusion of multiple heterogeneous modalities .", "the main challenge is the occurrence of some missing modalities during the multimodal fusion procedure .", "however , the existing techniques require all modalities as input , thus are sensitive to missing modalities at predicting time ."], "relation": "used for", "id": "2021.acl-long.412", "year": 2021, "rel_sent": "In this work , the coupled - translation fusion network ( CTFN ) is firstly proposed to model bi - direction interplay via couple learning , ensuring the robustness in respect to missing modalities .", "forward": false, "src_ids": "2021.acl-long.412_15614"}
{"input": "cbt interactions is done by using Task| context: one of the key ideas of cognitive behavioural therapy ( cbt ) is the ability to convert negative or distorted thoughts into more realistic alternatives . although modern machine learning techniques can be successfully applied to a variety of natural language processing tasks , including cognitive behavioural therapy , the lack of a publicly available dataset makes supervised training difficult for tasks such as reforming distorted thoughts .", "entity": "cbt interactions", "output": "automated responses", "neg_sample": ["cbt interactions is done by using Task", "one of the key ideas of cognitive behavioural therapy ( cbt ) is the ability to convert negative or distorted thoughts into more realistic alternatives .", "although modern machine learning techniques can be successfully applied to a variety of natural language processing tasks , including cognitive behavioural therapy , the lack of a publicly available dataset makes supervised training difficult for tasks such as reforming distorted thoughts ."], "relation": "used for", "id": "2021.icnlsp-1.13", "year": 2021, "rel_sent": "Formulating Automated Responses to Cognitive Distortions for CBT Interactions.", "forward": false, "src_ids": "2021.icnlsp-1.13_3089"}
{"input": "anomaly detection is done by using OtherScientificTerm| context: while sentence anomalies have been applied periodically for testing in nlp , we have yet to establish a picture of the precise status of anomaly information in representations from nlp models .", "entity": "anomaly detection", "output": "coarser - grained word position information", "neg_sample": ["anomaly detection is done by using OtherScientificTerm", "while sentence anomalies have been applied periodically for testing in nlp , we have yet to establish a picture of the precise status of anomaly information in representations from nlp models ."], "relation": "used for", "id": "2021.blackboxnlp-1.18", "year": 2021, "rel_sent": "Follow - up analyses support the notion that these models pick up on a legitimate , general notion of sentence oddity , while coarser - grained word position information is likely also a contributor to the observed anomaly detection .", "forward": false, "src_ids": "2021.blackboxnlp-1.18_5205"}
{"input": "parallel sentences is done by using Method| context: the cross - lingual language models are typically pretrained with masked language modeling on multilingual text or parallel sentences .", "entity": "parallel sentences", "output": "self - labeled word alignment", "neg_sample": ["parallel sentences is done by using Method", "the cross - lingual language models are typically pretrained with masked language modeling on multilingual text or parallel sentences ."], "relation": "used for", "id": "2021.acl-long.265", "year": 2021, "rel_sent": "Specifically , the model first self - label word alignments for parallel sentences .", "forward": false, "src_ids": "2021.acl-long.265_7370"}
{"input": "model introspection is done by using Material| context: recent question answering and machine reading benchmarks frequently reduce the task to one of pinpointing spans within a certain text passage that answers the given question . typically , these systems are not required to actually understand the text on a deeper level that allows for more complex reasoning on the information contained .", "entity": "model introspection", "output": "synthetic datasets", "neg_sample": ["model introspection is done by using Material", "recent question answering and machine reading benchmarks frequently reduce the task to one of pinpointing spans within a certain text passage that answers the given question .", "typically , these systems are not required to actually understand the text on a deeper level that allows for more complex reasoning on the information contained ."], "relation": "used for", "id": "2021.starsem-1.10", "year": 2021, "rel_sent": "We demonstrate how these synthetic datasets align structured knowledge with natural text and aid model introspection when approaching complex text understanding .", "forward": false, "src_ids": "2021.starsem-1.10_10865"}
{"input": "pathological visual question answering framework is used for Material| context: pathology imaging is broadly used for identifying the causes and effects of diseases or injuries .", "entity": "pathological visual question answering framework", "output": "pathology images", "neg_sample": ["pathological visual question answering framework is used for Material", "pathology imaging is broadly used for identifying the causes and effects of diseases or injuries ."], "relation": "used for", "id": "2021.acl-short.90", "year": 2021, "rel_sent": "In this paper , we aim to develop a pathological visual question answering framework to analyze pathology images and answer medical questions related to these images .", "forward": true, "src_ids": "2021.acl-short.90_5240"}
{"input": "bert is used for OtherScientificTerm| context: we present two novel unsupervised methods for eliminating toxicity in text .", "entity": "bert", "output": "toxic words", "neg_sample": ["bert is used for OtherScientificTerm", "we present two novel unsupervised methods for eliminating toxicity in text ."], "relation": "used for", "id": "2021.emnlp-main.629", "year": 2021, "rel_sent": "Our second method uses BERT to replace toxic words with their non - offensive synonyms .", "forward": true, "src_ids": "2021.emnlp-main.629_12206"}
{"input": "coherent and informative responses is done by using Material| context: existing conversational recommendation ( cr ) systems usually suffer from insufficient item information when conducted on short dialogue history and unfamiliar items .", "entity": "coherent and informative responses", "output": "reviews", "neg_sample": ["coherent and informative responses is done by using Material", "existing conversational recommendation ( cr ) systems usually suffer from insufficient item information when conducted on short dialogue history and unfamiliar items ."], "relation": "used for", "id": "2021.findings-acl.99", "year": 2021, "rel_sent": "In this paper , we design a novel end - to - end framework , namely , Review - augmented Conversational Recommender ( RevCore ) , where reviews are seamlessly incorporated to enrich item information and assist in generating both coherent and informative responses .", "forward": false, "src_ids": "2021.findings-acl.99_1154"}
{"input": "abstractive summarization is done by using Method| context: neural abstractive summarization systems have gained significant progress in recent years . however , abstractive summarization often produce inconsisitent statements or false facts . how to automatically generate highly abstract yet factually correct summaries ?", "entity": "abstractive summarization", "output": "gradient - based adversarial factual consistency evaluation", "neg_sample": ["abstractive summarization is done by using Method", "neural abstractive summarization systems have gained significant progress in recent years .", "however , abstractive summarization often produce inconsisitent statements or false facts .", "how to automatically generate highly abstract yet factually correct summaries ?"], "relation": "used for", "id": "2021.emnlp-main.337", "year": 2021, "rel_sent": "Gradient - Based Adversarial Factual Consistency Evaluation for Abstractive Summarization.", "forward": false, "src_ids": "2021.emnlp-main.337_13394"}
{"input": "metaphor detection task is used for Material| context: metaphors are ubiquitous in human language .", "entity": "metaphor detection task", "output": "md datasets", "neg_sample": ["metaphor detection task is used for Material", "metaphors are ubiquitous in human language ."], "relation": "used for", "id": "2021.unimplicit-1.5", "year": 2021, "rel_sent": "This approach could be applied to other existing MD datasets as well , since the metaphoricity annotations in these benchmark datasets may be outdated .", "forward": true, "src_ids": "2021.unimplicit-1.5_11932"}
{"input": "wmt 2021 news translation and biomedical translation tasks is done by using Method| context: we focused on low - resource pairs , using a simple system .", "entity": "wmt 2021 news translation and biomedical translation tasks", "output": "fujitsu dmath systems", "neg_sample": ["wmt 2021 news translation and biomedical translation tasks is done by using Method", "we focused on low - resource pairs , using a simple system ."], "relation": "used for", "id": "2021.wmt-1.13", "year": 2021, "rel_sent": "This paper describes the Fujitsu DMATH systems used for WMT 2021 News Translation and Biomedical Translation tasks .", "forward": false, "src_ids": "2021.wmt-1.13_10592"}
{"input": "wordnets is done by using Method| context: this paper introduces wn , a new python library for working with wordnets . unlike previous libraries , wn is built from the beginning to accommodate multiple wordnets - for multiple languages or multiple versions of the same wordnet - while retaining the ability to query and traverse them independently . it is also able to download and incorporate wordnets published online .", "entity": "wordnets", "output": "wn python library", "neg_sample": ["wordnets is done by using Method", "this paper introduces wn , a new python library for working with wordnets .", "unlike previous libraries , wn is built from the beginning to accommodate multiple wordnets - for multiple languages or multiple versions of the same wordnet - while retaining the ability to query and traverse them independently .", "it is also able to download and incorporate wordnets published online ."], "relation": "used for", "id": "2021.gwc-1.12", "year": 2021, "rel_sent": "Intrinsically Interlingual : The Wn Python Library for Wordnets.", "forward": false, "src_ids": "2021.gwc-1.12_2503"}
{"input": "dialogpt is used for Material| context: current dialogue summarization systems usually encode the text with a number of general semantic features ( e.g. , keywords and topics ) to gain more powerful dialogue modeling capabilities . however , these features are obtained via open - domain toolkits that are dialog - agnostic or heavily relied on human annotations .", "entity": "dialogpt", "output": "dialogue summarization datasets", "neg_sample": ["dialogpt is used for Material", "current dialogue summarization systems usually encode the text with a number of general semantic features ( e.g.", ", keywords and topics ) to gain more powerful dialogue modeling capabilities .", "however , these features are obtained via open - domain toolkits that are dialog - agnostic or heavily relied on human annotations ."], "relation": "used for", "id": "2021.acl-long.117", "year": 2021, "rel_sent": "We apply DialoGPT to label three types of features on two dialogue summarization datasets , SAMSum and AMI , and employ pre - trained and non pre - trained models as our summarizers .", "forward": true, "src_ids": "2021.acl-long.117_7593"}
{"input": "natural language processing is done by using Generic| context: in recent years , the nlp community has shown increasing interest in analysing how deep learning models work .", "entity": "natural language processing", "output": "controlled tasks", "neg_sample": ["natural language processing is done by using Generic", "in recent years , the nlp community has shown increasing interest in analysing how deep learning models work ."], "relation": "used for", "id": "2021.blackboxnlp-1.37", "year": 2021, "rel_sent": "We propose a new set of such controlled tasks to explore a crucial aspect of natural language processing that has not received enough attention : the need to retrieve discrete information from sequences .", "forward": false, "src_ids": "2021.blackboxnlp-1.37_13148"}
{"input": "random forest classifier is used for OtherScientificTerm| context: the automatic recognition of idioms poses a challenging problem for nlp applications . whereas native speakers can intuitively handle multiword expressions whose compositional meanings are hard to trace back to individual word semantics , there is still ample scope for improvement regarding computational approaches .", "entity": "random forest classifier", "output": "features", "neg_sample": ["random forest classifier is used for OtherScientificTerm", "the automatic recognition of idioms poses a challenging problem for nlp applications .", "whereas native speakers can intuitively handle multiword expressions whose compositional meanings are hard to trace back to individual word semantics , there is still ample scope for improvement regarding computational approaches ."], "relation": "used for", "id": "2021.mwe-1.3", "year": 2021, "rel_sent": "To this end , we apply a Random Forest classifier to analyze the individual contribution of features for automatically detecting idioms , and study the trade - off between recall and precision .", "forward": true, "src_ids": "2021.mwe-1.3_11738"}
{"input": "images is used for OtherScientificTerm| context: multimodal named entity recognition ( mner ) requires to bridge the gap between language understanding and visual context . while many multimodal neural techniques have been proposed to incorporate images into the mner task , the model 's ability to leverage multimodal interactions remains poorly understood .", "entity": "images", "output": "entities", "neg_sample": ["images is used for OtherScientificTerm", "multimodal named entity recognition ( mner ) requires to bridge the gap between language understanding and visual context .", "while many multimodal neural techniques have been proposed to incorporate images into the mner task , the model 's ability to leverage multimodal interactions remains poorly understood ."], "relation": "used for", "id": "2021.wnut-1.11", "year": 2021, "rel_sent": "Can images help recognize entities ? A study of the role of images for Multimodal NER.", "forward": true, "src_ids": "2021.wnut-1.11_11020"}
{"input": "continuous entailment patterns is used for Task| context: combining a pretrained language model ( plm ) with textual patterns has been shown to help in both zero- and few - shot settings . for zero - shot performance , it makes sense to design patterns that closely resemble the text seen during self - supervised pretraining because the model has never seen anything else . supervised training allows for more flexibility .", "entity": "continuous entailment patterns", "output": "lexical inference", "neg_sample": ["continuous entailment patterns is used for Task", "combining a pretrained language model ( plm ) with textual patterns has been shown to help in both zero- and few - shot settings .", "for zero - shot performance , it makes sense to design patterns that closely resemble the text seen during self - supervised pretraining because the model has never seen anything else .", "supervised training allows for more flexibility ."], "relation": "used for", "id": "2021.emnlp-main.556", "year": 2021, "rel_sent": "Continuous Entailment Patterns for Lexical Inference in Context.", "forward": true, "src_ids": "2021.emnlp-main.556_7260"}
{"input": "topic - enriched news representation is used for Method| context: nowadays , fake news detection , which aims to verify whether a news document is trusted or fake , has become urgent and important . most existing methods rely heavily on linguistic and semantic features from the news content , and fail to effectively exploit external knowledge which could help determine whether the news document is trusted .", "entity": "topic - enriched news representation", "output": "fake news classifier", "neg_sample": ["topic - enriched news representation is used for Method", "nowadays , fake news detection , which aims to verify whether a news document is trusted or fake , has become urgent and important .", "most existing methods rely heavily on linguistic and semantic features from the news content , and fail to effectively exploit external knowledge which could help determine whether the news document is trusted ."], "relation": "used for", "id": "2021.acl-long.62", "year": 2021, "rel_sent": "Finally , the topic - enriched news representation combining the entity comparison features is fed into a fake news classifier .", "forward": true, "src_ids": "2021.acl-long.62_7188"}
{"input": "salient content is done by using OtherScientificTerm| context: a crucial difference between single- and multi - document summarization is how salient content manifests itself in the document(s ) . while such content may appear at the beginning of a single document , essential information is frequently reiterated in a set of documents related to a particular topic , resulting in an endorsement effect that increases information salience .", "entity": "salient content", "output": "synopsis", "neg_sample": ["salient content is done by using OtherScientificTerm", "a crucial difference between single- and multi - document summarization is how salient content manifests itself in the document(s ) .", "while such content may appear at the beginning of a single document , essential information is frequently reiterated in a set of documents related to a particular topic , resulting in an endorsement effect that increases information salience ."], "relation": "used for", "id": "2021.newsum-1.13", "year": 2021, "rel_sent": "Our method generates a synopsis from each document , which serves as an endorser to identify salient content from other documents .", "forward": false, "src_ids": "2021.newsum-1.13_997"}
{"input": "ternary weight splitting is used for Method| context: the rapid development of large pre - trained language models has greatly increased the demand for model compression techniques , among which quantization is a popular solution . we find that a binary bert is hard to be trained directly than a ternary counterpart due to its complex and irregular loss landscape .", "entity": "ternary weight splitting", "output": "binarybert", "neg_sample": ["ternary weight splitting is used for Method", "the rapid development of large pre - trained language models has greatly increased the demand for model compression techniques , among which quantization is a popular solution .", "we find that a binary bert is hard to be trained directly than a ternary counterpart due to its complex and irregular loss landscape ."], "relation": "used for", "id": "2021.acl-long.334", "year": 2021, "rel_sent": "Therefore , we propose ternary weight splitting , which initializes BinaryBERT by equivalently splitting from a half - sized ternary network .", "forward": true, "src_ids": "2021.acl-long.334_12104"}
{"input": "vqa models is done by using Method| context: visual question answering ( vqa ) models , in particular modular ones , are commonly trained on large - scale datasets to achieve state - of - the - art performance . however , such datasets are sometimes not available . further , it has been shown that training these models on small datasets significantly reduces their accuracy .", "entity": "vqa models", "output": "curriculum - based learning ( cl ) regime", "neg_sample": ["vqa models is done by using Method", "visual question answering ( vqa ) models , in particular modular ones , are commonly trained on large - scale datasets to achieve state - of - the - art performance .", "however , such datasets are sometimes not available .", "further , it has been shown that training these models on small datasets significantly reduces their accuracy ."], "relation": "used for", "id": "2021.alta-1.3", "year": 2021, "rel_sent": "In this paper , we propose curriculum - based learning ( CL ) regime to increase the accuracy of VQA models trained on small datasets .", "forward": false, "src_ids": "2021.alta-1.3_8264"}
{"input": "masking is used for Task| context: hyperpartisan news show an extreme manipulation of reality based on an underlying and extreme ideological orientation . because of its harmful effects at reinforcing one 's bias and the posterior behavior of people , hyperpartisan news detection has become an important task for computational linguists .", "entity": "masking", "output": "hyperpartisanship detection", "neg_sample": ["masking is used for Task", "hyperpartisan news show an extreme manipulation of reality based on an underlying and extreme ideological orientation .", "because of its harmful effects at reinforcing one 's bias and the posterior behavior of people , hyperpartisan news detection has become an important task for computational linguists ."], "relation": "used for", "id": "2021.ranlp-1.140", "year": 2021, "rel_sent": "Masking and Transformer - based Models for Hyperpartisanship Detection in News.", "forward": true, "src_ids": "2021.ranlp-1.140_816"}
{"input": "extracting fine - grained knowledge graphs is used for Material| context: recent transformer - based approaches demonstrate promising results on relational scientific information extraction . existing datasets focus on high - level description of how research is carried out .", "entity": "extracting fine - grained knowledge graphs", "output": "scientific claims", "neg_sample": ["extracting fine - grained knowledge graphs is used for Material", "recent transformer - based approaches demonstrate promising results on relational scientific information extraction .", "existing datasets focus on high - level description of how research is carried out ."], "relation": "used for", "id": "2021.emnlp-main.381", "year": 2021, "rel_sent": "We extend work in transformer - based joint entity and relation extraction to effectively infer our schema , showing the promise of fine - grained knowledge graphs in scientific claims and beyond .", "forward": true, "src_ids": "2021.emnlp-main.381_3931"}
{"input": "encoding techniques is used for Task| context: previous work shows the efficacy of jointly scoring and selecting sentences with neural sequence generation models . it is , however , not well - understood if the gain is due to better encoding techniques or better redundancy reduction approaches . similarly , the contribution of salience versus diversity components on the created summary is not studied well .", "entity": "encoding techniques", "output": "summarization", "neg_sample": ["encoding techniques is used for Task", "previous work shows the efficacy of jointly scoring and selecting sentences with neural sequence generation models .", "it is , however , not well - understood if the gain is due to better encoding techniques or better redundancy reduction approaches .", "similarly , the contribution of salience versus diversity components on the created summary is not studied well ."], "relation": "used for", "id": "2021.eacl-main.22", "year": 2021, "rel_sent": "Building on the state - of - the - art encoding methods for summarization , we present two adaptive learning models : AREDSUM - SEQ that jointly considers salience and novelty during sentence selection ; and a two - step AREDSUM - CTX that scores salience first , then learns to balance salience and redundancy , enabling the measurement of the impact of each aspect .", "forward": true, "src_ids": "2021.eacl-main.22_4854"}
{"input": "scaffolding loss is used for Task| context: recent work has shown fine - tuning neural coreference models can produce strong performance when adapting to different domains . however , at the same time , this can require a large amount of annotated target examples .", "entity": "scaffolding loss", "output": "recovery of knowledge", "neg_sample": ["scaffolding loss is used for Task", "recent work has shown fine - tuning neural coreference models can produce strong performance when adapting to different domains .", "however , at the same time , this can require a large amount of annotated target examples ."], "relation": "used for", "id": "2021.crac-1.13", "year": 2021, "rel_sent": "We develop methods to improve the span representations via ( 1 ) a retrofitting loss to incentivize span representations to satisfy a knowledge - based distance function and ( 2 ) a scaffolding loss to guide the recovery of knowledge from the span representation .", "forward": true, "src_ids": "2021.crac-1.13_12333"}
{"input": "multiwoz dataset is used for Task| context: one of the difficulties in training dialogue systems is the lack of training data .", "entity": "multiwoz dataset", "output": "transfer learning problems", "neg_sample": ["multiwoz dataset is used for Task", "one of the difficulties in training dialogue systems is the lack of training data ."], "relation": "used for", "id": "2021.acl-long.13", "year": 2021, "rel_sent": "In experiments on the MultiWOZ dataset , two practical transfer learning problems are investigated : 1 ) domain adaptation and 2 ) single - to - multiple domain transfer .", "forward": true, "src_ids": "2021.acl-long.13_10119"}
{"input": "data augmentation methods is used for OtherScientificTerm| context: the existing resources for studying anaphoric zero pronoun interpretation are however still limited .", "entity": "data augmentation methods", "output": "anaphoric zero pronoun systems", "neg_sample": ["data augmentation methods is used for OtherScientificTerm", "the existing resources for studying anaphoric zero pronoun interpretation are however still limited ."], "relation": "used for", "id": "2021.crac-1.9", "year": 2021, "rel_sent": "Data Augmentation Methods for Anaphoric Zero Pronouns.", "forward": true, "src_ids": "2021.crac-1.9_15382"}
{"input": "synthetic text is done by using Method| context: there are concerns that the ability of language models ( lms ) to generate high quality synthetic text can be misused to launch spam , disinformation , or propaganda . therefore , the research community is actively working on developing approaches to detect whether a given text is organic or synthetic . while this is a useful first step , it is important to be able tofurther fingerprint the author lm to attribute its origin . prior work on fingerprinting lms is limited to attributing synthetic text generated by a handful ( usually < 10 ) of pre - trained lms . however , lms such as gpt2 are commonly fine - tuned in a myriad of ways ( e.g. , on a domain - specific text corpus ) before being used to generate synthetic text . it is challenging tofingerprinting fine - tuned lms because the universe of fine - tuned lms is much larger in realistic scenarios .", "entity": "synthetic text", "output": "fine - tuning", "neg_sample": ["synthetic text is done by using Method", "there are concerns that the ability of language models ( lms ) to generate high quality synthetic text can be misused to launch spam , disinformation , or propaganda .", "therefore , the research community is actively working on developing approaches to detect whether a given text is organic or synthetic .", "while this is a useful first step , it is important to be able tofurther fingerprint the author lm to attribute its origin .", "prior work on fingerprinting lms is limited to attributing synthetic text generated by a handful ( usually < 10 ) of pre - trained lms .", "however , lms such as gpt2 are commonly fine - tuned in a myriad of ways ( e.g.", ", on a domain - specific text corpus ) before being used to generate synthetic text .", "it is challenging tofingerprinting fine - tuned lms because the universe of fine - tuned lms is much larger in realistic scenarios ."], "relation": "used for", "id": "2021.findings-acl.409", "year": 2021, "rel_sent": "Our results show that fine - tuning itself is the most effective in attributing the synthetic text generated by fine - tuned LMs .", "forward": false, "src_ids": "2021.findings-acl.409_1986"}
{"input": "fine - tuning classification models is done by using Method| context: supervised models can achieve very high accuracy for fine - grained text classification . in practice , however , training data may be abundant for some types but scarce or even non - existent for others .", "entity": "fine - tuning classification models", "output": "hybrid architecture", "neg_sample": ["fine - tuning classification models is done by using Method", "supervised models can achieve very high accuracy for fine - grained text classification .", "in practice , however , training data may be abundant for some types but scarce or even non - existent for others ."], "relation": "used for", "id": "2021.case-1.24", "year": 2021, "rel_sent": "We propose a hybrid architecture that uses as much labeled data as available for fine - tuning classification models , while also allowing for types with little ( few - shot ) or no ( zero - shot ) labeled data .", "forward": false, "src_ids": "2021.case-1.24_13866"}
{"input": "transformer - based autoregressive language model is used for OtherScientificTerm| context: previous works for rap generation focused on rhyming lyrics , but ignored rhythmic beats , which are important for rap performance .", "entity": "transformer - based autoregressive language model", "output": "rhythms", "neg_sample": ["transformer - based autoregressive language model is used for OtherScientificTerm", "previous works for rap generation focused on rhyming lyrics , but ignored rhythmic beats , which are important for rap performance ."], "relation": "used for", "id": "2021.acl-long.6", "year": 2021, "rel_sent": "Second , we design a Transformer - based autoregressive language model which carefully models rhymes and rhythms .", "forward": true, "src_ids": "2021.acl-long.6_4591"}
{"input": "backtranslation feedback is used for Task| context: translating text into a language unknown to the text 's author , dubbed outbound translation , is a modern need for which the user experience has significant room for improvement , beyond the basic machine translation facility .", "entity": "backtranslation feedback", "output": "mt", "neg_sample": ["backtranslation feedback is used for Task", "translating text into a language unknown to the text 's author , dubbed outbound translation , is a modern need for which the user experience has significant room for improvement , beyond the basic machine translation facility ."], "relation": "used for", "id": "2021.naacl-main.14", "year": 2021, "rel_sent": "Backtranslation Feedback Improves User Confidence in MT , Not Quality.", "forward": true, "src_ids": "2021.naacl-main.14_4685"}
{"input": "text generation is done by using Material| context: data - to - text annotations can be a costly process , especially when dealing with tables which are the major source of structured data and contain nontrivial structures .", "entity": "text generation", "output": "open - domain structured data record", "neg_sample": ["text generation is done by using Material", "data - to - text annotations can be a costly process , especially when dealing with tables which are the major source of structured data and contain nontrivial structures ."], "relation": "used for", "id": "2021.naacl-main.37", "year": 2021, "rel_sent": "DART : Open - Domain Structured Data Record to Text Generation.", "forward": false, "src_ids": "2021.naacl-main.37_670"}
{"input": "multitask architectures is used for OtherScientificTerm| context: discovering whether words are semantically related and identifying the specific semantic relation that holds between them is of crucial importance for automatic reasoning on text data . for that purpose , different methodologies have been proposed that either ( 1 ) tackle feature engineering , ( 2 ) fine - tune latent semantic spaces , or ( 3 ) take advantage of cognitive links between semantic relations in multitask settings .", "entity": "multitask architectures", "output": "lexico - semantic relations", "neg_sample": ["multitask architectures is used for OtherScientificTerm", "discovering whether words are semantically related and identifying the specific semantic relation that holds between them is of crucial importance for automatic reasoning on text data .", "for that purpose , different methodologies have been proposed that either ( 1 ) tackle feature engineering , ( 2 ) fine - tune latent semantic spaces , or ( 3 ) take advantage of cognitive links between semantic relations in multitask settings ."], "relation": "used for", "id": "2021.findings-acl.244", "year": 2021, "rel_sent": "In this paper , we investigate how feature engineering and multitask architectures can be improved and consequently combined to identify lexico - semantic relations .", "forward": true, "src_ids": "2021.findings-acl.244_9182"}
{"input": "regularized decoding is used for OtherScientificTerm| context: arabic diacritization is a fundamental task for arabic language processing . previous studies have demonstrated that automatically generated knowledge can be helpful to this task . however , these studies regard the auto - generated knowledge instances as gold references , which limits their effectiveness since such knowledge is not always accurate and inferior instances can lead to incorrect predictions .", "entity": "regularized decoding", "output": "noisy knowledge", "neg_sample": ["regularized decoding is used for OtherScientificTerm", "arabic diacritization is a fundamental task for arabic language processing .", "previous studies have demonstrated that automatically generated knowledge can be helpful to this task .", "however , these studies regard the auto - generated knowledge instances as gold references , which limits their effectiveness since such knowledge is not always accurate and inferior instances can lead to incorrect predictions ."], "relation": "used for", "id": "2021.acl-short.68", "year": 2021, "rel_sent": "In this paper , we propose to use regularized decoding and adversarial training to appropriately learn from such noisy knowledge for diacritization .", "forward": true, "src_ids": "2021.acl-short.68_9872"}
{"input": "feature weights is used for Task| context: however , since these corpora are typically extremely noisy , their use is fairly limited . current approaches to deal with this problem mainly focus on filtering using heuristics or single features such as language model scores or bi - lingual similarity .", "entity": "feature weights", "output": "translation", "neg_sample": ["feature weights is used for Task", "however , since these corpora are typically extremely noisy , their use is fairly limited .", "current approaches to deal with this problem mainly focus on filtering using heuristics or single features such as language model scores or bi - lingual similarity ."], "relation": "used for", "id": "2021.wmt-1.118", "year": 2021, "rel_sent": "These feature weights which are optimized directly for the task of improving translation performance , are used to score and filter sentences in the noisy corpora more effectively .", "forward": true, "src_ids": "2021.wmt-1.118_2106"}
{"input": "event prediction is done by using Task| context: event schemas encode knowledge of stereotypical structures of events and their connections . as events unfold , schemas are crucial to act as a scaffolding . previous work on event schema induction focuses either on atomic events or linear temporal event sequences , ignoring the interplay between events via arguments and argument relations .", "entity": "event prediction", "output": "complex event schema induction", "neg_sample": ["event prediction is done by using Task", "event schemas encode knowledge of stereotypical structures of events and their connections .", "as events unfold , schemas are crucial to act as a scaffolding .", "previous work on event schema induction focuses either on atomic events or linear temporal event sequences , ignoring the interplay between events via arguments and argument relations ."], "relation": "used for", "id": "2021.emnlp-main.422", "year": 2021, "rel_sent": "The Future is not One - dimensional : Complex Event Schema Induction by Graph Modeling for Event Prediction.", "forward": false, "src_ids": "2021.emnlp-main.422_11341"}
{"input": "delexicalized cross - lingual dependency parsing is used for Material| context: manually annotating a treebank is time - consuming and labor - intensive . we conduct delexicalized cross - lingual dependency parsing experiments , where we train the parser on one language and test on our target language . however , it is not clear how to determine those closely related languages .", "entity": "delexicalized cross - lingual dependency parsing", "output": "xibe", "neg_sample": ["delexicalized cross - lingual dependency parsing is used for Material", "manually annotating a treebank is time - consuming and labor - intensive .", "we conduct delexicalized cross - lingual dependency parsing experiments , where we train the parser on one language and test on our target language .", "however , it is not clear how to determine those closely related languages ."], "relation": "used for", "id": "2021.ranlp-1.182", "year": 2021, "rel_sent": "Delexicalized Cross - lingual Dependency Parsing for Xibe.", "forward": true, "src_ids": "2021.ranlp-1.182_6022"}
{"input": "task - specific lexicons is used for Material| context: we study how masking and predicting tokens in an unsupervised fashion can give rise to linguistic structures and downstream performance gains . recent theories have suggested that pretrained language models acquire useful inductive biases through masks that implicitly act as cloze reductions for downstream tasks . while appealing , we show that the success of the random masking strategy used in practice can not be explained by such cloze - like masks alone .", "entity": "task - specific lexicons", "output": "classification datasets", "neg_sample": ["task - specific lexicons is used for Material", "we study how masking and predicting tokens in an unsupervised fashion can give rise to linguistic structures and downstream performance gains .", "recent theories have suggested that pretrained language models acquire useful inductive biases through masks that implicitly act as cloze reductions for downstream tasks .", "while appealing , we show that the success of the random masking strategy used in practice can not be explained by such cloze - like masks alone ."], "relation": "used for", "id": "2021.naacl-main.404", "year": 2021, "rel_sent": "We construct cloze - like masks using task - specific lexicons for three different classification datasets and show that the majority of pretrained performance gains come from generic masks that are not associated with the lexicon .", "forward": true, "src_ids": "2021.naacl-main.404_6942"}
{"input": "bert is used for Task| context: while prior research has found that bert does contain commonsense information to some extent , there has been work showing that pre - trained models can rely on spurious associations ( e.g. , data bias ) rather than key cues in solving sentiment classification and other problems .", "entity": "bert", "output": "commonsense tasks", "neg_sample": ["bert is used for Task", "while prior research has found that bert does contain commonsense information to some extent , there has been work showing that pre - trained models can rely on spurious associations ( e.g.", ", data bias ) rather than key cues in solving sentiment classification and other problems ."], "relation": "used for", "id": "2021.findings-acl.61", "year": 2021, "rel_sent": "We quantitatively investigate the presence of structural commonsense cues in BERT when solving commonsense tasks , and the importance of such cues for the model prediction .", "forward": true, "src_ids": "2021.findings-acl.61_13921"}
{"input": "quantitative aspects of numerals is done by using OtherScientificTerm| context: numerical common sense ( ncs ) is necessary tofully understand natural language text that includes numerals . ncs is knowledge about the numerical features of objects in text , such as size , weight , or color . existing neural language models treat numerals in a text as string tokens in the same way as other words . therefore , they can not reflect the quantitative aspects of numerals in the training process , making it difficult to learn ncs .", "entity": "quantitative aspects of numerals", "output": "loss function", "neg_sample": ["quantitative aspects of numerals is done by using OtherScientificTerm", "numerical common sense ( ncs ) is necessary tofully understand natural language text that includes numerals .", "ncs is knowledge about the numerical features of objects in text , such as size , weight , or color .", "existing neural language models treat numerals in a text as string tokens in the same way as other words .", "therefore , they can not reflect the quantitative aspects of numerals in the training process , making it difficult to learn ncs ."], "relation": "used for", "id": "2021.deelio-1.14", "year": 2021, "rel_sent": "We also propose methods to reflect not only the symbolic aspect but also the quantitative aspect of numerals in the training of language models , using a loss function that depends on the magnitudes of the numerals and a regression model for the masked numeral prediction task .", "forward": false, "src_ids": "2021.deelio-1.14_6489"}
{"input": "uncertainty of neural rankers is done by using Task| context: according to the probability ranking principle ( prp ) , ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad - hoc retrieval . the prp holds when two conditions are met : [ c1 ] the models are well calibrated , and , [ c2 ] the probabilities of relevance are reported with certainty . we know however that deep neural networks ( dnns ) are often not well calibrated and have several sources of uncertainty , and thus [ c1 ] and [ c2 ] might not be satisfied by neural rankers .", "entity": "uncertainty of neural rankers", "output": "conversational search", "neg_sample": ["uncertainty of neural rankers is done by using Task", "according to the probability ranking principle ( prp ) , ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad - hoc retrieval .", "the prp holds when two conditions are met : [ c1 ] the models are well calibrated , and , [ c2 ] the probabilities of relevance are reported with certainty .", "we know however that deep neural networks ( dnns ) are often not well calibrated and have several sources of uncertainty , and thus [ c1 ] and [ c2 ] might not be satisfied by neural rankers ."], "relation": "used for", "id": "2021.eacl-main.12", "year": 2021, "rel_sent": "Then , motivated by our findings we use two techniques to model the uncertainty of neural rankers leading to the proposed stochastic rankers , which output a predictive distribution of relevance as opposed to point estimates .", "forward": false, "src_ids": "2021.eacl-main.12_8013"}
{"input": "prosodic features is used for Method| context: a prerequisite for the computational study of literature is the availability of properly digitized texts , ideally with reliable meta - data and ground - truth annotation . poetry corpora do exist for a number of languages , but larger collections lack consistency and are encoded in various standards , while annotated corpora are typically constrained to a particular genre and/or were designed for the analysis of certain linguistic features ( like rhyme ) .", "entity": "prosodic features", "output": "corpus driven neural models", "neg_sample": ["prosodic features is used for Method", "a prerequisite for the computational study of literature is the availability of properly digitized texts , ideally with reliable meta - data and ground - truth annotation .", "poetry corpora do exist for a number of languages , but larger collections lack consistency and are encoded in various standards , while annotated corpora are typically constrained to a particular genre and/or were designed for the analysis of certain linguistic features ( like rhyme ) ."], "relation": "used for", "id": "2021.eacl-main.325", "year": 2021, "rel_sent": "In this work , we provide large poetry corpora for English and German , and annotate prosodic features in smaller corpora to train corpus driven neural models that enable robust large scale analysis .", "forward": true, "src_ids": "2021.eacl-main.325_3605"}
{"input": "multilingual st task is done by using Method| context: adapter modules were recently introduced as an efficient alternative tofine - tuning in nlp . adapter tuning consists in freezing pre - trained parameters of a model and injecting lightweight modules between layers , resulting in the addition of only a small number of task - specific trainable parameters .", "entity": "multilingual st task", "output": "mbart pre - trained model", "neg_sample": ["multilingual st task is done by using Method", "adapter modules were recently introduced as an efficient alternative tofine - tuning in nlp .", "adapter tuning consists in freezing pre - trained parameters of a model and injecting lightweight modules between layers , resulting in the addition of only a small number of task - specific trainable parameters ."], "relation": "used for", "id": "2021.acl-short.103", "year": 2021, "rel_sent": "Starting from different pre - trained models ( a multilingual ST trained on parallel data or a multilingual BART ( mBART ) trained on non parallel multilingual data ) , we show that adapters can be used to : ( a ) efficiently specialize ST to specific language pairs with a low extra cost in terms of parameters , and ( b ) transfer from an automatic speech recognition ( ASR ) task and an mBART pre - trained model to a multilingual ST task .", "forward": false, "src_ids": "2021.acl-short.103_12591"}
{"input": "data augmentation method is used for OtherScientificTerm| context: unsupervised pre - training has led to much recent progress in natural language understanding .", "entity": "data augmentation method", "output": "task - specific query embeddings", "neg_sample": ["data augmentation method is used for OtherScientificTerm", "unsupervised pre - training has led to much recent progress in natural language understanding ."], "relation": "used for", "id": "2021.naacl-main.426", "year": 2021, "rel_sent": "To obtain additional data for a specific task , we introduce SentAugment , a data augmentation method which computes task - specific query embeddings from labeled data to retrieve sentences from a bank of billions of unlabeled sentences crawled from the web .", "forward": true, "src_ids": "2021.naacl-main.426_15708"}
{"input": "classification systems is done by using OtherScientificTerm| context: incorporating listeners ' interpretations of song lyrics has been shown to significantly improve topic classification accuracy .", "entity": "classification systems", "output": "representations of songs", "neg_sample": ["classification systems is done by using OtherScientificTerm", "incorporating listeners ' interpretations of song lyrics has been shown to significantly improve topic classification accuracy ."], "relation": "used for", "id": "2021.nlp4musa-1.5", "year": 2021, "rel_sent": "Using a different type of interpretation , as compared to previous research , we propose four possible representations of songs as input for classification systems .", "forward": false, "src_ids": "2021.nlp4musa-1.5_2230"}
{"input": "contrastive attention is used for Task| context: in most cases , the normal regions dominate the entire chest x - ray image , and the corresponding descriptions of these normal regions dominate the final report . due to such data bias , learning - based models may fail to attend to abnormal regions .", "entity": "contrastive attention", "output": "automatic chest x - ray report generation", "neg_sample": ["contrastive attention is used for Task", "in most cases , the normal regions dominate the entire chest x - ray image , and the corresponding descriptions of these normal regions dominate the final report .", "due to such data bias , learning - based models may fail to attend to abnormal regions ."], "relation": "used for", "id": "2021.findings-acl.23", "year": 2021, "rel_sent": "Contrastive Attention for Automatic Chest X - ray Report Generation.", "forward": true, "src_ids": "2021.findings-acl.23_2864"}
{"input": "slot tagging task is done by using Method| context: recently , they have been used in many natural language processing applications but not for slot tagging .", "entity": "slot tagging task", "output": "metric - based learning methods", "neg_sample": ["slot tagging task is done by using Method", "recently , they have been used in many natural language processing applications but not for slot tagging ."], "relation": "used for", "id": "2021.eacl-main.134", "year": 2021, "rel_sent": "In this paper , we explore metric - based learning methods in the slot tagging task and propose a novel metric - based learning architecture - Attentive Relational Network .", "forward": false, "src_ids": "2021.eacl-main.134_13058"}
{"input": "language model pre - training is done by using Method| context: in this paper , we detail the relationship between convolutions and self - attention in natural language tasks . we show that relative position embeddings in self - attention layers are equivalent to recently - proposed dynamic lightweight convolutions , and we consider multiple new ways of integrating convolutions into transformer self - attention .", "entity": "language model pre - training", "output": "dynamic convolutions", "neg_sample": ["language model pre - training is done by using Method", "in this paper , we detail the relationship between convolutions and self - attention in natural language tasks .", "we show that relative position embeddings in self - attention layers are equivalent to recently - proposed dynamic lightweight convolutions , and we consider multiple new ways of integrating convolutions into transformer self - attention ."], "relation": "used for", "id": "2021.acl-long.333", "year": 2021, "rel_sent": "To inform future work , we present results comparing lightweight convolutions , dynamic convolutions , and depthwise - separable convolutions in language model pre - training , considering multiple injection points for convolutions in self - attention layers .", "forward": false, "src_ids": "2021.acl-long.333_14466"}
{"input": "phrase alignment is done by using Method| context: deep learning ( dl ) based language models achieve high performance on various benchmarks for natural language inference ( nli ) . and at this time , symbolic approaches to nli are receiving less attention . both approaches ( symbolic and dl ) have their advantages and weaknesses . however , currently , no method combines them in a system to solve the task of nli .", "entity": "phrase alignment", "output": "neural network language model", "neg_sample": ["phrase alignment is done by using Method", "deep learning ( dl ) based language models achieve high performance on various benchmarks for natural language inference ( nli ) .", "and at this time , symbolic approaches to nli are receiving less attention .", "both approaches ( symbolic and dl ) have their advantages and weaknesses .", "however , currently , no method combines them in a system to solve the task of nli ."], "relation": "used for", "id": "2021.starsem-1.7", "year": 2021, "rel_sent": "To merge symbolic and deep learning methods , we propose an inference framework called NeuralLog , which utilizes both a monotonicity - based logical inference engine and a neural network language model for phrase alignment .", "forward": false, "src_ids": "2021.starsem-1.7_12658"}
{"input": "literary style is done by using OtherScientificTerm| context: identifying intertextual relationships between authors is of central importance to the study of literature .", "entity": "literary style", "output": "embeddings", "neg_sample": ["literary style is done by using OtherScientificTerm", "identifying intertextual relationships between authors is of central importance to the study of literature ."], "relation": "used for", "id": "2021.naacl-main.389", "year": 2021, "rel_sent": "We then demonstrate that training embeddings on very small corpora can capture salient aspects of literary style and apply this approach to replicate a previous intertextual study of the Roman historian Livy , which relied on hand - crafted stylometric features .", "forward": false, "src_ids": "2021.naacl-main.389_15984"}
{"input": "knowledge distillation is used for Task| context: recent studies argue that knowledge distillation is promising for speech translation ( st ) using end - to - end models .", "entity": "knowledge distillation", "output": "translating erroneous speech transcriptions", "neg_sample": ["knowledge distillation is used for Task", "recent studies argue that knowledge distillation is promising for speech translation ( st ) using end - to - end models ."], "relation": "used for", "id": "2021.iwslt-1.24", "year": 2021, "rel_sent": "On Knowledge Distillation for Translating Erroneous Speech Transcriptions.", "forward": true, "src_ids": "2021.iwslt-1.24_11331"}
{"input": "intraand inter - sentential relations is done by using Method| context: document - level relation extraction has attracted much attention in recent years . it is usually formulated as a classification problem that predicts relations for all entity pairs in the document . however , previous works indiscriminately represent intraand inter - sentential relations in the same way , confounding the different patterns for predicting them . besides , they create a document graph and use paths between entities on the graph as clues for logical reasoning . however , not all entity pairs can be connected with a path and have the correct logical reasoning paths in their graph . thus many cases of logical reasoning can not be covered .", "entity": "intraand inter - sentential relations", "output": "sire", "neg_sample": ["intraand inter - sentential relations is done by using Method", "document - level relation extraction has attracted much attention in recent years .", "it is usually formulated as a classification problem that predicts relations for all entity pairs in the document .", "however , previous works indiscriminately represent intraand inter - sentential relations in the same way , confounding the different patterns for predicting them .", "besides , they create a document graph and use paths between entities on the graph as clues for logical reasoning .", "however , not all entity pairs can be connected with a path and have the correct logical reasoning paths in their graph .", "thus many cases of logical reasoning can not be covered ."], "relation": "used for", "id": "2021.findings-acl.47", "year": 2021, "rel_sent": "This paper proposes an effective architecture , SIRE , to represent intraand inter - sentential relations in different ways .", "forward": false, "src_ids": "2021.findings-acl.47_7346"}
{"input": "sapbert is used for Method| context: despite the widespread success of self - supervised learning via masked language models ( mlm ) , accurately capturing fine - grained semantic relationships in the biomedical domain remains a challenge . this is of paramount importance for entity - level tasks such as entity linking where the ability to model entity relations ( especially synonymy ) is pivotal .", "entity": "sapbert", "output": "one - model - for - all solution", "neg_sample": ["sapbert is used for Method", "despite the widespread success of self - supervised learning via masked language models ( mlm ) , accurately capturing fine - grained semantic relationships in the biomedical domain remains a challenge .", "this is of paramount importance for entity - level tasks such as entity linking where the ability to model entity relations ( especially synonymy ) is pivotal ."], "relation": "used for", "id": "2021.naacl-main.334", "year": 2021, "rel_sent": "In contrast with previous pipeline - based hybrid systems , SapBERT offers an elegant one - model - for - all solution to the problem of medical entity linking ( MEL ) , achieving a new state - of - the - art ( SOTA ) on six MEL benchmarking datasets .", "forward": true, "src_ids": "2021.naacl-main.334_2295"}
{"input": "intra- and inter - sentential reasoning is used for Task| context: it is usually formulated as a classification problem that predicts relations for all entity pairs in the document . however , previous works indiscriminately represent intraand inter - sentential relations in the same way , confounding the different patterns for predicting them . besides , they create a document graph and use paths between entities on the graph as clues for logical reasoning . however , not all entity pairs can be connected with a path and have the correct logical reasoning paths in their graph . thus many cases of logical reasoning can not be covered .", "entity": "intra- and inter - sentential reasoning", "output": "document - level relation extraction", "neg_sample": ["intra- and inter - sentential reasoning is used for Task", "it is usually formulated as a classification problem that predicts relations for all entity pairs in the document .", "however , previous works indiscriminately represent intraand inter - sentential relations in the same way , confounding the different patterns for predicting them .", "besides , they create a document graph and use paths between entities on the graph as clues for logical reasoning .", "however , not all entity pairs can be connected with a path and have the correct logical reasoning paths in their graph .", "thus many cases of logical reasoning can not be covered ."], "relation": "used for", "id": "2021.findings-acl.47", "year": 2021, "rel_sent": "SIRE : Separate Intra- and Inter - sentential Reasoning for Document - level Relation Extraction.", "forward": true, "src_ids": "2021.findings-acl.47_7348"}
{"input": "information extraction ( ie ) system is done by using Method| context: much past work has focused on extracting information like events , entities , and relations from documents . very little work has focused on analyzing these results for better model understanding .", "entity": "information extraction ( ie ) system", "output": "curation interface", "neg_sample": ["information extraction ( ie ) system is done by using Method", "much past work has focused on extracting information like events , entities , and relations from documents .", "very little work has focused on analyzing these results for better model understanding ."], "relation": "used for", "id": "2021.acl-demo.19", "year": 2021, "rel_sent": "In this paper , we introduce a curation interface that takes an Information Extraction ( IE ) system 's output in a pre - defined format and generates a graphical representation of its elements .", "forward": false, "src_ids": "2021.acl-demo.19_14776"}
{"input": "clustering - based sparse transformer is used for OtherScientificTerm| context: transformer has become ubiquitous in the deep learning field . the sequence length . therefore , long sequences are often encoded by transformer in chunks using a sliding window .", "entity": "clustering - based sparse transformer", "output": "attention", "neg_sample": ["clustering - based sparse transformer is used for OtherScientificTerm", "transformer has become ubiquitous in the deep learning field .", "the sequence length .", "therefore , long sequences are often encoded by transformer in chunks using a sliding window ."], "relation": "used for", "id": "2021.findings-acl.346", "year": 2021, "rel_sent": "In this paper , we propose Cluster - Former , a novel clusteringbased sparse Transformer to perform attention across chunked sequences .", "forward": true, "src_ids": "2021.findings-acl.346_7374"}
{"input": "phrase meaning information is done by using Method| context: pre - trained transformer language models have shown remarkable performance on a variety of nlp tasks . however , recent research has suggested that phrase - level representations in these models reflect heavy influences of lexical content , but lack evidence of sophisticated , compositional phrase information ( yu and ettinger , 2020 ) .", "entity": "phrase meaning information", "output": "contextualized embeddings", "neg_sample": ["phrase meaning information is done by using Method", "pre - trained transformer language models have shown remarkable performance on a variety of nlp tasks .", "however , recent research has suggested that phrase - level representations in these models reflect heavy influences of lexical content , but lack evidence of sophisticated , compositional phrase information ( yu and ettinger , 2020 ) ."], "relation": "used for", "id": "2021.findings-acl.201", "year": 2021, "rel_sent": "Here we investigate the impact of fine - tuning on the capacity of contextualized embeddings to capture phrase meaning information beyond lexical content .", "forward": false, "src_ids": "2021.findings-acl.201_12126"}
{"input": "data augmentation scheme is done by using Task| context: unsupervised consistency training is a way of semi - supervised learning that encourages consistency in model predictions between the original and augmented data . for named entity recognition ( ner ) , existing approaches augment the input sequence with token replacement , assuming annotations on the replaced positions unchanged .", "entity": "data augmentation scheme", "output": "paraphrasing", "neg_sample": ["data augmentation scheme is done by using Task", "unsupervised consistency training is a way of semi - supervised learning that encourages consistency in model predictions between the original and augmented data .", "for named entity recognition ( ner ) , existing approaches augment the input sequence with token replacement , assuming annotations on the replaced positions unchanged ."], "relation": "used for", "id": "2021.emnlp-main.430", "year": 2021, "rel_sent": "In this paper , we explore the use of paraphrasing as a more principled data augmentation scheme for NER unsupervised consistency training .", "forward": false, "src_ids": "2021.emnlp-main.430_13711"}
{"input": "nmt model is used for Task| context: dravidian language family is one of the largest language families in the world .", "entity": "nmt model", "output": "translation", "neg_sample": ["nmt model is used for Task", "dravidian language family is one of the largest language families in the world ."], "relation": "used for", "id": "2021.dravidianlangtech-1.50", "year": 2021, "rel_sent": "The NMT model was applied on translation using English - Tamil , EnglishTelugu , English - Malayalam and Tamil - Telugu corpora provided by the organizers .", "forward": true, "src_ids": "2021.dravidianlangtech-1.50_8654"}
{"input": "labeled sequence translation method is used for Material| context: named entity recognition ( ner ) for low - resource languages is a both practical and challenging research problem .", "entity": "labeled sequence translation method", "output": "source - language training data", "neg_sample": ["labeled sequence translation method is used for Material", "named entity recognition ( ner ) for low - resource languages is a both practical and challenging research problem ."], "relation": "used for", "id": "2021.acl-long.453", "year": 2021, "rel_sent": "The paper first proposes a simple but effective labeled sequence translation method to translate source - language training data to target languages and avoids problems such as word order change and entity span determination .", "forward": true, "src_ids": "2021.acl-long.453_14091"}
{"input": "sports question - answering ( qa ) system is done by using Method| context: named entity linking ( nel ) or mapping ' strings ' to ' things ' in a knowledge base is a fundamental preprocessing step in systems that require knowledge of entities such as information extraction and question answering . in this work , we lay out and investigate two challenges faced by individuals or organizations building nel systems .", "entity": "sports question - answering ( qa ) system", "output": "nel model", "neg_sample": ["sports question - answering ( qa ) system is done by using Method", "named entity linking ( nel ) or mapping ' strings ' to ' things ' in a knowledge base is a fundamental preprocessing step in systems that require knowledge of entities such as information extraction and question answering .", "in this work , we lay out and investigate two challenges faced by individuals or organizations building nel systems ."], "relation": "used for", "id": "2021.naacl-industry.26", "year": 2021, "rel_sent": "Second , for a use case where the NEL model is used in a sports question - answering ( QA ) system , we investigate how to close the loop in our analysis by repurposing the best off - the - shelf model ( Bootleg ) to correct sport - related errors .", "forward": false, "src_ids": "2021.naacl-industry.26_1518"}
{"input": "amr based semantic graph aggregator is used for OtherScientificTerm| context: the tasks of rich semantic parsing , such as abstract meaning representation ( amr ) , share similar goals with information extraction ( ie ) to convert natural language texts into structured semantic representations .", "entity": "amr based semantic graph aggregator", "output": "event trigger nodes", "neg_sample": ["amr based semantic graph aggregator is used for OtherScientificTerm", "the tasks of rich semantic parsing , such as abstract meaning representation ( amr ) , share similar goals with information extraction ( ie ) to convert natural language texts into structured semantic representations ."], "relation": "used for", "id": "2021.naacl-main.4", "year": 2021, "rel_sent": "Our framework consists of two novel components : 1 ) an AMR based semantic graph aggregator to let the candidate entity and event trigger nodes collect neighborhood information from AMR graph for passing message among related knowledge elements ; 2 ) an AMR guided graph decoder to extract knowledge elements based on the order decided by the hierarchical structures in AMR .", "forward": true, "src_ids": "2021.naacl-main.4_10607"}
{"input": "review - enriched and entity - based recommendations is used for OtherScientificTerm| context: existing conversational recommendation ( cr ) systems usually suffer from insufficient item information when conducted on short dialogue history and unfamiliar items . incorporating external information ( e.g. , reviews ) is a potential solution to alleviate this problem . given that reviews often provide a rich and detailed user experience on different interests , they are potential ideal resources for providing high - quality recommendations within an informative conversation .", "entity": "review - enriched and entity - based recommendations", "output": "item suggestions", "neg_sample": ["review - enriched and entity - based recommendations is used for OtherScientificTerm", "existing conversational recommendation ( cr ) systems usually suffer from insufficient item information when conducted on short dialogue history and unfamiliar items .", "incorporating external information ( e.g.", ", reviews ) is a potential solution to alleviate this problem .", "given that reviews often provide a rich and detailed user experience on different interests , they are potential ideal resources for providing high - quality recommendations within an informative conversation ."], "relation": "used for", "id": "2021.findings-acl.99", "year": 2021, "rel_sent": "In detail , we extract sentiment - consistent reviews , perform review - enriched and entity - based recommendations for item suggestions , as well as use a review - attentive encoder - decoder for response generation .", "forward": true, "src_ids": "2021.findings-acl.99_1158"}
{"input": "hypernym is done by using Task| context: although pre - training models have achieved great success in dialogue generation , their performance drops dramatically when the input contains an entity that does not appear in pre - training and fine - tuning datasets ( unseen entity ) . to address this issue , existing methods leverage an external knowledge base to generate appropriate responses . in real - world practical , the entity may not be included by the knowledge base or suffer from the precision of knowledge retrieval .", "entity": "hypernym", "output": "hypernym generation", "neg_sample": ["hypernym is done by using Task", "although pre - training models have achieved great success in dialogue generation , their performance drops dramatically when the input contains an entity that does not appear in pre - training and fine - tuning datasets ( unseen entity ) .", "to address this issue , existing methods leverage an external knowledge base to generate appropriate responses .", "in real - world practical , the entity may not be included by the knowledge base or suffer from the precision of knowledge retrieval ."], "relation": "used for", "id": "2021.emnlp-main.179", "year": 2021, "rel_sent": "Specifically , with the help of a knowledge base , we introduce two auxiliary training objectives : 1 ) Interpret Masked Word , which conjectures the meaning of the masked entity given the context ; 2 ) Hypernym Generation , which predicts the hypernym of the entity based on the context .", "forward": false, "src_ids": "2021.emnlp-main.179_8534"}
{"input": "graph encoder is used for OtherScientificTerm| context: event extraction ( ee ) has considerably benefited from pre - trained language models ( plms ) by fine - tuning . however , existing pre - training methods have not involved modeling event characteristics , resulting in the developed ee models can not take full advantage of large - scale unsupervised data .", "entity": "graph encoder", "output": "event structures", "neg_sample": ["graph encoder is used for OtherScientificTerm", "event extraction ( ee ) has considerably benefited from pre - trained language models ( plms ) by fine - tuning .", "however , existing pre - training methods have not involved modeling event characteristics , resulting in the developed ee models can not take full advantage of large - scale unsupervised data ."], "relation": "used for", "id": "2021.acl-long.491", "year": 2021, "rel_sent": "CLEVE contains a text encoder to learn event semantics and a graph encoder to learn event structures respectively .", "forward": true, "src_ids": "2021.acl-long.491_6670"}
{"input": "weather labels is used for OtherScientificTerm| context: the task of generating weather - forecast comments from meteorological simulations has the following requirements : ( i ) the changes in numerical values for various physical quantities need to be considered , ( ii ) the weather comments should be dependent on delivery time and area information , and ( iii ) the comments should provide useful information for users .", "entity": "weather labels", "output": "weather information", "neg_sample": ["weather labels is used for OtherScientificTerm", "the task of generating weather - forecast comments from meteorological simulations has the following requirements : ( i ) the changes in numerical values for various physical quantities need to be considered , ( ii ) the weather comments should be dependent on delivery time and area information , and ( iii ) the comments should provide useful information for users ."], "relation": "used for", "id": "2021.eacl-main.125", "year": 2021, "rel_sent": "We also introduce weather labels representing weather information , such as sunny and rain , for our model to explicitly describe useful information .", "forward": true, "src_ids": "2021.eacl-main.125_4690"}
{"input": "language models is used for Task| context: pretrained language models like bert have advanced the state of the art for many nlp tasks . for resource - rich languages , one has the choice between a number of language - specific models , while multilingual models are also worth considering . these models are well known for their crosslingual performance , but have also shown competitive in - language performance on some tasks . we consider monolingual and multilingual models from the perspective of historical texts , and in particular for texts enriched with editorial notes : how do language models deal with the historical and editorial content in these texts ?", "entity": "language models", "output": "prediction of entities in historical texts", "neg_sample": ["language models is used for Task", "pretrained language models like bert have advanced the state of the art for many nlp tasks .", "for resource - rich languages , one has the choice between a number of language - specific models , while multilingual models are also worth considering .", "these models are well known for their crosslingual performance , but have also shown competitive in - language performance on some tasks .", "we consider monolingual and multilingual models from the perspective of historical texts , and in particular for texts enriched with editorial notes : how do language models deal with the historical and editorial content in these texts ?"], "relation": "used for", "id": "2021.latechclfl-1.3", "year": 2021, "rel_sent": "In particular , language models successfully incorporate notes for the prediction of entities in historical texts .", "forward": true, "src_ids": "2021.latechclfl-1.3_2318"}
{"input": "answering models is used for Material| context: in the visual dialog task guesswhat ? ! two players maintain a dialog in order to identify a secret object in an image . this raises a question : what 's the risk of having an imperfect oracle model ? .", "entity": "answering models", "output": "human generated questions", "neg_sample": ["answering models is used for Material", "in the visual dialog task guesswhat ? !", "two players maintain a dialog in order to identify a secret object in an image .", "this raises a question : what 's the risk of having an imperfect oracle model ?", "."], "relation": "used for", "id": "2021.reinact-1.2", "year": 2021, "rel_sent": "Here we present work in progress in the study of the impact of different answering models in human generated questions in GuessWhat ? ! .", "forward": true, "src_ids": "2021.reinact-1.2_10992"}
{"input": "parallel corpus is done by using Method| context: with more than 7000 languages worldwide , multilingual natural language processing ( nlp ) is essential both from an academic and commercial perspective . researching typological properties of languages is fundamental for progress in multilingual nlp . examples include assessing language similarity for effective transfer learning , injecting inductive biases into machine learning models or creating resources such as dictionaries and inflection tables .", "entity": "parallel corpus", "output": "parcoure", "neg_sample": ["parallel corpus is done by using Method", "with more than 7000 languages worldwide , multilingual natural language processing ( nlp ) is essential both from an academic and commercial perspective .", "researching typological properties of languages is fundamental for progress in multilingual nlp .", "examples include assessing language similarity for effective transfer learning , injecting inductive biases into machine learning models or creating resources such as dictionaries and inflection tables ."], "relation": "used for", "id": "2021.acl-demo.8", "year": 2021, "rel_sent": "ParCourE can be set up for any parallel corpus and can thus be used for typological research on other corpora as well as for exploring their quality and properties .", "forward": false, "src_ids": "2021.acl-demo.8_3686"}
{"input": "pre - trained language model is used for Task| context: recently , various neural models for multi - party conversation ( mpc ) have achieved impressive improvements on a variety of tasks such as addressee recognition , speaker identification and response prediction . however , these existing methods on mpc usually represent interlocutors and utterances individually and ignore the inherent complicated structure in mpc which may provide crucial interlocutor and utterance semantics and would enhance the conversation understanding process .", "entity": "pre - trained language model", "output": "multi - party conversation understanding", "neg_sample": ["pre - trained language model is used for Task", "recently , various neural models for multi - party conversation ( mpc ) have achieved impressive improvements on a variety of tasks such as addressee recognition , speaker identification and response prediction .", "however , these existing methods on mpc usually represent interlocutors and utterances individually and ignore the inherent complicated structure in mpc which may provide crucial interlocutor and utterance semantics and would enhance the conversation understanding process ."], "relation": "used for", "id": "2021.acl-long.285", "year": 2021, "rel_sent": "MPC - BERT : A Pre - Trained Language Model for Multi - Party Conversation Understanding.", "forward": true, "src_ids": "2021.acl-long.285_2421"}
{"input": "empirical explainers is used for OtherScientificTerm| context: amid a discussion about green ai in which we see explainability neglected , we explore the possibility to efficiently approximate computationally expensive explainers .", "entity": "empirical explainers", "output": "attribution maps", "neg_sample": ["empirical explainers is used for OtherScientificTerm", "amid a discussion about green ai in which we see explainability neglected , we explore the possibility to efficiently approximate computationally expensive explainers ."], "relation": "used for", "id": "2021.blackboxnlp-1.17", "year": 2021, "rel_sent": "Empirical Explainers learn from data to predict the attribution maps of expensive explainers .", "forward": true, "src_ids": "2021.blackboxnlp-1.17_1989"}
{"input": "sequence tagging model is done by using OtherScientificTerm| context: automatic extraction of product attribute - value pairs from unstructured text like product descriptions is an important problem for e - commerce companies . the attribute schema typically varies from one category of products ( which will be referred as vertical ) to another . this leads to extreme annotation efforts for training of supervised deep sequence labeling models such as lstm - crf , and consequently not enough labeled data for some vertical - attribute pairs .", "entity": "sequence tagging model", "output": "model parameters", "neg_sample": ["sequence tagging model is done by using OtherScientificTerm", "automatic extraction of product attribute - value pairs from unstructured text like product descriptions is an important problem for e - commerce companies .", "the attribute schema typically varies from one category of products ( which will be referred as vertical ) to another .", "this leads to extreme annotation efforts for training of supervised deep sequence labeling models such as lstm - crf , and consequently not enough labeled data for some vertical - attribute pairs ."], "relation": "used for", "id": "2021.ecnlp-1.10", "year": 2021, "rel_sent": "Our model jointly learns the similarity between attributes of the two verticals along with the model parameters for the sequence tagging model .", "forward": false, "src_ids": "2021.ecnlp-1.10_6470"}
{"input": "parallel and conversational movie subtitles datasets is used for Task| context: recent progress in task - oriented neural dialogue systems is largely focused on a handful of languages , as annotation of training data is tedious and expensive . machine translation has been used to make systems multilingual , but this can introduce a pipeline of errors . another promising solution is using cross - lingual transfer learning through pretrained multilingual models .", "entity": "parallel and conversational movie subtitles datasets", "output": "cross - lingual intermediate tasks", "neg_sample": ["parallel and conversational movie subtitles datasets is used for Task", "recent progress in task - oriented neural dialogue systems is largely focused on a handful of languages , as annotation of training data is tedious and expensive .", "machine translation has been used to make systems multilingual , but this can introduce a pipeline of errors .", "another promising solution is using cross - lingual transfer learning through pretrained multilingual models ."], "relation": "used for", "id": "2021.emnlp-main.87", "year": 2021, "rel_sent": "Specifically , we use parallel and conversational movie subtitles datasets to design cross - lingual intermediate tasks suitable for downstream dialogue tasks .", "forward": true, "src_ids": "2021.emnlp-main.87_3646"}
{"input": "context is used for Material| context: the user inten - tion and background .", "entity": "context", "output": "dialogue", "neg_sample": ["context is used for Material", "the user inten - tion and background ."], "relation": "used for", "id": "2021.findings-acl.124", "year": 2021, "rel_sent": "In this paper , we explore and quantify the role of context for different aspects of a dialogue , namely emotion , dialogue act , and intent identification , using state - of - the - art dialogue understanding methods as baselines .", "forward": true, "src_ids": "2021.findings-acl.124_15714"}
{"input": "large scale models is used for Generic| context: building open - domain chatbots is a challenging area for machine learning research . while prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results , we highlight other ingredients .", "entity": "large scale models", "output": "skills", "neg_sample": ["large scale models is used for Generic", "building open - domain chatbots is a challenging area for machine learning research .", "while prior work has shown that scaling neural models in the number of parameters and the size of the data they are trained on gives improved results , we highlight other ingredients ."], "relation": "used for", "id": "2021.eacl-main.24", "year": 2021, "rel_sent": "We show that large scale models can learn these skills when given appropriate training data and choice of generation strategy .", "forward": true, "src_ids": "2021.eacl-main.24_8811"}
{"input": "natural language descriptions is used for Task| context: frame - based state representation is widely used in modern task - oriented dialog systems to model user intentions and slot values . however , a fixed design of domain ontology makes it difficult to extend to new services and apis . recent work proposed to use natural language descriptions to define the domain ontology instead of tag names for each intent or slot , thus offering a dynamic set of schema .", "entity": "natural language descriptions", "output": "dialog state tracking", "neg_sample": ["natural language descriptions is used for Task", "frame - based state representation is widely used in modern task - oriented dialog systems to model user intentions and slot values .", "however , a fixed design of domain ontology makes it difficult to extend to new services and apis .", "recent work proposed to use natural language descriptions to define the domain ontology instead of tag names for each intent or slot , thus offering a dynamic set of schema ."], "relation": "used for", "id": "2021.naacl-main.62", "year": 2021, "rel_sent": "In this paper , we conduct in - depth comparative studies to understand the use of natural language description for schema in dialog state tracking .", "forward": true, "src_ids": "2021.naacl-main.62_9966"}
{"input": "automated system is used for Metric| context: easy access , variety of content , and fast widespread interactions are some of the reasons that have made social media increasingly popular in today 's society . however , this has also enabled the widespread propagation of fake news , text that is published with an intent to spread misinformation and sway beliefs . detecting fake news is important to prevent misinformation and maintain a healthy society . while prior works have tackled this problem by building supervised learning systems , automatedly modeling the social media landscape that enables the spread of fake news is challenging . on the contrary , having humans fact check all news is not scalable .", "entity": "automated system", "output": "social media representation quality", "neg_sample": ["automated system is used for Metric", "easy access , variety of content , and fast widespread interactions are some of the reasons that have made social media increasingly popular in today 's society .", "however , this has also enabled the widespread propagation of fake news , text that is published with an intent to spread misinformation and sway beliefs .", "detecting fake news is important to prevent misinformation and maintain a healthy society .", "while prior works have tackled this problem by building supervised learning systems , automatedly modeling the social media landscape that enables the spread of fake news is challenging .", "on the contrary , having humans fact check all news is not scalable ."], "relation": "used for", "id": "2021.internlp-1.7", "year": 2021, "rel_sent": "Thus , in this paper , we propose to approach this problem interactively , where human insight can be continually combined with an automated system , enabling better social media representation quality .", "forward": true, "src_ids": "2021.internlp-1.7_2299"}
{"input": "moderately under - resourced languages is done by using Method| context: multilingual language models have been a crucial breakthrough as they considerably reduce the need of data for under - resourced languages . nevertheless , the superiority of language - specific models has already been proven for languages having access to large amounts of data .", "entity": "moderately under - resourced languages", "output": "multilingual models", "neg_sample": ["moderately under - resourced languages is done by using Method", "multilingual language models have been a crucial breakthrough as they considerably reduce the need of data for under - resourced languages .", "nevertheless , the superiority of language - specific models has already been proven for languages having access to large amounts of data ."], "relation": "used for", "id": "2021.findings-acl.437", "year": 2021, "rel_sent": "Are Multilingual Models the Best Choice for Moderately Under - resourced Languages ? A Comprehensive Assessment for Catalan.", "forward": false, "src_ids": "2021.findings-acl.437_2592"}
{"input": "sequence tagging problems is done by using Method| context: the semeval 2021 task 5 : toxic spans detection is a task of identifying considered - toxic spans in text , which provides a valuable , automatic tool for moderating online contents .", "entity": "sequence tagging problems", "output": "dual networks", "neg_sample": ["sequence tagging problems is done by using Method", "the semeval 2021 task 5 : toxic spans detection is a task of identifying considered - toxic spans in text , which provides a valuable , automatic tool for moderating online contents ."], "relation": "used for", "id": "2021.semeval-1.120", "year": 2021, "rel_sent": "S - NLP at SemEval-2021 Task 5 : An Analysis of Dual Networks for Sequence Tagging.", "forward": false, "src_ids": "2021.semeval-1.120_11674"}
{"input": "speech recognition models is used for Material| context: language is a fundamental component of human communication . however , there is still very little comparable research in speech recognition for african languages .", "entity": "speech recognition models", "output": "african low - resourced languages", "neg_sample": ["speech recognition models is used for Material", "language is a fundamental component of human communication .", "however , there is still very little comparable research in speech recognition for african languages ."], "relation": "used for", "id": "2021.winlp-1.1", "year": 2021, "rel_sent": "Our findings serve both as a guide for future NLP research for Fon and Igbo in particular , and the creation of speech recognition models for other African low - resourced languages in general .", "forward": true, "src_ids": "2021.winlp-1.1_4084"}
{"input": "tail - to - tail is used for Task| context: since most tokens are correct and easily to be predicted / conveyed to the target , then the models may suffer from a severe class imbalance issue .", "entity": "tail - to - tail", "output": "error detection and correction", "neg_sample": ["tail - to - tail is used for Task", "since most tokens are correct and easily to be predicted / conveyed to the target , then the models may suffer from a severe class imbalance issue ."], "relation": "used for", "id": "2021.acl-long.385", "year": 2021, "rel_sent": "Experimental results on standard datasets , especially on the variable - length datasets , demonstrate the effectiveness of TtT in terms of sentence - level Accuracy , Precision , Recall , and F1 - Measure on tasks of error Detection and Correction .", "forward": true, "src_ids": "2021.acl-long.385_9023"}
{"input": "neural transformer framework is used for Task| context: in recent years , the widespread use of social media has led to an increase in the generation of toxic and offensive content on online platforms . in response , social media platforms have worked on developing automatic detection methods and employing human moderators to cope with this deluge of offensive content . while various state - of - the - art statistical models have been applied to detect toxic posts , there are only a few studies that focus on detecting the words or expressions that make a post offensive .", "entity": "neural transformer framework", "output": "detecting toxic spans", "neg_sample": ["neural transformer framework is used for Task", "in recent years , the widespread use of social media has led to an increase in the generation of toxic and offensive content on online platforms .", "in response , social media platforms have worked on developing automatic detection methods and employing human moderators to cope with this deluge of offensive content .", "while various state - of - the - art statistical models have been applied to detect toxic posts , there are only a few studies that focus on detecting the words or expressions that make a post offensive ."], "relation": "used for", "id": "2021.semeval-1.111", "year": 2021, "rel_sent": "WLV - RIT at SemEval-2021 Task 5 : A Neural Transformer Framework for Detecting Toxic Spans.", "forward": true, "src_ids": "2021.semeval-1.111_7281"}
{"input": "information bottleneck principle is used for Task| context: current abstractive summarization systems outperform their extractive counterparts , but their widespread adoption is inhibited by the inherent lack of interpretability . extractive summarization systems , though interpretable , suffer from redundancy and possible lack of coherence .", "entity": "information bottleneck principle", "output": "extraction", "neg_sample": ["information bottleneck principle is used for Task", "current abstractive summarization systems outperform their extractive counterparts , but their widespread adoption is inhibited by the inherent lack of interpretability .", "extractive summarization systems , though interpretable , suffer from redundancy and possible lack of coherence ."], "relation": "used for", "id": "2021.newsum-1.10", "year": 2021, "rel_sent": "We use the Information Bottleneck principle to jointly train the extraction and abstraction in an end - to - end fashion .", "forward": true, "src_ids": "2021.newsum-1.10_8943"}
{"input": "english - hinglish translation is done by using Method| context: code - switching is the embedding of linguistic units or phrases from two or more languages in a single sentence . this phenomenon is practiced in all multilingual communities and is prominent in social media . consequently , there is a growing need to understand code - switched translations by translating the code - switched text into one of the standard languages or vice versa . neural machine translation is a well - studied research problem in the monolingual text .", "entity": "english - hinglish translation", "output": "sequence networks", "neg_sample": ["english - hinglish translation is done by using Method", "code - switching is the embedding of linguistic units or phrases from two or more languages in a single sentence .", "this phenomenon is practiced in all multilingual communities and is prominent in social media .", "consequently , there is a growing need to understand code - switched translations by translating the code - switched text into one of the standard languages or vice versa .", "neural machine translation is a well - studied research problem in the monolingual text ."], "relation": "used for", "id": "2021.calcs-1.4", "year": 2021, "rel_sent": "In this paper , we have used the gated convolutional sequences to sequence networks for English - Hinglish translation .", "forward": false, "src_ids": "2021.calcs-1.4_204"}
{"input": "contextual decomposition is used for Method| context: the field of explainable ai has recently seen an explosion in the number of explanation methods for highly non - linear deep neural networks . the extent to which such methods - that are often proposed and tested in the domain of computer vision - are appropriate to address the explainability challenges in nlp is yet relatively unexplored .", "entity": "contextual decomposition", "output": "attention - based models", "neg_sample": ["contextual decomposition is used for Method", "the field of explainable ai has recently seen an explosion in the number of explanation methods for highly non - linear deep neural networks .", "the extent to which such methods - that are often proposed and tested in the domain of computer vision - are appropriate to address the explainability challenges in nlp is yet relatively unexplored ."], "relation": "used for", "id": "2021.deelio-1.13", "year": 2021, "rel_sent": "To this end , we extend CD to cover the operations necessary for attention - based models .", "forward": true, "src_ids": "2021.deelio-1.13_1005"}
{"input": "distributional methods is used for Material| context: just as the meaning of words is tied to the communities in which they are used , so too is semantic change . but how does lexical semantic change manifest differently across different communities ?", "entity": "distributional methods", "output": "social networks", "neg_sample": ["distributional methods is used for Material", "just as the meaning of words is tied to the communities in which they are used , so too is semantic change .", "but how does lexical semantic change manifest differently across different communities ?"], "relation": "used for", "id": "2021.starsem-1.3", "year": 2021, "rel_sent": "We use distributional methods to quantify lexical semantic change and induce a social network on communities , based on interactions between members .", "forward": true, "src_ids": "2021.starsem-1.3_11333"}
{"input": "aspect labels is used for Task| context: the pilot version annotates english data .", "entity": "aspect labels", "output": "uniform meaning representations ( umr )", "neg_sample": ["aspect labels is used for Task", "the pilot version annotates english data ."], "relation": "used for", "id": "2021.law-1.4", "year": 2021, "rel_sent": "The aspect labels are designed specifically for Uniform Meaning Representations ( UMR ) , an annotation schema that aims to encode crosslingual semantic information .", "forward": true, "src_ids": "2021.law-1.4_10741"}
{"input": "aggressive data distillation is done by using Method| context: while neural networks produce state - of - the - art performance in several nlp tasks , they generally depend heavily on lexicalized information , which transfer poorly between domains .", "entity": "aggressive data distillation", "output": "model distillation method", "neg_sample": ["aggressive data distillation is done by using Method", "while neural networks produce state - of - the - art performance in several nlp tasks , they generally depend heavily on lexicalized information , which transfer poorly between domains ."], "relation": "used for", "id": "2021.naacl-main.360", "year": 2021, "rel_sent": "We present a data distillation technique for delexicalization , which we then combine with a model distillation method to prevent aggressive data distillation .", "forward": false, "src_ids": "2021.naacl-main.360_6659"}
{"input": "conversational artificial intelligence ( ai ) is used for Task| context: almost 30 % of the adult population in the world is experiencing or has experience insomnia . cognitive behaviour therapy for insomnia ( cbt - i ) is one of the most effective treatment , but it has limitations on accessibility and availability . utilising technology is one of the possible solutions , but existing methods neglect conversational aspects , which plays a critical role in sleep therapy .", "entity": "conversational artificial intelligence ( ai )", "output": "sleep coaching programme", "neg_sample": ["conversational artificial intelligence ( ai ) is used for Task", "almost 30 % of the adult population in the world is experiencing or has experience insomnia .", "cognitive behaviour therapy for insomnia ( cbt - i ) is one of the most effective treatment , but it has limitations on accessibility and availability .", "utilising technology is one of the possible solutions , but existing methods neglect conversational aspects , which plays a critical role in sleep therapy ."], "relation": "used for", "id": "2021.eacl-srw.17", "year": 2021, "rel_sent": "To address this issue , we propose a PhD project exploring potentials of developing conversational artificial intelligence ( AI ) for a sleep coaching programme , which is motivated by CBT - I treatment .", "forward": true, "src_ids": "2021.eacl-srw.17_5212"}
{"input": "domain - specific synonym replacement is used for OtherScientificTerm| context: healthcare predictive analytics aids medical decision - making , diagnosis prediction and drug review analysis . therefore , prediction accuracy is an important criteria which also necessitates robust predictive language models . however , the models using deep learning have been proven vulnerable towards insignificantly perturbed input instances which are less likely to be misclassified by humans . recent efforts of generating adversaries using rule - based synonyms and bert - mlms have been witnessed in general domain , but the ever - increasing biomedical literature poses unique challenges .", "entity": "domain - specific synonym replacement", "output": "biomedical named entities", "neg_sample": ["domain - specific synonym replacement is used for OtherScientificTerm", "healthcare predictive analytics aids medical decision - making , diagnosis prediction and drug review analysis .", "therefore , prediction accuracy is an important criteria which also necessitates robust predictive language models .", "however , the models using deep learning have been proven vulnerable towards insignificantly perturbed input instances which are less likely to be misclassified by humans .", "recent efforts of generating adversaries using rule - based synonyms and bert - mlms have been witnessed in general domain , but the ever - increasing biomedical literature poses unique challenges ."], "relation": "used for", "id": "2021.naacl-main.423", "year": 2021, "rel_sent": "We propose BBAEG ( Biomedical BERT - based Adversarial Example Generation ) , a black - box attack algorithm for biomedical text classification , leveraging the strengths of both domain - specific synonym replacement for biomedical named entities and BERT - MLM predictions , spelling variation and number replacement .", "forward": true, "src_ids": "2021.naacl-main.423_14114"}
{"input": "ablation - based methodology is used for Generic| context: an in - depth analysis of the level of language understanding required by existing machine reading comprehension ( mrc ) benchmarks can provide insight into the reading capabilities of machines .", "entity": "ablation - based methodology", "output": "skills", "neg_sample": ["ablation - based methodology is used for Generic", "an in - depth analysis of the level of language understanding required by existing machine reading comprehension ( mrc ) benchmarks can provide insight into the reading capabilities of machines ."], "relation": "used for", "id": "2021.eacl-main.311", "year": 2021, "rel_sent": "We then introduce ablation methods that verify whether these skills are required to succeed on a dataset .", "forward": true, "src_ids": "2021.eacl-main.311_5176"}
{"input": "low - quality synthetic data is done by using Method| context: while self - training generates synthetic training data where natural inputs are aligned with noisy outputs , back - training results in natural outputs aligned with noisy inputs .", "entity": "low - quality synthetic data", "output": "consistency filters", "neg_sample": ["low - quality synthetic data is done by using Method", "while self - training generates synthetic training data where natural inputs are aligned with noisy outputs , back - training results in natural outputs aligned with noisy inputs ."], "relation": "used for", "id": "2021.emnlp-main.566", "year": 2021, "rel_sent": "We further propose consistency filters to remove low - quality synthetic data before training .", "forward": false, "src_ids": "2021.emnlp-main.566_4459"}
{"input": "low - resource question generation is done by using Method| context: multi - hop question generation requires complex reasoning and coherent language realization . learning a generation model for the problem requires extensive multi - hop question answering ( qa ) data , which are limited due to the manual collection effort . learning this generating and then composing twophase model , however , requires manually labeled question decomposition data , which is labor intensive .", "entity": "low - resource question generation", "output": "latent reasoning", "neg_sample": ["low - resource question generation is done by using Method", "multi - hop question generation requires complex reasoning and coherent language realization .", "learning a generation model for the problem requires extensive multi - hop question answering ( qa ) data , which are limited due to the manual collection effort .", "learning this generating and then composing twophase model , however , requires manually labeled question decomposition data , which is labor intensive ."], "relation": "used for", "id": "2021.findings-acl.265", "year": 2021, "rel_sent": "Latent Reasoning for Low - Resource Question Generation.", "forward": false, "src_ids": "2021.findings-acl.265_5937"}
{"input": "lms is used for OtherScientificTerm| context: any test that promises to assess human knowledge of language ( kol ) for any statistically - based language model ( lm ) must meet three requirements : ( 1 ) comprehensive coverage of linguistic phenomena ; ( 2 ) replicable and statistically - vetted human judgement data ; and ( 3 ) test the lm 's ability to track the gradience of sentence acceptability . to this end , we propose here the li - adger dataset : a comprehensive collection of 519 sentence types ( 4177 sentences ) spanning the field of current generative linguistics , accompanied by attested and replicable human acceptability judgements ( sprouse & almeida , 2012 ; sprouse et al .", "entity": "lms", "output": "gradience of acceptability", "neg_sample": ["lms is used for OtherScientificTerm", "any test that promises to assess human knowledge of language ( kol ) for any statistically - based language model ( lm ) must meet three requirements : ( 1 ) comprehensive coverage of linguistic phenomena ; ( 2 ) replicable and statistically - vetted human judgement data ; and ( 3 ) test the lm 's ability to track the gradience of sentence acceptability .", "to this end , we propose here the li - adger dataset : a comprehensive collection of 519 sentence types ( 4177 sentences ) spanning the field of current generative linguistics , accompanied by attested and replicable human acceptability judgements ( sprouse & almeida , 2012 ; sprouse et al ."], "relation": "used for", "id": "2021.blackboxnlp-1.38", "year": 2021, "rel_sent": "Adopting the ADC reveals how much harder it is for LMs to track the gradience of acceptability across minimal pairs .", "forward": true, "src_ids": "2021.blackboxnlp-1.38_5352"}
{"input": "hierarchical semantic structures is done by using Task| context: transformer networks have revolutionized nlp representation learning since they were introduced . though a great effort has been made to explain the representation in transformers , it is widely recognized that our understanding is not sufficient .", "entity": "hierarchical semantic structures", "output": "visualization", "neg_sample": ["hierarchical semantic structures is done by using Task", "transformer networks have revolutionized nlp representation learning since they were introduced .", "though a great effort has been made to explain the representation in transformers , it is widely recognized that our understanding is not sufficient ."], "relation": "used for", "id": "2021.deelio-1.1", "year": 2021, "rel_sent": "Through visualization , we demonstrate the hierarchical semantic structures captured by the transformer factors , e.g. , word - level polysemy disambiguation , sentence - level pattern formation , and long - range dependency .", "forward": false, "src_ids": "2021.deelio-1.1_8576"}
{"input": "stochastic rankers is done by using OtherScientificTerm| context: according to the probability ranking principle ( prp ) , ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad - hoc retrieval . the prp holds when two conditions are met : [ c1 ] the models are well calibrated , and , [ c2 ] the probabilities of relevance are reported with certainty . we know however that deep neural networks ( dnns ) are often not well calibrated and have several sources of uncertainty , and thus [ c1 ] and [ c2 ] might not be satisfied by neural rankers .", "entity": "stochastic rankers", "output": "uncertainty of neural rankers", "neg_sample": ["stochastic rankers is done by using OtherScientificTerm", "according to the probability ranking principle ( prp ) , ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad - hoc retrieval .", "the prp holds when two conditions are met : [ c1 ] the models are well calibrated , and , [ c2 ] the probabilities of relevance are reported with certainty .", "we know however that deep neural networks ( dnns ) are often not well calibrated and have several sources of uncertainty , and thus [ c1 ] and [ c2 ] might not be satisfied by neural rankers ."], "relation": "used for", "id": "2021.eacl-main.12", "year": 2021, "rel_sent": "Then , motivated by our findings we use two techniques to model the uncertainty of neural rankers leading to the proposed stochastic rankers , which output a predictive distribution of relevance as opposed to point estimates .", "forward": false, "src_ids": "2021.eacl-main.12_8018"}
{"input": "visualsem is used for Method| context: an exciting frontier in natural language understanding ( nlu ) and generation ( nlg ) calls for ( vision - and- ) language models that can efficiently access external structured knowledge repositories . however , many existing knowledge bases only cover limited domains , or suffer from noisy data , and most of all are typically hard to integrate into neural language pipelines .", "entity": "visualsem", "output": "data augmentation", "neg_sample": ["visualsem is used for Method", "an exciting frontier in natural language understanding ( nlu ) and generation ( nlg ) calls for ( vision - and- ) language models that can efficiently access external structured knowledge repositories .", "however , many existing knowledge bases only cover limited domains , or suffer from noisy data , and most of all are typically hard to integrate into neural language pipelines ."], "relation": "used for", "id": "2021.mrl-1.13", "year": 2021, "rel_sent": "We encourage the research community to use VisualSem for data augmentation and/or as a source of grounding , among other possible uses .", "forward": true, "src_ids": "2021.mrl-1.13_9568"}
{"input": "hierarchical transformer encoders is done by using Method| context: generative models for dialog systems have gained much interest because of the recent success of rnn and transformer based models in tasks like question answering and summarization . although the task of dialog response generation is generally seen as a sequence to sequence ( seq2seq ) problem , researchers in the past have found it challenging to train dialog systems using the standard seq2seq models . therefore , to help the model learn meaningful utterance and conversation level features , sordoni et al . ( 2015b ) , serban et al . with the transformer - based models dominating the seq2seq problems lately , the natural question to ask is the applicability of the notion of hierarchy in transformer - based dialog systems .", "entity": "hierarchical transformer encoders", "output": "generalized framework", "neg_sample": ["hierarchical transformer encoders is done by using Method", "generative models for dialog systems have gained much interest because of the recent success of rnn and transformer based models in tasks like question answering and summarization .", "although the task of dialog response generation is generally seen as a sequence to sequence ( seq2seq ) problem , researchers in the past have found it challenging to train dialog systems using the standard seq2seq models .", "therefore , to help the model learn meaningful utterance and conversation level features , sordoni et al .", "( 2015b ) , serban et al .", "with the transformer - based models dominating the seq2seq problems lately , the natural question to ask is the applicability of the notion of hierarchy in transformer - based dialog systems ."], "relation": "used for", "id": "2021.naacl-main.449", "year": 2021, "rel_sent": "In this paper , we propose a generalized framework for Hierarchical Transformer Encoders and show how a standard transformer can be morphed into any hierarchical encoder , including HRED and HIBERT like models , by using specially designed attention masks and positional encodings .", "forward": false, "src_ids": "2021.naacl-main.449_15027"}
{"input": "large - scale natural language understanding systems is done by using Task| context: large - scale conversational assistants like alexa , siri , cortana and google assistant process every utterance using multiple models for domain , intent and named entity recognition . given the decoupled nature of model development and large traffic volumes , it is extremely difficult to identify utterances processed erroneously by such systems .", "entity": "large - scale natural language understanding systems", "output": "error detection", "neg_sample": ["large - scale natural language understanding systems is done by using Task", "large - scale conversational assistants like alexa , siri , cortana and google assistant process every utterance using multiple models for domain , intent and named entity recognition .", "given the decoupled nature of model development and large traffic volumes , it is extremely difficult to identify utterances processed erroneously by such systems ."], "relation": "used for", "id": "2021.findings-acl.44", "year": 2021, "rel_sent": "Error Detection in Large - Scale Natural Language Understanding Systems Using Transformer Models.", "forward": false, "src_ids": "2021.findings-acl.44_7693"}
{"input": "chinese characters is done by using Method| context: recently , word enhancement has become very popular for chinese named entity recognition ( ner ) , reducing segmentation errors and increasing the semantic and boundary information of chinese words . however , these methods tend to ignore the information of the chinese character structure after integrating the lexical information . chinese characters have evolved from pictographs since ancient times , and their structure often reflects more information about the characters .", "entity": "chinese characters", "output": "metadata embedding based cross - transformer", "neg_sample": ["chinese characters is done by using Method", "recently , word enhancement has become very popular for chinese named entity recognition ( ner ) , reducing segmentation errors and increasing the semantic and boundary information of chinese words .", "however , these methods tend to ignore the information of the chinese character structure after integrating the lexical information .", "chinese characters have evolved from pictographs since ancient times , and their structure often reflects more information about the characters ."], "relation": "used for", "id": "2021.acl-long.121", "year": 2021, "rel_sent": "With the structural characteristics of Chinese characters , MECT can better capture the semantic information of Chinese characters for NER .", "forward": false, "src_ids": "2021.acl-long.121_16062"}
{"input": "information sources is used for Task| context: we study the problem of event causality identification ( eci ) to detect causal relation between event mention pairs in text . although deep learning models have recently shown state - of - the - art performance for eci , they are limited to the intra - sentence setting where event mention pairs are presented in the same sentences .", "entity": "information sources", "output": "document - level eci", "neg_sample": ["information sources is used for Task", "we study the problem of event causality identification ( eci ) to detect causal relation between event mention pairs in text .", "although deep learning models have recently shown state - of - the - art performance for eci , they are limited to the intra - sentence setting where event mention pairs are presented in the same sentences ."], "relation": "used for", "id": "2021.naacl-main.273", "year": 2021, "rel_sent": "Various information sources are introduced to enrich the interaction graphs for DECI , featuring discourse , syntax , and semantic information .", "forward": true, "src_ids": "2021.naacl-main.273_1365"}
{"input": "document - level context is used for Task| context: in a real - time simultaneous translation setting and neural machine translation ( nmt ) models start generating target language tokens from incomplete source language sentences and making them harder to translate and leading to poor translation quality . previous research has shown that document - level nmt and comprising of sentence and context encoders and a decoder and leverages context from neighboring sentences and helps improve translation quality . in simultaneous translation settings and the context from previous sentences should be even more critical .", "entity": "document - level context", "output": "simultaneous neural machine translation", "neg_sample": ["document - level context is used for Task", "in a real - time simultaneous translation setting and neural machine translation ( nmt ) models start generating target language tokens from incomplete source language sentences and making them harder to translate and leading to poor translation quality .", "previous research has shown that document - level nmt and comprising of sentence and context encoders and a decoder and leverages context from neighboring sentences and helps improve translation quality .", "in simultaneous translation settings and the context from previous sentences should be even more critical ."], "relation": "used for", "id": "2021.mtsummit-research.17", "year": 2021, "rel_sent": "Studying The Impact Of Document - level Context On Simultaneous Neural Machine Translation.", "forward": true, "src_ids": "2021.mtsummit-research.17_3650"}
{"input": "zero - shot transfer is done by using Method| context: such dialogs are grounded in domain - specific flowcharts , which the agent is supposed tofollow during the conversation .", "entity": "zero - shot transfer", "output": "flonet", "neg_sample": ["zero - shot transfer is done by using Method", "such dialogs are grounded in domain - specific flowcharts , which the agent is supposed tofollow during the conversation ."], "relation": "used for", "id": "2021.emnlp-main.357", "year": 2021, "rel_sent": "Our experiments find that FLONET can do zero - shot transfer to unseen flowcharts , and sets a strong baseline for future research .", "forward": false, "src_ids": "2021.emnlp-main.357_4840"}
{"input": "graph - based visual question answering is done by using Method| context: images are more than a collection of objects or attributes - they represent a web of relationships among interconnected objects . scene graph has emerged as a new modality as a structured graphical representation of images . scene graph encodes objects as nodes connected via pairwise relations as edges .", "entity": "graph - based visual question answering", "output": "language - guided graph neural networks", "neg_sample": ["graph - based visual question answering is done by using Method", "images are more than a collection of objects or attributes - they represent a web of relationships among interconnected objects .", "scene graph has emerged as a new modality as a structured graphical representation of images .", "scene graph encodes objects as nodes connected via pairwise relations as edges ."], "relation": "used for", "id": "2021.maiworkshop-1.12", "year": 2021, "rel_sent": "GraghVQA : Language - Guided Graph Neural Networks for Graph - based Visual Question Answering.", "forward": false, "src_ids": "2021.maiworkshop-1.12_1827"}
{"input": "asr adaptation is used for Material| context: automatic speech recognition ( asr ) robustness toward slot entities are critical in e - commerce voice assistants that involve monetary transactions and purchases . along with effective domain adaptation , it is intuitive that cross utterance contextual cues play an important role in disambiguating domain specific content words from speech .", "entity": "asr adaptation", "output": "e - commerce chatbots", "neg_sample": ["asr adaptation is used for Material", "automatic speech recognition ( asr ) robustness toward slot entities are critical in e - commerce voice assistants that involve monetary transactions and purchases .", "along with effective domain adaptation , it is intuitive that cross utterance contextual cues play an important role in disambiguating domain specific content words from speech ."], "relation": "used for", "id": "2021.ecnlp-1.3", "year": 2021, "rel_sent": "ASR Adaptation for E - commerce Chatbots using Cross - Utterance Context and Multi - Task Language Modeling.", "forward": true, "src_ids": "2021.ecnlp-1.3_1267"}
{"input": "defence strategies is used for Generic| context: natural language processing ( nlp ) tasks , ranging from text classification to text generation , have been revolutionised by the pretrained language models , such as bert . this allows corporations to easily build powerful apis by encapsulating fine - tuned bert models for downstream tasks . however , when a fine - tuned bert model is deployed as a service , it may suffer from different attacks launched by the malicious users .", "entity": "defence strategies", "output": "victim model", "neg_sample": ["defence strategies is used for Generic", "natural language processing ( nlp ) tasks , ranging from text classification to text generation , have been revolutionised by the pretrained language models , such as bert .", "this allows corporations to easily build powerful apis by encapsulating fine - tuned bert models for downstream tasks .", "however , when a fine - tuned bert model is deployed as a service , it may suffer from different attacks launched by the malicious users ."], "relation": "used for", "id": "2021.naacl-main.161", "year": 2021, "rel_sent": "Finally , we investigate two defence strategies to protect the victim model , and find that unless the performance of the victim model is sacrificed , both model extraction and adversarial transferability can effectively compromise the target models .", "forward": true, "src_ids": "2021.naacl-main.161_10962"}
{"input": "dependency distance is used for Task| context: quantitative research on learner writing has traditionally focused on lexical and syntactic features , but there has been increasing interest in incorporating discourse - level properties .", "entity": "dependency distance", "output": "efl writing", "neg_sample": ["dependency distance is used for Task", "quantitative research on learner writing has traditionally focused on lexical and syntactic features , but there has been increasing interest in incorporating discourse - level properties ."], "relation": "used for", "id": "2021.tlt-1.10", "year": 2021, "rel_sent": "Discourse Tree Structure and Dependency Distance in EFL Writing.", "forward": true, "src_ids": "2021.tlt-1.10_2049"}
{"input": "situation - based summarization is done by using Material| context: currently , text chatting is one of the primary means of communication . however , modern text chat still in general does not offer any navigation or even full - featured search , although the high volumes of messages demand it .", "entity": "situation - based summarization", "output": "gold - standard dataset", "neg_sample": ["situation - based summarization is done by using Material", "currently , text chatting is one of the primary means of communication .", "however , modern text chat still in general does not offer any navigation or even full - featured search , although the high volumes of messages demand it ."], "relation": "used for", "id": "2021.acl-srw.14", "year": 2021, "rel_sent": "Finally , we present the first gold - standard dataset for situation - based summarization .", "forward": false, "src_ids": "2021.acl-srw.14_376"}
{"input": "qualitative properties of natural language arguments is done by using OtherScientificTerm| context: the paper presents a novel discourse - based approach to argument quality assessment defined as a graph classification task , where the depth of reasoning ( argumentation ) is evident from the number and type of detected discourse units and relations between them .", "entity": "qualitative properties of natural language arguments", "output": "discourse - based argument structures", "neg_sample": ["qualitative properties of natural language arguments is done by using OtherScientificTerm", "the paper presents a novel discourse - based approach to argument quality assessment defined as a graph classification task , where the depth of reasoning ( argumentation ) is evident from the number and type of detected discourse units and relations between them ."], "relation": "used for", "id": "2021.ranlp-1.143", "year": 2021, "rel_sent": "The obtained accuracy ranges from 74.5 % to 85.0 % and indicates that discourse - based argument structures reflect qualitative properties of natural language arguments .", "forward": false, "src_ids": "2021.ranlp-1.143_12258"}
{"input": "syntactic structure is done by using Method| context: the validity of the law was confirmed many times for the relation between lengths of a word and its syllables . however , the relation between lengths of sentences ( measured in clauses ) and clauses ( measured in words ) is problematic .", "entity": "syntactic structure", "output": "menzerath - altmann law", "neg_sample": ["syntactic structure is done by using Method", "the validity of the law was confirmed many times for the relation between lengths of a word and its syllables .", "however , the relation between lengths of sentences ( measured in clauses ) and clauses ( measured in words ) is problematic ."], "relation": "used for", "id": "2021.quasy-1.6", "year": 2021, "rel_sent": "The Menzerath - Altmann law in syntactic structure revisited.", "forward": false, "src_ids": "2021.quasy-1.6_4630"}
{"input": "bert variation is done by using Material| context: bert has been shown to be extremely effective on a wide variety of natural language processing tasks , including sentiment analysis and emotion detection . however , the proposed pretraining objectives of bert do not induce any sentiment or emotion - specific biases into the model .", "entity": "bert variation", "output": "pre - training corpora", "neg_sample": ["bert variation is done by using Material", "bert has been shown to be extremely effective on a wide variety of natural language processing tasks , including sentiment analysis and emotion detection .", "however , the proposed pretraining objectives of bert do not induce any sentiment or emotion - specific biases into the model ."], "relation": "used for", "id": "2021.acl-short.38", "year": 2021, "rel_sent": "Using the same pre - training corpora as the original model , Wikipedia and BookCorpus , our BERT variation manages to improve the downstream performance on 4 tasks from emotion detection and sentiment analysis by an average of 1.2 % F-1 .", "forward": false, "src_ids": "2021.acl-short.38_9708"}
{"input": "metaphor detection is done by using Method| context: metaphors are ubiquitous in natural language , and detecting them requires contextual reasoning about whether a semantic incongruence actually exists . most existing work addresses this problem using pre - trained contextualized models . despite their success , these models require a large amount of labeled data and are not linguistically - based .", "entity": "metaphor detection", "output": "contrastive pre - trained model ( cate )", "neg_sample": ["metaphor detection is done by using Method", "metaphors are ubiquitous in natural language , and detecting them requires contextual reasoning about whether a semantic incongruence actually exists .", "most existing work addresses this problem using pre - trained contextualized models .", "despite their success , these models require a large amount of labeled data and are not linguistically - based ."], "relation": "used for", "id": "2021.emnlp-main.316", "year": 2021, "rel_sent": "In this paper , we proposed a ContrAstive pre - Trained modEl ( CATE ) for metaphor detection with semi - supervised learning .", "forward": false, "src_ids": "2021.emnlp-main.316_1487"}
{"input": "machine translation ( mt ) mistakes is done by using Metric| context: social media companies as well as censorship authorities make extensive use of artificial intelligence ( ai ) tools to monitor postings of hate speech , celebrations of violence or profanity . since ai software requires massive volumes of data to train computers , automatic - translation of the online content is usually implemented to compensate for the scarcity of text in some languages . however , machine translation ( mt ) mistakes are a regular occurrence when translating sentiment - oriented user - generated content ( ugc ) , especially when a low - resource language is involved . in such scenarios , the adequacy of the whole process relies on the assumption that the translation can be evaluated correctly .", "entity": "machine translation ( mt ) mistakes", "output": "automatic quality metrics", "neg_sample": ["machine translation ( mt ) mistakes is done by using Metric", "social media companies as well as censorship authorities make extensive use of artificial intelligence ( ai ) tools to monitor postings of hate speech , celebrations of violence or profanity .", "since ai software requires massive volumes of data to train computers , automatic - translation of the online content is usually implemented to compensate for the scarcity of text in some languages .", "however , machine translation ( mt ) mistakes are a regular occurrence when translating sentiment - oriented user - generated content ( ugc ) , especially when a low - resource language is involved .", "in such scenarios , the adequacy of the whole process relies on the assumption that the translation can be evaluated correctly ."], "relation": "used for", "id": "2021.triton-1.6", "year": 2021, "rel_sent": "In this paper , we assess the ability of automatic quality metrics to detect critical machine translation errors which can cause serious misunderstanding of the affect message .", "forward": false, "src_ids": "2021.triton-1.6_15803"}
{"input": "news article headlines is done by using Method| context: we present a covid-19 news dashboard which visualizes sentiment in pandemic news coverage in different languages across europe . the dashboard shows analyses for positive / neutral / negative sentiment and moral sentiment for news articles across countries and languages .", "entity": "news article headlines", "output": "multilingual bert model", "neg_sample": ["news article headlines is done by using Method", "we present a covid-19 news dashboard which visualizes sentiment in pandemic news coverage in different languages across europe .", "the dashboard shows analyses for positive / neutral / negative sentiment and moral sentiment for news articles across countries and languages ."], "relation": "used for", "id": "2021.hackashop-1.15", "year": 2021, "rel_sent": "Then we use a pre - trained multilingual BERT model for sentiment analysis of news article headlines and a dictionary and word vectors -based method for moral sentiment analysis of news articles .", "forward": false, "src_ids": "2021.hackashop-1.15_11337"}
{"input": "deep pretrained language models is used for OtherScientificTerm| context: contextual word representation models have shown massive improvements on a multitude of nlp tasks , yet their word sense disambiguation capabilities remain poorly explained .", "entity": "deep pretrained language models", "output": "anisotropic representations", "neg_sample": ["deep pretrained language models is used for OtherScientificTerm", "contextual word representation models have shown massive improvements on a multitude of nlp tasks , yet their word sense disambiguation capabilities remain poorly explained ."], "relation": "used for", "id": "2021.deelio-1.9", "year": 2021, "rel_sent": "We analyze the representation geometry and find that most layers of deep pretrained language models create highly anisotropic representations , pointing towards the existence of representation degeneration problem in contextual word representations .", "forward": true, "src_ids": "2021.deelio-1.9_11996"}
{"input": "char - level aligner is done by using Method| context: the languages examined are english , fake english , german and greek .", "entity": "char - level aligner", "output": "bert", "neg_sample": ["char - level aligner is done by using Method", "the languages examined are english , fake english , german and greek ."], "relation": "used for", "id": "2021.insights-1.3", "year": 2021, "rel_sent": "Here we investigate whether BERT can also operate as a char - level aligner .", "forward": false, "src_ids": "2021.insights-1.3_12874"}
{"input": "conversational search is used for Method| context: according to the probability ranking principle ( prp ) , ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad - hoc retrieval . the prp holds when two conditions are met : [ c1 ] the models are well calibrated , and , [ c2 ] the probabilities of relevance are reported with certainty . we know however that deep neural networks ( dnns ) are often not well calibrated and have several sources of uncertainty , and thus [ c1 ] and [ c2 ] might not be satisfied by neural rankers .", "entity": "conversational search", "output": "stochastic rankers", "neg_sample": ["conversational search is used for Method", "according to the probability ranking principle ( prp ) , ranking documents in decreasing order of their probability of relevance leads to an optimal document ranking for ad - hoc retrieval .", "the prp holds when two conditions are met : [ c1 ] the models are well calibrated , and , [ c2 ] the probabilities of relevance are reported with certainty .", "we know however that deep neural networks ( dnns ) are often not well calibrated and have several sources of uncertainty , and thus [ c1 ] and [ c2 ] might not be satisfied by neural rankers ."], "relation": "used for", "id": "2021.eacl-main.12", "year": 2021, "rel_sent": "Then , motivated by our findings we use two techniques to model the uncertainty of neural rankers leading to the proposed stochastic rankers , which output a predictive distribution of relevance as opposed to point estimates .", "forward": true, "src_ids": "2021.eacl-main.12_8020"}
{"input": "dimensional sentiment analysis techniques is used for Material| context: valence represents the degree of pleasant and unpleasant ( or positive and negative ) feelings , and arousal represents the degree of excitement and calm .", "entity": "dimensional sentiment analysis techniques", "output": "educational domain", "neg_sample": ["dimensional sentiment analysis techniques is used for Material", "valence represents the degree of pleasant and unpleasant ( or positive and negative ) feelings , and arousal represents the degree of excitement and calm ."], "relation": "used for", "id": "2021.rocling-1.51", "year": 2021, "rel_sent": "We expected that this evaluation campaign could produce more advanced dimensional sentiment analysis techniques for the educational domain .", "forward": true, "src_ids": "2021.rocling-1.51_4120"}
{"input": "distantly - supervised models is used for OtherScientificTerm| context: naturally - occurring bracketings , such as answer fragments to natural language questions and hyperlinks on webpages , can reflect human syntactic intuition regarding phrasal boundaries . their availability and approximate correspondence to syntax make them appealing as distant information sources to incorporate into unsupervised constituency parsing .", "entity": "distantly - supervised models", "output": "syntactic structures", "neg_sample": ["distantly - supervised models is used for OtherScientificTerm", "naturally - occurring bracketings , such as answer fragments to natural language questions and hyperlinks on webpages , can reflect human syntactic intuition regarding phrasal boundaries .", "their availability and approximate correspondence to syntax make them appealing as distant information sources to incorporate into unsupervised constituency parsing ."], "relation": "used for", "id": "2021.naacl-main.234", "year": 2021, "rel_sent": "Experiments demonstrate that our distantly - supervised models trained on naturally - occurring bracketing data are more accurate in inducing syntactic structures than competing unsupervised systems .", "forward": true, "src_ids": "2021.naacl-main.234_1084"}
{"input": "disclosive transparency is done by using Metric| context: broader disclosive transparency - truth and clarity in communication regarding the function of ai systems - is widely considered desirable . unfortunately , it is a nebulous concept , difficult to both define and quantify . this is problematic , as previous work has demonstrated possible trade - offs and negative consequences to disclosive transparency , such as a confusion effect , where ' too much information ' clouds a reader 's understanding of what a system description means . disclosive transparency 's subjective nature has rendered deep study into these problems and their remedies difficult .", "entity": "disclosive transparency", "output": "neural language model - based probabilistic metrics", "neg_sample": ["disclosive transparency is done by using Metric", "broader disclosive transparency - truth and clarity in communication regarding the function of ai systems - is widely considered desirable .", "unfortunately , it is a nebulous concept , difficult to both define and quantify .", "this is problematic , as previous work has demonstrated possible trade - offs and negative consequences to disclosive transparency , such as a confusion effect , where ' too much information ' clouds a reader 's understanding of what a system description means .", "disclosive transparency 's subjective nature has rendered deep study into these problems and their remedies difficult ."], "relation": "used for", "id": "2021.emnlp-main.153", "year": 2021, "rel_sent": "To improve this state of affairs , We introduce neural language model - based probabilistic metrics to directly model disclosive transparency , and demonstrate that they correlate with user and expert opinions of system transparency , making them a valid objective proxy .", "forward": false, "src_ids": "2021.emnlp-main.153_13179"}
{"input": "contextual synonyms is done by using Method| context: contextualised word embeddings is a powerful tool to detect contextual synonyms . however , most of the current state - of - the - art ( sota ) deep learning concept extraction methods remain supervised and underexploit the potential of the context .", "entity": "contextual synonyms", "output": "self - supervised pre - training approach", "neg_sample": ["contextual synonyms is done by using Method", "contextualised word embeddings is a powerful tool to detect contextual synonyms .", "however , most of the current state - of - the - art ( sota ) deep learning concept extraction methods remain supervised and underexploit the potential of the context ."], "relation": "used for", "id": "2021.emnlp-main.690", "year": 2021, "rel_sent": "In this paper , we propose a self - supervised pre - training approach which is able to detect contextual synonyms of concepts being training on the data created by shallow matching .", "forward": false, "src_ids": "2021.emnlp-main.690_14921"}
{"input": "representational correlates is done by using OtherScientificTerm| context: while vector - based language representations from pretrained language models have set a new standard for many nlp tasks , there is not yet a complete accounting of their inner workings . in particular , it is not entirely clear what aspects of sentence - level syntax are captured by these representations , nor how ( if at all ) they are built along the stacked layers of the network .", "entity": "representational correlates", "output": "syntactic perturbations", "neg_sample": ["representational correlates is done by using OtherScientificTerm", "while vector - based language representations from pretrained language models have set a new standard for many nlp tasks , there is not yet a complete accounting of their inner workings .", "in particular , it is not entirely clear what aspects of sentence - level syntax are captured by these representations , nor how ( if at all ) they are built along the stacked layers of the network ."], "relation": "used for", "id": "2021.repl4nlp-1.27", "year": 2021, "rel_sent": "Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models.", "forward": false, "src_ids": "2021.repl4nlp-1.27_10562"}
{"input": "anisotropic representations is done by using Method| context: contextual word representation models have shown massive improvements on a multitude of nlp tasks , yet their word sense disambiguation capabilities remain poorly explained .", "entity": "anisotropic representations", "output": "deep pretrained language models", "neg_sample": ["anisotropic representations is done by using Method", "contextual word representation models have shown massive improvements on a multitude of nlp tasks , yet their word sense disambiguation capabilities remain poorly explained ."], "relation": "used for", "id": "2021.deelio-1.9", "year": 2021, "rel_sent": "We analyze the representation geometry and find that most layers of deep pretrained language models create highly anisotropic representations , pointing towards the existence of representation degeneration problem in contextual word representations .", "forward": false, "src_ids": "2021.deelio-1.9_11995"}
{"input": "pre - training is used for Method| context: this tutorial provides a comprehensive guide to make the most of pre - training for neural machine translation .", "entity": "pre - training", "output": "nmt", "neg_sample": ["pre - training is used for Method", "this tutorial provides a comprehensive guide to make the most of pre - training for neural machine translation ."], "relation": "used for", "id": "2021.acl-tutorials.4", "year": 2021, "rel_sent": "Then we will focus on analysing the role of pre - training in enhancing the performance of NMT , how to design a better pre - training model for executing specific NMT tasks and how to better integrate the pre - trained model into NMT system .", "forward": true, "src_ids": "2021.acl-tutorials.4_910"}
{"input": "multi - step reasoning ( mr ) is done by using Method| context: the key challenge of the visual dialog task is how tofuse features from multimodal sources and extract relevant information from dialog history to answer the current query .", "entity": "multi - step reasoning ( mr )", "output": "transformer", "neg_sample": ["multi - step reasoning ( mr ) is done by using Method", "the key challenge of the visual dialog task is how tofuse features from multimodal sources and extract relevant information from dialog history to answer the current query ."], "relation": "used for", "id": "2021.dialdoc-1.2", "year": 2021, "rel_sent": "For inference , we propose two Sequential Dialog Networks ( SeqDialN ): the first uses LSTM(Hochreiter and Schmidhuber,1997 ) for information propagation ( IP ) and the second uses a modified Transformer ( Vaswani et al . ,2017 ) for multi - step reasoning ( MR ) .", "forward": false, "src_ids": "2021.dialdoc-1.2_7136"}
{"input": "nmt model is used for OtherScientificTerm| context: reviews written by the users for a particular product or service play an influencing role for the customers to make an informative decision . although online e - commerce portals have immensely impacted our lives , available contents predominantly are in english language- often limiting its widespread usage . there is an exponential growth in the number of e - commerce users who are not proficient in english . hence , there is a necessity to make these services available in non - english languages , especially in a multilingual country like india . this can be achieved by an in - domain robust machine translation ( mt ) system . however , the reviews written by the users pose unique challenges to mt , such as misspelled words , ungrammatical constructions , presence of colloquial terms , lack of resources such as in - domain parallel corpus etc .", "entity": "nmt model", "output": "noisy tokens", "neg_sample": ["nmt model is used for OtherScientificTerm", "reviews written by the users for a particular product or service play an influencing role for the customers to make an informative decision .", "although online e - commerce portals have immensely impacted our lives , available contents predominantly are in english language- often limiting its widespread usage .", "there is an exponential growth in the number of e - commerce users who are not proficient in english .", "hence , there is a necessity to make these services available in non - english languages , especially in a multilingual country like india .", "this can be achieved by an in - domain robust machine translation ( mt ) system .", "however , the reviews written by the users pose unique challenges to mt , such as misspelled words , ungrammatical constructions , presence of colloquial terms , lack of resources such as in - domain parallel corpus etc ."], "relation": "used for", "id": "2021.ecnlp-1.21", "year": 2021, "rel_sent": "In order to make our NMT model robust enough to handle the noisy tokens in the reviews , we integrate a character based language model to generate word vectors and map the noisy tokens with their correct forms .", "forward": true, "src_ids": "2021.ecnlp-1.21_10406"}
{"input": "well - calibrated model is done by using Method| context: a medical dialogue system is essential for healthcare service as providing primary clinical advice and diagnoses . it has been gradually adopted and practiced in medical organizations , largely due to the advancement of nlp . the introduction of state - of - the - art deep learning models and transfer learning techniques like universal language model fine tuning ( ulmfit ) and knowledge distillation ( kd ) largely contributes to the performance of nlp tasks . however , some deep neural networks are poorly calibrated and wrongly estimate the uncertainty . hence the model is not trustworthy , especially in sensitive medical decisionmaking systems and safety tasks .", "entity": "well - calibrated model", "output": "label smoothing ( ls )", "neg_sample": ["well - calibrated model is done by using Method", "a medical dialogue system is essential for healthcare service as providing primary clinical advice and diagnoses .", "it has been gradually adopted and practiced in medical organizations , largely due to the advancement of nlp .", "the introduction of state - of - the - art deep learning models and transfer learning techniques like universal language model fine tuning ( ulmfit ) and knowledge distillation ( kd ) largely contributes to the performance of nlp tasks .", "however , some deep neural networks are poorly calibrated and wrongly estimate the uncertainty .", "hence the model is not trustworthy , especially in sensitive medical decisionmaking systems and safety tasks ."], "relation": "used for", "id": "2021.icnlsp-1.22", "year": 2021, "rel_sent": "The calibrated ULMFiT ( CULMFiT ) is obtained by incorporating label smoothing ( LS ) to achieve a well - calibrated model .", "forward": false, "src_ids": "2021.icnlsp-1.22_12897"}
{"input": "re - ranking mechanism is used for OtherScientificTerm| context: in previous studies , researchers have faced various issues such as the out - of - vocabulary problem and over / under - specificity problems . over - specific definitions present narrow word meanings , whereas under - specific definitions present general and context - insensitive meanings .", "entity": "re - ranking mechanism", "output": "specificity in definitions", "neg_sample": ["re - ranking mechanism is used for OtherScientificTerm", "in previous studies , researchers have faced various issues such as the out - of - vocabulary problem and over / under - specificity problems .", "over - specific definitions present narrow word meanings , whereas under - specific definitions present general and context - insensitive meanings ."], "relation": "used for", "id": "2021.emnlp-main.194", "year": 2021, "rel_sent": "The proposed method addresses the aforementioned problems by leveraging a pre - trained encoder - decoder model , namely Text - to - Text Transfer Transformer , and introducing a re - ranking mechanism to model specificity in definitions .", "forward": true, "src_ids": "2021.emnlp-main.194_7255"}
{"input": "read operation is done by using Method| context: various machine learning tasks can benefit from access to external information of different modalities , such as text and images . recent work has focused on learning architectures with large memories capable of storing this knowledge .", "entity": "read operation", "output": "kif module", "neg_sample": ["read operation is done by using Method", "various machine learning tasks can benefit from access to external information of different modalities , such as text and images .", "recent work has focused on learning architectures with large memories capable of storing this knowledge ."], "relation": "used for", "id": "2021.tacl-1.6", "year": 2021, "rel_sent": "Each KIF module learns a read operation to access fixed external knowledge .", "forward": false, "src_ids": "2021.tacl-1.6_9751"}
{"input": "dynamic terminology integration is used for Material| context: the majority of language domains require prudent use of terminology to ensure clarity and adequacy of information conveyed . while the correct use of terminology for some languages and domains can be achieved by adapting general - purpose mt systems on large volumes of in - domain parallel data , such quantities of domain - specific data are seldom available for less - resourced languages and niche domains . however , the gravity of this recent calamity created a high demand for reliable translation of critical information regarding pandemic and infection prevention .", "entity": "dynamic terminology integration", "output": "covid-19", "neg_sample": ["dynamic terminology integration is used for Material", "the majority of language domains require prudent use of terminology to ensure clarity and adequacy of information conveyed .", "while the correct use of terminology for some languages and domains can be achieved by adapting general - purpose mt systems on large volumes of in - domain parallel data , such quantities of domain - specific data are seldom available for less - resourced languages and niche domains .", "however , the gravity of this recent calamity created a high demand for reliable translation of critical information regarding pandemic and infection prevention ."], "relation": "used for", "id": "2021.wmt-1.81", "year": 2021, "rel_sent": "Dynamic Terminology Integration for COVID-19 and Other Emerging Domains.", "forward": true, "src_ids": "2021.wmt-1.81_3814"}
{"input": "cross - lingual language model pre - training is done by using OtherScientificTerm| context: compared to monolingual models , cross - lingual models usually require a more expressive vocabulary to represent all languages adequately . we find that many languages are under - represented in recent cross - lingual language models due to the limited vocabulary capacity . however , increasing the vocabulary size significantly slows down the pre - training speed .", "entity": "cross - lingual language model pre - training", "output": "multilingual vocabulary", "neg_sample": ["cross - lingual language model pre - training is done by using OtherScientificTerm", "compared to monolingual models , cross - lingual models usually require a more expressive vocabulary to represent all languages adequately .", "we find that many languages are under - represented in recent cross - lingual language models due to the limited vocabulary capacity .", "however , increasing the vocabulary size significantly slows down the pre - training speed ."], "relation": "used for", "id": "2021.emnlp-main.257", "year": 2021, "rel_sent": "Our experiments show that the multilingual vocabulary learned with VoCap benefits cross - lingual language model pre - training .", "forward": false, "src_ids": "2021.emnlp-main.257_2269"}
{"input": "tailored pre - training model is used for Task| context: large pre - trained language generation models such as gpt-2 have demonstrated their effectiveness as language priors by reaching state - of - the - art results in various language generation tasks . however , the performance of pre - trained models on task - oriented dialog tasks is still under - explored .", "entity": "tailored pre - training model", "output": "task - oriented dialog generation", "neg_sample": ["tailored pre - training model is used for Task", "large pre - trained language generation models such as gpt-2 have demonstrated their effectiveness as language priors by reaching state - of - the - art results in various language generation tasks .", "however , the performance of pre - trained models on task - oriented dialog tasks is still under - explored ."], "relation": "used for", "id": "2021.acl-short.40", "year": 2021, "rel_sent": "PRAL : A Tailored Pre - Training Model for Task - Oriented Dialog Generation.", "forward": true, "src_ids": "2021.acl-short.40_6000"}
{"input": "propositional content and syntactic category information is used for Task| context: expectation - based theories of sentence processing posit that processing difficulty is determined by predictability in context . while predictability quantified via surprisal has gained empirical support , this representation - agnostic measure leaves open the question of how to best approximate the human comprehender 's latent probability model .", "entity": "propositional content and syntactic category information", "output": "incremental sentence processing", "neg_sample": ["propositional content and syntactic category information is used for Task", "expectation - based theories of sentence processing posit that processing difficulty is determined by predictability in context .", "while predictability quantified via surprisal has gained empirical support , this representation - agnostic measure leaves open the question of how to best approximate the human comprehender 's latent probability model ."], "relation": "used for", "id": "2021.cmcl-1.28", "year": 2021, "rel_sent": "Taken together , these results suggest a role for propositional content and syntactic category information in incremental sentence processing .", "forward": true, "src_ids": "2021.cmcl-1.28_1787"}
{"input": "structured graphs is used for Material| context: abstractive conversation summarization has received much attention recently . however , these generated summaries often suffer from insufficient , redundant , or incorrect content , largely due to the unstructured and complex characteristics of human - human interactions .", "entity": "structured graphs", "output": "conversations", "neg_sample": ["structured graphs is used for Material", "abstractive conversation summarization has received much attention recently .", "however , these generated summaries often suffer from insufficient , redundant , or incorrect content , largely due to the unstructured and complex characteristics of human - human interactions ."], "relation": "used for", "id": "2021.naacl-main.109", "year": 2021, "rel_sent": "To this end , we propose to explicitly model the rich structures in conversations for more precise and accurate conversation summarization , by first incorporating discourse relations between utterances and action triples ( ' who - doing - what ' ) in utterances through structured graphs to better encode conversations , and then designing a multi - granularity decoder to generate summaries by combining all levels of information .", "forward": true, "src_ids": "2021.naacl-main.109_8413"}
{"input": "retrieval is done by using Method| context: large - scale document retrieval systems often utilize two styles of neural network models which live at two different ends of the joint computation vs. accuracy spectrum . the first style is dual encoder ( or two - tower ) models , where the query and document representations are computed completely independently and combined with a simple dot product operation . the second style is cross - attention models , where the query and document features are concatenated in the input layer and all computation is based on the joint query - document representation .", "entity": "retrieval", "output": "dual encoder models", "neg_sample": ["retrieval is done by using Method", "large - scale document retrieval systems often utilize two styles of neural network models which live at two different ends of the joint computation vs. accuracy spectrum .", "the first style is dual encoder ( or two - tower ) models , where the query and document representations are computed completely independently and combined with a simple dot product operation .", "the second style is cross - attention models , where the query and document features are concatenated in the input layer and all computation is based on the joint query - document representation ."], "relation": "used for", "id": "2021.emnlp-main.443", "year": 2021, "rel_sent": "Dual encoder models are typically used for retrieval and deep re - ranking , while cross - attention models are typically used for shallow re - ranking .", "forward": false, "src_ids": "2021.emnlp-main.443_241"}
{"input": "clustering is done by using OtherScientificTerm| context: the words in a single morphological paradigm are different inflectional variants of an underlying lemma , meaning that the words share a common core meaning . they also - usually - show a high degree of orthographical similarity .", "entity": "clustering", "output": "randomly initialized centroids", "neg_sample": ["clustering is done by using OtherScientificTerm", "the words in a single morphological paradigm are different inflectional variants of an underlying lemma , meaning that the words share a common core meaning .", "they also - usually - show a high degree of orthographical similarity ."], "relation": "used for", "id": "2021.sigmorphon-1.10", "year": 2021, "rel_sent": "Following these intuitions , we investigate KMeans clustering using two different types of word representations : one focusing on orthographical similarity and the other focusing on semantic similarity . Additionally , we discuss the merits of randomly initialized centroids versus pre - defined centroids for clustering .", "forward": false, "src_ids": "2021.sigmorphon-1.10_5864"}
{"input": "roberta is used for OtherScientificTerm| context: appraisal theories explain how the cognitive evaluation of an event leads to a particular emotion . in contrast to theories of basic emotions or affect ( valence / arousal ) , this theory has not received a lot of attention in natural language processing . yet , in psychology it has been proven powerful : smith and ellsworth ( 1985 ) showed that the appraisal dimensions attention , certainty , anticipated effort , pleasantness , responsibility / control and situational control discriminate between ( at least ) 15 emotion classes .", "entity": "roberta", "output": "appraisal variables", "neg_sample": ["roberta is used for OtherScientificTerm", "appraisal theories explain how the cognitive evaluation of an event leads to a particular emotion .", "in contrast to theories of basic emotions or affect ( valence / arousal ) , this theory has not received a lot of attention in natural language processing .", "yet , in psychology it has been proven powerful : smith and ellsworth ( 1985 ) showed that the appraisal dimensions attention , certainty , anticipated effort , pleasantness , responsibility / control and situational control discriminate between ( at least ) 15 emotion classes ."], "relation": "used for", "id": "2021.wassa-1.17", "year": 2021, "rel_sent": "We evaluate these strategies in two ways : by measuring inter - annotator agreement and by fine- tuning RoBERTa to predict appraisal variables .", "forward": true, "src_ids": "2021.wassa-1.17_9215"}
{"input": "multi - target mrc task is used for Material| context: it has been widely studied recently , especially in open domains . however , few efforts have been made on closed - domain mrc , mainly due to the lack of large - scale training data .", "entity": "multi - target mrc task", "output": "medical domain", "neg_sample": ["multi - target mrc task is used for Material", "it has been widely studied recently , especially in open domains .", "however , few efforts have been made on closed - domain mrc , mainly due to the lack of large - scale training data ."], "relation": "used for", "id": "2021.findings-acl.197", "year": 2021, "rel_sent": "In this paper , we introduce a multi - target MRC task for the medical domain , whose goal is to predict answers to medical questions and the corresponding support sentences from medical information sources simultaneously , in order to ensure the high reliability of medical knowledge serving .", "forward": true, "src_ids": "2021.findings-acl.197_707"}
{"input": "mart - based transformers is used for OtherScientificTerm| context: story visualization is an underexplored task that falls at the intersection of many important research directions in both computer vision and natural language processing . in this task , given a series of natural language captions which compose a story , an agent must generate a sequence of images that correspond to the captions . prior work has introduced recurrent generative models which outperform text - to - image synthesis models on this task . however , there is room for improvement of generated images in terms of visual quality , coherence and relevance .", "entity": "mart - based transformers", "output": "complex interactions between frames", "neg_sample": ["mart - based transformers is used for OtherScientificTerm", "story visualization is an underexplored task that falls at the intersection of many important research directions in both computer vision and natural language processing .", "in this task , given a series of natural language captions which compose a story , an agent must generate a sequence of images that correspond to the captions .", "prior work has introduced recurrent generative models which outperform text - to - image synthesis models on this task .", "however , there is room for improvement of generated images in terms of visual quality , coherence and relevance ."], "relation": "used for", "id": "2021.naacl-main.194", "year": 2021, "rel_sent": "We present a number of improvements to prior modeling approaches , including ( 1 ) the addition of a dual learning framework that utilizes video captioning to reinforce the semantic alignment between the story and generated images , ( 2 ) a copy - transform mechanism for sequentially - consistent story visualization , and ( 3 ) MART - based transformers to model complex interactions between frames .", "forward": true, "src_ids": "2021.naacl-main.194_3825"}
{"input": "clevr_hyp is used for Task| context: in this paper , we take visual understanding to a higher level where systems are challenged to answer questions that involve mentally simulating the hypothetical consequences of performing specific actions in a given scenario .", "entity": "clevr_hyp", "output": "visual question answering", "neg_sample": ["clevr_hyp is used for Task", "in this paper , we take visual understanding to a higher level where systems are challenged to answer questions that involve mentally simulating the hypothetical consequences of performing specific actions in a given scenario ."], "relation": "used for", "id": "2021.naacl-main.289", "year": 2021, "rel_sent": "CLEVR_HYP : A Challenge Dataset and Baselines for Visual Question Answering with Hypothetical Actions over Images.", "forward": true, "src_ids": "2021.naacl-main.289_6935"}
{"input": "clause recommendation is used for Task| context: contracts are a common type of legal document that frequent in several day - to - day business workflows . however , there has been very limited nlp research in processing such documents , and even lesser in generating them . these contracts are made up of clauses , and the unique nature of these clauses calls for specific methods to understand and generate such documents .", "entity": "clause recommendation", "output": "authoring of contract documents", "neg_sample": ["clause recommendation is used for Task", "contracts are a common type of legal document that frequent in several day - to - day business workflows .", "however , there has been very limited nlp research in processing such documents , and even lesser in generating them .", "these contracts are made up of clauses , and the unique nature of these clauses calls for specific methods to understand and generate such documents ."], "relation": "used for", "id": "2021.emnlp-main.691", "year": 2021, "rel_sent": "In this paper , we introduce the task of clause recommendation , as a first step to aid and accelerate the authoring of contract documents .", "forward": true, "src_ids": "2021.emnlp-main.691_15446"}
{"input": "bias mitigation is used for Method| context: fine - tuned language models have been shown to exhibit biases against protected groups in a host of modeling tasks such as text classification and coreference resolution . previous works focus on detecting these biases , reducing bias in data representations , and using auxiliary training objectives to mitigate bias during fine - tuning . although these techniques achieve bias reduction for the task and domain at hand , the effects of bias mitigation may not directly transfer to new tasks , requiring additional data collection and customized annotation of sensitive attributes , and re - evaluation of appropriate fairness metrics .", "entity": "bias mitigation", "output": "lm fine - tuning", "neg_sample": ["bias mitigation is used for Method", "fine - tuned language models have been shown to exhibit biases against protected groups in a host of modeling tasks such as text classification and coreference resolution .", "previous works focus on detecting these biases , reducing bias in data representations , and using auxiliary training objectives to mitigate bias during fine - tuning .", "although these techniques achieve bias reduction for the task and domain at hand , the effects of bias mitigation may not directly transfer to new tasks , requiring additional data collection and customized annotation of sensitive attributes , and re - evaluation of appropriate fairness metrics ."], "relation": "used for", "id": "2021.naacl-main.296", "year": 2021, "rel_sent": "Though challenges remain , we show that UBM promises more efficient and accessible bias mitigation in LM fine - tuning .", "forward": true, "src_ids": "2021.naacl-main.296_11662"}
{"input": "redundant head enlivening ( rhe ) method is used for OtherScientificTerm| context: multi - head self - attention recently attracts enormous interest owing to its specialized functions , significant parallelizable computation , and flexible extensibility .", "entity": "redundant head enlivening ( rhe ) method", "output": "redundant heads", "neg_sample": ["redundant head enlivening ( rhe ) method is used for OtherScientificTerm", "multi - head self - attention recently attracts enormous interest owing to its specialized functions , significant parallelizable computation , and flexible extensibility ."], "relation": "used for", "id": "2021.emnlp-main.260", "year": 2021, "rel_sent": "We propose a redundant head enlivening ( RHE ) method to precisely identify redundant heads , and then vitalize their potential by learning syntactic relations and prior knowledge in the text without sacrificing the roles of important heads .", "forward": true, "src_ids": "2021.emnlp-main.260_566"}
{"input": "information sharing is done by using Method| context: pretraining and multitask learning are widely used to improve the speech translation performance .", "entity": "information sharing", "output": "initialization strategy", "neg_sample": ["information sharing is done by using Method", "pretraining and multitask learning are widely used to improve the speech translation performance ."], "relation": "used for", "id": "2021.acl-long.328", "year": 2021, "rel_sent": "First , a parameter sharing and initialization strategy is proposed to enhance information sharing between the tasks .", "forward": false, "src_ids": "2021.acl-long.328_15431"}
{"input": "argument impact classification task is done by using Method| context: discourse relations among arguments reveal logical structures of a debate conversation . however , no prior work has explicitly studied how the sequence of discourse relations influence a claim 's impact . this paper empirically shows that the discourse relations between two arguments along the context path are essential factors for identifying the persuasive power of an argument .", "entity": "argument impact classification task", "output": "attention and gate mechanisms", "neg_sample": ["argument impact classification task is done by using Method", "discourse relations among arguments reveal logical structures of a debate conversation .", "however , no prior work has explicitly studied how the sequence of discourse relations influence a claim 's impact .", "this paper empirically shows that the discourse relations between two arguments along the context path are essential factors for identifying the persuasive power of an argument ."], "relation": "used for", "id": "2021.acl-long.306", "year": 2021, "rel_sent": "Experimental results and extensive analysis show that the attention and gate mechanisms that explicitly model contexts and texts can indeed help the argument impact classification task defined by Durmus et al .", "forward": false, "src_ids": "2021.acl-long.306_3733"}
{"input": "historical text archives is done by using Method| context: finding the year of writing for a historical text is of crucial importance to historical research . however , the year of original creation is rarely explicitly stated and must be inferred from the text content , historical records , and codicological clues . given a transcribed text , machine learning has successfully been used to estimate the year of production .", "entity": "historical text archives", "output": "estimation approaches", "neg_sample": ["historical text archives is done by using Method", "finding the year of writing for a historical text is of crucial importance to historical research .", "however , the year of original creation is rarely explicitly stated and must be inferred from the text content , historical records , and codicological clues .", "given a transcribed text , machine learning has successfully been used to estimate the year of production ."], "relation": "used for", "id": "2021.nodalida-main.15", "year": 2021, "rel_sent": "In this paper , we present an overview of several estimation approaches for historical text archives spanning from the 12th century until today .", "forward": false, "src_ids": "2021.nodalida-main.15_13219"}
{"input": "relation extraction is done by using Method| context: distantly supervised datasets for relation extraction mostly focus on sentence - level extraction , and they cover very few relations .", "entity": "relation extraction", "output": "hierarchical entity graph convolutional network", "neg_sample": ["relation extraction is done by using Method", "distantly supervised datasets for relation extraction mostly focus on sentence - level extraction , and they cover very few relations ."], "relation": "used for", "id": "2021.ranlp-1.115", "year": 2021, "rel_sent": "A Hierarchical Entity Graph Convolutional Network for Relation Extraction across Documents.", "forward": false, "src_ids": "2021.ranlp-1.115_8750"}
{"input": "disfluencies is done by using Task| context: speech disfluencies are prevalent in spontaneous speech . the rising popularity of voice assistants presents a growing need to handle naturally occurring disfluencies .", "entity": "disfluencies", "output": "semantic parsing", "neg_sample": ["disfluencies is done by using Task", "speech disfluencies are prevalent in spontaneous speech .", "the rising popularity of voice assistants presents a growing need to handle naturally occurring disfluencies ."], "relation": "used for", "id": "2021.eacl-main.150", "year": 2021, "rel_sent": "We find that a state - of - the - art semantic parser does not seamlessly handle disfluencies .", "forward": false, "src_ids": "2021.eacl-main.150_7903"}
{"input": "bert is done by using OtherScientificTerm| context: type- and token - based embedding architectures are still competing in lexical semantic change detection . the recent success of type - based models in semeval-2020 task 1 has raised the question why the success of token - based models on a variety of other nlp tasks does not translate to our field .", "entity": "bert", "output": "orthography", "neg_sample": ["bert is done by using OtherScientificTerm", "type- and token - based embedding architectures are still competing in lexical semantic change detection .", "the recent success of type - based models in semeval-2020 task 1 has raised the question why the success of token - based models on a variety of other nlp tasks does not translate to our field ."], "relation": "used for", "id": "2021.eacl-srw.25", "year": 2021, "rel_sent": "By reducing the influence of orthography we considerably improve BERT 's performance .", "forward": false, "src_ids": "2021.eacl-srw.25_4744"}
{"input": "annotation projection technique is used for OtherScientificTerm| context: the data set comprises a total of 100 contracts , obtained from 25 documents annotated in four different languages : english , german , italian , and polish .", "entity": "annotation projection technique", "output": "annotations", "neg_sample": ["annotation projection technique is used for OtherScientificTerm", "the data set comprises a total of 100 contracts , obtained from 25 documents annotated in four different languages : english , german , italian , and polish ."], "relation": "used for", "id": "2021.nllp-1.1", "year": 2021, "rel_sent": "We show how a simple yet efficient annotation projection technique based on sentence embeddings could be used to automatically transfer annotations across languages .", "forward": true, "src_ids": "2021.nllp-1.1_10852"}
{"input": "metadata extraction is done by using Material| context: applications based on scholarly data are of ever increasing importance . this results in disadvantages for areas where high - quality data and compatible systems are not available , such as non - english publications .", "entity": "metadata extraction", "output": "high - quality data set", "neg_sample": ["metadata extraction is done by using Material", "applications based on scholarly data are of ever increasing importance .", "this results in disadvantages for areas where high - quality data and compatible systems are not available , such as non - english publications ."], "relation": "used for", "id": "2021.sdp-1.8", "year": 2021, "rel_sent": "To advance the mitigation of this imbalance , we use Cyrillic script publications from the CORE collection to create a high - quality data set for metadata extraction .", "forward": false, "src_ids": "2021.sdp-1.8_694"}
{"input": "decoupled transformer is used for Task| context: large transformer models , such as bert , achieve state - of - the - art results in machine reading comprehension ( mrc ) for open - domain question answering ( qa ) . however , transformers have a high computational cost for inference which makes them hard to apply to online qa systems for applications like voice assistants .", "entity": "decoupled transformer", "output": "open - domain mrc", "neg_sample": ["decoupled transformer is used for Task", "large transformer models , such as bert , achieve state - of - the - art results in machine reading comprehension ( mrc ) for open - domain question answering ( qa ) .", "however , transformers have a high computational cost for inference which makes them hard to apply to online qa systems for applications like voice assistants ."], "relation": "used for", "id": "2021.ranlp-1.44", "year": 2021, "rel_sent": "In experiments on the SQUAD 2.0 dataset , a decoupled transformer reduces the computational cost and latency of open - domain MRC by 30 - 40 % with only 1.2 points worse F1 - score compared to a standard transformer .", "forward": true, "src_ids": "2021.ranlp-1.44_12778"}
{"input": "argumentation quality is done by using OtherScientificTerm| context: the combination of gestures , intonations , and textual content plays a key role in argument delivery . however , the current literature mostly considers textual content while assessing the quality of an argument , and it is limited to datasets containing short sequences ( 18 - 48 words ) .", "entity": "argumentation quality", "output": "multimodal cues", "neg_sample": ["argumentation quality is done by using OtherScientificTerm", "the combination of gestures , intonations , and textual content plays a key role in argument delivery .", "however , the current literature mostly considers textual content while assessing the quality of an argument , and it is limited to datasets containing short sequences ( 18 - 48 words ) ."], "relation": "used for", "id": "2021.emnlp-main.515", "year": 2021, "rel_sent": "Through ablation studies , we demonstrate the importance of multimodal cues in modeling argument quality .", "forward": false, "src_ids": "2021.emnlp-main.515_11192"}
{"input": "statistical and natural language processing techniques is used for Material| context: over the years customers ' expectation of getting information instantaneously has given rise to the increased usage of channels like virtual assistants . typically , customers try to get their questions answered by low - touch channels like search and virtual assistant first , before getting in touch with a live chat agent or the phone representative . higher usage of these low - touch systems is a win - win for both customers and the organization since it enables organizations to attain a low cost of service while customers get served without delay .", "entity": "statistical and natural language processing techniques", "output": "financial domain", "neg_sample": ["statistical and natural language processing techniques is used for Material", "over the years customers ' expectation of getting information instantaneously has given rise to the increased usage of channels like virtual assistants .", "typically , customers try to get their questions answered by low - touch channels like search and virtual assistant first , before getting in touch with a live chat agent or the phone representative .", "higher usage of these low - touch systems is a win - win for both customers and the organization since it enables organizations to attain a low cost of service while customers get served without delay ."], "relation": "used for", "id": "2021.fnp-1.1", "year": 2021, "rel_sent": "Data Driven Content Creation using Statistical and Natural Language Processing Techniques for Financial Domain.", "forward": true, "src_ids": "2021.fnp-1.1_10924"}
{"input": "toxic spans detection is done by using Method| context: although several resources and systems have been developed sofar in the context of offensive language , both annotation and tasks have mainly focused on classifying whether a text is offensive or not . however , detecting toxic spans is crucial to identify why a text is toxic and can assist human moderators to locate this type of content on social media .", "entity": "toxic spans detection", "output": "bilstm - crf model", "neg_sample": ["toxic spans detection is done by using Method", "although several resources and systems have been developed sofar in the context of offensive language , both annotation and tasks have mainly focused on classifying whether a text is offensive or not .", "however , detecting toxic spans is crucial to identify why a text is toxic and can assist human moderators to locate this type of content on social media ."], "relation": "used for", "id": "2021.semeval-1.134", "year": 2021, "rel_sent": "SINAI at SemEval-2021 Task 5 : Combining Embeddings in a BiLSTM - CRF model for Toxic Spans Detection.", "forward": false, "src_ids": "2021.semeval-1.134_9376"}
{"input": "cross - document re is used for Task| context: existing relation extraction ( re ) methods typically focus on extracting relational facts between entity pairs within single sentences or documents . however , a large quantity of relational facts in knowledge bases can only be inferred across documents in practice .", "entity": "cross - document re", "output": "knowledge acquisition", "neg_sample": ["cross - document re is used for Task", "existing relation extraction ( re ) methods typically focus on extracting relational facts between entity pairs within single sentences or documents .", "however , a large quantity of relational facts in knowledge bases can only be inferred across documents in practice ."], "relation": "used for", "id": "2021.emnlp-main.366", "year": 2021, "rel_sent": "In this work , we present the problem of cross - document RE , making an initial step towards knowledge acquisition in the wild .", "forward": true, "src_ids": "2021.emnlp-main.366_3859"}
{"input": "triage procedures is done by using Method| context: respiratory insufficiency is a symptom that requires hospitalization . this work investigates whether it is possible to detect this condition by analyzing patient 's speech samples ; the analysis was performed on data collected during the first wave of the covid-19 pandemic in 2020 , and thus limited to respiratory insufficiency in covid-19 patients .", "entity": "triage procedures", "output": "automated speech analysis", "neg_sample": ["triage procedures is done by using Method", "respiratory insufficiency is a symptom that requires hospitalization .", "this work investigates whether it is possible to detect this condition by analyzing patient 's speech samples ; the analysis was performed on data collected during the first wave of the covid-19 pandemic in 2020 , and thus limited to respiratory insufficiency in covid-19 patients ."], "relation": "used for", "id": "2021.findings-acl.55", "year": 2021, "rel_sent": "Thus we validated the project 's Leading Hypothesis , namely that it is possible to detect respiratory insufficiency in speech utterances , under real - life environmental conditions ; we believe our results justify further enquiries into the use of automated speech analysis to support health professionals in triage procedures .", "forward": false, "src_ids": "2021.findings-acl.55_3027"}
{"input": "fine - tuning is done by using OtherScientificTerm| context: low - resource multilingual neural machine translation ( mnmt ) is typically tasked with improving the translation performance on one or more language pairs with the aid of high - resource language pairs .", "entity": "fine - tuning", "output": "curricula", "neg_sample": ["fine - tuning is done by using OtherScientificTerm", "low - resource multilingual neural machine translation ( mnmt ) is typically tasked with improving the translation performance on one or more language pairs with the aid of high - resource language pairs ."], "relation": "used for", "id": "2021.mtsummit-research.1", "year": 2021, "rel_sent": "We show on the FLORES low - resource translation dataset that these learned curricula can provide better starting points for fine tuning and improve overall performance of the translation system .", "forward": false, "src_ids": "2021.mtsummit-research.1_380"}
{"input": "semantic representation of utterances is done by using OtherScientificTerm| context: a key challenge of dialog systems research is to effectively and efficiently adapt to new domains . a scalable paradigm for adaptation necessitates the development of generalizable models that perform well in few - shot settings .", "entity": "semantic representation of utterances", "output": "[ cls ] token", "neg_sample": ["semantic representation of utterances is done by using OtherScientificTerm", "a key challenge of dialog systems research is to effectively and efficiently adapt to new domains .", "a scalable paradigm for adaptation necessitates the development of generalizable models that perform well in few - shot settings ."], "relation": "used for", "id": "2021.naacl-main.237", "year": 2021, "rel_sent": "Observers are tokens that are not attended to , and are an alternative to the [ CLS ] token as a semantic representation of utterances .", "forward": false, "src_ids": "2021.naacl-main.237_3196"}
{"input": "automatic detection of exaggerated statements is done by using Generic| context: there is a huge difference between a scientific journal reporting ' wine consumption might be correlated to cancer ' , and a media outlet publishing ' wine causes cancer ' citing the journal 's results . the above example is a typical case of a scientific statement being exaggerated as an outcome of the rising problem of media manipulation . given a pair of statements ( say one from the source journal article and the other from the news article covering the results published in the journal ) , is it possible to ascertain with some confidence whether one is an exaggerated version of the other ?", "entity": "automatic detection of exaggerated statements", "output": "three - step approach", "neg_sample": ["automatic detection of exaggerated statements is done by using Generic", "there is a huge difference between a scientific journal reporting ' wine consumption might be correlated to cancer ' , and a media outlet publishing ' wine causes cancer ' citing the journal 's results .", "the above example is a typical case of a scientific statement being exaggerated as an outcome of the rising problem of media manipulation .", "given a pair of statements ( say one from the source journal article and the other from the news article covering the results published in the journal ) , is it possible to ascertain with some confidence whether one is an exaggerated version of the other ?"], "relation": "used for", "id": "2021.eacl-main.289", "year": 2021, "rel_sent": "A Simple Three - Step Approach for the Automatic Detection of Exaggerated Statements in Health Science News.", "forward": false, "src_ids": "2021.eacl-main.289_10501"}
{"input": "joint hierarchies is done by using Method| context: learning representations of entities and relations in structured knowledge bases is an active area of research , with much emphasis placed on choosing the appropriate geometry to capture the hierarchical structures exploited in , for example , isa or haspart relations . box embeddings ( vilnis et al . , 2018 ; li et al . , 2019 ; dasgupta et al . , 2020 ) , which represent concepts as n - dimensional hyperrectangles , are capable of embedding hierarchies when training on a subset of the transitive closure . while it is possible to represent joint hierarchies with this method , the parameters for each hierarchy are decoupled , making generalization between hierarchies infeasible .", "entity": "joint hierarchies", "output": "box - to - box transformations", "neg_sample": ["joint hierarchies is done by using Method", "learning representations of entities and relations in structured knowledge bases is an active area of research , with much emphasis placed on choosing the appropriate geometry to capture the hierarchical structures exploited in , for example , isa or haspart relations .", "box embeddings ( vilnis et al .", ", 2018 ; li et al .", ", 2019 ; dasgupta et al .", ", 2020 ) , which represent concepts as n - dimensional hyperrectangles , are capable of embedding hierarchies when training on a subset of the transitive closure .", "while it is possible to represent joint hierarchies with this method , the parameters for each hierarchy are decoupled , making generalization between hierarchies infeasible ."], "relation": "used for", "id": "2021.repl4nlp-1.28", "year": 2021, "rel_sent": "Box - To - Box Transformations for Modeling Joint Hierarchies.", "forward": false, "src_ids": "2021.repl4nlp-1.28_14377"}
{"input": "encoder - only models is used for OtherScientificTerm| context: we investigate if , given a simple symbol masking strategy , self - attention models are capable of learning nested structures and generalise over their depth . we do so in the simplest setting possible , namely languages consisting of nested parentheses of several kinds .", "entity": "encoder - only models", "output": "randomly masked symbols", "neg_sample": ["encoder - only models is used for OtherScientificTerm", "we investigate if , given a simple symbol masking strategy , self - attention models are capable of learning nested structures and generalise over their depth .", "we do so in the simplest setting possible , namely languages consisting of nested parentheses of several kinds ."], "relation": "used for", "id": "2021.findings-acl.67", "year": 2021, "rel_sent": "We use encoder - only models , which we train to predict randomly masked symbols , in a BERTlike fashion .", "forward": true, "src_ids": "2021.findings-acl.67_581"}
{"input": "commonsense question answering is done by using OtherScientificTerm| context: commonsense question answering ( qa ) requires a model to grasp commonsense and factual knowledge to answer questions about world events .", "entity": "commonsense question answering", "output": "knowledge graphs", "neg_sample": ["commonsense question answering is done by using OtherScientificTerm", "commonsense question answering ( qa ) requires a model to grasp commonsense and factual knowledge to answer questions about world events ."], "relation": "used for", "id": "2021.findings-acl.102", "year": 2021, "rel_sent": "Fusing Context Into Knowledge Graph for Commonsense Question Answering.", "forward": false, "src_ids": "2021.findings-acl.102_14370"}
{"input": "multiple reference image captioning is done by using Method| context: in image captioning , multiple captions are often provided as ground truths , since a valid caption is not always uniquely determined . conventional methods randomly select a single caption and treat it as correct , but there have been few effective training methods that utilize multiple given captions .", "entity": "multiple reference image captioning", "output": "validity - based sampling and smoothing methods", "neg_sample": ["multiple reference image captioning is done by using Method", "in image captioning , multiple captions are often provided as ground truths , since a valid caption is not always uniquely determined .", "conventional methods randomly select a single caption and treat it as correct , but there have been few effective training methods that utilize multiple given captions ."], "relation": "used for", "id": "2021.maiworkshop-1.6", "year": 2021, "rel_sent": "Validity - Based Sampling and Smoothing Methods for Multiple Reference Image Captioning.", "forward": false, "src_ids": "2021.maiworkshop-1.6_8588"}
{"input": "generation of prolonged stories is done by using Method| context: in visual storytelling , a short story is generated based on a given image sequence . despite years of work , most visual storytelling models remain limited in terms of the generated stories ' fixed length : most models produce stories with exactly five sentences because five - sentence stories dominate the training data . the fix - length stories carry limited details and provide ambiguous textual information to the readers .", "entity": "generation of prolonged stories", "output": "stretch - vst", "neg_sample": ["generation of prolonged stories is done by using Method", "in visual storytelling , a short story is generated based on a given image sequence .", "despite years of work , most visual storytelling models remain limited in terms of the generated stories ' fixed length : most models produce stories with exactly five sentences because five - sentence stories dominate the training data .", "the fix - length stories carry limited details and provide ambiguous textual information to the readers ."], "relation": "used for", "id": "2021.acl-demo.42", "year": 2021, "rel_sent": "This paper presents Stretch - VST , a visual storytelling framework that enables the generation of prolonged stories by adding appropriate knowledge , which is selected by the proposed scoring function .", "forward": false, "src_ids": "2021.acl-demo.42_8553"}
{"input": "mistranslation of sentiment is done by using Metric| context: in translating text where sentiment is the main message , human translators give particular attention to sentiment - carrying words . the reason is that an incorrect translation of such words would miss the fundamental aspect of the source text , i.e. the author 's sentiment . in the online world , mt systems are extensively used to translate user - generated content ( ugc ) such as reviews , tweets , and social media posts , where the main message is often the author 's positive or negative attitude towards the topic of the text . it is important in such scenarios to accurately measure how far an mt system can be a reliable real - life utility in transferring the correct affect message .", "entity": "mistranslation of sentiment", "output": "quality metrics", "neg_sample": ["mistranslation of sentiment is done by using Metric", "in translating text where sentiment is the main message , human translators give particular attention to sentiment - carrying words .", "the reason is that an incorrect translation of such words would miss the fundamental aspect of the source text , i.e.", "the author 's sentiment .", "in the online world , mt systems are extensively used to translate user - generated content ( ugc ) such as reviews , tweets , and social media posts , where the main message is often the author 's positive or negative attitude towards the topic of the text .", "it is important in such scenarios to accurately measure how far an mt system can be a reliable real - life utility in transferring the correct affect message ."], "relation": "used for", "id": "2021.ranlp-1.137", "year": 2021, "rel_sent": "We evaluate the efficacy of conventional quality metrics in spotting a mistranslation of sentiment , especially when it is the sole error in the MT output .", "forward": false, "src_ids": "2021.ranlp-1.137_11504"}
{"input": "pretrained models is used for OtherScientificTerm| context: we present a new probing dataset named prost : physical reasoning about objects through space and time . this dataset contains 18,736 multiple - choice questions made from 14 manually curated templates , covering 10 physical reasoning concepts .", "entity": "pretrained models", "output": "physical interactions", "neg_sample": ["pretrained models is used for OtherScientificTerm", "we present a new probing dataset named prost : physical reasoning about objects through space and time .", "this dataset contains 18,736 multiple - choice questions made from 14 manually curated templates , covering 10 physical reasoning concepts ."], "relation": "used for", "id": "2021.findings-acl.404", "year": 2021, "rel_sent": "These results provide support for the hypothesis that current pretrained models ' ability to reason about physical interactions is inherently limited by a lack of real world experience .", "forward": true, "src_ids": "2021.findings-acl.404_15676"}
{"input": "classification datasets is done by using OtherScientificTerm| context: we study how masking and predicting tokens in an unsupervised fashion can give rise to linguistic structures and downstream performance gains . recent theories have suggested that pretrained language models acquire useful inductive biases through masks that implicitly act as cloze reductions for downstream tasks . while appealing , we show that the success of the random masking strategy used in practice can not be explained by such cloze - like masks alone .", "entity": "classification datasets", "output": "task - specific lexicons", "neg_sample": ["classification datasets is done by using OtherScientificTerm", "we study how masking and predicting tokens in an unsupervised fashion can give rise to linguistic structures and downstream performance gains .", "recent theories have suggested that pretrained language models acquire useful inductive biases through masks that implicitly act as cloze reductions for downstream tasks .", "while appealing , we show that the success of the random masking strategy used in practice can not be explained by such cloze - like masks alone ."], "relation": "used for", "id": "2021.naacl-main.404", "year": 2021, "rel_sent": "We construct cloze - like masks using task - specific lexicons for three different classification datasets and show that the majority of pretrained performance gains come from generic masks that are not associated with the lexicon .", "forward": false, "src_ids": "2021.naacl-main.404_6941"}
{"input": "graded entailment information is done by using OtherScientificTerm| context: given a word , our framework can create its negation that is similar to how humans perceive negation .", "entity": "graded entailment information", "output": "density matrices", "neg_sample": ["graded entailment information is done by using OtherScientificTerm", "given a word , our framework can create its negation that is similar to how humans perceive negation ."], "relation": "used for", "id": "2021.semspace-1.6", "year": 2021, "rel_sent": "We validate the sensibility of our conversational negation framework by performing experiments , leveraging density matrices to encode graded entailment information .", "forward": false, "src_ids": "2021.semspace-1.6_7743"}
{"input": "pre - trained bert is used for OtherScientificTerm| context: we present experiments on assessing the grammatical correctness of learners ' answers in a language - learning system ( references to the system , and the links to the released data and code are withheld for anonymity ) .", "entity": "pre - trained bert", "output": "grammatical mistakes", "neg_sample": ["pre - trained bert is used for OtherScientificTerm", "we present experiments on assessing the grammatical correctness of learners ' answers in a language - learning system ( references to the system , and the links to the released data and code are withheld for anonymity ) ."], "relation": "used for", "id": "2021.bea-1.15", "year": 2021, "rel_sent": "Due to the paucity of training data , we explore the ability of pre - trained BERT to detect grammatical errors and then fine - tune it using synthetic training data .", "forward": true, "src_ids": "2021.bea-1.15_7198"}
{"input": "crowd - workers is used for Material| context: multi - text applications , such as multi - document summarization , are typically required to model redundancies across related texts . current methods confronting consolidation struggle tofuse overlapping information .", "entity": "crowd - workers", "output": "qa - based alignments", "neg_sample": ["crowd - workers is used for Material", "multi - text applications , such as multi - document summarization , are typically required to model redundancies across related texts .", "current methods confronting consolidation struggle tofuse overlapping information ."], "relation": "used for", "id": "2021.emnlp-main.778", "year": 2021, "rel_sent": "We employ crowd - workers for constructing a dataset of QA - based alignments , and present a baseline QA alignment model trained over our dataset .", "forward": true, "src_ids": "2021.emnlp-main.778_14513"}
{"input": "country - level and province - level identification is done by using Method| context: dialect and standard language identification are crucial tasks for many arabic natural language processing applications .", "entity": "country - level and province - level identification", "output": "deep learning - based system", "neg_sample": ["country - level and province - level identification is done by using Method", "dialect and standard language identification are crucial tasks for many arabic natural language processing applications ."], "relation": "used for", "id": "2021.wanlp-1.31", "year": 2021, "rel_sent": "In this paper , we present our deep learning - based system , submitted to the second NADI shared task for country - level and province - level identification of Modern Standard Arabic ( MSA ) and Dialectal Arabic ( DA ) .", "forward": false, "src_ids": "2021.wanlp-1.31_13144"}
{"input": "visual tokens is done by using Method| context: almost all existing video grounding methods fall into twoframeworks : 1 ) top - down model : it predefines a set of segment candidates and then conducts segment classification and regression . 2 ) bottom - up model : it directly predicts frame - wise probabilities of the referential segment boundaries . however , all these methods are not end - to - end , i.e. , they always rely on some time - consuming post - processing steps to refine predictions .", "entity": "visual tokens", "output": "cubic embedding layer", "neg_sample": ["visual tokens is done by using Method", "almost all existing video grounding methods fall into twoframeworks : 1 ) top - down model : it predefines a set of segment candidates and then conducts segment classification and regression .", "2 ) bottom - up model : it directly predicts frame - wise probabilities of the referential segment boundaries .", "however , all these methods are not end - to - end , i.e.", ", they always rely on some time - consuming post - processing steps to refine predictions ."], "relation": "used for", "id": "2021.emnlp-main.773", "year": 2021, "rel_sent": "Tofacilitate the end - to - end training , we use a Cubic Embedding layer to transform the raw videos into a set of visual tokens .", "forward": false, "src_ids": "2021.emnlp-main.773_8921"}
{"input": "character based language model is used for OtherScientificTerm| context: reviews written by the users for a particular product or service play an influencing role for the customers to make an informative decision . although online e - commerce portals have immensely impacted our lives , available contents predominantly are in english language- often limiting its widespread usage . there is an exponential growth in the number of e - commerce users who are not proficient in english . hence , there is a necessity to make these services available in non - english languages , especially in a multilingual country like india . this can be achieved by an in - domain robust machine translation ( mt ) system . however , the reviews written by the users pose unique challenges to mt , such as misspelled words , ungrammatical constructions , presence of colloquial terms , lack of resources such as in - domain parallel corpus etc .", "entity": "character based language model", "output": "noisy tokens", "neg_sample": ["character based language model is used for OtherScientificTerm", "reviews written by the users for a particular product or service play an influencing role for the customers to make an informative decision .", "although online e - commerce portals have immensely impacted our lives , available contents predominantly are in english language- often limiting its widespread usage .", "there is an exponential growth in the number of e - commerce users who are not proficient in english .", "hence , there is a necessity to make these services available in non - english languages , especially in a multilingual country like india .", "this can be achieved by an in - domain robust machine translation ( mt ) system .", "however , the reviews written by the users pose unique challenges to mt , such as misspelled words , ungrammatical constructions , presence of colloquial terms , lack of resources such as in - domain parallel corpus etc ."], "relation": "used for", "id": "2021.ecnlp-1.21", "year": 2021, "rel_sent": "In order to make our NMT model robust enough to handle the noisy tokens in the reviews , we integrate a character based language model to generate word vectors and map the noisy tokens with their correct forms .", "forward": true, "src_ids": "2021.ecnlp-1.21_10407"}
{"input": "defeasible inference task is done by using OtherScientificTerm| context: defeasible reasoning is a mode of reasoning where conclusions can be overturned by taking into account new evidence .", "entity": "defeasible inference task", "output": "graphs", "neg_sample": ["defeasible inference task is done by using OtherScientificTerm", "defeasible reasoning is a mode of reasoning where conclusions can be overturned by taking into account new evidence ."], "relation": "used for", "id": "2021.findings-acl.456", "year": 2021, "rel_sent": "Through automated metrics and human evaluation , we find that our method generates meaningful graphs for the defeasible inference task .", "forward": false, "src_ids": "2021.findings-acl.456_10096"}
{"input": "automatic assessments is done by using Metric| context: a critical bottleneck of obtaining a reliable learnable evaluation metric is the lack of high - quality training data for classifiers to efficiently distinguish plausible and implausible machine - generated stories . previous works relied on heuristically manipulated plausible examples to mimic possible system drawbacks such as repetition , contradiction , or irrelevant content in the text level , which can be unnatural and oversimplify the characteristics of implausible machine - generated stories .", "entity": "automatic assessments", "output": "evaluation metrics", "neg_sample": ["automatic assessments is done by using Metric", "a critical bottleneck of obtaining a reliable learnable evaluation metric is the lack of high - quality training data for classifiers to efficiently distinguish plausible and implausible machine - generated stories .", "previous works relied on heuristically manipulated plausible examples to mimic possible system drawbacks such as repetition , contradiction , or irrelevant content in the text level , which can be unnatural and oversimplify the characteristics of implausible machine - generated stories ."], "relation": "used for", "id": "2021.naacl-main.343", "year": 2021, "rel_sent": "Experiments show that the evaluation metrics trained on our generated data result in more reliable automatic assessments that correlate remarkably better with human judgments compared to the baselines .", "forward": false, "src_ids": "2021.naacl-main.343_1252"}
{"input": "gradient boosting framework is done by using OtherScientificTerm| context: eye - tracking data from reading represent an important resource for both linguistics and natural language processing . the ability to accurately model gaze features is crucial to advance our understanding of language processing .", "entity": "gradient boosting framework", "output": "linguistic and psychometric features", "neg_sample": ["gradient boosting framework is done by using OtherScientificTerm", "eye - tracking data from reading represent an important resource for both linguistics and natural language processing .", "the ability to accurately model gaze features is crucial to advance our understanding of language processing ."], "relation": "used for", "id": "2021.cmcl-1.7", "year": 2021, "rel_sent": "The winning system used a range of linguistic and psychometric features in a gradient boosting framework .", "forward": false, "src_ids": "2021.cmcl-1.7_80"}
{"input": "grammatically restricted inference method is used for OtherScientificTerm| context: automated program repair ( apr ) aims tofind an automatic solution to program language bugs without human intervention , and it can potentially reduce debugging costs and improve software quality . conventional approaches adopt learning - based methods such as sequence - to - sequence models for the patches generation . however , they tend to ignore the code structure information and suffer from grammar and syntax errors .", "entity": "grammatically restricted inference method", "output": "grammar rules", "neg_sample": ["grammatically restricted inference method is used for OtherScientificTerm", "automated program repair ( apr ) aims tofind an automatic solution to program language bugs without human intervention , and it can potentially reduce debugging costs and improve software quality .", "conventional approaches adopt learning - based methods such as sequence - to - sequence models for the patches generation .", "however , they tend to ignore the code structure information and suffer from grammar and syntax errors ."], "relation": "used for", "id": "2021.findings-acl.111", "year": 2021, "rel_sent": "Besides , to guarantee grammar correctness , we employ a grammatically restricted inference method to generate each grammar rule in a legally constrained sub - search - space considering the generated previous rules .", "forward": true, "src_ids": "2021.findings-acl.111_9101"}
{"input": "rnns is used for OtherScientificTerm| context: the main subject and the associated verb in english must agree in grammatical number as per the subject - verb agreement ( sva ) phenomenon . it has been found that the presence of a noun between the verb and the main subject , whose grammatical number is opposite to that of the main subject , can cause speakers to produce a verb that agrees with the intervening noun rather than the main noun ; the former thus acts as an agreement attractor . such attractors have also been shown to pose a challenge for rnn models without explicit hierarchical bias to perform well on sva tasks . previous work suggests that syntactic cues in the input can aid such models to choose hierarchical rules over linear rules for number agreement .", "entity": "rnns", "output": "hierarchical rules of natural language", "neg_sample": ["rnns is used for OtherScientificTerm", "the main subject and the associated verb in english must agree in grammatical number as per the subject - verb agreement ( sva ) phenomenon .", "it has been found that the presence of a noun between the verb and the main subject , whose grammatical number is opposite to that of the main subject , can cause speakers to produce a verb that agrees with the intervening noun rather than the main noun ; the former thus acts as an agreement attractor .", "such attractors have also been shown to pose a challenge for rnn models without explicit hierarchical bias to perform well on sva tasks .", "previous work suggests that syntactic cues in the input can aid such models to choose hierarchical rules over linear rules for number agreement ."], "relation": "used for", "id": "2021.scil-1.37", "year": 2021, "rel_sent": "These results suggest that current RNNs do not capture the underlying hierarchical rules of natural language , but rather use shallower heuristics for their predictions .", "forward": true, "src_ids": "2021.scil-1.37_2790"}
{"input": "pre - training approaches is used for Task| context: loading models pre - trained on the large - scale corpus in the general domain and fine - tuning them on specific downstream tasks is gradually becoming a paradigm in natural language processing . previous investigations prove that introducing a further pre - training phase between pre - training and fine - tuning phases to adapt the model on the domain - specific unlabeled data can bring positive effects . however , most of these further pre - training works just keep running the conventional pre - training task , e.g. , masked language model , which can be regarded as the domain adaptation to bridge the data distribution gap . after observing diverse downstream tasks , we suggest that different tasks may also need a further pre - training phase with appropriate training tasks to bridge the task formulation gap .", "entity": "pre - training approaches", "output": "diverse dialogue tasks", "neg_sample": ["pre - training approaches is used for Task", "loading models pre - trained on the large - scale corpus in the general domain and fine - tuning them on specific downstream tasks is gradually becoming a paradigm in natural language processing .", "previous investigations prove that introducing a further pre - training phase between pre - training and fine - tuning phases to adapt the model on the domain - specific unlabeled data can bring positive effects .", "however , most of these further pre - training works just keep running the conventional pre - training task , e.g.", ", masked language model , which can be regarded as the domain adaptation to bridge the data distribution gap .", "after observing diverse downstream tasks , we suggest that different tasks may also need a further pre - training phase with appropriate training tasks to bridge the task formulation gap ."], "relation": "used for", "id": "2021.emnlp-main.178", "year": 2021, "rel_sent": "Different Strokes for Different Folks : Investigating Appropriate Further Pre - training Approaches for Diverse Dialogue Tasks.", "forward": true, "src_ids": "2021.emnlp-main.178_5149"}
{"input": "conversational recommendation is done by using OtherScientificTerm| context: growing interests have been attracted in conversational recommender systems ( crs ) , which explore user preference through conversational interactions in order to make appropriate recommendation . however , there is still a lack of ability in existing crs to ( 1 ) traverse multiple reasoning paths over background knowledge to introduce relevant items and attributes , and ( 2 ) arrange selected entities appropriately under current system intents to control response generation .", "entity": "conversational recommendation", "output": "dialog acts", "neg_sample": ["conversational recommendation is done by using OtherScientificTerm", "growing interests have been attracted in conversational recommender systems ( crs ) , which explore user preference through conversational interactions in order to make appropriate recommendation .", "however , there is still a lack of ability in existing crs to ( 1 ) traverse multiple reasoning paths over background knowledge to introduce relevant items and attributes , and ( 2 ) arrange selected entities appropriately under current system intents to control response generation ."], "relation": "used for", "id": "2021.emnlp-main.139", "year": 2021, "rel_sent": "CR - Walker : Tree - Structured Graph Reasoning and Dialog Acts for Conversational Recommendation.", "forward": false, "src_ids": "2021.emnlp-main.139_2625"}
{"input": "neural models is used for Task| context: data - to - text ( d2 t ) generation in the biomedical domain is a promising - yet mostly unexplored - field of research .", "entity": "neural models", "output": "d2 t generation", "neg_sample": ["neural models is used for Task", "data - to - text ( d2 t ) generation in the biomedical domain is a promising - yet mostly unexplored - field of research ."], "relation": "used for", "id": "2021.inlg-1.40", "year": 2021, "rel_sent": "Here , we apply neural models for D2 T generation to a real - world dataset consisting of package leaflets of European medicines .", "forward": true, "src_ids": "2021.inlg-1.40_12694"}
{"input": "anchor instances is done by using Method| context: difficult samples of the minority class in imbalanced text classification are usually hard to be classified as they are embedded into an overlapping semantic region with the majority class .", "entity": "anchor instances", "output": "disentangled semantic representations", "neg_sample": ["anchor instances is done by using Method", "difficult samples of the minority class in imbalanced text classification are usually hard to be classified as they are embedded into an overlapping semantic region with the majority class ."], "relation": "used for", "id": "2021.emnlp-main.252", "year": 2021, "rel_sent": "MISO consists of ( 1 ) a semantic fusion module that learns entangled semantics among difficult and majority samples with an adaptive multi - head attention mechanism , ( 2 ) a mutual information loss that forces our model to learn new representations of entangled semantics in the non - overlapping region of the minority class , and ( 3 ) a coupled adversarial encoder - decoder that fine - tunes disentangled semantic representations to remain their correlations with the minority class , and then using these disentangled semantic representations to generate anchor instances for each difficult sample .", "forward": false, "src_ids": "2021.emnlp-main.252_9848"}
{"input": "event semantic relation reasoning is done by using OtherScientificTerm| context: understanding how events are semantically related to each other is the essence of reading comprehension . recent event - centric reading comprehension datasets focus mostly on event arguments or temporal relations . while these tasks partially evaluate machines ' ability of narrative understanding , human - like reading comprehension requires the capability to process event - based information beyond arguments and temporal reasoning . for example , to understand causality between events , we need to infer motivation or purpose ; to establish event hierarchy , we need to understand the composition of events .", "entity": "event semantic relation reasoning", "output": "natural language queries", "neg_sample": ["event semantic relation reasoning is done by using OtherScientificTerm", "understanding how events are semantically related to each other is the essence of reading comprehension .", "recent event - centric reading comprehension datasets focus mostly on event arguments or temporal relations .", "while these tasks partially evaluate machines ' ability of narrative understanding , human - like reading comprehension requires the capability to process event - based information beyond arguments and temporal reasoning .", "for example , to understand causality between events , we need to infer motivation or purpose ; to establish event hierarchy , we need to understand the composition of events ."], "relation": "used for", "id": "2021.emnlp-main.597", "year": 2021, "rel_sent": "The dataset leverages natural language queries to reason about the five most common event semantic relations , provides more than 6 K questions , and captures 10.1 K event relation pairs .", "forward": false, "src_ids": "2021.emnlp-main.597_12815"}
{"input": "language models is done by using OtherScientificTerm| context: recent work has raised concerns about the inherent limitations of text - only pretraining .", "entity": "language models", "output": "reporting bias", "neg_sample": ["language models is done by using OtherScientificTerm", "recent work has raised concerns about the inherent limitations of text - only pretraining ."], "relation": "used for", "id": "2021.emnlp-main.63", "year": 2021, "rel_sent": "The World of an Octopus : How Reporting Bias Influences a Language Model 's Perception of Color.", "forward": false, "src_ids": "2021.emnlp-main.63_12565"}
{"input": "amr - to - text generation is done by using Method| context: however , efficiently encoding the graph structure in plms is challenging because such models were pretrained on natural language , and modeling structured data may lead to catastrophic forgetting of distributional knowledge .", "entity": "amr - to - text generation", "output": "pretrained language models", "neg_sample": ["amr - to - text generation is done by using Method", "however , efficiently encoding the graph structure in plms is challenging because such models were pretrained on natural language , and modeling structured data may lead to catastrophic forgetting of distributional knowledge ."], "relation": "used for", "id": "2021.emnlp-main.351", "year": 2021, "rel_sent": "Structural Adapters in Pretrained Language Models for AMR - to - Text Generation.", "forward": false, "src_ids": "2021.emnlp-main.351_66"}
{"input": "statement verification is done by using Method| context: question answering from semi - structured tables can be seen as a semantic parsing task and is significant and practical for pushing the boundary of natural language understanding . existing research mainly focuses on understanding contents from unstructured evidence , e.g. , news , natural language sentences and documents . the task of verification from structured evidence , such as tables , charts , and databases , is still less - explored .", "entity": "statement verification", "output": "ensemble solution", "neg_sample": ["statement verification is done by using Method", "question answering from semi - structured tables can be seen as a semantic parsing task and is significant and practical for pushing the boundary of natural language understanding .", "existing research mainly focuses on understanding contents from unstructured evidence , e.g.", ", news , natural language sentences and documents .", "the task of verification from structured evidence , such as tables , charts , and databases , is still less - explored ."], "relation": "used for", "id": "2021.semeval-1.179", "year": 2021, "rel_sent": "Sattiy at SemEval-2021 Task 9 : An Ensemble Solution for Statement Verification and Evidence Finding with Tables.", "forward": false, "src_ids": "2021.semeval-1.179_8323"}
{"input": "edge - aware attentive biases is used for OtherScientificTerm| context: link prediction on knowledge graphs ( kgs ) is a key research topic . previous work mainly focused on binary relations , paying less attention to higher - arity relations although they are ubiquitous in real - world kgs .", "entity": "edge - aware attentive biases", "output": "graph structure", "neg_sample": ["edge - aware attentive biases is used for OtherScientificTerm", "link prediction on knowledge graphs ( kgs ) is a key research topic .", "previous work mainly focused on binary relations , paying less attention to higher - arity relations although they are ubiquitous in real - world kgs ."], "relation": "used for", "id": "2021.findings-acl.35", "year": 2021, "rel_sent": "The fully - connected attention captures universal inter - vertex interactions , while with edge - aware attentive biases to particularly encode the graph structure and its heterogeneity .", "forward": true, "src_ids": "2021.findings-acl.35_7968"}
{"input": "transformer is used for Task| context: neural conversation models have shown great potentials towards generating fluent and informative responses by introducing external background knowledge . nevertheless , it is laborious to construct such knowledge - grounded dialogues , and existing models usually perform poorly when transfer to new domains with limited training samples . therefore , building a knowledge - grounded dialogue system under the low - resource setting is a still crucial issue .", "entity": "transformer", "output": "disentangled learning of response generation", "neg_sample": ["transformer is used for Task", "neural conversation models have shown great potentials towards generating fluent and informative responses by introducing external background knowledge .", "nevertheless , it is laborious to construct such knowledge - grounded dialogues , and existing models usually perform poorly when transfer to new domains with limited training samples .", "therefore , building a knowledge - grounded dialogue system under the low - resource setting is a still crucial issue ."], "relation": "used for", "id": "2021.emnlp-main.173", "year": 2021, "rel_sent": "To better cooperate with this framework , we devise a variant of Transformer with decoupled decoder which facilitates the disentangled learning of response generation and knowledge incorporation .", "forward": true, "src_ids": "2021.emnlp-main.173_10952"}
{"input": "mixture of experts is done by using OtherScientificTerm| context: simultaneous machine translation ( simt ) generates translation before reading the entire source sentence and hence it has to trade off between translation quality and latency . tofulfill the requirements of different translation quality and latency in practical applications , the previous methods usually need to train multiple simt models for different latency levels , resulting in large computational costs .", "entity": "mixture of experts", "output": "multi - head attention", "neg_sample": ["mixture of experts is done by using OtherScientificTerm", "simultaneous machine translation ( simt ) generates translation before reading the entire source sentence and hence it has to trade off between translation quality and latency .", "tofulfill the requirements of different translation quality and latency in practical applications , the previous methods usually need to train multiple simt models for different latency levels , resulting in large computational costs ."], "relation": "used for", "id": "2021.emnlp-main.581", "year": 2021, "rel_sent": "Specifically , our method employs multi - head attention to accomplish the mixture of experts where each head is treated as a wait - k expert with its own waiting words number , and given a test latency and source inputs , the weights of the experts are accordingly adjusted to produce the best translation .", "forward": false, "src_ids": "2021.emnlp-main.581_7160"}
{"input": "xibe is done by using Task| context: manually annotating a treebank is time - consuming and labor - intensive . however , it is not clear how to determine those closely related languages .", "entity": "xibe", "output": "delexicalized cross - lingual dependency parsing", "neg_sample": ["xibe is done by using Task", "manually annotating a treebank is time - consuming and labor - intensive .", "however , it is not clear how to determine those closely related languages ."], "relation": "used for", "id": "2021.ranlp-1.182", "year": 2021, "rel_sent": "Delexicalized Cross - lingual Dependency Parsing for Xibe.", "forward": false, "src_ids": "2021.ranlp-1.182_6021"}
{"input": "auxiliary data is used for Task| context: it has been well - studied within the medical domain . in this paper , we focus on job postings .", "entity": "auxiliary data", "output": "de - identification", "neg_sample": ["auxiliary data is used for Task", "it has been well - studied within the medical domain .", "in this paper , we focus on job postings ."], "relation": "used for", "id": "2021.nodalida-main.21", "year": 2021, "rel_sent": "Our results show that auxiliary data helps to improve de - identification performance .", "forward": true, "src_ids": "2021.nodalida-main.21_7217"}
{"input": "language modeling is done by using Method| context: the uniform information density ( uid ) hypothesis , which posits that speakers behaving optimally tend to distribute information uniformly across a linguistic signal , has gained traction in psycholinguistics as an explanation for certain syntactic , morphological , and prosodic choices .", "entity": "language modeling", "output": "cognitive regularizer", "neg_sample": ["language modeling is done by using Method", "the uniform information density ( uid ) hypothesis , which posits that speakers behaving optimally tend to distribute information uniformly across a linguistic signal , has gained traction in psycholinguistics as an explanation for certain syntactic , morphological , and prosodic choices ."], "relation": "used for", "id": "2021.acl-long.404", "year": 2021, "rel_sent": "A Cognitive Regularizer for Language Modeling.", "forward": false, "src_ids": "2021.acl-long.404_1175"}
{"input": "unsupervised neural machine translation ( unmt ) is done by using Method| context: unsupervised neural machine translation ( unmt ) that relies solely on massive monolingual corpora has achieved remarkable results in several translation tasks . however , in real - world scenarios , massive monolingual corpora do not exist for some extremely low - resource languages such as estonian , and unmt systems usually perform poorly when there is not adequate training corpus for one language .", "entity": "unsupervised neural machine translation ( unmt )", "output": "self - training", "neg_sample": ["unsupervised neural machine translation ( unmt ) is done by using Method", "unsupervised neural machine translation ( unmt ) that relies solely on massive monolingual corpora has achieved remarkable results in several translation tasks .", "however , in real - world scenarios , massive monolingual corpora do not exist for some extremely low - resource languages such as estonian , and unmt systems usually perform poorly when there is not adequate training corpus for one language ."], "relation": "used for", "id": "2021.naacl-main.311", "year": 2021, "rel_sent": "Self - Training for Unsupervised Neural Machine Translation in Unbalanced Training Data Scenarios.", "forward": false, "src_ids": "2021.naacl-main.311_2559"}
{"input": "question - passage matching is done by using Method| context: dense passage retrieval has been shown to be an effective approach for information retrieval tasks such as open domain question answering .", "entity": "question - passage matching", "output": "dual - encoder model", "neg_sample": ["question - passage matching is done by using Method", "dense passage retrieval has been shown to be an effective approach for information retrieval tasks such as open domain question answering ."], "relation": "used for", "id": "2021.acl-long.477", "year": 2021, "rel_sent": "In this paper , we propose a new contrastive learning method called Cross Momentum Contrastive learning ( xMoCo ) , for learning a dual - encoder model for question - passage matching .", "forward": false, "src_ids": "2021.acl-long.477_13233"}
{"input": "topic adaptation is done by using Method| context: online user comments in public forums are often associated with low quality , hate speech or even excessive demands for moderation . to better exploit their constructive and deliberate potential , we present forumbert .", "entity": "topic adaptation", "output": "self - supervised bert language model fine tuning", "neg_sample": ["topic adaptation is done by using Method", "online user comments in public forums are often associated with low quality , hate speech or even excessive demands for moderation .", "to better exploit their constructive and deliberate potential , we present forumbert ."], "relation": "used for", "id": "2021.konvens-1.17", "year": 2021, "rel_sent": "This is done using a two step procedure : self - supervised BERT language model fine tuning for topic adaptation followed by integration into the forumBERT architecture for online / offline classification .", "forward": false, "src_ids": "2021.konvens-1.17_9467"}
{"input": "paraphrasing methods is used for Task| context: this paper studies the generation methods for paraphrasing in the russian language . there are several transformer - based models ( russian and multilingual ) trained on a collected corpus of paraphrases .", "entity": "paraphrasing methods", "output": "augmentation procedure", "neg_sample": ["paraphrasing methods is used for Task", "this paper studies the generation methods for paraphrasing in the russian language .", "there are several transformer - based models ( russian and multilingual ) trained on a collected corpus of paraphrases ."], "relation": "used for", "id": "2021.bsnlp-1.2", "year": 2021, "rel_sent": "We compare different models , contrast the quality of paraphrases using different ranking methods and apply paraphrasing methods in the context of augmentation procedure for different tasks .", "forward": true, "src_ids": "2021.bsnlp-1.2_14691"}
{"input": "transliteration ( script conversion ) is used for OtherScientificTerm| context: multilingual neural machine translation has achieved remarkable performance by training a single translation model for multiple languages .", "entity": "transliteration ( script conversion )", "output": "lexical gap", "neg_sample": ["transliteration ( script conversion ) is used for OtherScientificTerm", "multilingual neural machine translation has achieved remarkable performance by training a single translation model for multiple languages ."], "relation": "used for", "id": "2021.wat-1.26", "year": 2021, "rel_sent": "Furthermore , we demonstrate the use of transliteration ( script conversion ) for Indic languages in reducing the lexical gap for training a multilingual NMT system .", "forward": true, "src_ids": "2021.wat-1.26_3017"}
{"input": "cs morphological tagging is done by using Method| context: morphological tagging of code - switching ( cs ) data becomes more challenging especially when language pairs composing the cs data have different morphological representations .", "entity": "cs morphological tagging", "output": "transformer - based framework", "neg_sample": ["cs morphological tagging is done by using Method", "morphological tagging of code - switching ( cs ) data becomes more challenging especially when language pairs composing the cs data have different morphological representations ."], "relation": "used for", "id": "2021.calcs-1.10", "year": 2021, "rel_sent": "In this paper , we explore a number of ways of implementing a language - aware morphological tagging method and present our approach for integrating language IDs into a transformer - based framework for CS morphological tagging .", "forward": false, "src_ids": "2021.calcs-1.10_10445"}
{"input": "emotional response generation is done by using Method| context: generating context - aware language that embodies diverse emotions is an important step towards building empathetic nlp systems .", "entity": "emotional response generation", "output": "large - scale language models", "neg_sample": ["emotional response generation is done by using Method", "generating context - aware language that embodies diverse emotions is an important step towards building empathetic nlp systems ."], "relation": "used for", "id": "2021.findings-acl.379", "year": 2021, "rel_sent": "In this paper , we propose a formulation of modulated layer normalization - a technique inspired by computer vision - that allows us to use large - scale language models for emotional response generation .", "forward": false, "src_ids": "2021.findings-acl.379_4535"}
{"input": "non - english mt system is done by using Task| context: the provided parallel data are russian - chinese ( direct ) , russian - english ( indirect ) , and english - chinese ( indirect ) data .", "entity": "non - english mt system", "output": "wmt21 shared triangular mt task", "neg_sample": ["non - english mt system is done by using Task", "the provided parallel data are russian - chinese ( direct ) , russian - english ( indirect ) , and english - chinese ( indirect ) data ."], "relation": "used for", "id": "2021.wmt-1.40", "year": 2021, "rel_sent": "This paper describes Naver Papago 's submission to the WMT21 shared triangular MT task to enhance the non - English MT system with tri - language parallel data .", "forward": false, "src_ids": "2021.wmt-1.40_13939"}
{"input": "ontology - free framework is used for OtherScientificTerm| context: in real - world settings with constantly changing services , dst systems must generalize to new domains and unseen slot types . existing methods for dst do not generalize well to new slot names and many require known ontologies of slot types and values for inference .", "entity": "ontology - free framework", "output": "natural language queries", "neg_sample": ["ontology - free framework is used for OtherScientificTerm", "in real - world settings with constantly changing services , dst systems must generalize to new domains and unseen slot types .", "existing methods for dst do not generalize well to new slot names and many require known ontologies of slot types and values for inference ."], "relation": "used for", "id": "2021.eacl-main.91", "year": 2021, "rel_sent": "We introduce a novel ontology - free framework that supports natural language queries for unseen constraints and slots in multi - domain task - oriented dialogs .", "forward": true, "src_ids": "2021.eacl-main.91_2085"}
{"input": "multi - head attention is used for Method| context: ser ) 25.4 % .", "entity": "multi - head attention", "output": "end - to - end speech recognition system", "neg_sample": ["multi - head attention is used for Method", "ser ) 25.4 % ."], "relation": "used for", "id": "2021.ijclclp-1.2", "year": 2021, "rel_sent": "We use composed of Multi - head Attention to construct an end - to - end speech recognition system and combine it with Connectionist Temporal Classification ( CTC ) for end - to - end training and decoding .", "forward": true, "src_ids": "2021.ijclclp-1.2_13118"}
{"input": "tree - structured graph reasoning is used for Task| context: growing interests have been attracted in conversational recommender systems ( crs ) , which explore user preference through conversational interactions in order to make appropriate recommendation . however , there is still a lack of ability in existing crs to ( 1 ) traverse multiple reasoning paths over background knowledge to introduce relevant items and attributes , and ( 2 ) arrange selected entities appropriately under current system intents to control response generation .", "entity": "tree - structured graph reasoning", "output": "conversational recommendation", "neg_sample": ["tree - structured graph reasoning is used for Task", "growing interests have been attracted in conversational recommender systems ( crs ) , which explore user preference through conversational interactions in order to make appropriate recommendation .", "however , there is still a lack of ability in existing crs to ( 1 ) traverse multiple reasoning paths over background knowledge to introduce relevant items and attributes , and ( 2 ) arrange selected entities appropriately under current system intents to control response generation ."], "relation": "used for", "id": "2021.emnlp-main.139", "year": 2021, "rel_sent": "CR - Walker : Tree - Structured Graph Reasoning and Dialog Acts for Conversational Recommendation.", "forward": true, "src_ids": "2021.emnlp-main.139_2627"}
{"input": "wikibio is used for Task| context: this task is very important in many situations , such as changing some conditions , consequences , or properties in a legal document , or changing some key information of an event in a news text . this is very challenging , as it is hard to obtain a parallel corpus for training , and we need tofirst find all text positions that should be changed and then decide how to change them .", "entity": "wikibio", "output": "table - to - text generation", "neg_sample": ["wikibio is used for Task", "this task is very important in many situations , such as changing some conditions , consequences , or properties in a legal document , or changing some key information of an event in a news text .", "this is very challenging , as it is hard to obtain a parallel corpus for training , and we need tofirst find all text positions that should be changed and then decide how to change them ."], "relation": "used for", "id": "2021.findings-acl.110", "year": 2021, "rel_sent": "We constructed the new dataset WIKIBIOCTE for this task based on the existing dataset WIKIBIO ( originally created for table - to - text generation ) .", "forward": true, "src_ids": "2021.findings-acl.110_13357"}
{"input": "transformer is used for Task| context: introducing factors , that is to say , word features such as linguistic information referring to the source tokens , is known to improve the results of neural machine translation systems in certain settings , typically in recurrent architectures .", "entity": "transformer", "output": "low - resource machine translation", "neg_sample": ["transformer is used for Task", "introducing factors , that is to say , word features such as linguistic information referring to the source tokens , is known to improve the results of neural machine translation systems in certain settings , typically in recurrent architectures ."], "relation": "used for", "id": "2021.ranlp-1.9", "year": 2021, "rel_sent": "Enriching the Transformer with Linguistic Factors for Low - Resource Machine Translation.", "forward": true, "src_ids": "2021.ranlp-1.9_9243"}
{"input": "multi - step analysis is used for OtherScientificTerm| context: translation can obscure the subjectivity of the sources and flatten down positive and negative aspects . however , we do not compare translations with their sources , but analyse polarity items in two translation variants from the same text sources .", "entity": "multi - step analysis", "output": "distribution of polarity items", "neg_sample": ["multi - step analysis is used for OtherScientificTerm", "translation can obscure the subjectivity of the sources and flatten down positive and negative aspects .", "however , we do not compare translations with their sources , but analyse polarity items in two translation variants from the same text sources ."], "relation": "used for", "id": "2021.motra-1.7", "year": 2021, "rel_sent": "We propose a multi - step analysis to investigate the distribution of polarity items and report on small experiments on a corpus of English to German translations to identify the lack of experience in translation by students .", "forward": true, "src_ids": "2021.motra-1.7_15401"}
{"input": "screenplays is done by using OtherScientificTerm| context: screenplay summarization is the task of extracting informative scenes from a screenplay . the screenplay contains turning point ( tp ) events that change the story direction and thus define the story structure decisively . accordingly , this task can be defined as the tp identification task .", "entity": "screenplays", "output": "dialogue information", "neg_sample": ["screenplays is done by using OtherScientificTerm", "screenplay summarization is the task of extracting informative scenes from a screenplay .", "the screenplay contains turning point ( tp ) events that change the story direction and thus define the story structure decisively .", "accordingly , this task can be defined as the tp identification task ."], "relation": "used for", "id": "2021.nuse-1.6", "year": 2021, "rel_sent": "We suggest using dialogue information , one attribute of screenplays , motivated by previous work that discovered that TPs have a relation with dialogues appearing in screenplays .", "forward": false, "src_ids": "2021.nuse-1.6_7643"}
{"input": "large - scale dataset is done by using Material| context: text simplification is a valuable technique . however , current research is limited to sentence simplification .", "entity": "large - scale dataset", "output": "wikipedia dumps", "neg_sample": ["large - scale dataset is done by using Material", "text simplification is a valuable technique .", "however , current research is limited to sentence simplification ."], "relation": "used for", "id": "2021.emnlp-main.630", "year": 2021, "rel_sent": "Based on Wikipedia dumps , we first construct a large - scale dataset named D - Wikipedia and perform analysis and human evaluation on it to show that the dataset is reliable .", "forward": false, "src_ids": "2021.emnlp-main.630_11484"}
{"input": "chinese second language writing is done by using Method| context: ' with the increasing popularity of learning chinese as a second language ( l2 ) the development of an automatic essay scoring ( aes ) method specially for chinese l2 essays has become animportant task .", "entity": "chinese second language writing", "output": "automated essay scoring method", "neg_sample": ["chinese second language writing is done by using Method", "' with the increasing popularity of learning chinese as a second language ( l2 ) the development of an automatic essay scoring ( aes ) method specially for chinese l2 essays has become animportant task ."], "relation": "used for", "id": "2021.ccl-1.107", "year": 2021, "rel_sent": "A Prompt - independent and Interpretable Automated Essay Scoring Method for Chinese Second Language Writing.", "forward": false, "src_ids": "2021.ccl-1.107_14669"}
{"input": "large - scale zero - shot learning is done by using Method| context: most existing works use visual attributes labeled by humans , not suitable for large - scale applications . we argue that documents like wikipedia pages contain rich visual information , which however can easily be buried by the vast amount of non - visual sentences .", "entity": "large - scale zero - shot learning", "output": "document representations", "neg_sample": ["large - scale zero - shot learning is done by using Method", "most existing works use visual attributes labeled by humans , not suitable for large - scale applications .", "we argue that documents like wikipedia pages contain rich visual information , which however can easily be buried by the vast amount of non - visual sentences ."], "relation": "used for", "id": "2021.naacl-main.250", "year": 2021, "rel_sent": "Revisiting Document Representations for Large - Scale Zero - Shot Learning.", "forward": false, "src_ids": "2021.naacl-main.250_7499"}
{"input": "ernie - m is used for Method| context: recent studies have demonstrated that pre - trained cross - lingual models achieve impressive performance in downstream cross - lingual tasks . this improvement benefits from learning a large amount of monolingual and parallel corpora . although it is generally acknowledged that parallel corpora are critical for improving the model performance , existing methods are often constrained by the size of parallel corpora , especially for low - resource languages .", "entity": "ernie - m", "output": "multilingual representation", "neg_sample": ["ernie - m is used for Method", "recent studies have demonstrated that pre - trained cross - lingual models achieve impressive performance in downstream cross - lingual tasks .", "this improvement benefits from learning a large amount of monolingual and parallel corpora .", "although it is generally acknowledged that parallel corpora are critical for improving the model performance , existing methods are often constrained by the size of parallel corpora , especially for low - resource languages ."], "relation": "used for", "id": "2021.emnlp-main.3", "year": 2021, "rel_sent": "ERNIE - M : Enhanced Multilingual Representation by Aligning Cross - lingual Semantics with Monolingual Corpora.", "forward": true, "src_ids": "2021.emnlp-main.3_7325"}
{"input": "benchmarking fairness of text classification methods is done by using Generic| context: psychometric measures of ability , attitudes , perceptions , and beliefs are crucial for understanding user behavior in various contexts including health , security , e - commerce , and finance . traditionally , psychometric dimensions have been measured and collected using survey - based methods . inferring such constructs from user - generated text could allow timely , unobtrusive collection and analysis .", "entity": "benchmarking fairness of text classification methods", "output": "testbed", "neg_sample": ["benchmarking fairness of text classification methods is done by using Generic", "psychometric measures of ability , attitudes , perceptions , and beliefs are crucial for understanding user behavior in various contexts including health , security , e - commerce , and finance .", "traditionally , psychometric dimensions have been measured and collected using survey - based methods .", "inferring such constructs from user - generated text could allow timely , unobtrusive collection and analysis ."], "relation": "used for", "id": "2021.emnlp-main.304", "year": 2021, "rel_sent": "Our testbed also encompasses self - reported demographic information , including race , sex , age , income , and education - thereby affording opportunities for measuring bias and benchmarking fairness of text classification methods .", "forward": false, "src_ids": "2021.emnlp-main.304_1693"}
{"input": "retrieval augmented generation models is done by using Method| context: automatically inducing high quality knowledge graphs from a given collection of documents still remains a challenging problem in ai . one way to make headway for this problem is through advancements in a related task known as slot filling . in this task , given an entity query in form of [ entity , slot , ? the recent works in the field try to solve this task in an end - to - end fashion using retrieval - based language models .", "entity": "retrieval augmented generation models", "output": "training procedures", "neg_sample": ["retrieval augmented generation models is done by using Method", "automatically inducing high quality knowledge graphs from a given collection of documents still remains a challenging problem in ai .", "one way to make headway for this problem is through advancements in a related task known as slot filling .", "in this task , given an entity query in form of [ entity , slot , ?", "the recent works in the field try to solve this task in an end - to - end fashion using retrieval - based language models ."], "relation": "used for", "id": "2021.emnlp-main.148", "year": 2021, "rel_sent": "In this paper , we present a novel approach to zero - shot slot filling that extends dense passage retrieval with hard negatives and robust training procedures for retrieval augmented generation models .", "forward": false, "src_ids": "2021.emnlp-main.148_9347"}
{"input": "zero - shot cross - lingual structured prediction is done by using Task| context: adapting word order from one language to another is a key problem in cross - lingual structured prediction . current sentence encoders ( e.g. , rnn , transformer with position embeddings ) are usually word order sensitive . even with uniform word form representations ( muse , mbert ) , word order discrepancies may hurt the adaptation of models .", "entity": "zero - shot cross - lingual structured prediction", "output": "word reordering", "neg_sample": ["zero - shot cross - lingual structured prediction is done by using Task", "adapting word order from one language to another is a key problem in cross - lingual structured prediction .", "current sentence encoders ( e.g.", ", rnn , transformer with position embeddings ) are usually word order sensitive .", "even with uniform word form representations ( muse , mbert ) , word order discrepancies may hurt the adaptation of models ."], "relation": "used for", "id": "2021.emnlp-main.338", "year": 2021, "rel_sent": "Word Reordering for Zero - shot Cross - lingual Structured Prediction.", "forward": false, "src_ids": "2021.emnlp-main.338_7055"}
{"input": "semantic triplet units is used for OtherScientificTerm| context: human evaluation for summarization tasks is reliable but brings in issues of reproducibility and high costs . automatic metrics are cheap and reproducible but sometimes poorly correlated with human judgment .", "entity": "semantic triplet units", "output": "summary content units", "neg_sample": ["semantic triplet units is used for OtherScientificTerm", "human evaluation for summarization tasks is reliable but brings in issues of reproducibility and high costs .", "automatic metrics are cheap and reproducible but sometimes poorly correlated with human judgment ."], "relation": "used for", "id": "2021.emnlp-main.531", "year": 2021, "rel_sent": "Finally , we propose in - between metrics , Lite2.xPyramid , where we use a simple regressor to predict how well the STUs can simulate SCUs and retain SCUs that are more difficult to simulate , which provides a smooth transition and balance between automation and manual evaluation .", "forward": true, "src_ids": "2021.emnlp-main.531_2613"}
{"input": "human - machine collaborative framework is used for Task| context: conversational dialogue systems ( cdss ) are hard to evaluate due to the complexity of natural language . automatic evaluation of dialogues often shows insufficient correlation with human judgements . human evaluation is reliable but labor - intensive .", "entity": "human - machine collaborative framework", "output": "evaluating malevolence in dialogues", "neg_sample": ["human - machine collaborative framework is used for Task", "conversational dialogue systems ( cdss ) are hard to evaluate due to the complexity of natural language .", "automatic evaluation of dialogues often shows insufficient correlation with human judgements .", "human evaluation is reliable but labor - intensive ."], "relation": "used for", "id": "2021.acl-long.436", "year": 2021, "rel_sent": "A Human - machine Collaborative Framework for Evaluating Malevolence in Dialogues.", "forward": true, "src_ids": "2021.acl-long.436_5544"}
{"input": "wav2vec 2.0 model is used for Method| context: wav2vec 2.0 is a state - of - the - art speech recognition model which maps speech audio waveforms into latent representations . the largest version of wav2vec 2.0 contains 317 million parameters . hence , the inference latency of wav2vec 2.0 will be a bottleneck in production , leading to high costs and a significant environmental footprint .", "entity": "wav2vec 2.0 model", "output": "student model", "neg_sample": ["wav2vec 2.0 model is used for Method", "wav2vec 2.0 is a state - of - the - art speech recognition model which maps speech audio waveforms into latent representations .", "the largest version of wav2vec 2.0 contains 317 million parameters .", "hence , the inference latency of wav2vec 2.0 will be a bottleneck in production , leading to high costs and a significant environmental footprint ."], "relation": "used for", "id": "2021.sustainlp-1.14", "year": 2021, "rel_sent": "Using a teacher - student approach , we distilled the knowledge from the original wav2vec 2.0 model into a student model , which is 2 times faster , 4.8 times smaller than the original model .", "forward": true, "src_ids": "2021.sustainlp-1.14_14814"}
{"input": "deep learning is used for Task| context: respiratory insufficiency is a symptom that requires hospitalization . this work investigates whether it is possible to detect this condition by analyzing patient 's speech samples ; the analysis was performed on data collected during the first wave of the covid-19 pandemic in 2020 , and thus limited to respiratory insufficiency in covid-19 patients .", "entity": "deep learning", "output": "respiratory insufficiency detection", "neg_sample": ["deep learning is used for Task", "respiratory insufficiency is a symptom that requires hospitalization .", "this work investigates whether it is possible to detect this condition by analyzing patient 's speech samples ; the analysis was performed on data collected during the first wave of the covid-19 pandemic in 2020 , and thus limited to respiratory insufficiency in covid-19 patients ."], "relation": "used for", "id": "2021.findings-acl.55", "year": 2021, "rel_sent": "Deep Learning against COVID-19 : Respiratory Insufficiency Detection in Brazilian Portuguese Speech.", "forward": true, "src_ids": "2021.findings-acl.55_3023"}
{"input": "interaction graphs is done by using Material| context: we study the problem of event causality identification ( eci ) to detect causal relation between event mention pairs in text . although deep learning models have recently shown state - of - the - art performance for eci , they are limited to the intra - sentence setting where event mention pairs are presented in the same sentences .", "entity": "interaction graphs", "output": "information sources", "neg_sample": ["interaction graphs is done by using Material", "we study the problem of event causality identification ( eci ) to detect causal relation between event mention pairs in text .", "although deep learning models have recently shown state - of - the - art performance for eci , they are limited to the intra - sentence setting where event mention pairs are presented in the same sentences ."], "relation": "used for", "id": "2021.naacl-main.273", "year": 2021, "rel_sent": "Various information sources are introduced to enrich the interaction graphs for DECI , featuring discourse , syntax , and semantic information .", "forward": false, "src_ids": "2021.naacl-main.273_1368"}
{"input": "dialog systems is used for OtherScientificTerm| context: humans are increasingly interacting with machines through language , sometimes in contexts where the user may not know they are talking to a machine ( like over the phone or a text chatbot ) .", "entity": "dialog systems", "output": "undesired deception", "neg_sample": ["dialog systems is used for OtherScientificTerm", "humans are increasingly interacting with machines through language , sometimes in contexts where the user may not know they are talking to a machine ( like over the phone or a text chatbot ) ."], "relation": "used for", "id": "2021.acl-long.544", "year": 2021, "rel_sent": "Such classifiers could be integrated into dialog systems to avoid undesired deception .", "forward": true, "src_ids": "2021.acl-long.544_10680"}
{"input": "absa task is done by using Task| context: many efforts have been made in solving the aspect - based sentiment analysis ( absa ) task . while most existing studies focus on english texts , handling absa in resource - poor languages remains a challenging problem .", "entity": "absa task", "output": "unsupervised cross - lingual transfer", "neg_sample": ["absa task is done by using Task", "many efforts have been made in solving the aspect - based sentiment analysis ( absa ) task .", "while most existing studies focus on english texts , handling absa in resource - poor languages remains a challenging problem ."], "relation": "used for", "id": "2021.emnlp-main.727", "year": 2021, "rel_sent": "In this paper , we consider the unsupervised cross - lingual transfer for the ABSA task , where only labeled data in the source language is available and we aim at transferring its knowledge to the target language having no labeled data .", "forward": false, "src_ids": "2021.emnlp-main.727_184"}
{"input": "preference comparison is done by using Task| context: high - quality cpc models can significantly benefit applications such as comparative question answering and review - based recommendation . among the existing approaches , non - deep learning methods suffer from inferior performances .", "entity": "preference comparison", "output": "comparative preference classification ( cpc )", "neg_sample": ["preference comparison is done by using Task", "high - quality cpc models can significantly benefit applications such as comparative question answering and review - based recommendation .", "among the existing approaches , non - deep learning methods suffer from inferior performances ."], "relation": "used for", "id": "2021.emnlp-main.546", "year": 2021, "rel_sent": "We study Comparative Preference Classification ( CPC ) which aims at predicting whether a preference comparison exists between two entities in a given sentence and , if so , which entity is preferred over the other .", "forward": false, "src_ids": "2021.emnlp-main.546_5296"}
{"input": "self - critic scores is used for OtherScientificTerm| context: we introduce self - critic pretraining transformers ( script ) for representation learning of text .", "entity": "self - critic scores", "output": "pseudo - log - likelihood", "neg_sample": ["self - critic scores is used for OtherScientificTerm", "we introduce self - critic pretraining transformers ( script ) for representation learning of text ."], "relation": "used for", "id": "2021.naacl-main.409", "year": 2021, "rel_sent": "Also , the self - critic scores can be directly used as pseudo - log - likelihood for efficient scoring .", "forward": true, "src_ids": "2021.naacl-main.409_14293"}
{"input": "cfd models is done by using Material| context: counterfactual statements describe events that did not or can not take place . we consider the problem of counterfactual detection ( cfd ) in product reviews .", "entity": "cfd models", "output": "cfd dataset", "neg_sample": ["cfd models is done by using Material", "counterfactual statements describe events that did not or can not take place .", "we consider the problem of counterfactual detection ( cfd ) in product reviews ."], "relation": "used for", "id": "2021.emnlp-main.568", "year": 2021, "rel_sent": "Moreover , our CFD dataset is compatible with prior datasets and can be merged to learn accurate CFD models .", "forward": false, "src_ids": "2021.emnlp-main.568_10089"}
{"input": "labeling rules is done by using Method| context: instead of using expensive manual annotations , researchers have proposed to train named entity recognition ( ner ) systems using heuristic labeling rules . however , devising labeling rules is challenging because it often requires a considerable amount of manual effort and domain expertise .", "entity": "labeling rules", "output": "glara", "neg_sample": ["labeling rules is done by using Method", "instead of using expensive manual annotations , researchers have proposed to train named entity recognition ( ner ) systems using heuristic labeling rules .", "however , devising labeling rules is challenging because it often requires a considerable amount of manual effort and domain expertise ."], "relation": "used for", "id": "2021.eacl-main.318", "year": 2021, "rel_sent": "To alleviate this problem , we propose GLARA , a graph - based labeling rule augmentation framework , to learn new labeling rules from unlabeled data .", "forward": false, "src_ids": "2021.eacl-main.318_11790"}
{"input": "* sentence planner * model is used for OtherScientificTerm| context: abstractive summarization models heavily rely on copy mechanisms , such as the pointer network or attention , to achieve good performance , measured by textual overlap with reference summaries . as a result , the generated summaries stay close to the formulations in the source document .", "entity": "* sentence planner * model", "output": "abstractive summaries", "neg_sample": ["* sentence planner * model is used for OtherScientificTerm", "abstractive summarization models heavily rely on copy mechanisms , such as the pointer network or attention , to achieve good performance , measured by textual overlap with reference summaries .", "as a result , the generated summaries stay close to the formulations in the source document ."], "relation": "used for", "id": "2021.newsum-1.1", "year": 2021, "rel_sent": "We propose the * sentence planner * model to generate more abstractive summaries .", "forward": true, "src_ids": "2021.newsum-1.1_13392"}
{"input": "bert is done by using OtherScientificTerm| context: tokenization is a fundamental preprocessing step for almost all nlp tasks .", "entity": "bert", "output": "wordpiece tokenization", "neg_sample": ["bert is done by using OtherScientificTerm", "tokenization is a fundamental preprocessing step for almost all nlp tasks ."], "relation": "used for", "id": "2021.emnlp-main.160", "year": 2021, "rel_sent": "In this paper , we propose efficient algorithms for the WordPiece tokenization used in BERT , from single - word tokenization to general text ( e.g. , sentence ) tokenization .", "forward": false, "src_ids": "2021.emnlp-main.160_13945"}
{"input": "t3qa is used for Material| context: weakly - supervised table question - answering ( tableqa ) models have achieved state - of - art performance by using pre - trained bert transformer to jointly encoding a question and a table to produce structured query for the question . however , in practical settings tableqa systems are deployed over table corpora having topic and word distributions quite distinct from bert 's pretraining corpus .", "entity": "t3qa", "output": "topic shift benchmarks", "neg_sample": ["t3qa is used for Material", "weakly - supervised table question - answering ( tableqa ) models have achieved state - of - art performance by using pre - trained bert transformer to jointly encoding a question and a table to produce structured query for the question .", "however , in practical settings tableqa systems are deployed over table corpora having topic and word distributions quite distinct from bert 's pretraining corpus ."], "relation": "used for", "id": "2021.emnlp-main.342", "year": 2021, "rel_sent": "We show that T3QA provides a reasonably good baseline for our topic shift benchmarks .", "forward": true, "src_ids": "2021.emnlp-main.342_13514"}
{"input": "transformer model is done by using Method| context: pretrained transformer - based models , such as bert and its variants , have become a common choice to obtain state - of - the - art performances in nlp tasks . in the identification of adverse drug events ( ade ) from social media texts , for example , bert architectures rank first in the leaderboard . however , a systematic comparison between these models has not yet been done .", "entity": "transformer model", "output": "in - domain language pretraining", "neg_sample": ["transformer model is done by using Method", "pretrained transformer - based models , such as bert and its variants , have become a common choice to obtain state - of - the - art performances in nlp tasks .", "in the identification of adverse drug events ( ade ) from social media texts , for example , bert architectures rank first in the leaderboard .", "however , a systematic comparison between these models has not yet been done ."], "relation": "used for", "id": "2021.eacl-main.149", "year": 2021, "rel_sent": "SpanBERT and PubMedBERT emerged as the best models in our evaluation : this result clearly shows that span - based pretraining gives a decisive advantage in the precise recognition of ADEs , and that in - domain language pretraining is particularly useful when the transformer model is trained just on biomedical text from scratch .", "forward": false, "src_ids": "2021.eacl-main.149_2931"}
{"input": "joint probabilistic distribution is done by using Method| context: in this paper , we investigate the problem of reasoning over natural language statements . prior neural based approaches do not explicitly consider the inter - dependency among answers and their proofs .", "entity": "joint probabilistic distribution", "output": "probr", "neg_sample": ["joint probabilistic distribution is done by using Method", "in this paper , we investigate the problem of reasoning over natural language statements .", "prior neural based approaches do not explicitly consider the inter - dependency among answers and their proofs ."], "relation": "used for", "id": "2021.findings-acl.277", "year": 2021, "rel_sent": "PROBR defines a joint probabilistic distribution over all possible proof graphs and answers via an induced graphical model .", "forward": false, "src_ids": "2021.findings-acl.277_7461"}
{"input": "dialogue acts prediction is done by using Method| context: regular physical activity is associated with a reduced risk of chronic diseases such as type 2 diabetes and improved mental well - being . yet , more than half of the us population is insufficiently active . health coaching has been successful in promoting healthy behaviors .", "entity": "dialogue acts prediction", "output": "transformer - based machine learning models", "neg_sample": ["dialogue acts prediction is done by using Method", "regular physical activity is associated with a reduced risk of chronic diseases such as type 2 diabetes and improved mental well - being .", "yet , more than half of the us population is insufficiently active .", "health coaching has been successful in promoting healthy behaviors ."], "relation": "used for", "id": "2021.sigdial-1.31", "year": 2021, "rel_sent": "We employ both traditional and transformer - based machine learning models for dialogue acts prediction and find them statistically indistinguishable in performance on our health coaching dataset .", "forward": false, "src_ids": "2021.sigdial-1.31_4994"}
{"input": "general placeholders is used for OtherScientificTerm| context: recent neural text generation models have shown significant improvement in generating descriptive text from structured data such as table formats . one of the remaining important challenges is generating more analytical descriptions that can be inferred from facts in a data source . the use of a template - based generator and a pointer - generator is among the potential alternatives for table - to - text generators .", "entity": "general placeholders", "output": "hallucinated phrases", "neg_sample": ["general placeholders is used for OtherScientificTerm", "recent neural text generation models have shown significant improvement in generating descriptive text from structured data such as table formats .", "one of the remaining important challenges is generating more analytical descriptions that can be inferred from facts in a data source .", "the use of a template - based generator and a pointer - generator is among the potential alternatives for table - to - text generators ."], "relation": "used for", "id": "2021.acl-long.115", "year": 2021, "rel_sent": "The copy mechanism is incorporated in the fine - tuning step by using general placeholders to avoid producing hallucinated phrases that are not supported by a table while preserving high fluency .", "forward": true, "src_ids": "2021.acl-long.115_5079"}
{"input": "topic split benchmarks is used for Method| context: weakly - supervised table question - answering ( tableqa ) models have achieved state - of - art performance by using pre - trained bert transformer to jointly encoding a question and a table to produce structured query for the question . however , in practical settings tableqa systems are deployed over table corpora having topic and word distributions quite distinct from bert 's pretraining corpus .", "entity": "topic split benchmarks", "output": "tableqa solutions", "neg_sample": ["topic split benchmarks is used for Method", "weakly - supervised table question - answering ( tableqa ) models have achieved state - of - art performance by using pre - trained bert transformer to jointly encoding a question and a table to produce structured query for the question .", "however , in practical settings tableqa systems are deployed over table corpora having topic and word distributions quite distinct from bert 's pretraining corpus ."], "relation": "used for", "id": "2021.emnlp-main.342", "year": 2021, "rel_sent": "We believe our topic split benchmarks will lead to robust TableQA solutions that are better suited for practical deployment", "forward": true, "src_ids": "2021.emnlp-main.342_13516"}
{"input": "automated segmentation is used for Task| context: any attempt to integrate nlp systems to the study of endangered languages must take into consideration traditional approaches by both nlp and linguistics .", "entity": "automated segmentation", "output": "documentary and descriptive linguistics", "neg_sample": ["automated segmentation is used for Task", "any attempt to integrate nlp systems to the study of endangered languages must take into consideration traditional approaches by both nlp and linguistics ."], "relation": "used for", "id": "2021.computel-1.11", "year": 2021, "rel_sent": "Integrating Automated Segmentation and Glossing into Documentary and Descriptive Linguistics.", "forward": true, "src_ids": "2021.computel-1.11_15036"}
{"input": "pairre is used for OtherScientificTerm| context: distance based knowledge graph embedding methods show promising results on link prediction task , on which two topics have been widely studied : one is the ability to handle complex relations , such as n - to-1 , 1 - to - n and n - to - n , the other is to encode various relation patterns , such as symmetry / antisymmetry . however , the existing methods fail to solve these two problems at the same time , which leads to unsatisfactory results .", "entity": "pairre", "output": "subrelation further", "neg_sample": ["pairre is used for OtherScientificTerm", "distance based knowledge graph embedding methods show promising results on link prediction task , on which two topics have been widely studied : one is the ability to handle complex relations , such as n - to-1 , 1 - to - n and n - to - n , the other is to encode various relation patterns , such as symmetry / antisymmetry .", "however , the existing methods fail to solve these two problems at the same time , which leads to unsatisfactory results ."], "relation": "used for", "id": "2021.acl-long.336", "year": 2021, "rel_sent": "Given simple constraints on relation representations , PairRE can encode subrelation further .", "forward": true, "src_ids": "2021.acl-long.336_12297"}
{"input": "classifiers is used for OtherScientificTerm| context: humans are increasingly interacting with machines through language , sometimes in contexts where the user may not know they are talking to a machine ( like over the phone or a text chatbot ) .", "entity": "classifiers", "output": "undesired deception", "neg_sample": ["classifiers is used for OtherScientificTerm", "humans are increasingly interacting with machines through language , sometimes in contexts where the user may not know they are talking to a machine ( like over the phone or a text chatbot ) ."], "relation": "used for", "id": "2021.acl-long.544", "year": 2021, "rel_sent": "Such classifiers could be integrated into dialog systems to avoid undesired deception .", "forward": true, "src_ids": "2021.acl-long.544_10679"}
{"input": "open - source morphology development is used for Material| context: this work is located in the context where large written and spoken language corpora are available , which creates a set of unique challenges that have to be , and can be , addressed .", "entity": "open - source morphology development", "output": "komi - zyrian language", "neg_sample": ["open - source morphology development is used for Material", "this work is located in the context where large written and spoken language corpora are available , which creates a set of unique challenges that have to be , and can be , addressed ."], "relation": "used for", "id": "2021.iwclul-1.4", "year": 2021, "rel_sent": "Overview of Open - Source Morphology Development for the Komi - Zyrian Language : Past and future.", "forward": true, "src_ids": "2021.iwclul-1.4_6408"}
{"input": "free software library is used for Task| context: best - worst scaling ( bws ) is a methodology for annotation based on comparing and ranking instances , rather than classifying or scoring individual instances . studies have shown the efficacy of this methodology applied to nlp tasks in terms of a higher quality of the datasets produced by following it .", "entity": "free software library", "output": "bws annotation tasks", "neg_sample": ["free software library is used for Task", "best - worst scaling ( bws ) is a methodology for annotation based on comparing and ranking instances , rather than classifying or scoring individual instances .", "studies have shown the efficacy of this methodology applied to nlp tasks in terms of a higher quality of the datasets produced by following it ."], "relation": "used for", "id": "2021.ranlp-1.15", "year": 2021, "rel_sent": "In this system demonstration paper , we present Litescale , a free software library to create and manage BWS annotation tasks .", "forward": true, "src_ids": "2021.ranlp-1.15_7661"}
{"input": "self - supervised learning objectives is used for OtherScientificTerm| context: while pre - trained language models ( ptlms ) have achieved noticeable success on many nlp tasks , they still struggle for tasks that require event temporal reasoning , which is essential for event - centric applications .", "entity": "self - supervised learning objectives", "output": "masked - out event and temporal indicators", "neg_sample": ["self - supervised learning objectives is used for OtherScientificTerm", "while pre - trained language models ( ptlms ) have achieved noticeable success on many nlp tasks , they still struggle for tasks that require event temporal reasoning , which is essential for event - centric applications ."], "relation": "used for", "id": "2021.emnlp-main.436", "year": 2021, "rel_sent": "We design self - supervised learning objectives to recover masked - out event and temporal indicators and to discriminate sentences from their corrupted counterparts ( where event or temporal indicators got replaced ) .", "forward": true, "src_ids": "2021.emnlp-main.436_16026"}
{"input": "translation model is done by using Method| context: for japanese - to - english translation , zero pronouns in japanese pose a challenge , since the model needs to infer and produce the corresponding pronoun in the target side of the english sentence . however , although fully resolving zero pronouns often needs discourse context , in some cases , the local context within a sentence gives clues to the inference of the zero pronoun .", "entity": "translation model", "output": "data augmentation method", "neg_sample": ["translation model is done by using Method", "for japanese - to - english translation , zero pronouns in japanese pose a challenge , since the model needs to infer and produce the corresponding pronoun in the target side of the english sentence .", "however , although fully resolving zero pronouns often needs discourse context , in some cases , the local context within a sentence gives clues to the inference of the zero pronoun ."], "relation": "used for", "id": "2021.wat-1.11", "year": 2021, "rel_sent": "In this study , we propose a data augmentation method that provides additional training signals for the translation model to learn correlations between local context and zero pronouns .", "forward": false, "src_ids": "2021.wat-1.11_12338"}
{"input": "indoor navigation is done by using Method| context: people navigating in unfamiliar buildings take advantage of myriad visual , spatial and semantic cues to efficiently achieve their navigation goals .", "entity": "indoor navigation", "output": "world model", "neg_sample": ["indoor navigation is done by using Method", "people navigating in unfamiliar buildings take advantage of myriad visual , spatial and semantic cues to efficiently achieve their navigation goals ."], "relation": "used for", "id": "2021.alvr-1.9", "year": 2021, "rel_sent": "Pathdreamer : A World Model for Indoor Navigation.", "forward": false, "src_ids": "2021.alvr-1.9_4227"}
{"input": "latent space is used for Method| context: generating diverse texts is an important factor for unsupervised text generation . one approach is to produce the diversity of texts conditioned by the sampled latent code . although several generative adversarial networks ( gans ) have been proposed thus far , these models still suffer from mode - collapsing if the models are not pre - trained .", "entity": "latent space", "output": "language gans", "neg_sample": ["latent space is used for Method", "generating diverse texts is an important factor for unsupervised text generation .", "one approach is to produce the diversity of texts conditioned by the sampled latent code .", "although several generative adversarial networks ( gans ) have been proposed thus far , these models still suffer from mode - collapsing if the models are not pre - trained ."], "relation": "used for", "id": "2021.eacl-srw.23", "year": 2021, "rel_sent": "Making Use of Latent Space in Language GANs for Generating Diverse Text without Pre - training.", "forward": true, "src_ids": "2021.eacl-srw.23_11805"}
{"input": "large language model is used for Method| context: we address the task of explaining relationships between two scientific documents using natural language text . this task requires modeling the complex content of long technical documents , deducing a relationship between these documents , and expressing the details of that relationship in text . in addition to the theoretical interest of this task , successful solutions can help improve researcher efficiency in search and review .", "entity": "large language model", "output": "autoregressive approaches", "neg_sample": ["large language model is used for Method", "we address the task of explaining relationships between two scientific documents using natural language text .", "this task requires modeling the complex content of long technical documents , deducing a relationship between these documents , and expressing the details of that relationship in text .", "in addition to the theoretical interest of this task , successful solutions can help improve researcher efficiency in search and review ."], "relation": "used for", "id": "2021.acl-long.166", "year": 2021, "rel_sent": "We pretrain a large language model to serve as the foundation for autoregressive approaches to the task .", "forward": true, "src_ids": "2021.acl-long.166_12435"}
{"input": "weighted sum is used for OtherScientificTerm| context: understanding linguistics and morphology of resource - scarce code - mixed texts remains a key challenge in text processing . although word embedding comes in handy to support downstream tasks for low - resource languages , there are plenty of scopes in improving the quality of language representation particularly for code - mixed languages .", "entity": "weighted sum", "output": "attention weights", "neg_sample": ["weighted sum is used for OtherScientificTerm", "understanding linguistics and morphology of resource - scarce code - mixed texts remains a key challenge in text processing .", "although word embedding comes in handy to support downstream tasks for low - resource languages , there are plenty of scopes in improving the quality of language representation particularly for code - mixed languages ."], "relation": "used for", "id": "2021.findings-acl.407", "year": 2021, "rel_sent": "HIT incorporates two attention modules , a multi - headed self - attention and an outer product attention module , and computes their weighted sum to obtain the attention weights .", "forward": true, "src_ids": "2021.findings-acl.407_7800"}
{"input": "layers is used for OtherScientificTerm| context: existing work on probing of pretrained language models ( lms ) has predominantly focused on sentence - level syntactic tasks .", "entity": "layers", "output": "discourse information", "neg_sample": ["layers is used for OtherScientificTerm", "existing work on probing of pretrained language models ( lms ) has predominantly focused on sentence - level syntactic tasks ."], "relation": "used for", "id": "2021.naacl-main.301", "year": 2021, "rel_sent": "Across the different models , there are substantial differences in which layers best capture discourse information , and large disparities between models .", "forward": true, "src_ids": "2021.naacl-main.301_15801"}
{"input": "nlp is done by using Method| context: their success in machine translation and other nlp tasks is phenomenal , but their interpretability is challenging .", "entity": "nlp", "output": "neural networks", "neg_sample": ["nlp is done by using Method", "their success in machine translation and other nlp tasks is phenomenal , but their interpretability is challenging ."], "relation": "used for", "id": "2021.naacl-srw.4", "year": 2021, "rel_sent": "Representations of Meaning in Neural Networks for NLP : a Thesis Proposal.", "forward": false, "src_ids": "2021.naacl-srw.4_10691"}
{"input": "acute psychiatric crises is done by using Material| context: we address the problem of predicting psychiatric hospitalizations using linguistic features drawn from social media posts .", "entity": "acute psychiatric crises", "output": "social media data", "neg_sample": ["acute psychiatric crises is done by using Material", "we address the problem of predicting psychiatric hospitalizations using linguistic features drawn from social media posts ."], "relation": "used for", "id": "2021.clpsych-1.14", "year": 2021, "rel_sent": "Our results suggest that this is a useful framework for collecting hospitalization data , and that social media data can be leveraged to predict acute psychiatric crises before they occur , potentially saving lives and improving outcomes for individuals with mental illness .", "forward": false, "src_ids": "2021.clpsych-1.14_14515"}
{"input": "deep learning model is used for Material| context: ever - expanding evaluative texts on online forums have become an important source of sentiment analysis .", "entity": "deep learning model", "output": "implicit reviews", "neg_sample": ["deep learning model is used for Material", "ever - expanding evaluative texts on online forums have become an important source of sentiment analysis ."], "relation": "used for", "id": "2021.rocling-1.35", "year": 2021, "rel_sent": "We introduce a category , implicit evaluative texts , impevals for short , to investigate how the deep learning model works on these implicit reviews .", "forward": true, "src_ids": "2021.rocling-1.35_2429"}
{"input": "explicit multi - hop reasoning is done by using Method| context: although paths of user interests shift in knowledge graphs ( kgs ) can benefit conversational recommender systems ( crs ) , explicit reasoning on kgs has not been well considered in crs , due to the complex of high - order and incomplete paths .", "entity": "explicit multi - hop reasoning", "output": "crfr", "neg_sample": ["explicit multi - hop reasoning is done by using Method", "although paths of user interests shift in knowledge graphs ( kgs ) can benefit conversational recommender systems ( crs ) , explicit reasoning on kgs has not been well considered in crs , due to the complex of high - order and incomplete paths ."], "relation": "used for", "id": "2021.emnlp-main.355", "year": 2021, "rel_sent": "We propose CRFR , which effectively does explicit multi - hop reasoning on KGs with a conversational context - based reinforcement learning model .", "forward": false, "src_ids": "2021.emnlp-main.355_11988"}
{"input": "context information is used for Task| context: in general , there are two strategies to track a dialogue state : predicting it from scratch and updating it from previous state . the scratch - based strategy obtains each slot value by inquiring all the dialogue history , and the previous - based strategy relies on the current turn dialogue to update the previous dialogue state . obviously , it plays different roles for the context information of different granularity to track different kinds of dialogue states .", "entity": "context information", "output": "dialogue state tracking", "neg_sample": ["context information is used for Task", "in general , there are two strategies to track a dialogue state : predicting it from scratch and updating it from previous state .", "the scratch - based strategy obtains each slot value by inquiring all the dialogue history , and the previous - based strategy relies on the current turn dialogue to update the previous dialogue state .", "obviously , it plays different roles for the context information of different granularity to track different kinds of dialogue states ."], "relation": "used for", "id": "2021.acl-long.193", "year": 2021, "rel_sent": "Comprehensive Study : How the Context Information of Different Granularity Affects Dialogue State Tracking ?.", "forward": true, "src_ids": "2021.acl-long.193_6003"}
{"input": "tapas is done by using Method| context: tables are widely used in various kinds of documents to present information concisely . understanding tables is a challenging problem that requires an understanding of language and table structure , along with numerical and logical reasoning .", "entity": "tapas", "output": "transfer learning", "neg_sample": ["tapas is done by using Method", "tables are widely used in various kinds of documents to present information concisely .", "understanding tables is a challenging problem that requires an understanding of language and table structure , along with numerical and logical reasoning ."], "relation": "used for", "id": "2021.semeval-1.180", "year": 2021, "rel_sent": "In subtask A , we evaluate how transfer learning and standardizing tables to have a single header row improves TAPAS ' performance .", "forward": false, "src_ids": "2021.semeval-1.180_9040"}
{"input": "unintended memorization is done by using Method| context: recent works have shown that language models ( lms ) , e.g. , for next word prediction ( nwp ) , have a tendency to memorize rare or unique sequences in the training data . since useful lms are often trained on sensitive data , it is critical to identify and mitigate such unintended memorization . it differs in many aspects from the well - studied central learning setting where all the data is stored at the central server , and minibatch stochastic gradient descent is used to conduct training .", "entity": "unintended memorization", "output": "federated learning", "neg_sample": ["unintended memorization is done by using Method", "recent works have shown that language models ( lms ) , e.g.", ", for next word prediction ( nwp ) , have a tendency to memorize rare or unique sequences in the training data .", "since useful lms are often trained on sensitive data , it is critical to identify and mitigate such unintended memorization .", "it differs in many aspects from the well - studied central learning setting where all the data is stored at the central server , and minibatch stochastic gradient descent is used to conduct training ."], "relation": "used for", "id": "2021.privatenlp-1.1", "year": 2021, "rel_sent": "Thus , we initiate a formal study to understand the effect of different components of FL on unintended memorization in trained NWP models .", "forward": false, "src_ids": "2021.privatenlp-1.1_7045"}
{"input": "t5 - based joint models is used for OtherScientificTerm| context: in interpretable nlp , we require faithful rationales that reflect the model 's decision - making process for an explained instance . while prior work focuses on extractive rationales ( a subset of the input words ) , we investigate their less - studied counterpart : free - text natural language rationales .", "entity": "t5 - based joint models", "output": "faithful free - text rationales", "neg_sample": ["t5 - based joint models is used for OtherScientificTerm", "in interpretable nlp , we require faithful rationales that reflect the model 's decision - making process for an explained instance .", "while prior work focuses on extractive rationales ( a subset of the input words ) , we investigate their less - studied counterpart : free - text natural language rationales ."], "relation": "used for", "id": "2021.emnlp-main.804", "year": 2021, "rel_sent": "Via two tests , * robustness equivalence * and * feature importance agreement * , we find that state - of - the - art T5 - based joint models exhibit desirable properties for explaining commonsense question - answering and natural language inference , indicating their potential for producing faithful free - text rationales .", "forward": true, "src_ids": "2021.emnlp-main.804_11874"}
{"input": "lexical normalization is used for Material| context: these questions may also affect community formation on social networking sites where differences can be attributed to esl learners and native english speakers . however , few studies have addressed these questions .", "entity": "lexical normalization", "output": "noisy english texts", "neg_sample": ["lexical normalization is used for Material", "these questions may also affect community formation on social networking sites where differences can be attributed to esl learners and native english speakers .", "however , few studies have addressed these questions ."], "relation": "used for", "id": "2021.wnut-1.50", "year": 2021, "rel_sent": "The experimental results showed that although intermediate - level ESL learners can read most noisy English texts in the first place , lexical normalization significantly improves the readability of noisy English texts for ESL learners .", "forward": true, "src_ids": "2021.wnut-1.50_11296"}
{"input": "aspect - based sentiment classification is done by using Method| context: graph - based aspect - based sentiment classification ( absc ) approaches have yielded state - of - the - art results , expecially when equipped with contextual word embedding from pre - training language models ( plms ) . however , they ignore sequential features of the context and have not yet made the best of plms .", "entity": "aspect - based sentiment classification", "output": "gcn", "neg_sample": ["aspect - based sentiment classification is done by using Method", "graph - based aspect - based sentiment classification ( absc ) approaches have yielded state - of - the - art results , expecially when equipped with contextual word embedding from pre - training language models ( plms ) .", "however , they ignore sequential features of the context and have not yet made the best of plms ."], "relation": "used for", "id": "2021.emnlp-main.724", "year": 2021, "rel_sent": "BERT4GCN : Using BERT Intermediate Layers to Augment GCN for Aspect - based Sentiment Classification.", "forward": false, "src_ids": "2021.emnlp-main.724_12650"}
{"input": "sentence processing is done by using OtherScientificTerm| context: context guides comprehenders ' expectations during language processing , and informationtheoretic surprisal is commonly used as an index of cognitive processing effort . however , prior work using surprisal has considered only within - sentence context , using n - grams , neural language models , or syntactic structure as conditioning context .", "entity": "sentence processing", "output": "neuro - cognitive mechanisms", "neg_sample": ["sentence processing is done by using OtherScientificTerm", "context guides comprehenders ' expectations during language processing , and informationtheoretic surprisal is commonly used as an index of cognitive processing effort .", "however , prior work using surprisal has considered only within - sentence context , using n - grams , neural language models , or syntactic structure as conditioning context ."], "relation": "used for", "id": "2021.findings-acl.332", "year": 2021, "rel_sent": "More generally , our approach adds to a growing literature using methods from computational linguistics to operationalize and test hypotheses about neuro - cognitive mechanisms in sentence processing .", "forward": false, "src_ids": "2021.findings-acl.332_5738"}
{"input": "biomedical relation extraction is done by using Method| context: the recent advancement of pre - trained transformer models has propelled the development of effective text mining models across various biomedical tasks . however , these models are primarily learned on the textual data and often lack the domain knowledge of the entities to capture the context beyond the sentence .", "entity": "biomedical relation extraction", "output": "multimodal graph - based transformer framework", "neg_sample": ["biomedical relation extraction is done by using Method", "the recent advancement of pre - trained transformer models has propelled the development of effective text mining models across various biomedical tasks .", "however , these models are primarily learned on the textual data and often lack the domain knowledge of the entities to capture the context beyond the sentence ."], "relation": "used for", "id": "2021.findings-acl.328", "year": 2021, "rel_sent": "Multimodal Graph - based Transformer Framework for Biomedical Relation Extraction.", "forward": false, "src_ids": "2021.findings-acl.328_7435"}
{"input": "coreference resolution system is used for Material| context: it is often posited that more predictable parts of a speaker 's meaning tend to be made less explicit , for instance using shorter , less informative words . studying these dynamics in the domain of referring expressions has proven difficult , with existing studies , both psycholinguistic and corpus - based , providing contradictory results .", "entity": "coreference resolution system", "output": "english", "neg_sample": ["coreference resolution system is used for Material", "it is often posited that more predictable parts of a speaker 's meaning tend to be made less explicit , for instance using shorter , less informative words .", "studying these dynamics in the domain of referring expressions has proven difficult , with existing studies , both psycholinguistic and corpus - based , providing contradictory results ."], "relation": "used for", "id": "2021.conll-1.36", "year": 2021, "rel_sent": "We obtain these estimates training an existing coreference resolution system for English on a new task , masked coreference resolution , giving us a probability distribution over referents that is conditioned on the context but not the referring expression .", "forward": true, "src_ids": "2021.conll-1.36_7911"}
{"input": "simultaneous speech translation ( simulst ) is used for Task| context: with the increased audiovisualisation of communication , the need for live subtitles in multilingual events is more relevant than ever . however , the word - for - word rate of generation of simulst systems is not optimal for displaying the subtitles in a comprehensible and readable way .", "entity": "simultaneous speech translation ( simulst )", "output": "live subtitling", "neg_sample": ["simultaneous speech translation ( simulst ) is used for Task", "with the increased audiovisualisation of communication , the need for live subtitles in multilingual events is more relevant than ever .", "however , the word - for - word rate of generation of simulst systems is not optimal for displaying the subtitles in a comprehensible and readable way ."], "relation": "used for", "id": "2021.mtsummit-asltrw.4", "year": 2021, "rel_sent": "In an attempt to automatise the process , we aim at exploring the feasibility of simultaneous speech translation ( SimulST ) for live subtitling .", "forward": true, "src_ids": "2021.mtsummit-asltrw.4_5624"}
{"input": "summarization is done by using Task| context: how to generate summaries of different styles without requiring corpora in the target styles , or training separate models ?", "entity": "summarization", "output": "inference time style control", "neg_sample": ["summarization is done by using Task", "how to generate summaries of different styles without requiring corpora in the target styles , or training separate models ?"], "relation": "used for", "id": "2021.naacl-main.476", "year": 2021, "rel_sent": "Inference Time Style Control for Summarization.", "forward": false, "src_ids": "2021.naacl-main.476_3550"}
{"input": "graph - based sentence ordering is done by using OtherScientificTerm| context: dominant sentence ordering models can be classified into pairwise ordering models and set - to - sequence models . however , there is little attempt to combine these two types of models , which inituitively possess complementary advantages .", "entity": "graph - based sentence ordering", "output": "pairwise orderings", "neg_sample": ["graph - based sentence ordering is done by using OtherScientificTerm", "dominant sentence ordering models can be classified into pairwise ordering models and set - to - sequence models .", "however , there is little attempt to combine these two types of models , which inituitively possess complementary advantages ."], "relation": "used for", "id": "2021.emnlp-main.186", "year": 2021, "rel_sent": "In this paper , we propose a novel sentence ordering framework which introduces two classifiers to make better use of pairwise orderings for graph - based sentence ordering ( Yin et al .", "forward": false, "src_ids": "2021.emnlp-main.186_5031"}
{"input": "domain - general knowledge is used for OtherScientificTerm| context: unsupervised machine translation , which utilizes unpaired monolingual corpora as training data , has achieved comparable performance against supervised machine translation .", "entity": "domain - general knowledge", "output": "data - scarce domains", "neg_sample": ["domain - general knowledge is used for OtherScientificTerm", "unsupervised machine translation , which utilizes unpaired monolingual corpora as training data , has achieved comparable performance against supervised machine translation ."], "relation": "used for", "id": "2021.acl-long.225", "year": 2021, "rel_sent": "We assume that domain - general knowledge is a significant factor in handling data - scarce domains .", "forward": true, "src_ids": "2021.acl-long.225_11380"}
{"input": "well - calibrated model is used for Task| context: it has been gradually adopted and practiced in medical organizations , largely due to the advancement of nlp . the introduction of state - of - the - art deep learning models and transfer learning techniques like universal language model fine tuning ( ulmfit ) and knowledge distillation ( kd ) largely contributes to the performance of nlp tasks . however , some deep neural networks are poorly calibrated and wrongly estimate the uncertainty . hence the model is not trustworthy , especially in sensitive medical decisionmaking systems and safety tasks .", "entity": "well - calibrated model", "output": "medical dialogue system", "neg_sample": ["well - calibrated model is used for Task", "it has been gradually adopted and practiced in medical organizations , largely due to the advancement of nlp .", "the introduction of state - of - the - art deep learning models and transfer learning techniques like universal language model fine tuning ( ulmfit ) and knowledge distillation ( kd ) largely contributes to the performance of nlp tasks .", "however , some deep neural networks are poorly calibrated and wrongly estimate the uncertainty .", "hence the model is not trustworthy , especially in sensitive medical decisionmaking systems and safety tasks ."], "relation": "used for", "id": "2021.icnlsp-1.22", "year": 2021, "rel_sent": "In this paper , we investigate the well - calibrated model for ULMFiT and self - distillation ( SD ) in a medical dialogue system .", "forward": true, "src_ids": "2021.icnlsp-1.22_12891"}
{"input": "end - to - end speech translation is done by using Method| context: speech translation is the translation of speech in one language typically to text in another , traditionally accomplished through a combination of automatic speech recognition and machine translation . speech translation has attracted interest for many years , but the recent successful applications of deep learning to both individual tasks have enabled new opportunities through joint modeling , in what we today call ' end - to - end speech translation . '", "entity": "end - to - end speech translation", "output": "model architectures", "neg_sample": ["end - to - end speech translation is done by using Method", "speech translation is the translation of speech in one language typically to text in another , traditionally accomplished through a combination of automatic speech recognition and machine translation .", "speech translation has attracted interest for many years , but the recent successful applications of deep learning to both individual tasks have enabled new opportunities through joint modeling , in what we today call ' end - to - end speech translation . '"], "relation": "used for", "id": "2021.eacl-tutorials.3", "year": 2021, "rel_sent": "Starting from the traditional cascaded approach , we will given an overview on data sources and model architectures to achieve state - of - the art performance with end - to - end speech translation for both high- and low - resource languages .", "forward": false, "src_ids": "2021.eacl-tutorials.3_140"}
{"input": "english gpt-2 is used for Material| context: large generative language models have been very successful for english , but other languages lag behind due to data and computational limitations .", "entity": "english gpt-2", "output": "italian", "neg_sample": ["english gpt-2 is used for Material", "large generative language models have been very successful for english , but other languages lag behind due to data and computational limitations ."], "relation": "used for", "id": "2021.findings-acl.74", "year": 2021, "rel_sent": "Specifically , we describe the adaptation of English GPT-2 to Italian and Dutch by retraining lexical embeddings without tuning the Transformer layers .", "forward": true, "src_ids": "2021.findings-acl.74_7021"}
{"input": "controlled generation of longer sequences is done by using OtherScientificTerm| context: despite recent successes of large pre - trained language models in solving reasoning tasks , their inference capabilities remain opaque . we posit that such models can be made more interpretable by explicitly generating interim inference rules , and using them to guide the generation of task - specific textual outputs .", "entity": "controlled generation of longer sequences", "output": "recursive nature of", "neg_sample": ["controlled generation of longer sequences is done by using OtherScientificTerm", "despite recent successes of large pre - trained language models in solving reasoning tasks , their inference capabilities remain opaque .", "we posit that such models can be made more interpretable by explicitly generating interim inference rules , and using them to guide the generation of task - specific textual outputs ."], "relation": "used for", "id": "2021.acl-long.395", "year": 2021, "rel_sent": "The recursive nature of holds the potential for controlled generation of longer sequences .", "forward": false, "src_ids": "2021.acl-long.395_339"}
{"input": "question - answer pairs is used for OtherScientificTerm| context: multi - text applications , such as multi - document summarization , are typically required to model redundancies across related texts . current methods confronting consolidation struggle tofuse overlapping information .", "entity": "question - answer pairs", "output": "predicate - argument relations", "neg_sample": ["question - answer pairs is used for OtherScientificTerm", "multi - text applications , such as multi - document summarization , are typically required to model redundancies across related texts .", "current methods confronting consolidation struggle tofuse overlapping information ."], "relation": "used for", "id": "2021.emnlp-main.778", "year": 2021, "rel_sent": "Our setting exploits QA - SRL , utilizing question - answer pairs to capture predicate - argument relations , facilitating laymen annotation of cross - text alignments .", "forward": true, "src_ids": "2021.emnlp-main.778_14511"}
{"input": "demography is used for Task| context: the existing research on sentiment analysis mainly utilized data curated in limited geographical regions and demography ( e.g. , usa , uk , china ) due to commercial interest and availability of review data . since the user 's attitudes and preferences can be affected by numerous sociocultural factors and demographic characteristics , it is necessary to have annotated review datasets belong to various demography .", "entity": "demography", "output": "linguistic aspects of reviews", "neg_sample": ["demography is used for Task", "the existing research on sentiment analysis mainly utilized data curated in limited geographical regions and demography ( e.g.", ", usa , uk , china ) due to commercial interest and availability of review data .", "since the user 's attitudes and preferences can be affected by numerous sociocultural factors and demographic characteristics , it is necessary to have annotated review datasets belong to various demography ."], "relation": "used for", "id": "2021.ranlp-1.144", "year": 2021, "rel_sent": "The data analysis reveals that demography plays an influential role in the linguistic aspects of reviews .", "forward": true, "src_ids": "2021.ranlp-1.144_16179"}
{"input": "semantic operations is done by using Method| context: neural sequence models exhibit limited compositional generalization ability in semantic parsing tasks . compositional generalization requires algebraic recombination , i.e. , dynamically recombining structured expressions in a recursive manner . however , most previous studies mainly concentrate on recombining lexical units , which is an important but not sufficient part of algebraic recombination .", "entity": "semantic operations", "output": "interpreter", "neg_sample": ["semantic operations is done by using Method", "neural sequence models exhibit limited compositional generalization ability in semantic parsing tasks .", "compositional generalization requires algebraic recombination , i.e.", ", dynamically recombining structured expressions in a recursive manner .", "however , most previous studies mainly concentrate on recombining lexical units , which is an important but not sufficient part of algebraic recombination ."], "relation": "used for", "id": "2021.findings-acl.97", "year": 2021, "rel_sent": "Specifically , we learn two modules jointly : a Composer for producing latent syntax , and an Interpreter for assigning semantic operations .", "forward": false, "src_ids": "2021.findings-acl.97_12467"}
{"input": "symmetry of the compact closed category is used for OtherScientificTerm| context: while the discocat model ( coecke et al . , 2010 ) has been proved a valuable tool for studying compositional aspects of language at the level of semantics , its strong dependency on pregroup grammars poses important restrictions : first , it prevents large - scale experimentation due to the absence of a pregroup parser ; and second , it limits the expressibility of the model to context - free grammars .", "entity": "symmetry of the compact closed category", "output": "word meaning", "neg_sample": ["symmetry of the compact closed category is used for OtherScientificTerm", "while the discocat model ( coecke et al .", ", 2010 ) has been proved a valuable tool for studying compositional aspects of language at the level of semantics , its strong dependency on pregroup grammars poses important restrictions : first , it prevents large - scale experimentation due to the absence of a pregroup parser ; and second , it limits the expressibility of the model to context - free grammars ."], "relation": "used for", "id": "2021.semspace-1.3", "year": 2021, "rel_sent": "We start by showing that standard categorial grammars can be expressed as a biclosed category , where all rules emerge as currying / uncurrying the identity ; we then proceed to model permutation - inducing rules by exploiting the symmetry of the compact closed category encoding the word meaning .", "forward": true, "src_ids": "2021.semspace-1.3_5326"}
{"input": "nlp problems is done by using Method| context: there has seen a surge of interests in applying deep learning on graph techniques to nlp , and has achieved considerable success in many nlp tasks , ranging from classification tasks like sentence classification , semantic role labeling and relation extraction , to generation tasks like machine translation , question generation and summarization . despite these successes , deep learning on graphs for nlp still face many challenges , including automatically transforming original text sequence data into highly graph - structured data , and effectively modeling complex data that involves mapping between graph - based inputs and other highly structured output data such as sequences , trees , and graph data with multi - types in both nodes and edges .", "entity": "nlp problems", "output": "graph neural networks ( gnns )", "neg_sample": ["nlp problems is done by using Method", "there has seen a surge of interests in applying deep learning on graph techniques to nlp , and has achieved considerable success in many nlp tasks , ranging from classification tasks like sentence classification , semantic role labeling and relation extraction , to generation tasks like machine translation , question generation and summarization .", "despite these successes , deep learning on graphs for nlp still face many challenges , including automatically transforming original text sequence data into highly graph - structured data , and effectively modeling complex data that involves mapping between graph - based inputs and other highly structured output data such as sequences , trees , and graph data with multi - types in both nodes and edges ."], "relation": "used for", "id": "2021.naacl-tutorials.3", "year": 2021, "rel_sent": "In addition , hands - on demonstration sessions will be included to help the audience gain practical experience on applying GNNs to solve challenging NLP problems using our recently developed open source library - Graph4NLP , the first library for researchers and practitioners for easy use of GNNs for various NLP tasks .", "forward": false, "src_ids": "2021.naacl-tutorials.3_8181"}
{"input": "constituency parsing task is done by using Method| context: modern approaches to constituency parsing are mono - lingual supervised approaches which require large amount of labelled data to be trained on , thus limiting their utility to only a handful of high - resource languages .", "entity": "constituency parsing task", "output": "cross - lingual transfer learning", "neg_sample": ["constituency parsing task is done by using Method", "modern approaches to constituency parsing are mono - lingual supervised approaches which require large amount of labelled data to be trained on , thus limiting their utility to only a handful of high - resource languages ."], "relation": "used for", "id": "2021.rocling-1.1", "year": 2021, "rel_sent": "UniRNNG involves Cross - lingual Transfer Learning for Constituency Parsing task .", "forward": false, "src_ids": "2021.rocling-1.1_13314"}
{"input": "features is done by using Method| context: one of the mechanisms through which disinformation is spreading online , in particular through social media , is by employing propaganda techniques . these include specific rhetorical and psychological strategies , ranging from leveraging on emotions to exploiting logical fallacies .", "entity": "features", "output": "linguistic analysis", "neg_sample": ["features is done by using Method", "one of the mechanisms through which disinformation is spreading online , in particular through social media , is by employing propaganda techniques .", "these include specific rhetorical and psychological strategies , ranging from leveraging on emotions to exploiting logical fallacies ."], "relation": "used for", "id": "2021.ranlp-1.168", "year": 2021, "rel_sent": "More precisely , we propose a supervised approach to classify textual snippets both as propaganda messages and according to the precise applied propaganda technique , as well as a detailed linguistic analysis of the features characterising propaganda information in text ( e.g. , semantic , sentiment and argumentation features ) .", "forward": false, "src_ids": "2021.ranlp-1.168_1835"}
{"input": "abstractive explainer is done by using OtherScientificTerm| context: how can we generate concise explanations for multi - hop reading comprehension ( rc ) ? the current strategies of identifying supporting sentences can be seen as an extractive question - focused summarization of the input text . however , these extractive explanations are not necessarily concise i.e. not minimally sufficient for answering a question .", "entity": "abstractive explainer", "output": "human - annotated abstractive explanations", "neg_sample": ["abstractive explainer is done by using OtherScientificTerm", "how can we generate concise explanations for multi - hop reading comprehension ( rc ) ?", "the current strategies of identifying supporting sentences can be seen as an extractive question - focused summarization of the input text .", "however , these extractive explanations are not necessarily concise i.e.", "not minimally sufficient for answering a question ."], "relation": "used for", "id": "2021.emnlp-main.490", "year": 2021, "rel_sent": "Given a limited amount of human - annotated abstractive explanations , we train the abstractive explainer in a semi - supervised manner , where we start from the supervised model and then train it further through trial and error maximizing a conciseness - promoted reward function .", "forward": false, "src_ids": "2021.emnlp-main.490_7953"}
{"input": "topic modeling is used for OtherScientificTerm| context: we apply statistical techniques from natural language processing to a collection of western and hong kong - based english - language newspaper articles spanning the years 1998 - 2020 , studying the difference and evolution of its portrayal . we observe that both content and attitudes differ between western and hong kong - based sources .", "entity": "topic modeling", "output": "protests", "neg_sample": ["topic modeling is used for OtherScientificTerm", "we apply statistical techniques from natural language processing to a collection of western and hong kong - based english - language newspaper articles spanning the years 1998 - 2020 , studying the difference and evolution of its portrayal .", "we observe that both content and attitudes differ between western and hong kong - based sources ."], "relation": "used for", "id": "2021.case-1.7", "year": 2021, "rel_sent": "Topic modeling detects salient aspects of protests and shows that Hong Kong - based papers made fewer references to police violence during the Anti - Extradition Law Amendment Bill Movement .", "forward": true, "src_ids": "2021.case-1.7_5998"}
{"input": "cluster overlap of speaker attributes is done by using Method| context: state - of - the - art variational auto - encoders ( vaes ) for learning disentangled latent representations give impressive results in discovering features like pitch , pause duration , and accent in speech data , leading to highly controllable text - to - speech ( tts ) synthesis . however , these lstm - based vaes fail to learn latent clusters of speaker attributes when trained on limited or noisy datasets . further , different latent variables are found to encode the same features , limiting the control and expressiveness during speech synthesis .", "entity": "cluster overlap of speaker attributes", "output": "reordered transformer encoder with minimal mutual information", "neg_sample": ["cluster overlap of speaker attributes is done by using Method", "state - of - the - art variational auto - encoders ( vaes ) for learning disentangled latent representations give impressive results in discovering features like pitch , pause duration , and accent in speech data , leading to highly controllable text - to - speech ( tts ) synthesis .", "however , these lstm - based vaes fail to learn latent clusters of speaker attributes when trained on limited or noisy datasets .", "further , different latent variables are found to encode the same features , limiting the control and expressiveness during speech synthesis ."], "relation": "used for", "id": "2021.findings-acl.312", "year": 2021, "rel_sent": "We show that REMMI reduces the cluster overlap of speaker attributes by at least 30 % over LSTM - VAE .", "forward": false, "src_ids": "2021.findings-acl.312_13163"}
{"input": "sentence selection task is done by using Metric| context: answer sentence selection is an important sub - task in question answering ( qa ) that determines the correct answer sentence from a passage . this task can naturally be reduced to the semantic text similarity problem between question and answer candidate .", "entity": "sentence selection task", "output": "similarity measures", "neg_sample": ["sentence selection task is done by using Metric", "answer sentence selection is an important sub - task in question answering ( qa ) that determines the correct answer sentence from a passage .", "this task can naturally be reduced to the semantic text similarity problem between question and answer candidate ."], "relation": "used for", "id": "2021.paclic-1.29", "year": 2021, "rel_sent": "Combining Karaka relations with different similarity measures shows significant performance improvement for sentence selection task , suggesting them as potentially a semantic similarity measure .", "forward": false, "src_ids": "2021.paclic-1.29_5277"}
{"input": "lexical semantic change is done by using Method| context: just as the meaning of words is tied to the communities in which they are used , so too is semantic change . but how does lexical semantic change manifest differently across different communities ?", "entity": "lexical semantic change", "output": "distributional methods", "neg_sample": ["lexical semantic change is done by using Method", "just as the meaning of words is tied to the communities in which they are used , so too is semantic change .", "but how does lexical semantic change manifest differently across different communities ?"], "relation": "used for", "id": "2021.starsem-1.3", "year": 2021, "rel_sent": "We use distributional methods to quantify lexical semantic change and induce a social network on communities , based on interactions between members .", "forward": false, "src_ids": "2021.starsem-1.3_11335"}
{"input": "spider benchmark is used for Task| context: recently , there has been significant progress in studying neural networks to translate text descriptions into sql queries . despite achieving good performance on some public benchmarks , existing text - to - sql models typically rely on the lexical matching between words in natural language ( nl ) questions and tokens in table schemas , which may render the models vulnerable to attacks that break the schema linking mechanism .", "entity": "spider benchmark", "output": "text - to - sql translation", "neg_sample": ["spider benchmark is used for Task", "recently , there has been significant progress in studying neural networks to translate text descriptions into sql queries .", "despite achieving good performance on some public benchmarks , existing text - to - sql models typically rely on the lexical matching between words in natural language ( nl ) questions and tokens in table schemas , which may render the models vulnerable to attacks that break the schema linking mechanism ."], "relation": "used for", "id": "2021.acl-long.195", "year": 2021, "rel_sent": "In particular , we introduce Spider - Syn , a human - curated dataset based on the Spider benchmark for text - to - SQL translation .", "forward": true, "src_ids": "2021.acl-long.195_15360"}
{"input": "reinforcement learning is used for Task| context: the task can be any language - related task , from intent detection tofull task - oriented conversations .", "entity": "reinforcement learning", "output": "data generation process", "neg_sample": ["reinforcement learning is used for Task", "the task can be any language - related task , from intent detection tofull task - oriented conversations ."], "relation": "used for", "id": "2021.sigdial-1.12", "year": 2021, "rel_sent": "We use reinforcement learning to optimize the data generation process where the reward signal is the agent 's performance on the task .", "forward": true, "src_ids": "2021.sigdial-1.12_8030"}
{"input": "abductive inference is done by using Task| context: abductive reasoning starts from some observations and aims at finding the most plausible explanation for these observations . to perform abduction , humans often make use of temporal and causal inferences , and knowledge about how some hypothetical situation can result in different outcomes .", "entity": "abductive inference", "output": "generating hypothetical events", "neg_sample": ["abductive inference is done by using Task", "abductive reasoning starts from some observations and aims at finding the most plausible explanation for these observations .", "to perform abduction , humans often make use of temporal and causal inferences , and knowledge about how some hypothetical situation can result in different outcomes ."], "relation": "used for", "id": "2021.starsem-1.6", "year": 2021, "rel_sent": "Generating Hypothetical Events for Abductive Inference.", "forward": false, "src_ids": "2021.starsem-1.6_3117"}
{"input": "mtxnet is used for OtherScientificTerm| context: explainable deep learning models are advantageous in many situations . prior work mostly provide unimodal explanations through post - hoc approaches not part of the original system design . explanation mechanisms also ignore useful textual information present in images .", "entity": "mtxnet", "output": "multimodal explanations", "neg_sample": ["mtxnet is used for OtherScientificTerm", "explainable deep learning models are advantageous in many situations .", "prior work mostly provide unimodal explanations through post - hoc approaches not part of the original system design .", "explanation mechanisms also ignore useful textual information present in images ."], "relation": "used for", "id": "2021.maiworkshop-1.4", "year": 2021, "rel_sent": "In this paper , we propose MTXNet , an end - to - end trainable multimodal architecture to generate multimodal explanations , which focuses on the text in the image .", "forward": true, "src_ids": "2021.maiworkshop-1.4_15283"}
{"input": "layoutlmv2 is done by using Method| context: pre - training of text and layout has proved effective in a variety of visually - rich document understanding tasks due to its effective model architecture and the advantage of large - scale unlabeled scanned / digital - born documents .", "entity": "layoutlmv2", "output": "two - stream multi - modal transformer encoder", "neg_sample": ["layoutlmv2 is done by using Method", "pre - training of text and layout has proved effective in a variety of visually - rich document understanding tasks due to its effective model architecture and the advantage of large - scale unlabeled scanned / digital - born documents ."], "relation": "used for", "id": "2021.acl-long.201", "year": 2021, "rel_sent": "Specifically , with a two - stream multi - modal Transformer encoder , LayoutLMv2 uses not only the existing masked visual - language modeling task but also the new text - image alignment and text - image matching tasks , which make it better capture the cross - modality interaction in the pre - training stage .", "forward": false, "src_ids": "2021.acl-long.201_15327"}
{"input": "contrastive fine - tuning method is used for Method| context: the performance of state - of - the - art neural rankers can deteriorate substantially when exposed to noisy inputs or applied to a new domain .", "entity": "contrastive fine - tuning method", "output": "bert and bart - based rankers", "neg_sample": ["contrastive fine - tuning method is used for Method", "the performance of state - of - the - art neural rankers can deteriorate substantially when exposed to noisy inputs or applied to a new domain ."], "relation": "used for", "id": "2021.findings-acl.51", "year": 2021, "rel_sent": "In experiments with four passage ranking datasets , the proposed contrastive fine - tuning method obtains improvements on robustness to query reformulations , noise perturbations , and zeroshot transfer for both BERT and BART - based rankers .", "forward": true, "src_ids": "2021.findings-acl.51_12708"}
{"input": "incremental and character - level approach is used for Task| context: recent work has adopted models of pragmatic reasoning for the generation of informative language in , e.g. , image captioning .", "entity": "incremental and character - level approach", "output": "pragmatically informative neural image captioning", "neg_sample": ["incremental and character - level approach is used for Task", "recent work has adopted models of pragmatic reasoning for the generation of informative language in , e.g.", ", image captioning ."], "relation": "used for", "id": "2021.inlg-1.41", "year": 2021, "rel_sent": "We propose a simple but highly effective relaxation of fully rational decoding , based on an existing incremental and character - level approach to pragmatically informative neural image captioning .", "forward": true, "src_ids": "2021.inlg-1.41_6775"}
{"input": "entangled ' representations is used for OtherScientificTerm| context: named entity recognition systems achieve remarkable performance on domains such as english news . it is natural to ask : what are these models actually learning to achieve this ? are they merely memorizing the names themselves ? or are they capable of interpreting the text and inferring the correct entity type from the linguistic context ?", "entity": "entangled ' representations", "output": "contextual and local token information", "neg_sample": ["entangled ' representations is used for OtherScientificTerm", "named entity recognition systems achieve remarkable performance on domains such as english news .", "it is natural to ask : what are these models actually learning to achieve this ?", "are they merely memorizing the names themselves ?", "or are they capable of interpreting the text and inferring the correct entity type from the linguistic context ?"], "relation": "used for", "id": "2021.cl-1.5", "year": 2021, "rel_sent": "Finally , we find that one issue contributing to model errors is the use of ' entangled ' representations that encode both contextual and local token information into a single vector , which can obscure clues .", "forward": true, "src_ids": "2021.cl-1.5_2198"}
{"input": "research literature navigation function is done by using Task| context: health and medical researchers often give clinical and policy recommendations to inform health practice and public health policy . however , no current health information system supports the direct retrieval of health advice .", "entity": "research literature navigation function", "output": "retrieving health advice sentences", "neg_sample": ["research literature navigation function is done by using Task", "health and medical researchers often give clinical and policy recommendations to inform health practice and public health policy .", "however , no current health information system supports the direct retrieval of health advice ."], "relation": "used for", "id": "2021.emnlp-main.486", "year": 2021, "rel_sent": "We also conducted a case study that applied this prediction model to retrieve specific health advice on COVID-19 treatments from LitCovid , a large COVID research literature portal , demonstrating the usefulness of retrieving health advice sentences as an advanced research literature navigation function for health researchers and the general public .", "forward": false, "src_ids": "2021.emnlp-main.486_15172"}
{"input": "event semantics is done by using Method| context: event extraction ( ee ) has considerably benefited from pre - trained language models ( plms ) by fine - tuning . however , existing pre - training methods have not involved modeling event characteristics , resulting in the developed ee models can not take full advantage of large - scale unsupervised data .", "entity": "event semantics", "output": "text encoder", "neg_sample": ["event semantics is done by using Method", "event extraction ( ee ) has considerably benefited from pre - trained language models ( plms ) by fine - tuning .", "however , existing pre - training methods have not involved modeling event characteristics , resulting in the developed ee models can not take full advantage of large - scale unsupervised data ."], "relation": "used for", "id": "2021.acl-long.491", "year": 2021, "rel_sent": "CLEVE contains a text encoder to learn event semantics and a graph encoder to learn event structures respectively .", "forward": false, "src_ids": "2021.acl-long.491_6665"}
{"input": "data augmentation is used for Task| context: current work in named entity recognition ( ner ) shows that data augmentation techniques can produce more robust models . however , most existing techniques focus on augmenting in - domain data in low - resource scenarios where annotated data is quite limited .", "entity": "data augmentation", "output": "cross - domain named entity recognition", "neg_sample": ["data augmentation is used for Task", "current work in named entity recognition ( ner ) shows that data augmentation techniques can produce more robust models .", "however , most existing techniques focus on augmenting in - domain data in low - resource scenarios where annotated data is quite limited ."], "relation": "used for", "id": "2021.emnlp-main.434", "year": 2021, "rel_sent": "Data Augmentation for Cross - Domain Named Entity Recognition.", "forward": true, "src_ids": "2021.emnlp-main.434_5137"}
{"input": "decoder is done by using Method| context: abstractive summarization for long - document or multi - document remains challenging for the seq2seq architecture , as seq2seq is not good at analyzing long - distance relations in text .", "entity": "decoder", "output": "graph - propagation attention mechanism", "neg_sample": ["decoder is done by using Method", "abstractive summarization for long - document or multi - document remains challenging for the seq2seq architecture , as seq2seq is not good at analyzing long - distance relations in text ."], "relation": "used for", "id": "2021.acl-long.472", "year": 2021, "rel_sent": "Specifically , several graph augmentation methods are designed to encode both the explicit and implicit relations in the text while the graph - propagation attention mechanism is developed in the decoder to select salient content into the summary .", "forward": false, "src_ids": "2021.acl-long.472_8136"}
{"input": "children 's speech is done by using OtherScientificTerm| context: in this type of mismatched condition , the asr performance is degraded due to the acoustic and linguistic mismatch in the attributes between children and adult speakers .", "entity": "children 's speech", "output": "spectral tilt", "neg_sample": ["children 's speech is done by using OtherScientificTerm", "in this type of mismatched condition , the asr performance is degraded due to the acoustic and linguistic mismatch in the attributes between children and adult speakers ."], "relation": "used for", "id": "2021.nodalida-main.10", "year": 2021, "rel_sent": "In this paper , we propose spectral modification by sharpening formants and by reducing the spectral tilt to recognize children 's speech by automatic speech recognition ( ASR ) systems developed using adult speech .", "forward": false, "src_ids": "2021.nodalida-main.10_609"}
{"input": "pre - training is done by using Method| context: as the labeling cost for different modules in task - oriented dialog ( tod ) systems is expensive , a major challenge is to train different modules with the least amount of labeled data . recently , large - scale pre - trained language models , have shown promising results for few - shot learning in tod.", "entity": "pre - training", "output": "self - training approach", "neg_sample": ["pre - training is done by using Method", "as the labeling cost for different modules in task - oriented dialog ( tod ) systems is expensive , a major challenge is to train different modules with the least amount of labeled data .", "recently , large - scale pre - trained language models , have shown promising results for few - shot learning in tod."], "relation": "used for", "id": "2021.emnlp-main.142", "year": 2021, "rel_sent": "Self - training Improves Pre - training for Few - shot Learning in Task - oriented Dialog Systems.", "forward": false, "src_ids": "2021.emnlp-main.142_11435"}
{"input": "answer sentence selection task is done by using Metric| context: answer sentence selection is an important sub - task in question answering ( qa ) that determines the correct answer sentence from a passage . this task can naturally be reduced to the semantic text similarity problem between question and answer candidate .", "entity": "answer sentence selection task", "output": "similarity measures", "neg_sample": ["answer sentence selection task is done by using Metric", "answer sentence selection is an important sub - task in question answering ( qa ) that determines the correct answer sentence from a passage .", "this task can naturally be reduced to the semantic text similarity problem between question and answer candidate ."], "relation": "used for", "id": "2021.paclic-1.29", "year": 2021, "rel_sent": "Study of Similarity Measures as Features in Classification for Answer Sentence Selection Task in Hindi Question Answering : Language - Specific v / s Other Measures.", "forward": false, "src_ids": "2021.paclic-1.29_5273"}
{"input": "parallel sentences is done by using Material| context: the quality and quantity of parallel sentences are known as very important training data for constructing neural machine translation ( nmt ) systems . however , these resources are not available for many low - resource language pairs . many existing methods need strong supervision are not suitable . although several attempts at developing unsupervised models , they ignore the language - invariant between languages .", "entity": "parallel sentences", "output": "bilingual corpora of rich - resource language pairs", "neg_sample": ["parallel sentences is done by using Material", "the quality and quantity of parallel sentences are known as very important training data for constructing neural machine translation ( nmt ) systems .", "however , these resources are not available for many low - resource language pairs .", "many existing methods need strong supervision are not suitable .", "although several attempts at developing unsupervised models , they ignore the language - invariant between languages ."], "relation": "used for", "id": "2021.naacl-srw.17", "year": 2021, "rel_sent": "In this paper , we propose an approach based on transfer learning to mine parallel sentences in the unsupervised setting . With the help of bilingual corpora of rich - resource language pairs , we can mine parallel sentences without bilingual supervision of low - resource language pairs .", "forward": false, "src_ids": "2021.naacl-srw.17_9063"}
{"input": "geocoding text data is used for Material| context: text data are an important source of detailed information about social and political events . automated systems parse large volumes of text data to infer or extract structured information that describes actors , actions , dates , times , and locations . one of these sub - tasks is geocoding : predicting the geographic coordinates associated with events or locations described by a given text .", "entity": "geocoding text data", "output": "event data", "neg_sample": ["geocoding text data is used for Material", "text data are an important source of detailed information about social and political events .", "automated systems parse large volumes of text data to infer or extract structured information that describes actors , actions , dates , times , and locations .", "one of these sub - tasks is geocoding : predicting the geographic coordinates associated with events or locations described by a given text ."], "relation": "used for", "id": "2021.case-1.8", "year": 2021, "rel_sent": "I compare the model - based solution , called ELECTRo - map , to the current state - of - the - art open source system for geocoding texts for event data .", "forward": true, "src_ids": "2021.case-1.8_3479"}
{"input": "attention mechanisms is used for OtherScientificTerm| context: unlike the standard image captioning task , news images depict situations where people , locations , and events are of paramount importance .", "entity": "attention mechanisms", "output": "named entities", "neg_sample": ["attention mechanisms is used for OtherScientificTerm", "unlike the standard image captioning task , news images depict situations where people , locations , and events are of paramount importance ."], "relation": "used for", "id": "2021.emnlp-main.542", "year": 2021, "rel_sent": "More specifically , built upon the Transformer architecture , our model is further equipped with novel multi - modal feature fusion techniques and attention mechanisms , which are designed to generate named entities more accurately .", "forward": true, "src_ids": "2021.emnlp-main.542_9488"}
{"input": "semi - automatic mechanism is used for Task| context: most existing works use visual attributes labeled by humans , not suitable for large - scale applications . we argue that documents like wikipedia pages contain rich visual information , which however can easily be buried by the vast amount of non - visual sentences .", "entity": "semi - automatic mechanism", "output": "visual sentence extraction", "neg_sample": ["semi - automatic mechanism is used for Task", "most existing works use visual attributes labeled by humans , not suitable for large - scale applications .", "we argue that documents like wikipedia pages contain rich visual information , which however can easily be buried by the vast amount of non - visual sentences ."], "relation": "used for", "id": "2021.naacl-main.250", "year": 2021, "rel_sent": "To address this issue , we propose a semi - automatic mechanism for visual sentence extraction that leverages the document section headers and the clustering structure of visual sentences .", "forward": true, "src_ids": "2021.naacl-main.250_7504"}
{"input": "out - of - domain detection is done by using Method| context: detecting out - of - domain ( ood ) or unknown intents from user queries is essential in a task - oriented dialog system . a key challenge of ood detection is to learn discriminative semantic features . traditional cross - entropy loss only focuses on whether a sample is correctly classified , and does not explicitly distinguish the margins between categories .", "entity": "out - of - domain detection", "output": "discriminative representations", "neg_sample": ["out - of - domain detection is done by using Method", "detecting out - of - domain ( ood ) or unknown intents from user queries is essential in a task - oriented dialog system .", "a key challenge of ood detection is to learn discriminative semantic features .", "traditional cross - entropy loss only focuses on whether a sample is correctly classified , and does not explicitly distinguish the margins between categories ."], "relation": "used for", "id": "2021.acl-short.110", "year": 2021, "rel_sent": "Modeling Discriminative Representations for Out - of - Domain Detection with Supervised Contrastive Learning.", "forward": false, "src_ids": "2021.acl-short.110_2504"}
{"input": "logical operation is used for OtherScientificTerm| context: an interpretable system for open - domain reasoning needs to express its reasoning process in a transparent form . natural language is an attractive representation for this purpose - it is both highly expressive and easy for humans to understand . however , manipulating natural language statements in logically consistent ways is hard : models must cope with variation in how meaning is expressed while remaining precise .", "entity": "logical operation", "output": "premise statements", "neg_sample": ["logical operation is used for OtherScientificTerm", "an interpretable system for open - domain reasoning needs to express its reasoning process in a transparent form .", "natural language is an attractive representation for this purpose - it is both highly expressive and easy for humans to understand .", "however , manipulating natural language statements in logically consistent ways is hard : models must cope with variation in how meaning is expressed while remaining precise ."], "relation": "used for", "id": "2021.emnlp-main.506", "year": 2021, "rel_sent": "We train BART - based models ( Lewis et al . , 2020 ) to generate the result of applying a particular logical operation to one or more premise statements .", "forward": true, "src_ids": "2021.emnlp-main.506_14800"}
{"input": "guessing task is used for Material| context: in the visual dialog task guesswhat ? ! two players maintain a dialog in order to identify a secret object in an image . this raises a question : what 's the risk of having an imperfect oracle model ? .", "entity": "guessing task", "output": "human dialog", "neg_sample": ["guessing task is used for Material", "in the visual dialog task guesswhat ? !", "two players maintain a dialog in order to identify a secret object in an image .", "this raises a question : what 's the risk of having an imperfect oracle model ?", "."], "relation": "used for", "id": "2021.reinact-1.2", "year": 2021, "rel_sent": "We show that having access to better quality answers has a direct impact on the guessing task for human dialog and argue that better answers could help train better question generation models .", "forward": true, "src_ids": "2021.reinact-1.2_10994"}
{"input": "writing mode is used for OtherScientificTerm| context: the success of authorship attribution relies on the presence of linguistic features specific to individual authors . there is , however , limited research assessing to what extent authorial style remains constant when individuals switch from one writing modality to another .", "entity": "writing mode", "output": "writing style", "neg_sample": ["writing mode is used for OtherScientificTerm", "the success of authorship attribution relies on the presence of linguistic features specific to individual authors .", "there is , however , limited research assessing to what extent authorial style remains constant when individuals switch from one writing modality to another ."], "relation": "used for", "id": "2021.eacl-main.97", "year": 2021, "rel_sent": "We measure the effect of writing mode on writing style in the context of authorship attribution research using a corpus of documents composed online ( in a web browser ) and documents composed offline using a traditional word processor .", "forward": true, "src_ids": "2021.eacl-main.97_12499"}
{"input": "word representations is done by using OtherScientificTerm| context: we describe a new addition to the webvectors toolkit which is used to serve word embedding models over the web . the new elmoviz module adds support for contextualized embedding architectures , in particular for elmo models .", "entity": "word representations", "output": "lexical hyperlinks", "neg_sample": ["word representations is done by using OtherScientificTerm", "we describe a new addition to the webvectors toolkit which is used to serve word embedding models over the web .", "the new elmoviz module adds support for contextualized embedding architectures , in particular for elmo models ."], "relation": "used for", "id": "2021.eacl-demos.18", "year": 2021, "rel_sent": "The module is well integrated into the rest of the WebVectors toolkit , providing lexical hyperlinks to word representations in static embedding models .", "forward": false, "src_ids": "2021.eacl-demos.18_15726"}
{"input": "low - resource nlu is done by using Method| context: recent advances in transfer learning have improved the performance of virtual assistants considerably . nevertheless , creating sophisticated voice - enabled applications for new domains remains a challenge , and meager training data is often a key bottleneck . accordingly , unsupervised learning and ssl ( semi - supervised learning ) techniques continue to be of vital importance .", "entity": "low - resource nlu", "output": "weakly supervised ml techniques", "neg_sample": ["low - resource nlu is done by using Method", "recent advances in transfer learning have improved the performance of virtual assistants considerably .", "nevertheless , creating sophisticated voice - enabled applications for new domains remains a challenge , and meager training data is often a key bottleneck .", "accordingly , unsupervised learning and ssl ( semi - supervised learning ) techniques continue to be of vital importance ."], "relation": "used for", "id": "2021.naacl-industry.36", "year": 2021, "rel_sent": "Combining Weakly Supervised ML Techniques for Low - Resource NLU.", "forward": false, "src_ids": "2021.naacl-industry.36_874"}
{"input": "fsr-2020 is used for Task| context: apart from the issues on tonal and dialectical variations of the taiwanese language , speech artificially contaminated with different types of real - world noise also has to be dealt with in the final test stage ; all of these make fsr-2020 much more challenging than before .", "entity": "fsr-2020", "output": "taiwanese speech recognition", "neg_sample": ["fsr-2020 is used for Task", "apart from the issues on tonal and dialectical variations of the taiwanese language , speech artificially contaminated with different types of real - world noise also has to be dealt with in the final test stage ; all of these make fsr-2020 much more challenging than before ."], "relation": "used for", "id": "2021.ijclclp-1.1", "year": 2021, "rel_sent": "FSR-2020 aims at fostering the development of Taiwanese speech recognition .", "forward": true, "src_ids": "2021.ijclclp-1.1_4803"}
{"input": "contrast set is used for Material| context: neural module networks ( nmn ) are a popular approach for grounding visual referring expressions . prior implementations of nmn use pre - defined and fixed textual inputs in their module instantiation . this necessitates a large number of modules as they lack the ability to share weights and exploit associations between similar textual contexts ( e.g. ' dark cube on the left ' vs. ' black cube on the left ' ) .", "entity": "contrast set", "output": "clevr - ref+", "neg_sample": ["contrast set is used for Material", "neural module networks ( nmn ) are a popular approach for grounding visual referring expressions .", "prior implementations of nmn use pre - defined and fixed textual inputs in their module instantiation .", "this necessitates a large number of modules as they lack the ability to share weights and exploit associations between similar textual contexts ( e.g. '", "dark cube on the left ' vs. ' black cube on the left ' ) ."], "relation": "used for", "id": "2021.emnlp-main.516", "year": 2021, "rel_sent": "We further evaluate the impact of our contextualization by constructing a contrast set for CLEVR - Ref+ , which we call CC - Ref+ .", "forward": true, "src_ids": "2021.emnlp-main.516_1678"}
{"input": "regression testing is used for Method| context: we argue that regression testing is necessary to ensure reliability in the continuous development of nlp tools , especially higher level applications like grammar checkers .", "entity": "regression testing", "output": "gramdivvun", "neg_sample": ["regression testing is used for Method", "we argue that regression testing is necessary to ensure reliability in the continuous development of nlp tools , especially higher level applications like grammar checkers ."], "relation": "used for", "id": "2021.iwclul-1.6", "year": 2021, "rel_sent": "We present a tool for regression testing for GramDivvun , the rule - based open source North Sami grammar checker .", "forward": true, "src_ids": "2021.iwclul-1.6_5876"}
{"input": "text annotation is done by using Method| context: benefiting from the fully modular architecture design , fitannotator provides a systematic solution for the annotation of a variety of natural language processing tasks , including classification , sequence tagging and semantic role annotation , regardless of the language .", "entity": "text annotation", "output": "web - based tool", "neg_sample": ["text annotation is done by using Method", "benefiting from the fully modular architecture design , fitannotator provides a systematic solution for the annotation of a variety of natural language processing tasks , including classification , sequence tagging and semantic role annotation , regardless of the language ."], "relation": "used for", "id": "2021.naacl-demos.5", "year": 2021, "rel_sent": "In this paper , we introduce FITAnnotator , a generic web - based tool for efficient text annotation .", "forward": false, "src_ids": "2021.naacl-demos.5_10450"}
{"input": "chinesebert is used for Method| context: recent pretraining models in chinese neglect two important aspects specific to the chinese language : glyph and pinyin , which carry significant syntax and semantic information for language understanding .", "entity": "chinesebert", "output": "language model pretraining", "neg_sample": ["chinesebert is used for Method", "recent pretraining models in chinese neglect two important aspects specific to the chinese language : glyph and pinyin , which carry significant syntax and semantic information for language understanding ."], "relation": "used for", "id": "2021.acl-long.161", "year": 2021, "rel_sent": "In this work , we propose ChineseBERT , which incorporates both the glyph and pinyin information of Chinese characters into language model pretraining .", "forward": true, "src_ids": "2021.acl-long.161_10438"}
{"input": "scalable method is used for Task| context: several cluster - based methods for semantic change detection with contextual embeddings emerged recently . they allow a fine - grained analysis of word use change by aggregating embeddings into clusters that reflect the different usages of the word . however , these methods are unscalable in terms of memory consumption and computation time . therefore , they require a limited set of target words to be picked in advance . this drastically limits the usability of these methods in open exploratory tasks , where each word from the vocabulary can be considered as a potential target .", "entity": "scalable method", "output": "word usage - change detection", "neg_sample": ["scalable method is used for Task", "several cluster - based methods for semantic change detection with contextual embeddings emerged recently .", "they allow a fine - grained analysis of word use change by aggregating embeddings into clusters that reflect the different usages of the word .", "however , these methods are unscalable in terms of memory consumption and computation time .", "therefore , they require a limited set of target words to be picked in advance .", "this drastically limits the usability of these methods in open exploratory tasks , where each word from the vocabulary can be considered as a potential target ."], "relation": "used for", "id": "2021.naacl-main.369", "year": 2021, "rel_sent": "We propose a novel scalable method for word usage - change detection that offers large gains in processing time and significant memory savings while offering the same interpretability and better performance than unscalable methods .", "forward": true, "src_ids": "2021.naacl-main.369_777"}
{"input": "french is done by using Method| context: grammatical gender may be determined by semantics , orthography , phonology , or could even be arbitrary . identifying patterns in the factors that govern noun genders can be useful for language learners , and for understanding innate linguistic sources of gender bias . traditional manual rule - based approaches may be substituted by more accurate and scalable but harder - to - interpret computational approaches for predicting gender from typological information .", "entity": "french", "output": "interpretable gender classification models", "neg_sample": ["french is done by using Method", "grammatical gender may be determined by semantics , orthography , phonology , or could even be arbitrary .", "identifying patterns in the factors that govern noun genders can be useful for language learners , and for understanding innate linguistic sources of gender bias .", "traditional manual rule - based approaches may be substituted by more accurate and scalable but harder - to - interpret computational approaches for predicting gender from typological information ."], "relation": "used for", "id": "2021.sigtyp-1.9", "year": 2021, "rel_sent": "In this work , we propose interpretable gender classification models for French , which obtain the best of both worlds .", "forward": false, "src_ids": "2021.sigtyp-1.9_14926"}
{"input": "dialogue summarization tasks is done by using Method| context: this paper introduces mediasum , a large - scale media interview dataset consisting of 463.6 k transcripts with abstractive summaries .", "entity": "dialogue summarization tasks", "output": "transfer learning", "neg_sample": ["dialogue summarization tasks is done by using Method", "this paper introduces mediasum , a large - scale media interview dataset consisting of 463.6 k transcripts with abstractive summaries ."], "relation": "used for", "id": "2021.naacl-main.474", "year": 2021, "rel_sent": "We also show that MediaSum can be used in transfer learning to improve a model 's performance on other dialogue summarization tasks .", "forward": false, "src_ids": "2021.naacl-main.474_9302"}
{"input": "prototypical amortization networks is used for OtherScientificTerm| context: event detection , a fundamental task of information extraction , tends to struggle when it needs to recognize novel event types with a few samples , i.e. few - shot event detection ( fsed ) . previous identify - then - classify paradigm attempts to solve this problem in the pipeline manner but ignores the trigger discrepancy between event types , thus suffering from the error propagation .", "entity": "prototypical amortization networks", "output": "transition scores", "neg_sample": ["prototypical amortization networks is used for OtherScientificTerm", "event detection , a fundamental task of information extraction , tends to struggle when it needs to recognize novel event types with a few samples , i.e.", "few - shot event detection ( fsed ) .", "previous identify - then - classify paradigm attempts to solve this problem in the pipeline manner but ignores the trigger discrepancy between event types , thus suffering from the error propagation ."], "relation": "used for", "id": "2021.findings-acl.3", "year": 2021, "rel_sent": "To this end , we first design the Prototypical Amortized Conditional Random Field ( PA - CRF ) to model the label dependency in the few - shot scenario , which builds prototypical amortization networks to approximate the transition scores between labels based on the label prototypes .", "forward": true, "src_ids": "2021.findings-acl.3_13839"}
{"input": "deep q - learning network is used for Task| context: computing precise evidences , namely minimal sets of sentences that support or refute a given claim , rather than larger evidences is crucial in fact verification ( fv ) , since larger evidences may contain conflicting pieces some of which support the claim while the other refute , thereby misleading fv . despite being important , precise evidences are rarely studied by existing methods for fv . it is challenging tofind precise evidences due to a large search space with lots of local optimums .", "entity": "deep q - learning network", "output": "claim verification", "neg_sample": ["deep q - learning network is used for Task", "computing precise evidences , namely minimal sets of sentences that support or refute a given claim , rather than larger evidences is crucial in fact verification ( fv ) , since larger evidences may contain conflicting pieces some of which support the claim while the other refute , thereby misleading fv .", "despite being important , precise evidences are rarely studied by existing methods for fv .", "it is challenging tofind precise evidences due to a large search space with lots of local optimums ."], "relation": "used for", "id": "2021.acl-long.83", "year": 2021, "rel_sent": "Experimental results confirm the effectiveness of DQN in computing precise evidences and demonstrate improvements in achieving accurate claim verification .", "forward": true, "src_ids": "2021.acl-long.83_9770"}
{"input": "sota generative language model is used for OtherScientificTerm| context: many state - of - the - art ( sota ) language models have achieved high accuracy on several multi - hop reasoning problems . however , these approaches tend to not be interpretable because they do not make the intermediate reasoning steps explicit . moreover , models trained on simpler tasks tend tofail when directly tested on more complex problems .", "entity": "sota generative language model", "output": "subgoals", "neg_sample": ["sota generative language model is used for OtherScientificTerm", "many state - of - the - art ( sota ) language models have achieved high accuracy on several multi - hop reasoning problems .", "however , these approaches tend to not be interpretable because they do not make the intermediate reasoning steps explicit .", "moreover , models trained on simpler tasks tend tofail when directly tested on more complex problems ."], "relation": "used for", "id": "2021.naacl-main.97", "year": 2021, "rel_sent": "We implement EVR by extending the classic reasoning paradigm General Problem Solver ( GPS ) with a SOTA generative language model to generate subgoals and perform inference in natural language at each reasoning step .", "forward": true, "src_ids": "2021.naacl-main.97_5172"}
{"input": "logical forms is done by using Method| context: it is often challenging to solve a complex problem from scratch , but much easier if we can access other similar problems with their solutions - a paradigm known as case - based reasoning ( cbr ) .", "entity": "logical forms", "output": "cbr - kbqa", "neg_sample": ["logical forms is done by using Method", "it is often challenging to solve a complex problem from scratch , but much easier if we can access other similar problems with their solutions - a paradigm known as case - based reasoning ( cbr ) ."], "relation": "used for", "id": "2021.emnlp-main.755", "year": 2021, "rel_sent": "Furthermore , we show that CBR - KBQA is capable of using new cases without any further training : by incorporating a few human - labeled examples in the case memory , CBR - KBQA is able to successfully generate logical forms containing unseen KB entities as well as relations .", "forward": false, "src_ids": "2021.emnlp-main.755_10940"}
{"input": "pre - trained models is used for Task| context: data sharing restrictions are common in nlp datasets .", "entity": "pre - trained models", "output": "cross - domain tasks", "neg_sample": ["pre - trained models is used for Task", "data sharing restrictions are common in nlp datasets ."], "relation": "used for", "id": "2021.semeval-1.184", "year": 2021, "rel_sent": "As a little data provided , pre - trained models are suitable to solve the cross - domain tasks .", "forward": true, "src_ids": "2021.semeval-1.184_2025"}
{"input": "general framework is used for Task| context: locating and fixing bugs is a time - consuming task . obviously , unchanged fix is not the correct fix because it is the same as the buggy code that needs to be fixed .", "entity": "general framework", "output": "bug fixing", "neg_sample": ["general framework is used for Task", "locating and fixing bugs is a time - consuming task .", "obviously , unchanged fix is not the correct fix because it is the same as the buggy code that needs to be fixed ."], "relation": "used for", "id": "2021.emnlp-main.282", "year": 2021, "rel_sent": "Based on these , we propose an intuitive yet effective general framework ( called Fix - Filter - Fix or F^3 ) for bug fixing .", "forward": true, "src_ids": "2021.emnlp-main.282_8802"}
{"input": "contextual words is done by using Method| context: syntactic information , especially dependency trees , has been widely used by existing studies to improve relation extraction with better semantic guidance for analyzing the context information associated with the given entities . however , most existing studies suffer from the noise in the dependency trees , especially when they are automatically generated , so that intensively leveraging dependency information may introduce confusions to relation classification and necessary pruning is of great importance in this task .", "entity": "contextual words", "output": "attention mechanism", "neg_sample": ["contextual words is done by using Method", "syntactic information , especially dependency trees , has been widely used by existing studies to improve relation extraction with better semantic guidance for analyzing the context information associated with the given entities .", "however , most existing studies suffer from the noise in the dependency trees , especially when they are automatically generated , so that intensively leveraging dependency information may introduce confusions to relation classification and necessary pruning is of great importance in this task ."], "relation": "used for", "id": "2021.acl-long.344", "year": 2021, "rel_sent": "In this approach , an attention mechanism upon graph convolutional networks is applied to different contextual words in the dependency tree obtained from an off - the - shelf dependency parser , to distinguish the importance of different word dependencies .", "forward": false, "src_ids": "2021.acl-long.344_1259"}
{"input": "auxiliary target language prediction task is used for OtherScientificTerm| context: multilingual neural machine translation ( nmt ) enables one model to serve all translation directions , including ones that are unseen during training , i.e. zero - shot translation . despite being theoretically attractive , current models often produce low quality translations - commonly failing to even produce outputs in the right target language . in this work , we observe that off - target translation is dominant even in strong multilingual systems , trained on massive multilingual corpora .", "entity": "auxiliary target language prediction task", "output": "decoder outputs", "neg_sample": ["auxiliary target language prediction task is used for OtherScientificTerm", "multilingual neural machine translation ( nmt ) enables one model to serve all translation directions , including ones that are unseen during training , i.e.", "zero - shot translation .", "despite being theoretically attractive , current models often produce low quality translations - commonly failing to even produce outputs in the right target language .", "in this work , we observe that off - target translation is dominant even in strong multilingual systems , trained on massive multilingual corpora ."], "relation": "used for", "id": "2021.emnlp-main.578", "year": 2021, "rel_sent": "At the representation level , we leverage an auxiliary target language prediction task to regularize decoder outputs to retain information about the target language .", "forward": true, "src_ids": "2021.emnlp-main.578_9669"}
{"input": "embedding vectors is used for OtherScientificTerm| context: we study the task of learning and evaluating chinese idiom embeddings .", "entity": "embedding vectors", "output": "idioms", "neg_sample": ["embedding vectors is used for OtherScientificTerm", "we study the task of learning and evaluating chinese idiom embeddings ."], "relation": "used for", "id": "2021.ranlp-1.155", "year": 2021, "rel_sent": "Observing that existing Chinese word embedding methods may not be suitable for learning idiom embeddings , we further present a BERT - based method that directly learns embedding vectors for individual idioms .", "forward": true, "src_ids": "2021.ranlp-1.155_14403"}
{"input": "plot - guided adversarial example construction is used for Task| context: according to conducted researches in this regard , learnable evaluation metrics have promised more accurate assessments by having higher correlations with human judgments . a critical bottleneck of obtaining a reliable learnable evaluation metric is the lack of high - quality training data for classifiers to efficiently distinguish plausible and implausible machine - generated stories . previous works relied on heuristically manipulated plausible examples to mimic possible system drawbacks such as repetition , contradiction , or irrelevant content in the text level , which can be unnatural and oversimplify the characteristics of implausible machine - generated stories .", "entity": "plot - guided adversarial example construction", "output": "open - domain story generation", "neg_sample": ["plot - guided adversarial example construction is used for Task", "according to conducted researches in this regard , learnable evaluation metrics have promised more accurate assessments by having higher correlations with human judgments .", "a critical bottleneck of obtaining a reliable learnable evaluation metric is the lack of high - quality training data for classifiers to efficiently distinguish plausible and implausible machine - generated stories .", "previous works relied on heuristically manipulated plausible examples to mimic possible system drawbacks such as repetition , contradiction , or irrelevant content in the text level , which can be unnatural and oversimplify the characteristics of implausible machine - generated stories ."], "relation": "used for", "id": "2021.naacl-main.343", "year": 2021, "rel_sent": "Plot - guided Adversarial Example Construction for Evaluating Open - domain Story Generation.", "forward": true, "src_ids": "2021.naacl-main.343_1249"}
{"input": "weak supervision signal is done by using OtherScientificTerm| context: recent work has shown that monolingual masked language models learn to represent data - driven notions of language variation which can be used for domain - targeted training data selection . dataset genre labels are already frequently available , yet remain largely unexplored in cross - lingual setups .", "entity": "weak supervision signal", "output": "genre metadata", "neg_sample": ["weak supervision signal is done by using OtherScientificTerm", "recent work has shown that monolingual masked language models learn to represent data - driven notions of language variation which can be used for domain - targeted training data selection .", "dataset genre labels are already frequently available , yet remain largely unexplored in cross - lingual setups ."], "relation": "used for", "id": "2021.emnlp-main.393", "year": 2021, "rel_sent": "We harness this genre metadata as a weak supervision signal for targeted data selection in zero - shot dependency parsing .", "forward": false, "src_ids": "2021.emnlp-main.393_7921"}
{"input": "quality estimation ( qe ) is used for Material| context: automatic image captioning has improved significantly over the last few years , but the problem is far from being solved , with state of the art models still often producing low quality captions when used in the wild .", "entity": "quality estimation ( qe )", "output": "image captions", "neg_sample": ["quality estimation ( qe ) is used for Material", "automatic image captioning has improved significantly over the last few years , but the problem is far from being solved , with state of the art models still often producing low quality captions when used in the wild ."], "relation": "used for", "id": "2021.naacl-main.253", "year": 2021, "rel_sent": "In this paper , we focus on the task of Quality Estimation ( QE ) for image captions , which attempts to model the caption quality from a human perspective and * without * access to ground - truth references , so that it can be applied at prediction time to detect low - quality captions produced on * previously unseen images * .", "forward": true, "src_ids": "2021.naacl-main.253_4904"}
{"input": "explicit guidance on how to resolve conversational dependency is used for Method| context: one of the main challenges in conversational question answering ( cqa ) is to resolve the conversational dependency , such as anaphora and ellipsis .", "entity": "explicit guidance on how to resolve conversational dependency", "output": "qa models", "neg_sample": ["explicit guidance on how to resolve conversational dependency is used for Method", "one of the main challenges in conversational question answering ( cqa ) is to resolve the conversational dependency , such as anaphora and ellipsis ."], "relation": "used for", "id": "2021.acl-long.478", "year": 2021, "rel_sent": "ExCorD first generates self - contained questions that can be understood without the conversation history , then trains a QA model with the pairs of original and self - contained questions using a consistency - based regularizer .", "forward": true, "src_ids": "2021.acl-long.478_661"}
{"input": "systematic procedure is used for Task| context: we present a systematic procedure for interrater disagreement resolution . the procedure is general , but of particular use in multiple - annotator tasks geared towards ground truth construction .", "entity": "systematic procedure", "output": "annotation tasks", "neg_sample": ["systematic procedure is used for Task", "we present a systematic procedure for interrater disagreement resolution .", "the procedure is general , but of particular use in multiple - annotator tasks geared towards ground truth construction ."], "relation": "used for", "id": "2021.humeval-1.15", "year": 2021, "rel_sent": "Interrater Disagreement Resolution : A Systematic Procedure to Reach Consensus in Annotation Tasks.", "forward": true, "src_ids": "2021.humeval-1.15_8900"}
{"input": "wait - k - stride - n strategy is used for Task| context: end - to - end simultaneous speech translation ( sst ) , which directly translates speech in one language into text in another language in realtime , is useful in many scenarios but has not been fully investigated .", "entity": "wait - k - stride - n strategy", "output": "local reranking", "neg_sample": ["wait - k - stride - n strategy is used for Task", "end - to - end simultaneous speech translation ( sst ) , which directly translates speech in one language into text in another language in realtime , is useful in many scenarios but has not been fully investigated ."], "relation": "used for", "id": "2021.findings-acl.218", "year": 2021, "rel_sent": "Besides , to improve the model performance in simultaneous scenarios , we propose a blank penalty to enhance the shrinking quality and a Wait - K - Stride - N strategy to allow local reranking during decoding .", "forward": true, "src_ids": "2021.findings-acl.218_8823"}
{"input": "speech utterance is done by using Method| context: in recent years , speech synthesis system can generate speech with high speech quality . however , multi - speaker text - to - speech ( tts ) system still require large amount of speech data for each target speaker .", "entity": "speech utterance", "output": "tts system", "neg_sample": ["speech utterance is done by using Method", "in recent years , speech synthesis system can generate speech with high speech quality .", "however , multi - speaker text - to - speech ( tts ) system still require large amount of speech data for each target speaker ."], "relation": "used for", "id": "2021.ijclclp-2.4", "year": 2021, "rel_sent": "The constructed TTS system can generate a speech utterance of the target speaker in fewer than 2 seconds .", "forward": false, "src_ids": "2021.ijclclp-2.4_7084"}
{"input": "visual representations is used for OtherScientificTerm| context: information visualization is critical to analytical reasoning and knowledge discovery .", "entity": "visual representations", "output": "features", "neg_sample": ["visual representations is used for OtherScientificTerm", "information visualization is critical to analytical reasoning and knowledge discovery ."], "relation": "used for", "id": "2021.dash-1.3", "year": 2021, "rel_sent": "The novel visual representations introduced here encode the features delivered by modern text mining models using advanced metaphors such as hypergraphs , nested topologies and tessellated planes .", "forward": true, "src_ids": "2021.dash-1.3_7389"}
{"input": "generating diverse and faithful summaries is done by using Method| context: professional summaries are written with document - level information , such as the theme of the document , in mind . this is in contrast with most seq2seq decoders which simultaneously learn tofocus on salient content , while deciding what to generate , at each decoding step .", "entity": "generating diverse and faithful summaries", "output": "focus sampling method", "neg_sample": ["generating diverse and faithful summaries is done by using Method", "professional summaries are written with document - level information , such as the theme of the document , in mind .", "this is in contrast with most seq2seq decoders which simultaneously learn tofocus on salient content , while deciding what to generate , at each decoding step ."], "relation": "used for", "id": "2021.acl-long.474", "year": 2021, "rel_sent": "Further , we propose a Focus Sampling method to enable generation of diverse summaries , an area currently understudied in summarization .", "forward": false, "src_ids": "2021.acl-long.474_8293"}
{"input": "component statistics is used for Method| context: sentence - level text simplification is currently evaluated using both automated metrics and human evaluation . for automatic evaluation , a combination of metrics is usually employed to evaluate different aspects of the simplification . flesch - kincaid grade level ( fkgl ) is one metric that has been regularly used to measure the readability of system output .", "entity": "component statistics", "output": "posthoc analysis", "neg_sample": ["component statistics is used for Method", "sentence - level text simplification is currently evaluated using both automated metrics and human evaluation .", "for automatic evaluation , a combination of metrics is usually employed to evaluate different aspects of the simplification .", "flesch - kincaid grade level ( fkgl ) is one metric that has been regularly used to measure the readability of system output ."], "relation": "used for", "id": "2021.gem-1.1", "year": 2021, "rel_sent": "Instead of using FKGL , we suggest that the component statistics , along with others , be used for posthoc analysis to understand system behavior .", "forward": true, "src_ids": "2021.gem-1.1_9373"}
{"input": "transformers is used for Task| context: recent progress in natural language processing has led to transformer architectures becoming the predominant model used for natural language tasks . however , in many real- world datasets , additional modalities are included which the transformer does not directly leverage .", "entity": "transformers", "output": "downstream applications", "neg_sample": ["transformers is used for Task", "recent progress in natural language processing has led to transformer architectures becoming the predominant model used for natural language tasks .", "however , in many real- world datasets , additional modalities are included which the transformer does not directly leverage ."], "relation": "used for", "id": "2021.maiworkshop-1.10", "year": 2021, "rel_sent": "We present Multimodal- Toolkit , an open - source Python package to incorporate text and tabular ( categorical and numerical ) data with Transformers for downstream applications .", "forward": true, "src_ids": "2021.maiworkshop-1.10_13421"}
{"input": "grammar rules is done by using Method| context: automated program repair ( apr ) aims tofind an automatic solution to program language bugs without human intervention , and it can potentially reduce debugging costs and improve software quality . conventional approaches adopt learning - based methods such as sequence - to - sequence models for the patches generation . however , they tend to ignore the code structure information and suffer from grammar and syntax errors .", "entity": "grammar rules", "output": "grammatically restricted inference method", "neg_sample": ["grammar rules is done by using Method", "automated program repair ( apr ) aims tofind an automatic solution to program language bugs without human intervention , and it can potentially reduce debugging costs and improve software quality .", "conventional approaches adopt learning - based methods such as sequence - to - sequence models for the patches generation .", "however , they tend to ignore the code structure information and suffer from grammar and syntax errors ."], "relation": "used for", "id": "2021.findings-acl.111", "year": 2021, "rel_sent": "Besides , to guarantee grammar correctness , we employ a grammatically restricted inference method to generate each grammar rule in a legally constrained sub - search - space considering the generated previous rules .", "forward": false, "src_ids": "2021.findings-acl.111_9102"}
{"input": "spatial dependency parsing problem is used for Task| context: information extraction ( ie ) for semi - structured document images is often approached as a sequence tagging problem by classifying each recognized input token into one of the iob ( inside , outside , and beginning ) categories . however , such problem setup has two inherent limitations that ( 1 ) it can not easily handle complex spatial relationships and ( 2 ) it is not suitable for highly structured information , which are nevertheless frequently observed in real - world document images .", "entity": "spatial dependency parsing problem", "output": "semi - structured document information extraction", "neg_sample": ["spatial dependency parsing problem is used for Task", "information extraction ( ie ) for semi - structured document images is often approached as a sequence tagging problem by classifying each recognized input token into one of the iob ( inside , outside , and beginning ) categories .", "however , such problem setup has two inherent limitations that ( 1 ) it can not easily handle complex spatial relationships and ( 2 ) it is not suitable for highly structured information , which are nevertheless frequently observed in real - world document images ."], "relation": "used for", "id": "2021.findings-acl.28", "year": 2021, "rel_sent": "Spatial Dependency Parsing for Semi - Structured Document Information Extraction.", "forward": true, "src_ids": "2021.findings-acl.28_5151"}
{"input": "methodological approach is used for Method| context: in the next decade , we will see a considerable need for nlp models for situated settings where diversity of situations and also different modalities including eye - movements should be taken into account in order to grasp the intention of the user . however , language comprehension in situated settings can not be handled in isolation , where different multimodal cues are inherently present and essential parts of the situations .", "entity": "methodological approach", "output": "situation - specific feature adaptation", "neg_sample": ["methodological approach is used for Method", "in the next decade , we will see a considerable need for nlp models for situated settings where diversity of situations and also different modalities including eye - movements should be taken into account in order to grasp the intention of the user .", "however , language comprehension in situated settings can not be handled in isolation , where different multimodal cues are inherently present and essential parts of the situations ."], "relation": "used for", "id": "2021.hcinlp-1.13", "year": 2021, "rel_sent": "We summarize the challenges of intention extraction and propose a methodological approach to investigate a situation - specific feature adaptation to improve crossmodal mapping and meaning recovery from noisy communication settings .", "forward": true, "src_ids": "2021.hcinlp-1.13_3544"}
{"input": "metaphor processing and understanding is done by using OtherScientificTerm| context: metaphor involves not only a linguistic phenomenon , but also a cognitive phenomenon structuring human thought , which makes understanding it challenging . as a means of cognition , metaphor is rendered by more than texts alone , and multimodal information in which vision / audio content is integrated with the text can play an important role in expressing and understanding metaphor . however , previous metaphor processing and understanding has focused on texts , partly due to the unavailability of large - scale datasets with ground truth labels of multimodal metaphor .", "entity": "metaphor processing and understanding", "output": "multimodal cues", "neg_sample": ["metaphor processing and understanding is done by using OtherScientificTerm", "metaphor involves not only a linguistic phenomenon , but also a cognitive phenomenon structuring human thought , which makes understanding it challenging .", "as a means of cognition , metaphor is rendered by more than texts alone , and multimodal information in which vision / audio content is integrated with the text can play an important role in expressing and understanding metaphor .", "however , previous metaphor processing and understanding has focused on texts , partly due to the unavailability of large - scale datasets with ground truth labels of multimodal metaphor ."], "relation": "used for", "id": "2021.acl-long.249", "year": 2021, "rel_sent": "Moreover , we propose a range of strong baselines and show the importance of combining multimodal cues for metaphor understanding .", "forward": false, "src_ids": "2021.acl-long.249_3523"}
{"input": "compressing pretrained transformers is done by using Method| context: we generalize deep self - attention distillation in minilm ( wang et al . , 2020 ) by only using self - attention relation distillation for taskagnostic compression of pretrained transformers . moreover , the fine - grained self - attention relations tend tofully exploit the interaction knowledge learned by transformer .", "entity": "compressing pretrained transformers", "output": "minilmv2", "neg_sample": ["compressing pretrained transformers is done by using Method", "we generalize deep self - attention distillation in minilm ( wang et al .", ", 2020 ) by only using self - attention relation distillation for taskagnostic compression of pretrained transformers .", "moreover , the fine - grained self - attention relations tend tofully exploit the interaction knowledge learned by transformer ."], "relation": "used for", "id": "2021.findings-acl.188", "year": 2021, "rel_sent": "MiniLMv2 : Multi - Head Self - Attention Relation Distillation for Compressing Pretrained Transformers.", "forward": false, "src_ids": "2021.findings-acl.188_6734"}
{"input": "few - shot settings is done by using Method| context: the major paradigm of applying a pre - trained language model to downstream tasks is tofinetune it on labeled task data , which often suffers instability and low performance when the labeled examples are scarce .", "entity": "few - shot settings", "output": "post - training methods", "neg_sample": ["few - shot settings is done by using Method", "the major paradigm of applying a pre - trained language model to downstream tasks is tofinetune it on labeled task data , which often suffers instability and low performance when the labeled examples are scarce ."], "relation": "used for", "id": "2021.findings-acl.151", "year": 2021, "rel_sent": "Empirical results show that CMLM surpasses several recent post - training methods in few - shot settings without the need for data augmentation .", "forward": false, "src_ids": "2021.findings-acl.151_10116"}
{"input": "multilingual models is used for Task| context: pretrained language models like bert have advanced the state of the art for many nlp tasks . for resource - rich languages , one has the choice between a number of language - specific models , while multilingual models are also worth considering . these models are well known for their crosslingual performance , but have also shown competitive in - language performance on some tasks . we consider monolingual and multilingual models from the perspective of historical texts , and in particular for texts enriched with editorial notes : how do language models deal with the historical and editorial content in these texts ?", "entity": "multilingual models", "output": "semantical tasks", "neg_sample": ["multilingual models is used for Task", "pretrained language models like bert have advanced the state of the art for many nlp tasks .", "for resource - rich languages , one has the choice between a number of language - specific models , while multilingual models are also worth considering .", "these models are well known for their crosslingual performance , but have also shown competitive in - language performance on some tasks .", "we consider monolingual and multilingual models from the perspective of historical texts , and in particular for texts enriched with editorial notes : how do language models deal with the historical and editorial content in these texts ?"], "relation": "used for", "id": "2021.latechclfl-1.3", "year": 2021, "rel_sent": "We alsofind that multilingual models outperform monolingual models on our data , but that this superiority is linked to the task at hand : multilingual models lose their advantage when confronted with more semantical tasks .", "forward": true, "src_ids": "2021.latechclfl-1.3_2320"}
{"input": "hidden vectors is done by using OtherScientificTerm| context: a health outcome is a measurement or an observation used to capture and assess the effect of a treatment . automatic detection of health outcomes from text would undoubtedly speed up access to evidence necessary in healthcare decision making . prior work on outcome detection has modelled this task as either ( a ) a sequence labelling task , where the goal is to detect which text spans describe health outcomes , or ( b ) a classification task , where the goal is to classify a text into a predefined set of categories depending on an outcome that is mentioned somewhere in that text . however , this decoupling of span detection and classification is problematic from a modelling perspective and ignores global structural correspondences between sentence - level and word - level information present in a given text .", "entity": "hidden vectors", "output": "contextual information", "neg_sample": ["hidden vectors is done by using OtherScientificTerm", "a health outcome is a measurement or an observation used to capture and assess the effect of a treatment .", "automatic detection of health outcomes from text would undoubtedly speed up access to evidence necessary in healthcare decision making .", "prior work on outcome detection has modelled this task as either ( a ) a sequence labelling task , where the goal is to detect which text spans describe health outcomes , or ( b ) a classification task , where the goal is to classify a text into a predefined set of categories depending on an outcome that is mentioned somewhere in that text .", "however , this decoupling of span detection and classification is problematic from a modelling perspective and ignores global structural correspondences between sentence - level and word - level information present in a given text ."], "relation": "used for", "id": "2021.emnlp-main.686", "year": 2021, "rel_sent": "In addition to injecting contextual information to hidden vectors , we use label attention to appropriately weight both word and sentence level information .", "forward": false, "src_ids": "2021.emnlp-main.686_14869"}
{"input": "semantic relevance is done by using OtherScientificTerm| context: keyword or keyphrase extraction is to identify words or phrases presenting the main topics of a document .", "entity": "semantic relevance", "output": "cross - attention", "neg_sample": ["semantic relevance is done by using OtherScientificTerm", "keyword or keyphrase extraction is to identify words or phrases presenting the main topics of a document ."], "relation": "used for", "id": "2021.emnlp-main.146", "year": 2021, "rel_sent": "The cross - attention is calculated to identify the semantic relevance between a candidate and sentences within a document .", "forward": false, "src_ids": "2021.emnlp-main.146_2418"}
{"input": "semantic similarity - based features is used for Task| context: effective management of dementia hinges on timely detection and precise diagnosis of the underlying cause of the syndrome at an early mild cognitive impairment ( mci ) stage . verbal fluency tasks are among the most often applied tests for early dementia detection due to their efficiency and ease of use . in these tasks , participants are asked to produce as many words as possible belonging to either a semantic category ( svf task ) or a phonemic category ( pvf task ) . even though both svf and pvf share neurocognitive function profiles , the pvf is typically believed to be less sensitive to measure mci - related cognitive impairment and recent research on fine - grained automatic evaluation of vf tasks has mainly focused on the svf .", "entity": "semantic similarity - based features", "output": "automatic mci", "neg_sample": ["semantic similarity - based features is used for Task", "effective management of dementia hinges on timely detection and precise diagnosis of the underlying cause of the syndrome at an early mild cognitive impairment ( mci ) stage .", "verbal fluency tasks are among the most often applied tests for early dementia detection due to their efficiency and ease of use .", "in these tasks , participants are asked to produce as many words as possible belonging to either a semantic category ( svf task ) or a phonemic category ( pvf task ) .", "even though both svf and pvf share neurocognitive function profiles , the pvf is typically believed to be less sensitive to measure mci - related cognitive impairment and recent research on fine - grained automatic evaluation of vf tasks has mainly focused on the svf ."], "relation": "used for", "id": "2021.clpsych-1.4", "year": 2021, "rel_sent": "We further show that semantic similarity - based features improve automatic MCI versus HC classification by 29 % over previous approaches for the PVF .", "forward": true, "src_ids": "2021.clpsych-1.4_2465"}
{"input": "textual- and social context - based model is used for Task| context: as the world continues tofight the covid-19 pandemic , it is simultaneously fighting an ' infodemic ' - a flood of disinformation and spread of conspiracy theories leading to health threats and the division of society . to combat this infodemic , there is an urgent need for benchmark datasets that can help researchers develop and evaluate models geared towards automatic detection of disinformation . while there are increasing efforts to create adequate , open - source benchmark datasets for english , comparable resources are virtually unavailable for german , leaving research for the german language lagging significantly behind .", "entity": "textual- and social context - based model", "output": "fake news detection", "neg_sample": ["textual- and social context - based model is used for Task", "as the world continues tofight the covid-19 pandemic , it is simultaneously fighting an ' infodemic ' - a flood of disinformation and spread of conspiracy theories leading to health threats and the division of society .", "to combat this infodemic , there is an urgent need for benchmark datasets that can help researchers develop and evaluate models geared towards automatic detection of disinformation .", "while there are increasing efforts to create adequate , open - source benchmark datasets for english , comparable resources are virtually unavailable for german , leaving research for the german language lagging significantly behind ."], "relation": "used for", "id": "2021.fever-1.9", "year": 2021, "rel_sent": "Furthermore , we propose an explainable textual- and social context - based model for fake news detection , compare its performance to ' black - box ' models and perform feature ablation to assess the relative importance of human - interpretable features in distinguishing fake news from authentic news .", "forward": true, "src_ids": "2021.fever-1.9_10360"}
{"input": "nlp is done by using Method| context: recent impressive improvements in nlp , largely based on the success of contextual neural language models , have been mostly demonstrated on at most a couple dozen high- resource languages . building language mod- els and , more generally , nlp systems for non- standardized and low - resource languages remains a challenging task .", "entity": "nlp", "output": "character - based language models", "neg_sample": ["nlp is done by using Method", "recent impressive improvements in nlp , largely based on the success of contextual neural language models , have been mostly demonstrated on at most a couple dozen high- resource languages .", "building language mod- els and , more generally , nlp systems for non- standardized and low - resource languages remains a challenging task ."], "relation": "used for", "id": "2021.wnut-1.47", "year": 2021, "rel_sent": "Confirming these results a on much larger data set of noisy French user - generated content , we argue that such character - based language models can be an asset for NLP in low - resource and high language variability set- tings .", "forward": false, "src_ids": "2021.wnut-1.47_5923"}
{"input": "knowledge representation strategies is used for Method| context: relational knowledge bases ( kbs ) are commonly used to represent world knowledge in machines . however , while advantageous for their high degree of precision and interpretability , kbs are usually organized according to manually - defined schemas , which limit their expressiveness and require significant human efforts to engineer and maintain .", "entity": "knowledge representation strategies", "output": "language models", "neg_sample": ["knowledge representation strategies is used for Method", "relational knowledge bases ( kbs ) are commonly used to represent world knowledge in machines .", "however , while advantageous for their high degree of precision and interpretability , kbs are usually organized according to manually - defined schemas , which limit their expressiveness and require significant human efforts to engineer and maintain ."], "relation": "used for", "id": "2021.emnlp-main.81", "year": 2021, "rel_sent": "We propose to organize knowledge representation strategies in LMs by the level of KB supervision provided , from no KB supervision at all to entity- and relation - level supervision .", "forward": true, "src_ids": "2021.emnlp-main.81_13407"}
{"input": "cross - lingual transfer is done by using Method| context: multilingual pretrained language models have demonstrated remarkable zero - shot cross - lingual transfer capabilities . such transfer emerges by fine - tuning on a task of interest in one language and evaluating on a distinct language , not seen during the fine - tuning . despite promising results , we still lack a proper understanding of the source of this transfer .", "entity": "cross - lingual transfer", "output": "encoder", "neg_sample": ["cross - lingual transfer is done by using Method", "multilingual pretrained language models have demonstrated remarkable zero - shot cross - lingual transfer capabilities .", "such transfer emerges by fine - tuning on a task of interest in one language and evaluating on a distinct language , not seen during the fine - tuning .", "despite promising results , we still lack a proper understanding of the source of this transfer ."], "relation": "used for", "id": "2021.eacl-main.189", "year": 2021, "rel_sent": "While the encoder is crucial for cross - lingual transfer and remains mostly unchanged during fine - tuning , the task predictor has little importance on the transfer and can be reinitialized during fine - tuning .", "forward": false, "src_ids": "2021.eacl-main.189_13182"}
{"input": "pre - trained model is done by using Method| context: recently , fine - tuning pre - trained language models ( e.g. , multilingual bert ) to downstream cross - lingual tasks has shown promising results . however , the fine - tuning process inevitably changes the parameters of the pre - trained model and weakens its cross - lingual ability , which leads to sub - optimal performance .", "entity": "pre - trained model", "output": "fine - tuning methods", "neg_sample": ["pre - trained model is done by using Method", "recently , fine - tuning pre - trained language models ( e.g.", ", multilingual bert ) to downstream cross - lingual tasks has shown promising results .", "however , the fine - tuning process inevitably changes the parameters of the pre - trained model and weakens its cross - lingual ability , which leads to sub - optimal performance ."], "relation": "used for", "id": "2021.repl4nlp-1.8", "year": 2021, "rel_sent": "The experimental result shows that our fine - tuning methods can better preserve the cross - lingual ability of the pre - trained model in a sentence retrieval task .", "forward": false, "src_ids": "2021.repl4nlp-1.8_10414"}
{"input": "semantic relationship is done by using Method| context: understanding linguistics and morphology of resource - scarce code - mixed texts remains a key challenge in text processing . although word embedding comes in handy to support downstream tasks for low - resource languages , there are plenty of scopes in improving the quality of language representation particularly for code - mixed languages .", "entity": "semantic relationship", "output": "hierarchical transformer - based framework", "neg_sample": ["semantic relationship is done by using Method", "understanding linguistics and morphology of resource - scarce code - mixed texts remains a key challenge in text processing .", "although word embedding comes in handy to support downstream tasks for low - resource languages , there are plenty of scopes in improving the quality of language representation particularly for code - mixed languages ."], "relation": "used for", "id": "2021.findings-acl.407", "year": 2021, "rel_sent": "HIT is a hierarchical transformer - based framework that captures the semantic relationship among words and hierarchically learns the sentencelevel semantics using a fused attention mechanism .", "forward": false, "src_ids": "2021.findings-acl.407_7797"}
{"input": "food - disease relation extraction is done by using Method| context: the accelerating growth of big data in the biomedical domain , with an endless amount of electronic health records and more than 30 million citations and abstracts in pubmed , introduces the need for automatic structuring of textual biomedical data .", "entity": "food - disease relation extraction", "output": "transfer learning", "neg_sample": ["food - disease relation extraction is done by using Method", "the accelerating growth of big data in the biomedical domain , with an endless amount of electronic health records and more than 30 million citations and abstracts in pubmed , introduces the need for automatic structuring of textual biomedical data ."], "relation": "used for", "id": "2021.bionlp-1.4", "year": 2021, "rel_sent": "SAFFRON : tranSfer leArning For Food - disease RelatiOn extractioN.", "forward": false, "src_ids": "2021.bionlp-1.4_9481"}
{"input": "faithful classifiers is done by using Method| context: pretrained transformer - based models such as bert have demonstrated state - of - the - art predictive performance when adapted into a range of natural language processing tasks . an open problem is how to improve the faithfulness of explanations ( rationales ) for the predictions of these models .", "entity": "faithful classifiers", "output": "saloss models", "neg_sample": ["faithful classifiers is done by using Method", "pretrained transformer - based models such as bert have demonstrated state - of - the - art predictive performance when adapted into a range of natural language processing tasks .", "an open problem is how to improve the faithfulness of explanations ( rationales ) for the predictions of these models ."], "relation": "used for", "id": "2021.emnlp-main.645", "year": 2021, "rel_sent": "Using the rationales extracted from vanilla BERT and SaLoss models to train inherently faithful classifiers , we further show that the latter result in higher predictive performance in downstream tasks .", "forward": false, "src_ids": "2021.emnlp-main.645_8896"}
{"input": "class - imbalanced discourse classification is done by using Method| context: as labeling schemas evolve over time , small differences can render datasets following older schemas unusable . this prevents researchers from building on top of previous annotation work and results in the existence , in discourse learning in particular , of many small class - imbalanced datasets .", "entity": "class - imbalanced discourse classification", "output": "multitask semi - supervised learning", "neg_sample": ["class - imbalanced discourse classification is done by using Method", "as labeling schemas evolve over time , small differences can render datasets following older schemas unusable .", "this prevents researchers from building on top of previous annotation work and results in the existence , in discourse learning in particular , of many small class - imbalanced datasets ."], "relation": "used for", "id": "2021.emnlp-main.40", "year": 2021, "rel_sent": "Multitask Semi - Supervised Learning for Class - Imbalanced Discourse Classification.", "forward": false, "src_ids": "2021.emnlp-main.40_12921"}
{"input": "aspect term extraction is used for OtherScientificTerm| context: one of the remaining challenges for aspect term extraction resides in the lack of sufficient annotated data . while self - training is potentially an effective method to address this issue , the pseudo - labels it yields on unlabeled data could induce noise .", "entity": "aspect term extraction", "output": "aspect terms", "neg_sample": ["aspect term extraction is used for OtherScientificTerm", "one of the remaining challenges for aspect term extraction resides in the lack of sufficient annotated data .", "while self - training is potentially an effective method to address this issue , the pseudo - labels it yields on unlabeled data could induce noise ."], "relation": "used for", "id": "2021.emnlp-main.23", "year": 2021, "rel_sent": "Aspect term extraction aims to extract aspect terms from a review sentence that users have expressed opinions on .", "forward": true, "src_ids": "2021.emnlp-main.23_10154"}
{"input": "conversation representation is done by using Method| context: automatically extracting interpersonal relationships of conversation interlocutors can enrich personal knowledge bases to enhance personalized search , recommenders and chatbots .", "entity": "conversation representation", "output": "pride", "neg_sample": ["conversation representation is done by using Method", "automatically extracting interpersonal relationships of conversation interlocutors can enrich personal knowledge bases to enhance personalized search , recommenders and chatbots ."], "relation": "used for", "id": "2021.emnlp-main.380", "year": 2021, "rel_sent": "To infer speakers ' relationships from dialogues we propose PRIDE , a neural multi - label classifier , based on BERT and Transformer for creating a conversation representation .", "forward": false, "src_ids": "2021.emnlp-main.380_6689"}
{"input": "dirichlet neighborhood ensemble is used for Method| context: although deep neural networks have achieved prominent performance on many nlp tasks , they are vulnerable to adversarial examples .", "entity": "dirichlet neighborhood ensemble", "output": "large models", "neg_sample": ["dirichlet neighborhood ensemble is used for Method", "although deep neural networks have achieved prominent performance on many nlp tasks , they are vulnerable to adversarial examples ."], "relation": "used for", "id": "2021.acl-long.426", "year": 2021, "rel_sent": "DNE is agnostic to the network architectures and scales to large models ( e.g. , BERT ) for NLP applications .", "forward": true, "src_ids": "2021.acl-long.426_13418"}
{"input": "text - code matching is done by using Method| context: finding codes given natural language query is beneficial to the productivity of software developers . future progress towards better semantic matching between query and code requires richer supervised training resources .", "entity": "text - code matching", "output": "contrastive learning method", "neg_sample": ["text - code matching is done by using Method", "finding codes given natural language query is beneficial to the productivity of software developers .", "future progress towards better semantic matching between query and code requires richer supervised training resources ."], "relation": "used for", "id": "2021.acl-long.442", "year": 2021, "rel_sent": "We further introduce a contrastive learning method dubbed CoCLR to enhance text - code matching , which works as a data augmenter to bring more artificially generated training instances .", "forward": false, "src_ids": "2021.acl-long.442_11513"}
{"input": "wordlevel data scarcity is done by using Method| context: quality estimation ( qe ) for machine translation has been shown to reach relatively high accuracy in predicting sentence - level scores , relying on pretrained contextual embeddings and human - produced quality scores . however , the lack of explanations along with decisions made by end - to - end neural models makes the results difficult to interpret . furthermore , word - level annotated datasets are rare due to the prohibitive effort required to perform this task , while they could provide interpretable signals in addition to sentence - level qe outputs .", "entity": "wordlevel data scarcity", "output": "qe architecture", "neg_sample": ["wordlevel data scarcity is done by using Method", "quality estimation ( qe ) for machine translation has been shown to reach relatively high accuracy in predicting sentence - level scores , relying on pretrained contextual embeddings and human - produced quality scores .", "however , the lack of explanations along with decisions made by end - to - end neural models makes the results difficult to interpret .", "furthermore , word - level annotated datasets are rare due to the prohibitive effort required to perform this task , while they could provide interpretable signals in addition to sentence - level qe outputs ."], "relation": "used for", "id": "2021.eval4nlp-1.15", "year": 2021, "rel_sent": "In this paper , we propose a novel QE architecture which tackles both the wordlevel data scarcity and the interpretability limitations of recent approaches .", "forward": false, "src_ids": "2021.eval4nlp-1.15_11281"}
{"input": "language gans is used for Task| context: one approach is to produce the diversity of texts conditioned by the sampled latent code . although several generative adversarial networks ( gans ) have been proposed thus far , these models still suffer from mode - collapsing if the models are not pre - trained .", "entity": "language gans", "output": "generating diverse texts", "neg_sample": ["language gans is used for Task", "one approach is to produce the diversity of texts conditioned by the sampled latent code .", "although several generative adversarial networks ( gans ) have been proposed thus far , these models still suffer from mode - collapsing if the models are not pre - trained ."], "relation": "used for", "id": "2021.eacl-srw.23", "year": 2021, "rel_sent": "Making Use of Latent Space in Language GANs for Generating Diverse Text without Pre - training.", "forward": true, "src_ids": "2021.eacl-srw.23_11806"}
{"input": "exophoric pronouns is done by using Method| context: resolving pronouns to their referents has long been studied as a fundamental natural language understanding problem . previous works on pronoun coreference resolution ( pcr ) mostly focus on resolving pronouns to mentions in text while ignoring the exophoric scenario . exophoric pronouns are common in daily communications , where speakers may directly use pronouns to refer to some objects present in the environment without introducing the objects first . although such objects are not mentioned in the dialogue text , they can often be disambiguated by the general topics of the dialogue .", "entity": "exophoric pronouns", "output": "topic regularization", "neg_sample": ["exophoric pronouns is done by using Method", "resolving pronouns to their referents has long been studied as a fundamental natural language understanding problem .", "previous works on pronoun coreference resolution ( pcr ) mostly focus on resolving pronouns to mentions in text while ignoring the exophoric scenario .", "exophoric pronouns are common in daily communications , where speakers may directly use pronouns to refer to some objects present in the environment without introducing the objects first .", "although such objects are not mentioned in the dialogue text , they can often be disambiguated by the general topics of the dialogue ."], "relation": "used for", "id": "2021.emnlp-main.311", "year": 2021, "rel_sent": "Extensive experiments demonstrate the effectiveness of adding topic regularization for resolving exophoric pronouns .", "forward": false, "src_ids": "2021.emnlp-main.311_16054"}
{"input": "crowdsourced annotations is done by using Method| context: emotion classification is the task of automatically associating a text with a human emotion . state - of - the - art models are usually learned using annotated corpora or rely on hand - crafted affective lexicons .", "entity": "crowdsourced annotations", "output": "bayesian method", "neg_sample": ["crowdsourced annotations is done by using Method", "emotion classification is the task of automatically associating a text with a human emotion .", "state - of - the - art models are usually learned using annotated corpora or rely on hand - crafted affective lexicons ."], "relation": "used for", "id": "2021.ranlp-1.16", "year": 2021, "rel_sent": "We aggregate the predictions of these models using a Bayesian method originally developed for modelling crowdsourced annotations .", "forward": false, "src_ids": "2021.ranlp-1.16_15111"}
{"input": "transfer language is done by using Material| context: in this paper we explore pos tagging for the scots language . as no linguistically annotated scots data were available , we manually pos tagged a small set that is used for evaluation and training .", "entity": "transfer language", "output": "english", "neg_sample": ["transfer language is done by using Material", "in this paper we explore pos tagging for the scots language .", "as no linguistically annotated scots data were available , we manually pos tagged a small set that is used for evaluation and training ."], "relation": "used for", "id": "2021.vardial-1.5", "year": 2021, "rel_sent": "We use English as a transfer language to examine zero - shot transfer and transfer learning methods .", "forward": false, "src_ids": "2021.vardial-1.5_4794"}
{"input": "transition states is done by using Method| context: transition systems usually contain various dynamic structures ( e.g. , stacks , buffers ) . an ideal transition - based model should encode these structures completely and efficiently . previous works relying on templates or neural network structures either only encode partial structure information or suffer from computation efficiency .", "entity": "transition states", "output": "parallel - friendly attention network", "neg_sample": ["transition states is done by using Method", "transition systems usually contain various dynamic structures ( e.g.", ", stacks , buffers ) .", "an ideal transition - based model should encode these structures completely and efficiently .", "previous works relying on templates or neural network structures either only encode partial structure information or suffer from computation efficiency ."], "relation": "used for", "id": "2021.emnlp-main.339", "year": 2021, "rel_sent": "With the help of parallel - friendly attention network , we are able to encoding transition states with O(1 ) additional complexity ( with respect to basic feature extractors ) .", "forward": false, "src_ids": "2021.emnlp-main.339_1503"}
{"input": "swedish reading comprehension questions is done by using Task| context: an important part when constructing multiple - choice questions ( mcqs ) for reading comprehension assessment are the distractors , the incorrect but preferably plausible answer options .", "entity": "swedish reading comprehension questions", "output": "bert - based distractor generation", "neg_sample": ["swedish reading comprehension questions is done by using Task", "an important part when constructing multiple - choice questions ( mcqs ) for reading comprehension assessment are the distractors , the incorrect but preferably plausible answer options ."], "relation": "used for", "id": "2021.inlg-1.43", "year": 2021, "rel_sent": "BERT - based distractor generation for Swedish reading comprehension questions using a small - scale dataset.", "forward": false, "src_ids": "2021.inlg-1.43_7420"}
{"input": "language models is done by using Method| context: can we get existing language models and refine them for zero - shot commonsense reasoning ?", "entity": "language models", "output": "self - supervised learning approach", "neg_sample": ["language models is done by using Method", "can we get existing language models and refine them for zero - shot commonsense reasoning ?"], "relation": "used for", "id": "2021.emnlp-main.688", "year": 2021, "rel_sent": "To this end , we propose a novel self - supervised learning approach that refines the language model utilizing a set of linguistic perturbations of similar concept relationships .", "forward": false, "src_ids": "2021.emnlp-main.688_2031"}
{"input": "filtering - based approach is used for Generic| context: warning : this paper contains content that may be offensive or upsetting . commonsense knowledge bases ( cskb ) are increasingly used for various natural language processing tasks . since cskbs are mostly human - generated and may reflect societal biases , it is important to ensure that such biases are not conflated with the notion of commonsense .", "entity": "filtering - based approach", "output": "harms", "neg_sample": ["filtering - based approach is used for Generic", "warning : this paper contains content that may be offensive or upsetting .", "commonsense knowledge bases ( cskb ) are increasingly used for various natural language processing tasks .", "since cskbs are mostly human - generated and may reflect societal biases , it is important to ensure that such biases are not conflated with the notion of commonsense ."], "relation": "used for", "id": "2021.emnlp-main.410", "year": 2021, "rel_sent": "Finally , we propose a filtering - based approach for mitigating such harms , and observe that our filtered - based approach can reduce the issues in both resources and models but leads to a performance drop , leaving room for future work to build fairer and stronger commonsense models .", "forward": true, "src_ids": "2021.emnlp-main.410_8448"}
{"input": "lexical surprisal is used for OtherScientificTerm| context: context guides comprehenders ' expectations during language processing , and informationtheoretic surprisal is commonly used as an index of cognitive processing effort . however , prior work using surprisal has considered only within - sentence context , using n - grams , neural language models , or syntactic structure as conditioning context .", "entity": "lexical surprisal", "output": "local context", "neg_sample": ["lexical surprisal is used for OtherScientificTerm", "context guides comprehenders ' expectations during language processing , and informationtheoretic surprisal is commonly used as an index of cognitive processing effort .", "however , prior work using surprisal has considered only within - sentence context , using n - grams , neural language models , or syntactic structure as conditioning context ."], "relation": "used for", "id": "2021.findings-acl.332", "year": 2021, "rel_sent": "Lexical surprisal calculated from ngram and LSTM language models is used to capture effects of local context ; to capture the effects of broader context a new metric based on topic models , topical surprisal , is introduced .", "forward": true, "src_ids": "2021.findings-acl.332_5741"}
{"input": "lexical normalization task is done by using Method| context: the task of converting a nonstandard text to a standard and readable text is known as lexical normalization . almost all the natural language processing ( nlp ) applications require the text data in normalized form to build quality task - specific models . hence , lexical normalization has been proven to improve the performance of numerous natural language processing tasks on social media .", "entity": "lexical normalization task", "output": "multilingual sequence labeling approach", "neg_sample": ["lexical normalization task is done by using Method", "the task of converting a nonstandard text to a standard and readable text is known as lexical normalization .", "almost all the natural language processing ( nlp ) applications require the text data in normalized form to build quality task - specific models .", "hence , lexical normalization has been proven to improve the performance of numerous natural language processing tasks on social media ."], "relation": "used for", "id": "2021.wnut-1.51", "year": 2021, "rel_sent": "Multilingual Sequence Labeling Approach to solve Lexical Normalization.", "forward": false, "src_ids": "2021.wnut-1.51_11268"}
{"input": "dominant theoretical frameworks is used for Task| context: over the past decade , the field of natural language processing has developed a wide array of computational methods for reasoning about narrative , including summarization , commonsense inference , and event detection . while this work has brought an important empirical lens for examining narrative , it is by and large divorced from the large body of theoretical work on narrative within the humanities , social and cognitive sciences .", "entity": "dominant theoretical frameworks", "output": "nlp community", "neg_sample": ["dominant theoretical frameworks is used for Task", "over the past decade , the field of natural language processing has developed a wide array of computational methods for reasoning about narrative , including summarization , commonsense inference , and event detection .", "while this work has brought an important empirical lens for examining narrative , it is by and large divorced from the large body of theoretical work on narrative within the humanities , social and cognitive sciences ."], "relation": "used for", "id": "2021.emnlp-main.26", "year": 2021, "rel_sent": "In this position paper , we introduce the dominant theoretical frameworks to the NLP community , situate current research in NLP within distinct narratological traditions , and argue that linking computational work in NLP to theory opens up a range of new empirical questions that would both help advance our understanding of narrative and open up new practical applications .", "forward": true, "src_ids": "2021.emnlp-main.26_2453"}
{"input": "query - driven topic model is used for OtherScientificTerm| context: topic modeling is an unsupervised method for revealing the hidden semantic structure of a corpus . it has been increasingly widely adopted as a tool in the social sciences , including political science , digital humanities and sociological research in general . one desirable property of topic models is to allow users tofind topics describing a specific aspect of the corpus .", "entity": "query - driven topic model", "output": "query - related topics", "neg_sample": ["query - driven topic model is used for OtherScientificTerm", "topic modeling is an unsupervised method for revealing the hidden semantic structure of a corpus .", "it has been increasingly widely adopted as a tool in the social sciences , including political science , digital humanities and sociological research in general .", "one desirable property of topic models is to allow users tofind topics describing a specific aspect of the corpus ."], "relation": "used for", "id": "2021.findings-acl.154", "year": 2021, "rel_sent": "We propose a novel query - driven topic model that allows users to specify a simple query in words or phrases and return query - related topics , thus avoiding tedious work from domain experts .", "forward": true, "src_ids": "2021.findings-acl.154_952"}
{"input": "automated responses is used for OtherScientificTerm| context: one of the key ideas of cognitive behavioural therapy ( cbt ) is the ability to convert negative or distorted thoughts into more realistic alternatives . although modern machine learning techniques can be successfully applied to a variety of natural language processing tasks , including cognitive behavioural therapy , the lack of a publicly available dataset makes supervised training difficult for tasks such as reforming distorted thoughts .", "entity": "automated responses", "output": "cognitive distortions", "neg_sample": ["automated responses is used for OtherScientificTerm", "one of the key ideas of cognitive behavioural therapy ( cbt ) is the ability to convert negative or distorted thoughts into more realistic alternatives .", "although modern machine learning techniques can be successfully applied to a variety of natural language processing tasks , including cognitive behavioural therapy , the lack of a publicly available dataset makes supervised training difficult for tasks such as reforming distorted thoughts ."], "relation": "used for", "id": "2021.icnlsp-1.13", "year": 2021, "rel_sent": "Formulating Automated Responses to Cognitive Distortions for CBT Interactions.", "forward": true, "src_ids": "2021.icnlsp-1.13_3091"}
{"input": "scoped meaning representations is used for OtherScientificTerm| context: recently , deep neural networks ( dnns ) have achieved great success in semantically challenging nlp tasks , yet it remains unclear whether dnn models can capture compositional meanings , those aspects of meaning that have been long studied in formal semantics .", "entity": "scoped meaning representations", "output": "semantic phenomena", "neg_sample": ["scoped meaning representations is used for OtherScientificTerm", "recently , deep neural networks ( dnns ) have achieved great success in semantically challenging nlp tasks , yet it remains unclear whether dnn models can capture compositional meanings , those aspects of meaning that have been long studied in formal semantics ."], "relation": "used for", "id": "2021.findings-acl.10", "year": 2021, "rel_sent": "To investigate this issue , we propose a Systematic Generalization testbed based on Natural language Semantics ( SyGNS ) , whose challenge is to map natural language sentences to multiple forms of scoped meaning representations , designed to account for various semantic phenomena .", "forward": true, "src_ids": "2021.findings-acl.10_1517"}
{"input": "character - based models is used for Task| context: subword segmentation algorithms have been a de facto choice when building neural machine translation systems . however , most of them need to learn a segmentation model based on some heuristics , which may produce sub - optimal segmentation . this can be problematic in some scenarios when the target language has rich morphological changes or there is not enough data for learning compact composition rules . translating at fully character level has the potential to alleviate the issue , but empirical performances of character - based models has not been fully explored .", "entity": "character - based models", "output": "generating rare and unknown words", "neg_sample": ["character - based models is used for Task", "subword segmentation algorithms have been a de facto choice when building neural machine translation systems .", "however , most of them need to learn a segmentation model based on some heuristics , which may produce sub - optimal segmentation .", "this can be problematic in some scenarios when the target language has rich morphological changes or there is not enough data for learning compact composition rules .", "translating at fully character level has the potential to alleviate the issue , but empirical performances of character - based models has not been fully explored ."], "relation": "used for", "id": "2021.acl-short.69", "year": 2021, "rel_sent": "Further analyses show that compared to subword - based models , character - based models are better at handling morphological phenomena , generating rare and unknown words , and more suitable for transferring to unseen domains .", "forward": true, "src_ids": "2021.acl-short.69_9433"}
{"input": "rewriter - evaluator architecture is used for Method| context: a few approaches have been developed to improve neural machine translation ( nmt ) models with multiple passes of decoding . however , their performance gains are limited because of lacking proper policies to terminate the multi - pass process .", "entity": "rewriter - evaluator architecture", "output": "rewriter - evaluator", "neg_sample": ["rewriter - evaluator architecture is used for Method", "a few approaches have been developed to improve neural machine translation ( nmt ) models with multiple passes of decoding .", "however , their performance gains are limited because of lacking proper policies to terminate the multi - pass process ."], "relation": "used for", "id": "2021.acl-long.443", "year": 2021, "rel_sent": "To address this issue , we introduce a novel architecture of Rewriter - Evaluator .", "forward": true, "src_ids": "2021.acl-long.443_8305"}
{"input": "contrastive learning strategy is used for Method| context: multilingual text summarization requires the ability to understand documents in multiple languages and generate summaries in the corresponding language , which poses more challenges on current summarization systems . however , this problem has been rarely studied due to the lack of large - scale supervised summarization data in multiple languages .", "entity": "contrastive learning strategy", "output": "multilingual summarization system ( calms )", "neg_sample": ["contrastive learning strategy is used for Method", "multilingual text summarization requires the ability to understand documents in multiple languages and generate summaries in the corresponding language , which poses more challenges on current summarization systems .", "however , this problem has been rarely studied due to the lack of large - scale supervised summarization data in multiple languages ."], "relation": "used for", "id": "2021.findings-acl.242", "year": 2021, "rel_sent": "We use the contrastive learning strategy to train our multilingual summarization system ( CALMS ) , which consists of two training objectives , contrastive sentence ranking ( CSR ) and sentence aligned substitution ( SAS ) .", "forward": true, "src_ids": "2021.findings-acl.242_9985"}
{"input": "automatic unsupervised morphological segmentation is used for Material| context: low - resource polysynthetic languages pose many challenges in nlp tasks , such as morphological analysis and machine translation , due to available resources and tools , and the morphologically complex languages .", "entity": "automatic unsupervised morphological segmentation", "output": "inuinnaqtun", "neg_sample": ["automatic unsupervised morphological segmentation is used for Material", "low - resource polysynthetic languages pose many challenges in nlp tasks , such as morphological analysis and machine translation , due to available resources and tools , and the morphologically complex languages ."], "relation": "used for", "id": "2021.americasnlp-1.17", "year": 2021, "rel_sent": "Towards a First Automatic Unsupervised Morphological Segmentation for Inuinnaqtun.", "forward": true, "src_ids": "2021.americasnlp-1.17_8059"}
{"input": "bert models is done by using OtherScientificTerm| context: nlp has a rich history of representing our prior understanding of language in the form of graphs . recent work on analyzing contextualized text representations has focused on hand - designed probe models to understand how and to what extent do these representations encode a particular linguistic phenomenon . however , due to the inter - dependence of various phenomena and randomness of training probe models , detecting how these representations encode the rich information in these linguistic graphs remains a challenging problem .", "entity": "bert models", "output": "probes", "neg_sample": ["bert models is done by using OtherScientificTerm", "nlp has a rich history of representing our prior understanding of language in the form of graphs .", "recent work on analyzing contextualized text representations has focused on hand - designed probe models to understand how and to what extent do these representations encode a particular linguistic phenomenon .", "however , due to the inter - dependence of various phenomena and randomness of training probe models , detecting how these representations encode the rich information in these linguistic graphs remains a challenging problem ."], "relation": "used for", "id": "2021.acl-long.145", "year": 2021, "rel_sent": "Using these probes , we analyze the BERT models on its ability to encode a syntactic and a semantic graph structure , and find that these models encode to some degree both syntactic as well as semantic information ; albeit syntactic information to a greater extent .", "forward": false, "src_ids": "2021.acl-long.145_1333"}
{"input": "multilingual bert is done by using Task| context: since the popularization of the transformer as a general - purpose feature encoder for nlp , many studies have attempted to decode linguistic structure from its novel multi - head attention mechanism . however , much of such work focused almost exclusively on english - a language with rigid word order and a lack of inflectional morphology .", "entity": "multilingual bert", "output": "decoding", "neg_sample": ["multilingual bert is done by using Task", "since the popularization of the transformer as a general - purpose feature encoder for nlp , many studies have attempted to decode linguistic structure from its novel multi - head attention mechanism .", "however , much of such work focused almost exclusively on english - a language with rigid word order and a lack of inflectional morphology ."], "relation": "used for", "id": "2021.eacl-main.264", "year": 2021, "rel_sent": "In this study , we present decoding experiments for multilingual BERT across 18 languages in order to test the generalizability of the claim that dependency syntax is reflected in attention patterns .", "forward": false, "src_ids": "2021.eacl-main.264_5082"}
{"input": "few - shot neural text generation is done by using Task| context: large - scale pretrained language models have led to dramatic improvements in few - shot text generation . nonetheless , almost all previous work simply applies random sampling to select the few - shot training instances . little to no attention has been paid to the selection strategies and how they would affect model performance .", "entity": "few - shot neural text generation", "output": "training instance selection", "neg_sample": ["few - shot neural text generation is done by using Task", "large - scale pretrained language models have led to dramatic improvements in few - shot text generation .", "nonetheless , almost all previous work simply applies random sampling to select the few - shot training instances .", "little to no attention has been paid to the selection strategies and how they would affect model performance ."], "relation": "used for", "id": "2021.inlg-1.36", "year": 2021, "rel_sent": "We propose a shared task on training instance selection for few - shot neural text generation .", "forward": false, "src_ids": "2021.inlg-1.36_9896"}
{"input": "index sizes is done by using Method| context: however , no previous work investigated how dense representations perform with large index sizes .", "entity": "index sizes", "output": "sparse - representations", "neg_sample": ["index sizes is done by using Method", "however , no previous work investigated how dense representations perform with large index sizes ."], "relation": "used for", "id": "2021.acl-short.77", "year": 2021, "rel_sent": "We show theoretically and empirically that the performance for dense representations decreases quicker than sparse representations for increasing index sizes .", "forward": false, "src_ids": "2021.acl-short.77_7614"}
{"input": "noise is used for OtherScientificTerm| context: ensuring strong theoretical privacy guarantees on text data is a challenging problem which is usually attained at the expense of utility . however , to improve the practicality of privacy preserving text analyses , it is essential to design algorithms that better optimize this tradeoff .", "entity": "noise", "output": "private vectors", "neg_sample": ["noise is used for OtherScientificTerm", "ensuring strong theoretical privacy guarantees on text data is a challenging problem which is usually attained at the expense of utility .", "however , to improve the practicality of privacy preserving text analyses , it is essential to design algorithms that better optimize this tradeoff ."], "relation": "used for", "id": "2021.trustnlp-1.3", "year": 2021, "rel_sent": "Our idea based on first randomly projecting the vectors to a lower - dimensional space and then adding noise in this projected space generates private vectors that achieve strong theoretical guarantees on its utility .", "forward": true, "src_ids": "2021.trustnlp-1.3_14076"}
{"input": "suicidality prediction is done by using Task| context: progress on nlp for mental health - indeed , for healthcare in general - is hampered by obstacles to shared , community - level access to relevant data .", "entity": "suicidality prediction", "output": "community - level research", "neg_sample": ["suicidality prediction is done by using Task", "progress on nlp for mental health - indeed , for healthcare in general - is hampered by obstacles to shared , community - level access to relevant data ."], "relation": "used for", "id": "2021.clpsych-1.7", "year": 2021, "rel_sent": "Community - level Research on Suicidality Prediction in a Secure Environment : Overview of the CLPsych 2021 Shared Task.", "forward": false, "src_ids": "2021.clpsych-1.7_8681"}
{"input": "computational argumentation tasks is done by using Method| context: approaches to computational argumentation tasks such as stance detection and aspect detection have largely focused on the text of independent claims , losing out on potentially valuable context provided by the rest of the collection .", "entity": "computational argumentation tasks", "output": "syntopical graphs", "neg_sample": ["computational argumentation tasks is done by using Method", "approaches to computational argumentation tasks such as stance detection and aspect detection have largely focused on the text of independent claims , losing out on potentially valuable context provided by the rest of the collection ."], "relation": "used for", "id": "2021.acl-long.126", "year": 2021, "rel_sent": "Syntopical Graphs for Computational Argumentation Tasks.", "forward": false, "src_ids": "2021.acl-long.126_15046"}
{"input": "big corpora of diagnosis in bulgarian is used for Task| context: the task of automatic diagnosis encoding into standard medical classifications and ontologies , is of great importance in medicine - both to support the daily tasks of physicians in the preparation and reporting of clinical documentation , and for automatic processing of clinical reports .", "entity": "big corpora of diagnosis in bulgarian", "output": "classification task", "neg_sample": ["big corpora of diagnosis in bulgarian is used for Task", "the task of automatic diagnosis encoding into standard medical classifications and ontologies , is of great importance in medicine - both to support the daily tasks of physicians in the preparation and reporting of clinical documentation , and for automatic processing of clinical reports ."], "relation": "used for", "id": "2021.ranlp-1.162", "year": 2021, "rel_sent": "Big corpora of diagnosis in Bulgarian annotated with ICD-10 codes is used for the classification task .", "forward": true, "src_ids": "2021.ranlp-1.162_649"}
{"input": "hierarchical mutual information maximization is used for Task| context: these embeddings are generated from the upstream process called multimodal fusion , which aims to extract and combine the input unimodal raw data to produce a richer multimodal representation . previous work either back - propagates the task loss or manipulates the geometric property of feature spaces to produce favorable fusion results , which neglects the preservation of critical task - related information that flows from input to the fusion results .", "entity": "hierarchical mutual information maximization", "output": "multimodal sentiment analysis ( msa )", "neg_sample": ["hierarchical mutual information maximization is used for Task", "these embeddings are generated from the upstream process called multimodal fusion , which aims to extract and combine the input unimodal raw data to produce a richer multimodal representation .", "previous work either back - propagates the task loss or manipulates the geometric property of feature spaces to produce favorable fusion results , which neglects the preservation of critical task - related information that flows from input to the fusion results ."], "relation": "used for", "id": "2021.emnlp-main.723", "year": 2021, "rel_sent": "Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis.", "forward": true, "src_ids": "2021.emnlp-main.723_6443"}
{"input": "causality detection is done by using Task| context: a total of 6 teams submitted runs across the task and 4 of them contributed with a system description paper .", "entity": "causality detection", "output": "fincausal 2021 shared task", "neg_sample": ["causality detection is done by using Task", "a total of 6 teams submitted runs across the task and 4 of them contributed with a system description paper ."], "relation": "used for", "id": "2021.fnp-1.10", "year": 2021, "rel_sent": "We present the FinCausal 2021 Shared Task on Causality Detection in Financial Documents and discuss the participating systems and results .", "forward": false, "src_ids": "2021.fnp-1.10_13098"}
{"input": "semantic mapping approach is used for Task| context: neural relation extraction models have shown promising results in recent years ; however , the model performance drops dramatically given only a few training samples . recent works try leveraging the advance in few - shot learning to solve the low resource problem , where they train label - agnostic models to directly compare the semantic similarities among context sentences in the embedding space . however , the label - aware information , i.e. , the relation label that contains the semantic knowledge of the relation itself , is often neglected for prediction .", "entity": "semantic mapping approach", "output": "low - resource relation extraction tasks", "neg_sample": ["semantic mapping approach is used for Task", "neural relation extraction models have shown promising results in recent years ; however , the model performance drops dramatically given only a few training samples .", "recent works try leveraging the advance in few - shot learning to solve the low resource problem , where they train label - agnostic models to directly compare the semantic similarities among context sentences in the embedding space .", "however , the label - aware information , i.e.", ", the relation label that contains the semantic knowledge of the relation itself , is often neglected for prediction ."], "relation": "used for", "id": "2021.emnlp-main.212", "year": 2021, "rel_sent": "MapRE : An Effective Semantic Mapping Approach for Low - resource Relation Extraction.", "forward": true, "src_ids": "2021.emnlp-main.212_503"}