Open Access
Chinese EmoBank: Building Valence-Arousal Resources for Dimensional Sentiment Analysis
July 2022, Article No.: 65, pp 1–18

An increasing amount of research has recently focused on dimensional sentiment analysis that represents affective states as continuous numerical values on multiple dimensions, such as valence-arousal (VA) space. Compared to the categorical approach that ...

Dual Discriminator GAN: Restoring Ancient Yi Characters
July 2022, Article No.: 66, pp 1–23

In China, the damage of ancient Yi books are serious. Due to the lack of ancient Yi experts, the repairation of ancient Yi books is progressing very slowly. The artificial intelligence is successful in the field of image and text, so it is feasible for ...

Hypernymy Detection for Low-resource Languages: A Study for Hindi, Bengali, and Amharic
July 2022, Article No.: 67, pp 1–21

Numerous attempts for hypernymy relation (e.g., dog “is-a” animal) detection have been made for resourceful languages like English, whereas efforts made for low-resource languages are scarce primarily due to lack of gold-standard datasets and suitable ...

Linguistically Driven Multi-Task Pre-Training for Low-Resource Neural Machine Translation
July 2022, Article No.: 68, pp 1–29

In the present study, we propose novel sequence-to-sequence pre-training objectives for low-resource machine translation (NMT): Japanese-specific sequence to sequence (JASS) for language pairs involving Japanese as the source or target language, and ...

Arabic Word Sense Disambiguation for Information Retrieval
July 2022, Article No.: 69, pp 1–19

In the context of using semantic resources for information retrieval, the relationship and distance between concepts are considered important for word sense disambiguation. In this article, we experiment with Conceptual Density and Random Walk with graph ...

Emotion Recognition with Conversational Generation Transfer
July 2022, Article No.: 70, pp 1–17

Emotion recognition in conversation is one of the essential tasks of natural language processing. However, this task’s annotation data is insufficient since such data is hard to collect and annotate. Meanwhile, there is large-scale data for conversational ...

Chinese Event Extraction via Graph Attention Network
July 2022, Article No.: 71, pp 1–12

Event extraction plays an important role in natural language processing (NLP) applications, including question answering and information retrieval. Most of the previous state-of-the-art methods were lack of ability in capturing features in long range. ...

Interactive Gated Decoder for Machine Reading Comprehension
July 2022, Article No.: 72, pp 1–19

Owing to the availability of various large-scale Machine Reading Comprehension (MRC) datasets, building an effective model to extract passage spans for question answering has been well studied in previous works. However, in reality, there are some ...

Investigating the Effect of Preprocessing Arabic Text on Offensive Language and Hate Speech Detection
July 2022, Article No.: 73, pp 1–20

Preprocessing of input text can play a key role in text classification by reducing dimensionality and removing unnecessary content. This study aims to investigate the impact of preprocessing on Arabic offensive language classification. We explore six ...

Arabic Fake News Detection: A Fact Checking Based Deep Learning Approach
July 2022, Article No.: 75, pp 1–34

Fake news stories can polarize society, particularly during political events. They undermine confidence in the media in general. Current NLP systems are still lacking the ability to properly interpret and classify Arabic fake news. Given the high stakes ...

Text-to-Speech Synthesis: Literature Review with an Emphasis on Malayalam Language
July 2022, Article No.: 76, pp 1–56

Text-to-Speech Synthesis (TTS) is an active area of research to generate synthetic speech from underlying text. The identified syllables are uttered with proper duration and prosody characteristics to emulate natural speech. It falls under the category of ...

Multi-domain Spoken Language Understanding Using Domain- and Task-aware Parameterization
July 2022, Article No.: 77, pp 1–17

Spoken language understanding (SLU) has been addressed as a supervised learning problem, where a set of training data is available for each domain. However, annotating data for a new domain can be both financially costly and non-scalable. One existing ...

Advancing Chinese Event Detection via Revisiting Character Information
July 2022, Article No.: 78, pp 1–9

Recently, character information has been successfully introduced into the encoder-decoder event detection model to relieve the trigger-word mismatch problem, thus achieving impressive results in the languages without natural delimiters (i.e., Chinese). ...

Word Sense Disambiguation using Cooperative Game Theory and Fuzzy Hindi WordNet based on ConceptNet
July 2022, Article No.: 79, pp 1–25

Natural Language is fuzzy in nature. The fuzziness of Hindi language was captured in the Fuzzy Hindi WordNet (FHWN). FHWN assigned membership values to fuzzy relationships by consulting experts from various domains. However, these membership values need ...

Konkani WordNet: Corpus-Based Enhancement using Crowdsourcing
July 2022, Article No.: 80, pp 1–18

Konkani is one of the languages included in the eighth schedule of the Indian constitution. It is the official language of Goa and is spoken mainly in Goa and some places in Karnataka and Kerala. Konkani WordNet or Konkani Shabdamalem (kōṁkanī śabdamālēṁ) ...

Handwritten New Tai Lue Character Recognition Using Convolutional Prior Features and Deep Variationally Sparse Gaussian Process Modeling
July 2022, Article No.: 82, pp 1–25

New Tai Lue is widely used in Southwest China and Southeast Asia. Hence, it is important to study related handwritten character recognition. Considering the many similar characters in handwritten New Tai Lue, this paper proposes an offline handwritten New ...

Word Level Script Identification Using Convolutional Neural Network Enhancement for Scenic Images
July 2022, Article No.: 83, pp 1–29

Script identification from complex and colorful images is an integral part of the text recognition and classification system. Such images may contain twofold challenges: (1) Challenges related to the camera like blurring effect, non-uniform illumination ...

Combining a Novel Scoring Approach with Arabic Stemming Techniques for Arabic Chatbots Conversation Engine
July 2022, Article No.: 84, pp 1–21

Arabic is recognized as one of the main languages around the world. Many attempts and efforts have been done to provide computing solutions to support the language. Developing Arabic chatbots is still an evolving research field and requires extra efforts ...



