Summary for 2021-08-30, created on 2021-12-19

An Introduction to Variational Inference arxiv:2108.13083 📈 104

Ankush Ganguly, Samuel W. F. Earp

**Abstract:** Approximating complex probability densities is a core problem in modern statistics. In this paper, we introduce the concept of Variational Inference (VI), a popular method in machine learning that uses optimization techniques to estimate complex probability densities. This property allows VI to converge faster than classical methods, such as, Markov Chain Monte Carlo sampling. Conceptually, VI works by choosing a family of probability density functions and then finding the one closest to the actual probability density -- often using the Kullback-Leibler (KL) divergence as the optimization metric. We introduce the Evidence Lower Bound to tractably compute the approximated probability density and we review the ideas behind mean-field variational inference. Finally, we discuss the applications of VI to variational auto-encoders (VAE) and VAE-Generative Adversarial Network (VAE-GAN). With this paper, we aim to explain the concept of VI and assist in future research with this approach.

Deep Reinforcement Learning at the Edge of the Statistical Precipice arxiv:2108.13264 📈 97

Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron Courville, Marc G. Bellemare

**Abstract:** Deep reinforcement learning (RL) algorithms are predominantly evaluated by comparing their relative performance on a large suite of tasks. Most published results on deep RL benchmarks compare point estimates of aggregate performance such as mean and median scores across tasks, ignoring the statistical uncertainty implied by the use of a finite number of training runs. Beginning with the Arcade Learning Environment (ALE), the shift towards computationally-demanding benchmarks has led to the practice of evaluating only a small number of runs per task, exacerbating the statistical uncertainty in point estimates. In this paper, we argue that reliable evaluation in the few run deep RL regime cannot ignore the uncertainty in results without running the risk of slowing down progress in the field. We illustrate this point using a case study on the Atari 100k benchmark, where we find substantial discrepancies between conclusions drawn from point estimates alone versus a more thorough statistical analysis. With the aim of increasing the field's confidence in reported results with a handful of runs, we advocate for reporting interval estimates of aggregate performance and propose performance profiles to account for the variability in results, as well as present more robust and efficient aggregate metrics, such as interquartile mean scores, to achieve small uncertainty in results. Using such statistical tools, we scrutinize performance evaluations of existing algorithms on other widely used RL benchmarks including the ALE, Procgen, and the DeepMind Control Suite, again revealing discrepancies in prior comparisons. Our findings call for a change in how we evaluate performance in deep RL, for which we present a more rigorous evaluation methodology, accompanied with an open-source library rliable, to prevent unreliable results from stagnating the field.

Semi-Supervised Exaggeration Detection of Health Science Press Releases arxiv:2108.13493 📈 83

Dustin Wright, Isabelle Augenstein

**Abstract:** Public trust in science depends on honest and factual communication of scientific papers. However, recent studies have demonstrated a tendency of news media to misrepresent scientific papers by exaggerating their findings. Given this, we present a formalization of and study into the problem of exaggeration detection in science communication. While there are an abundance of scientific papers and popular media articles written about them, very rarely do the articles include a direct link to the original paper, making data collection challenging. We address this by curating a set of labeled press release/abstract pairs from existing expert annotated studies on exaggeration in press releases of scientific papers suitable for benchmarking the performance of machine learning models on the task. Using limited data from this and previous studies on exaggeration detection in science, we introduce MT-PET, a multi-task version of Pattern Exploiting Training (PET), which leverages knowledge from complementary cloze-style QA tasks to improve few-shot learning. We demonstrate that MT-PET outperforms PET and supervised learning both when data is limited, as well as when there is an abundance of data for the main task.

Speaker-Conditioned Hierarchical Modeling for Automated Speech Scoring arxiv:2109.00928 📈 75

Yaman Kumar Singla, Avykat Gupta, Shaurya Bagga, Changyou Chen, Balaji Krishnamurthy, Rajiv Ratn Shah

**Abstract:** Automatic Speech Scoring (ASS) is the computer-assisted evaluation of a candidate's speaking proficiency in a language. ASS systems face many challenges like open grammar, variable pronunciations, and unstructured or semi-structured content. Recent deep learning approaches have shown some promise in this domain. However, most of these approaches focus on extracting features from a single audio, making them suffer from the lack of speaker-specific context required to model such a complex task. We propose a novel deep learning technique for non-native ASS, called speaker-conditioned hierarchical modeling. In our technique, we take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. We extract context vectors from these responses and feed them as additional speaker-specific context to our network to score a particular response. We compare our technique with strong baselines and find that such modeling improves the model's average performance by 6.92% (maximum = 12.86%, minimum = 4.51%). We further show both quantitative and qualitative insights into the importance of this additional context in solving the problem of ASS.

SHIFT15M: Multiobjective Large-Scale Fashion Dataset with Distributional Shifts arxiv:2108.12992 📈 45

Masanari Kimura, Takuma Nakamura, Yuki Saito

**Abstract:** Many machine learning algorithms assume that the training data and the test data follow the same distribution. However, such assumptions are often violated in real-world machine learning problems. In this paper, we propose SHIFT15M, a dataset that can be used to properly evaluate models in situations where the distribution of data changes between training and testing. The SHIFT15M dataset has several good properties: (i) Multiobjective. Each instance in the dataset has several numerical values that can be used as target variables. (ii) Large-scale. The SHIFT15M dataset consists of 15million fashion images. (iii) Coverage of types of dataset shifts. SHIFT15M contains multiple dataset shift problem settings (e.g., covariate shift or target shift). SHIFT15M also enables the performance evaluation of the model under various magnitudes of dataset shifts by switching the magnitude. In addition, we provide software to handle SHIFT15M in a very simple way: https://github.com/st-tech/zozo-shift15m.

Investigating Vulnerabilities of Deep Neural Policies arxiv:2108.13093 📈 44

Ezgi Korkmaz

**Abstract:** Reinforcement learning policies based on deep neural networks are vulnerable to imperceptible adversarial perturbations to their inputs, in much the same way as neural network image classifiers. Recent work has proposed several methods to improve the robustness of deep reinforcement learning agents to adversarial perturbations based on training in the presence of these imperceptible perturbations (i.e. adversarial training). In this paper, we study the effects of adversarial training on the neural policy learned by the agent. In particular, we follow two distinct parallel approaches to investigate the outcomes of adversarial training on deep neural policies based on worst-case distributional shift and feature sensitivity. For the first approach, we compare the Fourier spectrum of minimal perturbations computed for both adversarially trained and vanilla trained neural policies. Via experiments in the OpenAI Atari environments we show that minimal perturbations computed for adversarially trained policies are more focused on lower frequencies in the Fourier domain, indicating a higher sensitivity of these policies to low frequency perturbations. For the second approach, we propose a novel method to measure the feature sensitivities of deep neural policies and we compare these feature sensitivity differences in state-of-the-art adversarially trained deep neural policies and vanilla trained deep neural policies. We believe our results can be an initial step towards understanding the relationship between adversarial training and different notions of robustness for neural policies.

Neural HMMs are all you need (for high-quality attention-free TTS) arxiv:2108.13320 📈 27

Shivam Mehta, Éva Székely, Jonas Beskow, Gustav Eje Henter

**Abstract:** Neural sequence-to-sequence TTS has achieved significantly better output quality than statistical speech synthesis using HMMs. However, neural TTS is generally not probabilistic and the use of non-monotonic attention both increases training time and introduces "babbling" failure modes that are unacceptable in production. This paper demonstrates that the old and new paradigms can be combined to obtain the advantages of both worlds, by replacing the attention in Tacotron 2 with an autoregressive left-right no-skip hidden Markov model defined by a neural network. This leads to an HMM-based neural TTS model with monotonic alignment, trained to maximise the full sequence likelihood without approximations. We discuss how to combine innovations from both classical and contemporary TTS for best results. The final system is smaller and simpler than Tacotron 2, and learns to speak with fewer iterations and less data, whilst achieving the same naturalness prior to the post-net. Unlike Tacotron 2, our system also allows easy control over speaking rate. Audio examples and code are available at https://shivammehta007.github.io/Neural-HMM/

Want To Reduce Labeling Cost? GPT-3 Can Help arxiv:2108.13487 📈 23

Shuohang Wang, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng

**Abstract:** Data annotation is a time-consuming and labor-intensive process for many NLP tasks. Although there exist various methods to produce pseudo data labels, they are often task-specific and require a decent amount of labeled data to start with. Recently, the immense language model GPT-3 with 175 billion parameters has achieved tremendous improvement across many few-shot learning tasks. In this paper, we explore ways to leverage GPT-3 as a low-cost data labeler to train other models. We find that, to make the downstream model achieve the same performance on a variety of NLU and NLG tasks, it costs 50% to 96% less to use labels from GPT-3 than using labels from humans. Furthermore, we propose a novel framework of combining pseudo labels from GPT-3 with human labels, which leads to even better performance with limited labeling budget. These results present a cost-effective data labeling methodology that is generalizable to many practical applications.

Tune It or Don't Use It: Benchmarking Data-Efficient Image Classification arxiv:2108.13122 📈 16

Lorenzo Brigato, Björn Barz, Luca Iocchi, Joachim Denzler

**Abstract:** Data-efficient image classification using deep neural networks in settings, where only small amounts of labeled data are available, has been an active research area in the recent past. However, an objective comparison between published methods is difficult, since existing works use different datasets for evaluation and often compare against untuned baselines with default hyper-parameters. We design a benchmark for data-efficient image classification consisting of six diverse datasets spanning various domains (e.g., natural images, medical imagery, satellite data) and data types (RGB, grayscale, multispectral). Using this benchmark, we re-evaluate the standard cross-entropy baseline and eight methods for data-efficient deep learning published between 2017 and 2021 at renowned venues. For a fair and realistic comparison, we carefully tune the hyper-parameters of all methods on each dataset. Surprisingly, we find that tuning learning rate, weight decay, and batch size on a separate validation split results in a highly competitive baseline, which outperforms all but one specialized method and performs competitively to the remaining one.

Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback arxiv:2108.13454 📈 15

HongChien Yu, Chenyan Xiong, Jamie Callan

**Abstract:** Dense retrieval systems conduct first-stage retrieval using embedded representations and simple similarity metrics to match a query to documents. Its effectiveness depends on encoded embeddings to capture the semantics of queries and documents, a challenging task due to the shortness and ambiguity of search queries. This paper proposes ANCE-PRF, a new query encoder that uses pseudo relevance feedback (PRF) to improve query representations for dense retrieval. ANCE-PRF uses a BERT encoder that consumes the query and the top retrieved documents from a dense retrieval model, ANCE, and it learns to produce better query embeddings directly from relevance labels. It also keeps the document index unchanged to reduce overhead. ANCE-PRF significantly outperforms ANCE and other recent dense retrieval systems on several datasets. Analysis shows that the PRF encoder effectively captures the relevant and complementary information from PRF documents, while ignoring the noise with its learned attention mechanism.

DNNFusion: Accelerating Deep Neural Networks Execution with Advanced Operator Fusion arxiv:2108.13342 📈 8

Wei Niu, Jiexiong Guan, Yanzhi Wang, Gagan Agrawal, Bin Ren

**Abstract:** Deep Neural Networks (DNNs) have emerged as the core enabler of many major applications on mobile devices. To achieve high accuracy, DNN models have become increasingly deep with hundreds or even thousands of operator layers, leading to high memory and computational requirements for inference. Operator fusion (or kernel/layer fusion) is key optimization in many state-of-the-art DNN execution frameworks, such as TensorFlow, TVM, and MNN. However, these frameworks usually adopt fusion approaches based on certain patterns that are too restrictive to cover the diversity of operators and layer connections. Polyhedral-based loop fusion techniques, on the other hand, work on a low-level view of the computation without operator-level information, and can also miss potential fusion opportunities. To address this challenge, this paper proposes a novel and extensive loop fusion framework called DNNFusion. The basic idea of this work is to work at an operator view of DNNs, but expand fusion opportunities by developing a classification of both individual operators and their combinations. In addition, DNNFusion includes 1) a novel mathematical-property-based graph rewriting framework to reduce evaluation costs and facilitate subsequent operator fusion, 2) an integrated fusion plan generation that leverages the high-level analysis and accurate light-weight profiling, and 3) additional optimizations during fusion code generation. DNNFusion is extensively evaluated on 15 DNN models with varied types of tasks, model sizes, and layer counts. The evaluation results demonstrate that DNNFusion finds up to 8.8x higher fusion opportunities, outperforms four state-of-the-art DNN execution frameworks with 9.3x speedup. The memory requirement reduction and speedups can enable the execution of many of the target models on mobile devices and even make them part of a real-time application.

SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning arxiv:2108.13035 📈 8

Jiaqi Xu, Bin Li, Bo Lu, Yun-Hui Liu, Qi Dou, Pheng-Ann Heng

**Abstract:** Autonomous surgical execution relieves tedious routines and surgeon's fatigue. Recent learning-based methods, especially reinforcement learning (RL) based methods, achieve promising performance for dexterous manipulation, which usually requires the simulation to collect data efficiently and reduce the hardware cost. The existing learning-based simulation platforms for medical robots suffer from limited scenarios and simplified physical interactions, which degrades the real-world performance of learned policies. In this work, we designed SurRoL, an RL-centered simulation platform for surgical robot learning compatible with the da Vinci Research Kit (dVRK). The designed SurRoL integrates a user-friendly RL library for algorithm development and a real-time physics engine, which is able to support more PSM/ECM scenarios and more realistic physical interactions. Ten learning-based surgical tasks are built in the platform, which are common in the real autonomous surgical execution. We evaluate SurRoL using RL algorithms in simulation, provide in-depth analysis, deploy the trained policies on the real dVRK, and show that our SurRoL achieves better transferability in the real world.

Auto-Split: A General Framework of Collaborative Edge-Cloud AI arxiv:2108.13041 📈 7

Amin Banitalebi-Dehkordi, Naveen Vedula, Jian Pei, Fei Xia, Lanjun Wang, Yong Zhang

**Abstract:** In many industry scale applications, large and resource consuming machine learning models reside in powerful cloud servers. At the same time, large amounts of input data are collected at the edge of cloud. The inference results are also communicated to users or passed to downstream tasks at the edge. The edge often consists of a large number of low-power devices. It is a big challenge to design industry products to support sophisticated deep model deployment and conduct model inference in an efficient manner so that the model accuracy remains high and the end-to-end latency is kept low. This paper describes the techniques and engineering practice behind Auto-Split, an edge-cloud collaborative prototype of Huawei Cloud. This patented technology is already validated on selected applications, is on its way for broader systematic edge-cloud application integration, and is being made available for public use as an automated pipeline service for end-to-end cloud-edge collaborative intelligence deployment. To the best of our knowledge, there is no existing industry product that provides the capability of Deep Neural Network (DNN) splitting.

How Does Adversarial Fine-Tuning Benefit BERT? arxiv:2108.13602 📈 6

Javid Ebrahimi, Hao Yang, Wei Zhang

**Abstract:** Adversarial training (AT) is one of the most reliable methods for defending against adversarial attacks in machine learning. Variants of this method have been used as regularization mechanisms to achieve SOTA results on NLP benchmarks, and they have been found to be useful for transfer learning and continual learning. We search for the reasons for the effectiveness of AT by contrasting vanilla and adversarially fine-tuned BERT models. We identify partial preservation of BERT's syntactic abilities during fine-tuning as the key to the success of AT. We observe that adversarially fine-tuned models remain more faithful to BERT's language modeling behavior and are more sensitive to the word order. As concrete examples of syntactic abilities, an adversarially fine-tuned model could have an advantage of up to 38% on anaphora agreement and up to 11% on dependency parsing. Our analysis demonstrates that vanilla fine-tuning oversimplifies the sentence representation by focusing heavily on a small subset of words. AT, however, moderates the effect of these influential words and encourages representational diversity. This allows for a more hierarchical representation of a sentence and leads to the mitigation of BERT's loss of syntactic abilities.

Shatter: An Efficient Transformer Encoder with Single-Headed Self-Attention and Relative Sequence Partitioning arxiv:2108.13032 📈 6

Ran Tian, Joshua Maynez, Ankur P. Parikh

**Abstract:** The highly popular Transformer architecture, based on self-attention, is the foundation of large pretrained models such as BERT, that have become an enduring paradigm in NLP. While powerful, the computational resources and time required to pretrain such models can be prohibitive. In this work, we present an alternative self-attention architecture, Shatter, that more efficiently encodes sequence information by softly partitioning the space of relative positions and applying different value matrices to different parts of the sequence. This mechanism further allows us to simplify the multi-headed attention in Transformer to single-headed. We conduct extensive experiments showing that Shatter achieves better performance than BERT, with pretraining being faster per step (15% on TPU), converging in fewer steps, and offering considerable memory savings (>50%). Put together, Shatter can be pretrained on 8 V100 GPUs in 7 days, and match the performance of BERT_Base -- making the cost of pretraining much more affordable.

Self-balanced Learning For Domain Generalization arxiv:2108.13597 📈 4

Jin Kim, Jiyoung Lee, Jungin Park, Dongbo Min, Kwanghoon Sohn

**Abstract:** Domain generalization aims to learn a prediction model on multi-domain source data such that the model can generalize to a target domain with unknown statistics. Most existing approaches have been developed under the assumption that the source data is well-balanced in terms of both domain and class. However, real-world training data collected with different composition biases often exhibits severe distribution gaps for domain and class, leading to substantial performance degradation. In this paper, we propose a self-balanced domain generalization framework that adaptively learns the weights of losses to alleviate the bias caused by different distributions of the multi-domain source data. The self-balanced scheme is based on an auxiliary reweighting network that iteratively updates the weight of loss conditioned on the domain and class information by leveraging balanced meta data. Experimental results demonstrate the effectiveness of our method overwhelming state-of-the-art works for domain generalization.

Survival Prediction of Heart Failure Patients using Stacked Ensemble Machine Learning Algorithm arxiv:2108.13367 📈 4

S. M Mehedi Zaman, Wasay Mahmood Qureshi, Md. Mohsin Sarker Raihan, Ocean Monjur, Abdullah Bin Shams

**Abstract:** Cardiovascular disease, especially heart failure is one of the major health hazard issues of our time and is a leading cause of death worldwide. Advancement in data mining techniques using machine learning (ML) models is paving promising prediction approaches. Data mining is the process of converting massive volumes of raw data created by the healthcare institutions into meaningful information that can aid in making predictions and crucial decisions. Collecting various follow-up data from patients who have had heart failures, analyzing those data, and utilizing several ML models to predict the survival possibility of cardiovascular patients is the key aim of this study. Due to the imbalance of the classes in the dataset, Synthetic Minority Oversampling Technique (SMOTE) has been implemented. Two unsupervised models (K-Means and Fuzzy C-Means clustering) and three supervised classifiers (Random Forest, XGBoost and Decision Tree) have been used in our study. After thorough investigation, our results demonstrate a superior performance of the supervised ML algorithms over unsupervised models. Moreover, we designed and propose a supervised stacked ensemble learning model that can achieve an accuracy, precision, recall and F1 score of 99.98%. Our study shows that only certain attributes collected from the patients are imperative to successfully predict the surviving possibility post heart failure, using supervised ML algorithms.

On the Multilingual Capabilities of Very Large-Scale English Language Models arxiv:2108.13349 📈 4

Jordi Armengol-Estapé, Ona de Gibert Bonet, Maite Melero

**Abstract:** Generative Pre-trained Transformers (GPTs) have recently been scaled to unprecedented sizes in the history of machine learning. These models, solely trained on the language modeling objective, have been shown to exhibit outstanding few-shot learning capabilities in a number of different tasks. Nevertheless, aside from anecdotal experiences, little is known regarding their multilingual capabilities, given the fact that the pre-training corpus is almost entirely composed of English text. In this work, we investigate the multilingual skills of GPT-3, focusing on one language that barely appears in the pre-training corpus, Catalan, which makes the results especially meaningful; we assume that our results may be relevant for other languages as well. We find that the model shows an outstanding performance, particularly in generative tasks, with predictable limitations mostly in language understanding tasks but still with remarkable results given the zero-shot scenario. We investigate its potential and limits in extractive question-answering and natural language generation, as well as the effect of scale in terms of model size.

Enlisting 3D Crop Models and GANs for More Data Efficient and Generalizable Fruit Detection arxiv:2108.13344 📈 4

Zhenghao Fei, Alex Olenskyj, Brian N. Bailey, Mason Earles

**Abstract:** Training real-world neural network models to achieve high performance and generalizability typically requires a substantial amount of labeled data, spanning a broad range of variation. This data-labeling process can be both labor and cost intensive. To achieve desirable predictive performance, a trained model is typically applied into a domain where the data distribution is similar to the training dataset. However, for many agricultural machine learning problems, training datasets are collected at a specific location, during a specific period in time of the growing season. Since agricultural systems exhibit substantial variability in terms of crop type, cultivar, management, seasonal growth dynamics, lighting condition, sensor type, etc, a model trained from one dataset often does not generalize well across domains. To enable more data efficient and generalizable neural network models in agriculture, we propose a method that generates photorealistic agricultural images from a synthetic 3D crop model domain into real world crop domains. The method uses a semantically constrained GAN (generative adversarial network) to preserve the fruit position and geometry. We observe that a baseline CycleGAN method generates visually realistic target domain images but does not preserve fruit position information while our method maintains fruit positions well. Image generation results in vineyard grape day and night images show the visual outputs of our network are much better compared to a baseline network. Incremental training experiments in vineyard grape detection tasks show that the images generated from our method can significantly speed the domain adaption process, increase performance for a given number of labeled images (i.e. data efficiency), and decrease labeling requirements.

Zero Shot on the Cold-Start Problem: Model-Agnostic Interest Learning for Recommender Systems arxiv:2108.13592 📈 3

Philip J. Feng, Pingjun Pan, Tingting Zhou, Hongxiang Chen, Chuanjiang Luo

**Abstract:** User behavior has been validated to be effective in revealing personalized preferences for commercial recommendations. However, few user-item interactions can be collected for new users, which results in a null space for their interests, i.e., the cold-start dilemma. In this paper, a two-tower framework, namely, the model-agnostic interest learning (MAIL) framework, is proposed to address the cold-start recommendation (CSR) problem for recommender systems. In MAIL, one unique tower is constructed to tackle the CSR from a zero-shot view, and the other tower focuses on the general ranking task. Specifically, the zero-shot tower first performs cross-modal reconstruction with dual auto-encoders to obtain virtual behavior data from highly aligned hidden features for new users; and the ranking tower can then output recommendations for users based on the completed data by the zero-shot tower. Practically, the ranking tower in MAIL is model-agnostic and can be implemented with any embedding-based deep models. Based on the co-training of the two towers, the MAIL presents an end-to-end method for recommender systems that shows an incremental performance improvement. The proposed method has been successfully deployed on the live recommendation system of NetEase Cloud Music to achieve a click-through rate improvement of 13% to 15% for millions of users. Offline experiments on real-world datasets also show its superior performance in CSR. Our code is available.

Full-Cycle Energy Consumption Benchmark for Low-Carbon Computer Vision arxiv:2108.13465 📈 3

Bo Li, Xinyang Jiang, Donglin Bai, Yuge Zhang, Ningxin Zheng, Xuanyi Dong, Lu Liu, Yuqing Yang, Dongsheng Li

**Abstract:** The energy consumption of deep learning models is increasing at a breathtaking rate, which raises concerns due to potential negative effects on carbon neutrality in the context of global warming and climate change. With the progress of efficient deep learning techniques, e.g., model compression, researchers can obtain efficient models with fewer parameters and smaller latency. However, most of the existing efficient deep learning methods do not explicitly consider energy consumption as a key performance indicator. Furthermore, existing methods mostly focus on the inference costs of the resulting efficient models, but neglect the notable energy consumption throughout the entire life cycle of the algorithm. In this paper, we present the first large-scale energy consumption benchmark for efficient computer vision models, where a new metric is proposed to explicitly evaluate the full-cycle energy consumption under different model usage intensity. The benchmark can provide insights for low carbon emission when selecting efficient deep learning algorithms in different model usage scenarios.

The missing link: Developing a safety case for perception components in automated driving arxiv:2108.13294 📈 3

Rick Salay, Krzysztof Czarnecki, Hiroshi Kuwajima, Hirotoshi Yasuoka, Toshihiro Nakae, Vahdat Abdelzad, Chengjie Huang, Maximilian Kahn, Van Duong Nguyen

**Abstract:** Safety assurance is a central concern for the development and societal acceptance of automated driving (AD) systems. Perception is a key aspect of AD that relies heavily on Machine Learning (ML). Despite the known challenges with the safety assurance of ML-based components, proposals have recently emerged for unit-level safety cases addressing these components. Unfortunately, AD safety cases express safety requirements at the system level and these efforts are missing the critical linking argument needed to integrate safety requirements at the system level with component performance requirements at the unit level. In this paper, we propose the Integration Safety Case for Perception (ISCaP), a generic template for such a linking safety argument specifically tailored for perception components. The template takes a deductive and formal approach to define strong traceability between levels. We demonstrate the applicability of ISCaP with a detailed case study and discuss its use as a tool to support incremental development of perception components.

StackGAN: Facial Image Generation Optimizations arxiv:2108.13290 📈 3

Badr Belhiti, Justin Milushev, Avinash Gupta, John Breedis, Johnson Dinh, Jesse Pisel, Michael Pyrcz

**Abstract:** Current state-of-the-art photorealistic generators are computationally expensive, involve unstable training processes, and have real and synthetic distributions that are dissimilar in higher-dimensional spaces. To solve these issues, we propose a variant of the StackGAN architecture. The new architecture incorporates conditional generators to construct an image in many stages. In our model, we generate grayscale facial images in two different stages: noise to edges (stage one) and edges to grayscale (stage two). Our model is trained with the CelebA facial image dataset and achieved a Fréchet Inception Distance (FID) score of 73 for edge images and a score of 59 for grayscale images generated using the synthetic edge images. Although our model achieved subpar results in relation to state-of-the-art models, dropout layers could reduce the overfitting in our conditional mapping. Additionally, since most images can be broken down into important features, improvements to our model can generalize to other datasets. Therefore, our model can potentially serve as a superior alternative to traditional means of generating photorealistic images.

The effects of data size on Automated Essay Scoring engines arxiv:2108.13275 📈 3

Christopher Ormerod, Amir Jafari, Susan Lottridge, Milan Patel, Amy Harris, Paul van Wamelen

**Abstract:** We study the effects of data size and quality on the performance on Automated Essay Scoring (AES) engines that are designed in accordance with three different paradigms; A frequency and hand-crafted feature-based model, a recurrent neural network model, and a pretrained transformer-based language model that is fine-tuned for classification. We expect that each type of model benefits from the size and the quality of the training data in very different ways. Standard practices for developing training data for AES engines were established with feature-based methods in mind, however, since neural networks are increasingly being considered in a production setting, this work seeks to inform us as to how to establish better training data for neural networks that will be used in production.

Thermodynamics-based Artificial Neural Networks (TANN) for multiscale modeling of materials with inelastic microstructure arxiv:2108.13137 📈 3

Filippo Masi, Ioannis Stefanou

**Abstract:** The mechanical behavior of inelastic materials with microstructure is very complex and hard to grasp with heuristic, empirical constitutive models. For this purpose, multiscale, homogenization approaches are often used for performing reliable, accurate predictions of the macroscopic mechanical behavior of microstructured solids. Nevertheless, the calculation cost of such approaches is extremely high and prohibitive for real-scale applications involving inelastic materials. Recently, data-driven approaches based on deep learning have risen as a promising alternative to replace ad-hoc constitutive laws and speed-up multiscale numerical methods. However, such approaches lack a rigorous frame based on the laws of physics. As a result, their application to model materials with complex microstructure in inelasticity is not yet established. Here, we propose Thermodynamics-based Artificial Neural Networks (TANN) for the constitutive modeling of materials with inelastic and complex microstructure. Our approach integrates thermodynamics-aware dimensionality reduction techniques and deep neural networks to identify the constitutive laws and the internal state variables of complex inelastic materials. The ability of TANN in delivering high-fidelity, physically consistent predictions is demonstrated through several examples both at the microscopic and macroscopic scale. In particular, we show the efficiency and accuracy of TANN in predicting the average and local stress-strain response, the internal energy and the dissipation of both regular and perturbed lattice microstructures in inelasticity. Finally, a double-scale homogenization scheme is used to solve a large scale boundary value problem. The high performance of the homogenized model using TANN is illustrated through detailed comparisons. An excellent agreement is shown for a variety of monotonous and cyclic stress-strain paths.

Wasserstein Generative Adversarial Uncertainty Quantification in Physics-Informed Neural Networks arxiv:2108.13054 📈 3

Yihang Gao, Michael K. Ng

**Abstract:** In this paper, we study a physics-informed algorithm for Wasserstein Generative Adversarial Networks (WGANs) for uncertainty quantification in solutions of partial differential equations. By using groupsort activation functions in adversarial network discriminators, network generators are utilized to learn the uncertainty in solutions of partial differential equations observed from the initial/boundary data. Under mild assumptions, we show that the generalization error of the computed generator converges to the approximation error of the network with high probability, when the number of samples are sufficiently taken. According to our established error bound, we also find that our physics-informed WGANs have higher requirement for the capacity of discriminators than that of generators. Numerical results on synthetic examples of partial differential equations are reported to validate our theoretical results and demonstrate how uncertainty quantification can be obtained for solutions of partial differential equations and the distributions of initial/boundary data.

Communication-Computation Efficient Device-Edge Co-Inference via AutoML arxiv:2108.13009 📈 3

Xinjie Zhang, Jiawei Shao, Yuyi Mao, Jun Zhang

**Abstract:** Device-edge co-inference, which partitions a deep neural network between a resource-constrained mobile device and an edge server, recently emerges as a promising paradigm to support intelligent mobile applications. To accelerate the inference process, on-device model sparsification and intermediate feature compression are regarded as two prominent techniques. However, as the on-device model sparsity level and intermediate feature compression ratio have direct impacts on computation workload and communication overhead respectively, and both of them affect the inference accuracy, finding the optimal values of these hyper-parameters brings a major challenge due to the large search space. In this paper, we endeavor to develop an efficient algorithm to determine these hyper-parameters. By selecting a suitable model split point and a pair of encoder/decoder for the intermediate feature vector, this problem is casted as a sequential decision problem, for which, a novel automated machine learning (AutoML) framework is proposed based on deep reinforcement learning (DRL). Experiment results on an image classification task demonstrate the effectiveness of the proposed framework in achieving a better communication-computation trade-off and significant inference speedup against various baseline schemes.

X2Teeth: 3D Teeth Reconstruction from a Single Panoramic Radiograph arxiv:2108.13004 📈 3

Yuan Liang, Weinan Song, Jiawei Yang, Liang Qiu, Kun Wang, Lei He

**Abstract:** 3D teeth reconstruction from X-ray is important for dental diagnosis and many clinical operations. However, no existing work has explored the reconstruction of teeth for a whole cavity from a single panoramic radiograph. Different from single object reconstruction from photos, this task has the unique challenge of constructing multiple objects at high resolutions. To conquer this task, we develop a novel ConvNet X2Teeth that decomposes the task into teeth localization and single-shape estimation. We also introduce a patch-based training strategy, such that X2Teeth can be end-to-end trained for optimal performance. Extensive experiments show that our method can successfully estimate the 3D structure of the cavity and reflect the details for each tooth. Moreover, X2Teeth achieves a reconstruction IoU of 0.681, which significantly outperforms the encoder-decoder method by $1.71X and the retrieval-based method by $1.52X. Our method can also be promising for other multi-anatomy 3D reconstruction tasks.

Multi-Task Triplet Loss for Named Entity Recognition using Supplementary Text arxiv:2109.13736 📈 2

Ryan Siskind, Shalin Shah

**Abstract:** Retail item data contains many different forms of text like the title of an item, the description of an item, item name and reviews. It is of interest to identify the item name in the other forms of text using a named entity tagger. However, the title of an item and its description are syntactically different (but semantically similar) in that the title is not necessarily a well formed sentence while the description is made up of well formed sentences. In this work, we use a triplet loss to contrast the embeddings of the item title with the description to establish a proof of concept. We find that using the triplet loss in a multi-task NER algorithm improves both the precision and recall by a small percentage. While the improvement is small, we think it is a step in the right direction of using various forms of text in a multi-task algorithm. In addition to precision and recall, the multi task triplet loss method is also found to significantly improve the exact match accuracy i.e. the accuracy of tagging the entire set of tokens in the text with correct tags.

Sample Efficient Detection and Classification of Adversarial Attacks via Self-Supervised Embeddings arxiv:2108.13797 📈 2

Mazda Moayeri, Soheil Feizi

**Abstract:** Adversarial robustness of deep models is pivotal in ensuring safe deployment in real world settings, but most modern defenses have narrow scope and expensive costs. In this paper, we propose a self-supervised method to detect adversarial attacks and classify them to their respective threat models, based on a linear model operating on the embeddings from a pre-trained self-supervised encoder. We use a SimCLR encoder in our experiments, since we show the SimCLR embedding distance is a good proxy for human perceptibility, enabling it to encapsulate many threat models at once. We call our method SimCat since it uses SimCLR encoder to catch and categorize various types of adversarial attacks, including L_p and non-L_p evasion attacks, as well as data poisonings. The simple nature of a linear classifier makes our method efficient in both time and sample complexity. For example, on SVHN, using only five pairs of clean and adversarial examples computed with a PGD-L_inf attack, SimCat's detection accuracy is over 85%. Moreover, on ImageNet, using only 25 examples from each threat model, SimCat can classify eight different attack types such as PGD-L_2, PGD-L_inf, CW-L_2, PPGD, LPA, StAdv, ReColor, and JPEG-L_inf, with over 40% accuracy. On STL10 data, we apply SimCat as a defense against poisoning attacks, such as BP, CP, FC, CLBD, HTBD, halving the success rate while using only twenty total poisons for training. We find that the detectors generalize well to unseen threat models. Lastly, we investigate the performance of our detection method under adaptive attacks and further boost its robustness against such attacks via adversarial training.

Fast Multi-label Learning arxiv:2108.13570 📈 2

Xiuwen Gong, Dong Yuan, Wei Bao

**Abstract:** Embedding approaches have become one of the most pervasive techniques for multi-label classification. However, the training process of embedding methods usually involves a complex quadratic or semidefinite programming problem, or the model may even involve an NP-hard problem. Thus, such methods are prohibitive on large-scale applications. More importantly, much of the literature has already shown that the binary relevance (BR) method is usually good enough for some applications. Unfortunately, BR runs slowly due to its linear dependence on the size of the input data. The goal of this paper is to provide a simple method, yet with provable guarantees, which can achieve competitive performance without a complex training process. To achieve our goal, we provide a simple stochastic sketch strategy for multi-label classification and present theoretical results from both algorithmic and statistical learning perspectives. Our comprehensive empirical studies corroborate our theoretical findings and demonstrate the superiority of the proposed methods.

Adaptive Label Smoothing To Regularize Large-Scale Graph Training arxiv:2108.13555 📈 2

Kaixiong Zhou, Ninghao Liu, Fan Yang, Zirui Liu, Rui Chen, Li Li, Soo-Hyun Choi, Xia Hu

**Abstract:** Graph neural networks (GNNs), which learn the node representations by recursively aggregating information from its neighbors, have become a predominant computational tool in many domains. To handle large-scale graphs, most of the existing methods partition the input graph into multiple sub-graphs (e.g., through node clustering) and apply batch training to save memory cost. However, such batch training will lead to label bias within each batch, and then result in over-confidence in model predictions. Since the connected nodes with positively related labels tend to be assigned together, the traditional cross-entropy minimization process will attend on the predictions of biased classes in the batch, and may intensify the overfitting issue. To overcome the label bias problem, we propose the adaptive label smoothing (ALS) method to replace the one-hot hard labels with smoothed ones, which learns to allocate label confidences from the biased classes to the others. Specifically, ALS propagates node labels to aggregate the neighborhood label distribution in a pre-processing step, and then updates the optimal smoothed labels online to adapt to specific graph structure. Experiments on the real-world datasets demonstrate that ALS can be generally applied to the main scalable learning frameworks to calibrate the biased labels and improve generalization performances.

An FEA surrogate model with Boundary Oriented Graph Embedding approach arxiv:2108.13509 📈 2

Xingyu Fu, Fengfeng Zhou, Dheeraj Peddireddy, Zhengyang Kang, Martin Byung-Guk Jun, Vaneet Aggarwal

**Abstract:** In this work, we present a Boundary Oriented Graph Embedding (BOGE) approach for the Graph Neural Network (GNN) to serve as a general surrogate model for regressing physical fields and solving boundary value problems. Providing shortcuts for both boundary elements and local neighbor elements, the BOGE approach can embed structured mesh elements into the graph and performs an efficient regression on large-scale triangular-mesh-based FEA results, which cannot be realized by other machine-learning-based surrogate methods. Focusing on the cantilever beam problem, our BOGE approach cannot only fit the distribution of stress fields but also regresses the topological optimization results, which show its potential of realizing abstract decision-making design process. The BOGE approach with 3-layer DeepGCN model \textcolor{blue}{achieves the regression with MSE of 0.011706 (2.41\% MAPE) for stress field prediction and 0.002735 MSE (with 1.58\% elements having error larger than 0.01) for topological optimization.} The overall concept of the BOGE approach paves the way for a general and efficient deep-learning-based FEA simulator that will benefit both industry and design-related areas.

The Application of Convolutional Neural Networks for Tomographic Reconstruction of Hyperspectral Images arxiv:2108.13458 📈 2

Wei-Chih Huang, Mads Svanborg Peters, Mads Juul Ahlebaek, Mads Toudal Frandsen, René Lynge Eriksen, Bjarke Jørgensen

**Abstract:** A novel method, utilizing convolutional neural networks (CNNs), is proposed to reconstruct hyperspectral cubes from computed tomography imaging spectrometer (CTIS) images. Current reconstruction algorithms are usually subject to long reconstruction times and mediocre precision in cases of a large number of spectral channels. The constructed CNNs deliver higher precision and shorter reconstruction time than a standard expectation maximization algorithm. In addition, the network can handle two different types of real-world images at the same time -- specifically ColorChecker and carrot spectral images are considered. This work paves the way toward real-time reconstruction of hyperspectral cubes from CTIS images.

Ovarian Cancer Prediction from Ovarian Cysts Based on TVUS Using Machine Learning Algorithms arxiv:2108.13387 📈 2

Laboni Akter, Nasrin Akhter

**Abstract:** Ovarian Cancer (OC) is type of female reproductive malignancy which can be found among young girls and mostly the women in their fertile or reproductive. There are few number of cysts are dangerous and may it cause cancer. So, it is very important to predict and it can be from different types of screening are used for this detection using Transvaginal Ultrasonography (TVUS) screening. In this research, we employed an actual datasets called PLCO with TVUS screening and three machine learning (ML) techniques, respectively Random Forest KNN, and XGBoost within three target variables. We obtained a best performance from this algorithms as far as accuracy, recall, f1 score and precision with the approximations of 99.50%, 99.50%, 99.49% and 99.50% individually. The AUC score of 99.87%, 98.97% and 99.88% are observed in these Random Forest, KNN and XGB algorithms .This approach helps assist physicians and suspects in identifying ovarian risks early on, reducing ovarian malignancy-related complications and deaths.

Trustworthy AI for Process Automation on a Chylla-Haase Polymerization Reactor arxiv:2108.13381 📈 2

Daniel Hein, Daniel Labisch

**Abstract:** In this paper, genetic programming reinforcement learning (GPRL) is utilized to generate human-interpretable control policies for a Chylla-Haase polymerization reactor. Such continuously stirred tank reactors (CSTRs) with jacket cooling are widely used in the chemical industry, in the production of fine chemicals, pigments, polymers, and medical products. Despite appearing rather simple, controlling CSTRs in real-world applications is quite a challenging problem to tackle. GPRL utilizes already existing data from the reactor and generates fully automatically a set of optimized simplistic control strategies, so-called policies, the domain expert can choose from. Note that these policies are white-box models of low complexity, which makes them easy to validate and implement in the target control system, e.g., SIMATIC PCS 7. However, despite its low complexity the automatically-generated policy yields a high performance in terms of reactor temperature control deviation, which we empirically evaluate on the original reactor template.

Robust Interactive Semantic Segmentation of Pathology Images with Minimal User Input arxiv:2108.13368 📈 2

Mostafa Jahanifar, Neda Zamani Tajeddin, Navid Alemi Koohbanani, Nasir Rajpoot

**Abstract:** From the simple measurement of tissue attributes in pathology workflow to designing an explainable diagnostic/prognostic AI tool, access to accurate semantic segmentation of tissue regions in histology images is a prerequisite. However, delineating different tissue regions manually is a laborious, time-consuming and costly task that requires expert knowledge. On the other hand, the state-of-the-art automatic deep learning models for semantic segmentation require lots of annotated training data and there are only a limited number of tissue region annotated images publicly available. To obviate this issue in computational pathology projects and collect large-scale region annotations efficiently, we propose an efficient interactive segmentation network that requires minimum input from the user to accurately annotate different tissue types in the histology image. The user is only required to draw a simple squiggle inside each region of interest so it will be used as the guiding signal for the model. To deal with the complex appearance and amorph geometry of different tissue regions we introduce several automatic and minimalistic guiding signal generation techniques that help the model to become robust against the variation in the user input. By experimenting on a dataset of breast cancer images, we show that not only does our proposed method speed up the interactive annotation process, it can also outperform the existing automatic and interactive region segmentation models.

Predicting Road Flooding Risk with Machine Learning Approaches Using Crowdsourced Reports and Fine-grained Traffic Data arxiv:2108.13265 📈 2

Faxi Yuan, William Mobley, Hamed Farahmand, Yuanchang Xu, Russell Blessing, Shangjia Dong, Ali Mostafavi, Samuel D. Brody

**Abstract:** The objective of this study is to predict road flooding risks based on topographic, hydrologic, and temporal precipitation features using machine learning models. Predictive flood monitoring of road network flooding status plays an essential role in community hazard mitigation, preparedness, and response activities. Existing studies related to the estimation of road inundations either lack observed road inundation data for model validations or focus mainly on road inundation exposure assessment based on flood maps. This study addresses this limitation by using crowdsourced and fine-grained traffic data as an indicator of road inundation, and topographic, hydrologic, and temporal precipitation features as predictor variables. Two tree-based machine learning models (random forest and AdaBoost) were then tested and trained for predicting road inundations in the contexts of 2017 Hurricane Harvey and 2019 Tropical Storm Imelda in Harris County, Texas. The findings from Hurricane Harvey indicate that precipitation is the most important feature for predicting road inundation susceptibility, and that topographic features are more essential than hydrologic features for predicting road inundations in both storm cases. The random forest and AdaBoost models had relatively high AUC scores (0.860 and 0.810 for Harvey respectively and 0.790 and 0.720 for Imelda respectively) with the random forest model performing better in both cases. The random forest model showed stable performance for Harvey, while varying significantly for Imelda. This study advances the emerging field of smart flood resilience in terms of predictive flood risk mapping at the road level. For example, such models could help impacted communities and emergency management agencies develop better preparedness and response strategies with improved situational awareness of road inundation likelihood as an extreme weather event unfolds.

Adaptive perturbation adversarial training: based on reinforcement learning arxiv:2108.13239 📈 2

Zhishen Nie, Ying Lin, Sp Ren, Lan Zhang

**Abstract:** Adversarial training has become the primary method to defend against adversarial samples. However, it is hard to practically apply due to many shortcomings. One of the shortcomings of adversarial training is that it will reduce the recognition accuracy of normal samples. Adaptive perturbation adversarial training is proposed to alleviate this problem. It uses marginal adversarial samples that are close to the decision boundary but does not cross the decision boundary for adversarial training, which improves the accuracy of model recognition while maintaining the robustness of the model. However, searching for marginal adversarial samples brings additional computational costs. This paper proposes a method for finding marginal adversarial samples based on reinforcement learning, and combines it with the latest fast adversarial training technology, which effectively speeds up training process and reduces training costs.

Representation of binary classification trees with binary features by quantum circuits arxiv:2108.13207 📈 2

Raoul Heese, Patricia Bickert, Astrid Elisa Niederle

**Abstract:** We propose a quantum representation of binary classification trees with binary features based on a probabilistic approach. By using the quantum computer as a processor for probability distributions, a probabilistic traversal of the decision tree can be realized via measurements of a quantum circuit. We describe how tree inductions and the prediction of class labels of query data can be integrated into this framework. An on-demand sampling method enables predictions with a constant number of classical memory slots, independent of the tree depth. We experimentally study our approach using both a quantum computing simulator and actual IBM quantum hardware. To our knowledge, this is the first realization of a decision tree classifier on a quantum device.

Robust Privacy-Preserving Motion Detection and Object Tracking in Encrypted Streaming Video arxiv:2108.13141 📈 2

Xianhao Tian, Peijia Zheng, Jiwu Huang

**Abstract:** Video privacy leakage is becoming an increasingly severe public problem, especially in cloud-based video surveillance systems. It leads to the new need for secure cloud-based video applications, where the video is encrypted for privacy protection. Despite some methods that have been proposed for encrypted video moving object detection and tracking, none has robust performance against complex and dynamic scenes. In this paper, we propose an efficient and robust privacy-preserving motion detection and multiple object tracking scheme for encrypted surveillance video bitstreams. By analyzing the properties of the video codec and format-compliant encryption schemes, we propose a new compressed-domain feature to capture motion information in complex surveillance scenarios. Based on this feature, we design an adaptive clustering algorithm for moving object segmentation with an accuracy of 4x4 pixels. We then propose a multiple object tracking scheme that uses Kalman filter estimation and adaptive measurement refinement. The proposed scheme does not require video decryption or full decompression and has a very low computation load. The experimental results demonstrate that our scheme achieves the best detection and tracking performance compared with existing works in the encrypted and compressed domain. Our scheme can be effectively used in complex surveillance scenarios with different challenges, such as camera movement/jitter, dynamic background, and shadows.

Automatic Preprocessing and Ensemble Learning for Low Quality Cell Image Segmentation arxiv:2108.13118 📈 2

Sota Kato, Kazuhiro Hotta

**Abstract:** We propose an automatic preprocessing and ensemble learning for segmentation of cell images with low quality. It is difficult to capture cells with strong light. Therefore, the microscopic images of cells tend to have low image quality but these images are not good for semantic segmentation. Here we propose a method to translate an input image to the images that are easy to recognize by deep learning. The proposed method consists of two deep neural networks. The first network is the usual training for semantic segmentation, and penultimate feature maps of the first network are used as filters to translate an input image to the images that emphasize each class. This is the automatic preprocessing and translated cell images are easily classified. The input cell image with low quality is translated by the feature maps in the first network, and the translated images are fed into the second network for semantic segmentation. Since the outputs of the second network are multiple segmentation results, we conduct the weighted ensemble of those segmentation images. Two networks are trained by end-to-end manner, and we do not need to prepare images with high quality for the translation. We confirmed that our proposed method can translate cell images with low quality to the images that are easy to segment, and segmentation accuracy has improved using the weighted ensemble learning.

To tune or not to tune? An Approach for Recommending Important Hyperparameters arxiv:2108.13066 📈 2

Mohamadjavad Bahmani, Radwa El Shawi, Nshan Potikyan, Sherif Sakr

**Abstract:** Novel technologies in automated machine learning ease the complexity of algorithm selection and hyperparameter optimization. Hyperparameters are important for machine learning models as they significantly influence the performance of machine learning models. Many optimization techniques have achieved notable success in hyperparameter tuning and surpassed the performance of human experts. However, depending on such techniques as blackbox algorithms can leave machine learning practitioners without insight into the relative importance of different hyperparameters. In this paper, we consider building the relationship between the performance of the machine learning models and their hyperparameters to discover the trend and gain insights, with empirical results based on six classifiers and 200 datasets. Our results enable users to decide whether it is worth conducting a possibly time-consuming tuning strategy, to focus on the most important hyperparameters, and to choose adequate hyperparameter spaces for tuning. The results of our experiments show that gradient boosting and Adaboost outperform other classifiers across 200 problems. However, they need tuning to boost their performance. Overall, the results obtained from this study provide a quantitative basis to focus efforts toward guided automated hyperparameter optimization and contribute toward the development of better-automated machine learning frameworks.

Demystifying Drug Repurposing Domain Comprehension with Knowledge Graph Embedding arxiv:2108.13051 📈 2

Edoardo Ramalli, Alberto Parravicini, Guido Walter Di Donato, Mirko Salaris, Céline Hudelot, Marco Domenico Santambrogio

**Abstract:** Drug repurposing is more relevant than ever due to drug development's rising costs and the need to respond to emerging diseases quickly. Knowledge graph embedding enables drug repurposing using heterogeneous data sources combined with state-of-the-art machine learning models to predict new drug-disease links in the knowledge graph. As in many machine learning applications, significant work is still required to understand the predictive models' behavior. We propose a structured methodology to understand better machine learning models' results for drug repurposing, suggesting key elements of the knowledge graph to improve predictions while saving computational resources. We reduce the training set of 11.05% and the embedding space by 31.87%, with only a 2% accuracy reduction, and increase accuracy by 60% on the open ogbl-biokg graph adding only 1.53% new triples.

Integrated Decision and Control at Multi-Lane Intersections with Mixed Traffic Flow arxiv:2108.13038 📈 2

Jianhua Jiang, Yangang Ren, Yang Guan, Shengbo Eben Li, Yuming Yin, Xiaoping Jin

**Abstract:** Autonomous driving at intersections is one of the most complicated and accident-prone traffic scenarios, especially with mixed traffic participants such as vehicles, bicycles and pedestrians. The driving policy should make safe decisions to handle the dynamic traffic conditions and meet the requirements of on-board computation. However, most of the current researches focuses on simplified intersections considering only the surrounding vehicles and idealized traffic lights. This paper improves the integrated decision and control framework and develops a learning-based algorithm to deal with complex intersections with mixed traffic flows, which can not only take account of realistic characteristics of traffic lights, but also learn a safe policy under different safety constraints. We first consider different velocity models for green and red lights in the training process and use a finite state machine to handle different modes of light transformation. Then we design different types of distance constraints for vehicles, traffic lights, pedestrians, bicycles respectively and formulize the constrained optimal control problems (OCPs) to be optimized. Finally, reinforcement learning (RL) with value and policy networks is adopted to solve the series of OCPs. In order to verify the safety and efficiency of the proposed method, we design a multi-lane intersection with the existence of large-scale mixed traffic participants and set practical traffic light phases. The simulation results indicate that the trained decision and control policy can well balance safety and tracking performance. Compared with model predictive control (MPC), the computational time is three orders of magnitude lower.

Identifying optimal cycles in quantum thermal machines with reinforcement-learning arxiv:2108.13525 📈 1

Paolo Andrea Erdman, Frank Noé

**Abstract:** The optimal control of open quantum systems is a challenging task but has a key role in improving existing quantum information processing technologies. We introduce a general framework based on Reinforcement Learning to discover optimal thermodynamic cycles that maximize the power of out-of-equilibrium quantum heat engines and refrigerators. We apply our method, based on the soft actor-critic algorithm, to three systems: a benchmark two-level system heat engine, where we find the optimal known cycle; an experimentally realistic refrigerator based on a superconducting qubit that generates coherence, where we find a non-intuitive control sequence that outperform previous cycles proposed in literature; a heat engine based on a quantum harmonic oscillator, where we find a cycle with an elaborate structure that outperforms the optimized Otto cycle. We then evaluate the corresponding efficiency at maximum power.

Recent advances for quantum classifiers arxiv:2108.13421 📈 1

Weikang Li, Dong-Ling Deng

**Abstract:** Machine learning has achieved dramatic success in a broad spectrum of applications. Its interplay with quantum physics may lead to unprecedented perspectives for both fundamental research and commercial applications, giving rise to an emergent research frontier of quantum machine learning. Along this line, quantum classifiers, which are quantum devices that aim to solve classification problems in machine learning, have attracted tremendous attention recently. In this review, we give a relatively comprehensive overview for the studies of quantum classifiers, with a focus on recent advances. First, we will review a number of quantum classification algorithms, including quantum support vector machines, quantum kernel methods, quantum decision tree classifiers, quantum nearest neighbor algorithms, and quantum annealing based classifiers. Then, we move on to introduce the variational quantum classifiers, which are essentially variational quantum circuits for classifications. We will review different architectures for constructing variational quantum classifiers and introduce the barren plateau problem, where the training of quantum classifiers might be hindered by the exponentially vanishing gradient. In addition, the vulnerability aspect of quantum classifiers in the setting of adversarial learning and the recent experimental progress on different quantum classifiers will also be discussed.

ML-based IoT Malware Detection Under Adversarial Settings: A Systematic Evaluation arxiv:2108.13373 📈 1

Ahmed Abusnaina, Afsah Anwar, Sultan Alshamrani, Abdulrahman Alabduljabbar, RhongHo Jang, Daehun Nyang, David Mohaisen

**Abstract:** The rapid growth of the Internet of Things (IoT) devices is paralleled by them being on the front-line of malicious attacks. This has led to an explosion in the number of IoT malware, with continued mutations, evolution, and sophistication. These malicious software are detected using machine learning (ML) algorithms alongside the traditional signature-based methods. Although ML-based detectors improve the detection performance, they are susceptible to malware evolution and sophistication, making them limited to the patterns that they have been trained upon. This continuous trend motivates the large body of literature on malware analysis and detection research, with many systems emerging constantly, and outperforming their predecessors. In this work, we systematically examine the state-of-the-art malware detection approaches, that utilize various representation and learning techniques, under a range of adversarial settings. Our analyses highlight the instability of the proposed detectors in learning patterns that distinguish the benign from the malicious software. The results exhibit that software mutations with functionality-preserving operations, such as stripping and padding, significantly deteriorate the accuracy of such detectors. Additionally, our analysis of the industry-standard malware detectors shows their instability to the malware mutations.

Multi-Agent Simulation for AI Behaviour Discovery in Operations Research arxiv:2108.13296 📈 1

Michael Papasimeon, Lyndon Benke

**Abstract:** We describe ACE0, a lightweight platform for evaluating the suitability and viability of AI methods for behaviour discovery in multiagent simulations. Specifically, ACE0 was designed to explore AI methods for multi-agent simulations used in operations research studies related to new technologies such as autonomous aircraft. Simulation environments used in production are often high-fidelity, complex, require significant domain knowledge and as a result have high R&D costs. Minimal and lightweight simulation environments can help researchers and engineers evaluate the viability of new AI technologies for behaviour discovery in a more agile and potentially cost effective manner. In this paper we describe the motivation for the development of ACE0.We provide a technical overview of the system architecture, describe a case study of behaviour discovery in the aerospace domain, and provide a qualitative evaluation of the system. The evaluation includes a brief description of collaborative research projects with academic partners, exploring different AI behaviour discovery methods.

Reachability Is NP-Complete Even for the Simplest Neural Networks arxiv:2108.13179 📈 1

Marco Sälzer, Martin Lange

**Abstract:** We investigate the complexity of the reachability problem for (deep) neural networks: does it compute valid output given some valid input? It was recently claimed that the problem is NP-complete for general neural networks and conjunctive input/output specifications. We repair some flaws in the original upper and lower bound proofs. We then show that NP-hardness already holds for restricted classes of simple specifications and neural networks with just one layer, as well as neural networks with minimal requirements on the occurring parameters.

Open Set RF Fingerprinting using Generative Outlier Augmentation arxiv:2108.13099 📈 1

Samurdhi Karunaratne, Samer Hanna, Danijela Cabric

**Abstract:** RF devices can be identified by unique imperfections embedded in the signals they transmit called RF fingerprints. The closed set classification of such devices, where the identification must be made among an authorized set of transmitters, has been well explored. However, the much more difficult problem of open set classification, where the classifier needs to reject unauthorized transmitters while recognizing authorized transmitters, has only been recently visited. So far, efforts at open set classification have largely relied on the utilization of signal samples captured from a known set of unauthorized transmitters to aid the classifier learn unauthorized transmitter fingerprints. Since acquiring new transmitters to use as known transmitters is highly expensive, we propose to use generative deep learning methods to emulate unauthorized signal samples for the augmentation of training datasets. We develop two different data augmentation techniques, one that exploits a limited number of known unauthorized transmitters and the other that does not require any unauthorized transmitters. Experiments conducted on a dataset captured from a WiFi testbed indicate that data augmentation allows for significant increases in open set classification accuracy, especially when the authorized set is small.

Data-driven Small-signal Modeling for Converter-based Power Systems arxiv:2108.13046 📈 1

Francesca Rossi, Eduardo Prieto-Araujo, Marc Cheah-Mane, Oriol Gomis-Bellmunt

**Abstract:** This article details a complete procedure to derive a data-driven small-signal-based model useful to perform converter-based power system related studies. To compute the model, Decision Tree (DT) regression, both using single DT and ensemble DT, and Spline regression have been employed and their performances have been compared, in terms of accuracy, training and computing time. The methodology includes a comprehensive step-by-step procedure to develop the model: data generation by conventional simulation and mathematical models, databases (DBs) arrangement, regression training and testing, realizing prediction for new instances. The methodology has been developed using an essential network and then tested on a more complex system, to show the validity and usefulness of the suggested approach. Both power systems test cases have the essential characteristics of converter-based power systems, simulating high penetration of converter interfaced generation and the presence of HVDC links. Moreover, it is proposed how to represent in a visual manner the results of the small-signal stability analysis for a wide range of system operating conditions, exploiting DT regressions. Finally, the possible applications of the model are discussed, highlighting the potential of the developed model in further power system small-signal related studies.

The Second International Verification of Neural Networks Competition (VNN-COMP 2021): Summary and Results arxiv:2109.00498 📈 0

Stanley Bak, Changliu Liu, Taylor Johnson

**Abstract:** This report summarizes the second International Verification of Neural Networks Competition (VNN-COMP 2021), held as a part of the 4th Workshop on Formal Methods for ML-Enabled Autonomous Systems that was collocated with the 33rd International Conference on Computer-Aided Verification (CAV). Twelve teams participated in this competition. The goal of the competition is to provide an objective comparison of the state-of-the-art methods in neural network verification, in terms of scalability and speed. Along this line, we used standard formats (ONNX for neural networks and VNNLIB for specifications), standard hardware (all tools are run by the organizers on AWS), and tool parameters provided by the tool authors. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this competition.

DuTrust: A Sentiment Analysis Dataset for Trustworthiness Evaluation arxiv:2108.13140 📈 0

Lijie Wang, Hao Liu, Shuyuan Peng, Hongxuan Tang, Xinyan Xiao, Ying Chen, Hua Wu, Haifeng Wang

**Abstract:** While deep learning models have greatly improved the performance of most artificial intelligence tasks, they are often criticized to be untrustworthy due to the black-box problem. Consequently, many works have been proposed to study the trustworthiness of deep learning. However, as most open datasets are designed for evaluating the accuracy of model outputs, there is still a lack of appropriate datasets for evaluating the inner workings of neural networks. The lack of datasets obviously hinders the development of trustworthiness research. Therefore, in order to systematically evaluate the factors for building trustworthy systems, we propose a novel and well-annotated sentiment analysis dataset to evaluate robustness and interpretability. To evaluate these factors, our dataset contains diverse annotations about the challenging distribution of instances, manual adversarial instances and sentiment explanations. Several evaluation metrics are further proposed for interpretability and robustness. Based on the dataset and metrics, we conduct comprehensive comparisons for the trustworthiness of three typical models, and also study the relations between accuracy, robustness and interpretability. We release this trustworthiness evaluation dataset at \url{https://github/xyz} and hope our work can facilitate the progress on building more trustworthy systems for real-world applications.

Deep kernel machines and fast solvers for deep kernel machines arxiv:2108.13097 📈 0

Laurence Aitchison

**Abstract:** Deep neural networks (DNNs) with the flexibility to learn good top-layer representations have eclipsed shallow kernel methods without that flexibility. Here, we take inspiration from DNNs to develop the first non-Bayesian deep kernel method, the deep kernel machine. In addition, we develop a solver for the intermediate layer kernels in deep kernel machines that converges in around 10 steps, exploiting matrix solvers initially developed in the control theory literature. These are many times faster the usual gradient descent approach and generalise to arbitrary architectures. While deep kernel machines currently scale poorly in the number of datapoints, we believe that this can be rectified in future work, allowing deep kernel machines to form the basis of a new class of much more efficient deep nonlinear function approximators.

Next Page